Google
 

Tuesday, April 15, 2008

Bitmap join indexes - An Simple Example

Bitmap join indexes are a way to denormalize without denormalizing.

consider the emp/dept tables..

people ask "how many people work in RESEARCH?". "Who in SALES makes more than 5000$?"

Note that they do not ask "how many people work in deptno=10", "who in deptno=20 makes more than
5000$"

so, do you move DNAME into EMP? No, you can create a bitmap join index. consider:


ops$tkyte@ORA10G> create table emp as select * from scott.emp;
Table created.

ops$tkyte@ORA10G> create table dept as select * from scott.dept;
Table created.

ops$tkyte@ORA10G>
ops$tkyte@ORA10G> alter table dept add constraint dept_pk primary key(deptno);

Table altered.

ops$tkyte@ORA10G>
ops$tkyte@ORA10G> set echo off
Enter to continue

so, where is how to create the bitmap index -- using a FROM
and WHERE clause

ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create bitmap index emp_bm_idx
2 on emp( d.dname )
3 from emp e, dept d
4 where e.deptno = d.deptno
5 /

Index created.

ops$tkyte@ORA10G>
ops$tkyte@ORA10G> set echo off
Enter to continue


of course, to use a bitmap index we must use the CBO, we would need
lots of data to get it to want to use the bitmap as well -- so we fake
it out.


ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec dbms_stats.set_table_stats( user, 'EMP', numrows => 1000000, numblks =>
300000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> exec dbms_stats.set_table_stats( user, 'DEPT', numrows => 100000, numblks =>
30000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G>
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> set echo off
Enter to continue

and now we just query...


ops$tkyte@ORA10G>
ops$tkyte@ORA10G> set autotrace on
ops$tkyte@ORA10G> select count(*)
2 from emp, dept
3 where emp.deptno = dept.deptno
4 and dept.dname = 'SALES'
5 /

COUNT(*)
----------
6


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=13)
1 0 SORT (AGGREGATE)
2 1 BITMAP CONVERSION (COUNT) (Cost=1 Card=10000 Bytes=130000)
3 2 BITMAP INDEX (SINGLE VALUE) OF 'EMP_BM_IDX' (INDEX (BITMAP))

Note how neither of EMP or DEPT was actually accessed for this query...


ops$tkyte@ORA10G> select emp.*
2 from emp, dept
3 where emp.deptno = dept.deptno
4 and dept.dname = 'SALES'
5 /

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ------- --------- ------ ------- ------
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7900 JAMES CLERK 7698 03-DEC-81 950 30

6 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=8712 Card=10000 Bytes=870000)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (TABLE) (Cost=8712 Card=10000 Bytes=870000)
2 1 BITMAP CONVERSION (TO ROWIDS)
3 2 BITMAP INDEX (SINGLE VALUE) OF 'EMP_BM_IDX' (INDEX (BITMAP))

and notice how DEPT is not accessed here either


bitmap join indexes could be used in some cases INSTEAD of a star transformation -- it is not that
they "help it", they are an alternative in a way.

Friday, April 11, 2008

Oracle 11g New Feature - Invisible Indexes


Beginning with Release 11g, you can create invisible indexes.

An invisible index is an index that is ignored by the optimizer unless you explicitly set the OPTIMIZER_USE_INVISIBLE_INDEXES initialization parameter to TRUE at the session or system level.

Using invisible indexes, you can do the following:
· Test the removal of an index before dropping it.
· Use temporary index structures for certain operations or modules of an application without affecting the overall application.

Making an index invisible is an alternative to making it unusable or dropping it.

Unlike Unusable Index, an invisible index is maintained for any DML operation but is not used by the optimizer unless you explicitly specify the index with a hint.

Applications often have to be modified without being able to bring the complete application offline. Invisible indexes enable you to leverage temporary index structures for certain operations or modules of an application without affecting the overall application. Furthermore, invisible indexes can be used to test the removal of an index without dropping it right away, thus enabling a grace period for testing in production environments.

You can set following Oracle Parameter to make the Invisible Indexes visible to Optimizer or not.

OPTIMIZER_USE_INVISIBLE_INDEXES enables or disables the use of invisible indexes.
Values:
· true - Invisible indexes are treated as visible (normal) indexes.
· false - Invisible indexes will not be considered by the optimizer but will still be maintained by DML operations.

Creating an Invisible Index

To create an invisible index, use the SQL statement CREATE INDEX with the INVISIBLE clause.

The following statement creates an invisible index named emp_ename for the ename column of the emp table:

CREATE INDEX emp_ename ON emp(ename)
TABLESPACE users
STORAGE (INITIAL 20K NEXT 20k PCTINCREASE 75)
INVISIBLE;

Making an Index Invisible

To make a visible index invisible, issue this statement:

ALTER INDEX index INVISIBLE;

To make an invisible index visible, issue this statement:

ALTER INDEX index VISIBLE;

To find out whether an index is visible or invisible, query the dictionary views USER_INDEXES, ALL_INDEXES, or DBA_INDEXES. For example, to determine if the index IND1 is invisible, issue the following query:

SELECT INDEX_NAME, VISIBILITY
FROM USER_INDEXES
WHERE INDEX_NAME = 'IND1';

INDEX_NAME VISIBILITY
---------- ----------
IND1 VISIBLE

Oracle 11g New features Case Sensitive Password

In previous releases of Oracle Database, passwords were not case sensitive.

In Oracle 11g, when you create or modify user accounts, by default, passwords are case sensitive. To control the use of case sensitivity in passwords, set the SEC_CASE_SENSITIVE_LOGON initialization parameter. Only users who have the ALTER SYSTEM privilege can set the SEC_CASE_SENSITIVE_LOGON parameter. Set it to TRUE to enable case sensitivity or FALSE to disable case sensitivity.

For greater security, Oracle recommends that you enable case sensitivity in passwords. However, if you have compatibility issues with your applications, you can use this parameter to disable password case sensitivity. Examples of application compatibility issues are passwords for your applications being hard-coded to be case insensitive, or different application modules being inconsistent about case sensitivity when sending credentials to start a database session.

Below statement shows how to enable case sensitivity in passwords:

ALTER SYSTEM SET SEC_CASE_SENSITIVE_LOGON = TRUE

If you import user accounts from a previous release, for example, Release 10g, into the current database release, the case-insensitive passwords in these accounts remain case insensitive until the user changes his or her password. If the account was granted SYSDBA or SYSOPER privileges, it is imported to the password file.When a password from a user account from the previous release is changed, it then becomes case sensitive.

You can find users who have case sensitive or case insensitive passwords by querying the DBA_USERS view. The PASSWORD_VERSIONS column in this view indicates the release in which the password was created. For example:

SELECT USERNAME,PASSWORD_VERSIONS FROM DBA_USERS;

USERNAME PASSWORD_VERSIONS
------------------------------ -----------------
JONES 10G 11G
ADAMS 10G 11G
CLARK 10G 11G
PRESTON 11G
BLAKE 10G

The passwords for accounts jones, adams, and clark were originally created in Release 10g and then reset in Release 11g. Their passwords, assuming case sensitivity has been enabled, are now case sensitive, as is the password for preston. However, the account for blake is still using the Release 10g standard, so it is case insensitive. Ask him to reset his password so that it will be case sensitive, and therefore more secure.


How Case Sensitivity Affects Password Files

You can enable or disable case sensitivity for password files by using the ignorecase argument in the ORAPWD command line utility. The default value for ignorecase is n (no), which enforces case sensitivity.

below example shows how to enable case sensitivity in password files.

orapwd file=orapw entries=100 ignorecase=n
Enter password for SYS: password

This creates a password file called orapwd. Because ignorecase is set to n (no), the password entered for the password parameter will be case sensitive. Afterwards, if you connect using this password, it succeeds—as long as you enter it using the exact case sensitivity in which it was created. If you enter the same password but with different case sensitivity, it will fail.

If you set ignorecase to y, then the passwords in the password file are case insensitive, which means that you can enter the password using any capitalization that you want.

If you imported user accounts from a previous release and these accounts were created with SYSDBA or SYSOPER privileges, then they will be included in the password file. The passwords for these accounts are case insensitive. The next time these users change their passwords, and assuming case sensitivity is enabled, the passwords become case sensitive. For greater security, have these users change their passwords.

How Case Sensitivity Affects Accounts Created for Database Link Connections

When you create a database link connection, you need to define a user name and password for the connection. When you create the database link connection, the password is case sensitive. How this user enters his or her password for connections depends on the release in which the database link was created:

Before a user can connect from a pre-Release 11g database to a Release 11g database, and assuming that case sensitivity is enabled, you must re-create the password for this database link using all uppercase letters.

The reason you need to re-create the password using all uppercase letters is so that it will match how Oracle Database stores database link passwords. Oracle Database always stores this type of password in uppercase letters, even if the password had originally been created using lower or mixed case letters. If case sensitivity is disabled, the user can enter the password using the case the password was created in.

If the user is connecting from a Release 11g database to another Release 11g database, he or she must enter the password using the case in which it was created, assuming that case sensitivity is enabled.

If the user connecting from a Release 11g database to a pre-Release 11g database, he or she can enter his or her password using any case, because the password is still case insensitive.

In other words, any time a user connects to a Release 11g database from a database link, he or she needs to enter the password in its exact case.

Optimizer v/s Data

In this blog, we would discuss the issue that is normally faced, but gets unnoticed, due to the data entered into the system, may be because of the application bug. Hence, the title of this blog as "Optimizer v/s Data".

Let us visit the query and we would paste only a part of that query that is worth discussing here.

and nsup_date_time between to_date(to_char(sysdate,'DD-MON-YYYY')' ''00:00:00','DD-MON-YYYY HH24:MI:SS')and to_date(to_char(sysdate,'DD-MON-YYYY')' ''23:59:59','DD-MON-YYYY HH24:MI:SS')

This is an indexed column on mt.m_nagn table and the original query as well as the modified query with an ORDERED hint uses this Index. For my curiosity and simplicity of the query, I modified this condition as :

nsup_date_time between trunc(sysdate) and trunc(sysdate+1)-1/86400;

Surprisingly, after this change, the query planned changed and optimizer ceased to use this index and opted for Full table scan. This impacted the performance with significant increase in the elapsed time (from less than a seconds to 10 minutes).

Why did the optimizer behave abnormally after this small change in the query ? Both the changes will not alter the amount of data to be fetched as both these are requesting data for a day, then, can it be an optimizer bug ?

So, for testing purpose, I modified the query to make it a simple single table query with a single predicate.

explain plan for
select * from mt.m_nagnwhere nsup_date_time between to_date(to_char(sysdate,'DD-MON-YYYY')' ''00:00:00','DD-MON-YYYY HH24:MI:SS')and to_date(to_char(sysdate,'DD-MON-YYYY')' ''23:59:59','DD-MON-YYYY HH24:MI:SS');

Explained.

SQL> @utlxpls

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost
-------------------------------------------------------------------------------

0 SELECT STATEMENT 28401 2523K 10507
* 1 FILTER
2 TABLE ACCESS BY INDEX ROWID M_NAGN 28401 2523K 10507
* 3 INDEX RANGE SCAN M_NAGN_MK2 51123 149
-------------------------------------------------------------------------------

explain plan for
select * from mt.m_nagnwhere nsup_date_time between trunc(sysdate) and trunc(sysdate+1)-1/86400;

Explained.

SQL> @utlxpls

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost
--------------------------------------------------------------------
0 SELECT STATEMENT 435K 37M 14616
* 1 FILTER
* 2 TABLE ACCESS FULL M_NAGN 435K 37M 14616
--------------------------------------------------------------------

At the time of writing this blog, since the table was analyzed, the statistics of this table changed. Hence, the cost and cardinality of the Index / Full table scan steps will not match as compared to my previous blog.

Seeing both the plans, the query with original predicate does an index-scan whereas modified goes for a Full Table Scan.

As mentioned in my previous blog, the cardinality calculation of the original predicate is :
Nrows * 0.0025 = 11360572*0.0025 = 28401.43 = 28401. This matches the cardinality of the Table Access via rowid step. Optimizer assumes that 28401 rows will be fetched and hence opts for an Index Scan.

With the modified query, based on the predicate, optimizer assumes that 435k rows will be fetched and hence feels FTS being a better and viable option than an Index Scan. The calculation made by the optimizer is as under :

nsup_date_time between trunc(sysdate) and trunc(sysdate+1)-1/86400.

This can be written as :
nsup_date_time >= trunc(sysdate) and nsup_date_time <= trunc(sysdate+1)-1/86400.

Now, since trunc(sysdate+1) is converted into bind variable, the selectivity for this predicate will be 5%. For trunc(sysdate), the calculation takes into account low_value and high_value. This is interesting. The query from dba_tab_column reveals some interesting fact about the data.

select owner, num_rows, blocks from dba_tables
where table_name='M_NAGN';

OWNER NUM_ROWS BLOCKS
--------- --------------------- ----------
MT 11360572 151955

select num_distinct, num_nulls, density, get_date(low_value) "LOW",
get_date(high_value) "HIGH" from dba_tab_columns
where table_name='M_NAGN'
and column_name='NSUP_DATE_TIME';

NUM_DISTINCT NUM_NULLS DENSITY LOW HIGH
------------ ---------- ---------- ----------- -----------
76637 0 .000013049 16-JAN-2003 03-APR-2024

The high_value in the table entered is '03-APR-2024' and there were around 7 rows with this date, rest all the values were of current year (infact, current month). Hence, the calculation for this is :

SQL> select (to_date('03-Apr-2024 14:00:00','DD-Mon-YYYY HH24:MI:SS')-sysdate) "Req.Range" from dual;

Req.Range
----------
5937.938

SQL> select (to_date('03-Apr-2024 14:00:00','DD-Mon-YYYY HH24:MI:SS') - to_date('16-Jan-2003 22:00:00','DD-Mon-YYYY HH24:MI:SS')) "Available Range" from dual;

Available Range
---------------
7747.66667

Data to be fetched is 5938 days out of 7748 available days, and this is
= 5938/7748*1/ndv*nrows = 5938/7748*1/76637*11360572= 8706792.08
Total Cardinality is 8706792.08 * 5/100 = 435339.604 = 435K

It can be seen that any wrong value into a column can change the optimizer calculation of cardinality and thus can impact the performance. This entry of the future date caused this behaviour and could have been rectified by either correcting this entry or leaving it as NULL.

There are two fixes for this issue :

1. Index Hint. This would have improved the performance of the query with drastically low logical i/o's but would have come out as a BUG in the optimizer.

2. Update the Data to reflect correct date or leave these entries as NULL (provided the column is not defined as not null).

For me, it is an Application BUG in the code.

Again, I will reiterate, that optimizer is a piece of code and it has to work on the inputs provided to it. If the underlying input has an issue, the output returned by the optimizer will also be an issue. Hence, it is always important to know the underlying data, by way of which, a query performance can be improved by either using a HINT or fixing the underlying data.

Wednesday, April 9, 2008

Virtual Indexes in Oracle

Virtual Indexes are another undocumented feature used by Oracle. Virtual indexes, as the name suggests are pseudo-indexes that will not behave the same way that normal indexes behave, and are meant for a very specific purpose.


A virtual index is created in a slightly different manner than the normal indexes. A virtual index has no segment pegged to it, i.e., the DBA_SEGMENTS view will not show an entry for this. Oracle handles such indexes internally and few required dictionary tables are updated so that the optimizer can be made aware of its presence and generate an execution plan considering such indexes.

As per Oracle, this functionality is not intended for standalone usage. It is part of the Oracle Enterprise Manger Tuning Pack (Virtual Index Wizard). The virtual index wizard functionality allows the user to test a potential new index prior to actually building the new index in the database. It allows the CBO to evaluate the potential new index for a selected SQL statement by building an explain plan that is aware of the potential new index. This allows the user to determine if the optimizer would use the index, once implemented.

Therefore, the feature is here to be supported from Enterprise Manager and not for standalone usage. I went a bit further and actually tested it using SQL*Plus, basically, trying to use the same feature but without the enterprise manager.

I do not see much use of Virtual Indexes in a development area where we can create and drop indexes while testing. However, this feature could prove handy if a query or group of queries have to be tested in production (for want of simulation or urgency!), to determine if a new index will improve the performance, without impacting existing or new sessions.

Below are some attributes of the Virtual Indexes.

1. These are permanent and continue to exist unless we drop them.

2. Their creation will not affect existing and new sessions. Only sessions marked for Virtual Index usage will become aware of their existence.

3. Such indexes will be used only when the hidden parameter "_use_nosegment_indexes" is set to true.

4. The Rule based optimizer did not recognize Virtual Indexes when I tested, however, CBO recognizes them. In all of my examples, I have used CBO. However, I did not carry out intensive testing in RBO and you may come across exceptions to this view.

5. Dictionary view DBA_SEGMENTS will not show an entry for Virtual Indexes. The table DBA_INDEXES and DBA_OBJECTS will have an entry for them in Oracle 8i; in Oracle 9i onwards, DBA_INDEXES no longer show Virtual Indexes.

6. Virtual Indexes cannot be altered and throw a "fake index" error!

7. Virtual Indexes can be analyzed, using the ANALYZE command or DBMS_STATS package, but the statistics cannot be viewed (in Oracle 8i, DBA_INDEXES will not show this either).

Oracle may be generating artificial statistics and storing it somewhere for referring it later.

Creating Virtual Index

Creating a Virtual Index can be achieved by using the NOSEGMENT clause with the CREATE INDEX command.

e.g.:
SQL> create unique index am304_u1 on am304(col2) nosegment;

Index created.

Parameter _USE_NOSEGMENT_INDEXES

This is a hidden/internal parameter and therefore undocumented. Such parameters should not be altered for Oracle databases unless Oracle Support either advises or recommends that you do so. In our case, we make an exception (!), but only to be set at session level. Do not set it for the complete instance.
Setting the "_use_nosegment_indexes" parameter enables the optimizer to use virtual indexes.

Examples:

Creating the virtual index:

SQL> create index am301_n1 on am301(col1) nosegment;

Index created.

Checking some dictionary tables:

SQL> select segment_name, segment_type, bytes from dba_segments where segment_name = 'AM301_N1';

no rows selected

SQL> select object_name, object_type, status from dba_objects where object_name = 'AM301_N1';

OBJECT_NAME OBJECT_TYPE STATUS
--------------------------------------------------
AM301_N1 INDEX VALID

SQL> select index_name, index_type, status from dba_indexes where index_name = 'AM301_N1';

INDEX_NAME INDEX_TYPE STATUS
---------------------------------------------------------
AM301_N1 NORMAL VALID

Working with the Virtual indexes:

SQL> create table am301 2 (col1 number, col2 varchar2(20));
Table created.

SQL> insert into am301 values(dbms_random.random, dbms_random.string('A', 20));
1 row created.

SQL> insert into am301 values(dbms_random.random, dbms_random.string('A', 20));
1 row created.

SQL> insert into am301 values(dbms_random.random, dbms_random.string('A', 20));
1 row created.

SQL> insert into am301 values(dbms_random.random, dbms_random.string('A', 20));
1 row created.

SQL> select * from am301;

COL1 COL2
---------- ---------------------
512599724 aCR_PdFVdSGJLCOLCjJQ
-2.049E+09 qiVUlonc^p^R_X_W_^Xn-
189009085 prTNPqPUod^miAnLXrMA
2082093299 Bq_icbmcpNFNUKDRdMi]

--Though inserting alpha-numeric, Oracle also inserted--special characters in col2. This one is a bug and is--resolved in 9i+.

SQL> insert into am301 select * from am301;
4 rows created.

SQL> insert into am301 select * from am301;
8 rows created.

SQL> insert into am301 select * from am301;
16 rows created.

SQL> alter session set optimizer_mode=first_rows;
Session altered.

SQL> create index am301_n1 on am301(col1) nosegment;
Index created.

SQL> analyze table am301 compute statistics;
Table analyzed.

SQL> analyze index am301_n1 compute statistics;
Index analyzed.

--It is recommended to use dbms_stats package to
--generate statistics.

Analyze is used here as an--example only.

SQL> set autotrace on
SQL> select * from am301 where col1 = 2082093299;
COL1 COL2
---------- --------------------
2082093299 Bq_icbmcpNFNUKDRdMi]
2082093299 Bq_icbmcpNFNUKDRdMi]..
32 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=1 Card=32 Bytes=864)1
0 TABLE ACCESS (FULL) OF 'AM301' (Cost=1 Card=32 Bytes=864)

SQL> alter session set "_use_nosegment_indexes" = true; --set only for the session testing it.
Session altered.

SQL> select * from am301 where col1 = 2082093299;
COL1 COL2
---------- --------------------
2082093299 Bq_icbmcpNFNUKDRdMi]
2082093299 Bq_icbmcpNFNUKDRdMi]
32 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=207 Card=32 Bytes=864)1
0 TABLE ACCESS (BY INDEX ROWID) OF 'AM301' (Cost=207 Card=32 Bytes=864)2 1 INDEX (RANGE SCAN) OF 'AM301_N1' (NON-UNIQUE) (Cost=7 Card=32)

Trying to alter the virual index:

SQL> alter index am301_n1 rebuild;
alter index am301_n1 rebuild
*ERROR at line 1:ORA-08114: can not alter a fake index

SQL> alter index am301_n1 storage(pctincrease 0);
alter index am301_n1 storage(pctincrease 0)
*ERROR at line 1:ORA-08114: can not alter a fake index
Trying to re-create the index as a real one:

SQL> create index am301_n1 on am301(col1);
create index am301_n1 on am301(col1)
*ERROR at line 1:ORA-00955: name is already used by an existing object

As the Virtual Index has an entry in some of the dictionary tables, it will prevent the creation of an object with the same name. The alternative is to drop and recreate the Virtual Index as a real index.

Dropping and re-creating again as real index:

SQL> drop index am301_n1; Index dropped.

SQL> create index am301_n1 on am301(col1);
Index created.

However, a Virtual Index will not prevent the creation of an index with the same column(s).

In the example below, a Virtual Index is created with name DUMMY, afterwards a new index with a different name is created with the same column and structure. Both of the indexes will show in the DBA_OBJECTS listing.

SQL> create index dummy on am310(col1, col2, col3) nosegment;
Index created.

SQL> create index am310_n1 on am310(col1, col2, col3);
Index created.

Conclusion

As I mentioned earlier, this is undocumented, so use it at your own risk. The above feature may not be a must-use option, but is a good-to-know fact.
Drop the index once you are done with it, without fail! Its presence can baffle some of the regular scripts that are run to monitor the databases.

Tuesday, April 1, 2008

Types of SQL Statements

Oracle optimizes the following types of SQL statements:


Simple statement

An INSERT, UPDATE, DELETE, or SELECT statement that involves only a single table.

Simple query

Another name for a SELECT statement.

Join

A query that selects data from more than one table. A join is characterized by multiple tables in the FROM clause. Oracle pairs the rows from these tables using the condition specified in the WHERE clause and returns the resulting rows. This condition is called the join condition and usually compares columns of all the joined tables.

Equijoin

A join condition containing an equality operator.

Non-equijoin

A join condition containing something other than an equality operator.

Outer join

A join condition using the outer join operator (+) with one or more columns of one of the tables. Oracle returns all rows that meet the join condition. Oracle also returns all rows from the table without the outer join operator for which there are no matching rows in the table with the outer join operator.

Cartesian product

A join with no join condition results in a Cartesian product, or a cross product. A Cartesian product is the set of all possible combinations of rows drawn one from each table. In other words, for a join of two tables, each row in one table is matched in turn with every row in the other. A Cartesian product for more than two tables is the result of pairing each row of one table with every row of the Cartesian product of the remaining tables.

All other kinds of joins are subsets of Cartesian products effectively created by deriving the Cartesian product and then excluding rows that fail the join condition.

Complex statement

An INSERT, UPDATE, DELETE, or SELECT statement that contains a subquery, which is a form of the SELECT statement within another statement that produces a set of values for further processing within the statement. The outer portion of the complex statement that contains a subquery is called the parent statement.

Compound query

A query that uses set operators (UNION, UNION ALL, INTERSECT, or MINUS) to combine two or more simple or complex statements. Each simple or complex statement in a compound query is called a component query.

Statement accessing views

Simple, join, complex, or compound statement that accesses one or more views as well as tables.

Distributed statement

A statement that accesses data on two or more distinct nodes of a distributed database. A remote statement accesses data on one remote node of a distributed database.

Optimizer Operations

For any SQL statement processed by Oracle, the optimizer does the following:


Evaluation of expressions and conditions

The optimizer first evaluates expressions and conditions containing constants as fully as possible.

Statement transformation

For complex statements involving, for example, correlated subqueries, the optimizer may transform the original statement into an equivalent join statement.

View merging

For SQL statements that access a view, the optimizer often merges the query in the statement with that in the view, and then optimizes the result.

Choice of optimizer approaches

The optimizer chooses either a cost-based or rule-based approach and determines the goal of optimization.

Choice of access paths

For each table accessed by the statement, the optimizer chooses one or more of the available access paths to obtain the table's data.

Choice of join orders

For a join statement that joins more than two tables, the optimizer chooses which pair of tables is joined first, and then which table is joined to the result, and so on.

Choice of join operations

For any join statement, the optimizer chooses an operation to use to perform the join.


Monday, March 31, 2008

Partition Maintenance (By Tom Kyte)

Starting in Oracle9i, another option was added to partition maintenance: the ability to maintain the global indexes during the partition operation using the UPDATE GLOBAL INDEXES clause. This means that as you drop a partition, split a partition, perform whatever operation necessary on a partition ¿ Oracle will perform the necessary modifications to the global index to keep it up to date. Since most partition operations will cause this global index invalidation to occur, this feature can be a boon to systems that need to provide continual access to the data. You¿ll find that you sacrifice the raw speed of the partition operation, but with the associated window of unavailability immediately afterward as you rebuild indexes, for a slower overall response time from the partition operation but coupled with 100 percent data availability. In short, if you have a data warehouse that cannot have downtime, but must support these common data warehouse techniques of sliding data in and out, then this feature is for you¿but you must understand the implications.

Revisiting our previous example, if our partition operations had used the UPDATE GLOBAL INDEXES clause when relevant (in this example, it would not be needed on the ADD PARTITION statement since the newly added partition would not have any rows in it)

ops$tkyte@ORA10G> alter table partitioned
2 exchange partition fy_2004
3 with table fy_2004
4 including indexes
5 without validation
6 UPDATE GLOBAL INDEXES
7 /
Table altered.

ops$tkyte@ORA10G> alter table partitioned
2 drop partition fy_2004
3 UPDATE GLOBAL INDEXES
4 /
Table altered.

ops$tkyte@ORA10G> alter table partitioned
2 add partition fy_2006
3 values less than ( to_date('01-jan-2007','dd-mon-yyyy') )
4 /
Table altered.

ops$tkyte@ORA10G> alter table partitioned
2 exchange partition fy_2006
3 with table fy_2006
4 including indexes
5 without validation
6 UPDATE GLOBAL INDEXES
7 /
Table altered.

we would have discovered the indexes to be perfectly valid and usable both during and after the operation:

ops$tkyte@ORA10G> select index_name, status from user_indexes;

INDEX_NAME STATUS
------------------------------ --------
FY_2006_IDX VALID
FY_2004_IDX VALID
PARTITIONED_IDX_GLOBAL VALID
PARTITIONED_IDX_LOCAL N/A

6 rows selected.

ops$tkyte@ORA10G> set autotrace on explain
ops$tkyte@ORA10G> select count(*)
2 from partitioned
3 where timestamp between sysdate-50 and sysdate;

COUNT(*)
----------
6750

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=9 Card=1 Bytes=9)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 INDEX (RANGE SCAN) OF 'PARTITIONED_IDX_GLOBAL' (INDEX) (Cost=9¿

But there is a tradeoff: we are performing the logical equivalent of INSERT and DELETE operations on the global index structures. When we drop a partition, we have to delete all of the global index entries that might be pointing to that that partition. When we did the exchange of a table with a partition, we had to delete all of the global index entries pointing to the original data and then insert all of the new ones that we just slid in there. So the amount of work performed by the ALTER commands was significantly increased.

In fact, using runstats and a slightly modified version of the preceding example, we can measure the amount of ¿extra¿ work performed to maintain the global indexes during the partition operation. We¿ll slide out FY_2004 and slide in FY_2006 as before, adding in the requisite index rebuild. This will be the sliding window implementation that causes the data to become unavailable due to the need to rebuild the global indexes. We¿ll also then slide out FY_2005 and slide in FY_2007 using the UPDATE GLOBAL INDEXES clause, to emulate the sliding window implementation that provides for full data availability, even during the partition operations. In that manner, we can measure the same operations using the different techniques and compare them. Our expected outcome will be that the first approach will consume fewer database resources and therefore perform ¿faster¿ but incur a measurable period of ¿downtime.¿ The second approach, while consuming more resources and perhaps taking longer overall, will not incur downtime. As far as the end users are concerned, their ability to work never ceased. They might have been processing a bit slower (since we were competing with them for resources), but they were still processing, and they never stopped.

So, if we take the earlier example, but also create an empty FY_2005 table like FY_2004, and a full FY_2007 table like FY_2006, we can measure the differences between the index rebuild approaches, starting first with the ¿less available approach¿:

exec runStats_pkg.rs_start;

alter table partitioned exchange partition fy_2004
with table fy_2004 including indexes without validation;

alter table partitioned drop partition fy_2004;

alter table partitioned add partition fy_2006
values less than ( to_date('01-jan-2007','dd-mon-yyyy') );

alter table partitioned exchange partition fy_2006
with table fy_2006 including indexes without validation;

alter index partitioned_idx_global rebuild;

exec runStats_pkg.rs_middle;
Here is the highly available UPDATE GLOBAL INDEXES approach:
alter table partitioned exchange partition fy_2005
with table fy_2005 including indexes without validation
update global indexes;

alter table partitioned drop partition fy_2005
update global indexes;

alter table partitioned add partition fy_2007
values less than ( to_date('01-jan-2008','dd-mon-yyyy') );

alter table partitioned exchange partition fy_2007
with table fy_2007 including indexes without validation
update global indexes;

exec runStats_pkg.rs_stop;

We might observe the following:
ops$tkyte@ORA10G> exec runStats_pkg.rs_stop;
Run1 ran in 81 hsecs
Run2 ran in 94 hsecs
run 1 ran in 86.17% of the time

Name Run1 Run2 Diff
...
STAT...CPU used when call star 39 59 20
...
STAT...redo entries 938 3,340 2,402
STAT...db block gets 1,348 5,441 4,093
STAT...session logical reads 2,178 6,455 4,277
...
LATCH.cache buffers chains 5,675 27,695 22,020
...
STAT...table scan rows gotten 97,711 131,427 33,716
STAT...undo change vector size 35,100 3,404,056 3,368,956
STAT...redo size 2,694,172 6,197,988 3,503,816

The index rebuild approach did run faster, both as observed by the elapsed time and the CPU time. This fact has caused many a DBA to pause and say, ¿Hey, I don¿t want to use UPDATE GLOBAL INDEXES¿it¿s slower.¿ That is too simplistic of a view, however. What you need to remember is that while the operations overall took longer, processing on your system was not necessarily interrupted. Sure, you as the DBA might be looking at your screen for a longer period of time, but the really important work that takes place on your system was still taking place. What you need to do is see if this tradeoff makes sense for you. If you have an eight-hour maintenance window overnight in which to load new data, then by all means, use the rebuild approach if that makes sense. However, if you have a mandate to be available continuously, then the ability to maintain the global indexes will be crucial.

Looking at the redo generated by each approach, we can see that the UPDATE GLOBAL INDEXES generated considerably more¿over 230 percent more¿and we would expect that to only go up as we add more and more global indexes to the table. The redo generated by the UPDATE GLOBAL INDEXES is unavoidable and cannot be turned off via NOLOGGING, since the maintenance of the global indexes is not a complete rebuild of their structure but more of an incremental ¿maintenance.¿ Additionally, since we are maintaining the live index structure, we must generate undo for that¿in the event the partition operation fails, we must be prepared to put the index back the way it was. And remember, undo is protected by redo itself, so some of the redo you see generated is from the index updates and some is from the rollback. Add another global index or two and you would reasonably expect these numbers to increase.

So, UPDATE GLOBAL INDEXES is an option that allows you to trade off availability for resource consumption. If you have the need to provide continuous availability, it will be the option for you. But you will have to understand the ramifications and size other components of your system appropriately. Specifically, many data warehouses have been crafted over time to use bulk, direct path operations, bypassing undo generation and, when permitted, redo generation as well. Using UPDATE GLOBAL INDEXES cannot bypass either of those two elements. You¿ll need to examine the rules you used to size these components before using this feature, so you can assure yourself it can work on your system.