EMS logo


Choisissez votre outil SQL

Notre statut de partenariat

Microsoft Certified Partner
Oracle Certified Partner
Embarcadero Technology Partner

Articles sur SQL

Tous les articles SQL

Arup Nanda
Oracle Database Partitions - Partition Virtually

Save space by deactivating index partitions, and use virtual columns to partition more logically.

Kelly throws up her hands in disgust. As the chief application architect at Acme Insurance, she is visibly frustrated with the database problems she is facing. For years Acme used the Social Security number (SSN) as a unique customer identifier and until recently had no problems at all, given that it is a U.S.-based company and every customer has an SSN. But with recent changes in the company’s sales model, a single person can now be identified as multiple customers because of various types of products (or insurance plans) the person can buy. In addition, a person can now buy plans for a spouse and children, thereby associating multiple customers with a single SSN. Clearly, Kelly needs to replace the SSN as a unique identifier.

To accommodate the new sales model, Kelly has considered creating a new customer identifier: CUST_ID. This new identifier would concatenate PLAN_ID, SSN, and a code for the person covered (such as self, spouse, or child). This idea solves the uniqueness problem and was therefore approved by the business leaders, but it also means significant rework. All of the tables need to reference the new CUST_ID column, and all SQL statements using that table need to be modified. With thousands of applications, it will likely take several months to complete, likely meaning a missed deadline for the new sales model launch. They need a new approach—one that will be both quick and reliable.

Another complication is that many of the tables that reference this new column are partitioned. Because the new column will often be used as a foreign key, Kelly assumes that this column will need to be added to all child tables as well. To further complicate matters, the modelers also want to repartition the CUSTOMERS table with a new column called CATEGORY, whose value depends on the person insured (such as self or child) and the plan ID.

As Kelly presents these issues to Jill, the DBA in charge of the database, they also discuss one more problem: space. In addition to making the CUST_ID change, the business owners have decided to increase the retention period of data warehouse information and to pump more data to other tables. As a result, they estimate that some large tables and indexes might triple their space consumption, and Jill has just gotten the word back from the storage group that no further storage expansion will happen unless the company buys a new storage-area network (SAN)—which won’t occur anytime soon. Jill eyes Kelly incredulously. “How am I supposed to store all this data? I can’t pull space out of thin air!”

In this article, you will learn how John—the lead DBA at Acme—solves these problems by using the tools and technology available in Oracle Database 11g Release 2.

Optimizing Partitions

John first turns his attention to Jill’s storage issue. Jill has done a wonderful job of designing her tables to take full advantage of partitioning features such as partition pruning and faster purge and archiving. (For more information on partitioning features, see my article “More Partitioning Choices,” Oracle Magazine, September/October 2007). However, some of the partitioned tables are still very large, and the space consumed by their indexes makes them even larger. Jill’s earlier usage pattern analysis showed that most of the older partitions are rarely, if ever, accessed in user queries. The indexes are useful, but because older partitions are rarely used, the corresponding index partitions are deadweight. If they can somehow be eliminated, Jill’s space issue will be instantly resolved. But Jill can’t just drop the indexes, because they are in use for the queries on the newer partitions.

Fortunately, John has the answer. With Oracle Database 11g Release 2, they can implement a middle-ground solution to keep only the indexes for the newer partitions. With this release, the database will immediately drop the segment when an index partition is marked unusable, freeing up the space for other uses. John proceeds to explain the concept with a demonstration by using the partitioned SALES table in the SH schema, which is one of the example schemas installed during Oracle Database installation. Jill mentions that at Acme, the partition containing sales data from Q1 1999 isn’t used much, so John starts his analysis with the Q1 1999 SALES partition in the sample SH schema.

First John determines the query plan against that partition (see Listing 1 for details). He starts by issuing SET AUTOTRACE ON EXPLAIN, so that output will show the plan in addition to the query results. He then issues a query that selects the data from the Q1 1999 partition only. John notes that he does so by selecting dates in the predicate that fall within the range for that partition. The plan clearly shows the partitions of the indexes—SALES_CUST_BIX and SALES_TIME_BIX—the optimizer has chosen, as reflected in the Pstart and Pstop columns. The value 9 in both columns indicates that the optimizer has performed partition pruning and accessed only partition #9, rather than the entire index.

Code Listing 1: Execution plan before index partition drop

set autot on explain
select count(1)
from sales
where cust_id = 611
and time_id between to_date('01-jan-1999', 'dd-mon-yyyy') and to_date('01-mar-1999', 'dd-mon-yyyy')

Execution Plan
| Id|Operation                      |Name           |Rows|Bytes |Cost|Time    |Pstart|Pstop|
|  0|SELECT STATEMENT               |               |   1|    13|5(0)|00:00:01|      |     |
|  1| SORT AGGREGATE                |               |   1|    13|    |        |      |     |
|  2|  PARTITION RANGE SINGLE       |               |  19|   247|5(0)|00:00:01|     9|    9|
|  3|   BITMAP CONVERSION COUNT     |               |  19|   247|5(0)|00:00:01|      |     |
|  4|    BITMAP AND                 |               |    |      |    |        |      |     |
|* 5|     BITMAP INDEX SINGLE VALUE |SALES_CUST_BIX |    |      |    |        |     9|    9|
|  6|     BITMAP MERGE              |               |    |      |    |        |      |     |
|* 7|      BITMAP INDEX RANGE SCAN  |SALES_TIME_BIX |    |      |    |        |     9|    9|

Predicate Information (identified by operation id):
   5 - access("CUST_ID"=611)
   7 - filter("CUST_ID"=611 AND "TIME_ID"<=TO_DATE(' 1999-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"<=TO_DATE(' 1999-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Next John checks for all the segments of the specific partition relating to Q1 1999. The partitions are named in the format SALES_Q<quarter>_<4-digit year>, so he issues the following query:

select segment_name, bytes
from user_segments
where segment_name like 'SALES%'
and partition_name = 'SALES_Q1_1999'

and gets the following output:

-------------------   ---------
SALES                  2097152
SALES_CHANNEL_BIX        65536
SALES_CUST_BIX          393216
SALES_PROD_BIX           65536
SALES_PROMO_BIX          65536
SALES_TIME_BIX           65536

John draws everyone’s attention to the presence of a segment called SALES_CUST_BIX. He then makes the SALES_Q1_1999 partition of the SALES_CUST_BIX index inactive, or unusable, by issuing the statement

alter index sales_cust_bix
modify partition sales_q1_1999 unusable;

After he makes the index partition unusable, John runs the previous query once again:

select segment_name, bytes
from user_segments
where segment_name like 'SALES%'
and partition_name = 'SALES_Q1_1999'

and gets the following output:

-------------------   ---------
SALES                  2097152
SALES_CHANNEL_BIX        65536
SALES_PROD_BIX           65536
SALES_PROMO_BIX          65536
SALES_TIME_BIX           65536

The segment SALES_CUST_BIX is gone. When the index partition was made unusable, Oracle Database dropped the segment as well. This drop released the space that was used by the segment—393,216 bytes—to the tablespace to be used for other purposes. The space released may not seem significant, but this is merely an example. In real-world tables, index partition segments are much larger, John explains, and hence the space savings will be substantial. When the space savings are realized over many index partitions, the savings will be even more significant.

Jill wonders about the other partitions of the index. “Will they be affected in any manner?” she asks. John assures her that they won’t. He issues the following query to check on the other partitions of the index:

select partition_name, status,
from user_ind_partitions
where index_name = 'SALES_CUST_BIX'

and gets the following output:

---------------   --------    -------
SALES_1995         USABLE       YES
SALES_1996         USABLE       YES
SALES_H1_1997      USABLE       YES
SALES_H2_1997      USABLE       YES
SALES_Q1_1998      USABLE       YES
SALES_Q1_1999      UNUSABLE     NO
SALES_Q1_2000      USABLE       YES
. . . and so on . . .

The status of the partition SALES_Q1_1999 is shown as UNUSABLE. The column SEGMENT_CREATED shows whether a segment exists for that partition. In the case of the inactivated partition, the segment was dropped. John stresses that the partition information still remains in the data dictionary; only the storage portion is gone. As a result, the database is able to free up unused space, but it also preserves the metadata about the partition so that it can be queried (and possibly reactivated) later.

“It makes perfect sense,” remarks Jill. “An unusable index partition isn’t updated by DML [data manipulation language], so there’s no reason to keep the segment around. Oracle Database drops it and gives the space back for others. But won’t it cause queries against that unusable partition to fail?”

John reassures Jill that it won’t. To demonstrate, he redisplays the execution plan of the previously issued query against the partition (as shown in Listing 2). The query works fine, but it now uses a full table scan instead of an index scan. That was to be expected, notes John, because the corresponding index partition is unusable and cannot be used by the optimizer.

Code Listing 2: Execution plan for a query against a dropped partition

select count(1)
from sales
where cust_id = 611
and time_id between to_date('01-jan-1999', 'dd-mon-yyyy') and to_date('01-mar-1999', 'dd-mon-yyyy')

Execution Plan
Plan hash value: 642363238
| Id|Operation                 |Name   |Rows|Bytes|Cost |Time    |Pstart|Pstop|
|  0|SELECT STATEMENT          |       |  1 |  13 |36(3)|00:00:01|      |     |
|  1| SORT AGGREGATE           |       |  1 |  13 |     |        |      |     |
|  2|  PARTITION RANGE SINGLE  |       |  19|  247|36(3)|00:00:01|     9|    9|
|* 3|   TABLE ACCESS FULL      |SALES  |  19|  247|36(3)|00:00:01|     9|    9|

Predicate Information (identified by operation id):
   3 - filter("CUST_ID"=611 AND "TIME_ID"<=TO_DATE(' 1999-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"<=TO_DATE(' 1999-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Jill appears concerned. “Well,” she muses, “all the queries against that table—including those against active partitions—will use full table scans as well. That’s not acceptable.”

“Not true,” explains John. The queries against other active partitions will continue to use index scans. To demonstrate, he examines the execution plan of a similar query with different predicates that chooses an active partition—Q1 of 2000—as shown in Listing 3. As the output shows, the optimizer chooses the correct index partitions and does not perform a full table scan. Jill and Kelly are impressed.

Code Listing 3: Execution plan for a query against an active partition

select count(1)
from sales
where cust_id = 611
and time_id between to_date('01-jan-2000', 'dd-mon-yyyy') and to_date('01-mar-2000', 'dd-mon-yyyy')

Execution Plan
Plan hash value: 1549846902
| Id|Operation                      |Name          |Rows|Bytes|Cost|Time    |Pstart|Pstop|
|  0|SELECT STATEMENT               |              |  1 |  13 |5(0)|00:00:01|      |     |
|  1| SORT AGGREGATE                |              |  1 |  13 |    |        |      |     |
|  2|  PARTITION RANGE SINGLE       |              |  21|  273|5(0)|00:00:01|    13|   13|
|  3|   BITMAP CONVERSION COUNT     |              |  21|  273|5(0)|00:00:01|      |     |
|  4|    BITMAP AND                 |              |    |     |    |        |      |     |
|* 5|     BITMAP INDEX SINGLE VALUE |SALES_CUST_BIX|    |     |    |        |    13|   13|
|  6|     BITMAP MERGE              |              |    |     |    |        |      |     |
|* 7|      BITMAP INDEX RANGE SCAN  |SALES_TIME_BIX|    |     |    |        |    13|   13|

Predicate Information (identified by operation id):
   5 - access("CUST_ID"=611)
   7 - access("TIME_ID">=TO_DATE(' 2000-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"<=TO_DATE(' 2000-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

With a solution in place, John warns Jill about one caveat. If a table is truncated, the unusable index partition will become usable again and Oracle Database will re-create the segment.

Virtual Columns

With Jill’s issue resolved, John now turns to Kelly. In summary, Kelly wants to add two new columns—CUST_ID and CATEGORY—but doesn’t want to have to update all the application code to reflect those changes. She also wants to make the CUST_ID column the primary key and to use referential partitioning on all the child tables, so that they will inherit the same partitioning scheme as the parent table. Finally, she wants to change the partitioning method from range to list, using the new CATEGORY column.

To solve the issue of application rework due to the two new columns, John points out one of the most powerful yet underutilized features of Oracle Database 11g Release 2: using virtual columns as a primary key and partitioning key. As Listing 4 shows, he starts by creating the CUSTOMER table with the CUST_ID and CATEGORY columns specified as VIRTUAL, which means that the values for these two columns will always be derived during program execution, as opposed to being stored in the database. Because these values are always computed at runtime, application code does not need to populate these columns and therefore does not have to be modified.

Code Listing 4: Customers table with virtual columns

create table customers
    cust_id     number(14)
        generated always as
            DECODE(plan_id, 'MEDICAL', 100, 'DENTAL', 200, 'HOSPITAL ONLY', 300, 999)
                || ssn ||
            DECODE(member_type, 'SELF', '01', 'SPOUSE', '02', 'CHILD', '03', '99')
        ) virtual,
    cust_name       varchar2(20),
    ssn                 varchar(9),
    plan_id           varchar2(15),
    member_type   varchar2(10),
    category         varchar2(1)
        generated always as
            when member_type = 'SELF' then
                 case when plan_id = 'MEDICAL' then 'A' else 'B' end
            when member_type = 'SPOUSE' then
                 case when plan_id = 'MEDICAL' then 'B' else 'C' end
            when member_type = 'CHILD' then 'C' else 'X'
         end) virtual,
        constraint pk_customers primary key (cust_id)
)partition by list (category) 
        partition A values ('A'),
        partition B values ('B'),
        partition C values ('C'),
        partition DEF values (default)
-- Insert some rows
insert into customers (cust_name, ssn, plan_id, member_type) values ('Jill', '123456789',

insert into customers (cust_name, ssn, plan_id, member_type) values ('John', '123456789', 'MEDICAL', 'SPOUSE')
insert into customers (cust_name, ssn, plan_id, member_type) values ('Charlie', '123456789', 'MEDICAL',' CHILD') 

-- select rows from the table to check the virtual columns

select * from customers;

----------      ----------    -------    ---------    -----------    ------
10012345678901  Jill          123456789   MEDICAL     SELF           A
10012345678902  John          123456789   MEDICAL     SPOUSE         B
10012345678903  Charlie       123456789   MEDICAL     CHILD          C

John inserts a few records and selects from them to demonstrate how the columns are automatically calculated by Oracle Database (see Listing 4 for details). To demonstrate that the primary key constraint is actually enforced by Oracle Database even when the column is virtual, he inserts a row that would result in a duplicate value in the CUST_ID column, as shown in Listing 5. The statement fails, as expected. Next, John creates the SALES table, partitioned with the referential integrity constraint shown in Listing 6. Note that even though John has partitioned the SALES table (like the CUSTOMERS table) based on the CATEGORY column, no new CATEGORY column was created in the SALES table. With referential partitioning, it was automatically partitioned to match the partitioning scheme of its parent table (CUSTOMERS), due to the referential integrity constraint. What’s more, this partitioning approach works even when the partitioning column of the parent is virtual.

Code Listing 5: Primary key constraint on a virtual column

SQL> insert into customers (cust_name, ssn, plan_id, member_type)
  2    values ('A', '123456789', 'MEDICAL', 'SELF');  

insert into customers (cust_name, ssn, plan_id, member_type) values ('A', '123456789', 'MEDICAL', 'SELF')
ERROR at line 1:
ORA-00001: unique constraint (ARUP.PK_CUSTOMERS) violated

To validate that the SALES partitioning scheme has actually mapped to that of the CUSTOMERS table, John selects the partitions and their high values for the SALES table, as shown in Listing 6. The query result shows that the partitions are created in exactly the same way as for the parent table, with one exception: the high values are all null. This result is expected: because the partitioning column—CATEGORY—doesn’t actually exist in the SALES table, having actual high values there wouldn’t make any sense.

Code Listing 6: Sales table, referentially partitioned

create table sales
   sales_id       number primary key,
   cust_id        number not null,
   sales_amt    number,
   constraint    fk_sales_01
   foreign key   (cust_id)
      references customers
partition by reference (fk_sales_01)

-- insert some data
insert into sales values (1,10012345678901,100)
insert into sales values (2,10012345678902,200)
insert into sales values (3,10012345678903,50)

--select the partitions
SQL> set null ?
SQL> l
  1  select partition_name, high_value
  2  from user_tab_partitions
  3* where table_name = 'SALES'
SQL> /

---------------      -------
A                    ?
B                    ?
C                    ?
DEF                  ?

Kelly is both impressed and relieved that they won’t have to add code to populate the new column in all the tables. She now asks, “When I want to select rows from a specific partition of a table, I use a query such as SELECT * FROM TableName partition (PartitionName). But I often don’t remember the exact name of the partition, and I don’t even have access to the data dictionary to check on the name. Is there an easier way to select the partition?”

“Yes, there is,” John replies. “Extended partition syntax is what you need.” As Listing 7 shows, Kelly can use this syntax to access any partition by supplying the value it contains, as opposed to its exact name. This approach also makes development easier, because the code doesn’t have to specify partition names in advance. This makes Jill happy, because she can change the partition-naming convention to accommodate more changes without requiring the application code to change.

Code Listing 7: Extended partition syntax

select * from customers partition for ('A')

--------------    -----------   ----------   ---------   ------------   --------
10012345678901    Jill          123456789    MEDICAL     SELF           A


At the end of the day, John sums up how he used the new features of Oracle Database 11g Release 2 to address the challenging needs of Acme Insurance, as shown in Table 1. Kelly and Jill murmur their appreciation at the relative ease of the solution—especially because no changes to application code were needed. Meeting adjourned.

Requirement Addressed by
A new column needs to be added as the primary key and be updated automatically without apps’ specifically referencing it. A virtual column as the primary key constraint
Older partitions are hardly ever used, but the corresponding index partitions consume a lot of space. Making those partitions unusable, enabling Oracle Database to drop the segments and release the disk space

Table 1: Customer requirements and their solutions

Next Steps: READ more about partitions

About the article author: Arup Nanda has been an Oracle DBA for more than 16 years, handling all aspects of database administration.

Source: Oracle Magazine (March/April 2010)