A brief and basic overview of Basic Compression (though Advanced Compression use the same method but also caters for DML)
‘With standard compression Oracle Database compresses data by eliminating duplicate values in a database block. Standard compression only works for direct path operations (CTAS or IAS).
If the data is modified using any kind of conventional DML operation (for example updates), the data within that database block is uncompressed to make the modifications and is written back to disk uncompressed.’
What does eliminate duplicate values really mean?
Within a database block, a structure called a symbol table will a hold lookup data for duplicate values for rows and columns. The database then replaces occurrences of these values with a reference to the symbol table to that lookup structure.
If we take this a set further maybe we could derive candidate tables for Basic compress based on table statistics in dba_tab_columns, we would look for large tables that have columns with a low number of distinct values. Also we could review dba_tab_modidications for confirm if table or partition does have not or little change making it a better candidate for compression.
What is a direct path insert operation?
‘The APPEND
hint instructs the optimizer to use direct-path INSERT
with the subquery syntax of the INSERT
statement.’
For SQL, the direct path insert is enabled using the /*+ APPEND */ hint, as an example:
INSERT /*+ append */ into child_test1 select * from source_num1;
From this we gather that Direct Path Insert can be more efficient as it:
- By passes db cache, by writing directly to datafiles
- Any free space in a table that would normally be used for insert is not used
- Data is inserted above the table high water mark
Requires use of ‘append’ hint
- INSERT generates both redo and undo for metadata changes
- Always bypasses undo generation for data changes
- If the database is not in ARCHIVELOG or FORCE LOGGING mode, then no redo is generated for data change.
- If the database is in ARCHIVELOG and FORCE LOGGING mode, then direct-path SQL generate data redo, this implies we are not compromising recovery.
Also we note following restrictions (there are others):
If any of the restrictions are violated the operation still runs but as a conventional INSERT
- The Table cannot have any triggers
- The Table cannot have referential integrity constraints
- You cannot query or modify direct-path inserted data immediately after the insert is complete. If you attempt to do so, an ORA-12838 error is generated. You must first issue a COMMIT.
In Summary Direct Path will not compromise recoverability if we have our database in Archive log and Force Logging mode, the performance gain in this situation is from avoiding access to the database cache and inserting directly into data files above the table high water mark. The restrictions that stand out for are no triggers or referential integrity.
Update on Advanced Compression:
- Maintains Compression During DML (which BASIC compression does not)
- Minimises overhead of Write operations on compressed data => good for OLTP & DW
- Algorithm removes duplicates in a block, a symbol table contains metadata reference to data place
- compression is self contained in a block vs other db’s that have global symbol tables => better performance
- reduce data sizes depends in cardinalilty but between 2x to 4x compression ratio
- Oracle reads compressed blocks direct to memory without uncompression=> more blocks in SGA less i/o
- Minimal overhead on reads, some additional impact on writes
- Blocks are compressed in batches not per row, once a block is at a threshold its compressed
- more data is added and its compressed at another threshold until no benifit in compressing