There are three ways to delete massive amounts of data, one more care full. than others:
- Truncate
The easiest and most destructive way of deleting data is truncating a table. A truncate command is like cutting down a tree: It simply cuts away all data from a table and leaves it empty. It is unreversible, there is no UNDO saved.
Truncate has two options to either release all reserved space of a table or to keep it, for reallocation of new data.
SQL> truncate table massive_table drop storage;
This will empty the table and release all occupied storage in the datafiles, to be used by other objects. When new data is inserted into the table, it may be less fragmented, depending on the overal fragmentation of the datafiles.
The other option is:
SQL> truncate table massive_table reuse storage;
This will empty the table, but keep the allocated storage. This is quite use full for tables that hold huge amounts of data during a batch process or repeating process. When storage is reused, the database does not need to reallocate it, when data is inserted into the table again.
The data in the table will be as fragmented as it was before.
There is also an extra option "cascade". This can be added to the statement when foreign keys are used referencing other tables:
SQL> truncate table massive_table drop storage cascade; - CTAS
Create Table As Select is probably the most safe way when deleting huge amounts of data. It can be used to save the data to be deleted into a backup table before a truncate is executed. It is also possible to create a new table with "data to keep only" and use that table instead of the old one, by e.g. renaming it:
SQL> create table data_to_keep as
select * from massive_table
where year >= to_date('01-01-2010','dd-mm-yyyy');
SQL> drop table massive table purge;
SQL> alter table data_to_keep rename to massive_table;
Remember to add Indexes afterwards if they existed and recalculate statistics.
It is also possible to truncate the massive_table and reinsert the saved data back into it:
SQL> create table data_to_keep as
select * from massive_table
where year >= to_date('01-01-2010','dd-mm-yyyy');
SQL> truncate table massive table drop storage;
SQL> insert into massive_table (select * from data_to_keep);
SQL> commit; - Partitioning
Another way to delete massive amounts of data on a more regular basis is the use of partitioning. Partitioning is only available in an Enterprise Edition database and it needs a separaten license.
It requires to think over the partitioning clause. If in the future data is to deleted based on a year, it is not smart to partition e.g. on company name.
It interval-partitioning is not used (see this article), then partitions to be deleted can be found using:
SQL> select 'select count(*) from pohead_his partition ('||partition_name||') ;'
from user_tab_partitions where table_name='MASSIVE_TABLE'
and partition_name like '%2010';
This generates a list of statements to drop the partitions from the table - Filtered Move
Probably the easiest way to delete massive amounts of data the quickest way is a filtered move. In this way the table is rebuild with only the data that is needed to be kept.
The statement to do this, also has options to do this "online" ( Enterprise Edition only) and to update the indexes automatically.
The statement:
SQL> alter table massive_table
move online
update indexes
including rows
where year >= to_date('01-01-2010','dd-mm-yyyy');
This statement rebuilds the massive_table and deletes all data that have a year older than Jan. 1st of 2010.
Since an "alter table move" is a DDL action, an implicit commit is executed.
So, there is no way back. It is as effective as the CTAS option, but way more easier and faster.