4 main methods of
deleting data in bulk from tables.
Command
|
Description
|
Permanently removes a table and its
definition. Optionally removes associated views and projections as
well.
|
|
Marks
rows with delete vectors and stores them so data can be rolled back to a
previous epoch.
The data must
eventually be purged before the database
can reclaim disk space.
|
|
Removes
all storage and history associated with a table. The table structure is
preserved for future use.
The
results of this command cannot be rolled back.
|
|
Removes
one partition from a partitioned table. Each partition contains a related
subset of data in the table.
Partitioned
data can be dropped efficiently, and provides query performance benefits.
|
Choosing the Right
Technique for Deleting Data :
- To delete both table data and definitions and start from scratch, use the DROP TABLE [CASCADE] command.
- To drop data, while preserving table definitions so that you can quickly and easily reload data, use TRUNCATE TABLE. Note that unlike DELETE, TRUNCATE does not have to mark each row with delete vectors, so it runs much more quickly. Explains why TRUNCATE is faster.
- To perform bulk delete operations on a regular basis, HP Vertica recommends using Partitioning.
- To perform occasional small deletes or updates with the option to roll back or review history, use DELETE FROM TABLE.
A] DROP TABLE
- Used to remove tables and optionally related projections.
- Syntax:
DROP TABLE [ IF EXISTS ]
[[db-name.]schema.]table [, ...] [ CASCADE ]
Notice the IF EXISTS clause. It is optional.
If specified, DROP TABLE does not report an error if one or more
of the tables to be dropped does not exist. This clause is useful in SQL
scripts where you want to drop a table if it exists before recreating it.
Also notice the CASCADE option. This is also optional.
Drops all projections that include the table. We need to use
this option if the table to be dropped has projections.
Following shows an example of using DELETE on a table with
projections on it.
In 1st statement we create unsegmented projections on the table. Note that we have created unsegmented projection on all nodes.
sukul1=> CREATE PROJECTION
USER_30_DAY_TABLES.p1_emp AS select FIRSTNAME FROM USER_30_DAY_TABLES.EMPTABLE
UNSEGMENTED ALL NODES;
WARNING 4468:
Projection <USER_30_DAY_TABLES.p1_emp_node0001> is not available
for query processing. Execute the select start_refresh() function to copy data
into this projection.
The projection must have a sufficient number of buddy projections and
all nodes must be up before starting a refresh
WARNING 4468:
Projection <USER_30_DAY_TABLES.p1_emp_node0002> is not available
for query processing. Execute the select start_refresh() function to copy data
into this projection.
The projection must have a sufficient number of buddy projections and
all nodes must be up before starting a refresh
WARNING 4468:
Projection <USER_30_DAY_TABLES.p1_emp_node0003> is not available
for query processing. Execute the select start_refresh() function to copy data
into this projection.
The projection must have a sufficient number of buddy projections and
all nodes must be up before starting a refresh
In next statement we can see that the drop fails because of
number of projections dependent on the table.
sukul1=> DROP TABLE
USER_30_DAY_TABLES.EMPTABLE;
NOTICE 4927:
The Projection p1_emp_node0001 depends on Table EMPTABLE
NOTICE 4927:
The Projection p1_emp_node0002 depends on Table EMPTABLE
NOTICE 4927:
The Projection p1_emp_node0003 depends on Table EMPTABLE
ROLLBACK 3128: DROP failed due
to dependencies
DETAIL: Cannot drop Table
EMPTABLE because other objects depend on it
HINT: Use DROP ... CASCADE to
drop the dependent objects too
In next statement we can see how using CASCADE option works.
sukul1=> DROP TABLE
USER_30_DAY_TABLES.EMPTABLE CASCADE;
DROP TABLE
sukul1=>
B] DELETE
- Marks tuples(records) as no longer valid in the current epoch, marking the records for deletion in the WOS, rather than deleting data from disk storage.
- You cannot delete records from a projection.
- Syntax
DELETE [ /*+ DIRECT */ ] [ /*+ LABEL(label-name)*/ ]
... FROM [[db-name.]schema.]table
... WHERE Clause
... FROM [[db-name.]schema.]table
... WHERE Clause
/*+ DIRECT */
|
Writes
the data directly to disk (ROS) bypassing memory (WOS).
Note: If you delete using the direct hint, you still need to
issue a COMMIT or ROLLBACK command to finish the transaction.
|
- DELETE statements support subqueries and joins, which is useful for deleting values in a table based on values that are stored in other tables.
- To remove all rows from a temporary table, use a DELETE statement with no WHERE clause. In this case, the rows are not stored in the system, which improves performance.(Similar to truncate)
- If you include a WHERE clause with a DELETE statement on a temporary table, DELETE behaves the same as for anchor tables, marking all delete vectors for storage, and you lose any performance benefits.
- To truncate a temporary table, without ending the
transaction, use DELETE FROM temp_table.
Truncate table is a DDL and will commit the current transaction.
C] TRUNCATE TABLE
- Removes all storage associated with a table, while preserving the table definitions.
- TRUNCATE TABLE auto-commits the current transaction after statement execution and cannot be rolled back. TRUNCATE TABLE is a DDL.
- Syntax
TRUNCATE TABLE [[db-name.]schema.]table
- To
truncate an ON COMMIT DELETE ROWS temporary table without ending
the transaction, use DELETE FROM temp_table syntax.
Note: The effect of DELETE FROM depends on the table type. If the table is specified as ON COMMIT DELETE ROWS, then DELETE FROM works like TRUNCATE TABLE; otherwise it behaves like a normal delete in that it does not truncate the table.
- After truncate operations complete, the data recovers from that current epoch onward. Because TRUNCATE TABLE removes table history, AT EPOCH queries return nothing.
D] DROP_PARTITION
- Forces
the partition of projections (if needed) and then drops the specified
partition.
REMEMBER DROP PARTITION is not a statement, its a function. Called using SELECT.
- Syntax
DROP_PARTITION ( table_name , partition_value [ , ignore_moveout_errors, reorganize_data ])
table-name
|
Specifies
the name of the table
|
partition_value
|
The key of the partition to drop. For example: DROP_PARTITION('trade', 2006);
|
ignore_moveout_errors
|
Optional Boolean, defaults to false.
Note: If you set this
parameter to true and the WOS includes data for the partition in WOS, partition
data in WOS is not dropped.
|
reorganize_data
|
Optional Boolean, defaults to false.
|
- The results of a DROP_PARTITION call go into effect immediately. If you drop a partition using DROP_PARTITION and then try to add data to a partition with the same name, HP Vertica creates a new partition.
- In general, if a ROS container has data that belongs to n+1 partitions and you want to drop a specific partition, the DROP_PARTITION operation:
- Forces the partition of data into two containers where
- One container holds the data that belongs to the partition that is to be dropped.
- Another container holds the remaining n partitions.
- Drops the specified partition.
- DROP_PARTITION acquires an exclusive lock on the table to prevent DELETE | UPDATE | INSERT | COPY statements from affecting the table, as well as any SELECT statements issued at SERIALIZABLE isolation level.
- You cannot perform a DROP_PARTITION operation on tables with projections that are not up to date
- Example:
SELECT DROP_PARTITION('trade', 2009);DROP_PARTITION-------------------Partition dropped(1 row)
Note that the table needs to be partitioned in the 1st place for the DROP_PARTITION to work.
No comments:
Post a Comment