COPY: Purpose of using the STREAM
NAME Parameter.
The LOAD_STREAMS system
table includes information about load stream
metrics from COPY and COPY FROM VERTICA statements
We can
query this table to get the COPY metrics.
The LOAD_STREAMS system table includes stream names for every COPY statement that takes
more than 1-second to run. The 1-second
duration includes the time to plan and execute the statement.
HP
Vertica maintains system table metrics until they reach a designated size
quota (in kilobytes). The quota is set through internal processes and cannot be
set or viewed directly.
Using the STREAM NAME parameter as part of the COPY
statement labels COPY streams explicitly so
they are easier to identify in the LOAD_STREAMS system table. This is the main
purpose of the STREAM NAME option.
To use
the STREAM NAME parameter:
=> COPY mytable FROM myfile DELIMITER '|' DIRECT STREAM NAME 'My stream name';
Following
are few imp columns in the LOAD_STREAMS Table.
- ACCEPTED_ROW_COUNT
- REJECTED_ROW_COUNT
- PARSE_COMPLETE_PERCENT
- SORT_COMPLETE_PERCENT
When
a COPY statement using the DIRECT option is in progress,
the ACCEPTED_ROW_COUNT value can increase to the maximum number of
rows in the input file as the rows are being parsed.
In a
typical load, the PARSE_COMPLETE_PERCENT value can either increase
slowly to 100%, or jump to 100% quickly if you are loading from named pipes or
STDIN, while SORT_COMPLETE_PERCENT is at 0. Once
PARSE_COMPLETE_PERCENT reaches
100%, SORT_COMPLETE_PERCENT increases to 100%. Depending on the data
sizes, a significant lag can occur between the
time PARSE_COMPLETE_PERCENT reaches 100% and the
timeSORT_COMPLETE_PERCENT begins to increase.
If COPY
reads input data from multiple named pipes,
the PARSE_COMPLETE_PERCENT value will remain at zero (0)
until all named pipes return an EOF.
While COPY awaits an EOF from multiple pipes, it can appear to be hung.
However, before canceling the COPY statement, check your system
CPU and disk accesses to see if any activity is in progress.
Expanded display is on.
=> SELECT stream_name, table_name, load_start, accepted_row_count,
rejected_row_count, read_bytes, unsorted_row_count, sorted_row_count,
sort_complete_percent FROM load_streams;
-[ RECORD 1 ]----------+---------------------------
stream_name | fact-13
table_name | fact
load_start | 2010-12-28 15:07:41.132053
accepted_row_count | 900
rejected_row_count | 100
read_bytes | 11975
input_file_size_bytes | 0
parse_complete_percent | 0
unsorted_row_count | 3600
sorted_row_count | 3600
sort_complete_percent | 100
=> SELECT stream_name, table_name, load_start, accepted_row_count,
rejected_row_count, read_bytes, unsorted_row_count, sorted_row_count,
sort_complete_percent FROM load_streams;
-[ RECORD 1 ]----------+---------------------------
stream_name | fact-13
table_name | fact
load_start | 2010-12-28 15:07:41.132053
accepted_row_count | 900
rejected_row_count | 100
read_bytes | 11975
input_file_size_bytes | 0
parse_complete_percent | 0
unsorted_row_count | 3600
sorted_row_count | 3600
sort_complete_percent | 100
No comments:
Post a Comment