Vertica Learning: 1.26 COPY : Using Parallel Load Streams

Saturday, November 14, 2015

1.26 COPY : Using Parallel Load Streams

Using Parallel Load Streams

By default, the initiator node for a COPY statement using the default parser uses parallel parsing for each significant (1GB and over) load file.

Parsing load files is a multi-threaded operation that start at different offsets in the file.

In addition, you can create multiple parallel load streams.

Creating parallel load streams can significantly increase performance and use resources more efficiently.

We can create parallel load streams by using these multi-node COPY parameters as follows:

Use the nodename parameter with each file to load files from different nodes.

Use the ON ANY NODE parameter to load files from any node. When used with wildcard, the files are distributed across nodes and nodes can load in parallel.

COPY LOCAL parses files serially, and does not support parallel load streams.

While there is no restriction to the number of files you can load, the optimal number of load streams depends on several factors. Such factors include the number of nodes, the physical and logical schemas, host processors, memory, disk space, and so on. Too many load streams can deplete or reduce system memory required for optimal query processing.

Vertica Learning

Pages

Saturday, November 14, 2015

1.26 COPY : Using Parallel Load Streams

No comments:

Post a Comment