Using Parallel Load Streams
By default, the initiator node for
a COPY statement using the default parser uses parallel parsing for each significant (1GB and over) load file.
Parsing
load files is a multi-threaded operation that start at different offsets in the
file.
In
addition, you can create multiple parallel load streams.
Creating
parallel load streams can significantly increase performance and use resources
more efficiently.
We can
create parallel load streams by using these
multi-node COPY parameters as follows:
- Use the nodename parameter with each file to load files from different nodes.
- Use the ON ANY NODE parameter to load files from any node. When used with wildcard, the files are distributed across nodes and nodes can load in parallel.
COPY LOCAL parses
files serially, and does not support parallel load streams.
While
there is no restriction to the number of files you can load, the optimal number
of load streams depends on several factors. Such factors include the number of
nodes, the physical and logical schemas, host processors, memory, disk space,
and so on. Too many load streams can deplete or reduce system memory required
for optimal query processing.
No comments:
Post a Comment