COPY : Loading with Wildcards (glob)
ON ANY NODE
COPY
fully supports the ON ANY NODE clause with a wildcard (glob).
We can
invoke COPY for a large number of files in a shared directory with a single
statement such as:
COPY myTable FROM
'/mydirectory/ofmanyfiles/*.dat' ON ANY NODE
Advantage of using * with ON ANY NODE option:
Using a
wildcard with the ON ANY NODE clause expands the file list on the initiator
node.
This
command then distributes the individual files among all nodes, so that the COPY
workload is evenly distributed across the entire cluster.
This
technique is used commonly across our system to load huge files.
Files
are created on a shared storage and * is used to load them.
We can
use wildcard while loading files using COPY LOCAL, however that does not
distribute files among nodes.
No comments:
Post a Comment