COPY: Loading Data
from specific Paths.
Use the 'pathToData' option to indicate the location of the load
file.
Optionally we can indicate a node name or ON ANY
NODE to indicate which node (or nodes) should parse the load file
We can specify multiple formats like GZIP,BZIP or
UNCOMPRESSED (default).
Using the ON ANY NODE clause (without indicates that the source file to load is on
all of the nodes, so COPY opens the file and parses it from any node in the
cluster.
Be sure that
the source file you specify is available and accessible on each cluster node.
If pathToData resolves to a storage location, and the user invoking COPY is not a
superuser, these are the required permissions:
- The storage location must have been created with the USER option
- The user must already have been granted READ access to the storage location where the file(s) exist
COPY: Loading
BZIP and GZIP Files
We can load files compressed with BZIP or GZIP.
To do so, we must indicate the compression format
for each file when loading multiple files.
Following examples show how to load compressd files.
We need to specify the compression technique used after
the file name.
sukul1=> COPY USER_30_DAY_TABLES.EMPLOYEE_1
sukul1-> (
sukul1(> NAME,
sukul1(> AGE,
sukul1(> JOINING_DATE FILLER VARCHAR(10),
sukul1(> JOINING_YEAR AS
TO_NUMBER(TO_CHAR(TO_DATE(JOINING_DATE,'YYYY-MM-DD'),'YYYY')),
sukul1(> SALARY
sukul1(> )
sukul1-> FROM LOCAL '/home/sukul1/employeedet1.dat.bz2' BZIP
sukul1-> DELIMITER '|'
sukul1-> ABORT ON ERROR
sukul1-> DIRECT;
Rows
Loaded
-------------
2
(1 row)
sukul1=> select * from
USER_30_DAY_TABLES.EMPLOYEE_1;
NAME | AGE | JOINING_YEAR |
SALARY
--------+-----+--------------+--------
RUTUJA
| 24 | 2005 |
20000
SUKUL
| 28 | 2010 |
10000
(2 rows)
sukul1=> COPY USER_30_DAY_TABLES.EMPLOYEE_1
sukul1-> (
sukul1(> NAME,
sukul1(> AGE,
sukul1(> JOINING_DATE FILLER VARCHAR(10),
sukul1(> JOINING_YEAR AS
TO_NUMBER(TO_CHAR(TO_DATE(JOINING_DATE,'YYYY-MM-DD'),'YYYY')),
sukul1(> SALARY
sukul1(> )
sukul1-> FROM LOCAL '/home/sukul1/employeedet2.dat.gz' GZIP
sukul1-> DELIMITER '|'
sukul1-> ABORT ON ERROR
sukul1-> DIRECT;
Rows
Loaded
-------------
2
(1 row)
sukul1=> select * from
USER_30_DAY_TABLES.EMPLOYEE_1;
NAME | AGE | JOINING_YEAR |
SALARY
----------+-----+--------------+--------
EMMANUEL
| 28 | 2010 |
10000
ALVARO
| 24 | 2005 |
20000
SUKUL
| 28 | 2010 |
10000
RUTUJA
| 24 | 2005 |
20000
(4 rows)
sukul1=>
In case we have multiple files we can list them all one
after another separated by ,.
However note
that compression technique for all files should be same.
We cannot load two files- one .gz and another .bz2 in
the same load.
Following example show the error we would get.
sukul1=> COPY USER_30_DAY_TABLES.EMPLOYEE_1
sukul1-> (
sukul1(> NAME,
sukul1(> AGE,
sukul1(> JOINING_DATE FILLER VARCHAR(10),
sukul1(> JOINING_YEAR AS
TO_NUMBER(TO_CHAR(TO_DATE(JOINING_DATE,'YYYY-MM-DD'),'YYYY')),
sukul1(> SALARY
sukul1(> )
sukul1-> FROM LOCAL '/home/sukul1/employeedet2.dat.gz' GZIP,
'/home/sukul1/employeedet1.dat.bz2' BZIP
sukul1-> DELIMITER '|'
sukul1-> ABORT ON ERROR
sukul1-> DIRECT;
ERROR 2724: COPY ... LOCAL can
read files with same compression only
No comments:
Post a Comment