Saturday, November 14, 2015

1.19 COPY : Using ENCLOSED BY clause

COPY : Using the ENCLOSED BY

If the input data contains the Delimiter characters, then one way to escape them is to use the escape character.
Another way is to enclose text in characters specified by the ENCLOSED BY clause.
Delimiter characters inside the enclosed characters are not treated as delimiters.

You can use any ASCII value in the range E'\001' to E'\177' inclusive (any ASCII character except NULL:E'\000') for the ENCLOSED BY value. Using double quotation marks (") is the most commonly used quotation character. 




For instance, the following parameter specifies that input data to the COPY statement is enclosed within double quotes:

ENCLOSED BY '"'


With the following input (using the default DELIMITER (|) character), specifying:

"vertica | value"

Results in:
  • Column 1 containing "vertica
  • Column 2 containing value"

However if we had specified ENCLOSED BY '"' then the | would not have been interpreted as delimiter.
And Column1 would have got the entire value .i.e vertica | value



Enclosed by basically means that delimiters within the enclosed by characters are not interpreted as delimiters.


Following example shows that the file has comma as the delimiter. But the 1st field as the character comma in the tezt itself.
So we have enclosed it in double quotes.

<> /home/sukul1 $ cat employeedet1.dat
"SHANE,WARNE",25,2015-01-01,133010
"SACHIN,TENDULKAR",35,2014-02-02,1234
"BRFET,LEE",43,2014-02-02,1234
<> /home/sukul1 $

sukul1=> COPY USER_30_DAY_TABLES.EMPLOYEE_1
sukul1-> (
sukul1(> NAME,
sukul1(> AGE,
sukul1(> JOINING_DATE FILLER VARCHAR(10),
sukul1(> JOINING_YEAR AS TO_NUMBER(TO_CHAR(TO_DATE(JOINING_DATE,'YYYY-MM-DD'),'YYYY')),
sukul1(> SALARY
sukul1(> )
sukul1-> FROM LOCAL '/home/sukul1/employeedet1.dat'
sukul1-> ENCLOSED BY '"'
sukul1-> DELIMITER ','
sukul1-> DIRECT;
 Rows Loaded
-------------
           3
(1 row)

sukul1=> select * from USER_30_DAY_TABLES.EMPLOYEE_1;
       NAME       | AGE | JOINING_YEAR | SALARY
------------------+-----+--------------+--------
 BRFET,LEE        |  43 |         2014 |   1234
 SHANE,WARNE      |  25 |         2015 | 133010
 SACHIN,TENDULKAR |  35 |         2014 |   1234
(3 rows)

However note that not all rows should have values enclosed in double quotes.
In below example the 1st column in 2nd row does not have the delimiter character in it.
So its okay to not have it enclosed in double quotes.

<> /home/sukul1 $ vi employeedet1.dat
"employeedet1.dat" 3 lines, 104 characters
"SHANE,WARNE",25,2015-01-01,133010
SACHIN,35,2014-02-02,1234
"BRFET,LEE",43,2014-02-02,1234




Note that its possible to specify the ENCLOSED BY character only for a specific column.
This will allow us to enclose a specific column in double quotes and still allow literal double quotes in other columns.

The following example uses double quotes to enclose a single column

=> COPY Retail.Dim (Dno, Dname ENCLOSED BY '"', Dstore) FROM '/home/dbadmin/dim3.txt'
   DELIMITER ','
   EXCEPTIONS '/home/dbadmin/exp.txt';


1 comment:

  1. Congratulation for the great post. Those who come to read your Information will find lots of helpful and informative tips. Enclosed

    ReplyDelete