Postgresql : load csv data
One of the most common action that one has to do with a database is to load data into it. Usually the data comes either in a dump/sql file or in csv format. Today I am going to brag about loading the latter format into a PostgreSQL database. The first thing to keep in mind is that PostgreSQL is able to do it without magic . The second thing to put attention on is which is the user who is connecting to the database to load the data.
The easy pattern : you have superuser privileges
In this scenario, the user is superuser ( postgres ) or is inheriting from the superuser role. If that's the case then all you need to do is open a psql shell and issue the command
copy <table> from <file> with delimiter '<delimiter>' ;
where
- <table> is the name of the table that you want to populate with the data
- <file> is the path to the file that contains your csv data
- <delimiter> is the delimiter used in your csv file to separate the fields ( usually ',' or ';' )
The way you open a connection depends on what you are using : if you are using a gui ( such as pgAdmin ) you simply double click on the database that contains the table that you want to populate and then hit the SQL button in the toolbar. If you are using a console then you have to use pql command :
psql -h <servername> -U <username> <databasename>
where
- <servername> is the name or ip address of the PostgreSQL server you are connecting to
- <username> is the username that you are using to connect
- <databasename> is the name of the database that contains the table you want to work on
The not so obvious pattern : you are a regular user
In this scenario you are connecting to the database via a user that has ( at least ) connect and insert privileges on the table that you want to populate. In this situation you cannot "simply" use the copy command telling postgres to get the data from a file because you need superuser privileges in order to import from a file . The work around here is to use stdin since everybody can use it ! This is going to limit the options a little but still ... Open a terminal and issue the commad :
cat <filename> | psql -U <username> -h <servername> <database> -c "copy <table> from stdin ;"
if you are on Windows system replace cat with type
Variation of the problem
- The file contains different number of columns than the table does
copy <table>(field1,field2, ..., fieldn) from ...
- The file contains headers : I don't want to import that too !
copy <table> from [stdin|<filename>] with CSV HEADER;
- The file contains special characters :
\encoding ISO_8859_1 ; copy <table> from [stdin|<filename>] with CSV HEADER;
If you want to deepen your knowledge of the powerful copy command here's the official documentation
Recent Comments