Populating a Database
Author
Written by Tom Lane, from an e-mail message dated 1999-12-05.
One may need to do a large number of table insertions when first
populating a database. Here are some tips and techniques for making that as
efficient as possible.
Disable Auto-commit
Turn off auto-commit and just do one commit at
the end. Otherwise Postgres is doing a
lot of work for each record
added. In general when you are doing bulk inserts, you want
to turn off some of the database features to gain speed.
Use COPY FROM
Use COPY FROM STDIN to load all the records in one
command, instead
of a series of INSERT commands. This reduces parsing, planning, etc
overhead a great deal. If you do this then it's not necessary to fool
around with autocommit.
Remove Indices
If you are loading a freshly created table, the fastest way is to
create the table, bulk-load with COPY, then create any indexes needed
for the table. Creating an index on pre-existing data is quicker than
updating it incrementally as each record is loaded.
If you are augmenting an existing table, you can DROP
INDEX, load the table, then recreate the index. Of
course, the database performance for other users may be adversely
affected during the time that the index is missing.