Hi there!
Since version 12, PostgreSQL has supported a WHERE clause within COPY ... FROM statements:
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
[ WHERE condition ]
It functions similarly to the WHERE clause used within SELECT statements, subject to some restrictions regarding subqueries.
Recently I've found myself in a situation where I am performing a bulk insert of CSV-formatted data from a remote machine, but would like to skip certain rows - such as those containing null values. From what I can tell, the existing Connection.copy_to_table helper doesn't support this, and I'd need to drop down to private API (the underlying Connection._protocol._copy_in machinery)
I'd appreciate some publicly exposed way of doing this - the simplest approach that comes to mind would be adding a where kwarg to Connection.copy_to_table as well as the Connection._format_copy_opts method. Usage could then look something along the lines of
await conn.copy_to_table("cool_table", source="important_file.csv", where="important_value = 42")
I'm not sure if this is satisfactory, though. There would also need to be some consideration regarding earlier Postgres versions that don't support COPY ... FROM using a WHERE clause.
Please let me know your thoughts - I'm happy to open a PR for implementation if necessary.
Hi there!
Since version 12, PostgreSQL has supported a
WHEREclause withinCOPY ... FROMstatements:It functions similarly to the
WHEREclause used withinSELECTstatements, subject to some restrictions regarding subqueries.Recently I've found myself in a situation where I am performing a bulk insert of CSV-formatted data from a remote machine, but would like to skip certain rows - such as those containing null values. From what I can tell, the existing
Connection.copy_to_tablehelper doesn't support this, and I'd need to drop down to private API (the underlyingConnection._protocol._copy_inmachinery)I'd appreciate some publicly exposed way of doing this - the simplest approach that comes to mind would be adding a
wherekwarg toConnection.copy_to_tableas well as theConnection._format_copy_optsmethod. Usage could then look something along the lines ofI'm not sure if this is satisfactory, though. There would also need to be some consideration regarding earlier Postgres versions that don't support
COPY ... FROMusing aWHEREclause.Please let me know your thoughts - I'm happy to open a PR for implementation if necessary.