Using PostgresHook.copy_expert() method

Another day, another interesting data source to perform ETL on. This time, it’s a CSV file that’s downloaded from a remote server. Not going to focus on the part about using requests to download the file, but jumping right into using PostgresHook.copy_expert() to dump the CSV file into a staging table.

Even the actual code_export method code doesn’t offer much insight, so I had to resort to looking through the psycopg2 docs (scroll to bottom of the page) to find out how to use this method!

Here’s what makes it work: you provide the method with a SQL statement (starting with COPY …) but instead of passing the file name–which just so happens to also be the second parameter you pass to copy_expert()–you use STDIN with COPY FROM (or STDOUT with COPY TO)!

Pretty trivial in hindsight but it tripped me up for a few minutes as to why I don’t need to pass the file name twice (once as part of SQL statement & again as a parameter).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.