How to Work with PostgreSQL in Python

PostgreSQL is one of the most popular open source database. If you are building a web application you need a database. Python community likes PostgreSQL as well as PHP community likes MySQL. In order "to speak" with a PostgreSQL database pythonistas usually use psycopg2 library. It is written in C programming language using libpq. Let's dive into it.

Installation

pip install psycopg2

If you install the psycopg2 you have to have additional source files and compiler (gcc):

  • libpq
  • libssl

But you can install the precompiled binary, in this case you need to execute:

pip install psycopg2-binary

Getting started

In order to query a database first we need to connect to it and get a cursor:

import psycopg2
conn = psycopg2.connect(dbname='database', user='db_user', 
                        password='mypassword', host='localhost')
cursor = conn.cursor()

Now you have to use cursor to make queries:

cursor.execute('SELECT * FROM airport LIMIT 10')
records = cursor.fetchall()
...
cursor.close()
conn.close()

When a query is sent you can get the results using the following methods:

  • cursor.fetchone() — returns a single row
  • cursor.fetchall() — returns a list of rows
  • cursor.fetchmany(size=5) — returns the provided number of rows

Cursor is an iterable object, so you can use the for loop:

for row in cursor:
    print(row)

If you want to follow best practices you need to close a cursor and a connection. Let's do it in a pythonic way using the context manager:

from contextlib import closing

with closing(psycopg2.connect(...)) as conn:
    with conn.cursor() as cursor:
        cursor.execute('SELECT * FROM airport LIMIT 5')
        for row in cursor:
            print(row)

By default when you iterate over a cursor (or using the methods mentioned above) you will get a tuple, each column corresponds to its index. If you want to get a value by column name you can use NamedTupleCursor or DictCursor:

from psycopg2.extras import DictCursor
with psycopg2.connect(...) as conn:
    with conn.cursor(cursor_factory=DictCursor) as cursor:
        ...

Building dynamic queries

Did you know that you can form a query using psycopg2? You can do it using format-like placeholders:

cursor.execute('SELECT * FROM airport WHERE city_code = %s', ('ALA', ))
for row in cursor:
    print(row)

You should keep in mind the following rules while working with placeholders:

  • Placeholder should be %s for all data types
  • Do not use single quotes for string values
  • If you need to use the character % you have to write it as %%

You can use named arguments as well.

cursor.execute(
    'SELECT * FROM engine_airport WHERE city_code = %(city_code)s', 
    {'city_code': 'ALA'}
)

Also psycopg2 provides the module called sql which can be used to securely form an SQL query. It was introduced in the version 2.7

from psycopg2 import sql
...
>>> with conn.cursor() as cursor:
        columns = ('country_name_ru', 'airport_name_ru', 'city_code')

        stmt = sql.SQL('SELECT {} FROM {} LIMIT 5').format(
            sql.SQL(',').join(map(sql.Identifier, columns)),
            sql.Identifier('airport')
        )
        cursor.execute(stmt)

        for row in cursor:
            print(row)

Transactions

By default all instructions are executed inside a transaction when the code leaves the context manager (with) or explicitly commited by invoking .commit() method. If an exception is raised inside a context manager psycopg2 executes a rollback to revert changes. You can change this behaviour by setting autocommit attribute to True. In this case every .execute call will have immediate effect on a database. Please be aware of long running transactions due to their bad impact on database performance.

with conn.cursor() as cursor:
    conn.autocommit = True
    values = [
        ('ALA', 'Almaty', 'Kazakhstan'),
        ('TSE', 'Astana', 'Kazakhstan'),
        ('PDX', 'Portland', 'USA'),
    ]
    insert = sql.SQL('INSERT INTO city (code, name, country_name) VALUES {}').format(
        sql.SQL(',').join(map(sql.Literal, values))
    )
    cursor.execute(insert)