How to Work with PostgreSQL in Python
PostgreSQL is one of the most popular open source database. If you are building a web application you need a database. Python community likes PostgreSQL as well as PHP community likes MySQL. In order "to speak" with a PostgreSQL database pythonistas usually use psycopg2 library. It is written in C programming language using libpq. Let's dive into it.
Installation
pip install psycopg2
If you install the psycopg2 you have to have additional source files and compiler (gcc):
- libpq
- libssl
But you can install the precompiled binary, in this case you need to execute:
pip install psycopg2-binary
Getting started
In order to query a database first we need to connect to it and get a cursor:
import psycopg2
conn = psycopg2.connect(dbname='database', user='db_user',
password='mypassword', host='localhost')
cursor = conn.cursor()
Now you have to use cursor
to make queries:
cursor.execute('SELECT * FROM airport LIMIT 10')
records = cursor.fetchall()
...
cursor.close()
conn.close()
When a query is sent you can get the results using the following methods:
cursor.fetchone()
— returns a single rowcursor.fetchall()
— returns a list of rowscursor.fetchmany(size=5)
— returns the provided number of rows
Cursor is an iterable object, so you can use the for loop:
for row in cursor:
print(row)
If you want to follow best practices you need to close a cursor and a connection. Let's do it in a pythonic way using the context manager:
from contextlib import closing
with closing(psycopg2.connect(...)) as conn:
with conn.cursor() as cursor:
cursor.execute('SELECT * FROM airport LIMIT 5')
for row in cursor:
print(row)
By default when you iterate over a cursor (or using the methods mentioned above) you will get a tuple, each column corresponds to its index. If you want to get a value by column name you can use NamedTupleCursor
or DictCursor
:
from psycopg2.extras import DictCursor
with psycopg2.connect(...) as conn:
with conn.cursor(cursor_factory=DictCursor) as cursor:
...
Building dynamic queries
Did you know that you can form a query using psycopg2? You can do it using format-like placeholders:
cursor.execute('SELECT * FROM airport WHERE city_code = %s', ('ALA', ))
for row in cursor:
print(row)
You should keep in mind the following rules while working with placeholders:
- Placeholder should be
%s
for all data types - Do not use single quotes for string values
- If you need to use the character
%
you have to write it as%%
You can use named arguments as well.
cursor.execute(
'SELECT * FROM engine_airport WHERE city_code = %(city_code)s',
{'city_code': 'ALA'}
)
Also psycopg2 provides the module called sql
which can be used to securely form an SQL query. It was introduced in the version 2.7
from psycopg2 import sql
...
>>> with conn.cursor() as cursor:
columns = ('country_name_ru', 'airport_name_ru', 'city_code')
stmt = sql.SQL('SELECT {} FROM {} LIMIT 5').format(
sql.SQL(',').join(map(sql.Identifier, columns)),
sql.Identifier('airport')
)
cursor.execute(stmt)
for row in cursor:
print(row)
Transactions
By default all instructions are executed inside a transaction when the code leaves the context manager (with
) or explicitly commited by invoking .commit()
method. If an exception is raised inside a context manager psycopg2 executes a rollback to revert changes. You can change this behaviour by setting autocommit
attribute to True
. In this case every .execute
call will have immediate effect on a database. Please be aware of long running transactions due to their bad impact on database performance.
with conn.cursor() as cursor:
conn.autocommit = True
values = [
('ALA', 'Almaty', 'Kazakhstan'),
('TSE', 'Astana', 'Kazakhstan'),
('PDX', 'Portland', 'USA'),
]
insert = sql.SQL('INSERT INTO city (code, name, country_name) VALUES {}').format(
sql.SQL(',').join(map(sql.Literal, values))
)
cursor.execute(insert)