New grammar for SQLAlchemy to make handling the crosstab() tablefunc (i.e. pivot tables) in Postgresql easy peasy. Jump down to the usage example if you're impatient.
This is a work-in-progress and not all that clean right now, but you're welcome to bring in your fixes and patches!
Author: Mali Akmanalp
Crosstab, a.k.a. pivot tables, is best described by example. Let's say you have a table of population records:
Country Year Count India 2009 100 India 2010 150 India 2011 200 Czechoslovakia 2008 200 Czechoslovakia 2010 400and turning it into:
Country 2008 2009 2010 2011 India 100 150 200 Czechoslovakia 200 400Another way to think about it is that it takes tuples of (y, x, value) and makes a new table where it puts value in location (x, y) e.g. 400 in (2010, "Czechoslovakia").
So a sample query would look like this:
select * from crosstab('select country, year, count from pop order by 1', 'select distinct year from pop order by 1') as derp (country text, y1 int, y2 int, y3 int, y4 int)
where the first parameter is the input of form (key, thing_to_turn_into_columns, value) (e.g. India, 2009, 100 etc.) and the second is a list of possible column values (eg. 2008, 2009, 2010, 2011). The from clause needs to declare the expected return types, which usually are the types of (key, col1, col2, col3 ...) etc.
For more, read the tablefunc docs.
Things I wish people had told me about crosstabcrosstab.py handles most of the grossness for you so you can get work done. Check example.py for the full runnable code.
crosstab_input = \ select([ raw.c.country, raw.c.year, raw.c.quantity]) categories = \ select([distinct(raw.c.year)]) ret_types = Table('ct', m, Column('country', Text), Column('y1', Integer), Column('y2', Integer), Column('y3', Integer), Column('y4', Integer), ) q = select(['*']).select_from(crosstab(crosstab_input, ret_types, categories=categories))
generates the query:
SELECT * FROM crosstab( $$SELECT raw.country, raw.year, raw.quantity FROM raw ORDER BY 1,2$$, $$SELECT DISTINCT raw.year FROM raw ORDER BY 1$$) AS ct(country TEXT, y1 INTEGER, y2 INTEGER, y3 INTEGER, y4 INTEGER)
crosstab.py also supplies row_total(), which allows you to sum a bunch of fields in a row while ignoring NULLs and pretending they were 0s. Otherwise, the NULLs would eat up the numbers and the total would be NULL. It does this under the hood by calling coalesce(x, 0) on each field and then summing them. This is meant to be used on the select part of a crosstab, such as:
select(['header', 'y1', 'y2', row_total('y1', 'y2').label('sum')])\ .select_from(crosstab(...))\ .order_by('sum')
Which generates:
select header, y1, y2, coalesce(y1, 0) + coalesce(y2, 0) as sum from crosstab(...) order by sum
This is useful for row totals and subtotals.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4