PostgreSQL | A practical use for the HSTORE datatype
The HSTORE extension has been around quite a while, but until recently I've never found a situation where I can justify using it - here's a quick writeup of a simple use-case.
The application which runs this website is a homebrew one which I've been maintaining on-and-off for over a decade, initially so I could have a Perl'n'PostgreSQL-powered website which provided some functionality not otherwise available at the time; but also as a platform for experimenting with various database and web technologies. As such the underlying database schema has suffered some sprawl over the years, and recently I've been tidying things up.
One source of the sprawl are a couple of key/value tables I've created at some point to store arbitrary attributes to associate with records in other tables. For example, this application is basically a CMS which runs multiple sites from the same database; it has (and who'd've thought it) a table called "site", and associated with each site are a number of ad-hoc options which I've added as I've needed them. The table isn't very large and basically looks like this:
site_id | name | value ---------+------------------------+-------------------- 1 | admin_email | firstname.lastname@example.org 2 | admin_email | email@example.com 2 | outgoing_email_address | firstname.lastname@example.org 2 | rss_style | snippet 3 | rss_style | default 5 | admin_email | email@example.com 6 | admin_email | firstname.lastname@example.org 7 | admin_email | email@example.com 7 | site_login_path | /login.html 7 | site_register_path | /register.html 7 | outgoing_email_address | firstname.lastname@example.org (11 rows)
Now, in the grand scheme of things this table is not causing any administrative or performance problems, and could happily stay as-is forever more, but it would also be an ideal candidate for conversion to an HSTORE field on the main site table. It also means I can remove a mildly ugly application method which simulates an "UPSERT", as storing a key/value pair in HSTORE simply overwrites an existing key's value.
A rundown of the steps needed to implement this shiny new HSTORE field:
- Install the HSTORE extension on the PostgreSQL server:
For me this step is basically
cd contrib/hstore && make && sudo make install, YMMV
- As superuser, install the extension HSTORE in the desired database:
sitedb=# CREATE EXTENSION hstore; CREATE EXTENSION
Note: this extension installs a number of functions - ensure they get installed in the correct schema for the application user
- Add an HSTORE field to the site table:
ALTER TABLE site ADD options HSTORE NOT NULL DEFAULT '';
DEFAULT '' ensures an empty HSTORE will be created, which ensures "upserting" values into the HSTORE via concatenation will work
- Copy the key/value pairs from the site_options table into the options HSTORE field:
This sounds simple but we need to flatten multiple key/value records into the matching HSTORE field.To do this in a single query, we'll need to use the HSTORE(TEXT,TEXT) function which constructs an HSTORE from separate key/value arrays; it draws the arrays from a CTE thusly:
WITH attrs AS ( SELECT so.site_id , ARRAY_AGG( name ORDER BY name) attr_key, ARRAY_AGG( value ORDER BY name) attr_value FROM site_option so GROUP BY so.site_id ) UPDATE site si SET options = HSTORE(attrs.attr_key, attrs.attr_value) FROM attrs WHERE si.site_id =attrs.site_id
(Thanks to David J. and Andreas for the suggestions pn using ARRAY_AGG())
- Update the application queries used to manage the site attributes:
BEFORE AFTER Get attribute value
SELECT value FROM site_option WHERE site_id=? AND name=?
SELECT options->? FROM site WHERE site_id=?
Get all attributes
SELECT name, value FROM site_option WHERE site_id=?
SELECT key AS name, value FROM EACH( (SELECT options FROM site WHERE site_id=?) )
SELECT 1 FROM site_option WHERE name=? AND site_id=?
UPDATE site_option SET value=? WHERE name=? AND site_id=?
INSERT INTO site_option (site_id, name, value) VALUES(?, ?, ?)
UPDATE site SET options = options || HSTORE(?,?) WHERE site_id=?
Et voila. One less table.