"PostgreSQL Backup and Restore How-to" review
PostgreSQL 9.3: feature preview articles
Logging changes to postgresql.conf
Long time no post...
Introducing PostgreNoSQL
Capture and store a Twitter search in a single SQL query using twitter_fdw
Installing MySQL on a Ubuntu/Debian server
SQL Notes
Having worked with PostgreSQL continuously since 2001, I'd like to think there's nothing I don't know about backing up and restoring. However, experience shows that a) it's all too easy to develop "muscle memory" for a certain way of doing things, and b) PostgreSQL has a pesky habit of developing useful new features which fly under the radar if you're not paying sufficient attention to the release notes, so any opportunity to review things from a fresh perspective is a welcome one.
I ordered the paper version of "PostgreSQL Backup and Restore How-to" by Shaun Thomas from Amazon Japan for a tad under 2,000 yen, which is a little on the expensive side for what was described as a "booklet" (and four times the price of the eBook version), but I have a (meagre) book budget to burn and I have this old-fashioned habit of scribbling notes on margins and making little bookmarks from Post-It notes etc., also it's nice to spend some time not staring at a screen. To my surprise it arrived less than 48 hours after ordering - upon closer examination it turned out the book was actually printed by Amazon in Japan, which is kind of nifty.
First impression: it's a very thin volume - 42 actual content pages - so is genuinely a booklet. On the other hand, bearing in mind I've got a bookshelf full of largely unread weighty tomes, less can be more.
The booklet's table of contents, as lifted directly from the publisher's site, is:
- Getting a basic export (Simple)
- Partial database exports (Simple)
- Restoring a database export (Simple)
- Obtaining a binary backup (Simple)
- Stepping into TAR backups (Intermediate)
- Taking snapshots (Advanced)
- Synchronizing backup servers (Intermediate)
- Restoring a binary backup (Simple)
- Point in time recovery (Intermediate)
- Warm and hot standby restore (Intermediate)
- Streaming replication (Advanced)
It's Sunday morning here in Japan, which in my case means it's an excellent time for a round of database server updates without interrupting production flow (lucky me). None of the databases in question are directly vulnerable to the recent security issue as for some crazy reason I prefer not to have port 5432 swinging in the Internet breeze for all and sundry to probe. However updates are updates, and the sooner applied the better - you never know what creative attack vectors all and sundry will dream up.
While I was updating, I was taking the opportunity to perform the odd bit of administrative TLC, which involves editing the postgresql.conf file, which involves manually checking in the changed version into source control, which is mildly onerous. Also, it's not convenient for tracking changes to individual configuration items over time. And it would be kind of handy to record the database settings in the database itself. Also, if it's possible to record exactly which configuration file the setting was taken from (a potential issue if the 'include' directive is used), it might be helpful when tracking down errors.
Anyway, it occurred to me that pg_settings stores displays information about which configuration items were set from values explicitly defined in postgresql.conf (and also notes which line in which file they were set on), so it should be possible to track changes on the basis of pg_settings' output:
The venerable SQL language has been in existence for a good four decades now, tracing its origins to an era where the punch card was still a viable input method and a computer was something you filled a room with, not put in your pocket. While SQL was a ground-breaking technology at the time and has served the data storage industry well during the intervening years, its heritage from the era of character-orientated terminals, line-feed printers and COMMANDS IN UPPER CASE is proving an increasing impediment in our modern world of cloud-hosted distributed global networks pushing social content to always-connected portable touchscreen devices.
I therefore propose that it is time for the PostgreSQL project to let SQL fade into a well-deserved retirement and get in on the ground floor to ride the coming NoSQL wave. This is a radical step, but it will not be the first time PostgreSQL has switched to a new core language, and I feel the PostgreSQL code base is in an excellent position to handle the transition especially once NoSQL evangelists have reached out to the core developers.
Details on mapping the reduced complexity provided by NoSQL are still being hashed out, but it's likely PostgreNoSQL's NoSQL functionality will coalesce around the HSTORE datatype, currently available as a contrib module but which will form a streamlined, distributed core implemented in Node.js and communicating exclusively via the JSON protocol (with 90's-style XML support being available for an interim period). The confusing plethora of index types will be removed except for the hash type, because that sounds cool. This will enable application developers to create their own index methods as required, as they will no longer be restricted by the fuddy-duddy "Daddy knows best" attitude inherent in legacy RDBMSs.

The PostgreNoSQL mascot
Type checking and constraints will also be removed, further reducing complexity and the overhead they entail, while empowering application developers to manage their data in the way they see best. The venerable command-line orientated client application psql will be deprecated, to be replaced by a touch-screen compatible app available for both iOS and Android, while a legacy browser version curated in JavaScript with "magic pixels" in the corners of the screen will provide accessibility to old-fashioned users who have not yet converted away from the dated mouse-orientated paradigm.
Of course, many users will be wondering what will happen to the many applications and projects which are written with PostgreSQL's historical SQL capability in mind. It's not unreasonable to expect a transition period of as long as 18 months for existing application code to be ported, during which time the current PostgreSQL version will be maintained under the title "PostgreSQLegacy". Meanwhile the future branch of the project, known as "PostgreNoSQL", or "ReNo" for short, will be marketed with the confidence-inspiring slogan:
"What goes into ReNo stays in ReNo"
There have been a quite a few excellent articles / blog posts posted in the last few months previewing features in the upcoming PostgreSQL 9.3 release, which I've been collating as they scroll of the bottom of Planet PostgreSQL pretty quickly, and thought it might be useful to share the list. I'll continue updating the list with any new articles, also if I've missed any please let me know in the comments. (Note I haven't yet confirmed the current status of all the features listed).
Some quick notes on installing MySQL 5 on a Debian or Ubuntu server using the command line.
1. Determine the available MySQL version(s)
This step is optional; to find out what MySQL versions are available, execute
root@server ~ # apt-cache search mysql-server auth2db - Powerful and eye-candy IDS logger, log viewer and alert generator mysql-cluster-server - MySQL database server (metapackage depending on the latest version) mysql-cluster-server-5.1 - MySQL database server binaries torrentflux - web based, feature-rich BitTorrent download manager mysql-server - MySQL database server (metapackage depending on the latest version) mysql-server-5.1 - MySQL database server binaries mysql-server-core-5.1 - MySQL database core server files cacti - Frontend to rrdtool for monitoring systems and services
Normally the mysql-server package will exist as a meta package which will install the latest MySQL version available on the system - in this case 5.1.
One of the many things I've been wanting to do with this site is add a Planet PostgreSQL feed, however as I've mentioned previously this is a custom application and while knocking together a feed reader is pretty routine stuff, it's not going to leave me any more enlightened than I was before.
However recently at the PostgreSQL "Unconference" in Tokyo, one of the talks was by Hitoshi Harada who demonstrated twitter_fdw, and it occurred to me that as Planet PostgreSQL twitters the updates to its own Twitter account, it might be simple to grab the feed that way.
And it is - follow the instructions in the README.md file, execute a query like the below and back comes a list of recent tweets - no setup, login, API key etc. required.
The HSTORE extension has been around quite a while, but until recently I've never found a situation where I can justify using it - here's a quick writeup of a simple use-case.
The application which runs this website is a homebrew one which I've been maintaining on-and-off for over a decade, initially so I could have a Perl'n'PostgreSQL-powered website which provided some functionality not otherwise available at the time; but also as a platform for experimenting with various database and web technologies. As such the underlying database schema has suffered some sprawl over the years, and recently I've been tidying things up.
One source of the sprawl are a couple of key/value tables I've created at some point to store arbitrary attributes to associate with records in other tables. For example, this application is basically a CMS which runs multiple sites from the same database; it has (and who'd've thought it) a table called "site", and associated with each site are a number of ad-hoc options which I've added as I've needed them. The table isn't very large and basically looks like this:
As mentioned previously, the OSPN 2013 Tokyo/Spring open source conference was held on February 22/23, and I was able to attend a few sessions on both days. While there was a wide range of sessions covering everything from open source licensing to a project to make available every obscure Chinese character variant (all 60,000 of them), I was interested primarily in the database-related sessions. The open-source database ecosystem was represented by PostgreSQL and the MySQL'n'derivatives family.
