PostgreSQL | Converting tsearch2 to 8.3
One of the major features in the upcoming 8.3 release is the integration of
the tsearch2 full text search extension as a
core PostgreSQL feature.
While there are no fundamental changes, there are some differences which make upgrading from an
existing installation a little tricky. The following are my notes from upgrading a test version of the database
which powers this website.
First off, to make administration easier, I've always kept the tsearch2
tables, functions etc. in their own schema (cunningly named tsearch) - this makes
the job of separating the legacy tsearch2 elements from the rest of the database
a lot easier.
-
DML dump: Using 8.3's
pg_dump, extract the DML except for thetsearchschema:~/pg83b2/bin/pg_dump -p5431 -U postgres -N tsearch -s cms > /home/ian/outgoing/cms-dml.sql
(The-Noption, introduced in PostgreSQL 8.1, specifies schemas to exclude).
-
Schema modification: remove any references to the
tsearchschema; change index definitions fromGISTtoGIN. Note that the names oftsearch2functions have changed, and are now prefixed withts_, e.g.headline()is nowts_headline(). Any backend functions which use these will have to be modified as well. Import the modified schema into the new 8.3 database.
-
Data dump: as with the DML, excluding the
tsearchschema:~/pg83b2/bin/pg_dump -p5431 -U postgres -N tsearch -a cms -Fc > /home/ian/outgoing/cms-data.pg
Restore will probably require the--disable-triggersoption:~/pg83b2/bin/pg_restore --disable-triggers -U postgres -Fc -d cms -p5431 cms-data.pg
-
Code modification:
As mentioned above,
tsearch2function names have changed; any which are used in application code will need to be changed too. (A good argument for encapsulating this kind of stuff into server-side functions, but that's life with "legacy" software... Fortunately the application only uses a few standardized full text search queries, neatly contained in their own library, so modifying them was fairly simple).
And voila, it works - at least more-or-less. What's still missing here are any configuration
changes to the original database's tsearch schema, e.g. the
pg_ts_config table. (And having done all this, no doubt someone will come along
and tell me about the easier way...)
(Added) some useful links:
- Tsearch2 8.3 changes from Oleg Bartunov, including list of renamed functions
- Thread on upgrading issues from the pgsql-hackers list: full text search in 8.3