2014-01-27 06:28:00

PostgreSQL | TO_DATE() / TO_TIMESTAMP() gotchas

A recent thread on -general sparked my interest as I'm working with a large and heterogeneous code base recently converted from Oracle which makes heavy use of TO_DATE(), and it seems this function has a couple of gotchas worthy of consideration.

1. TO_DATE() does not really validate dates.

Even if passed an invalid date, it will happily munge it into a nominally valid date - which is probably not what you want:

    # SELECT to_date('2014/02/31', 'YYYY/MM/DD');
    (1 row)

A possible workaround is to explicitly cast the string literal to DATE, e.g.:

    # SELECT '2014/02/31'::DATE;
    ERROR:  date/time field value out of range: "2014/02/31"
    LINE 1: SELECT '2014/02/31'::DATE;

    # SELECT '2014/02/28'::DATE;
    (1 row)

but this won't work with as wide a range of date formats as TO_DATE().

If you really need to parse arbitrary date strings, a function like this might make things safer by converting the output date to the original format, comparing input and output strings and raising an error if the value fails a round-trip:

    CREATE OR REPLACE FUNCTION to_date_safe(p_date TEXT, p_format TEXT, OUT r_date DATE)
      LANGUAGE plpgsql
    AS $$
      r_date = TO_DATE(p_date, p_format);
      IF TO_CHAR(r_date, p_format) != p_date THEN
        RAISE EXCEPTION 'Input date % does not match output date %', p_date, r_date;
      END IF;

This will mean a slight performance hit, and there might be side-effects I haven't thought about yet, but it seems preferable to silently accepting an incorrect date.

    # SELECT to_date_safe('2014/02/31', 'YYYY/MM/DD');
    ERROR:  Input date 2014/02/31 does not match output date 2014-03-03

2. PostgreSQL's TO_DATE() is not Oracle's TO_DATE()

In Oracle, the DATE datatype is actually equivalent to PostgreSQL's TIMESTAMP(0) - meaning Oracle's TO_DATE() is actually working with timestamps - whereas PostgreSQL's TO_DATE() is working with dates only. Crucially, PostgreSQL's TO_DATE() will ignore any time specifications, meaning the following comparison will not return the result you might be expecting if familiar with the Oracle version:

    # SELECT TO_DATE('2014-01-26 10:00:00', 'YYYY-MM-DD HH24:MI:SS') <
             TO_DATE('2014-01-26 22:00:00', 'YYYY-MM-DD HH24:MI:SS');

If you are converting TO_DATE() to PostgreSQL, use TO_TIMESTAMP() unless you are comparing calendar dates only. However TO_TIMESTAMP() suffers from the same issue as TO_DATE().

    # SELECT TO_TIMESTAMP('2014-02-31 25:99:00', 'YYYY-MM-DD HH24:MI:SS');
     2014-03-04 02:39:00+09
    (1 row)

Again, a simple wrapper function doing reverse verification should provide some protection:

    CREATE OR REPLACE FUNCTION to_timestamp_safe(p_date TEXT, p_format TEXT, OUT r_date TIMESTAMP)
      LANGUAGE plpgsql
    AS $$

      r_date = TO_TIMESTAMP(p_date, p_format);
      IF TO_CHAR(r_date, p_format) != p_date THEN
        RAISE EXCEPTION 'Input date % does not match output date %', p_date, r_date;
      END IF;

Not exhaustively tested but seems to work:

    # SELECT to_timestamp_safe('2014/02/31 10:00:00', 'YYYY/MM/DD HH24:MI:SS');
    ERROR:  Input date 2014/02/31 10:00:00 does not match output date 2014-03-03 10:00:00

    # SELECT to_timestamp_safe('2014/02/28 10:00:00', 'YYYY/MM/DD HH24:MI:SS');
     2014-02-28 10:00:00
    (1 row)

See also this page for some background on why this issue is still around

Posted at 2014-01-27 06:28:00

Thanks for posting this. Reminded me to go back and review an Oracle conversion we'd done and sure enough I found 3 places where people had used TO_DATE and we're losing timestamp data. Luckily nothing where it really mattered :)
Posted by: Brian Dunavant | 2014-01-27 16:12
Helpful workaround.
Posted by: Navin Chakraborty | 2014-01-28 04:29