diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2014-10-16 15:22:10 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2014-10-16 15:22:10 -0400 |
commit | b2cbced9eef20692b51a84d68d469627f4fc43ac (patch) | |
tree | 21774c6c010312abd2045c5d7b73f63a6828ec2c /src/timezone/localtime.c | |
parent | 90063a7612e2730f7757c2a80ba384bbe7e35c4b (diff) | |
download | postgresql-b2cbced9eef20692b51a84d68d469627f4fc43ac.tar.gz postgresql-b2cbced9eef20692b51a84d68d469627f4fc43ac.zip |
Support timezone abbreviations that sometimes change.
Up to now, PG has assumed that any given timezone abbreviation (such as
"EDT") represents a constant GMT offset in the usage of any particular
region; we had a way to configure what that offset was, but not for it
to be changeable over time. But, as with most things horological, this
view of the world is too simplistic: there are numerous regions that have
at one time or another switched to a different GMT offset but kept using
the same timezone abbreviation. Almost the entire Russian Federation did
that a few years ago, and later this month they're going to do it again.
And there are similar examples all over the world.
To cope with this, invent the notion of a "dynamic timezone abbreviation",
which is one that is referenced to a particular underlying timezone
(as defined in the IANA timezone database) and means whatever it currently
means in that zone. For zones that use or have used daylight-savings time,
the standard and DST abbreviations continue to have the property that you
can specify standard or DST time and get that time offset whether or not
DST was theoretically in effect at the time. However, the abbreviations
mean what they meant at the time in question (or most recently before that
time) rather than being absolutely fixed.
The standard abbreviation-list files have been changed to use this behavior
for abbreviations that have actually varied in meaning since 1970. The
old simple-numeric definitions are kept for abbreviations that have not
changed, since they are a bit faster to resolve.
While this is clearly a new feature, it seems necessary to back-patch it
into all active branches, because otherwise use of Russian zone
abbreviations is going to become even more problematic than it already was.
This change supersedes the changes in commit 513d06ded et al to modify the
fixed meanings of the Russian abbreviations; since we've not shipped that
yet, this will avoid an undesirably incompatible (not to mention incorrect)
change in behavior for timestamps between 2011 and 2014.
This patch makes some cosmetic changes in ecpglib to keep its usage of
datetime lookup tables as similar as possible to the backend code, but
doesn't do anything about the increasingly obsolete set of timezone
abbreviation definitions that are hard-wired into ecpglib. Whatever we
do about that will likely not be appropriate material for back-patching.
Also, a potential free() of a garbage pointer after an out-of-memory
failure in ecpglib has been fixed.
This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that
caused it to produce unexpected results near a timezone transition, if
both the "before" and "after" states are marked as standard time. We'd
only ever thought about or tested transitions between standard and DST
time, but that's not what's happening when a zone simply redefines their
base GMT offset.
In passing, update the SGML documentation to refer to the Olson/zoneinfo/
zic timezone database as the "IANA" database, since it's now being
maintained under the auspices of IANA.
Diffstat (limited to 'src/timezone/localtime.c')
-rw-r--r-- | src/timezone/localtime.c | 108 |
1 files changed, 106 insertions, 2 deletions
diff --git a/src/timezone/localtime.c b/src/timezone/localtime.c index 85b227c9255..19a24e1d960 100644 --- a/src/timezone/localtime.c +++ b/src/timezone/localtime.c @@ -1292,9 +1292,9 @@ increment_overflow(int *number, int delta) } /* - * Find the next DST transition time after the given time + * Find the next DST transition time in the given zone after the given time * - * *timep is the input value, the other parameters are output values. + * *timep and *tz are input arguments, the other parameters are output values. * * When the function result is 1, *boundary is set to the time_t * representation of the next DST transition time after *timep, @@ -1445,6 +1445,110 @@ pg_next_dst_boundary(const pg_time_t *timep, } /* + * Identify a timezone abbreviation's meaning in the given zone + * + * Determine the GMT offset and DST flag associated with the abbreviation. + * This is generally used only when the abbreviation has actually changed + * meaning over time; therefore, we also take a UTC cutoff time, and return + * the meaning in use at or most recently before that time, or the meaning + * in first use after that time if the abbrev was never used before that. + * + * On success, returns TRUE and sets *gmtoff and *isdst. If the abbreviation + * was never used at all in this zone, returns FALSE. + * + * Note: abbrev is matched case-sensitively; it should be all-upper-case. + */ +bool +pg_interpret_timezone_abbrev(const char *abbrev, + const pg_time_t *timep, + long int *gmtoff, + int *isdst, + const pg_tz *tz) +{ + const struct state *sp; + const char *abbrs; + const struct ttinfo *ttisp; + int abbrind; + int cutoff; + int i; + const pg_time_t t = *timep; + + sp = &tz->state; + + /* + * Locate the abbreviation in the zone's abbreviation list. We assume + * there are not duplicates in the list. + */ + abbrs = sp->chars; + abbrind = 0; + while (abbrind < sp->charcnt) + { + if (strcmp(abbrev, abbrs + abbrind) == 0) + break; + while (abbrs[abbrind] != '\0') + abbrind++; + abbrind++; + } + if (abbrind >= sp->charcnt) + return FALSE; /* not there! */ + + /* + * Unlike pg_next_dst_boundary, we needn't sweat about extrapolation + * (goback/goahead zones). Finding the newest or oldest meaning of the + * abbreviation should get us what we want, since extrapolation would just + * be repeating the newest or oldest meanings. + * + * Use binary search to locate the first transition > cutoff time. + */ + { + int lo = 0; + int hi = sp->timecnt; + + while (lo < hi) + { + int mid = (lo + hi) >> 1; + + if (t < sp->ats[mid]) + hi = mid; + else + lo = mid + 1; + } + cutoff = lo; + } + + /* + * Scan backwards to find the latest interval using the given abbrev + * before the cutoff time. + */ + for (i = cutoff - 1; i >= 0; i--) + { + ttisp = &sp->ttis[sp->types[i]]; + if (ttisp->tt_abbrind == abbrind) + { + *gmtoff = ttisp->tt_gmtoff; + *isdst = ttisp->tt_isdst; + return TRUE; + } + } + + /* + * Not there, so scan forwards to find the first one after. + */ + for (i = cutoff; i < sp->timecnt; i++) + { + ttisp = &sp->ttis[sp->types[i]]; + if (ttisp->tt_abbrind == abbrind) + { + *gmtoff = ttisp->tt_gmtoff; + *isdst = ttisp->tt_isdst; + return TRUE; + } + } + + return FALSE; /* hm, not actually used in any interval? */ +} + +/* * If the given timezone uses only one GMT offset, store that offset * into *gmtoff and return TRUE, else return FALSE. */ |