aboutsummaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2014-06-10 22:48:16 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2014-06-10 22:48:31 -0400
commit7f9fbb842b09da851e003a70c6c53fd8ca6c6f82 (patch)
treece3811eb8fdb53706e11789c5ac113c1b3b6278b /src
parentab76208e3df6841b3770edeece57d0f048392237 (diff)
downloadpostgresql-7f9fbb842b09da851e003a70c6c53fd8ca6c6f82.tar.gz
postgresql-7f9fbb842b09da851e003a70c6c53fd8ca6c6f82.zip
Fix ancient encoding error in hungarian.stop.
When we grabbed this file off the Snowball project's website, we mistakenly supposed that it was in LATIN1 encoding, but evidently it was actually in LATIN2. This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in bug #10589 from Zoltán Sörös. We'd have messed up u-double-acute too, but there aren't any of those in the file. Other characters used in the file have the same codes in LATIN1 and LATIN2, which no doubt helped hide the problem for so long. The error is not only ours: the Snowball project also was confused about which encoding is required for Hungarian. But dealing with that will require source-code changes that I'm not at all sure we'll wish to back-patch. Fixing the stopword file seems reasonably safe to back-patch however.
Diffstat (limited to 'src')
-rw-r--r--src/backend/snowball/stopwords/hungarian.stop14
1 files changed, 7 insertions, 7 deletions
diff --git a/src/backend/snowball/stopwords/hungarian.stop b/src/backend/snowball/stopwords/hungarian.stop
index 94e9f9a0b07..abfd35ce976 100644
--- a/src/backend/snowball/stopwords/hungarian.stop
+++ b/src/backend/snowball/stopwords/hungarian.stop
@@ -55,10 +55,10 @@ ekkor
el
elég
ellen
-elõ
-elõször
-elõtt
-elsõ
+elő
+először
+előtt
+első
én
éppen
ebben
@@ -149,9 +149,9 @@ nincs
olyan
ott
össze
-õk
-õket
+ők
+őket
pedig
persze