Hi. I have some data in my database that is a string representation of integers (mostly, there are some alphabetic characters here and there). I want to query this field using postgres’ collation, specifically I want to use
create collation numeric (provider = icu, locale = 'en-u-kn-true');
select '20' between '10' and '100' collate numeric;
and I expect to get true. But I get false instead. This seems to contradict the postgres documentation
CREATE COLLATION numeric (provider = icu, locale = 'en-u-kn-true');
CREATE COLLATION numeric (provider = icu, locale = 'en@colNumeric=yes');
Numeric ordering, sorts sequences of digits by their numeric value, for example: A-21
< A-123
(also known as natural sort).
It seems that icu support is present because I can see icu entries when I run select * from pg_collation;
Running the same code locally and within github actions works. Render seems to be the odd man out, but I’m open to the possibility that I’m missing something.
Here’s a minimal reproduction that works on all the other postgres databases I’ve tried so far:
=> create collation numeric (provider = icu, locale = 'en-u-kn-true');
CREATE COLLATION
=> select '20' between '10' and '100';
┌──────────┐
│ ?column? │
╞══════════╡
│ f │
└──────────┘
(1 row)
=> select '20' between '10' and '100' collate numeric;
┌──────────┐
│ ?column? │
╞══════════╡
│ t │
└──────────┘
(1 row)
On render’s DB, the final query returns false.
Hi @PaulOstazeski! Thanks for the detailed description and repro steps here! I was able to replicate the behavior on my end as well. It seems that the PostgreSQL image is using an older libicu version, which doesn’t support specifying that locale. We do expect this to be addressed once we roll out full support for PostgreSQL 12/13, which is currently in progress (early access now).
1 Like
Great! I’ll email support and see about getting early access.
Hi, I’m still actually having issues with custom collation for case-insensitivity. Do we know whether this is actually working?
I did the following and the select only selects one row instead of the expected three…
CREATE COLLATION public.case_insensitive2 (provider = icu, locale = 'und-u-ks-level2', deterministic = false);
CREATE TABLE public.case_test_opt5_2 (
id SERIAL PRIMARY KEY,
first_name TEXT NOT NULL COLLATE case_insensitive2,
last_name TEXT NOT NULL COLLATE case_insensitive2,
email TEXT NOT NULL COLLATE case_insensitive2
);
INSERT INTO public.case_test_opt5_2 (first_name, last_name, email) VALUES ('foo', 'bar', 'foo.bar@example.com'), ('foo', 'BAR', 'foo.BAR@EXAMPLE.COM'), ('foo', 'BaR', 'foo.BaR@EXAMPLE.COM');
SELECT * FROM public.case_test_opt5_2 WHERE last_name='bar';
I started another forum thread and posted a current solution there; but I would hope that Render upgrades ICU version soon
The issue I descibed has been correctly working on render for several years now. So far I’ve seen postgres versions >= 12 on render have all been fine.
We actually added tests to our product’s suite for exactly this (so we’d know if we needed to handle sorting in application code or not) and I haven’t encountered the problem since August of 2021.
Interesting. I ran this on my Render DB and get a really old version:
select oid, collname, collversion "orig_version" , pg_collation_actual_version(oid) "curr_version" from pg_collation
where collname = 'case_insensitive'
On Render the version is 58.0.0.50
and locally it’s 153.120
.
I did migrate to Render from Heroku… I wonder if that migration kept the old version.