Hello, I’m trying to implement case-insensitivity via collation on a Postgres 15 database but it doesn’t seem to be working. This is the recommended approach these days I believe from Postgres and other providers.
Example:
CREATE COLLATION public.case_insensitive2 (provider = icu, locale = 'und-u-ks-level2', deterministic = false);
CREATE TABLE public.case_test_opt5_2 (
id SERIAL PRIMARY KEY,
first_name TEXT NOT NULL COLLATE case_insensitive2,
last_name TEXT NOT NULL COLLATE case_insensitive2,
email TEXT NOT NULL COLLATE case_insensitive2
);
INSERT INTO public.case_test_opt5_2 (first_name, last_name, email) VALUES ('foo', 'bar', 'foo.bar@example.com'), ('foo', 'BAR', 'foo.BAR@EXAMPLE.COM'), ('foo', 'BaR', 'foo.BaR@EXAMPLE.COM');
SELECT * FROM public.case_test_opt5_2 WHERE last_name='bar';
The above should return three rows but instead, at least in my Postgres instance on Render, it returns one. I also tested this locally, with Postgres 15 as well, and it works fine.
Please help!
I found a dba stack exchange talking about this and some builds only support very old versions of ICU I guess, and they suggested this instead (note, the different collation definition):
CREATE COLLATION public.case_insensitive_old (
provider = 'icu',
locale = '@colStrength=secondary',
deterministic = false
);
CREATE TABLE public.case_test_opt5_3 (
id SERIAL PRIMARY KEY,
first_name TEXT NOT NULL COLLATE case_insensitive_old,
last_name TEXT NOT NULL COLLATE case_insensitive_old,
email TEXT NOT NULL COLLATE case_insensitive_old
);
INSERT INTO public.case_test_opt5_3 (first_name, last_name, email) VALUES ('foo', 'bar', 'foo.bar@example.com'), ('foo', 'BAR', 'foo.BAR@EXAMPLE.COM'), ('foo', 'BaR', 'foo.BaR@EXAMPLE.COM');
SELECT * FROM public.case_test_opt5_3 WHERE last_name='bar';
This seems to work, which is great, but why is Render using such an old ICU version?
Is it going to be easier to accomplish this by using lower()
in either your application’s language or in SQL queries directly in 100% of cases? I know a lot of data models where you preserve entered case but always lowercase (or uppercase) the inputs on comparisons.