Closed
Description
We should not calculate by bytes, but by unicode character:
To reproduce:
Risingwave:
=> select substr('Mér', 1, 2);
SSL SYSCALL error: EOF detected
PSQL:
=> select substr('Mér', 1, 2);
substr
--------
Mé
(1 row)
Similar for ''Mér'::char(3)
resources:
src/expr/src/vector_op/substr.rs:48:23
https://stackoverflow.com/questions/4249745/does-postgresql-varchar-count-using-unicode-character-length-or-ascii-character