8000 Fix tolower()/toupper() implementation to not lead to undefined behavior by rouault · Pull Request #4539 · OSGeo/PROJ · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix tolower()/toupper() implementation to not lead to undefined behavior #4539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 4, 2025

Conversation

rouault
Copy link
Member
@rouault rouault commented Jul 3, 2025

Fixes #4537

@rouault rouault added this to the 9.7.0 milestone Jul 3, 2025
@rouault rouault added the funded through GSP Work funded through the GDAL Sponsorship Program label Jul 3, 2025
ret[i] = static_cast<char>(::tolower(ret[i]));
for (char &ch : ret)
ch =
(ch >= 'A' && ch <= 'Z') ? static_cast<char>(ch + ('a' - 'A')) : ch;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will likely fail with EBCDIC but probably that's ok :-)

But more seriously, I wonder why the cast to unsigned char is bad. It allows ctype(3) implementations that use lookup tables and are thus fast, in general. (I get it that this isn't used much and that its performance isn't important.)

I don't object, though, and will test this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that use lookup tables and are thus fast,

I'm quite skeptical that lookup based code would be faster than branch testing. Modern processors are much better at doing branch prediction than fetching bytes from slow memory. Anyway this implementation or the alternate can only deal with ASCII, and that's fine given its intended purpose.

@gdt
Copy link
Contributor
gdt commented Jul 3, 2025

With this PR on top of 9.6.2, proj has one test failure (vs 2), and qgis builds.

The remaining test issue is

Command: "/tmp/work/geography/proj/work/proj-9.6.2/cmake-pkgsrc-build/bin/gie_self_tests"
Directory: /tmp/work/geography/proj/work/proj-9.6.2/cmake-pkgsrc-build/test/unit
"gie_self_tests" start time: Jul 03 18:08 EDT
Output:
----------------------------------------------------------
[==========] Running 18 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 15 tests from gie
[ RUN      ] gie.cart_selftest
utm: Invalid latitude
utm: Invalid latitude
utm: Invalid latitude
[       OK ] gie.cart_selftest (9 ms)
[ RUN      ] gie.info_functions
Unrecognized vertical grid format for filename 'proj.db'
Unrecognized horizontal grid format for filename 'proj.db'
ctype(3) isgraph: invalid input: -62
<end of output>
Test time =   0.02 sec
----------------------------------------------------------
Test Failed.
"gie_self_tests" end time: Jul 03 18:08 EDT
"gie_self_tests" time elapsed: 00:00:00
----------------------------------------------------------

which looks to be in src/dmstor.cpp:dmstor_ctx.

Adding a cast to unsigned char gets me a clean pass.

--- src/dmstor.cpp.orig 2025-04-01 21:34:48.000000000 +0000
+++ src/dmstor.cpp
@@ -46,7 +46,7 @@ double dmstor_ctx(PJ_CONTEXT *ctx, const
      * It is possible that a really odd input (like lots of leading zeros)
      * could be truncated in copying into work.  But ...
      */
-    while ((isgraph(*p) || *p == DEG_SIGN1 || *p == DEG_SIGN2) && --n)
+    while ((isgraph((unsigned char) *p) || *p == DEG_SIGN1 || *p == DEG_SIGN2) && --n)
         *s++ = *p++;
     *s = '\0';
     int sign = *(s = work);

@jjimenezshaw
Copy link
Contributor

Interesting comment in the "notes" in https://en.cppreference.com/w/cpp/string/byte/isgraph

@rouault
Copy link
Member Author
rouault commented Jul 4, 2025

isgraph() use now fixed

@rouault rouault merged commit a3cf3ca into OSGeo:master Jul 4, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
funded through GSP Work funded through the GDAL Sponsorship Program
Projects
None yet
Development

Successfully merging this pull request may close these issues.

proj can invoke ctype(3) functions with invalid input, leading to UB
3 participants
0