8000 FreeTDS Forces UTF-8 Conversion Despite CP936 Setting in Sybase ASE · Issue #645 · FreeTDS/freetds · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

FreeTDS Forces UTF-8 Conversion Despite CP936 Setting in Sybase ASE #645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ShiroiSkyy opened this issue Apr 9, 2025 · 4 comments
Open

Comments

@ShiroiSkyy
Copy link
ShiroiSkyy commented Apr 9, 2025

Environment:

  • Database: Sybase ASE 12.0 (CP850 charset)
  • Driver: FreeTDS 1.6.dev.20250329
  • OS: Windows Server 2022
  • Connection: PHP PDO_ODBC
  • DSN:
    odbc:Driver={FreeTDS};
    TDS_Version=5.0;
    ServerName=Sybase;
    Server=127.0.0.1,5000;
    Database=pos;
    ClientCharset=UTF-8;
    

Problem Description:
When executing Chinese character updates via PHP PDO_ODBC:

  1. Update statement:
    $sql = "update product set pro_scxkz = '中文字符' where pro_id='1234567'";
    $mbde = mb_detect_encoding($sql, ['GB18030','GBK','UTF-8','ASCII','CP850','CP936','BIG5'], true);
    $sql = mb_convert_encoding($sql, "CP936", $mbde);
    $result = $conn->query($sql);
  2. Subsequent SELECT returns UTF-8 encoded data, despite explicit CP936 conversion

Observed Behavior:

  • Update HEX result: e4b8ade69687e5ad97e7aca6 (UTF-8 bytes)

    • GB18030/GBK/CP936 decode attempts fail (mojibake output: 涓枃瀛楃)
    • Only UTF-8 decoding works: "中文字符"
  • Expected behavior (as seen in PowerBuilder 6.5):

    • Normal HEX value: cbd53230313630333132 (CP936)
    • Correctly decodes to: "苏20160312" in GB18030/GBK/CP936

Key Observations:

  1. Sybase DB (CP850) can store CP936 Chinese chars (confirmed via PB6.5)
  2. FreeTDS appears to force UTF-8 conversion despite:
    • Explicit CP936 conversion in PHP
    • ClientCharset=UTF-8 in DSN

Request:
How to properly configure FreeTDS to:

  1. Disable automatic UTF-8 conversion
  2. Maintain original CP936 encoding for Chinese characters
  3. Achieve parity with PowerBuilder's behavior

Additional Notes:

  • The database's nominal CP850 charset shouldn't prevent CP936 Chinese storage (as demonstrated by PB6.5)
  • Need solution that works with TDS 5.0 (Sybase 12.0 requirement)

No matter how much I change dsn= clientcharset=UTF-8 or CP936 or GB2312 , freetds seems to always send the SQL to the sybase server in UTF-8;

@freddy77
Copy link
Contributor

I will try to explain why what you are trying to do is wrong. As you stated your database is installed with CP850 charset. That is the characters it should handle are CP850 that is like https://en.wikipedia.org/wiki/Code_page_850. As you can see this character set do not contain any Chinese character. What you are doing is inserting some binary code expecting to get the same bytes back. This could seems to work "mangle" your charset and then "un-mangle" back hoping for the best. However tools handling encoding as server expects will show "weird" characters. Query could seem to work too, until you realize that the string comparisons are not what you are expecting as the server is using CP850 collation. You could end up using tools like backup and restore corrupting your data.

Now... one question is "Can I get the same dirty hack I used with PowerBuilder with FreeTDS?". I don't endorse this and if they ask I didn't suggest this but what you want is to set FreeTDS client encoding the same as server one. This way the bytes you send to FreeTDS library will be copied byte-to-byte to the server. Unless the PHP library uses wide characters, in this case FreeTDS will do the right thing and allows you to use CP850, not any random binary data.

@ShiroiSkyy
Copy link
Author
ShiroiSkyy commented Apr 16, 2025

I will try to explain why what you are trying to do is wrong. As you stated your database is installed with CP850 charset. That is the characters it should handle are CP850 that is like https://en.wikipedia.org/wiki/Code_page_850. As you can see this character set do not contain any Chinese character. What you are doing is inserting some binary code expecting to get the same bytes back. This could seems to work "mangle" your charset and then "un-mangle" back hoping for the best. However tools handling encoding as server expects will show "weird" characters. Query could seem to work too, until you realize that the string comparisons are not what you are expecting as the server is using CP850 collation. You could end up using tools like backup and restore corrupting your data.

Now... one question is "Can I get the same dirty hack I used with PowerBuilder with FreeTDS?". I don't endorse this and if they ask I didn't suggest this but what you want is to set FreeTDS client encoding the same as server one. This way the bytes you send to FreeTDS library will be copied byte-to-byte to the server. Unless the PHP library uses wide characters, in this case FreeTDS will do the right thing and allows you to use CP850, not any random binary data.


@freddy77

If I set the client DSN charset to CP850 and do not perform encoding conversion in the PHP script, the SQL statement is UTF-8 encoded. When executing the following code directly:

$SQL = "update product set pro_scxkz='中文字符123' where pro_id='8124560'";
$conn->query($SQL);

The result in freetds is ??????123. Attempting to encode ??????123 as GBK or UTF-8 yields the same result, while PowerBuilder retrieves 娑擃厽鏋冪€涙顑?23. I don’t understand what happened in between.

If I encode the SQL statement as CP850 in PHP, this clearly fails—only CP936 works, but the stored value in the Sybase database remains non-standard.
After encoding the SQL as CP936 and executing it:

  • freetds: ????123
  • PowerBuilder: 涓枃瀛楃123

When checking the freetds log, it appears freetds treats both Sybase and PHP-DSN as UTF-8.
Below is the debug code I added in FreeTDS (iconv.c): tds_iconv():

tdsdump_log(TDS_DBG_INFO1, "Converting from %s to %s\n", from->charset.name, to->charset.name);
tdsdump_log(TDS_DBG_INFO1, "Conversion direction: %s\n", (io == to_server) ? "to_server" : "to_client");
tdsdump_log(TDS_DBG_INFO1, "Client charset: %s\n", conv->from.charset.name);
tdsdump_log(TDS_DBG_INFO1, "Server charset: %s\n", conv->to.charset.name);

Log Output:

13:53:37.009 218732 (iconv.c:643): Converting from UTF-8 to UTF-8
13:53:37.009 218732 (iconv.c:644): Conversion direction: to_server
13:53:37.009 218732 (iconv.c:645): Client charset: UTF-8
13:53:37.009 218732 (iconv.c:646): Server charset: UTF-8

I have no further way to debug freetds—the encoding behavior is strange. I tried recompiling freetds to force CP850/CP936/GB18030 encoding, but it still doesn’t work. I’m not very familiar with this and am just trying to make it function.

When I set the DSN ClientCharset=UTF-8, Freetds can use the UPDATE statement to modify field values, but the stored data is incorrect. Freetds (via SELECT) receives data that can be properly converted and displays correct results. However, PowerBuilder shows abnormal display results.

Due to non-standard programming practices at my company, I must insert Chinese characters into a Sybase database with CP850 charset. Please help.

  • Option 1: Disable freetds’ iconv and handle encoding conversion myself.
  • Option 2: Specify ServerCharset (though freetds DSN doesn’t seem to provide this option).
  • Any alternative solutions to resolve the encoding mismatch?

@freddy77
Copy link
Contributor

Can you post some more logs? You can send to me privately. Also, it could be helpful to enable and send ODBC traces (refers to your driver manager documentation, see for instance https://www.freetds.org/userguide/logging.html#Logging.odbc).

@ShiroiSkyy
Copy link
Author

The logs have been sent to your e-mail.
@freddy77

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0