-
8000
-
Notifications
You must be signed in to change notification settings - Fork 172
Remove WinRT deps from cctz::local_time_zone
#316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reworks our previous commit [1], which aimed to address google#53 by implementing cctz::local_time_zone with WinRT API on Windows. The issue is there is no guarantee that RoInitialize can be safely called when cctz::local_time_zone gets called [2]. In theory cctz can have its own thread to safely call RoInitialize and WinRT APIs, but such an approach would make the code much more complex and fragile. This commit takes advantage of newly exposed ICU APIs that do not require COM/WinRT initialization on Windows 10 1903 and later [3]. To maintain binary compatibility with earlier versions of Windows, the DLL and function pointer are still loaded dynamically. [1]: 1bb996c [2]: google/mozc#856 [3]: https://devblogs.microsoft.com/oldnewthing/20210527-00/?p=105255
reinterpret_cast<const UChar*>(info.TimeZoneKeyName), -1, nullptr, | ||
buffer, ARRAYSIZE(buffer), &status); | ||
if (status != U_ZERO_ERROR || num_chars_in_buffer <= 0 || | ||
num_chars_in_buffer > ARRAYSIZE(buffer)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/ucal_8h.html#ac345a8561828554c092663e541b1c7bd says the return value does not include the terminating NUL, so I think this check is actually off by 2. But it would also be an error if ucal_getTimeZoneIDForWindowsIDFunc ignored the ARRAYSIZE parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I locally tested several buffer size for the same local time ("Asia/Tokyo"
) and here is what I observed in my Windows 11 24H2 environment.
ARRAYSIZE(buffer) |
num_chars_in_buffer |
status |
content of buffer |
---|---|---|---|
12 | 10 | U_ZERO_ERROR |
fully written with NUL termination |
11 | 10 | U_ZERO_ERROR |
fully written with NUL termination |
10 | 10 | U_STRING_NOT_TERMINATED_WARNING |
fully written without NUL termination |
9 | 10 | U_BUFFER_OVERFLOW_ERROR |
nothing is written |
Note that the subsequent code expect num_chars_in_buffer
to not include NUL, so I think num_chars_in_buffer > ARRAYSIZE(buffer)
makes sense in this case. status != U_ZERO_ERROR
check effectively covers such cases already though.
Do you have any preference on this condition? Realistically if (status != U_ZERO_ERROR)
seems to be sufficient. Happy to update the patch if you have any preference.
static_cast<int>(num_chars_in_buffer), nullptr, 0, nullptr, nullptr); | ||
std::string local_time_str; | ||
local_time_str.resize(static_cast<size_t>(num_bytes_in_utf8)); | ||
::WideCharToMultiByte(CP_UTF8, 0, reinterpret_cast<const wchar_t*>(buffer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://learn.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-widechartomultibyte
Maybe this is somewhat pedantic because it will still work, but it seems like this writes the NUL character to [local_time_str.size()-1]. Which I would think would mean you should erase the last char to make local_time_str.size() consistent with the real size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case, num_chars_in_buffer
is the UTF-16 char count without NUL so it falls into the following rule.
https://learn.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-widechartomultibyte
If this parameter is set to a positive integer, the function processes exactly the specified number of characters. If the provided size does not include a terminating null character, the resulting character string is not null-terminated, and the returned length does not include this character.
So I think this is working as expected, right?
if (!windows_get_string_raw_buffer) { | ||
return result; | ||
// True if we have already failed to load the API. | ||
static std::atomic_bool g_ucal_getTimeZoneIDForWindowsIDUnavailable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to factor the initialization code into some function called once to avoid these atomics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to factor the initialization code into some function called once to avoid these atomics?
I guess here is what you were wondering about.
std::string win32_local_time_zone() {
// To avoid frequent DLL loadings and unloading, we obtain the API only once
// then allow "icu.dll" to be "leaked". Given that it is a system library,
// keeping it on memory is supposed to have no major drawback.
const static auto ucal_getTimeZoneIDForWindowsIDFunc =
[]() -> decltype(&::ucal_getTimeZoneIDForWindowsID) {
const HMODULE icudll =
::LoadLibraryExW(L"icu.dll", nullptr, LOAD_LIBRARY_SEARCH_SYSTEM32);
if (icudll == nullptr) {
return nullptr;
}
return reinterpret_cast<decltype(&::ucal_getTimeZoneIDForWindowsID)>(
::GetProcAddress(icudll, "ucal_getTimeZoneIDForWindowsID"));
}();
DYNAMIC_TIME_ZONE_INFORMATION info = {};
if (::GetDynamicTimeZoneInformation(&info) == TIME_ZONE_ID_INVALID) {
return "";
}
...
One concern in this approach is that thread-safe local static initialization does internally use a lock, which may cause dead lock when combined with Win32 Loader lock.
Suppose there are two running thread X and thread Y.
- Thread X enters
DllMain
, where the operating system implicitly acquires Loader lock. - Thread Y enters
win32_local_time_zone()
then acquires the lock to initializeucal_getTimeZoneIDForWindowsIDFunc
variable. It then get blocked atLoadLibraryExW
because the thread X still holds the Loader Lock. - Thread X also enters
win32_local_time_zone()
then gets blocked becauseucal_getTimeZoneIDForWindowsIDFunc
is not yet fully initialized.
The current approach with std::atomic_*
doesn't have the above dead lock issue, as it is just an optimistic optimization that does not try to enforce the initialization happens exactly once.
Let me know if you still prefer the thread-safe local static initialization approach. I'll update the patch accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've addressed all of my concerns. Thanks.
When I try to import this into Abseil, I get
|
Hmm, that's strange. In my local environment, the following build command actually succeeds with the equivalent change.
Does the following condition work? #if defined(_WIN32)
// Include only when <icu.h> is available.
// https://learn.microsoft.com/en-us/windows/win32/intl/international-components-for-unicode--icu-
// https://devblogs.microsoft.com/oldnewthing/20210527-00/?p=105255
#if defined(__has_include)
#if __has_include(<icu.h>)
#include <sdkddkver.h>
// ucal_getTimeZoneIDForWindowsID is available on Windows 10 RS3 and later.
#if defined(NTDDI_WIN10_RS3) && NTDDI_VERSION >= NTDDI_WIN10_RS3
#define USE_WIN32_LOCAL_TIME_ZONE
#include <windows.h>
#include <icu.h>
#include <timezoneapi.h>
#include <atomic>
#endif // NTDDI_VERSION >= NTDDI_WIN10_RS3
#endif // __has_include(<icu.h>)
#endif // __has_include
#endif // _WIN32 |
Another thing that may work is to define #if defined(_WIN32)
// Include only when <icu.h> is available.
// https://learn.microsoft.com/en-us/windows/win32/intl/international-components-for-unicode--icu-
// https://devblogs.microsoft.com/oldnewthing/20210527-00/?p=105255
#if defined(__has_include)
#if __has_include(<icu.h>)
#define USE_WIN32_LOCAL_TIME_ZONE
#pragma push_macro("_WIN32_WINNT")
#pragma push_macro("NTDDI_VERSION")
// ucal_getTimeZoneIDForWindowsID is available on Windows 10 RS3 and later.
#define _WIN32_WINNT 0x0A00 // == _WIN32_WINNT_WIN10
#define NTDDI_VERSION 0x0A000004 // == NTDDI_WIN10_RS3
#include <windows.h>
#include <icu.h>
#include <timezoneapi.h>
#pragma pop_macro("NTDDI_VERSION")
#pragma pop_macro("_WIN32_WINNT")
#include <atomic>
#endif // __has_include(<icu.h>)
#endif // __has_include
#endif // _WIN32 |
This commit attempts to fix the build failure reported in google#316 (comment), which was a consequence of my previous commit [1]. With this commit relevant Win32 macros are overridden only while including <icu.h>. This commit also overrides _WIN32_WINNT to be consistent with NTDDI_VERSION. [1]: 2ebbe0d
@derekmauro Can you check if the following change works for abseil? |
It appears to work. Thanks! |
This commit attempts to fix the build failure reported in #316 (comment), which was a consequence of my previous commit [1]. With this commit relevant Win32 macros are overridden only while including <icu.h>. This commit also overrides _WIN32_WINNT to be consistent with NTDDI_VERSION. [1]: 2ebbe0d
This reworks our previous commit (1bb996c), which aimed to address #53 by implementing
cctz::local_time_zone
with WinRT API on Windows.The issue is there is no guarantee that
RoInitialize
can be safely called whencctz::local_time_zone
gets called. In theory cctz can have its own thread to safely callRoInitialize
and WinRT APIs, but such an approach would make the code much more complex and fragile.This commit takes advantage of newly exposed ICU APIs that do not require COM/WinRT initialization on Windows 10 1903 and later.
To maintain binary compatibility with earlier versions of Windows, the DLL and function pointer are still loaded dynamically.
Closes #315.