-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] compute::LocalTimestamp()
Performs incorrect conversion
#45751
Comments
compute::LocalTimestamp()
Resulting in incorrect conversioncompute::LocalTimestamp()
Performs incorrect conversion
Just to add, experimenting with a different timezone library (link) gets the expected 2222982812 value: #include <iostream>
#include <chrono>
#include <date/date.h>
#include <date/tz.h>
int main() {
date::sys_seconds utc_time{std::chrono::seconds(2222997212)};
date::zoned_time ny_time{"America/New_York", utc_time};
std::cout << "Epoch seconds: " << ny_time.get_sys_time().time_since_epoch().count() << std::endl;
std::cout << "UTC time: " << date::format("%F %T %Z", utc_time) << '\n';
std::cout << "NY time: " << date::format("%F %T %Z", ny_time) << '\n';
date::local_seconds naive_local = ny_time.get_local_time();
std::cout << "NY time naive: " << naive_local.time_since_epoch().count() << "\n";
} Output:
|
compute::LocalTimestamp()
Performs incorrect conversioncompute::LocalTimestamp()
Performs incorrect conversion
BTW, why do you want to get offset-ed seconds? FYI: The document of |
Hi @kou thank you for your time and reply,
Apologies I am confused, from the documentation I thought that was the exact purpose of the
At least the implication of that from the the way its written is that it is performing the following calculation (which is what I am looking for): I should also note for ~99% of cases I've tested so far the
I am trying to write a small CLI tool that converts parquet data to XPT format; XPT format however has no support for timezones so how to correctly store timestamp data is dependent on the use case; some users prefer to store the data as timezone-naive whilst others (myself included) prefer to just store the UTC-relative timestamps. To this end I am just providing an option for the user to choose. EDIT -- After spending more time than I care to admit reading about timestamps I think this might be a bug with regards to how the tzdata information is consumed. At least the issue only seems to occur after 2038 and seems to be mostly with daylight savings that implies its some issue to do with the timezone rules not being correctly applied. 2038 is a common issue point due to it being an overflow point with regards to 32bit ints. I haven't looked at the underlying code here but just seems suspicious that this error occurs at this specific year and that the Arrow produced value is exactly 1-hour off the expected value. |
Ah, sorry. I misunderstood this. I thought that Is this duplicated of #36110 ? |
hmm I'm not sure to be honest. I mean on the surface it definitely appears to be related but I'm not sure its exactly the same. In that ticket the issue seems to be a mismatch in how arrow / python interpolate missing rules when going into the future. Here however I can clearly see that the rules in my local tzdata database extend up until 2499:
I also get the correct expected behaviour from both R, Python and Cpp which as far as I can tell are all using the system tzdata source as well so they should be consistent. import zoneinfo
import datetime
def printtime(time: int):
ny_time = datetime.datetime.fromtimestamp(time, zoneinfo.ZoneInfo("America/New_York"))
print(f"Time: {ny_time} ({ny_time.tzname()})")
print(zoneinfo.TZPATH) # ('/usr/share/zoneinfo', '/usr/lib/zoneinfo', '/usr/share/lib/zoneinfo', '/etc/zoneinfo')
printtime(2095940701) # Time: 2036-06-01 09:45:01-04:00 (EDT)
printtime(2127476701) # Time: 2037-06-01 09:45:01-04:00 (EDT)
printtime(2159012701) # Time: 2038-06-01 09:45:01-04:00 (EDT)
printtime(2190548701) # Time: 2039-06-01 09:45:01-04:00 (EDT) as.POSIXct(2095940701, tz = "America/New_York") # "2036-06-01 09:45:01 EDT"
as.POSIXct(2127476701, tz = "America/New_York") # "2037-06-01 09:45:01 EDT"
as.POSIXct(2159012701, tz = "America/New_York") # "2038-06-01 09:45:01 EDT"
as.POSIXct(2190548701, tz = "America/New_York") # "2039-06-01 09:45:01 EDT" #include <iostream>
#include <chrono>
#include <format>
void printme(long long x) {
std::chrono::sys_seconds utc_time{std::chrono::seconds(x)};
std::chrono::zoned_time ny_time{"America/New_York", utc_time};
std::cout << "Local time: " << std::format("{:%F %T %Z}", ny_time) << '\n';
}
int main() {
std::cout << "C++ Standard Version: " << __cplusplus << std::endl; // 2020
printme(2095940701); // Local time: 2036-06-01 09:45:01 EDT
printme(2127476701); // Local time: 2037-06-01 09:45:01 EDT
printme(2159012701); // Local time: 2038-06-01 09:45:01 EDT
printme(2190548701); // Local time: 2039-06-01 09:45:01 EDT
} --- EDIT --- Apologies in case that wasn't clear the issue with the arrow implementation is that it is not applying the correct daylight savings adjustment after 2038 e.g. given a UTC value of 2159012701, which if in "America/New_York" would be EDT, is being adjusted as if it were in EST instead. The above examples show that R / Cpp / Python all correctly recognise that the 2159012701 value should be EDT |
Describe the bug, including details regarding any error messages, version, and platform.
Apologies in advance if I've made a mistake here I am relatively new to the arrow Cpp API and also to managing datetime stamps, that being said I think there might be a bug with the
compute::LocalTimestamp()
function (at least it appears to be producing results I wouldn't have expected:For example take a timestamp(seconds) of
Assuming that the value was stored in a Timestamp array with a timezone of EDT I would have expected after running
compute::LocalTimestamp()
a value to be produced of:However in practice when doing this I am observing an actual value of:
I tried searching but I couldn't see any other issues (open or closed) related to this.
I am running on Fedora 41 using libarrow-16.1.0-12.fc41.x86_64 (latest available from the fedora package manager)
--- EDIT - Just tested against
arrow-19.0.1
and am still getting the same behavior ---Code I am running to reproduce this:
Component(s)
C++
The text was updated successfully, but these errors were encountered: