How to handle time is one of those tricky issues where it is all too easy to get it wrong. So let’s dive in. (Note: We learned these lessons when implementing the scheduling system in Arrow.)
Why UTC Sometimes Fails
First off, using UTC (also known as Greenwich Mean Time) is many times not the correct solution. Yet many programmers think if they store everything that way, then they have it covered. (This mistake is why several years ago when Congress changed the start of DST in the U.S. you had to run a hotfix on Outlook for it to adjust reoccurring events.)
So let’s start with the key question – what do we mean by time? When a user says they want something to run at 7:00 am, what do they mean? In most cases they mean 7:00 am where they are located – but not always. In some cases, to accurately compare say web server statistics, they want each “day” to end at the same time, unadjusted for DST. At the other end, someone who takes medicine at certain times of the day and has that set in their calendar, will want that to always be on local time so a 3:00pm event is not 3:00am when they have travelled half way around the world.
So we have three main use cases here (there are some others, but they can generally be handled by the following):
- The same absolute (for lack of a better word) time.
- The time in a given time zone, shifting when DST goes on/off (including double DST which occurs in some regions).
- The local time, the time where the person presently in located.
The first is trivial to handle – you set it as UTC. By doing this every day of the year will have 24 hours. (Interesting note, UTC only matches the time in Greenwich during standard time. When it is DST there, Greenwich and UTC are not identical.)
The second requires storing a time and a time zone. However, the time zone is the geographical zone, not the present offset (offset is the difference with UTC). In other words, you store “Mountain Time,” not “Mountain Standard Time” or “Mountain Daylight Savings Time.” So 7:00 am in “Mountain Time” will be 7:00 am in Colorado regardless of the time of year.
Using a UTC offset is generally an incorrect solution. If you truly want the same time year round, you would use UTC. If you want the time in Colorado, using a UTC offset will be wrong half of the year. Even for the case where you say you want 7:00am in Colorado on Feb 1, and you know the UTC offset for that day – it can change. And it can change on very short notice. So don’t use UTC offset.
The third is similar to the second in that it has a time zone called “Local Time.” However, it requires knowing what time zone it is in in order to determine when it occurs. And this gets tricky. I’m on my laptop in Sweden hitting a server in Colorado. What’s local time? And more importantly, since local time is where my laptop is, we have two problems. First, did I change my laptop to Central European Time? Second, if I’m using a browser, how does the server know the time zone I’m in as the browser does not pass the local timezone to the web server?
Putting This to Use
Ok, so how do you handle this? It’s actually pretty simple. Every time needs to be stored one of two ways:
- As UTC. Generally when stored as UTC, you will still set/display it in local time.
- As a datetime plus a geographical timezone (which can be “local time”).
Now the trick is knowing which to use. Here are some general rules. You will need to figure this out for additional use cases, but most do fall in to these categories.
- When something happened – UTC. This is a singular event or series of events and when it occurred is unchangeable. You may have a timezone using to display the time of the event, but store the event itself using UTC.
- When the user selects a timezone of UTC, then obviously you use UTC. (Always provide UTC as a timezone when the user is prompted for a timezone as some people legitimately find this best for them.)
- An event in the future – datetime plus a timezone. Now it might be safe to use UTC if it will occur in the next several months (changing timezones generally have that much warning – although sometimes it’s just 8 days), but at some point out you need to do this, so you should do it for all cases. In this case you display what you stored.
- For a scheduled event, when it will next happen – UTC. This is a performance requirement where you want to be able to get all “next events” where their runtime is before now. Much faster to search against dates than recalculate each one. However, this does need to recalculate all scheduled events regularly in case the rules have changed for an event that runs every quarter.
- For events that are on “local time” the recalculation should occur anytime the user’s timezone changes. And if an event is skipped in the change, it needs to occur immediately.
With .NET 3.5 Microsoft added the TimeZoneInfo class. This provides full information for all timezones and works off the timezone settings in Windows, which are updated regularly. Use TimeZoneInfo.Id to save/find the timezone.
This is presently not great in Java. The SimpleTimeZone class is a mess that among other things returns 616 timezones. I’ve also found it a royal PITA to work with generally not giving me what I really want. Joda-Time is an open source library that is a really good implementation. Both of these use the tz database which you need to update regularly (and yes they should pull this info from the O/S – but they don’t).
Coming with Java 8 is a new datetime API. It’s cleaned up, done right, and the world no longer was created Jan 1, 1900 (yes the existing classes have no way to represent dates before 1900). For anything new, if you are using Java 8, switch to this new set of classes.
The one thing we have not figured out is how to know a user’s location if they are using a browser to hit our web application. For most countries the locale can be used to determine the timezone – but not for the U.S. (6 zones), Canada, or Russia (11 zones). So you have to ask a user to set their timezone – and to change it when they travel.
You can get the time zone offset (that helps) but not the timezone with the following:
<input id=”timezone_offset” type=”hidden” name=”timezone_offset” value=””>
document.getElementById(‘timezone_offset’).value = new Date().getTimezoneOffset();
The geo location based on IP address is also iffy. I was at a hotel in D.C. when I got a report of our demo download form having a problem. We pre-populate the form with city, state, & country based on the geo of the IP address. It said I was in Cleveland, OH. So again, usually right but not always. My take is we can use the offset, and for cases where there are multiple timezones with that offset (on that given day), follow up with the geo of the IP address. But I sure wish the powers that be would add a tz= to the header info sent with an HTML request.
A Few Final Notes
Ok, so you’ve implemented your code using a time and timezone. You’ve tested and you’re handling it all right. Good to go. Except…
What happens if an event is set for 1:30am? Do you run it twice on the day DST ends? And if an event is set for 2:30am does it just not happen on the day DST starts? This can be critical for many uses (like submitting the day’s VISA charges) and a problem for almost all uses. And telling users don’t schedule between 1:00am and 3:00am is not user friendly.
- For the day DST starts where there is no 2:00am – 3:00am there’s an easy solution that you should implement even if this problem did not exist. You don’t run events that are set for the present time, you should always run events that are set for the present time or earlier. This also handles the case of the program not running when an event would have occurred, it then runs all past due events when the program starts.
- For the day DST ends, it’s a pretty simple solution. You need to record if you ran an event, but you need that also for the case of the program was not running (or DST started) so you know what events are past due and not run vs what ones were run and reoccur in the future, but today’s run has occurred. The ones thing you need to add to this is to handle the case of an event that is due now, but already ran an hour ago.
For your internal calculations, do everything in UTC. For anything stored as a local timezone, immediately convert it to UTC when you read it in. If you try to perform calculations where some datetimes are in UTC, others are in the server’s timezone, and the remaining are in the browser’s timezone – you will have bugs. Really nasty ones.
Author: David Thielen
Dave, Windward's founder and CEO, is passionate about building superb software teams from scratch and dramatically improving the productivity of existing software teams. He's really proud that he once created a game so compelling (Enemy Nations) that a now-professional World of Warcraft player lost his job for playing it incessantly on company time. You can read more from Dave on his personal blog, and at Huffington Post.
Other posts by David Thielen