Date Management in Java

Considering how crucial date representation and computation is to so many of the sorts of commercial transactions that computers manage, you'd expect computers to be better at handling them. Contrary to reasonable expectation though, date representation, parsing, and storage have been a thorn in the side of programmers since the dawn of computers. In the early days of computing, when space was a very scarce resource, programmers stored dates as six-digit numeric strings — two digits for the year, two for the month, and two for the day. The values themselves were stored as strings, so that string sorting and comparison routines would work correctly on these dates, although date arithmetic was difficult. Worse was the decision to store (on disk!) dates using two-digit years. Sorting and comparison worked just fine as long as the dates themselves occurred before 12/31/99; anything after would be interpreted as the year 1900. I worked as a developer at a life insurance company in the mid 90's; the Y2K bug was already causing problems there, since they carried life insurance policies — on people still living — that dated back to 1899.

If you want to do any sort of arithmetic on dates — adding them or comparing them — you need a canonical form to represent them in. Americans and Europeans can't even agree on whether the month should come before they day, and most computer-friendly date formats put the year first so that pure-text sorting will work correctly. To avoid a repeat of the Y2K problem, modern computer systems get away from the concept (and the problems associated with the concept) of days, months and years and instead assign each date its own unique integer code. Such a code has three varying aspects - the granularity, the range, and the starting point. The granularity refers to which points in time get their own codes. Earlier coding-based systems had second-based granularity — every second in time that they were capable or representing was assigned its own unique code. The granulariy, of course, directly impacted the range. An n-bit processor can deal with 2n unique integer values without loss of precision, so the range that could be represented in this scheme is 2n/granularity.

C programs used a 32-bit integer and assigned each value to a unique second (not millisecond). This meant that there were 232 unique seconds that could be represented. This works out to:

232 = 4,294,967,296 seconds
232 / 60 = 71,582,788 minutes
232 / 60 / 60 = 1,193,046 hours
232 / 60 / 60 / 24 = 49,710 days
232 / 60 / 60 / 24 / 365 = 136 years
This is sort of a short range. To be usefully portable, there must be some agreed-upon "starting point" — in other words, what actual time does the value 0 represent? Obviously, the value 1 would represent the second after that, the value 2 the second after that, and so on, since time moves forward linearly (at least until December 21st of this year, and then all bets are off, at least according to the Mayans...) Arbitrarily, early C library implementers chose the date January 1, 1970. A back-of-the-envelope calculation would suggest that dates in the range from January 1, 1970 - March 11, 2106 (there are 70 days left over when the counter hits 136 years). However, this fails to take into account leap years. There are 34 leap years in the 136 year range that 232 can represent; this means that a 32-bit date can only climb up to Feb 6, 2106.

There's a problem with this, though - what about all of the dates before Jan 1, 1970? There's quite a bit of historical information of interest that occurred before 1970. To make maximal use of their 136-year range, C programmers instead consider the 32-bit counter to be a signed integer; the value 0 is still Jan 1, 1970, but negative numbers represent dates prior to that. This means that, following the canonical standard, C dates can represent any date in the range Dec 13, 1901 - Jan 18, 2038. (Meaning that those life insurance policies from 1899 still caused problems).

Incidentally, this is why I don't worry too much about retirement. The Y2038 problem that will occur when the old 32-bit counters reach their end of life will be even bigger than the Y2K problem, and nastier to fix. I'll turn 65 in the year 2039 — and since I know how to fix Y2038 problems, I plan on billing $300/hour and working 60-hours a week for the three or four years leading up to retirement. Assuming, that is, that the computer that keeps track of my bank account doesn't crash due to Y2038 problems...

Java implementers foresaw the Y2038 problem and gave dates in java a range of 264, and a granularity of one millisecond (one one- thousandth of a second). That means that java can represent 264 unique milliseconds. That works out to an astronomical-seeming 18,446,744,073,709,551,616 values. This represents:

264 / 1000 = 18,446,744,073,709,551.616 seconds
264 / 1000 / 60 = 307,445,734,561,825 minutes
264 / 1000 / 60 / 60 = 5,124,095,576,030 hours
264 / 1000 / 60 / 60 / 24 = 213,503,982,334 days
264 / 1000 / 60 / 60 / 24 / 365 = 584,942,417 years
This 584 million year range centers on Jan. 1, 1970, for optimal backwards compatibility. That means that, in java, dates ranging from Dec 2, 292,269,055 B.C. to Aug. 17 292,278,994 A.D. That starts at about 1/50th of the age of the universe and 1/15th of the age of the earth, so it's not perfect — but probably good enough for banking transactions. The Java calendar will roll over four-and-a-half billion years before the predicted end of time, so future programmers will have a Y292278994 problem to contend with, but I plan to be comfortably retired well before that time.

So, if the code 0 represents January 1, 1970, you'd expect this snippet:

Date d = new Date( 0L );  // initialize a date from a long integer code
System.out.println( d );
To output "Thu Jan 1 1970", right? Instead, on my computer, it outputs:
Wed Dec 31 18:00:00 CST 1969

Wait, what? Isn't 0 supposed to be January 1, 1970? Well, it is... but in which time zone? The JVM will go ahead and translate dates into your local time zone for you — since I'm 6 hours behind GMT, I see 6 PM on Dec. 31, 1969. This suggests that "zero time" in Java is midnight, January 1, 1970, in GMT (Greenwich Mean Time).

GMT is the timezone that longitude 0 falls in. In the 1800's, naval pilots divided the Earth into 360 equal vertical slices called longitudes and 360 horizontal slices called latitudes so that points on the Earth's surface could be roughly described in a standard way. Since the earth has a stable axis, it made perfect sense to standardize the 0 latitude as the equator. However, there's no logical standard for the 0 longitude. For some time, the 0 longitude varied from one country to the next (usually being the longitude that the capital of the reporters home country fell in). In 1884, the International Meridian Conference established London (particularly, Greenwich village) as the location of the standard 0 longitude.

So far, so good. Assigning each millisecond in a 584-million-year range makes date arithmetic and sorting simple — computers are pretty good at numeric operations. However, computers still need to interface with people, who don't typically work well with 64-bit integers. (Do you know your 64-bit integer birthdate? Mine is 171781200000.) So any date system has to be able to convert from the year-month-day, hour-month-second format that humans are used to into the canonical 64-bit long integer format. This turns out to be quite a bit more difficult than it sounds. It's easy enough to convert seconds to canonical time — just divide by 1000. Minutes? Divide by 60,000. Hours? Divide by 3,600,000. Days? Divide by 86,400,000. Months? Hmmm... well, that depends on which month. (I spent some time looking at this last January if you're curious). Since this conversion is complex, and must take into account a lot of little details such as leap years, Java includes a conversion from year/month/day to canonical format in the java.util.Date class. You can instantiate a Date with a three-integer constructor which will convert to the canonical 64-bit standard representation. It does so, that is, under the assumption that what you want is a Western-style date relative to the year 1900, in the timezone of the currently running system. To support proper internationalization, Sun/Oracle would prefer that you instead use the java.util.Calendar class to specify the timezone and locale, and create dates from that. So, instead of saying:

Date d = new Date( 111, 3, 24 ); // 3 = April; 0-based months.

Listing 1: Simple Date createion

For 4/24/2011 (the date of my first blog entry, if you're curious), you should instead say:

Calendar c = Calendar.getInstance();
c.set( Calendar.YEAR, 2011 );
c.set( Calendar.MONTH, 3 );
c.set( Calendar.DAY_OF_MONTH, 24 );
Date d = c.getTime();

Listing 2: "Correct" date creation

Although this is more i18n friendly (give getInstance a Locale and it will take into account locale-specific settings), this does involve the instantiation of two objects rather than just one. Each call to Calendar.getInstance returns a unique object; it's not a singleton class. A few quick experiments indicate that listing 2, while easier to internationalize, is about three times slower than listing 1.

So, given the three principal components (year, month and day) of a "human- friendly" date, it's a simple matter to convert, although you see quite a few date input boxes in interactive applications that look like Figure 1:

Date: //

Figure 1: Not very user-friendly date input form

most users would prefer to type a date in as an atomic unit — and you'll undoubtedly end up dealing with automated forms that consider dates to be one single element, with each subelement separated by a text separator such as a slash or a dash. This means that, to deal with input dates, you need some means of converting dates to and from strings. To make this even trickier, users in different locales prefer different date formats. Europeans are used to day-month-year ordering, but Americans expect month-day-year. Java provides a utility class to deal with exactly this in java.text.DateFormat. This class, however, is pretty picky. If you instantiate one via:
DateFormat fmt = DateFormat.getInstance();
and try to parse a date with it, you'll probably end up with a runtime exception. All of these input dates fail:
fmt.parse( "4/24/2011" );
fmt.parse( "4/24/11" );
fmt.parse( "2011-04-24" );
fmt.parse( "2011-4-24" );
fmt.parse( "Apr 24, 2011" );
fmt.parse( "April 24, 2011" );
As it turns out, the default instance expects (and requires) both a date and a time, in a very specific format. This parses correctly:
fmt.parse( "4/24/11 12:00 AM" );
But these don't:
fmt.parse( "4/24/11 12:00:00 AM" );
fmt.parse( "4/24/11 12:00AM" );
fmt.parse( "4/24/11 12:00" );
fmt.parse( "4-24-11 12:00 AM" );
fmt.parse( "12:00 4/24/11" );
In short — DateFormat.getInstance() returns a parser that's not particularly useful. (DateFormat is really designed to be used the other way — given a date in canonical milliseconds-since-the-epoch format, present it in a human-readable way, according to the current user's locale). There are some other instantiator methods that do permit just dates to be supplied, but they're still picky — for example, there's no way to get DateFormat to parse (or display) a date in the format YYYY-MM-DD. If you're curious, here's a table showing the formats of the various locales that JDK 1.6 supports out-of-the-box:
LocaleShortMediumLongFull
Japanese (Japan)11/04/242011/04/242011/04/242011年4月24日
Spanish (Peru)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
English4/24/11Apr 24, 2011April 24, 2011Sunday, April 24, 2011
Japanese (Japan,JP)H23.04.24H23.04.24H23.04.24平成23年4月24日
Spanish (Panama)04/24/1104/24/201124 de abril de 2011domingo 24 de abril de 2011
Serbian (Bosnia and Herzegovina)11-04-242011-04-2424. април 2011.недеља, 24. април 2011.
Macedonian24.4.1124.4.201124, април 2011недела, 24, април 2011
Spanish (Guatemala)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Arabic (United Arab Emirates)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Norwegian (Norway)24.04.1124.apr.201124. april 201124. april 2011
Albanian (Albania)11-04-242011-04-242011-04-242011-04-24
Bulgarian11-4-242011-4-24Неделя, 2011, Април 24Неделя, 2011, Април 24
Arabic (Iraq)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Arabic (Yemen)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Hungarian2011.04.24.2011.04.24.2011. április 24.2011. április 24.
Portuguese (Portugal)24-04-201124/Abr/201124 de Abril de 2011Domingo, 24 de Abril de 2011
Greek (Cyprus)24/04/201124 Απρ 201124 Απρίλιος 2011Κυριακή, 24 Απρίλιος 2011
Arabic (Qatar)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Macedonian (Macedonia)24.4.1124.4.201124, април 2011недела, 24, април 2011
Swedish2011-04-242011-apr-24den 24 april 2011den 24 april 2011
German (Switzerland)24.04.1124.04.201124. April 2011Sonntag, 24. April 2011
English (United States)4/24/11Apr 24, 2011April 24, 2011Sunday, April 24, 2011
Finnish (Finland)24.4.201124.4.201124. huhtikuuta 201124. huhtikuuta 2011
Icelandic24.4.201124.4.201124. apríl 201124. apríl 2011
Czech24.4.1124.4.201124. duben 2011Neděle, 24. duben 2011
English (Malta)24/04/201124 Apr 201124 April 2011Sunday, 24 April 2011
Slovenian (Slovenia)24.4.1124.4.2011Nedelja, 24 april 2011Nedelja, 24 april 2011
Slovak (Slovakia)24.4.201124.4.2011Nedeľa, 2011, apríl 24Nedeľa, 2011, apríl 24
Italian24/04/1124-apr-201124 aprile 2011domenica 24 aprile 2011
Turkish (Turkey)24.04.201124.Nis.201124 Nisan 2011 Pazar24 Nisan 2011 Pazar
Chinese11-4-242011-4-242011年4月24日2011年4月24日 星期日
Thai24/4/201124 เม.ย. 201124 เมษายน 2011วันอาทิตย์ที่ 24 เมษายน ค.ศ. 2011
Arabic (Saudi Arabia)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Norwegian24.04.1124.apr.201124. april 201124. april 2011
English (United Kingdom)24/04/1124-Apr-201124 April 2011Sunday, 24 April 2011
Serbian (Serbia and Montenegro)24.4.11.24.04.2011.24.04.2011.недеља, 24.април.2011.
Lithuanian11.4.242011-04-24Sekmadienis, 2011, Balandžio 24Sekmadienis, 2011, Balandžio 24
Romanian24.04.201124.04.201124 aprilie 201124 aprilie 2011
English (New Zealand)24/04/1124/04/201124 April 2011Sunday, 24 April 2011
Norwegian (Norway,Nynorsk)24.04.1124.apr.201124. april 201124. april 2011
Lithuanian (Lithuania)11.4.242011-04-24Sekmadienis, 2011, Balandžio 24Sekmadienis, 2011, Balandžio 24
Spanish (Nicaragua)04-24-1104-24-201124 de abril de 2011domingo 24 de abril de 2011
Dutch24-4-1124-apr-201124 april 2011zondag 24 april 2011
Irish (Ireland)24/04/201124 Aib 201124 Aibreán 2011Dé Domhnaigh 24 Aibreán 2011
French (Belgium)24/04/1124-avr.-201124 avril 2011dimanche 24 avril 2011
Spanish (Spain)24/04/1124-abr-201124 de abril de 2011domingo 24 de abril de 2011
Arabic (Lebanon)24/04/1124/04/201124 نيسان, 201124 نيسان, 2011
Korean11. 4. 242011. 4. 242011년 4월 24일 (일)2011년 4월 24일 일요일
French (Canada)11-04-242011-04-2424 avril 2011dimanche 24 avril 2011
Estonian (Estonia)24.04.1124.04.2011pühapäev, 24. Aprill 2011. apühapäev, 24. Aprill 2011
Arabic (Kuwait)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Serbian (Serbia)24.4.11.24.04.2011.24.04.2011.недеља, 24.април.2011.
Spanish (United States)4/24/11abr 24, 201124 de abril de 2011domingo 24 de abril de 2011
Spanish (Mexico)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Arabic (Sudan)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Indonesian (Indonesia)24/04/1124 Apr 1124 April 2011Minggu 24 April 2011
Russian24.04.1124.04.201124 Апрель 2011 г.24 Апрель 2011 г.
Latvian11.24.42011.24.4svētdiena, 2011, 24 aprīlissvētdiena, 2011, 24 aprīlis
Spanish (Uruguay)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Latvian (Latvia)11.24.42011.24.4svētdiena, 2011, 24 aprīlissvētdiena, 2011, 24 aprīlis
Hebrew24/04/1124/04/201124 אפריל 2011יום ראשון 24 אפריל 2011
Portuguese (Brazil)24/04/1124/04/201124 de Abril de 2011Domingo, 24 de Abril de 2011
Arabic (Syria)24/04/1124/04/201124 نيسان, 201124 نيسان, 2011
Croatian2011.04.242011.04.242011. travanj 242011. travanj 24
Estonian24.04.1124.04.2011pühapäev, 24. Aprill 2011. apühapäev, 24. Aprill 2011
Spanish (Dominican Republic)04/24/1104/24/201124 de abril de 2011domingo 24 de abril de 2011
French (Switzerland)24.04.1124 avr. 201124. avril 2011dimanche, 24. avril 2011
Hindi (India)२४/४/११२४ अप्रैल, २०११२४ अप्रैल, २०११रविवार, २४ अप्रैल, २०११
Spanish (Venezuela)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Arabic (Bahrain)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
English (Philippines)4/24/1104 24, 11April 24, 2011Sunday, April 24, 2011
Arabic (Tunisia)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Finnish24.4.201124.4.201124. huhtikuuta 201124. huhtikuuta 2011
German (Austria)24.04.1124.04.201124. April 2011Sonntag, 24. April 2011
Spanish24/04/1124-abr-201124 de abril de 2011domingo 24 de abril de 2011
Dutch (Netherlands)24-4-1124-apr-201124 april 2011zondag 24 april 2011
Spanish (Ecuador)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Chinese (Taiwan)2011/4/242011/4/242011年4月24日2011年4月24日 星期日
Arabic (Jordan)24/04/1124/04/201124 نيسان, 201124 نيسان, 2011
Belarusian24.4.1124.4.2011нядзеля, 24, красавіка 2011нядзеля, 24, красавіка 2011
Icelandic (Iceland)24.4.201124.4.201124. apríl 201124. apríl 2011
Spanish (Colombia)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Spanish (Costa Rica)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Spanish (Chile)24-04-1124-04-201124 de abril de 2011domingo 24 de abril de 2011
Arabic (Egypt)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
English (South Africa)2011/04/2424 Apr 201124 April 2011Sunday 24 April 2011
Thai (Thailand)24/4/255424 เม.ย. 255424 เมษายน 2554วันอาทิตย์ที่ 24 เมษายน พ.ศ. 2554
Greek (Greece)24/4/201124 Απρ 201124 Απρίλιος 2011Κυριακή, 24 Απρίλιος 2011
Italian (Italy)24/04/1124-apr-201124 aprile 2011domenica 24 aprile 2011
Catalan24/04/1124/04/201124 / abril / 2011diumenge, 24 / abril / 2011
Hungarian (Hungary)2011.04.24.2011.04.24.2011. április 24.2011. április 24.
French24/04/1124 avr. 201124 avril 2011dimanche 24 avril 2011
English (Ireland)24/04/1124-Apr-201124 April 201124 April 2011
Ukrainian (Ukraine)24.04.1124 квіт 201124 квітня 2011неділя, 24 квітня 2011 р.
Polish (Poland)24.04.112011-04-2424 kwiecień 2011niedziela, 24 kwiecień 2011
French (Luxembourg)24/04/1124 avr. 201124 avril 2011dimanche 24 avril 2011
Dutch (Belgium)24/04/1124-apr-201124 april 2011zondag 24 april 2011
English (India)24/4/1124 Apr, 201124 April, 2011Sunday, 24 April, 2011
Catalan (Spain)24/04/1124/04/201124 / abril / 2011diumenge, 24 / abril / 2011
Arabic (Morocco)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Spanish (Bolivia)24-04-1124-04-201124 de abril de 2011domingo 24 de abril de 2011
English (Australia)24/04/1124/04/201124 April 2011Sunday, 24 April 2011
Serbian24.4.11.24.04.2011.24.04.2011.недеља, 24.април.2011.
Chinese (Singapore)24/04/1124-四月-1124 四月 201124 四月 2011
Portuguese24-04-201124/Abr/201124 de Abril de 2011Domingo, 24 de Abril de 2011
Ukrainian24.04.1124 квіт 201124 квітня 2011неділя, 24 квітня 2011 р.
Spanish (El Salvador)04-24-1104-24-201124 de abril de 2011domingo 24 de abril de 2011
Russian (Russia)24.04.1124.04.201124 Апрель 2011 г.24 Апрель 2011 г.
Korean (South Korea)11. 4. 242011. 4. 242011년 4월 24일 (일)2011년 4월 24일 일요일
Vietnamese24/04/201124-04-2011Ngày 24 tháng 4 năm 2011Chủ nhật, ngày 24 tháng tư năm 2011
Arabic (Algeria)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Vietnamese (Vietnam)24/04/201124-04-2011Ngày 24 tháng 4 năm 2011Chủ nhật, ngày 24 tháng tư năm 2011
Serbian (Montenegro)24.4.11.24.04.2011.24.04.2011.недеља, 24.април.2011.
Albanian11-04-242011-04-242011-04-242011-04-24
Arabic (Libya)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Arabic24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Chinese (China)11-4-242011-4-242011年4月24日2011年4月24日 星期日
Belarusian (Belarus)24.4.1124.4.2011нядзеля, 24, красавіка 2011нядзеля, 24, красавіка 2011
Chinese (Hong Kong)11年4月24日2011年4月24日2011年04月24日 星期日2011年04月24日 星期日
Japanese11/04/242011/04/242011/04/242011年4月24日
Hebrew (Israel)24/04/1124/04/201124 אפריל 2011יום ראשון 24 אפריל 2011
Bulgarian (Bulgaria)11-4-242011-4-24Неделя, 2011, Април 24Неделя, 2011, Април 24
Indonesian11/04/242011 Apr 242011 April 24Minggu, 2011 April 24
Maltese (Malta)24/04/201124 Apr 201124 ta’ April 2011Il-Ħadd, 24 ta’ April 2011
Spanish (Paraguay)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Slovenian24.4.1124.4.2011Nedelja, 24 april 2011Nedelja, 24 april 2011
French (France)24/04/1124 avr. 201124 avril 2011dimanche 24 avril 2011
Czech (Czech Republic)24.4.1124.4.201124. duben 2011Neděle, 24. duben 2011
Italian (Switzerland)24.04.1124-apr-201124. aprile 2011domenica, 24. aprile 2011
Romanian (Romania)24.04.201124.04.201124 aprilie 201124 aprilie 2011
Spanish (Puerto Rico)04-24-1104-24-201124 de abril de 2011domingo 24 de abril de 2011
English (Canada)24/04/1124-Apr-2011April 24, 2011Sunday, April 24, 2011
German (Germany)24.04.1124.04.201124. April 2011Sonntag, 24. April 2011
Irish11/04/242011 Aib 242011 Aibreán 24Dé Domhnaigh, 2011 Aibreán 24
German (Luxembourg)24.04.1124.04.201124. April 2011Sonntag, 24. April 2011
German24.04.1124.04.201124. April 2011Sonntag, 24. April 2011
Spanish (Argentina)24/04/1124/04/201124 de abril de 2011domingo 24 de abril de 2011
Slovak24.4.201124.4.2011Nedeľa, 2011, apríl 24Nedeľa, 2011, apríl 24
Malay (Malaysia)24/04/201124 April 201124 April 2011Ahad 24 Apr 2011
Croatian (Croatia)24.04.11.24.04.2011.2011. travanj 242011. travanj 24
English (Singapore)4/24/11Apr 24, 2011April 24, 2011Sunday, April 24, 2011
Danish24-04-1124-04-201124. april 201124. april 2011
Maltese24/04/201124 Apr 201124 ta’ April 2011Il-Ħadd, 24 ta’ April 2011
Polish11-04-242011-04-2424 kwiecień 2011niedziela, 24 kwiecień 2011
Arabic (Oman)24/04/1124/04/201124 أبريل, 201124 أبريل, 2011
Turkish24.04.201124.Nis.201124 Nisan 2011 Pazar24 Nisan 2011 Pazar
Thai (Thailand,TH)๒๔/๔/๒๕๕๔๒๔ เม.ย. ๒๕๕๔๒๔ เมษายน ๒๕๕๔วันอาทิตย์ที่ ๒๔ เมษายน พ.ศ. ๒๕๕๔
Greek24/4/201124 Απρ 201124 Απρίλιος 2011Κυριακή, 24 Απρίλιος 2011
Malay11/04/242011 Apr 242011 April 24Ahad, 2011 April 24
Swedish (Sweden)2011-04-242011-apr-24den 24 april 2011den 24 april 2011
Danish (Denmark)24-04-1124-04-201124. april 201124. april 2011
Spanish (Honduras)04-24-1104-24-201124 de abril de 2011domingo 24 de abril de 2011

Table 1: JDK 1.6 locales and date formats

There are some pretty sophisticated conversions going on in there — Japan,JP actually converts the year to the era (年号) which is an older style of tracking years based on the ascendancy of the emporor. DateFormat.parse wants pretty much the exact input format of the current locale. There's a "lenient" setting whose behavior varies from one locale to the next; it's not documented exactly how lenient this behavior is or how it works, so the only way to determine what may be accepted is trial and error. Instead, it's usually easier to use java.text.SimpleDateFormat. SimpleDateFormat allows you to request a specific input format; although it doesn't allow "leniency", you (the developer) have a wide latitude on what sort of date formats you can accept. SimpleDateFormat is a concrete class (it has public constructors, unlike DateFormat) and when you instantiate one, you supply a date/time format template. The most useful such template (IMHO) is yyyy-MM-dd which will allow you to parse four digit years, followed by a dash, a month (one-based, as a user would expect, not zero-based) another dash, and a day of the month. There are options to allow the user to input an era (such as "AD" or "BC"), a time zone, milliseconds, days of the week, and so on, but I've never had occasion to use any of these.

One thing to be careful about with DateFormat and its subclasses such as SimpleDateFormat is that, for some odd reason, they're not thread-safe. If two threads try to to use the same instance to parse (or even format) dates concurrently, behavior is undefined. This is not an idle warning: I ran some tests and found that about 75% of the time, two threads invoking a shared SimpleDateFormat threw nonsensical exceptions such as:

Exception in thread "Thread-2" java.lang.NumberFormatException: multiple points
  at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1082)
  at java.lang.Double.parseDouble(Double.java:510)
  at java.text.DigitList.getDouble(DigitList.java:151)
  at java.text.DecimalFormat.parse(DecimalFormat.java:1302)
  at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:1589)
  at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1311)
  at java.text.DateFormat.parse(DateFormat.java:335)
  at DateTest$1.run(DateTest.java:48)
  at java.lang.Thread.run(Thread.java:680)

Exception in thread "Thread-2" java.lang.NumberFormatException: For input string: ""
  at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
  at java.lang.Long.parseLong(Long.java:431)
  at java.lang.Long.parseLong(Long.java:468)
  at java.text.DigitList.getLong(DigitList.java:177)
  at java.text.DecimalFormat.parse(DecimalFormat.java:1297)
  at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:1934)
  at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1311)
  at java.text.DateFormat.parse(DateFormat.java:335)
  at DateTest$1.run(DateTest.java:48)
  at java.lang.Thread.run(Thread.java:680)

Exception in thread "Thread-1" java.lang.NumberFormatException: For input string: ".2233E2.2233E2"
  at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1222)
  at java.lang.Double.parseDouble(Double.java:510)
  at java.text.DigitList.getDouble(DigitList.java:151)
  at java.text.DecimalFormat.parse(DecimalFormat.java:1302)
  at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:1934)
  at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1311)
  at java.text.DateFormat.parse(DateFormat.java:335)
  at DateTest$1.run(DateTest.java:48)
  at java.lang.Thread.run(Thread.java:680)
Good luck trying to figure out what went wrong from a log file entry there. Worse, some times both threads would fail with errors, but other times one would throw an exception and the other would succeed. "Worse", you say? Well, in those cases, the parsed date for "2011-04-24" was, at various times:
  • Fri Apr 23 00:00:00 CDT 2011
  • Mon Oct 24 00:00:00 CST 1200
  • Fri Apr 24 00:00:00 CST 1917
  • Sun Apr 24 00:00:00 CST 1197
In other words, don't ever share a SimpleDateFormat between multiple threads.

The easiest solution is to just create a new SimpleDateFormat for each thread; if you're feeling particularly lazy, instantiate a new one just before you need to parse a date, and let the garbage collector pick it up after the parsing function returns. Conventional wisdom suggests that this is a major performance drain — however, in benchmarking tests I've run on several systems suggests that instantiating a SimpleDateFormat every time you need one is only about 2-3 times slower than trying to share one. Depending on the platform, this is probably in the 1/1,000,000 of a second (microsecond) range, so I don't worry too much about it. I've seen developers do some pretty funky things with ThreadLocal storage to try to beat this and share a single instance across multiple threads, but the application requirements rarely warrant this. (Remember — good enough is good engineering).

In reality, if you're really pinched for performance, though, you should completely forego DateFormat and its subclasses and just do some simple date parsing yourself. The code in listing 3 is about twice as fast as SimpleDateFormat.parse, even when shared.

StringTokenizer tok = new StringTokenizer( in, "-" );
Date d = new Date( Integer.parseInt( tok.nextToken() ) - 1900,
        Integer.parseInt( tok.nextToken() ) - 1,
        Integer.parseInt( tok.nextToken() ) );

Listing 3: Fast fixed-format date parsing

The drawback to this approach is, of course, that it's not at all i18n compliant. However, if you're trying to squeeze every last drop of performance out of your application, you're going to have to bend somewhere.

Add a comment:

Completely off-topic or spam comments will be removed at the discretion of the moderator.

Name: Name is required
Email (will not be displayed publicly):
Comment:
Comment is required
Elvin Cheng, 2012-11-29
Good post.
Gerry, 2012-12-18
I'm quite pleased with the ifnormtaoin in this one. TY!

Past Posts

My Book

I'm the author of the book "Implementing SSL/TLS Using Cryptography and PKI". Like the title says, this is a from-the-ground-up examination of the SSL protocol that provides security, integrity and privacy to most application-level internet protocols, most notably HTTP. I include the source code to a complete working SSL implementation, including the most popular cryptographic algorithms (DES, 3DES, RC4, AES, RSA, DSA, Diffie-Hellman, HMAC, MD5, SHA-1, SHA-256, and ECC), and show how they all fit together to provide transport-layer security.

My Picture

Joshua Davies

Blog Submission Blog Sites
Promote Blog
Blog Community & Blog Directory
Blogs Blog Gadgets Alessandra