GHCN v3 station location metadata – a “report card”

Just over two years ago, when notifying NASA Goddard (GISTEMP) of errors in station coordinates in their v2.inv file, derived from the corresponding GHCN file, I also contacted Russell Vose at NOAA directly, rather than assuming he would be contacted by Goddard. He replied promptly, appreciating feedback, indicated that he was no longer working on the GHCN temperature data, that he would pass on the information to those working on a new version of the temperature dataset, and that “Hopefully some of these can be fixed quickly, but others may take a little longer”.

So, two years later, is the location metadata for GHCN v3 improved when compared to that for GHCN v2? The answer unfortunately, at least in so far as non-USA data is concerned, must be “not really”. For this “report card” I will consider the changes made for non-US station locations, and in particular the response to the errors I reported. Since that e-mail I have found many more location errors. These have been mainly for non-USA stations, although I have also also found some wrongly located USA stations. This may be because the location metadata for GHCN v2 and GHCN v3 stations in the USA is generally more accurate, which does seem to be the case. The fact that I have less alternative location information available to me for US stations may also be a relevant factor. Although there are a large number of US stations in both GHCN versions, these actually cover a relatively small part of the surface of the earth at higher density relative to the remaining stations in the rest of the world, and I have concentrated my checking efforts on these stations in the rest of the world.

The changes made to GHCN location metadata with v3 for non-US stations can be quickly summarised: all GHCN v3 latitude and longitude values are reported to 0.0001 degrees, rather than to 0.01 degrees as in GHCN v2. For non-US stations, with the single exception of ZARAGOZA/AEROPUERTO, this is achieved simply by appending 00 to the previous values. The unwarranted claim in Hansen et al 2010 that

Station location in the meteorological data records is provided with a resolution of 0.01 degrees of latitude and longitude, corresponding to a distance of about 1 km. This resolution is
useful for investigating urban effects on regional atmospheric temperature.

notwithstanding, the provision of this data with a resolution of 0.01 degrees does not mean that this data is accurate to 0.01 degrees, and if “useful for investigating urban effects” has the commonsense meaning of “accurate enough to use for classification of the location as urban or rural” the data for a substantial number of stations (not just those listed below) fails to meet such a criterion.

The locations of two non-US stations have been changed, both of which were included in my e-mail two years ago (which I have added at the end of this post):

64308160000    TENKODOGO has been returned to Spain from Burkina Faso as 64308160000    ZARAGOZA/AEROPUERTO, with appropriate latitude/longitude

64308373001    ALBACETE has had longitude changed from 1.8 degrees (east) to -1.8 degrees (est). There is a problem however with this “correction”. 08373 is the WMO station IBIZA/ES CODOLA, and, as the GISTEMP team at Goddard concluded, 08373001 is unlikely to be located at Albacete, and likely to be located at Ibiza.

And so, as regards non-US station locations, that’s it!

62316197001 ISOLA GORGONA is still located just over one degree north of the GHCN location

61017600001 LIMASSOL is still located about one degree east of the GHCN location, and has not yet become a new Atlantis

30187311000 SAN JUAN AERO is still more than forty kilometres in error

30382244001 SANTAREM/TAPERINHA is still probably mislocated

30984564000 HUANUCO is still about fifty kilometres in error

30984691000 PISCO (notified as a smaller error) is still about seven kilometres from where the WMO believes it to be

64308171001 SORIA is likely, as the GISTEMP team believe, to actually be a station at Lerida/LLeida (and the identification of this station as Soria is still likely to have arisen as a result of use of a map based on the Madrid meridian.

64308330001 SINTRA/GRANJA remains Portuguese, not Spanish (and is probably in error by about five kilometres)

And to complete the list of errors mentioned, the port and airport of Cherbourg have not yet migrated out to sea.

Two of the errors reported acted upon (and one of these probably still in error). The remainder of the errors reported ignored, and no evidence of any effort made to find and correct errors for other non-US stations (of which there are many).

While accuracy of these coordinates may perhaps be less important to NOAA than for example to GISTEMP, where they are used to classify stations as urban or rural, which may be difficult to do by examining the night time luminance of a point at some distance from the station, the appropriate grade nevertheless seems clear:

Grade: FAIL

(It may be interesting to compare the response of another organisation, the WMO, to a similar report. Batch of errors reported Saturday March 3, 2012. Response Monday March 5, 08:57. Corrections included in the flatfile of data on Observing Stations (with higher precision) – 05 March 2012 edition, later that day)


E-mail of 3 March 2010:

I am assuming from the contacts information on the NOAA website for GHCN-Monthly that you are the appropriate person, but if not, please redirect.

I have recently examined the changes resulting from the recent GISS decision to extend the use of nightlight radiance in Gistemp to determine rural/urban classification of stations from just USHCN stations previously to the rest of the world now (which has changed the classification of about a quarter of all stations). To do this I have examined the latitudes and longitudes in v2.inv, and quickly discovered some major errors while initially examining only a very small subset of file entries which were easily checked (details below). I then proceeded to check some more stations, in particular some of those located at airports, where the general location of the station could be inferred from the airport boundaries, and found that a considerable number of the coordinates in v2.inv were located at a considerable distance from the airport boundary, where I have regarded “a considerable distance” as “by multiples of the length of the main runway”, and considerably more than the precision implied by coordinates given as degrees to two decimal places. I also was amused to find that two of the Spanish stations were located by longitude values relative to the Madrid Prime Meridian rather than Greenwich!

As I have already found a considerable number of location errors, examining only a small subset of file entries, I would assume that a much larger number of similar errors remain undetected among the remaining file entries which I have not even attempted to examine.

While the nightlight radiance values in the version of v2.inv used by GISS have presumably been added by GISS, I have checked coordinates found in that GISS version against your original v2.inv, and found that the coordinates are the same. I have already e-mailed Reto Ruedy last week about this, but not yet received a reply from him, unusually, as he has previously replied within one working day  when I have alerted him to some coding errors, or made comments regarding Gistemp. As I assume he is on leave, I am now bringing this directly to your notice, rather than assuming that he will pass it on to you in due course.

The three major errors which I found within a couple of minutes were:

ALBACETE            SPAIN       39.00    1.80   43    0U   83FLxxno-9x-9WATER           A

Which I at first thought was simply an east/west error as -1.80 degrees relative to Greenwich does in fact fall relatively close to the true location of Albacete, but when I looked at the remaining Spanish stations and found another, Soria, which is relocated eastwards by a similar distance, and recalled that I had used some Spanish maps 25 years ago which used the Madrid meridian, I checked and found that Madrid is 3.687911111 degrees west of Greenwich, realised that the shift from 1.8 to -1.8 was just a coincidence and corrected for the difference between Madrid and Greenwich instead, and found that Albacete now fell even closer to that city, and of course Soria also moved to the true location.

62316197001 ISOLA GORGONA                   42.40    9.90  254    0R   -9HIxxCO 1x-9WATER           A

This small island is located at approximately 43.43 N, similar longitude

61017600001 LIMASSOL            CYPRUS      34.70   32.00    8    0U   82HIxxCO 1x-9WATER           A

Limassol is located at approximately 33.05 E, similar latitude

These relocations are so distant that no knowledge of the true coordinates was needed to find them within minutes. I had already noticed some time ago that the two stations at Cherbourg and Cherbourg airport, an area I am familiar with, had been relocated out into La Manche/English Channel, albeit not as extremely as the above stations, and it was in fact this which prompted me to check other stations now, and suggested an easy way to check at least some stations – I simply imported the coordinates of all European stations into Microsoft AutoRoute, and looked at the relatively few pushpins which appeared to be out at sea, and checked these to see if there was a “supporting island”. The three above were obviously wrong. There were also some others closer to shore, and in areas such as Turkey where the map detail was insufficient.

The next obvious group of stations to examine were airports, as the location of the station could be roughly determined without additional location information. To examine these I opened v2.inv in Excel, and wrote a short macro to produce a Google Earth kml file from selected stations, setting up a tour of these stations in Google Earth, with a pushpin and identifying information at each station location, and a short pause to allow zooming out if necessary if the airport runways were not in view. I have not attached these kml files now, as you may quite rightly be wary of mail from an unknown person with attachments, but I can send these on request when you have had a chance to verify from the examples in this email that the errors I describe are real. Sample airports you might look at include:

30187311000 SAN JUAN AERO                  -31.57  -68.87  598  688U  298MVxxno-9A 5WARM GRASS/SHRUBA

30382244001 SANTAREM/TAPERINHA              -2.40  -54.30   22   13U  102FLxxno-9A 1WARM GRASS/SHRUBA

30984564000 HUANUCO                         -9.90  -75.75 1860 3087U   52MVxxno-9A 1WARM GRASS/SHRUBA

30984691000 PISCO                          -13.75  -76.28    7    5U   53FLxxCO 1A 1WATER           B (smaller error)

And many more in other regions. These just happen to be among the first which caught my eye. And I mentioned Soria above:

64308171001 SORIA               SPAIN       41.70    1.20 1068  424R   -9HIFOno-9x-9MED. GRAZING    A

I looked at the remaining Spanish stations to see if any others were relocated like Albacete, and found Soria. But I also noticed two other trivial (from the GHCN point of view, although possibly not from a political point of view) errors. The 35 “Spanish” stations include

64308160000 TENKODOGO                       11.77    0.38 -999  288U  449HIxxno-9A10WARM GRASS/SHRUBA

Which is actually in Burkina Faso, and

64308330001 SINTRA/GRANJA                   38.80   -9.30  133  196U 1100HIxxCO 5A10COASTAL EDGES   C

Which is actually in Portugal, and far from the border. As Portugal already has a genuine historic border dispute with Spain (Olivença), perhaps this is one you should correct quickly!

