GHCN-M Raw Data from Ireland

(now subtitled) The Little Known Tropical Rain-forest of Ireland

Updated:

From the latest GHCN-M v3 status.txt file:

*************************************************************

03/11/2017

User feedback indicated a problem with some mean temperature data for select stations in Ireland.  The problems were traced to a particular data source (MCDW), and for the time being until that source is corrected, the data are now being sourced to the UK Met Office “Climat” data (“K” source flag”), which are believed to be the correct values.  The data changeover to the UK Met Office has occurred, but the source flag (“K”) for the corrected values was inadvertently left out.  Those source flags should be added within the next production cycle.

*************************************************************

“select” stations in Ireland meaning those stations for which GHCN-M v3 has continued to include data in recent years. Although how NOAA can be confident that the problem is confined to stations in Ireland without discovering the cause of these errors in MCDW (one of their own products) is something which escapes me.

Normally you might expect that some care would be taken to get the correction right. Not however here. They have botched it again, leaving many rogue values unchanged. Some, but not all of these are flagged as probably erroneous and to be omitted from further analysis. Others however pass their quality control. And they have not reverted to the (correct) values which they had earlier shown as received from the UK Met Office.

One absurd record caught my eye. Cork Airport in 2013 has been corrected, but in 2014 still has six identical rogue values including July, August and December. Five of these are flagged, but August still slips through as the rogue value of 10.4° slips through as not sufficiently outlying to be caught by their quality control. I asked myself how probable six identical monthly means would be. Not impossible for a tropical rain-forest climate zone I thought. Singapore for example has monthly means of daily mean temperatures lying throughout the year within a range of less than 2°C, but could not match the record six identical monthly means of Cork Airport. Examining the complete GHCN-M v3 raw data file I found 24 stations with  six or more identical monthly means, all well within the tropics. So book your next holiday in the tropical rain-forest climatic zone of Cork. Just beware of the crocodiles, and be aware that no guarantee is given that Cork will match the temperatures of the other 24 stations.

Their quality control procedures allow for manual flagging of erroneous values not caught by their automated procedure. Somehow I think it would have been prudent to take more care having admitted their own MCDW values were wrong, and if necessary resort to manual flagging until the cause of this MCDW problem had been determined and corrected.

END OF UPDATE

corkcomposite

Errors are not confined to 2013 to 2016, and not confined to this one station.

You can easily verify this. The Met Eireann most recent four year monthly data can be found at Monthly Data

The longer (not necessarily full length record however) data can be found at Historical Data  (I’ll return to add navigation advice here when I have completed other sections of this post. Navigation on this section of the Met Eireann site may not be intuitively obvious)

The GHCN-M version used above was ghcnm.tavg.v3.3.0.20161230.qcu.dat (which of course had not had a December 2016 value added, whereas Met Eireann calculates and shows a month-to-date mean, 7.4°C up to December 30th)

Station history

As shown below, the correct 2013 values were shown by GHCN-M for a time in 2013, corrupted for a time later in 2013, and briefly reappeared again in 2014, before settling down again as corrupted values.

Now follow the history of the April 2013 value (7.4°C according to Met Eireann). In the first GHCN-M file below (dated May 19th 2013) it is correctly recorded, and attributed to a CLIMAT report as source (740  P). This value is the most recent value to reach GHCN-M, and in this case the CLIMAT report has been correctly decoded. I will return to this question of correct or incorrect decoding of CLIMAT reports below).

By July 9th the still correct value has as data source “received by the UK Met Office” (740  K). This has been the usual change of data source, first CLIMAT report, then the UK Met Office. As seen on May 19th the March 2013 value had already been processed in this way (430  K).

On (or before) the 8th November the data source changed to “Monthly Climatic Data of the World (MCDW) QC completed but value is not yet published” (1040 WC). The value had now become the rogue value 10.4°C. the “W” quality flag indicates “monthly value is duplicated from the previous month, based upon regional and spatial criteria”. My experience of my region would suggest that duplicating the mean temperature of the previous month would very rarely produce a correct estimate for the following month. What “regional and spatial criteria” have required the replacement of a recorded monthly mean my a rogue value?

After that this rogue value has been retained, except for a brief return to the correct value and UK Met Office as source on (and possibly around) June 28th 2014. In mid 2015 the data source changed to “Final (Published) Monthly Climatic Data of the World (MCDW)”.

corkcomposite2013

 

Each monthly value above is followed by either one or two letters. A single letter, or the second of two letters, gives the data source:

C = Monthly Climatic Data of the World (MCDW) QC completed but value is not yet published
K = received by the UK Met Office
M = Final (Published) Monthly Climatic Data of the World (MCDW)
P = CLIMAT (Data transmitted over the GTS, not yet fully processed for the MCDW)
W = World Weather Records (WWR), 9th series 1991 through 2000

The first letter of two letters is a quality control flag:

S = monthly value has failed spatial consistency check. Any value found to be between 2.5 and 5.0 bi-weight standard deviations from the bi-weight mean, is more closely scrutinized by examining the 5 closest neighbors (not to exceed 500.0 km) and determine their associated distribution of respective z-scores.  At least one of the neighbor stations must have a z score with the same sign as the target and its z-score must be greater than or equal to the z-score listed in column B, where column B is expressed as a function of the target z-score ranges (column A). See GHCN-M README for table.
W = monthly value is duplicated from the previous month, based upon regional and spatial criteria and is only applied from the year 2000 to the present.

CLIMAT reports

to be added

As noted above, the April 2013 value for Cork Airport was initially decoded correctly from a CLIMAT report. But this correct decoding has not always been the case.

valentia2012a

 

 

When Valentia Observatory (62103953000) became an AWS station in April 2012 the first April values entering GHCN-M from decoded CLIMAT reports were rogue values

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

9 Responses to GHCN-M Raw Data from Ireland

  1. robinedwards36 says:

    What on earth have they been doing? It’s almost beyond belief that an official site should let such nonsense slip through. It undermines their credibility regarding other data that may also contain errors but which are be less blatant. Do those in charge really care?
    Have been meaning to contact you, but assorted (medical) problems have intervened, as they often do these days :-((
    I envy you your trip. Hope you witnessed the displays.

    Robin

    • When I first spotted this I complacently assumed that it was a transient issue which would be corrected once the GHCN-M data was finalised. Now I see that this is far from being the case. When I looked at it first I only looked at the four most recent years most readily found on the Met Eireann website. I’m not even sure whether the longer historical records were on the site at that time (and the longer historical records on the site are not the complete records, but only go back to a 20th century date, whereas Valentia Observatory for example was set up in 1868, and the GHCN-M record starts from 1869. That station is also of interest when considering GHCN-M pairwise homogenization. A station history has been maintained from the early 1900s, and parallel observations were made at old and new sites when the station was relocated. None of the GHCN-M breakpoints are confirmed by that station history, and a change of 0.3 degrees due to a station relocation, found in the station history, is not found in the adjusted GHCN-M record. An impressive homogenization performance.

      This time I did get to see the aurora (last time I was in northern Lapland the only night they were visible was the one night I did not go out)

  2. Here is the tail of the current unadjusted file, ghcnm.tavg.v3.3.0.20170205.qcu.dat

    2007TAVG 640 M 640 K 680 M 1110 M 1180 M 1370 M 1430 M 1480 M 1390 M 1180 M 860 M 730 M
    2008TAVG 640 M 640 M 630 M 780 M 1210 M 1340 M 1460 M 1460 M 1220 M 930 M 750 M 550 M
    2009TAVG 480 M 520 M 690 M 840 M 1040 M 1380 M 1420 M 1410 M 1290 M 1170 M 740 M 390 M
    2010TAVG 300 M 340 M 570 M 850 M 1060 K 1470 M 1490 M 1440 K 1350 M 1030 M 560 M 170 M
    2011TAVG 410 M 680 M 680 M 1060 M 1070 M 1200 M 1390 M 1340 M 1270 M 1110 M 960 M 1040 M
    2012TAVG 1040 SM 750 M 840 M 710 M 1050 M 1250 M 1040 M 1040 WM 1040 WM 1040 WM 1040 WM 1040 WM
    2013TAVG 1040 WM 510 M 1040 M 1040 WM 1000 M 1350 M 1040 SM 1540 M 1350 M 1200 M 1040 M 1040 WM
    2014TAVG 560 C 1040 SC 690 C 1040 C 1040 WC 1450 C 1040 SC 1040 C 1480 C 1120 C 840 C 1040 SC
    2015TAVG 1040 C 480 C 1040 SC 880 C 1010 C 1300 C-9999 -9999 1250 C 1100 C 950 C 830 C
    2016TAVG 610 C 510 C 610 C 730 C 1190 C 1440 C 1520 C 1500 C 1340 C 1090 C 590 K 740 K

    which agrees what you’ve shown in the post.

    The duplicate 1040s, starting in 2012, are flagged as ‘W’ which means a duplicate of the previous month. That means that in the corresponding qca file they are replaced by -9999, although the first of each group of 1040s survives.

    The M and C flags refer to the data source, which are “Monthly Climatic Data of the World (MCDW)”. I wonder if it’s possible to check MCDW? Is the error introduced by MCDW or GHCN?

  3. Answering my own question:

    You can find the MCDW data at https://www1.ncdc.noaa.gov/pub/data/mcdw/

    The Jan 2015 data is at https://www1.ncdc.noaa.gov/pub/data/mcdw/mcdw1501.pdf
    which includes the erroneous 10.4 figure.
    In fact all of the data there for Ireland seems suspiciously high for January.

    The signature at the front signing off these documents may cause some amusement.

    • The data flags are part of what I still have to add to the post. MCDW is also a NOAA product, and that is I believe where the corruption arises in the finalised GHCN-M, again something still to be added to the post. Initially the new data I observed earlier being added to GHCN-M by decoding CLIMAT reports (at the time decoding incorrectly); then the data source was flagged as received from the UK Met Office (and that data was correct); finally the data source changed to MCDW and the corrupted data returned. I’ll expand on all this tonight.

  4. Same problem at Shannon airport, where the repeated rogue number is 11.8, and Belmullet, where it’s 11.1. The rogue numbers appear in similar months but not quite identical places.

    621039620002012TAVG 810 M 810 M 780 M 950 M 1200 M 1360 M 1180 M 1180 WM 1180 WM 1180 WM 1180 WM 1180 WM

    621039760002012TAVG 790 M 840 M 960 M 960 WM 1180 M 1280 M 1110 M 1110 WM 1290 M 1110 M 1110 WM 1110 WM

    • My suspicion is that the GHCN and MCDW decoding is based on the unwise assumption that CLIMAT reports will be encoded exactly following the WMO specification (my reference to “child” programmers). Where the WMO issues a “Practical help” which condones departures from that specification that is a very unwise assumption. Until mid 2015 Met Eireann CLIMAT reports at Ogimet displayed just such departures from specification, as also did, and still do, reports from some coterie other countries. I suspect these rogue numbers may be unitialised data picked up when these reports are decoded assuming exact compliance with specification. Something similar to a buffer overflow. More on this in the post later.

  5. Pingback: More GHCN errors – thumb on the scale? | Climate Scepticism

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s