GISTEMP, GHCN and Valentia

Publishing this post now in advance of completion to be able to discuss some aspects with others. Hoping to complete it soon.

Tuesday July 16. Reformatting and exploration of MCDW has now been moved to a separate post.

Other related posts include Another look at NASA Gistemp, KML data for 434 WMO stations, Valentia, Met Éireann and NOAA and GHCN-M Raw Data from Ireland

The first thing to remember is that GISTEMP IS NOT GHCN and GHCN IS NOT GISTEMP. Keeping those distinctions clear can avoid misunderstanding.

The GISS website provides a “one stop shop” for plotting and downloading GHCN and GISTEMP data. This is convenient, however it is important to be clear what is being done by GHCN (NOAA) and what is being done by GISTEMP (NASA GISS). The data series for the Irish station, Valentia Observatory, illustrates how both GHCN and GISTEMP go wrong. Other problems with GHCN v3 are illustrated by examining other Irish stations.

This post is prompted by a recent blog post at NoTricksZone: Adjusted “Unadjusted” Data: NASA Uses The “Magic Wand Of Fudging”, Produces Warming Where There Never Was, and a related Twitter conversation by Kirye, one of the authors of that NoTricksZone blog post. Pierre L. Gosselin should know better than to sow confusion by writing:

So what to do? Well, it seems that NASA has decided to adjust its “V3 unadjusted datasets” and rename them as “V4 unadjusted”. That’s right, the adjusted data has become the new V4 “unadjusted” data.

Back in 2015, in a comment at NoTricksZone I (and others) already addressed similar confusion regarding GHCN and GISS. Both GHCN and GISTEMP present a flawed data series for Valentia Observatory, but not in the way suggested in the NoTricksZone post.

NASA has NOT “decided to adjust its ‘V3 unadjusted datasets’ and rename them as ‘V4 unadjusted’.” As far back as December 2011 NASA GISS made the decision to use adjusted GHCN v3 data as its input data.

This decision had consequences, discussed below, but clearly GISTEMP v4, introduced more than seven years later in 2019 to follow the introduction of GHCN v4 with its extended inventory, is not the product of a decision to adjust V3 unadjusted datasets and rename them as ‘V4 unadjusted. Such nonsense only serves to deflect attention from real problems of GHCN and GISTEMP.

Valentia Observatory: brief summary

This is how NASA uses its magic wand of fudging to turn past cooling into (fake) warming.

6 examples of scandalous mischief by NASA

What follows are 6 examples, scattered across the globe and going back decades, which demonstrate this scandalous mischief taking place at NASA.

Valentia Observatory is a rural station with a long record, and one of those six examples. The (unjustified) adjustment of this record is something carried out by GHCN (NOAA), not by GISTEMP (GISS), other that in a passive sense (where NASA starts from the adjustments already made by NOAA but makes no further adjustments of their own). Describing this in terms such as “how NASA uses its magic wand of fudging”, an example of “scandalous mischief by NASA” and “scandalous mischief taking place at NASA” demonstrates lack of understanding of which agency has carried out the adjustment.

Data Sources

Discussions of temperature homogenisation adjustments rarely seem to discuss the starting point, the raw data. Hamlet without the prince you might say. Quality control procedures may detect and remove invalid raw data, but are not always successful. GHCN raw data comes from different sources.

Irish station data for GHCN v3 initially comes from CLIMAT reports (Data transmitted over the GTS, not yet fully processed for the MCDW), then replaced later by data received by the UK Met Office, then replaced by data from the Monthly Climatic Data of the World (MCDW), QC completed but value not yet published, and finally by data from the Final (Published) Monthly Climatic Data of the World (MCDW). These MCDW stages however introduce errors into the Irish station raw data.

Data from many countries, but not all, ends up being taken from MCDW. Australian data for example appears to be taken from Australian sources only, and not taken from MCDW. (I mention this here as Australia is the only country for which I have seen a study comparing the GHCN raw data with that provided by the originating Australian Met Office source — and unlike the Irish case, the Australian raw data in GHCN v3 does not have errors introduced).

MCDW

I have already discussed these MCDW errors in GHCN-M Raw Data from Ireland. Now consider how these have been handled by NOAA in GHCN. Here is that 2017 status note again:

03/11/2017

User feedback indicated a problem with some mean temperature data for select stations in Ireland.  The problems were traced to a particular data source (MCDW), and for the time being until that source is corrected, the data are now being sourced to the UK Met Office “Climat” data (“K” source flag”), which are believed to be the correct values.  The data changeover to the UK Met Office has occurred, but the source flag (“K”) for the corrected values was inadvertently left out.  Those source flags should be added within the next production cycle.

To be continued

Note on Japan meteorology Agency data

To be added

GHCN Adjustment

First, to deal with a couple of myths regarding GHCN/PHA homogenisation:

Steven Mosher, February 13, 2017

Pha code has been posted for years. Along with the tests showing how it reduces bias.

Not like the old Steven Mosher to accept such claims without checking. Code has been posted for years (2012 in fact). But it is not the code used for GHCN. It is code related to those “tests”, by which I presume you mean Benchmarking the performance of pairwise homogenization of surface temperatures in the United States. (Note: in the United States) It is a specially crafted USHCN PHA version for “Peter’s World” (Peter Thorne, not me), with a FORTRAN bug which makes it a matter of luck whether it will run “out of the box” or not, and will not replicate GHCN PHA processing for any subversion of GHCN v3 when reconfigured for GHCN processing. Here is an extract from the output, which appears even when reconfigured for GHCN PHA processing.

   ——— Peter”s World (Switchboard) ———-
Number of closest neighbors input:          100
Number of neighbor stations output:           40
nstns buffer limit set to:           60
Correlation Type: 1diff
Minimum desired neighbors per year-mth:            7
SNHT significance threshold:            5
Bayesian penalty function: bic
CONFIRM chgpt hit threshold:            2
Toggle to use SHF metadata:            1
Method to estimate chgpt adjustment: med
Method for estimate filter: conf
Toggle to keep/remove chgpt outliers:            1
Months to avg for adj est(0=no limit):            0
Toggle to remove Non-sig segments:            1
Minimum data values in segment to est adj:           18
Minimum station pairs to est adjustment:            2
Amp vs Loc percent inclusion & index: 92               2

(more detail to be added)

Peter Thorne, February 5, 2017

The GHCN homogenisation algorithm is fully available to the public and bug fixes documented.   …   The land data homogenisation software is publically available (although I understand a refactored and more user friendly version shall appear with GHCNv4) and all known bugs have been identified and their impacts documented.  …  For Karl et al., 2015 the source code and data for much, if not all, of the process is available.

Paul Matthews said…

In this blog post you say that the software is available.
But the link provided is to Fortran code dating from 2012.
This is not the code currently used.

PeterThorne said…

Rgraf and Paul Matthews,  …  However, the PHA code has been refactored in fortran and made considerably more user friendly. My understanding is that this refactored code shall be released in coming weeks or months.

To be continued

GISTEMP Adjustment

To be added

This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to GISTEMP, GHCN and Valentia

  1. Nick Stokes says:

    Peter,
    I noticed that the aberrant readings you noticed in GHCN V3 for Jan 2015 for Irish stations are actually identical to those for May 2009, for which they seem much more appropriate.

    I see too that GHCN V4 seems to have corrected these errors, presumably by going back to the GHCN Daily data. Those corrections seem to be the basis for Kirye’s complaint about data fiddling. They can’t win.

    • Well caught. I had not spotted that coincidence. Checking back all Irish stations with data in 2016 (and 2019) have identical temperatures in January 2015 (unlikely January values) and May 2009 (likely May values, and, while I have still to check them against the original Met Éireann data, correct unless I find a correction to that statement necessary after checking.
      Unfortunately this pattern does not hold for the later corrupted values, but does give me a pattern to explore further. The corrupted values were introduced when NOAA took values from their own MCDW. While they initially took values from Met Éireann CLIMAT messages, and after that from the UK Met Office, this corruption did not arise.

      Here is the “corrected” GHCN-M v3 data for Cork Airport as it appears now. (the final column shows the count of missing values)

      and here is the GHCN-M v3 data for Cork Airport as it appeared at the end of 2016, when I noticed the corruption.

      Note those “1040”s still present, and in particular the absurd 2014 data showing six of the 12 months with the same 10.4°C month mean. This just does not occur at over 50°N! When I brought this corruption of Irish data to the attention of Met Éireann in January 2017, leading to the March 2017 GHCN-M status update shown above in the post, and suggested that they report it to NOAA, I had singled out Cork Airport as the most extreme example, and provided an image example comparing the GHCN-M data for the previous four years with their own (Met Éireann) data. That image even showed the 2014 errors above which NOAA failed to correct.

      I chose to report the corruption through Met Éireann as experience taught me that a direct report might not lead to action for many years. (When I reported metadata errors in the GHCN-M v2 inventory in 2010 a few were corrected promptly, but others persisted through GHCN-M v3, finally corrected in GHCN-M v4. I had been thanked back in 2010, and told that the other errors might take a little longer to correct. I had not expected the NOAA dictionary to define six or seven years as “a little longer”).

      Had this been a very junior undergraduate asked to correct errors I would have expected that he or she would take care to see that all errors were properly corrected. That a more senior person could fail to see the absurdity of leaving errors such as six identical month means in the same year at northern latitude, or in fact any errors at all when all that was immediately needed was to revert to the data values which they already had in GHCN-M prior to introduction of MCDW values, astounds me, and leaves me with very little confidence that NOAA had grounds to assert confidently that errors were confined to “select stations in Ireland”.

  2. Pingback: MCDW Exploration | Peter O'Neill's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.