As completion of this post was long delayed, I’m posting the discussion part separately here as an ‘aside’ post so that it will show up as a new post for anyone who has given up waiting.
The most striking aspect of this Krasnojarsk adjustment study is the variation in the trend imposed on the urban record by the different choices made for adjusting rural stations. This is not unique to Krasnojarsk, but the number of stations being reclassified by the different choices is greater here than for other stations which I have examined. This already suggests to me that it might be worthwhile examining all stations adjusted to see how many stations move in and out of the set of rural adjusting stations for the different choices involved, and to see if these changes are in any way geographically clustered (I expect that US stations, for which the metadata is more reliable, will show few changes).
|Urban/rural classification||Trend (full record)||Trend (final 30 years)|
|Raw (GHCN adjusted: ghcnm.tavg.v126.96.36.19940411.qca.dat)||0.45||3.68|
|Raw (2009: GHCN v2)||0.69||4.57|
|GHCN classification (2009)||0.48||5.35|
|GISS nightlights classification||1.32||4.13|
|My nightlights classification||0.79||5.26|
|(removing GHCN non-rural)||0.37||5.50|
|(removing GHCN non-rural, adding 3)||0.57||5.18|
Changes in the identification of urban stations and in the mix of rural adjusting stations arise in two ways:
- Due to the extension worldwide by GISS of urban/rural classification by examination of nighttime luminance in 2010
- Due to the replacement of suspect latitude/longitude data by corrected values.
These will now be discussed separately.
Extension worldwide of urban/rural classification by examination of nighttime luminance
Hansen et al. (2010) extends the use of satellite-observed nightlights worldwide to identify measurement stations as rural, peri-urban or urban, but fail to discuss the impact of this change on the number of stations classified as rural or urban. The magnitude of this change can be seen in Table 1:
|Stations||STEP2 (input)||Rural||Urban||STEP2 (used)||Rural||Urban|
GISS analysis station classification. Eighty three stations with records ending before 1880, or marked as strange, are dropped before STEP2, the UHI adjustment step.
The distinction between urban and peri-urban stations is ignored, both being regarded here as urban, as the UHI adjustment described in this paper treats both identically. The data used for analysis here is that for the August 2010 GISS analysis of global surface temperature change, up to and including July 2010.
It seems strange that Hansen et al 2010 made no reference to the extent of the urban/rural reclassification resulting from their worldwide extension of the use of nighttime luminance, and is content to claim that
Station location in the meteorological data records is provided with a resolution of 0.01 degrees of latitude and longitude, corresponding to a distance of about 1 km.
although Reto Ruedy at least was aware prior to publication that this was not strictly true, belatedly confirming, on August 23rd 2010, coincidentally the same date that Hansen et al 2010 was accepted for publication by Reviews of Geophysics, that
I’m not surprised at all that there are serious mistakes in this inventory file. It has been traditionally treated with less than the proper care; e.g. it took years after I notified them until they fixed the error of systematically dropping the 1000s in all altitudes.
It should be evident that an adjustment procedure for urban stations based on nearby rural stations requires an ability to correctly identify urban and rural stations. Each of the reclassified stations noted above must be incorrectly classified in one or the other of the two classifications, and we do not know which. Other stations which were not reclassified may potentially be incorrectly classified under both classifications. A large number of stations were reclassified, and there appears to be no evidence in Hansen et al 2010 to show that this classification was improved by the worldwide extension of the use of nighttime luminance. Certainly some at least of the stations reclassified were incorrectly reclassified as a result of incorrect latitude and longitude values (see next section). It is possible indeed that the worldwide extension of the use of nighttime luminance could have introduced more classification errors than it corrected. The original classification on the basis of population could possibly be more accurate. We do not know, and this is a question which should have been addressed by the authors, but was not.
This is also a question which I had raised with Reto Ruedy in December 2009, before the implementation of the change (I had spotted a trial run on the GISS FTP site, a transparency which I regret to say has since been reversed), and before the appearance of the first draft of Hansen et al 2010 with an invitation for comments. It may be worth showing that initial December 2009 e-mail on this topic here to show that this was no mere incidental comment, but explicitly drew attention to the possible results of using the GHCN metadata for station classification:
I have some comments on efforts to use the Global Lights for urban/rural classification, and the quality of the data you are trying to use for this. I have looked at the data for a few Irish locations, and a few European locations I would be familiar with (I use location rather than station here as the latitude/longitude values are so coarse that it makes little sense to talk of stations if these values are being used to locate lighting for these locations. For example the two Cherbourg stations:61507039001 CHERBOURG/CHANTEREYNE FRAN 49.70 -1.60 12 8S 31HIxxCO 1x-9COASTAL EDGES B 7 61507039002 CHERBOURG-MAUPERTUS 49.70 -1.50 139 12S 31HIxxCO 2A 5WATER A 0
Are both well out to sea based on these values, rather than at the port and the airport.
So here are some of the classification issues I have noted for a small subset of locations, and I am rather sure that such issues will arise again and again elsewhere.
Shannon Airport becomes ceases to be classified as rural when lights are used, which is as it should be, considering that this is an airport which has had considerable industry located nearby for many years. Lights classification makes sense here.62103962000 SHANNON AIRPO 52.70 -8.92 20 14R -9FLxxCO 1A-9WARM CROPS B 12
Fort William (Scotland) changes from rural to urban, while Ben Nevis remains rural. At first sight this might seem to make sense, until you notice that the GNCN data for these stations only covers the period from 1884-1903, at which time Fort William was fairly certainly still rural. Lights classification fails (These two stations are of interest as they are within 7 km of each other, but vertically separated by more than 1300 m, and provided early reliable comparison of conditions at altitude with those at sea level. But I wonder why they came to be included in the GHCN data, particularly in view of the otherwise surprisingly poor Scottish representation)65103038000 FORT WILLIAM 56.83 -5.10 20 229R -9MVxxCO 1x-9WARM GRASS/SHRUBC 13 65103038001 BEN NEVIS UK 56.80 -5.10 1343 229R -9MVxxCO 1x-9WARM GRASS/SHRUBB 0
I’ve travelled to France a few times on the ferry from Rosslare to Cherbourg. Using lights, Rosslare changes from rural to urban, while both Cherbourg stations change from urban to rural. Compare the surroundings on the maps pasted below.62103957000 ROSSLARE 52.25 -6.33 25 2R -9FLxxCO 1x-9WATER B 11
Bilbao changing from urban to rural is a surprise64308025000 BILBAO SPAIN 43.30 -2.80 16 378U 450HIxxCO 6A 2WARM DECIDUOUS C 10
As is York65103355001 YORK UK 53.90 -1.10 -999 22U 102FLxxno-9x-9HEATHS, MOORS A 7
It looks to me as if changing to global lights is likely to introduce as many new misclassified stations as it corrects, and that the only way this classification can really be improved is by actually examining each location, painful as that exercise may be.
I’ve added an obvious extra line to PApars.GHCN.CL.1000.20.log which you might find useful when investigating this classification issue:all 3515 0.826953342816498 0.125331076856742 urb warm 1939 5.09911191335741 0.75478098233507 urb cool 1557 4.48326075786769 0.701657234759009
and also suggested summarising the urban cooling effect in their output as well as the urban warming effect already summarised (bold and red in the original e-mail – I already suspected that the classification of rural/urban stations might also affect the balance between urban warming and urban cooling effects). Hansen et al 2010 suggests/notes that:
The urban influence on long‐term global temperature
change is generally found to be small. It is possible that the
overall small urban effect is, in part, a consequence of partial
cancellation of urban warming and urban cooling effects.
It seems premature to draw the conclusion that “The urban influence on long‐term global temperature change is generally found to be small” before a reliable classification of urban/rural has been established. The overall small urban effect may indeed, in part, be a consequence of partial cancellation of urban warming and urban cooling effects, and as shown by my added line above these two effects are quite similar, and do indeed partly cancel each other. Is this still the case if latitude/longitude errors are corrected? Lacking a complete set of corrected latitude/longitude values I cannot offer a definitive answer to this question. I have so far collected “corrected” values for about half the station inventory (“corrected” in the sense that the sources of these corrections may themselves contain some errors, but mostly come from the meteorological services operating the the stations in question, rather than some third party). In the context of GHCN data this of course may introduce a new error: the most recent location known to the owning meteorological service may not be the location corresponding to the most recent GHCN data where, for whatever reason, GHCN no longer collects current data. There may have been a subsequent station relocation. There may also be other issues in relation to GHCN data collection, which I will describe soon in another post in the specific context of Irish stations. As it is rather unlikely that there has been any conspiracy to distort the Irish record, I would suspect that similar problems can be found if other countries are examined. It just happens that being Irish I happen to have the corresponding data from the Irish meteorological service to hand for comparison purposes, and have from time to time made such comparisons. I suggest that you make a similar comparison for your own country, and this may well show similar problems.
When the change to worldwide use of nighttime luminance went online I followed up with a further list of substantial errors which could be found with just a few minutes work, and, in April 2010, in response to James Hansen’s invitation for comments on the draft Hansen et al 2010, I sent similar comments directly to James Hansen, including the comment:
As it is highly unlikely that I found all the gross location errors in v2.inv in just a few minutes, or that the relatively few airports I viewed in Google Earth just happened to include all those with incorrect coordinates, these and all other such errors in v2.inv need to be corrected before nightlights can provide the objective approach you suggest.
receiving the response “Peter, thanks much — will look into this soon. Jim”.
As Hansen et al 2010 still appeared with the misleading claim that “Station location in the meteorological data records is provided with a resolution of 0.01 degrees” (misleading in that it is true that the resolution provided is indeed 0.01 degrees. but this is an unreal resolution, divorced from the actual accuracy of the data). I find it difficult to reach any other conclusion than that this is an issue which which GISS would prefer not to address. Accordingly, while I have continued to accumulate further corrected data, I have refrained from disturbing GISS with unwelcome corrections.
However strange it may seen that GISS failed to address this issue, it should seem even stranger that no peer reviewer thought fit to demand that this omission be addressed. I am however less than impressed by peer review at AGU journals, with the honourable exception of Water Resources Research, at least during the period when I was a regular reader, or indeed by the AGU itself. I doubt for example that Michael Mann’s two smoothing papers, [2004 and 2008, another blog post to justify this negative comment following as you may suspect] would have passed peer review in any statistical journal. I have yet to see a satisfactory account of the provenance of the forged “Heartland” document circulated by Peter Gleick, then chair of the AGU Ethics committee, textual analysis of which lead to his outing at the identity thief involved, and so am less than impressed by his subsequent rehabilitation. I am of course open to consideration of any serious explanation of the provenance of that document, but please refrain from linking to any material from PR outfits with dubious financial backing.
One of the positive advantages of writing a blog post is that wandering into anecdote is permitted. I can thank the shortcomings of an article in a geophysical journal for my first job offer after graduation (unfortunately, 47 years later, I cannot recall which – if calculation of geophysical resistivity curves by a Mexican and US author pair, circa 1965, rings a bell, please post below. It may be totally irrelevant here, and now, but it would be nice to be reminded of the exact reference). As a trainee/stagiaire/Praktikant (choose one – it was a non English speaking environment) I was given the supplementary information for the article to work with. This was unfortunately typeset rather than offset printed, which at that time would have been the sensible choice., and may well have been produced by the authors’ university rather than by the journal. In any case, the tabulated resistivity curves were riddled with typesetting errors. Even at that time my training was first to examine that data for fitness for purpose rather than simply thinking that fitness for purpose was a matter for someone else. My objection to using the data as published was endorsed further up the company hierarchy, and I was assigned the task of first eliminating the many typesetting errors, replacing them by interpolated values (time constraints and the computer resources then available ruled out replication of the original calculations). That attitude, and another similar detection of faulty data before further use, played, I was told, a significant part in the decision to offer me a permanent post shortly afterwards, a month after I started work as a summer trainee. I regret to say that a willingness to use data without questioning the fitness for purpose of that data does not seem to be confined to this specific use of GHCN metadata by GISS – it has also appeared elsewhere in the literature I have examined. It should also be noted that fitness for purpose of course depends on purpose. The metadata collected by GHCN for the purposes of GHCN may be fully fit for those purposes. It is when it is used elsewhere for purposes which were not initially envisaged by the collecting agency that problems may arise. The location errors in the GHCN metadata may not matter for the use of that data by GHCN (or may matter – this is not a question I have examined), but it certainly matters when GISS uses that same data for urban/rural classification. As GHCN had not collected that data for that purpose, the onus is on GISS to ensure that the data is fit for this purpose, and to correct it where necessary. The excuse that “Unfortunately, we don’t have the manpower to check out all entries of that file” does not impress.
Replacement of suspect latitude/longitude data by corrected values
In the last section I considered the effect of the extension of the urban/rural reclassification resulting from the worldwide extension of the use of nighttime luminance, and touched on the question of the validity of the latitude/longitude metadata provided by GHCN for this purpose. While GISS uses the excuse that “Unfortunately, we don’t have the manpower to check out all entries of that file” to avoid any checking of the entries in that file, it should not be beyond even the “limited” manpower of GISS to make a start on that process. Roughly half the stations in the GHCN inventory are also WMO stations. The WMO is collecting higher precision location metadata for these stations. See WMO programme. Not all stations have revised coordinates yet, and there have been errors which I have spotted, generally an offset of one or more degrees. Those which I have notified to the WMO have been promptly corrected (US agencies “please note”), but coordinates where the error is in minutes or seconds are obviously also possible, but harder to detect. On one occasion the WMO published updates on their website with a number of coordinates such as “12 02 88E” and “56 29 60N”, suggesting an automatic process with inadequate validity checking (to spare the blushes of the national meteorological agencies involved I have chosen latitude and longitude values here from different countries). Again, the erroneous values were immediately corrected by the WMO when I brought these errors to their attention. The fact that such erroneous coordinates entered the system at all points to weaknesses at some point(s) in the data collection process.
The second obvious tool available for location checking is visual checking in Google Earth or other similar mapping program. Land based stations located out at sea, or airport stations located distant from the relevant airport, are obvious candidates for correction. This approach is obviously more manpower intensive.
Earlier posts on this blog have documented such location errors, and I do not propose to cover the same ground here. If my “corrected” coordinates, for somewhere between a quarter (if only stations for which higher precision coordinates have already been submitted to the WMO are considered) and a half (if all WMO coordinates are used to replace GHCN coordinates even if only lower precision coordinates are available) of the stations in the GHCN inventory are correct, a revised Gistemp analysis would suggest that the revised UHI contribution may potentially double the contribution claimed by GISS. Even if some of these revised coordinates are erroneous, this would still indicate that the sensitivity of this urban contribution to station location changes needs to be considered.
As almost all the station locations corrected are also WMO stations it is possible, if unlikely, that the quality of location metadata for the non-WMO stations is superior to that for the WMO stations, or that corrections to the metadata for these stations may affect the urban contribution differently, possibly even cancelling the increased contribution noted for corrected WMO stations. For now, until sufficient corrected location metadata becomes available, it is only possible to suggest that the urban contribution may be underestimated, and that there is an obvious need for accurate metadata.
Hansen et all 2010 may report that “The effect of urban adjustment on global temperature change is only of the order of 0.01°C for either night light or population adjustment”, but this may also result from adjustments which do not reflect the true urban/rural classification of the stations used.
Writing this post has suggested further aspects for investigation, which may lead to future posts, either specifically related to these Siberian stations, or more widely based.
A final point which I will shoehorn into this post is that the night light satellite data obtained from Marc L. Imhoff and used by GISS seems to be a version which I understand is deprecated. I have used the F16 radiance calibrated image instead. If radiance contours for the night light image used by GISS are examined some very improbable artifacts will be found, leading to dubious classification of some stations.