GISTEMP Urban Station Adjustment (post 4 of 6 in GISTEMP example)

The last post discussed the combination of rural station records close (within 500 km) to NAR’JAN_MAR, our urban station to be adjusted. This post shows how that combined rural record is used to adjust the urban record.

The combined rural record is shown in the first four columns of Table 1 below, starting from 1927, the first year in the urban record, with the urban temperature anomaly series in the fifth column. The “tail” years in the combined rural record, although disregarded for least squares line fitting, are still checked when building the data for the least squares fit, and so have been included in the table, but distinguished by a coloured cell background. Also, the weight (WT[IY]) and station count (IWT[IY]) have been included below, but de-emphasised using grey text, as a reminder that these final weights and counts play no further role. The least squares fit is not a weighted least squares fit. Should it be weighted? Removing the “head” and “tail” years with less than three contributing stations does not mean that there will be no years, or even runs of years, with only one or two contributing stations within the remaining data.

For the least squares fit, AVG[IY] – URB[IY] gives the value of F[NXY], the predicted variable, and YR[NXY] – 1950, rather than the year itself, is used as the value of X[NXY], the predictor variable.

The least squares fit is not a simple straight line fit (although a simple straight line is also fitted). Subroutines GETFIT and TREND2 look for a “broken line” fit, two straight line segments meeting at a “knee”. The least squares fit is found by fitting the broken line for each candidate knee year (the knee is not permitted to occur in the first or last five years of the data series) and selecting the year with the minimum RSS (residual sum of squares) value. This search can be seen in Table 2 below. The knee is located at 1970, highlighted in the table, with slope 0.069 before the knee, -0.009 after the knee. (The slope of the simple straight line is 0.051)

Before adjustment a flag value (iflag) is calculated by applying various tests to the coefficients of the fitted broken line in flags.f. If this flag has a value other than 0 or 100, the broken line is replaced by the simple straight line which was fitted. Here, the flag has value 0, so we proceed with the broken line in subroutine adj.

(text continues after Table 2)

Table 1: gathering data for least squares fit

Rural (combined) Urban Least squares data
IY AVG[IY] WT[IY] IWT[IY] URB[IY] NXY F[NXY] YR[NXY] X[NXY]
48 -4.38560 1.10933 4 -6.3 1 1.91440 1927 -23
49 -5.60458 1.10933 4 -4.8 2 -0.80458 1928 -22
50 -7.64562 1.10933 4 -7.9 3 0.25438 1929 -21
51 -3.75131 1.10933 4 -6.0 4 2.24869 1930 -20
52 8.88734 1.10933 4 12.0 5 -3.11266 1931 -19
53 4.79684 1.55001 5 4.3 6 0.49684 1932 -18
54 -6.73714 1.10933 4 -12.9 7 6.16286 1933 -17
55 8.24160 1.55001 5 11.6 8 -3.35840 1934 -16
56 11.11399 1.55001 5 12.9 9 -1.78601 1935 -15
57 11.29227 1.55001 5 13.0 10 -1.70773 1936 -14
58 24.74355 1.55001 5 27.1 11 -2.35645 1937 -13
59 14.80244 1.55001 5 16.0 12 -1.19756 1938 -12
60 -2.22015 1.55001 5 1.2 13 -3.42015 1939 -11
61 -1.15643 1.55001 5 0.4 14 -1.55643 1940 -10
62 -23.11685 1.55001 5 -25.2 15 2.08315 1941 -9
63 -16.77739 1.55001 5 -16.4 16 -0.37739 1942 -8
64 24.11956 1.55001 5 23.4 17 0.71956 1943 -7
65 20.23912 1.55001 5 20.3 18 -0.06088 1944 -6
66 7.03154 1.55001 5 0.2 19 6.83154 1945 -5
67 -11.86949 1.55001 5 -16.4 20 4.53051 1946 -4
68 2.45577 1.55001 5 0.9 21 1.55577 1947 -3
69 11.46470 1.38867 4 9.7 22 1.76470 1948 -2
70 -0.93749 1.55001 5 -4.3 23 3.36251 1949 -1
71 7.65636 1.55001 5 4.3 24 3.35636 1950 0
72 5.13485 1.32520 4 8.0 25 -2.86515 1951 1
73 1.37486 1.3252 4 1.5 26 -0.12514 1952 2
74 -0.65729 1.3252 4 -1.4 27 0.74271 1953 3
75 19.56510 1.32520 4 24.7 28 -5.13490 1954 4
76 3.04861 1.3252 4 0.2 29 2.84861 1955 5
77 -16.04342 1.3252 4 -15.7 30 -0.34342 1956 6
78 6.57394 1.3252 4 7.1 31 -0.52606 1957 7
79 -10.69421 1.3252 4 -16.1 32 5.40579 1958 8
80 0.17612 1.3252 4 0.9 33 -0.72388 1959 9
81 -11.56438 1.3252 4 -11.0 34 -0.56438 1960 10
82 18.40791 1.3252 4 19.6 35 -1.19209 1961 11
83 19.63526 1.3252 4 18.0 36 1.63526 1962 12
84 -8.26412 1.3252 4 -11.9 37 3.63588 1963 13
85 -17.11237 1.3252 4 -27.1 38 9.98763 1964 14
86 -9.59998 1.3252 4 -10.9 39 1.30002 1965 15
87 -17.10145 1.3252 4 -22.0 40 4.89855 1966 16
88 20.43326 1.3252 4 22.2 41 -1.76674 1967 17
89 -17.84025 1.3252 4 -21.4 42 3.55975 1968 18
90 -29.94853 1.3252 4 -32.4 43 2.45147 1969 19
91 -1.61561 1.3252 4 -4.3 44 2.68439 1970 20
92 -13.10021 1.3252 4 -15.9 45 2.79979 1971 21
93 -4.53604 1.3252 4 -7.0 46 2.46396 1972 22
94 5.79313 1.3252 4 5.0 47 0.79313 1973 23
95 4.51698 1.3252 4 0.1 48 4.41698 1974 24
96 11.96291 1.3252 4 10.2 49 1.76291 1975 25
97 3.78195 1.3252 4 5.0 50 -1.21805 1976 26
98 6.7072 1.32520 4 3.9 51 2.80720 1977 27
99 -10.41259 1.3252 4 -14.8 52 4.38741 1978 28
100 -21.46030 1.3252 4 -24.6 53 3.13970 1979 29
101 -4.03497 1.3252 4 -5.3 54 1.26503 1980 30
102 18.69017 1.3252 4 14.5 55 4.19017 1981 31
103 -0.96253 1.3252 4 -3.6 56 2.63747 1982 32
104 10.71142 1.3252 4 7.1 57 3.61142 1983 33
105 9.61215 1.3252 4 12.3 58 -2.68785 1984 34
106 -13.18452 1.3252 4 -14.4 59 1.21548 1985 35
107 -5.29707 1.3252 4 -9.4 60 4.10293 1986 36
108 -16.05559 1.3252 4 -20.7 61 4.64441 1987 37
109 4.20952 1.3252 4 -0.7 62 4.90952 1988 38
110 18.64379 1.3252 4 20.1 63 -1.45621 1989 39
“tail” years based on less than three stations which are dropped before the least squares fit
111 3.22453 0.60203 2 3.6 64 -0.37547 1990 40
112 15.55083 0.60203 2 12.7 65 2.85083 1991 41
113 -22.30930 0.60203 2 -18.8 66 -3.50930 1992 42
114 20.16842 0.60203 2 19.5 67 0.66842 1993 43
115 -2.42933 0.16135 1 -4.4 68 1.97067 1994 44
116 32.82268 0.60203 2 25.1 69 7.72268 1995 45
117 12.90292 0.60203 2 10.7 70 2.20292 1996 46
118 -6.98870 0.60203 2 -11.0 71 4.01130 1997 47
119 -30.60344 0.60203 2 -32.2 72 1.59656 1998 48
120 -20.32622 0.60203 2 -22.2 73 1.87378 1999 49
121 21.30125 0.60203 2 21.1 74 0.20125 2000 50
122 0.90410 0.60203 2 -9.3 75 10.20410 2001 51
123 -1.72933 0.16135 1 -11.0 76 9.27067 2002 52
124 9.17067 0.16135 1 6.3 77 2.87067 2003 53
125 15.97067 0.16135 1 9.6 78 6.37067 2004 54
126 15.47067 0.16135 1 21.5 79 -6.02933 2005 55
127 2.74238 0.6679 2 -1.6 80 4.34238 2006 56
128 23.64065 0.6679 2 24.5 81 -0.85935 2007 57
129 14.77067 0.16135 1 18.1 82 -3.32933 2008 58
130 13.00302 0.6679 2 8.1 83 4.90302 2009 59

Table 2: search for knee year

Year Least squares: RMS*
1927 * broken line fitting
(knee: not first five years)
1928
1929
1930
1931
1932 457.62230
1933 458.08158
1934 456.21843
1935 455.97691
1936 456.41962
1937 457.22522
1938 458.28603
1939 459.18345
1940 460.02514
1941 460.41520
1942 460.38794
1943 460.10739
1944 459.65298
1945 459.01179
1946 458.84310
1947 458.99310
1948 459.17391
1949 459.38541
1950 459.69697
1951 460.03503
1952 460.15343
1953 460.21602
1954 460.25971
1955 460.13407
1956 460.03592
1957 459.85736
1958 459.55047
1959 459.41226
1960 459.12411
1961 458.63830
1962 457.81005
1963 456.78845
1964 455.79265
1965 455.74040
1966 455.60129
1967 455.85482
1968 455.59056
1969 455.52538
1970 455.50816
1971 455.57617
1972 455.75285
1973 455.98654
1974 455.97783
1975 456.37439
1976 456.70142
1977 456.31340
1978 455.92857
1979 456.00261
1980 456.30756
1981 456.26698
1982 456.89054
1983 457.73444
1984 459.18667
1985 * (nor last five years)
1986
1987
1988
1989

The knee is located at 1970, highlighted in the table, with slope 0.069 to the left of the knee, -0.009 to the right of the knee. How do we interpret these slopes? A positive slope, such as the 0.069 before the knee, represents a “negative” UHI adjustment, boosting any temperature trend by cooling past temperature values, while a negative slope, such as the -0.009 after the knee, represents a “conventional” UHI adjustment, reducing any temperature trend by warming past temperature values. In the case of NAR’JAN-MAR these slopes lead to small adjustments. The earliest temperature values are reduced by 0.3°C, and by 1970 the adjustment is reduced to zero. As temperatures are rounded to 0.1°C, the small negative slope is insufficient to warm any past values.

Least squares fit to rural/urban difference (anomalies in 0.1°C)

The slopes alone are not sufficient to completely specify the fitted least squares broken line, and of course the remaining equation coefficients are also calculated and returned. But for adjustment purposes only the slopes and the location of the knee are used. This is because only past temperature values are adjusted, the fitted broken line is in effect shifted so that the adjustment made to the most recent year will be zero.

While this alignment of all adjustments to be zero in the last year to which the adjustment is applied may seem to be “rewriting past history” it has the advantage of leaving current adjusted and unadjusted values in agreement. When combining stations to produce gridded data it is taken into account (a future worked example post to show the gridded data calculations in Gistemp STEP3 is planned).

The adjustment is calculated on a year-by-year basic (for NAR’JAN-MAR the adjustment is -0.3°C from 1927 to 1931, -0.2°C from 1932 to 1945, -0.1°C from 1946 to 1960, and zero from 1961 on. The adjustment is calculated as an integer value, in units of 0.1°C. This adjustment is applied on a month-by-month basis to the unadjusted station data (temperature, not anomaly), and the result is saved as an integer value, again in units of 0.1°C. The year-by-year calculation of the adjustment, which is then applied to monthly data, and the rounded integer calculations give rise to the distinctive stepped appearance of the adjustment and of the difference between adjusted and unadjusted data whenever these are plotted.

The station temperature records resulting at the end of Gistemp STEP2 are saved as binary files rather than text files, but still contain integer temperatures, in units of 0.1°C. The results from STEP0 and STEP1 were similarly integer valued, but were saved as text files. In STEP3 the gridded data is saved as floating point values in binary files.

As the adjustments for NAR’JAN-MAR are small, the next post, before post 5 in this series, will illustrate a station with larger adjustments, my “home” station, Dublin Airport.

Advertisements
This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

6 Responses to GISTEMP Urban Station Adjustment (post 4 of 6 in GISTEMP example)

  1. Pingback: Roman’s Anomaly Regression and GISTEMP | Peter O'Neill's Blog

  2. vjones says:

    Peter,
    having spent quite a bit of time reading this and related posts, I just wanted to say thanks. It is one thing to read what the method is supposed to be and understand it, but it is much better to see a worked example. I do have one query, but I’ll wait and see if it resolves in your remaining post(s).

    • oneillp says:

      If the query is related to the posts so far rather than the trial substitution of Roman’s anomaly regression it may well be something I overlooked or glossed over, and it might be better to look at it now. I prepared this worked example to help refresh myself on the details of Gistemp, as I had implemented them last year, as an aid to integrating Roman’s procedure as an alternative option.

      • V says:

        No it isn’t an error – it relates to post 3 of 6 – I just wondered that the WT seems to be cummulative every time a station is added and that you end up with a WT of 1.55. I’m curious to see what happens when the combined rural station data is then combined with the urban station – does this WT have any effect?

        IF it does, then does that mean that the adjusting stations have a very strong action and that the more adjusting stations there are (e.g. Europe) the less any actual data from the station will filter through? Of course there would be huge consequences of this for use of rural stations that aren’t pristine. Or perhaps I am just reading too much too far forward into the programme.

      • oneillp says:

        V says:
        May 31, 2010 at 22:29

        No it isn’t an error – it relates to post 3 of 6 – I just wondered that the WT seems to be cummulative every time a station is added and that you end up with a WT of 1.55. I’m curious to see what happens when the combined rural station data is then combined with the urban station – does this WT have any effect?

        WT has no effect after the rural stations are combined. See the start of this post:

        Also, the weight (WT[IY]) and station count (IWT[IY]) have been included below, but de-emphasised using grey text, as a reminder that these final weights and counts play no further role. The least squares fit is not a weighted least squares fit.

  3. Verity Jones says:

    Ah I thought that, but there was doubt in my mind.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s