## 9 Reconstructions

The previous sections give the physical principles that allow us to reconstruct annual means of heliospheric parameters from geomagnetic activity. By using a combination of geomagnetic activity indices that have differing sensitivities to the substorm phenomenon (for example, one range index and one based on interdiurnal variability can be used) both the IMF magnitude and the solar wind speed, , can be derived. From Parker spiral theory the modulus of the radial field can be deduced and using the Ulysses result, this means the open solar flux, can be reconstructed too. The first paper to exploit this possibility was Lockwood et al. (1999a) who reconstructed the open solar flux, and found considerable variation with 11-year running means around 1985 more than twice those found around 1900. A wide variety of different procedures have subsequently been used. The most obvious differences between them are the basis geomagnetic data employed. However, there are other differences. In this section we look at the resulting reconstructions of , , and .

### 9.1 Results for the near-Earth IMF

Figure 26 shows the reconstructions of annual means the IMF magnitude, , and compares them to annual means of the near-Earth measurements. The reconstructions by Lockwood et al. (2009d) (LEA09, red line) and Rouillard et al. (2007) (REA07, blue line) are based on the (range) index and the index. The Svalgaard and Cliver (2010) reconstruction (SC10, thin black line) is based on the index alone. Because the method of SC10 is relatively straightforward, making use of a single correlation, it has the advantage that it is much easier to study the propagation of uncertainties and the grey area shows the error in the SC10 reconstruction, as evaluated by Lockwood and Owens (2011) from the regression error estimates by Svalgaard and Cliver (2010). These estimates are made using the standard equations for uncertainty in the slope and intercept of a linear regressions which are approximate (Richter, 1995) and do not allow for experimental uncertainties in the data used. A more rigorous examination of uncertainty is presented in Section 9.4.2. The orange line (LEA99) was not published in the paper by Lockwood et al. (1999a), which focussed on open solar flux. However, Lockwood and Owens (2011) have extended the procedure of Lockwood et al. (1999a) to evaluate , the results of which is shown here. In this procedure, only the index was used, and the effect of on this range index was allowed for using the recurrence index of (based on the autocorrelation of at 27-day lag). This works because in the streamer belt, annual means of are enhanced by the intersection with fast solar wind streams emanating from isolated low-latitude coronal holes or from low-latitude extensions to polar coronal holes (Sheeley Jr et al., 1976; Wang and Sheeley Jr, 1990). These fast streams interact with the slow solar wind ahead of them and, because the coronal holes generally persist for several solar rotations, cause co-rotating interaction regions which give recurrent geomagnetic disturbances, so increasing the recurrence index. As a result, recurrence indices (e.g., Sargent, 1986) are correlated with the average solar wind speed.

It can be seen that, despite the wide variety of implementations and the variety of different geomagnetic data used, the reconstructions of agree remarkably well and all agree well with the observed values from the space age. As expected, the differences get somewhat larger as one goes back in time. The year 1901 appears to either contain some error in one or more of the data series that is propagated into some of the reconstructions, or has exposed a limitation in one of the procedures. The paper by Rouillard et al. (2007) offers some insight into this as they applied both ordinary least-squares regression (OLS) and Bayesian least-square regression (BLS) techniques to two combinations of indices: with and with . Their Figure 5 reveals that a shallow minimum is found for 1901 using the with pairing with OLS. The same pairing gives a somewhat deeper minimum if BLS is used and the variation shown in Figure 26 is found using the with pairing and, again, is somewhat deeper if BLS rather than OLS is used. The figure shows that using alone gave a lower value but Figure 33 shows the index with a non-linear fit and full error analysis (Lockwood et al., 2013b) gives a similar shallower 1901 minimum to that from an OLS fit to (Svalgaard and Cliver, 2010). In itself the larger spread of estimates for this one year (1901) is not significant, other than it does highlight that uncertainties are larger when the number of available stations is lower.

### 9.2 Results for the near-Earth solar wind speed

The SC10 and LEA99 papers do not reconstruct the solar wind speed variation but Svalgaard et al. (2003), Svalgaard and Cliver (2007a) and REA07 did. Furthermore, REA07 used a number of different procedures to check how robust their reconstructions are: their results for are shown in Figure 27. Specifically they employed two different regression procedures: ordinary least squares (OLS) and Bayesian Least Squared (BLS). They also used two different combinations of geomagnetic indices: with and with , which as illustrated Figures 3 and 16, have differing dependencies on the solar wind speed.

The derived variations in annual means of are very similar in all four cases, as is the Svalgaard and Cliver (2007a) reconstruction (not shown). All show a weak upward trend on average during the past century, but this trend is not as strong as that in the IMF . This agrees with recent inferences from Wang and Sheeley Jr (2012) that the lower IMF in low-activity cycles would be accompanied by lower solar wind number density, but not significantly lower solar wind speed. Peaks in are seen in the declining phase of the solar cycles. In the space age there has been a marked tendency for these to be larger for even numbered cycles than odd-numbered ones (Hapgood, 1993) but this is not seen in the reconstruction for before the space age. This agrees with the inference from the 27-day recurrence in the index (see the lower panel of Figure 1 of Lockwood et al., 1999a). As yet we have no explanation for this difference between even- and odd-numbered cycles, nor why it appears to be intermittent on centennial timescales.

### 9.3 Results for the open solar flux

Figure 28 shows the open solar flux reconstructions corresponding to the IMF reconstructions shown in Figure 26. Note that Svalgaard and Cliver (2010) did not compute because the main focus of their paper was the IMF , but their reconstruction (and the uncertainty band around it) has here been converted into open solar flux using the polynomial fit in Figure 29 (see below). Again the agreement is generally good, but larger differences do exist than for the IMF reconstructions. In particular, the original reconstruction by Lockwood et al. (1999a) (LEA99, the orange line in Figure 28) was derived using the index and hourly mean data of the IMF, with no kinematic correction to the values. It can be seen this gives larger values than the best open solar flux values from in-situ data which do deploy the kinematic correction (the black dots in Figure 28). The green line in the figure (labelled LEA) shows the results of applying the LEA99 procedure to the index and using kinematically-corrected IMF values. Comparing the green and orange lines it can be seen that applying these corrections has lowered the open solar flux estimates at all times, but the effect is greatest for the modern data (after 1957). Before 1957 (which is when the move of the northern hemisphere station from Abinger to Hartland generates the major difference between and ) the difference between the two is not as great. As in Figure 26, the red and blue lines are from Lockwood et al. (2009d) (LEA09) and Rouillard et al. (2007) (REA07). It can be seen that the LEA99 procedure, when applied to the same data as used by LEA09 and REA07 generates very similar results, despite being based on and the 27-day recurrence index, whereas LEA09 and REA07 are based on combining and . Given it is based on their , it is not surprising that the SC10 reconstruction is, as for , slightly larger than the others in the earliest years; nevertheless, agreement is remarkably close overall.

Figure 29 shows the variation of with used to convert the SC10 data. This is a scatter plot of the data from the LEA09 reconstruction for 1905 – 2009 (black dots) and from the in-situ spacecraft data (open triangles). The black line is a polynomial fit (given by equation 8 of Lockwood and Owens, 2011), constrained to pass through the origin because if the open solar flux ever fell to zero, the near-Earth IMF would necessarily also fall to zero. This fit varies considerably from the best fit linear regression, shown by the dot-dash line. The form of the polynomial fit is readily understood in terms of the competition between two effects (Lockwood et al., 2009d). The first is what would be seen for uniform solar wind flow (over a 27-day period), as predicted by Parker spiral theory. Sections 9.1 and 9.2 show that as the average IMF rose over the past 150 years, the average solar wind speed also rose slightly. This causes the spiral field to unwind such that the ratio rises and, hence, the ratio also rises as rises. This is consistent with the sense of the curvature in non-linear behavior seen in the data and the polynomial fit at below about 6 nT. However, at above 6 nT, the ratio falls slightly as continues to increase. This is consistent with an increased kinematic effect due to increased longitudinal structure in the solar wind at higher solar activity which will increase the at for a given .

There is a point that has caused some confusion and needs clarifying here. The non-linearity of open solar flux and near-Earth IMF means that the radial component of the near-Earth IMF is not linearly related to . The original reconstruction of by Lockwood et al. (1999a) used not only a linear relationship but proportionality between and (by assuming that on annual mean timescales the gardenhose angle was constant) and so approximated the data in Figure 29 with . However, this approximation was used to derive an analytic form for that was then fitted to the and Lockwood and Owens (2011) have shown that, although this influences the fit coefficients, it does not greatly alter the derived . In other words, was a reasonable approximation to make in this context. However, Figure 29 shows that proportionality is an approximation and cannot be relied upon in general.

Thus, there is considerable agreement between the various reconstructions of both the open solar flux and the near-Earth interplanetary field. The main difference is that the SC10 reconstruction gives slightly but persistently higher values in the early years, but we should expect agreement to be less good at these times as the number of stations available, and the long-term stability of their instrumentation is necessarily lower for the early data. SC10 extend their sequence back to 1835, just three years after the establishment of the first geomagnetic observatory: discussion of the validity of this extension is presented in Section 9.4.2. The LEA99 reconstruction extends back to 1868 because they used the index only and this is the date at which Mayaud began his analysis of range at two antipodal stations. This sequence has been extended back to 1844 using data from a single station (Helsinki) by Nevanlinna and Kataja (1993) and Nevanlinna (2004); Lockwood (2003) used this to extend the open solar flux back to this date. LEA09 are more conservative in that they used the index which only uses datasets and composites that extend into the era of space measurements and they argued that the hourly mean (or hourly spot value) data that meet this criterion too few and of insufficient accuracy before 1902 (for example giving the uncertainty in 1901). There is considerable (if not complete) agreement after 1901 and so this gives more than 100 years of reconstruction that can be used to train and evaluate models of the long term variation in the IMF and open solar flux, and these are discussed in Section 11. These models are based on the longest series of as-it-happened observations available to us, which is of sunspot number (see the Living Review by Hathaway, 2010).

### 9.4 Discussion and uncertainty analysis

#### 9.4.1 Comparison of reconstructions and the concept of “floor” values

Figure 30 compares the open solar flux reconstructions to the group sunspot number, . The data sequence was initially compiled by Hoyt and Schatten (1998). A number of possible adjustments to this sequence have been proposed recently, based on newly discovered historic observations. Reviewing these, after a consensus view has been reached, will be an important update to this and other living reviews. However, one adjustment, by Vaquero et al. (2011) is already included in Figure 30 as this makes the decline into Maunder minimum conditions more consistent with cosmogenic isotope data (Lockwood et al., 2011b). The upper dashed line in the top panel shows the “floor” in annual means of the near-Earth IMF , of 4.6 nT postulated by Svalgaard and Cliver (2007b) (SC07). In this context, the author believes it is important to make a clear distinction between a genuine floor value (set by mechanisms which prevent the value of a given parameter from going any lower) and the minimum value detected since a certain date. The point is that without firm and quantitative understanding of the postulated mechanisms one can never be sure that lower values have not been seen only because the required conditions have not prevailled within the period for which one has data. Whilst it is almost certainly true that there is likely to always be some flux emergence (which means that there would always be some open flux and a non-zero near-Earth IMF) there is, as yet, no known physical reason that would allow one to quantify a minimum floor value. The estimate of 4.6 nT by Svalgaard and Cliver (2007a) was based on the fact that their reconstruction (SC07) did not go below this value. In fact, in the recent low solar minimum, annual values of fell to 3.9 nT in 2009 and so Svalgaard and Cliver (2010) (SC10) revised their floor value down to 4.0 nT (which is the lowest value for calendar years), which is also shown in Figure 30. Subsequently, Cliver and Ling (2011) have generated a new estimate of a floor IMF value of about 2.8 nT in annual means, based on more sophisticated empirical arguments, but the physical origin of any such a quantified limit remains unknown. The middle panel of Figure 30 shows the open solar flux reconstructions and the floor values have been mapped from the upper panel using the polynomial fit shown in Figure 29. One point to note is that a linear fit to the data shown in Figure 29 does not set a floor value at the intercept. The reason is that this intercept is at and . No source for the near Earth IMF, other than the coronal source flux, has ever been suggested and, hence, if then also. Hence, if a linear fit to Figure 29 is argued to be evidence for a floor value, then an explanation of where the come from, as it cannot be from the Sun. Much more realistic is that it does come from the Sun and that the relationship between open solar flux and near Earth IMF is not linear. In Section 10, the non-linear fit in Figure 29 is used to estimate the open solar flux during the Maunder minimum from cosmogenic isotope data.

All the reconstructions in Figure 30 show general variations with sunspot number, not just over the solar cycle but on centennial scales as well. The key difference between the sunspot variation and those derived for and is that sunspot activity indices return to a value close to zero every minimum (not exactly zero, there is a small long term drift in the minimum values that mirrors those in the maxima and in the 11-year running means). In contrast, both and show variability in the cycle minimum values which almost matches that in the solar maximum values. The realisation that the Sun does not return to the same baselevel state at each solar cycle minimum, even though it is (almost) clear of spots then, is an important change in our understanding of long-term solar variability. In using the two reconstructions ( and ), two points should be remembered. (1) The open solar flux has the advantage of being a global value that applies to the whole heliosphere whereas the IMF is a local value that applies only near the Earth (so, for example, it varies as the solar wind speed increases/decreases, making the Parker spiral unwind/tighten, respectively). (2) On the other hand, mapping from the near-Earth measurements back to the coronal source surface causes, as discussed in Section 7, its own complications and uncertainties. Hence, the IMF has the advantage of being much more straightforward observationally.

As noted above, the major differences between the reconstructions is before 1880, for when the SC10 is slightly, but consistently, higher than the other reconstructions, thereby giving less long-term trend and a higher floor value (at least over the period since 1835). Note, however, that the other reconstructions do still (just) agree within the computed uncertainty in SC10. Considerable effort is being expended deploying more datasets to try to resolve this discrepancy. However, I urge some caution here. Some of the early data are of higher quality and better long-term stability than others, and so great care must be taken to ensure bad data is not used to corrupt good data. The author’s personal view is that it may well be better to look at the present and the future to evaluate the reconstructions. As predicted as early as 2005 from the observed polar fields by Svalgaard et al. (2005), cycle 24 is proving to be a very weak cycle (e.g., Lockwood et al., 2012) and it is instructive to look at the latest 12-month means available at the time of writing (31 March 2013). These are shown by the black dots in the three panels of Figure 30. The and values are taken from the IMF observations by the ACE spacecraft and the value of is taken from the daily means of the International Sunspot Number (compiled by SIDC, Belgium), linearly regressed against for the years when both are available. All the current indications are that this value is close to the maximum value (Lockwood et al., 2012) which means that the current cycle (number 24) is similar in magnitude to cycle number 14 (which peaked around 1908). Hence, it is illuminating to compare the current observed values of and with the reconstructions for close to the peak of cycle 14. The best agreement is with LEA09 (in red) and the SC10 is already significantly higher at this time – a trend that continues as one goes back in time. Thus, the recent long and low minimum between solar cycles 23 and 24 (Russell et al., 2010; Lockwood, 2010) and the weakness of cycle 24 thus far (Lockwood et al., 2012) are likely to discriminate between the reconstructions much more effectively than the implementation of many corrections of the pre-1900 data. The evolution of cycle 24 will be monitored and updated in Section 12. The sunspot numbers seen already in cycle 24 are still considerably larger than were seen during the Dalton minimum (marked DM in the bottom panel of Figure 30) and, of course, the Maunder minimum (MM), it therefore seems highly unlikely indeed that and did not dip under any minimum values in data recorded after 1835.

#### 9.4.2 Analysis of uncertainty

The homogeneous construction of the composite by Lockwood et al. (2013a) allows a detailed analysis of uncertainties in the reconstructions that are based on it. In evaluating these uncertainties we need to allow for errors in both the interplanetary data and in the geomagnetic index, their effect on the regression fits and the subsequent effect on the reconstructions. Lockwood et al. (2013b) have carried out a comprehensive evaluation of errors in the reconstruction of IMF from .

The largest error in the interplanetary data is associated with the fact that the geomagnetic index responds to (or some equivalent coupling function that quantifies the southward IMF component in GSM) but we are attempting to reconstruct . As discussed in Section 5, the average of the ratio of the two tends to a constant on annual time scales, but part (c) of Figure 10 demonstrates that there is an error associated with employing this average that is of order 10%. There is also a much smaller measurement error which has been estimated from comparisons of measurements of from different spacecraft to be of order 0.2 nT.

Figure 31 presents an analysis of the errors in . Because is compiled from over 50 stations in modern times, it is reasonable to assume that most of the differences between and the appropriately scaled are due to errors in , hence the distribution of the residuals of the fit of onto gives us an uncertainty estimate in . This distribution for the space age is shown in Figure 31 and has a standard deviation of .

Lockwood et al. (2013b) use a Monte-Carlo method to carry out a non-linear regression fit between and and evaluate the uncertainties. The points shown in Figure 32 are annual means (with piecewise removal of data during datagaps and, hence, the parameters are denoted with a prime) which are fitted with a polynomial of form given in Equation (9).

In each fit, the values of , , and that yield the minimum r.m.s. difference between the observed and predicted IMF values ( and , respectively) are determined using the Nelder–Mead search method (Nelder and Mead, 1965). This fit was carried out 100 000 times, each time each point being perturbed individually by randomly-selected errors in both and , such that the errors in follow the normal distribution shown in part (c) of Figure 10 and the errors in follow the normal distribution shown in Figure 31. An additional error, drawn at random from a normal distribution of standard deviation 0.2 nT is added to to allow for IMF measurement uncertainties. For the full range of potential values, the median, 95-percentile, and 5-percentile were evaluated from the 100 000 fits and taken to be the best fit (the blue line in Figure 32) and the uncertainty limits (which bound the grey area in Figure 32), respectively. The correlation between and the best-fit from is 0.947. The maximum possible correlation is set by the correlation between and , which is 0.957 and hence of the unexplained variation of , is caused by the variation oin the IMF orientation factor.

The uncertainty band is wide at low values of as there are no data to constrain the fit there. The procedure does produce quasi-linear fits (for which is close to unity), but these are rare in the ensemble and so are close to, or beyond, the level. These linear fits produce a non-zero intercept in when falls to zero. This would mean that geomagnetic activity falls to zero when the annual mean falls below about 3nT (in annual means) and there is no known reason why this would occur. In contrast, the best non-linear fits give an intercept in if fell to zero: this does make sense as it means that there is a baselevel level of geomagnetic activity driven by solar wind buffeting and phenomena such as Kelvin–Helmholtz waves on the boundary, on top of which reconnection-driven effects are added.

Because it delineates the points, then 90% of the observed data points should lie within that grey band in Figure 32 if the error estimations are correct (with 5% above the band and 5% below the band). In fact, this is true for 22 out of the 30 data points (73%). However, there is an additional factor which has not yet been allowed for which is a factor in the fit to the space-age data but which would not be a factor in reconstructing the IMF from the composite. Although data gaps have been allowed for by piecewise removal of data, they still have an effect because there are (semi)annual and UT variations in the geomagnetic activity response to a given set of interplanetary conditions due to the effects of Earth’s dipole tilt. If we have full data coverage, these variations are not a factor as they are averaged out in annual means. However, datagaps mean they will have an effect, depending on the UT and time-of-year at which those datagaps occur. To simulate this, the ratio of annual means of and was evaluated for the continuous interplanetary data after 1995 but with data gaps synthetically introduced at random in such a way as to reproduce the observed distribution of gap durations in the OMNI2 dataset. Repeating this many times over allows statistical evaluation of the uncertainty in caused by gaps in the IMF data, as a function of the total data coverage. Using the observed coverage, uncertainties can be assigned to annual means and these are shown by the error bars in Figure 32. Allowing for these error bars, 27 of the 30 (90%) are consistent with the grey band and this meets the design criterion.

Figure 33 shows the reconstruction and its uncertainty from this fit. The tacit assumption is that the relationship between the index and found in the space age (as shown in Figure 32) applies at all other times. This is where the fact that the construction of is homogeneous is so important as it gives the greatest possible confidence that this is true. The best-fit reconstruction of using the polynomial fit is the black line. The grey area surrounding this black line is the uncertainty band associated with this and is derived using the grey band in Figure 32. In addition, the uncertainties introduced into by the intercalibration of the stations are allowed for. This is achieved by applying the upper fit to the upper limit of and the lower fit to the lower limit of . As discussed above, these confidence limits are defined to be at the 95% level. For comparison, the red line shows the result of using the linear fit: it is very similar to the results of the polynomial fit for the observed range of . The green line shows the reconstruction of Svalgaard and Cliver (2010), including the early extension using Bartels’ index. The blue dots show the annual means of the IMF data. It can be seen that agreement between the two reconstructions is exceptionally good between 1880 and the present day (including 1901). This is despite the fact that different geomagnetic indices and different fit procedures were used by Lockwood et al. (2013b) and Svalgaard and Cliver (2010). Therefore, there is a real and strong consensus about the IMF reconstruction after this date. However, before 1880 there are some differences. Before 1872, the Svalgaard and Cliver (2010) reconstruction is using the Bartels’ index about which Bartels himself expressed some reservations. On the other hand, the Lockwood et al. (2013b) reconstruction is based on data from the Helsinki observatory (Nevanlinna, 2004) which has passed a number of self-consistency checks (Lockwood et al., 2013a) and is very well correlated with corresponding data from Russian observatories operating at the same time (Nevanlinna and Häkkinen, 2010).