Research · Terrestrial Ecology

Quantifying Representation and Using Representation Weights to Interpolate Flux Tower Measurements across the United States

Goal

To determine the degree to which the existing network of carbon eddy flux towers within the AmeriFlux network (http://public.ornl.gov/ameriflux/) is representative of flux environments across major biomes within the conterminous United States. This information can be used to determine how many additional towers will be required, and where new towers should be placed. In addition, the importance and uniqueness of each existing eddy covariance tower to the AmeriFlux network can be calculated.

Objective

The overarching objective of this project begun in 2003 is to determine whether the AmeriFlux sites currently funded by DOE’s office of Biological and Environmental Research are effectively operating as an integrated national network, and to direct the expansion of additional flux sites as needed to fill critical gaps in biomes, developmental stages, and climate space. We are using a multivariate statistical approach on a parallel supercomputer to determine whether the current distribution of sites in the network is representative of the dominant combinations of vegetation, soils, and climate which are present in the conterminous US. Statistical indices based on multivariate representativeness and site importance indicate how well the current network of towers "samples" the population of flux environments within the nation. The same multivariate statistical approach can provide an objective and defensible rationale for the selection of additional sites, since it can determine any number of additional locations such that the representation of the overall network is maximized by their addition.

Approach

William W. Hargrove and Forrest M. Hoffman have developed a method based on a multivariate statistical approach to divide any map, based on a set of multivariate characteristics, into a set of regions which are relatively homogeneous with respect to those conditions. The number of regions which are to be produced is under user control. Once produced, any two regions taken from the map will have roughly equivalent within-region heterogeneity with regard to the combinations of characteristics used as input. Regions that are produced may be spatially disjoint; for example, environments on geographically distant mountaintops, if similar enough, will be classified into the same ecoregion.

We have statistically created a series of nine sets of flux-relevant ecoregions which divide the conterminous U.S. into a set of areas within which the carbon flux from terrestrial ecosystems is expected to be relatively uniform and homogeneous. Starting with digital GIS layers of factors deemed important in regulating carbon fixation and loss from terrestrial ecosystems, we assembled a set of maps of multivariate factors which describe and characterize the flux environment in each map cell. Then, we used a k-means clustering procedure to classify each map cell into a particular group whose cells have sufficiently similar environments from the last assignment.

Each variable used to describe the flux environment is itself a national map that was developed specifically for this project. These maps are at 1 km resolution over the conterminous United States, and consist of nearly 8 million cells each. Because there were as many as 30 environmental descriptors, each with nearly 8 million cells, it was necessary to perform the clustering process on a parallel supercomputer.

Because the statistical process is quantitative, the similarity of a selected flux-ecoregion to every other ecoregion in the map can be calculated. Maps can be produced that show the degree of similarity to the chosen flux-ecoregion as a series of gray shades. By sequentially selecting flux ecoregions currently containing an AmeriFlux tower, maps showing the geographic area which is represented by measurements from that flux tower will be produced.

Results to Date

In late 2003, a proof-of-concept paper was published in Eos describing the representativeness and network design analysis of the existing AmeriFlux network. For each quantitative ecoregion in the map, we found the Euclidean distance in data space to the single closest ecoregion that contains a site from the network. This distance was coded to a gray level, so that darker areas showed regions that are poorly represented by the existing AmeriFlux network. Network analysis showed how well the sampled environments represented the rest of the map, and identified the best locations for new sites or installations. The best location for additional AmeriFlux sites will be in places that are the least well-represented by the network of existing AmeriFlux sites. Environments in the central, midwestern, and northeastern portions of the U.S. were well-represented by existing AmeriFlux tower sites. Southern, southwestern, and Pacific Northwestern environments were more poorly represented by existing tower sites. It was concluded that adding AmeriFlux sites in the Pacific Northwest and south Texas would contribute significantly more marginal representation.

However, this proof-of-concept analysis used generic quantitative ecoregions produced for a different purpose. Because the ecoregionalization forms the basis for the entire network analysis, the results of this earlier work must be considered preliminary. Needed was a set of regionalizations created specifically to capture the spatial variability in flux environments across the nation. Once such flux-customized quantitative ecoregions were produced, the AmeriFlux representativeness and network design analysis could be repeated to see if the preliminary results changed or remained the same.

In FY2004, we produced nine separate sets of custom flux ecoregions for the conterminous U.S. (three types of flux ecoregions x three seasons for each type), adding each time to the factors used as input, and increasing in sophistication. We began by statistically constructing flux-ecoregions based on climatic, edaphic, and physiographic factors which might be considered, a priori, to impose limits on the amount and direction of carbon flux for a particular ecosystem. Consideration of primary forcing factors alone produces ecoregions which reflect the flux of potential vegetation. However, anthropogenic effects and disturbance history may have altered or reset this potential vegetation. The middle of this conceptual spectrum adds information about the characteristics of the vegetation that actually exists in this ecosystem at this time. Finally, in a third set of flux regionalizations, we added information about ecosystem performance, in terms of Gross Primary Production (GPP) and respiration.

We used data from NASA's MODIS sensor to provide information about extant vegetation across the US. MODIS was also used to provide indices of ecosystem performance for the third set of flux regionalizations. To calculate the flux-ecoregions, we processed every MODIS granule over the United States, for every 8-day interval, for every year, for 2001, 2002, and 2003 for almost every MODIS product from the Terra platform.

In FY2005, we repeated the AmeriFlux network analysis using all nine of the custom flux ecoregions. As in the preliminary analysis, the Pacific Northwest, the Sierra Nevada mountains, and the Sonoran desert region were again shown to be poorly represented by existing AmeriFlux towers. Substantially increased representation was indicated in Texas, while there was a general decrease in representativeness in Florida. Whereas the original analysis showed most of the Florida penninsula to be well-represented, with the exception of the Everglades, the updated analysis shows relatively poor representation for all of central Florida. The large-scale patterns of representativeness indicated for the AmeriFlux network are robust to prediction, regardless of which of the new statistical flux-ecoregions are selected for the basis of the analysis.

We also used a hierarchical multivariate clustering method to develop and plot a Similarity Tree among all 60 existing AmeriFlux sites. Similarity Trees show how similar the flux environment at each AmeriFlux site is to the flux environments at the other sites. In an ideal network, all of the sites should be highly unique in order to best capture different aspects of the continent.

By calculating the network representativeness with and without a particular site, the marginal contribution of that site to the representativeness of that network can be calculated. The difference between the representativeness of the network with and without this site is called the Site Representativeness Importance Value, and reflects the marginal contribution of that site to the representativeness of the network. Site Representativeness Importance Values allow direct comparison of sites in terms of their contribution toward network representativeness. We have produced sorted lists of all AmeriFlux tower sites, arranged by Site Representativeness Importance Value. We expect these lists to be of great value for the practical management of the AmeriFlux network.

Deliverables

Five most important publications:
Hargrove, W.W., Forrest M. Hoffman, and B.E. Law. December 2003. "New Analysis Reveals Representativeness of the AmeriFlux Network." Eos Trans. AGU, 84(48), 2003.

Hargrove, W.W., and F.M. Hoffman. 2004. The potential of multivariate quantitative methods for delineation and visualization of ecoregions. Environmental Management 34(5):S39-S60.

Hargrove, W.W., and F.M. Hoffman. 2004. A Flux Atlas for Representativeness and Statistical Extrapolation of the AmeriFlux Network. ORNL Technical Memorandum ORNL/TM-2004/112. Available at http://geobabble.ornl.gov:/flux-ecoregions

Hargrove, W.W., and F.M. Hoffman. 2004. Using Quantitative Flux Ecoregions to Determine the Extent of the NACP Mid-Continent Intensive. Web page at http://research.esd.ornl.gov/~hnw/mid-continent/

Hargrove, W.W., and F.M. Hoffman. 2003. Representativeness and Network Site Analysis Based on Quantitative Ecoregions. Web page at http://research.esd.ornl.gov/~hnw/networks/

Other Publications

Hoffman, F.M., W.W. Hargrove, D.J. Erickson, III, and R. Oglesby. 2004. Using clustered climate regimes to analyze and compare predictions from fully coupled general circulation models. Earth Interactions (in press).

Saxon, E., B. Baker, W.W. Hargrove, F.M. Hoffman, and C. Zganjar. 2005. Mapping environments at risk under different global climate change scenarios. Ecology Letters 8: 53-60.

White, M.A., F.M. Hoffman, W.W. Hargrove, and R.R. Nemani. 2005. A global framework for monitoring phenological responses to climate change. Geophysical Research Letters 32(4):L04705, doi:1029/2004GL021961.

Hargrove, W.W., and F.M. Hoffman. 2003. Recent Changes in the Configuration of the DOE AmeriFlux Network and Their Effects on Network Representation. Web page at http://research.esd.ornl.gov/~hnw/networks2/

For more information, contact:
William W. Hargrove (hargroveww@ornl.gov, 865-241-2748)
Forrest M. Hoffman (hoffmanfm@ornl.gov, 865-576-7680)

Revised: 8/04/05