Terrestrial gross primary production: Using NIRV to scale from site to globe
Abstract
Terrestrial photosynthesis is the largest and one of the most uncertain fluxes in the global carbon cycle. We find that near-infrared reflectance of vegetation (NIRV), a remotely sensed measure of canopy structure, accurately predicts photosynthesis at FLUXNET validation sites at monthly to annual timescales (R2 = 0.68), without the need for difficult to acquire information about environmental factors that constrain photosynthesis at short timescales. Scaling the relationship between gross primary production (GPP) and NIRV from FLUXNET eddy covariance sites, we estimate global annual terrestrial photosynthesis to be 147 Pg C/year (95% credible interval 131–163 Pg C/year), which falls between bottom-up GPP estimates and the top-down global constraint on GPP from oxygen isotopes. NIRV-derived estimates of GPP are systematically higher than existing bottom-up estimates, especially throughout the midlatitudes. Progress in improving estimated GPP from NIRV can come from improved cloud screening in satellite data and increased resolution of vegetation characteristics, especially details about plant photosynthetic pathway.
1 INTRODUCTION
Terrestrial photosynthesis (or gross primary production [GPP]) is responsible for fixing somewhere between 119 and 169 Pg C/year, making GPP both the largest and most uncertain component of the global carbon cycle (Anav et al., 2015). Carbon fixed by photosynthesis in turn provides the basis for practically all life on land, providing a strong motivation for improving global estimates of GPP. It is especially important to understand how photosynthesis might respond to global environmental change, as minor perturbations in terrestrial productivity have implications for global biodiversity, agriculture, and climate change (Rockström et al., 2009; Running, 2012).
A global network of eddy covariance measurements of land surface CO2 exchange serves as the primary basis for quantifying terrestrial photosynthesis at both the site and global scale (Baldocchi, 2008; Baldocchi et al., 2001). Despite their utility, eddy covariance measurements are limited in both time and space; individual flux sites measure CO2 fluxes over approximately 1 km2 and in any given year fewer than 100 sites operate globally (Kumar, Hoffman, Hargrove, & Collier, 2016). Nevertheless, these sparse measurements are the best available data both for studying ecosystem-scale photosynthetic processes at the global scale and for validating terrestrial ecosystem models, which operate globally at resolutions typically much greater than a single kilometer and need to integrate over processes with time constants from a fraction of a second to many years.
In response to the sparseness of photosynthesis observations, many semi-empirical upscaling approaches have emerged for translating site-level CO2 fluxes to globally gridded photosynthesis estimates. Upscaling depends on the assumption that functional relationships between driver variables and GPP operate the same way at measured and unmeasured sites. Although many upscaling schemes exist, two approaches are by far the most widely used: machine learning (Beer et al., 2010; Tramontana et al., 2016) and satellite-driven mechanistic models (Running et al., 2004; Ryu et al., 2011). Both approaches integrate some combination of site-level abiotic characteristics, plant traits, and meteorology to estimate photosynthesis, using in situ fluxes from eddy covariance installations to calculate scaling factors that allow an estimation of photosynthesis beyond tower footprints. Such approaches have been quite successful, allowing for both the investigation of the drivers of global photosynthesis (Jung et al., 2017; Zhao & Running, 2010) and more extensive benchmarking of photosynthesis models by expanding the temporal and spatial availability of photosynthesis estimates (Bonan et al., 2011; Williams et al., 2009).
Any upscaling introduces uncertainties into GPP estimates, stemming both from model formulation and input data. Machine learning approaches, for example, provide the best possible constraint on GPP based on available data, but they functionally operate as black boxes. Such complexity makes it difficult to diagnose the causes and consequences of uncertainty. Upscaling approaches are also limited by the availability of and the uncertainties contained within input datasets (e.g., meteorological data). Combined, these challenges limit the utility of existing upscaling approaches for improving our process-based understanding of photosynthesis and determining the true value of global GPP. Of particular concern is the large and persistent disconnect between upscaled estimates of global GPP and higher estimates derived from top-down isotopic constraints (Welp et al., 2011).
Here, we report a novel approach for estimating global GPP using the near-infrared reflectance of vegetation (NIRV) that takes conceptual root in ideas going back more than 40 years. Even before the widespread use of remote sensing in vegetation analyses, Monteith (1977) observed that the annual increment in biomass growth (NPP, net primary production) can be estimated as the product of the annual absorption of solar radiation and a radiation use efficiency that is relatively constant across species. Several early remote sensing studies built on this idea, documenting the strong correlation between biomass accumulation and the annual integral of the normalized vegetation index (NDVI; Goward, Tucker, & Dye, 1985; Tucker, Li Vanpraet, Sharman, & Van Ittersum, 1985). While these approaches for estimating NPP worked well at the annual scale, short-term responses were inconsistent and variable across sites (Running & Nemani, 1988). Progress in improving the performance of NDVI-based productivity models came from a mix of incorporating additional information about vegetation type, meteorology, and physiological stress. As a result, integration approaches gradually transitioned to more physiologically grounded models, which attempt to represent the biochemical processes (e.g., carbon fixation by rubisco) and physiological stress responses (e.g., stomatal closure due to low soil moisture) that control photosynthesis (Field, Randerson, Carolyn, & Malmström, 1995; Myneni, Hall, Sellers, & Marshak, 1995; Potter et al., 1993; Running et al., 2004; Sellers et al., 1996). Although inclusion of biochemical and physiological processes made photosynthesis models more robust at shorter timescales, it introduced the vexing problem of needing to independently specify key physiological parameters, such as the maximum rate of carboxylation of rubisco (VCmax). Inconsistencies in model parameterization schemes, in turn, give rise to large divergences in model-based estimates of GPP and reveal fundamental uncertainties in our understanding of the controls on photosynthesis at the global scale (Schaefer et al., 2012).
We revisit the early strategies for directly relating integrated satellite measurements to plant productivity. Our approach employs NIRV, a new satellite product that approximates the proportion of near-infrared (NIR) light reflected by vegetation. NIRV offers several advantages over existing satellite vegetation indices. Namely, NIRV has a robust physical interpretation, as it relates directly to the number of NIR photons reflected by plants (Badgley, Field, & Berry, 2017). As a result, NIRV minimizes both the effects of soil contamination and variable viewing geometry on satellite-derived spectra. Consequently, NIRV serves as a comprehensive index of light capture, integrating the influence of leaf area, leaf orientation, and overall canopy structure. We hypothesize that, to the extent plants allocate resources efficiently (Bloom, Stuart Chapin, & Mooney, 1985; Field et al., 1995), this integrated measure of investment in light capture should scale with the capacity to fix CO2, providing a strong basis for new, satellite-derived estimates of GPP.
To test this hypothesis, we use the relationship between NIRV and in situ measurements of GPP derived from eddy covariance. We present our results in three parts. First, we validate the NIRV–GPP relationship at the site scale, contrasting the NIRV approach with other remote sensing, statistical, and physiological models of GPP. Second, we extend the relationship to consider global GPP. Third, we evaluate some of the limitations in the global dataset of NIRV and discuss options for refining the approach.
2 MATERIALS AND METHODS
2.1 Data
We compared NIRV, which is the product of the normalized difference vegetation index (NDVI) and NIR reflectance (NDVI × NIR), against monthly and annual GPP fluxes at 105 flux sites contained in the FLUXNET2015 Tier 1 dataset that met quality control requirements and fell within the time frame of the MODIS record (2003–present). We calculated median NDVI and NIR for all daily scenes overlapping a 1 km2 circle around each flux site, using 500 m, daily red (620–670 nm) and NIR (841–876 nm) nadir-adjusted reflectances from MODIS collection MCD43A4.006 hosted on Google Earth Engine for the years spanning 2003–2015 (Schaaf & Wang, 2015). Prior to estimating mean NIRV, gaps in reflectance data of up to 7 days were filled using linear interpolation. We calculated the average of all NIRV observations for each month and compared them with monthly estimates of GPP from the FLUXNET2015 dataset (variable name: GPP_VUT_MEAN). We required all site-months to have over 75% valid GPP observations and required site-years to have a minimum of 9 months of data. We gridded the MCD43A4.006 dataset to 0.5° by averaging all 500 m pixels whose center fell within each 0.5° grid cell for the global upscaling. No additional gap filling, apart from those procedures inherent in the production of the underlying daily reflectance values (see Schaaf et al., 2002), was used in regridding. Missingness of NIRV data at both the site and global scale due to quality control issues (e.g., clouds) was minimal (Figure S1).
In addition to the site-level comparisons, we evaluated NIRV-based GPP estimates against two existing models of GPP: FLUXCOM, a machine learning approach for upscaling FLUXNET observations (Tramontana et al., 2016), and the Breathing Earth System Simulator (BESS), a physiologically based land surface model that has been extensively benchmarked against eddy covariance measurements of GPP (Jiang & Ryu, 2016; Ryu et al., 2011). For FLUXCOM, we used the mean ensemble of annual GPP HB fluxes from FLUXCOM CRUNCEPv6, available from http://www.fluxcom.org/CF-Download/. For BESS, we used GPP from BESS V1, downloaded from http://environment.snu.ac.kr/bess_flux/. Site-level RMSE values for FLUXCOM and BESS were derived from data provided by the authors (Jiang & Ryu, 2016; Tramontana et al., 2016). We compared models using an Akaike information criterion (AIC)-based approach that simultaneously evaluates model accuracy and penalizes model complexity (see Supplementary Text 1 for details). AIC values were calculated for NIRV, BESS, and FLUXCOM using only site-years shared across all three products.
2.2 Calibration
We used Bayesian estimation to relate NIRV and ecosystem type to GPP at both monthly and annual timescales. Bayesian estimation allowed us to fit slope and intercept, as well as hierarchical variance terms capturing site-level random effects (random deviations from the global slope and intercept per site) and error variance (Gelman et al., 2013). Because Bayesian estimation yields a joint posterior distribution of parameter estimates, upscaling from the model posterior allows us to accurately propagate multiple sources of uncertainty, including joint uncertainty in the model fixed structure (i.e., slope and intercept of the GPP NIRV relationship) and the random effects (i.e., unexplained site-to-site variation and residual variation in the training dataset). The best model, according to the deviance information criteria (DIC; an AIC-like score modified for Bayesian models), consists of a single, near-zero y-intercept and differing slopes for evergreen, deciduous, and crop ecosystem types. The model includes two additional terms: a random site-level intercept term and an error term, both of which were specified as normal distributions with mean of 0 and variance exponentially related to NIRV. See Supplementary Text and Table S1 for a full description of the model structure and the Markov chain Monte Carlo fitting procedure, as well as alternative model structures tested. We performed ecosystem type-stratified 10-fold cross-validation at the site level (e.g., leaving out 20% of sites from each ecosystem type) to confirm that the final model was not overfit (Figure S2). Calibration sites were distributed throughout the global range of observed annual NIRV, though there were only three sites with annual NIRV above 2.5 (Figure S3). In total, the final calibration dataset included data from 105 eddy covariance sites, comprising 526 site-years.
2.3 Upscaling
We produced global annual estimates of GPP using 1,000 samples from the joint model posterior for all 0.5° vegetated land pixels from 2005 to 2015. For each posterior sample (i.e., each joint set of scaling and variance parameter estimates), we calculated per-pixel GPP using the scaling parameters for the ecosystem type, a random draw from the site-level error distribution for each pixel and a random draw from the residual error distribution for each pixel-year. Using the site-level model for our global upscaling captured correlations between parameter estimates (scaling slope and site-level variance estimates were often correlated), resulting in GPP estimates that appropriately represent statistical, site, and residual uncertainty from the full joint posterior distribution of the model. We present the median and 95% credible intervals from the distribution of the 1,000 global GPP estimates.
3 RESULTS
3.1 Site-level validation
NIRV, combined with information on ecosystem type (deciduous, evergreen, and crop), explained 68% of the variation in annual GPP at 105 eddy covariance monitoring sites (526 site-years that passed quality control and data completeness requirements) and had an RMSE of 0.36 kg C m−2 year−1 (Figure 1). At the monthly scale, the same model explained 56% of monthly variation in GPP with an RMSE of 0.08 kg C m−2 month−1 (Figure 1, inset). At the annual scale, we found that the normalized difference vegetation index (NDVI) and the fraction of absorbed photosynthetic radiation (fPAR; two popular vegetation indices) were worse predictors than NIRV, explaining 59% and 52% of the variation in annual GPP fluxes. The accuracy of NIRV far exceeded both NDVI and fPAR in terms of RMSE (Table S2). Importantly, the NIRV–GPP relationship was consistently linear across all values of GPP (Figure S4). The most parsimonious model included just three ecosystem types, with a single intercept and separate NIRV–GPP slopes for sites with (a) evergreen, (b) deciduous, and (c) crop ecosystem types. The model also accounted for variance in both residual error and site-level random intercepts that increased as a function of NIRV (Figure S5). Dividing ecosystems into a greater number of types resulted in minor model improvements, but an almost identical DIC with more parameters, causing us to adopt the simpler three ecosystem type model.

The site-level performance of NIRV-derived GPP compared favorably against BESS and FLUXCOM, when evaluated across overlapping site-years (Figure 1b). The RMSE of site-level NIRv-based GPP estimates was 42% lower than estimates from BESS and 57% higher than estimates from FLUXCOM, a machine learning-based upscaling product. However, taking model complexity into account with AIC and using conservatively low estimates for number of fitted parameters in the alternative approaches, the NIRV approach had a far lower AIC than either BESS or FLUXCOM. This indicates that NIRV better balances model accuracy against model complexity and thereby has a lower likelihood of overfitting the site-level data. Strong performance at validation sites, especially relative to leading statistical- and physiological-based estimates of GPP, demonstrates that NIRV provides a robust basis for global estimates of GPP.
Furthermore, the NIRV approach requires no additional information on meteorological conditions, such as site temperature, vapor pressure deficit, or incoming radiation. Residuals in observed GPP relative to NIRV-derived GPP estimates showed only weak relationships with meteorological variables (Figure 2). For site-years with especially high values of annual precipitation, model accuracy was slightly improved by including precipitation in the model. Similarly, compound meteorological indices, like the ratio of precipitation to potential evapotranspiration (“aridity index”), had only a weak relationship with GPP residuals (Figure S7). Including all available meteorological data boosted R2 by only 0.04, from 0.68 to 0.72 (Table S3), but led to a higher DIC, which indicates that the base NIRV model better generalizes for predictive purposes. Models combining individual meteorological variables with NIRV showed similar small improvements in R2 and RMSE, accompanied by increased DIC.

Interestingly, model residuals had only a weak relationship with annual PAR (Figure 2d, p = 0.01, R2 = 0.01). Light is the primary driver of photosynthesis at shorter timescales, suggesting that it should be the leading candidate for improving model predictions. This was not the case for estimates based on integrated NIRV. In fact, including data on integrated PAR decreased the strength of the NIRV–GPP relationship (Figures S4d and S6). Such a pattern could result from NIRV already integrating relevant information about site-level radiation or have more to do with the uncertainties inherent in global radiation observations. We also found that model residuals at the annual timescale had no relationship with site-level cloudiness, indicating that NIRV alone captured the integrated effect of seasonal variation in sunny and cloudy conditions without the need for separately considering PAR (Figure S8). By requiring fewer inputs, NIRV-based upscaling of GPP reduces uncertainty from those inputs. It also allows the approach to be applied across a wide range of spatial and temporal scales where such data might not be available.
3.2 Global upscaling
Applying the site-level scaling to globally resolved measurements of NIRV, we estimated the median value of global annual GPP from 2003 to 2015 to be 147 Pg C/year, with a 95% credible interval of 131–163 Pg C/year. This median GPP estimate is intermediate between estimates from bottom-up models and constraints from O2 isotopes. FLUXCOM places annual GPP at 118 Pg C/year, while BESS puts mean global GPP at 122 Pg C/year. Based on a meta-analysis, the full range of terrestrial ecosystem models estimate annual to be between 119 and 169 Pg C/year (Anav et al., 2015). The Multi-Scale Synthesis and Terrestrial Model Intercomparison Project (MsTMIP) provides a similar range of estimates across 15 different terrestrial ecosystem models, with our NIRV-derived GPP estimate falling on the high side of those model estimates (Figure S9). O2 isotopic measurements are consistent with global annual GPP in the range of 150–175 Pg C/year (Welp et al., 2011).
The spatial distribution of NIRV-derived GPP is broadly consistent with previous global GPP estimates (Figure 3). As expected, GPP is concentrated in the tropics and declines toward the poles. On a per biome basis, tropical forests contribute the most, accounting for 31% of global GPP; FLUXCOM and BESS attribute 34% and 33% of GPP to tropical forests, respectively. Although lower in relative terms, NIRV-derived GPP in tropical forests is 15% higher than both FLUXCOM and BESS GPP estimates. Differences were even larger at higher latitudes, where NIRV assigns higher productivity to midlatitude mixed forests, grasslands, and shrub-dominated ecosystems (Figure 3b; Table S4). One explanation for this pattern is that NIRV minimizes soil contamination that might cause alternative remote sensing approaches to systematically underestimate leaf area across the midlatitudes. Consistent with this view, a recent study that combined solar-induced chlorophyll fluorescence with a terrestrial ecosystem model reports similar relative increases in extratropical GPP (Norton et al., 2018).

On a per pixel basis, NIRV GPP estimates are strongly linear with GPP estimates from both FLUXCOM and BESS at the annual timescale. R2 exceeds 0.90 and RMSE is below 0.4 kg C m−2 year−1 for both products (Figure S11). Comparison of NIRV to GPP estimates from the MODIS GPP algorithm shows similar performance (Figure S12). This consistency is striking, given that the NIRV approach requires only two inputs (NIRV and ecosystem type). By contrast, both FLUXCOM and BESS require numerous environmental inputs. While broadly consistent, the comparison also emphasizes that NIRV-derived GPP estimates are typically higher, exceeding FLUXCOM GPP by a median value of 0.24 kg C m−2 year−1 and BESS GPP by 0.21 kg C m−2 year−1. There is no obvious reason that NIRV might be biased high. It might be tempting to think that physiological stress, which is not explicitly accounted for by NIRV, might explain the higher GPP from this approach. However, the NIRV-based approach uses the annual sum of both NIRV and measured GPP, meaning NIRV-derived GPP estimates are calibrated to include all of the stress effects at FLUXNET sites, when integrated to the annual scale. Such an interpretation is supported by the weak correlations between model residual GPP and numerous meteorological variables. If NIRV failed to capture the effects of lower precipitation or higher VPD on plant productivity, we would expect these environmental variables to explain additional variations in annual GPP. Yet meteorological variables provide little additional predictive power, meaning the annual NIRV-based GPP estimates could be biased upward only if FLUXNET sites are systematically biased toward low-stress locations or the FLUXNET2015 GPP estimates are biased toward good years where stress did not limit photosynthesis. Of course, such biases would affect any upscaling approach calibrated to the FLUXNET2015 dataset.
Similarly, using the same satellite data at both the site and global scales minimizes the likelihood that systematic errors or biases in the retrieval of NIRV affect our estimates of GPP; any error or bias in NIRV should be accounted for by our site-level calibration. There is little evidence for systematic biases in our model fit (Figure 1; Figure S10). However, even in two worst-case scenarios of systematic bias (overprediction at low productivity sites or underprediction at high productivity sites), neither maximum credible bias would affect our annual global estimate by more than 10%, which is considerably smaller than the 30 Pg C/year credible interval around our mean estimate and the differences between our estimate and either BESS or FLUXCOM (Figure S10). Alternatively, both BESS and FLUXCOM might systematically underestimate true GPP, an interpretation consistent with the constraint from oxygen isotopes (Welp et al., 2011). Resolving this discrepancy represents an important next step in the study photosynthesis at the global scale.
3.3 Uncertainty analysis
Model parsimony, combined with Bayesian estimation, allows us to propagate three sources of uncertainty for each pixel based on the uncertainties quantified in model calibration: statistical (variation in per ecosystem type scaling in the model posterior distribution), site (deviation of each pixel's intercept from the global relationship for that ecosystem type), and residual (otherwise unexplained error). Median per pixel uncertainty is 0.20 kg C m−2 year−1. Total uncertainty, comprising all three sources of error, peaks in the tropics where total annual NIRV is highest. In the worst case, the 95% credible interval of GPP exceeds 0.75 kg C m−2 year−1 in the Amazon basin and Indonesia (Figure 4a). Given that tropical forests constitute the highest proportion of GPP (exceeding 30%) and have relatively few flux measurements, high uncertainty throughout the tropics significantly contributes to the overall uncertainty of global GPP estimates, regardless of approach.

Bayesian upscaling allows the uncertainties in parameter estimation from the site-level calibration to be projected globally; two examples of pixel-level uncertainties are shown in Figure 4b. GPP estimated for each pixel fully contains the uncertainties present in the FLUXNET2015 dataset, providing added confidence in the robustness of the credible range of estimated GPP. Outside of pixels with especially low NIRV, statistical uncertainty is always lowest in both relative and absolute terms, indicating minimal uncertainty in per ecosystem type scaling. On average, site uncertainty is always largest, meaning there is more uncertainty in the NIRV–GPP relationship from site to site (primarily in the site-level intercept, Figure S5) than interannual variation (encompassed by residual uncertainty) in the NIRV–GPP relationship at a single site. Site-to-site variability is randomly distributed, showing no relationship with site climate (Figure S13), thus highlighting retrieval errors (e.g., soil reflectance, clouds) in NIRV and inherent uncertainties in eddy covariance derived GPP estimates as the likely cause of site-level uncertainty.
4 DISCUSSION
NIRV provides a novel approach for estimating GPP that combines a very simple formulation with excellent performance at validation sites (Figure 1). As such, the NIRV approach is largely independent of existing semi-empirical and process-based upscaling approaches. Furthermore, the NIRV approach achieves strong quantification of uncertainties while maintaining parsimony. This combination of simple calculation plus straightforward analysis and partitioning of uncertainty between model structure and inputs makes NIRV a useful tool for revisiting and revising long-standing assumptions about the global controls of photosynthesis.
The strong correlation of NIRV and GPP at FLUXNET calibration sites provides prima facie evidence for the hypothesis that plants allocate resources such that the potential to harvest light (controlled by canopy architecture) and the potential for CO2 fixation (controlled by physiology and biochemical capacity) are held in balance. To further test this hypothesis, we examined differences in the strength of the NIRV–GPP relationship at successively longer integration times for evergreen (of which all but one was located in the temperate latitudes) and deciduous validation sites. Relative to evergreens, deciduous leaves have higher photosynthetic rates and must recoup the cost of constructing leaves over a short period of time. Alternatively, evergreen canopies amortize the cost of leaf construction and maintenance over a year or more and, as a result, have less flexibility to respond to short-term perturbations in resource availability (Chabot & Hicks, 1982). Given these contrasting strategies, we expect that NIRV at deciduous sites should track GPP just as well at short timescales as it does at longer timescales, while as integration time increases from days to months, the performance of NIRV as a predictor of GPP should increase at evergreen sites. This is exactly the pattern found at the FLUXNET validation sites, which we tested using Bonferroni adjusted t tests (Figure 5). At deciduous sites, NIRV is no more powerful at explaining daily GPP fluxes than it is at explaining fluxes integrated to 90 days (p > 0.05, Bonferroni adjusted). While at evergreen sites, NIRV is a significantly stronger predictor of GPP at 90 days than at the daily timescale (p < 0.001; Bonferroni adjusted). Interestingly, by 7 days, the difference in performance between deciduous and evergreen sites is statistically indistinguishable (p > 0.05; Bonferroni adjusted). As noted above, the analysis only included one evergreen tropical forest site (GF-Guy), meaning these results should primarily be interpreted as applying to temperate ecosystems.

The coupling of NIRV and GPP even holds during drought events. During the 2012 North American drought, NIRV showed characteristic early spring green-up, conforming with the spring-ward shift of both carbon and water fluxes documented by Wolf et al. (2016). With the onset of drought at severely drought affected site US-MMS, both NIRV and GPP rapidly declined in parallel, resulting in a similar NIRV–GPP relationship as that of nondrought years (Figure S14a,b). Thus, the coupling between the components of canopy structure that influence NIR reflectance and stress-constrained canopy photosynthetic capacity remains strong even at the short timescale of acute stress events. Notably, NDVI showed little deviation compared to nondrought years during the same period (Figure S14c). The extent of the coupling between canopy structure and productivity at subannual timescales likely varies by ecosystem type, making the study of NIRV–GPP dynamics under drought conditions an important area of future study.
On an instantaneous basis, environmental factors like water, light, and temperature combine with leaf-level biochemical capacity to dictate the rate of photosynthesis (Farquhar, von Caemmerer, & Berry, 1980). The accuracy of NIRV for estimating GPP, without the need for additional inputs like total incoming radiation (Figure 2), does not imply that environmental factors are irrelevant to photosynthesis, but rather that, when integrated over the appropriate time interval, canopy architecture and the physiological controls on photosynthesis are coordinated. This interpretation of the NIRV–GPP relationship also helps explain why including meteorological data does little to improve the accuracy of NIRV-derived GPP estimates. If integrated levels of temperature, light, and water availability (as well as nutrients) jointly determine canopy development and physiological potential, then canopy structure, as summarized by NIRV, should contain the information necessary to accurately estimate GPP. The minor improvement from including meteorological data likely indicates that no single linear relationship between one or even multiple meteorological variables accounts for the large number of possible combinations of meteorology and plant response (Figure 2; Table S3).
A major strength of the NIRV approach is that it allows statistically valid error propagation (Figure 4). More complicated approaches for upscaling GPP make it difficult to accurately partition sources of error, especially model structural errors and errors due to input uncertainties. FLUXCOM, for example, functionally operates as a black box, limiting our ability to draw biological inferences about the global controls of GPP from the model itself. With the NIRV-based approach, three sources of error warrant consideration. First, it could be the case that even though NIRV captures many of the controls of GPP, the slowly shifting integrator of NIRV might contain delays and inconsistencies that introduce uncertainties in the NIRV–GPP relationship. Second, the coordination of structure and physiology might be imprecise, failing to account for some of the factors that influence GPP. Third, there are almost certainly measurement errors in the NIRV and GPP datasets used for calibration. The latter two possibilities are strongly suggested by the predominance of site-level error (Figure 4b; Figure S5), which indicates that either the physiology controlling the NIRV–GPP relationship varies from site to site or that the NIRV measurements and/or GPP measurements used for calibration lack consistency across space. As a result, efforts to improve both the robustness of measurements of NIRV (e.g., better cloud filtering) and eddy covariance derived estimates of GPP (e.g., how GPP is partitioned from net ecosystem exchange, the mismatch between flux footprints and satellite measurements) are essential to minimizing site-level error.
A clear illustration of problems with the MODIS data used to calculate NIRV comes from GF-Guy, an eddy covariance site in French Guyana. GPP fluxes at GF-Guy varied less than 20% month to month, while NIRV varied by a factor of 3 (Figure 6a), which suggests errors in MODIS observations at the site. A likely explanation is cloud contamination, as remote sensing in the tropics is notoriously plagued by clouds. To investigate this, we used the multiangle implementation of atmospheric correction for MODIS (MAIAC) data product, newly available for selected sites.

MAIAC uses atmospheric modeling to remove aerosols, subpixel clouds, and other artifacts from MODIS satellite imagery (Lyapustin, Martonchik, Wang, Laszlo, & Korkin, 2011). The variability of NIRV dramatically decreased with the MAIAC data (Figure 6a). In fact, MAIAC-derived NIRV had a smaller dynamic range than measured GPP, strongly indicating cloud contamination of the baseline MODIS dataset at GF-Guy and, in all likelihood, throughout the tropics. Unfortunately, the 250 m resolution MAIAC data needed to perform site-level calibration are not yet available for all FLUXNET sites. Cloud contamination in the MODIS data likely causes systematic underestimation of NIRV throughout the tropics, which in turn would bias our median global GPP estimate upward and make 147 Pg C/year a conservative estimate of global GPP.
Fundamental differences in plant physiology can also contribute to site uncertainty. One clear candidate is the difference in C3 and C4 photosynthesis. C4 plants fix CO2 more efficiently than C3 plants, which should cause a steeper slope in the NIRV–GPP relationship, all else equal. Tests at a trio of Nebraskan eddy covariance towers that annually rotate between soy (C3) and corn (C4) crops reveal significant differences in the NIRV–GPP slope with crop type (Figure 6b). Including information on the distribution of C3 and C4 vegetation across both wild and managed ecosystems should decrease uncertainty. It would also likely increase the median estimate of GPP, as C3 sites comprise the majority of the calibration dataset, further emphasizing the conservative nature of the 147 Pg C/year estimate of GPP.
A third advantage of the NIRV approach is that it can be calculated from existing high resolution and widely available satellite imagery. This makes NIRV immediately available for detailed studies and trend analyses at a wide variety of spatial and temporal scales, from individual study sites to the entire globe (Figures 1 and 3). Our approach for estimating GPP from NIRV could also be calculated for the full Landsat and MODIS records, as well as the 39-year record of the Advanced Very High Resolution Radiometer series of sensors (Tucker et al., 2005). Finally, the ease of measuring NIRV allows researchers to make inexpensive, canopy-scale spectral measurements that are directly comparable with satellite data, facilitating efforts to bridge spatial scales.
To conclude, NIRV provides a new, largely independent approach for estimating global GPP with excellent performance at FLUXNET calibration sites. The median estimate from this approach, 147 Pg C/year, is higher than recent estimate from bottom-up process-based models but is lower than global constraints from oxygen isotopes. Correcting known sources of uncertainty will likely increase the median estimate. In addition to high accuracy at calibration sites, the approach combines simple calculation, robust error propagation, and the ability to utilize decades of historical remote sensing data. Future refinements of the NIRV-based approach can come from improved remote sensing inputs and inclusion of additional physiological processes.
CONFLICT OF INTEREST
The authors declare no conflicts of interest.