Carbon-based phytoplankton size classes retrieved via ocean color estimates of the particle size distribution

Owing to their important roles in biogeochemical cycles, phytoplankton functional types (PFTs) have been the aim of an increasing number of ocean color algorithms. Yet, none of the existing methods are based on phytoplankton carbon (C) biomass, which is a fundamental biogeochemical and ecological variable and the “unit of accounting” in Earth system models. We present a novel bio-optical algorithm to retrieve size-partitioned phytoplankton carbon from ocean color satellite data. The algorithm is based on existing methods to estimate particle volume from a power-law particle size distribution (PSD). Volume is converted to carbon concentrations using a compilation of allometric relationships. We quantify absolute and fractional biomass in three PFTs based on size – picophytoplankton (0.5–2 μm in diameter), nanophytoplankton (2–20 μm) and microphytoplankton (20– 50 μm). The mean spatial distributions of total phytoplankton C biomass and individual PFTs, derived from global SeaWiFS monthly ocean color data, are consistent with current understanding of oceanic ecosystems, i.e., oligotrophic regions are characterized by low biomass and dominance of picoplankton, whereas eutrophic regions have high biomass to which nanoplankton and microplankton contribute relatively larger fractions. Global climatological, spatially integrated phytoplankton carbon biomass standing stock estimates using our PSD-based approach yield ∼ 0.25 Gt of C, consistent with analogous estimates from two other ocean color algorithms and several state-of-the-art Earth system models. Satisfactory in situ closure observed between PSD and POC measurements lends support to the theoretical basis of the PSD-based algorithm. Uncertainty budget analyses indicate that absolute carbon concentration uncertainties are driven by the PSD parameter No which determines particle number concentration to first order, while uncertainties in PFTs’ fractional contributions to total C biomass are mostly due to the allometric coefficients. The C algorithm presented here, which is not empirically constrained a priori, partitions biomass in size classes and introduces improvement over the assumptions of the other approaches. However, the range of phytoplankton C biomass spatial variability globally is larger than estimated by any other models considered here, which suggests an empirical correction to the No parameter is needed, based on PSD validation statistics. These corrected absolute carbon biomass concentrations validate well against in situ POC observations.


T. S. Kostadinov et al.: Carbon-based phytoplankton size classes
fluences (e.g., Falkowski and Oliver, 2007) and can be influenced by (e.g., Marinov et al., 2013;Cabré et al., 2014) climate (and shorter-term processes such as seasonality; e.g., Kostadinov et al., 2016a).Therefore, detailed characterization of the structure and function of oceanic ecosystems (i.e., descriptive and predictive understanding of the PFTs) is required as a crucial component of Earth system and climate modeling.
Operational quantification of the PFTs on the required spatiotemporal scales can only be achieved via remote sensing.Remote-sensing reflectance as a function of wavelength, R rs (λ), quantifies ocean color; the canonical derived variable has been chlorophyll concentration (Chl) in surface waters (e.g., O'Reilly et al., 1998;Maritorena et al., 2002), interpreted as a proxy for phytoplankton biomass.However, total Chl does not provide a full description of the state of the ecosystem, since physiological acclimation to differing light levels can cause the ratio of intracellular Chl to carbon (C) concentrations to change, confounding interpretation of changes in Chl (Geider et al., 1987(Geider et al., , 1998;;Behrenfeld et al., 2005).It is carbon biomass in the living phytoplankton that is the variable of more direct relevance to the carbon cycle, other biogeochemical cycles and climate.It is also the tracer variable most commonly used in biogeochemical routines of climate models (e.g., Gregg, 2008;Dunne et al., 2013).In addition, a more complete characterization of an oceanic ecosystem also necessitates partitioning of the carbon biomass into the different PFTs comprising the ecosystem.The Chl : C ratio itself can be used as a proxy for physiological status and independent assessments of Chl and C allow for the building of carbon-based productivity models (Behrenfeld et al., 2005;Westberry et al., 2008).
In light of the above, recent ocean color algorithm developments have provided products beyond Chl.First, multiple algorithms for the estimation of various PFTs have been developed (IOCCG, 2014).Some algorithms retrieve multiple PFT groups using differential absorption (Bracher et al., 2009) or second-order anomalies of the reflectance spectra (Alvain et al., 2008).Others (e.g., Brewin et al., 2010;Hirata et al., 2011;Uitz et al., 2006) are based on total (Chl) abundance and the ecological premise that smaller cells are associated with oligotrophic conditions whereas larger cells are associated with eutrophic conditions (Chisholm, 1992).Yet another class of algorithms relies on various spectral features, either absorption (Ciotti and Bricaud, 2006;Mouw and Yoder, 2010;Roy et al., 2013), or backscattering (Kostadinov et al., 2009(Kostadinov et al., , 2010) ) or both (Fujiwara et al., 2011).A summary of the available algorithms and their technical basis can be found in IOCCG (2014) and Hirata (2015).Of particular importance is that none of the existing algorithms retrieve C or base their PFT/PSC retrievals on total or fractional C content per PFT.Second, algorithms have been developed to retrieve particulate organic carbon (POC, e.g., Stramski et al., 2008 -henceforth, S08).However, these are empirical band-ratio algorithms the output of which is expected to be tightly cor-related to Chl, which is derived in a similar way.The retrieval of just the living phytoplankton carbon concentration represents significant progress (Behrenfeld et al., 2005 -henceforth, B05).However, the B05 method is based on a constant empirical scaling with particulate backscattering at 440 nm (b bp (440)) which does not take into account the effects of variable particle size distributions (PSDs).Changes in the PSD will change the backscattering per unit C biomass due to different scattering efficiencies (e.g., Stramski and Kiefer, 1991;Kostadinov et al., 2009).
Recent advances allow for the quantification of the PSD from ocean color satellite data and thus the estimation of particulate volume in any size class (Kostadinov et al., 2009 -henceforth, KSM09;Kostadinov et al., 2010).Henceforth, this PSD algorithm is referred to as the KSM09 algorithm.Here, we leverage the KSM09 algorithm and an existing compilation of allometric relationships that link cellular C content to cellular volume (Menden-Deuer andLessard, 2000 -henceforth, MDL2000), in order to (1) estimate total phytoplankton C biomass using the power-law PSD parameters as input and (2) recast the volume-based PSCs of the KSM09 algorithm in terms of C biomass instead of biovolume.The effects of variable PSD have been taken into account for the first time, relaxing the assumption of a constant backscattering to carbon relationship.Importantly, to our knowledge this is the first attempt to provide size class partitioning of phytoplankton C biomass from space.We first present the methodology and apply the algorithm to SeaWiFS global monthly reflectance data, focusing on climatological patterns and comparison with existing phytoplankton carbon estimates and Earth system model results.We then assess global mixed-layer phytoplankton biomass stock and compare to existing estimates.We quantify partial uncertainties on a per-pixel basis by propagating existing input parameter uncertainties to the C-based products.In addition, we present an in situ POC-PSD closure analysis as verification of the method, propose an empirical correction to the algorithm to improve absolute carbon estimates and validate our results using in situ POC measurements.

Step 1: retrieval of suspended particulate volume from ocean color remote sensing data
We first quantify the volume concentration of suspended particulate matter from ocean color data by applying the KSM09 algorithm to estimate the parameters of an assumed powerlaw particle size distribution.These parameters are retrieved using lookup tables (LUTs) constructed using Mie theory of scattering (Mie, 1908).The LUTs relate the spectral shape and magnitude of the particulate backscattering coefficient at blue-green wavelengths (b bp (λ) (m −1 )) to the power-law slope ξ (unitless) of the PSD and the differential number concentration of suspended particles at a reference diameter (here, D o = 2 µm), N o (m −4 ) (Junge, 1963;Boss et al., 2001;KSM09): In Eq. ( 1), D (m) is the equivalent spherical diameter (ESD) (Jennings and Parslow, 1988) and N (D) (m −4 ) is the differential number concentration of particles of diameter D.
Volume concentration (m 3 of particles/m 3 seawater) can be computed from the PSD as (Kostadinov et al., 2010): (2) Note that Eq. ( 2) is an estimate of the volume of all backscattering in-water constituents in a given size range because the KSM09 algorithm uses total backscattering for the retrieval.
Even though the power-law PSD is considered a simple twoparameter model, in reality it has four parameters, because in practical applications the upper and lower limits of integrals such as Eq. ( 2) need to be known (Boss et al., 2001).
Assuming biogenic origin of scattering particles, Kostadinov et al. (2010) developed a novel method of estimating three PSCs, defining each class as its fractional contribution to total biovolume.

Step 2: retrieval of size-partitioned absolute and fractional phytoplankton carbon biomass
Estimation of carbon concentration follows the methodology first outlined in Kostadinov (2009).The volume-to-carbon allometric relationships compiled by MDL2000 are used to quantify POC by converting the volume estimates of Eq. ( 2) to C concentration.The relationships in MDL2000 have the general form: where C cell is cellular carbon content (pg C cell −1 ), a and b are group-specific constants and V cell is cell volume (µm 3 ).
Incorporating the allometric relationship of Eq. (3) into Eq.( 2) yields an estimate of particulate carbon mass concentration (i.e., POC) in a given size range, D min to D max .The carbon biomass of living phytoplankton only (C, (mg m −3 )) can then be estimated by multiplication by 1/3: The factor of 1/3 is used because it is approximately in the middle of the published range for the phytoplankton C : POC ratio in ocean regions of variable trophic status (0.14-0.49) (B05; DuRand et al., 2001;Eppley et al., 1992;Gundersen et al., 2001;Oubelkheir et al., 2005).The factors 10 −9 and 10 18 are applied in Eq. ( 4) for conversion from picogram (Eq. 3) to milligram of C and from m 3 to µm 3 , respectively.The formulation of Eq. ( 4) allows phytoplankton carbon biomass to be estimated for any size range.
Here, we partition the biomass in three classical phytoplankton size classes (PSCs, Sieburth et al., 1978): picoplankton (0.5 µm ≤ D ≤ 2 µm), nanoplankton (2 µm ≤ D ≤ 20 µm) and microplankton (20 µm ≤ D ≤ 50 µm).The three PSCs are expressed as relative fractions of total phytoplankton C biomass, by dividing the PSC's biomass by total biomass in the 0.5-50 µm range.This expression of the PSCs is a recast of the volume-fraction-based PSCs of KSM09 in terms of carbon biomass.Further details of application of the MDL2000 allometric relationships are given in Sect.S1.1 in the Supplement.

Input ocean color satellite data
Global mapped monthly composites of remote sensing reflectance R rs (λ) (sr The monthly R rs (λ) maps were used to retrieve the spectral particulate backscattering coefficient (b bp (λ), (m −1 ), λ same as for the input reflectances), using the algorithm of Loisel and Stramski (2000) and Loisel et al. (2006) (henceforth, the LAS2006 algorithm), with a solar zenith angle (SZA) of 0 • because the input R rs (λ) are fully normalized.The spectral slope of b bp (λ), η, was calculated using a linear regression on the log-transformed data at the 490, 510 and 555 nm bands.The KSM09 algorithm (Sect.2.1.1)was then applied to η and b bp at 443 nm in order to obtain the PSD parameters ξ and N o , which were subsequently used in Eq. (4) (specifically as shown in Eq. (S1) in the Supplement) to obtain monthly 9 km maps of total and PSC-partitioned absolute and fractional C biomass.

Additional methods information
Additional details of the methodology are provided in the Supplement Sect.S1.Specifically, Sect.S1.1 presents details of the application of the MDL2000 allometric relationships.Total phytoplankton carbon was also derived from the output of a group of Earth system simulations from the recent Coupled Model Intercomparison Project CMIP5 (Taylor et al., 2012).Details of the methods are provided in Sect.S1.2.Section S1.3 presents the methods for an entirely in situ POC-PSD closure analysis.Section S1.4 details the propagation of uncertainty to the carbon-based products and the composite (averaged) images calculated from monthly input data.Section S1.5 provides details of algorithm output analyses and additional ancillary data sets used.Importantly, details are presented on the computation of global carbon biomass stock within the mixed layer, using the PSD/allometric phytoplankton carbon retrievals presented here.Section S1.6 describes the methodology used for validation of total phytoplankton carbon using matchups between empirically corrected (see Sect. 3.7) SeaWiFS retrievals and in situ POC measurements provided by the SEABASS database.Results of this validation are discussed in Sect.3.7.

Global phytoplankton carbon biomass from SeaWiFS observations and CMIP5 models
The mission climatology of total phytoplankton carbon (C) (Fig. 1a) indicates that biomass is lowest in the oligotrophic subtropical gyres, while higher values occur in more eutrophic regions, such as the equatorial and eastern-boundary currents, other upwelling regions and high-latitude oceans.This general pattern corresponds to first order to the climatological Chl spatial patterns (Fig. S2 in the Supplement) and is consistent with current oceanic ecosystem understanding (e.g., Longhurst, 2007).Comparisons with two existing satellite methods (the B05 values, Fig. 1b; the S08 POC retrievals divided by 3, Fig. 1c) reveal that the PSD-based approach for quantifying C biomass results in a significantly wider range of spatial variability, as illustrated also by the histograms in Fig. S3.The PSD-based biomass estimates are the lowest in the subtropical oligotrophic gyres (by about an order of magnitude) and generally highest (generally by less than an order of magnitude) in more productive areas.The three methods are in relatively good agreement in the Pacific equatorial upwelling region.A considerable difference also exists between the B05 and the S08-based values -the former vary the least spatially, mostly due to relatively high biomass estimates in the subtropical oligotrophic gyres.
While it is likely that the PSD-based values in the oligotrophic gyres are underestimated and values in some eutrophic areas are overestimated, a global validation with concurrent field measurements of phytoplankton C biomass (total or partitioned) is not feasible at present since in situ analytical measurements of phytoplankton carbon are difficult and made possible only recently by emerging techniques (Graff et al., 2012(Graff et al., , 2015)).The S08 method is developed with in situ POC and reflectance data, and the constant conversion factor in B05 is picked empirically, so these algorithms are designed a priori to match in situ measurements.The method presented here is derived mostly from theory (apart from the allometric relationships) and is not subject to such constraints (Sect.3.6).Importantly, even if the absolute carbon concentration values are inaccurate, the PSCs expressed as percent contribution to C biomass should still be reliable and subject to much less uncertainty (Sects. 3.3 and 3.6).The PSC fractions can also be used with other absolute carbon estimates.An empirical correction to address the spatial exaggeration of absolute carbon concentrations is presented in Sect.3.7 together with a validation for corrected total phytoplankton carbon estimates using in situ POC measurements.
Some degree of exaggeration of the global range of values of the PSD-based mean algal biomass field (Fig. 1a) as compared to the approach of B05 (Fig. 1b) is expected because the former relaxes the assumption of a constant conversion factor in B05 by taking into account the varying backscattering per unit cell volume and carbon.According to Mie theory calculations, b bp (λ) normalized to volume of particles in the 0.5-50 µm range is 3 orders of magnitude higher when the PSD slope ξ = 6, as compared to when ξ = 3 (not shown).Thus, the same backscattering coefficient will be attributed to less particle total volume if the particles are relatively smaller in size (higher ξ ).Since PSD slopes are highest in the oligotrophic gyres (KSM09), the PSD-based approach is expected to exhibit smaller total volume of particles and thus smaller carbon concentrations as compared to the direct scaling with b bp (443) in B05.
The CMIP5 models' ensemble mean of phytoplankton C biomass (Fig. 1d) is independent of the satellite data sets (refs. in Table S3) and resembles the S08 POC-based estimate the most in spatial patterns and values, with somewhat lower values in the subtropical gyres, but not quite as low as the PSD-based method (Fig. 1a).Notably, the models yield higher values in the Pacific equatorial upwelling zone than any of the satellite data sets.

Global phytoplankton biomass stock
Estimates of total global phytoplankton biomass stock (Sect.S1.5) from the three satellite methods and the CMIP5 models (using the SeaWiFS mission climatological fields) are remarkably consistent (Fig. 2), yielding ∼ 0.2-0.3Gt C standing biomass stock (1 gigaton (Gt) = 10 12 kg).Biomass in open ocean areas (with the continental shelves excluded) accounts for most global biomass according to all estimates.However, the models attribute very little biomass to the shelves as compared to the satellite methods, which is probably due to the lower underlying spatial resolution of the models.Since satellite algorithms are generally subject to higher uncertainties in coastal zones, it is best to develop technology to measure C biomass in situ (Graff et al., 2012(Graff et al., , 2015) ) and inform both satellite algorithms and biogeochemical models.The satellite estimates in Fig. 2

Global phytoplankton carbon biomass stock
Figure 2. Global spatially integrated mixed-layer phytoplankton carbon biomass stock (Gt C), as estimated with three different satellite algorithms (as in Fig. 1a-c) from the SeaWiFS mission composite and from the CMIP5 model ensemble mean (Fig. 1d), using the same climatological MLD estimate for all estimates.Horizontal black lines within each bar on all panels represent the estimate when continental shelves (< 200 m depth) are excluded.The sum of the areas of valid pixels used in the estimates is given as a percentage of total ocean area (3.608 × 10 8 km 2 ) and area excluding the shelves (∼ 3.4 × 10 8 km 2 ), respectively.
respective ocean areas participate in the estimate (Fig. 2).However, some bias remains because high latitudes are observable only in summer months (Fig. S4).Monthly estimates of the global phytoplankton carbon stock are discussed in Supplement Sect.S2.It is notable that the novel PSDbased method is not empirically restricted or tuned a priori and yields reasonable estimates.Admittedly, this global spatially integrated result may be fortuitous due to cancellation of uncertainties with opposite signs in the oligotrophic vs. eutrophic areas, so it is not claimed that this result necessarily constitutes algorithm verification (also see Sect.3.7 and Fig. S7).Previous estimates of global phytoplankton C stock use different methodologies and range from 0.30 to 0.86 Gt C (Antoine et al., 1996;Behrenfeld and Falkowski, 1997b;Le Quéré et al., 2005).Further discussion of these estimates and the effects of MLD assumptions is provided in Sect.S2.

Size-partitioned biomass
Maps of absolute C biomass partitioned among picoplankton (Fig. 3a), nanoplankton (Fig. 3b) and microplankton (Fig. 3c) reveal a general global spatial pattern for all three size classes similar to the global total distribution (Fig. 1a), namely the lowest biomass values are encountered in the oligotrophic gyres, whereas higher latitudes, coastal and upwelling areas exhibit higher biomass.According to contemporary understanding of oceanic ecosystems (e.g., Uitz et al., 2010)   expect large cells (such as diatoms) to be opportunistic, responding via strong localized blooms to changes in nutrient inputs or grazing.This opportunistic response, which contrasts the smaller picoplankton adaptation to constant environmental conditions, explains the widely different spatial and temporal variability of these groups.Accordingly, we find that the range of spatial variability of carbon for picoplankton (< 3 orders of magnitude) is a lot smaller than the range of variability for nanoplankton (∼ 4) and especially microplankton (∼ 5 orders of magnitude) (Fig. S5).Negligible biomass is found in microplankton for most of the ocean area, except for eutrophic areas characterized by seasonal blooms and/or higher overall productivity such as the Equatorial Upwelling, whereas picoplankton are more globally ubiquitous.
The fractional contribution of each PSC to total C biomass reveals the climatological dominance of each group in the various oceanic regions (Fig. 4).Picoplankton emerge as the dominant size group in oligotrophic areas (Fig. 4a), because their large cellular surface-area-to-volume ratio enables them to acquire scarce nutrients very efficiently (Agawin et al., 2000;Falkowski and Oliver, 2007).By contrast, larger phytoplankton contribute relatively more biomass in the regions where nutrients are generally more abundant, because they can take up nutrients at a faster rate and store them inside vacuoles as a reserve for less favorable spells (e.g., Eppley and Peterson, 1979;Chisholm, 1992;Falkowski et al., 1998;Falkowski and Oliver, 2007).Together, nano-and microplankton achieve dominance (between 50 and 90 %) along the Antarctic coastline, in much of the zone between ∼ 40 • S and ∼ 50 • S (in the South Atlantic, the southwestern Indian Ocean, southeast of Australia and east of New Zealand), along the eastern boundaries of the Pacific and At-lantic oceans, in the northwestern Arabian Sea and almost everywhere north of ∼ 40 • N.
The total biomass patterns in the Southern Ocean (Fig. 1a) are characterized by more or less continuous bands of high biomass (a) along the frontal structures around 40-45 • S, a transitional region from the iron-limited upwelling regime in the south to the nitrate-limited downwelling subtropical gyres in the north and (b) in the marginal sea ice regions next to the Antarctic continent, where continental iron (Fe) inputs likely result in biomass and production spikes during the spring and summer.Both these large-biomass bands tend to be dominated by the larger opportunistic groups of nanoand microplankton (Fig. 4b-c).In between these two bands of high production we find a relatively lower biomass band from roughly 50-60 • S, where picoplankton thrive (Fig. 4a).The lower total biomass here is probably due to a combination of iron limitation and deep summertime mixed layers, resulting in strong light limitation during the growing season.Large areas in the Southern Hemisphere are characterized by lower total (Fig. 1a) and group-specific C biomass (Fig. 3a-c), as compared to the Northern Hemisphere.This interhemispheric disproportionality is dominated by highlatitude summer values (not shown) and is in agreement with findings that the Southern Ocean sustains relatively low phytoplankton biomass, in spite of high ambient macronutrient concentrations (e.g., Dugdale and Wilkerson, 1991).
We emphasize that our methodology is unique in its ability to partition phytoplankton carbon biomass in any desired size classes.It essentially represents a recast of the biovolumebased PSC/PFT definition of Kostadinov et al. (2010) that is also based on the KSM09 PSD retrieval.The effect of recasting to carbon using the allometric relationships is illustrated in Fig. 5, and further discussion is provided in Sect.S3.Comparison with other PFT algorithms is outside the scope of this work, but summaries of the available algorithms can be found in IOCCG, 2014and Hirata, 2015. Kostadinov et al. (2016a) compare phenological parameters among 10 PFT algorithms and 7 CMIP5 models as part of the PFT Intercomparison Project (Hirata et al., 2012;Hirata, 2015).

In situ POC-PSD closure
As a verification of the phytoplankton C retrieval methodology presented here, we test the closure between in situ determinations of POC and the PSD; specifically, we compare two different ways to compute phytoplankton carbon: (1) using a chemical POC determination, divided by 1/3, and (2) using Coulter counter PSD measurements in the same way as satellite PSDs (Sect.2.1).Two different sets of integration limits (Eq.4) for the power-law PSD are tested: 0.5-50 µm (Fig. 6a) and 0.7-200 µm (Fig. 6b).The first set of limits matches the operational satellite carbon algorithm (Table S1), and the second -the operational POC measurement.Both closure regressions are highly significant (p < 0.01), indicating that the PSD method can reasonably predict carbon content of parti- , nanoplankton (green) and microplankton (blue), to total phytoplankton carbon biomass (solid lines) and to total biovolume concentration (dashed lines), as functions of the PSD slope ξ .Limits of integration are the operational limits as indicated in Figs. 3 and 4, and Sect.2.1.2(also see Sect.S1.1).Also shown is the histogram of PSD slopes ξ from the mapped image of SeaWiFS mission climatology (September 1997-December 2010), normalized to the highest count bin.
cles in natural seawater samples.However, the smaller size limits (Fig. 6a) exhibit a better R 2 value (in log10 space), while the slope, bias and rms are better for the larger limits (Fig. 6b).Clearly, the PSD method is sensitive to the chosen limits of integration, and the satellite operational limits underestimate the POC values.Better agreement is found when the 0.7-200 µm limits are used, (matching the nominal pore size of the filters used for the POC measurements).Kostadinov et al. (2012) similarly found a relatively good agreement between in situ POC and PSD measurements for a semi-arid coastal site -the Santa Barbara Channel (SBC) off the coast of California.Both sets of results suggest that reasonable internal agreement exists between these two very different methods of in situ assessment of living carbon, even in optically complex coastal sites such as the SBC, where terrigenous material can contribute to the PSD and affect optical properties (Toole and Siegel, 2001;Otero and Siegel, 2004;Kostadinov et al., 2007).This PSD-POC closure analysis uses no satellite data or bio-optical algorithms and is thus is not subject to the associated uncertainties, e.g., mismatch of the scales of measurement.However, the estimation of phytoplankton carbon from the total PSD or from POC in situ does share some uncertainties and limitations as the satellite algorithm, e.g., the PSD does not always have to conform closely to a power law (Reynolds et al., 2010), although this is assumed here.Section 3.6 discusses such assumptions and uncertainties in detail.

Relationship between phytoplankton carbon biomass and chlorophyll concentration
The spatial distributions of Chl (Fig. S2) and total C biomass (Fig. 1a) and nano-and microplankton fractions (Fig. 4b-c) suggest strong positive correlations between these variables, whereas the picoplankton fraction (Fig. 4a) is negatively correlated with Chl.The bivariate histogram of Chl vs. total C biomass (Fig. 7a) confirms this strong correlation.However, for a given Chl value, total biomass can vary considerably (rarely, over an order of magnitude).For example, for the common Chl value of ∼ 0.25 mg m −3 , biomass frequently varies between 10 and 30 mg m −3 and less frequently between 1 and 100 mg m −3 .Although some of this spread may stem from underlying uncertainties in C biomass (Sect.3.6) and Chl (Gregg et al., 2009;Sathyendranath, 2000), some of it is likely attributable to ecological variability that is captured by estimating C biomass and taking into account the PSD, indicating that the biomass retrieval contains new information and is not merely a deterministic function of Chl.Indeed to first order Chl can serve as an indicator of phytoplankton C biomass (e.g., Behrenfeld and Falkowski, 1997a), but their relationship can also be affected by physiological changes in Chl without accompanying biomass changes (Behrenfeld et al., 2005(Behrenfeld et al., , 2006) ) in response to variability in the ambient levels of light (i.e., photoacclimation), nutrients and temperature (e.g., Geider et al., 1998).Notably, the histogram of Fig. 7a exhibits a pronounced sigmoidal shape in logarithmic space.At low and medium Chl values, increases in Chl do not lead to large biomass increases, which is consistent with the idea that Chl variability in oligotrophic areas is due mostly to physiological adaptation, rather than biomass growth and loss.Conversely at higher Chl values in more eutrophic areas, Chl variability is accompanied by biomass changes (B05; Behrenfeld et al. (2006); Siegel et al., 2013).B05 also observe that for low Chl, "background" low values of b bp (440) do not covary strongly with Chl; then for higher Chl values there is a positive linear correlation which tends Ocean Sci., 12, 561-575, 2016 to level off a bit for high Chl values (see their Fig.1), broadly consistent with the sigmoid curve of Fig. 7a.This confirms their (and our) choice to use backscattering as a first order proxy of biomass.
Bivariate histograms between Chl and the fractional PSCs (Fig. 7b-d) indicate that the picoplankton fraction (Fig. 7b) decreases with increasing Chl, whereas nanoplankton (Fig. 7c) and microplankton (Fig. 7d) fractions increase.The pico-and nanoplankton relationships also exhibit the sigmoidal shape.The considerable noise in these relationships is likely due to natural ecosystem variability that occurs for a given Chl value, illustrating that PFT algorithms based on Chl abundance (IOCCG, 2014, e.g., Brewin et al., 2010;Hirata et al., 2011;Uitz et al., 2006) may miss this variability.In spite of that, to first order the relationships of Fig. 7b-d

Algorithm assumptions and uncertainty budget
There are multiple steps involved in the retrieval of the carbon-based biomass products presented here.Namely, R rs (λ) is obtained from top of the atmosphere radiance after atmospheric correction, then spectral b bp (λ) is retrieved and used to estimate the power-law PSD parameters; the PSD is then used to estimate particle volume, which is finally converted to phytoplankton carbon.Each of the above steps is associated with a set of assumptions and uncertainties which combine and propagate to the final products.Only some of these uncertainties are quantifiable at present.Below, we (1) make a quantitative assessment of propagated partial uncertainties of the retrieved carbon-based products, and (2) offer a general discussion of algorithm assumptions and other unquantified uncertainties.In addition, in Sect.S.5, we assess the sensitivity of the carbon products to the input PSD parameters (including the limits of integration of Eq. 4).
Quantified uncertainties propagated (Sect.S1.4,Eqs.S2 and S3) to the final C products include: (1) partial uncertainties of the PSD algorithm products (ξ and N o ) that are due to natural variability of the complex index of refraction and the maximum diameter of the particles considered (KSM09), and (2) uncertainties in the allometric coefficients of MDL2000.The resulting partial uncertainty estimate for the total phytoplankton C biomass mission composite (Fig. 8a) is generally less than 1 mg C m −3 in the oligotrophic subtropics, higher in more productive regions, and exceeds ∼ 10 mg C m −3 only in some limited high-latitude and coastal areas.Examination of relative uncertainty for the global composite image indicates that it rarely exceeds 20 %, except for the very high latitudes (prominently south of 60 • S and in the Arctic Ocean), and in the oligotrophic gyres, where some pixels exceed ∼ 50 % relative uncertainty (not shown).The gyres are characterized by noisy uncertainty patterns (large variability on the pixel scale, not shown).The relative uncertainty of a typical individual monthly image is between 85 % and 115 % globally, illustrating the significant uncertainty reduction for the mission composite product (Eq.S3).
The uncertainty of the mission composite fractional picoplankton contribution to carbon biomass is very low (Fig. 8b), less than ∼ 1 % over most of the ocean, and not exceeding ∼ 7 % anywhere.The uncertainties for the other PSCs are similar (somewhat higher for microplankton, but only at the very high latitudes, not shown).Individual imagery uncertainty for the fractional picoplankton vary between ∼ 3 % to ∼ 8 % (1-7 % for nanoplankton fractions, and ∼ 0-2 % for microplankton, higher in eutrophic areas), illustrating that even for individual images fractional PSC uncertainties are quite low.This result is expected because the N o parameter, which is a large source of error (Sects.3.6 and S6), cancels in the computation of fractional PSCs (Eq.S1) and thus does not contribute to error in the PSCs.Thus, the carbon-based PSCs are likely to be a reliable product even if absolute carbon concentrations are not accurate.In fact, these PSCs can readily be used to partition other, independent estimates of phytoplankton carbon, such www.ocean-sci.net/12/561/2016/Ocean Sci., 12, 561-575, 2016 Figure 9. SeaWiFS mission composite mean (September 1997-December 2010) of total phytoplankton carbon biomass (mg C m −3 in log10 space) calculated with the PSD method described here, as in Fig. 1a, but with an empirical correction applied to the N o parameter first (see Sect. 3.7).
as those from the algorithms of B05 and S08, or even climate model data.Analytical error propagation (Eq.S2) permits tracing the relative contribution of the various input variables to the uncertainty (variance) of the dependent variable.Calculations for the example month of May 2004 indicate that almost the entire variance (> 95 % nearly everywhere) in total carbon is driven by uncertainties in N o (Fig. S6a).The remainder is mostly due to the allometric coefficients in oligotrophic areas (Fig. S6b), and only in some eutrophic areas the PSD slope ξ has a non-negligible contribution to total C variance.For the oligotrophic gyres and some transitional areas around them, most of the uncertainty in picoplankton fractional contribution to carbon biomass is due to the allometric coefficients (Fig. S6c), whereas for the higher latitudes and more productive areas ∼ 80 % of the variance is due to the PSD slope.For the nanoplankton fraction, almost everywhere the uncertainty is due primarily to the allometric coefficients.For the microplankton fraction in oligotrophic areas, the error is due almost exclusively to the allometric coefficients, but in eutrophic areas it is usually about equally due to the allometric coefficients and the PSD slope.
The propagated quantified uncertainties presented above are only partial estimates.There are other (not necessarily quantifiable) factors that contribute to the total uncertainty budget.For example, uncertainties in the spectral b bp (λ) retrieval are not taken into account.The assumptions of a single-slope power-law PSD (that applies across a wide range of particle sizes) and the sphericity and homogeneity assumptions of the KSM09 algorithm contribute to uncertainty as well and are discussed elsewhere (KSM09).For absolute C retrievals, we assume that all particles belong to the POC pool (i.e., that they are biogenic in origin), that the proportion of phytoplankton in POC is constant (i.e., equal to 1/3 of POC by mass), and that the allometric coefficients apply to the heterotrophic and nonliving (detrital) pools as well.
The assumption of biogenic nature of the particle assem-blage is most likely to be violated in shallow coastal waters where processes such as river discharge, wind-driven dust deposition and tidal mixing can introduce large and variable amounts of inorganic particles into the water column (e.g., Otero and Siegel, 2004).Additional uncertainties also exist that are external to the MDL2000 data set and therefore not included in their variance estimates.Finally, the assumption of equal contributions of diatoms and non-diatoms to the total carbon pool for cells larger than 3000 µm 3 is not expected to hold globally everywhere, and should be relaxed in the future by combining with other PFT methods capable of detecting diatoms (e.g., Hirata et al., 2011) and/or integrated ecosystem approaches based on regional knowledge (Raitsos et al., 2008;Fay and McKinley, 2014).A more detailed discussion of algorithm assumptions and additional uncertainty sources is provided in Sect.S6.

Empirical correction of absolute carbon concentrations. Validation with in situ total POC
As discussed in Sects.and S3).Using the empirically corrected values to estimate total global phytoplankton carbon stock yields a value of ∼ 0.17 Gt of C from the SeaWiFS mission climatology.This value is lower than the S08, B05 and CMIP5 model estimates, and it is also lower than the value using the uncorrected N o (Fig. S7).This is an indication that the lowered values of the more eutrophic regions dominate the global biomass result.Indeed the contribution by the shallow shelf regions is considerably reduced compared to the uncorrected estimate.The empirically corrected allometric/PSD-based determinations of phytoplankton carbon validate well against in situ POC measurements from the SeaBASS data set (Werdell et al., 2003) that are multiplied by 1/3 (Fig. 10).The validation statistics are highly significant in log10 space (p < 0.01).Compared to the validation results for the S08 and B05 methods, the PSD methods exhibits similar results.Namely, the PSD method slope is somewhat worse than the others, the R 2 value is about as good as that of S08 and better than the one for B05, and the same holds for the rms values.The PSD method exhibits no overall bias, but a few of the lowest POC values still exhibit underestimation.In addition, about 10 % fewer retrievals are available from the PSD/allometric method.The same validation performed on PSD/allometric retrievals using uncorrected N o values yields a slope of 2.20, an rms of 0.48 and a bias of 0.25 (not shown), indicating that the proposed empirical correction greatly improves algorithm performance for total absolute phytoplankton C concentrations.Because these empirically corrected values are more realistic and validate much better at the POC level, they are used in the published data set (see "Data availability and archival" below).Importantly, this is a validation of only total phytoplankton carbon, and uses in situ POC measurements as a proxy for it.The fractional contributions to the total phytoplankton carbon by the PSCs do not depend on the value of N o and are thus not affected by the empirical N o correction.Finally, Eq. ( 5) is based on PSD validation in KSM09 which has few matchups (N = 22) from a single type of PSD measurement.Many more measurements of the PSD are needed to make this empirical correction robust and possibly regionalize it.

Summary and Conclusions
We presented a novel method to retrieve phytoplankton carbon biomass from ocean color satellite data, based on combining volume determinations using backscattering-based PSD retrievals of Kostadinov et al. (2009) with carbon-tovolume allometric relationships compiled by Menden-Deuer and Lessard (2000).We use monthly SeaWiFS data to estimate total and size-partitioned absolute and fractional C biomass in three PSCs: pico-, nano-and microplankton.These PSCs can be treated as PFTs to first order.The climatological spatial patterns of the C-based PSCs broadly agree with current knowledge of phytoplankton biogeography and ecology.
While there are other remote sensing methods capable of producing algal biomass or PFT estimates, our methodology is unique and novel in the following key ways: (1) ability to partition algal community biomass into any number of desired size classes in terms of absolute or fractional carbon concentration, which is the most relevant variable of interest in terms of biogeochemistry and is the unit of quantification of phytoplankton in Earth system models; (2) it is overall less empirical in nature and is based more on first principles of bio-optics, i.e., it builds on the concept of constant backscatwww.ocean-sci.net/12/561/2016/Ocean Sci., 12, 561-575, 2016 tering to carbon relationship of Behrenfeld et al. (2005) by explicitly taking into account the underlying PSD that produced the backscattering and thus relaxing the assumed constant relationship.Satisfactory in situ closure is observed between a limited number of observations of PSD and POC on AMT cruises, which supports the PSD/allometric approach we take here.Detailed uncertainty analysis indicates that total carbon concentration retrievals are sensitive to assumptions about the underlying bulk particle index of refraction, which may lead to exaggeration of the spatial range of concentration, calling for caution when interpreting absolute concentrations.This exaggeration is improved with an empirical correction which leads to satisfactory validation of total phytoplankton carbon determinations against in situ POC measurements.Fractional PSCs, which are more reliable than the absolute carbon values, are subject to much smaller uncertainties due mostly to uncertainties in the allometric coefficients.The bio-optical algorithm presented here is a firstorder, global, proof-of-concept approach that can be further improved in multiple ways by addressing its assumptions and sources of uncertainty and incorporating new advancements in laboratory and satellite techniques (e.g., in situ phytoplankton carbon measurements and space-borne hyperspectral ocean color sensors).

Data availability and archival
The SeaWiFS data set produced for and used in this publication has been archived in the PANGAEA data repository (Kostadinov et al. (2016b), doi:10.1594/PANGAEA.859005) and is publicly available at https://doi.pangaea.de/10.1594/PANGAEA.859005.The following variables are provided: slope of the power-law PSD (unitless), the N o parameter (Eq.( 1), units of m −4 , decimal logarithm of the data before the empirical correction of Eq. ( 5) is applied), total carbon biomass (mg C m −3 ) and carbon biomass in the three PSCs (mg C m −3 ) with the empirical N o correction applied (Sect.3.7), and the fractional contribution of the three PSCs (picoplankton (0.5-2 µm ESD), nanoplankton (2-20 µm) and microplankton (20-50 µm)) to the total biomass (unitless).Partial propagated uncertainties quantified here are also provided for all variables (1 standard deviation in the units of the respective variable).The effects of the empirical correction Eq. ( 5) on propagated uncertainty have been ignored, i.e., we assume that the corrected N o parameter has the same uncertainty as the uncorrected one.The monthly and overall composite imagery (i.e., climatologies) are also provided, with the respective propagated uncertainties for the composite imagery (Sect.S1.4).Important: the provided data set uses the empirically corrected N o parameter (See Sect.3.7 and Figs. 9 and 10) in order to provide more realistic absolute phytoplankton concentration values.Note that analyses in this paper use mostly the uncorrected N o values, unless otherwise indicated.Use of this data set is subject to the appro-priate license as indicated in PANGAEA, and the SeaWiFS input data set (http://oceancolor.gsfc.nasa.gov/cms/citations)and this paper must be properly cited and acknowledged.
The Supplement related to this article is available online at doi:10.5194/os-12-561-2016-supplement.

Figure 1 .
Figure 1.SeaWiFS mission composite mean (September 1997-December 2010) of total phytoplankton carbon biomass (mg C m −3 in log10 space), derived from monthly data using (a) the allometric PSD method presented here, (b) the method of Behrenfeld et al. (2005) and (c) the Stramski et al. (2008) POC retrieval, multiplied by 1/3.(d) Ensemble mean of the CMIP5 models' (Table S3) climatologies (1990-2010) of the surface phytoplankton carbon biomass (mg m −3 ).The white contours are the 0.08 mg m −3 isoline of Chl.Both model and satellite composite means are computed from monthly data in linear space.

Figure 3 .
Figure 3. SeaWiFS mission composite (September 1997-December 2010) of size-partitioned phytoplankton carbon biomass, C (mg C m −3 in log10 space) estimated with the PSD/allometric method for (a) picoplankton, (b) nanoplankton and (c) microplankton.The white contours are the 0.08 mg m −3 isoline of Chl.Note that the color scale is different from that of Fig. 1.

Figure 4 .
Figure 4. SeaWiFS mission composite (September 1997-December 2010) of percentage contributions of three PSCs to total phytoplankton carbon biomass, estimated with the PSD/allometric method: (a) picoplankton, (b) nanoplankton and (c) microplankton.This mission composite is computed by averaging the fractional contributions to C biomass for each available month (Fig. S4).The white contours are the 0.08 mg m −3 isoline of Chl.

Figure 5 .
Figure 5. Fractional contribution of the three PSCs, picoplankton (red), nanoplankton (green) and microplankton (blue), to total phytoplankton carbon biomass (solid lines) and to total biovolume concentration (dashed lines), as functions of the PSD slope ξ .Limits of integration are the operational limits as indicated in Figs.3 and 4, and Sect.2.1.2(also see Sect.S1.1).Also shown is the histogram of PSD slopes ξ from the mapped image of SeaWiFS mission climatology (September 1997-December 2010), normalized to the highest count bin.

Figure 6 .
Figure6.Matchups between phytoplankton carbon estimated by applying allometric relationships to in situ measurements of the PSD (x axis) and by multiplying in situ chemical POC determinations by 1/3 (y axis).Measurements are coincident in time and space and were conducted on AMT cruises 2, 3 and 4. Two different limits of integration are used for the allometric estimate: (a) 0.5-50 µm, as in the operational satellite algorithm presented here, and (b) 0.7-200 µm, matching the GF / F filter pore size used in POC measurements.

Figure 7 .
Figure 7. Smoothed bivariate histograms of chlorophyll concentration and (a) total phytoplankton C biomass, (b) picoplankton, (c) nanoplankton and (d) microplankton fractional contributions to the total algal C biomass.The histograms were computed from the global mission composite of standard mapped SeaWiFS observations (September 1997-December 2010).The colors indicate the number of pixels that fall into each bivariate bin.The counts are shown in linear space, whereas the bins themselves are in logarithmic space.Data from continental shelf regions (< 200 m depth) are excluded.
are broadly consistent with the observations of Hirata et al. (2011) who use global in situ HPLC measurements to also derive Chl-PSC relationships.Further details of comparison with Hirata et al. (2011) are provided in Sect.S4.

Figure 8 .
Figure 8.(a) Propagated uncertainty in the mission mean of total phytoplankton carbon concentration (1 standard deviation in mg C m −3 , shown in log10 space).This is a partial uncertainty estimate due to the quantifiable PSD parameter uncertainties and the uncertainties of the allometric coefficients.Uncertainties are propagated to the individual monthly images using Eq.(S2) and then composite imagery uncertainty is estimated using Eq.(S3) (Sect.S1.4).Panel (b) is the same as (a) but shows uncertainty for the mission mean of percent picoplankton contribution to carbon biomass (1 standard deviation in percent).

Figure 10 .
Figure10.Validation of the total phytoplankton carbon satellite estimates discussed here, using SeaWiFS matchups of in situ POC measurements from the SeaBASS database.The three methods compared are: the allometric PSD method with the N o empirical correction applied (Sect.3.7) (green circles and line), the S08 POC retrieval multiplied by 1/3 (red crosses and line) and the B05 retrieval (blue triangles and line).All available matchups are used, including those from shallow waters (< 200 m depth).
are based on mission composites and are globally representative since 99-100 % of the