Main

Wood density, defined as the dry mass per fresh volume of wood, is a fundamental functional trait which reflects the carbon investment of trees. It is closely linked to the life history and functional attributes of trees, including mechanical and physiological properties1,2,3. Wood density plays a crucial role in determining the competitive ability of tree species and shapes the composition, structure and function of forest ecosystems4,5,6,7. These dynamics affect the rate of tree mortality8 and wood decomposition1, which are central to how ecosystems respond to environmental changes. Furthermore, the strong link between wood density and biomass production1,9 makes it a vital factor in quantifying terrestrial carbon uptake and storage10,11,12,13. Over one-third of the total variation in aboveground biomass in tropical forests can be explained by spatial differences in wood density1,14. Yet, until now, we lack a spatially continuous understanding of the variation in wood density in angiosperms and gymnosperms that would be necessary for representing this information in global forest carbon storage estimates.

In recent decades, empirical and theoretical studies have identified a wide range of factors that shape global variation in tree wood densities, including abiotic variation, biotic conditions, successional stages and human disturbances1,9,10,15,16,17,18. The evolution of wood density is fundamentally shaped by the cost for wood construction and the need for biomechanical and hydraulic safety2,19,20. Denser wood offers enhanced mechanical support and greater resistance to drought conditions in the xylem but this advantage may be offset by the higher resource allocation required for wood production, resources that could otherwise support growth or reproduction21,22,23. Consequently, in ecosystems with higher vapour pressure deficits, such as warm and dry forests, trees are likely to develop denser wood to maintain xylem resistance against implosion and rupture21,23. In contrast, in warm and humid ecosystems with lower vapour pressure deficit, life history strategies may lean towards rapid growth, characterized by reduced carbon investment in wood, to maximize competitive ability21,22. In colder regions, gymnosperms with low-density tracheids have a competitive advantage over angiosperms. Tracheids of gymnosperm trees, being narrower than the cavitation threshold of 30 μm, are capable of functioning under water and freezing stress, which allows them to resume transpiration early in the spring24,25. Additionally, factors such as reduced canopy height or a lower prevalence of pathogens26 in colder regions may reduce the need for high investment in wood construction27. As a result, the balance between the investment of trees in wood construction and their mechanical and physiological safety is expected to lead to notable geographic variations in wood density worldwide, affecting the structure, function and diversity of ecosystems.

Wood density also varies with the successional stage of forests28 and is influenced by disturbances from both natural processes and human activities1,29,30,31,32,33,34,35,36, such as wildfires17,37,38,39. For example, in parts of the Amazon rainforest, wood density in secondary forests was found to be 33% lower compared to predisturbance conditions40,41. This reduction is attributed to the prevalence of early-successional species with less dense wood in disturbed tropical wet and moist forests1,14,17,33,37,40,41,42,43,44,45. Conversely, in tropical dry forests, wood density often increases postdisturbance as a result of the establishment of more conservative, slow-growing species which are resistant to environmental stresses9,46,47,48,49,50. This implies that forest wood density responds unevenly to disturbances under different environmental conditions33,51. Yet, such context-dependency remains untested at a global scale. Understanding the global distribution of forest wood density and the various influencing factors, including climate and ecosystem disturbances, is vital for predicting and managing the responses of forest ecosystems to environmental shifts and for formulating effective strategies to mitigate and adapt to climate change impacts.

Here we paired ~1.1 million ground-sourced forest inventory plots (Fig. 1d) from the global forest biodiversity initiative (GFBi) database52 with collated species-level wood density data1,53,54,55,56,57,58,59,60 to explore global variation in wood density among both angiosperm and gymnosperm trees. Using this large-scale observation approach, we tested competing hypotheses about the dominant factors driving wood density variation across global forests, including temperature, water availability, species composition and disturbances. This approach allowed us to test theoretical predictions of geographic variation and to create a global model of wood density (Fig. 1 and Methods). We calculated community-wide mean wood density (CWD) by weighting the wood density of each individual observed in a forest plot by its basal area. To explore responses to anthropogenic and natural disturbance gradients, we integrated our observations with global information on human disturbance61 and fire frequency62. Finally, we estimated the total live forest biomass by integrating our CWD map with spatially explicit data on live tree volume63,64, root mass fraction65 and biome-level biomass expansion factors (Supplementary Table 1).

Fig. 1: Observed wood densities across the global forest inventory plots and within gymnosperms, angiosperms, forest types and biomes.
figure 1

a–c, Wood density distribution of gymnosperm (a) and angiosperm (c) species and influence of the proportion of angiosperms on CWD (b). The wood density distribution in gymnospermous species is narrower and has a smaller mean (~20% lower) than in angiospermous species. b, CWD increases with increasing proportion of angiospermous species in forest communities. We included 8,249 taxa with information on angiosperms and gymnosperms comprising 8,036 angiosperms and 213 gymnosperms, each with wood density information available at the species or genus level. d, Map of CWD observations for the ~1.1 million plots from the GFBi database. e,f, Box plots of observed CWD at the forest type (e) or biome level (f). Box plot shows the median, interquartile range and whiskers for data spread, excluding outliers.

Spatial and phylogenetic wood density variation

Gymnosperm trees, which are dominant in boreal and high elevation regions, had 20% lower wood density than angiosperms, with mean densities of 0.47 ± 0.07 g cm−3 and 0.59 ± 0.14 g cm−3, respectively. Accordingly, the CWDs of the global forests were positively related to the proportion of angiosperms within a plot (Fig. 1b).

Our global CWD data reveal strong differences in wood density across the major forest regions (‘Plot-level wood density metrics’ in Methods). Compared to boreal regions, which have a mean CWD of 0.46 ± 0.05 g cm−3, the average CWDs in temperate (0.52 ± 0.09 g cm−3; mean ± s.d.), tropical (0.57 ± 0.10 g cm−3) and dryland (0.59 ± 0.09 g cm−3) regions were 13%, 24% and 28% higher, respectively (Fig. 1e and Supplementary Table 2). At the biome level, tropical coniferous and Mediterranean forests had the densest wood, each with a wood density of 0.6 g cm−3. The standard deviations are ± 0.14 g cm−3 and ± 0.09 g cm−3, respectively. The lowest wood densities were observed in boreal (0.46 ± 0.05 g cm−3) and temperate (0.49 ± 0.07 g cm−3) coniferous forests and flooded savanna (0.46 ± 0.08 g cm−3) regions, with densities 23% to 32% lower than in tropical coniferous and Mediterranean forests (Fig. 1f and Supplementary Table 3). There was also considerable variation in CWD within biomes, which can rival the amount of variation across biomes.

To examine how phylogenetic position affects wood density variation across different species, we used a dated phylogeny on 4,298 species in 189 families and 55 orders. We found a pronounced phylogenetic signal, supporting niche conservatism in wood density among these evolutionary distinct linages (Pagel’s lambda = 0.92, P < 0.01 and Blomberg’s K = 0.01, P < 0.01)66,67. Similarly, ref. 68 reported a lambda value of 0.77 using wood density information from 2,261 species worldwide. This evolutionary signal persists at the order level69, indicating that higher wood densities in the angiosperm orders Myrtales (0.74 g cm−3), Fabales (0.69 g cm−3), Ericales (0.68 g cm−3) and Fagales (0.64 g cm−3) and lower wood densities in the Pinales (0.45 g cm−3), Cupressales (0.50 g cm−3), Araucariales (0.50 g cm−3), Malvales (0.50 g cm−3), Rosales (0.53 g cm−3) and Laurales (0.54 g cm−3) are phylogenetically conserved over evolutionary time (Fig. 2).

Fig. 2: Phylogenetic tree and wood density information of global tree species.
figure 2

The phylogenetic tree was constructed using the R package V.PhyloMaker, with wood density information available for 4,298 species (189 families from 55 orders). Wood density exhibits a strong phylogenetic signal (Pagel’s lambda = 0.92, P < 0.01, Blomberg’s K = 0.01, P < 0.01). The colours of the branches and the grey bars at the tips represent the wood density of each species. To identify orders that have significantly different wood densities compared to all other tree species, we conducted a two-tailed significance test by comparing the order-level wood density with 999 randomized wood density values from the entire phylogenetic tree. The coloured circle surrounding the phylogeny represents different orders. The filled blue/red circles inside the phylogeny indicate orders that show significantly (P < 0.05) lower (blue) or higher (red) wood densities relative to all the species. Numbers inside the circles represent the average wood density of the respective order.

Geospatial mapping

To map the geographic variation of wood density based on its relationship with environmental factors, we developed random-forest models using 62 global layers of climate, topography, soil, vegetation and human activity (Supplementary Table 4). These models were applied to all tree species (Fig. 3a), as well as separately to angiosperms (Fig. 3b) and gymnosperms (Fig. 3c). We observed spatial autocorrelation in model residuals70 up to a distance of 50 km (Supplementary Fig. 1). To mitigate the effect of spatial autocorrelation and ensure the reliability of our model predictions, we used a spatial bootstrapping procedure: we created 200 bootstrapped training subsets, each with data points at least 50 km apart (Methods). We then built individual models for each subset. Our final model, with 62 predictors, achieved a global average R2 of 0.53 (tenfold cross-validation; Supplementary Fig. 2). This model was used to map global wood density trends, revealing lower densities at higher latitudes and elevations (Fig. 3). For example, forests in Canada, Siberia, the Alps and the Qinghai-Tibetan plateau showed low wood density (<0.5 g cm−3), whereas high-density areas (>0.6 g cm−3) included warm, arid regions like the African Savanna and Australian open forests.

Fig. 3: Global maps of wood density.
figure 3

a,c,e, Wood density maps for all species (a), angiosperms-only (c) and gymnosperms-only (e). a, The community-level wood density map was derived from an ensemble approach, averaging the global predictions from the 200 best random-forest models. c,e, Angiosperm-only (c) and gymnosperm-only (e) wood density maps were derived from ensemble averaging of the global predictions from the 100 best random-forest models, respectively. b,d,f, Corresponding latitudinal trends in wood density aggregated for each 0.1 arc degree latitude: all species (b), angiosperms (d) and gymnosperms (f). Error ranges represent 1 s.d. either side of the mean. Maps are projected at 30 arcsec (~1 km2) resolution. Non-forested areas are displayed in grey. In the wood density maps for angiosperms (c) and gymnosperms (e), we correspondingly excluded pixels where angiosperms and gymnosperms constituted <5% of the entire community.

To assess the predictive uncertainty of our models, we calculated the bootstrapped coefficients of variation (standard deviation divided by mean) for CWD values. These results showed high confidence in predictions across all models, with coefficients of variation <5% for all pixels in existing forest areas (Supplementary Fig. 3). Furthermore, we distinguished between model interpolation (predictions within the environmental range of the training data) and extrapolation (predictions outside this range) using a principal component analysis (PCA)-based approach. Our analysis indicated that >95% of the forested areas fell within the environmental range of our training data in >95% of cases. Most of the outliers were located in African savanna regions, probably due to lower sampling density in these regions (Supplementary Fig. 4).

Drivers of global wood density variation

To assess the relative importance of climatic, soil, vegetation and disturbance factors in driving global CWD, we used partial regression and random-forest modelling (Fig. 4). We selected nine variables including environmental factors and functional traits based on previous research1, including mean annual temperature, soil moisture, soil carbon-to-nitrogen (C:N) ratio (indicating nitrogen availability71), leaf area index (LAI; indicating growth and canopy light competition72), tree diversity (species richness), forest age, diameter at breast height (DBH), human modification and fire frequency. Differences in the relative occurrence of angiosperms versus gymnosperms were accounted for by including the plot-level angiosperm ratio as an additional predictor (Fig. 4a,b). Overall, this analysis revealed that mean annual temperature is the most influential factor on CWD. Specifically, a 1 °C increase in temperature correlates with an average 0.5% increase in wood density (Supplementary Table 5). This trend was consistent in separate analyses of angiosperm and gymnosperm communities (Fig. 4c–f) and across forest types and biomes (Fig. 4d,f). The effect of water availability, nutrient resources and temperature on CWD is in alignment with the study conducted by ref. 73, which used soil water-holding capacity, soil basicity index and elevation as proxies for these factors.

Fig. 4: Variable importance of the selected environmental metrics.
figure 4

af, The environmental metrics are based on random-forest models (a,c,e) and linear partial regression models (b,d,f). a,b, Variable importance of the selected covariates across global forests, including angiosperm ratio to control for wood density differences between angiosperms and gymnosperms. c,d, Variable importance within angiosperm-only communities. e,f, Variable importance within gymnosperm-only communities. Mean decrease in accuracy values in a, c and e represents the relative contribution of each variable to CWD variation, whereby we averaged the values of 100 bootstrapped random-forest models. Bootstrapped partial regression coefficients for each variable (b,d,f) were calculated by averaging the partial regression coefficients from 100 multivariate linear models. All variables were standardized to allow for direct effect size comparison. In addition, we quantified the absolute effects of these covariates using partial regression analysis, as detailed in Supplementary Table 5.

The relationships between other tested variables and CWD varied considerably across forest types. High soil moisture correlated with low CWD in tropical and temperate forests but led to higher CWD in boreal and dryland forests (Supplementary Fig. 5). In tropical regions, LAI was positively correlated with CWD, whereas a negative correlation was observed in temperate, boreal and dryland regions. The soil C:N ratio generally correlated with slight decreases in CWD but an inverse relationship was observed in boreal forests. Forest age, while less influential on a global scale, displayed negative effects across all forest types (Supplementary Fig. 5). This pattern reflects the consistent impacts of forest age in communities dominated by angiosperms and gymnosperms (Fig. 4). Plot-level mean DBH, which may also reflect forest age to some extent, had a minor impact on wood density globally (Fig. 4a).

The impact of major disturbances, specifically human activity and fire frequency, on CWD was highly context-dependent. Our analysis across all plots showed human modification as the third most important factor affecting CWD (Fig. 4a) but its importance diminished in gymnosperm-only communities (Fig. 4c,e). This suggests that human activities indirectly influence CWD, primarily by altering the proportion of coniferous and broadleaved trees. Fire frequency was the least impactful factor among the nine variables (Fig. 4a). The limited global effect of fires is probably due to their infrequent occurrence in forest worldwide, with 96% of global forests not experiencing fires in the past 20 years62. However, the long-term impacts of fires on forest composition may be underestimated in our analysis, as their influence can extend beyond the 20 year period we considered. Additionally, the intensity of fire, a crucial aspect of fire disturbance74, was not captured in our frequency data, probably explaining why we did not find a stronger effect of fire frequency74.

To further explore how environmental variables modulate the relationships between disturbance processes and CWD, we conducted recursive partitioning analyses. These analyses show that in cold regions (<10 °C), CWD increases with human disturbance, whereas in warmer areas, it decreases (Supplementary Fig. 6a). Similarly, the effect of fire frequency on CWD also varied with temperature: it slightly reduces CWD in colder climates but increases it in warmer ones (Supplementary Fig. 6b). The relationships between disturbances and CWD were also dependent on the proportion of angiosperms versus gymnosperms in a forest. These findings underscore the context-specific nature of the effects of human disturbance and fire frequency on wood density, influenced by factors such as temperature and forest taxonomic composition.

Wood density and global biomass estimates

To assess the impact of wood density variations on global forest biomass estimates, we integrated our wood density map with the latest global maps of live tree volume63, root mass fraction65 and biomass expansion factors75 (Fig. 5a and Supplementary Table 1). This analysis revealed a total tree biomass of 374 GtC (Supplementary Fig. 7), of which 200 GtC (53.3%) is stored in tree stems, 93 GtC (24.9%) in branches, foliage and other aboveground living parts and 81 GtC (21.7%) belowground as roots. This global estimate aligns well with previous estimates based on remote sensing, ground-sourced models or harmonized ensemble approaches63,76,77, estimating total tree biomass in the range of 354–445 GtC (Supplementary Fig. 7). However, our wood density-based biomass estimations present spatial deviations compared to previous studies, showing an agreement in spatial variation ranging from 45% to 93% with earlier research13,63,76,77,78 (Supplementary Fig. 7a). Our estimates were most closely aligned (93%) with those from GlobBiomass63, as both used the same live tree volume data (Supplementary Fig. 7c).

Fig. 5: Comparison of global living tree biomass distribution using spatially explicit wood density data versus a universal wood density value.
figure 5

a, The global distribution of living tree biomass (in tonnes per hectare), derived by integrating our wood density map with spatially explicit data on living tree volume, root mass fraction and biomass expansion factors. b, Percentage difference in estimated living tree biomass when comparing results derived using the global wood density map (from a) with estimates using a single, universal wood density value. The difference is calculated as the percentage change by subtracting the spatially explicit estimate from the universal estimate and then dividing by the spatially explicit estimate. Blue areas show regions where the universal estimate is higher, and red/orange areas indicate where the spatial estimate is higher. c, Percentage difference between the two biomass estimation methods across biomes. Box plots show the median, interquartile range and whiskers for data spread, excluding outliers.

To isolate the influence of wood density variation on global tree biomass distribution, we compared our wood density-informed biomass model with a model using a constant wood density value of 0.53 g cm−3 (the global average). We found that the constant wood density model estimated the global biomass to be about 4% lower than the spatially explicit wood density model (359 GtC compared to 374 GtC; Fig. 5b). However, significant differences emerged within various biomes (Fig. 5b), where the constant density model underestimated carbon stocks in specific biomes like tropical moist, tropical dry, tropical savanna and Mediterranean forests by, on average, 12%, 17%, 17% and 21%, respectively, and overestimated them in temperate coniferous and boreal forests by 10% and 13%.

These findings underscore the critical role of spatially explicit wood density estimates in accurately predicting forest carbon stocks, taking into account the variations across different regions and forest types. Our detailed wood density map therefore allows for a more accurate representation of the geographic variation in tree carbon storage.

Discussion

Owing to high heterogeneity in wood density across forest types79,80 and along successional stages, modelling the spatial variation in forest functional traits has remained a major research challenge16,79,80. Yet, recent advancements in big data and remote sensing tools have begun to provide detailed high-resolution information on environmental, human disturbance and vegetation characteristics. In this study, we integrated these features to model global variations in wood density and identify the factors driving this variation.

Our analysis identifies temperature as the dominant driver of global wood density variation, with a more than three times greater effect than any other variable (Fig. 4a), highlighting temperature as a selective pressure and filter of global wood density2. Other factors, such as soil moisture, also affect wood densities but are more noticeable at regional scales. In addition, mean annual temperature is highly correlated with evapotranspiration and potential evapotranspiration (Pearson’s correlations around 0.9). Warmer climates thus often coincide with significant water stress. These global-scale trends support the hypothesis that the need for hydraulic safety drives high wood densities in stressful environments2,19,20.

The observed positive correlation between community-level wood density and temperature2,10 leads to distinct latitudinal and elevational trends (Fig. 3a,b). While changes in gymnosperm versus angiosperm species composition amplified these trends, the environmental factors driving global wood density variation were remarkably similar among these distantly related groups (Fig. 3c,e). Temperature and water availability had a consistent effect on wood density across angiosperms and gymnosperms (Fig. 4c,e), indicating similar wood anatomical adaptations to environmental factors.

Our findings on the drivers of global forest wood density align with ref. 81, which highlighted the importance of leaf habit, temperature seasonality, cloud cover and annual precipitation. This study81 calculated pixel-level average wood density by averaging individual tree-level measurements within pixels. In contrast, we integrated forest inventory data to represent CWD. Overall, our wood density estimates show slightly greater variation across the globe. We estimate higher densities than ref. 81 in warm tropical regions and lower densities in cold boreal regions. At the biome level, our predictions are higher in denser wood biomes and lower in regions with low wood density. Despite these differences, both studies show similar patterns of wood density across global forests, with higher wood densities at lower latitudes and lower densities at higher latitudes. The overall R2 between the two models is 0.58 (Supplementary Fig. 9). This consistency in global wood density patterns and climate responses is crucial for enhancing our confidence in predicting the impact of climate change on global forest biomass distribution and shifts in forest composition.

The importance of balancing hydraulic safety and growth efficiency has been well-documented in tropical dry forests82. Indeed, our model predicts the highest wood densities for hot environments with low water availability, such as the dryland regions of South America, Africa and Australia. Conversely, in tropical moist forests, wood density correlates more strongly with growth and mortality rates than with resistance to cavitation. This is evident in pioneer trees, which have low wood density and high growth rates, whereas non-pioneer rainforest trees tend to have higher wood density, largely due to their longevity and competitive advantage, rather than drought stress80. Overall, wood density serves as a multifaceted proxy for environmental interactions and responses, as demonstrated by the divergent trends in community wood density observed in dry versus wet tropical forests9.

The negative correlation between soil moisture and wood density lends support to observations in previous local-scale analyses1,2,83,84, a trend that can largely be explained by the high abundance of slow-growing species in dry environments9,50,85,86,87. These species typically adopt a conservative resource-uptake strategy and exhibit high water-use efficiency. Previous studies have indicated variability in the relationship between wood density and water availability, with differences in the direction and magnitude depending on the research scale and species sampled2,84,88. Our global-scale sampling now shows that dry forests have wood densities up to 31% higher than those found in more humid regions89 (Figs. 1e and 3). Nevertheless, we also find regional differences in the effect of soil moisture (Supplementary Fig. 5). For example, a positive relationship between wood density and water availability was found within boreal forests, while the opposite was true for tropical and temperate regions. Such variation in the effect of soil moisture may be driven by variation in functional properties of species. For instance, water availability tends to be less limiting in broadleaved trees than in conifers because angiosperm vessels can be more efficient in water conduction than tracheids24.

Although biodiversity (species richness), forest age, DBH, LAI and soil C:N ratio are important factors influencing local variation in wood density, these effects were overwhelmed by the impacts of temperature, soil moisture and angiosperm ratio at a global scale. This may suggest that biotic interactions play a relatively smaller role in shaping broad-scale variation in wood density, relative to abiotic environmental factors. Within gymnosperms, we observed a negative correlation between C:N ratio and wood density. This might reflect an increased investment in xylem safety in nutrient-rich environments, where the construction of tissues (carbon acquisition) is less of a limiting factor90. Forest age had opposite effects on wood density variation in communities dominated by angiosperms and gymnosperms. This indicates divergent trajectories in wood density development during the maturation of these two types of forests (Fig. 4b,c).

The relationship between wood density and disturbance was highly context-dependent, with regional contingencies being dependent on the environmental background conditions of the region9,17,33,37,41,45,46,47,50. While previous studies have emphasized the role of water availability51, our recursive partitioning analyses suggest mean annual temperature as the main driver shaping the relationship between wood density and forest disturbances, such as human disturbance and fire frequency (Supplementary Fig. 6). This might be driven by regional differences in the relative trade-off between acquisitive and conservative resource-uptake strategies9,50,85,86,87. In tropical wet forests, seral communities on disturbed forest margins are often dominated by short-lived, light-demanding species which tend to have low wood densities41,91, while in tropical dry forests, seral communities often consist of drought-resistant species with dense wood91,92,93. Controlled experiments and long-term field observations will be needed to further disentangle the context-dependent responses of wood density to disturbances.

Our analysis highlights the strong role of species composition in shaping wood density variation1,2,84,94,95,96. Owing to the strong phylogenetic signal in wood density, related species display similar wood densities, even when growing under different environmental conditions84. For example, gymnosperm tree species have, on average, 20% lower wood densities than angiosperm species (Fig. 2). The anatomical differences between angiosperms and gymnosperms play a crucial role in this disparity. Gymnosperms are characterized by thinner cell walls and smaller pits, decreasing the risk of xylem cavitation at the expense of hydraulic conductivity. In humid, hydraulically less stressful environments, angiosperm trees with higher hydraulic conductivity might thus tend to outcompete gymnosperm trees2. This may add to the observed decreases in wood density from low to high latitudes (Fig. 3a) or from tropical to boreal forests (Fig. 3b)1,2. Interestingly, we find consistent biogeographical trends in wood density for gymnosperms and angiosperms, with temperature being the key regulator. This is particularly evident in the pronounced increase in wood density from boreal to tropical forests within gymnosperm species (Fig. 3f). This trend illustrates the strong selective pressures and filters of temperature on tree wood density patterns globally (Fig. 4a,b).

Conclusion

The integration of global ground-sourced forest inventory data with wood density measurements allowed us to quantitatively assess the environmental factors driving the wood density distribution on a global scale. This integration has resulted in a high-resolution global model, providing critical information on the structure and biomass distribution of the forests of the world. Our analysis identifies taxonomic composition—particularly the distinction between angiosperms and gymnosperms—as the primary biotic driver influencing global wood density variations. Temperature, in conjunction with water availability, emerges as the dominant abiotic factor that controls the global variation in wood density. This pattern is probably attributable to the role of denser wood in enhancing competitive ability, hydraulic efficiency and transpiration efficiency in warmer environments. We also observed that community-level wood density responses to disturbances vary across forest types, biomes and environmental conditions. By integrating our wood density map with other key metrics such as live tree volume63, biomass expansion factors and root mass fractions65, we could benchmark existing forest biomass stocks, estimating a total living biomass of 374 GtC. Our research also showed that biomass estimates within biomes could vary by as much as 21%, depending on whether the variability of wood density was considered or if wood density was assumed to be uniform worldwide. Our findings contribute to an improved understanding of the structure and biomass distribution in global forests and highlight the effects of human and environmental disturbances on global forest communities and functional traits.

Methods

Data sources

Wood density is commonly measured as the ratio of the oven-dry mass of a wood sample to its green volume. Most wood density data stem from wood core samples59, while some are derived from fresh volumes and dry weights of whole tree components58,60. We compiled wood density measurements of individual trees or aggregated to the species level from several databases or publications. The majority of observations came from the Global Wood Density Database by refs. 1,57, encompassing a total of 16,468 records in 8,412 species and the TRY database59 with a total of 46,668 records for 7,514 species. Additionally, 1,117 wood density records in 937 species came from ref. 53, 4,022 records in 872 tree species from ref. 55, 618 records in 615 species from ref. 56, 624 records in 250 species from ref. 60, 3,529 records in 179 tree species from ref. 54, 3,092 records in 58 species from ref. 58 and 1,234 records in 1,061 species from published research articles by searching for ‘wood density’ in Google Scholar (publications listed in Supplementary Data 1). After standardizing the taxonomic names using the Taxonomic Name Resolution Service97 (R package TNRS v.0.1.0) and removing synonyms, we obtained 77,372 wood density observations across 10,703 species and 2,026 genera (data are available at GitHub https://github.com/LidongMo/GlobalWoodDensityProject).

To test the compatibility in wood density estimates among databases, we conducted an analysis to quantify their similarity. By constructing a linear regression model based on common species pairs, we calculated an R2 value of 0.78, indicating high consistency among all nine data sources and minimal bias introduced by different wood density determination methods (Supplementary Figs. 8 and 10).

Phylogenetic and trait analysis

To test whether wood density is phylogenetically conserved, we computed common phylogenetic metrics as well as random-effects models including taxonomic information. We built a phylogenetic tree using the R package V.PhyloMaker98, with a total of 4,298 species (189 families from 55 orders) with wood density information matching the species in the phylogenetic database. To test for phylogenetic signal in wood density, we computed Pagel’s lambda and Blomberg’s K, using the phylosig function in the R package phytools99. To further test for trait conservatism at the order level, we used the ph_aot function from the R package phylocomr100. Order-level wood density values were calculated by averaging across all descendent terminal taxa69 and comparing the means with 999 trait value randomizations across the tips of the full phylogeny to obtain significance estimates101. Only orders for which we had data on at least 50 species were tested. We further quantified the extent of within-species and within-genus variation in wood density by running a random-effects model on all 77,372 observations, including species and genus as random effects and wood density as response variable. The model showed that ~81% of the individual variation in wood density is explained by taxonomic information on family, genus and species levels, with 24% of the variation explained by family information, 30% by genus information and an additional 27% explained by species information.

Generating species- and genus-level wood density information

To quantify wood density variation across the world’s forests, we assigned species-level wood density values to individual tree observations from the GFBi. The GFBi database consists of 1,188,771 unique forest census plots, containing data for all tree individuals with a DBH > 5 cm. Each plot contains information on geospatial coordinates (latitude and longitude in decimal degrees), individual-level species binomial name and DBH, plot size (median plot size = 25 m2) and measurement year. For remeasured plots, we kept only the latest observation year for our analysis. Across all plots, the mean observation year was 2003. To assess the impact of the temporal changes of forest community on the community-level wood density, we applied a random-effects model to plots with time-series information. The model, including wood density as dependent variables and plot and year information as random effects, showed that variance in wood density was predominantly (97.9%) attributable to differences across plots, with only 0.2% due to variations across years.

We then used the binomial names to assign species-level wood density information to the individuals in the GFBi database. As for the wood density information, species binomials in the GFBi database were standardized using the TNRS97. For species with more than one wood density record, we used the average of all available records. If no wood density information was available at the species level or if the GFBi individual was only identified to the genus level, mean genus-level wood density values were used instead. Because of the strong phylogenetic signal in wood density values, these genus-level estimates introduce only little error compared to species-level wood densities2,10,96,102. In total, species-level wood density data could be matched to 4,428 species included in the GFBi database, while genus-level data were matched to 1,192 GFBi genera. We excluded plots representing 0.4% of the total, where <75% of the individuals had wood density information at either the species or genus level. Consequently, 1,183,070 plots were included in our geospatial analysis.

According to the GFBi inventory plots, the global average tree diameters of angiosperms and gymnosperms were similar, at 21.9 and 21.8 cm, respectively, with 95% quantile ranges of 5.6–56.5 cm for angiosperms and 6.4–56.4 cm for gymnosperms.

Plot-level wood density metrics

We allocated species-level wood density to each individual tree in the GFBi plots and calculated the CWD. This approach is supported by the phylogenetic conservatism of wood density (Fig. 2) and the small impact of individual-level wood density variations on community-level estimates73. The average community-wide wood density for each plot CWD was calculated as the wood density of all tree individuals weighted by tree basal area:

$${\rm{CWD}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}({\rm{{WD}}_{\rm{tree}}}\times {B}_{{\rm{tree}}})}{{\sum }_{i=1}^{n}{B}_{{\rm{tree}}}}$$
(1)

where WDtree is the wood density of each tree and Btree is the basal area of each tree.

The spatial modelling of community-wide wood density properties was performed at 30 arcsec (~1 km2) resolution and we therefore aggregated CWD values within each 30 arcsec pixel by calculating the mean, resulting in 506,630 pixel-level observations for the modelling.

To quantify CWD within biomes and forest types, we used all observations in each biome or forest type. Biome and forest types were classified using the WWF Biome map103. Forests were divided into the four broad categories (tropical, temperate, boreal and dryland), with tropical regions including six biomes (tropical and subtropical moist broadleaf forest, tropical and subtropical dry broadleaf forest, tropical and subtropical coniferous forest, tropical and subtropical grassland, savanna and shrubland, flooded grassland and savanna and mangroves), temperate regions including four biomes (temperate broadleaf and mixed forest, conifer forest, temperate grassland, savanna and shrubland and montane grassland and shrubland), boreal regions including two biomes (boreal forest/taiga and tundra) and dryland including two biomes (Mediterranean forest, woodland and scrub and desert and xeric shrubland).

Environmental and human disturbance covariates

We used 62 covariates, representing information on climate, topography, soil, vegetation characteristics, fire frequency and human disturbances, to test for the effects of environment and anthropogenic disturbance on the global variation in CWD and create spatially explicit models which allow us to interpolate CWD across the globe. All covariates were available as global layers at 30 arcsec resolution: layers for 19 bioclimatic variables came from the CHELSA open climate database (www.chelsa-climate.org)104; topographic information (elevation, roughness, slope, profile curvature, northness, eastness and topographic position index) from the EarthEnv database (www.earthenv.org/topography)105; cloud cover properties (annual mean, interannual standard deviation and intra-annual standard deviation) from the EarthEnv (www.earthenv.org/cloud) database and ref. 106; depth to the water table from ref. 107; the annual mean of solar radiation, wind speed and vapour pressure from the WorldClim database (v.2)108; absolute depth to bedrock and soil texture (clay content, coarse fragments, sand content, silt content and soil pH), averaged for soil depths from 0 to 100 cm below surface, from the soil grids database109; soil moisture was down-scaled from 10 km resolution maps sourced from GLDAS2.0 (ref. 110), ERA5 (ref. 111) and MERRA2 (ref. 112), soil nutrient information (cation exchange, C:N ratio and nitrogen) from the WISE30sec database71 and soil grids109; normalized difference vegetation index, enhanced vegetation index (upscaled from 250 m resolution), FPAR, LAI (upscaled from 500 m resolution) and annual net primary productivity from MODIS data113,114,115; aridity index and potential evapotranspiration from refs. 116,117; and current forest tree cover, tree density, canopy height and forest age from refs. 118,119,120,121, respectively.

To represent human and natural disturbances in our model, we used eight global layers that directly reflect anthropogenic disturbances: cultivated and managed vegetation and urban built-up122, agricultural land use (cropland, grazing, pasture and rangeland transformed to pixel-level percentages)123,124, human modification, reflecting the intensity of human activity61 and natural disturbances of forests: fire frequency62. Human modification is the most comprehensive and representative human activity variable integrating five major human disturbance categories: human settlement, agriculture, transportation, mining and energy production and electrical infrastructure61. The map of fire frequency was generated from yearly observations of fire occurrence62, by calculating the proportion of years with fire in each 30 arc degree resolution pixel. All covariates were extracted via Google Earth Engine125. The eight disturbance variables were uniformly scaled to represent a continuous gradient of human activity or fire frequency, whereby values of 0 indicate no disturbances in the respective pixel and values of 1 indicate maximum disturbance.

Representation of training data

To evaluate the extent of interpolation versus extrapolation in our models, that is, how well our training data represents the full multivariate environmental covariate space, we performed a PCA-based approach following ref. 126. We projected the covariates composite into the same space using the centring values, scaling values and eigenvectors from the PCA of the training data. Then, we created convex hulls for each of the bivariate combinations from the top principal components (which collectively covered >90% of the sample space variation). We used 22 principal components with 231 combinations for all covariates. Using the coordinates of these convex hulls, we classified whether each pixel falls within or outside each of these convex hulls. This analysis revealed that 95.2% of land pixels excluding Antarctica are covering at least 95% of the environmental conditions present in our training data locations (Supplementary Fig. 4).

Geospatial modelling of global forest wood density properties

To train spatially explicit CWD models across the world’s forests, we ran a series of random-forest machine learning models. The models included 62 predictor variables representing climate, soil, topography, vegetation, fire frequency and human disturbances. Parameter tuning for each model was performed through the grid search function of the H2O R package127 to iteratively explore the results of a suite of machine learning models trained on the 62 covariates.

To test for spatial autocorrelation of model residuals, we trained a generalized additive model (GAM) using the same 62 covariates and then extracted Moran’s I values of the GAM residuals at spatial scales of 0–1,000 km. This analysis revealed positive spatial autocorrelation up to a distance of 50 km (Supplementary Fig. 1). To minimize the influence of spatial autocorrelation in our random-forest model, we thus applied a spatially buffer-zone-based bootstrapping procedure, subsampling the training data during the grid search procedure to make sure the distance between any two data plots is always >50 km. This buffer-zone-based bootstrap subsampling was applied 200 times and, for each subsample (~2,000 observations), we ran 48 random discrete parameter sets covering the total grid space of 240 possible parameter combinations to perform the grid search. Model performance of each model was assessed using the coefficient of determination128 based on tenfold cross-validation and, for each subsample, we retained the model with the highest R2.

To create the final community wood density maps, we used an ensemble approach, whereby we averaged the global predictions from the 200 best random-forest models based on our bootstrapping procedure. By taking the average prediction across multiple models, ensemble methods minimize the influence of any single prediction, thereby stabilizing variation and minimizing bias that can otherwise arise from extrapolation or in-fit overfitting when using a single machine learning model129. Moreover, by quantifying the variation across these ensemble predictions, we can identify areas that have low consensus across multiple models and which thus have higher uncertainty than would otherwise be predicted by any single model. To implement this ensemble approach, the mean predicted value across the 200 best-fitting models was used as the final model prediction for each pixel and the variation coefficient across these 200 models was used to characterize intermodel consistency (paragraph on Supplementary Fig. 3).

Model consistency and uncertainty

Our ensemble approach allowed us to obtain spatially explicit estimates of the uncertainty associated with our random-forest models of global community-wide wood density. This was done by computing the pixel-wise variation coefficient (standard deviation divided by the mean) of the 200 bootstrapped models126 (Supplementary Fig. 3), whereby the coefficient of variation represents the uncertainty of our wood density estimates.

Geographic variation and drivers of community wood density properties

Variable selection

To explore the effect magnitude and direction of the main environmental drivers of CWD across the globe, we included variables of high ecological importance which have shown significant relationships with wood density in previous studies9,17,84,89,130 and performed hierarchical cluster analysis to remove highly similar variables. We then tested for multicollinearity among the retained covariates, by calculating variance inflation factors (VIFs) using the R package HH 3.1-52 (ref. 131). All VIFs of these selected variables were <5, indicating sufficient independence among predictor variables. Mean annual temperature, soil moisture, DBH, species richness, soil C:N ratio, forest age, canopy height, LAI, human modification and fire frequency were selected for the final analysis. In addition, we included two biotic variables: angiosperm ratio and biodiversity. Specifically, the angiosperm ratio represents the proportion of angiosperm individuals within the plot, which we used to account for differences in CWD between angiosperms and gymnosperms (Fig. 1b). Biodiversity is represented by richness, calculated by scaling the observed number of species to the plot size. We excluded precipitation from our analysis due to its strong collinearity with plant water availability on a global scale and the previously established weak correlation between wood density and precipitation1.

Variable importance

To test the variable importance of the selected covariates, we ran linear multivariate regression and random-forest models. To control for the potential effects of spatial autocorrelation, we ran a bootstrapping procedure, subsampling the full dataset 100 times, with each subsample randomly selecting one observation per 0.25 arc degree grid. For each subsample, we quantified the variable importance of the nine selected variables based on mean decrease in accuracy values from random-forest models using the H2O R package127. The average values across the 100 submodels were then used to evaluate the results (Fig. 4). Similarly, for each subsample, we fitted a multivariate regression model for the ten selected variables and calculated the corresponding regression coefficients, whereby both response and predictor variables were standardized to allow for direct effect size comparison. We then aggregated the results by calculating the mean regression coefficient and standard deviation across all submodels (Fig. 4). Furthermore, we ran the same models including only data for angiosperms or gymnosperms (Fig. 4c–f). We also tested the effects of the variables on CWD within forest types using partial linear regression models (Supplementary Fig. 5). As for the global analyses, we controlled for the effect of spatial autocorrelation by running a bootstrapping procedure, subsampling the full dataset 100 times, with each subsample randomly selecting one observation per 0.25 arc degree grid.

Context-dependency of human disturbance and fire frequency effects

To explore the effects of human disturbances and fire frequency on CWD under different environmental conditions, we ran recursive partitioning analyses using the packages partykit132 and ggparty133. We used a decision tree algorithm to explore the context-dependency of the slope and intercept of a univariate linear model for the effect of disturbance variables on community wood density properties, whereby the top four covariates based on a random-forest model (Fig. 4a) were evaluated as potential splitting points (Supplementary Fig. 6). The minimum node size (minimum number of observations contained in each terminal node) was set to 500 (~3% of the data) and the significance level was set to 0.01.

Estimation of the living biomass in global forest

To generate a global map of aboveground tree biomass, we combined our wood density map with an existing map of live tree volume63. The live tree volume represents the total volume of all living trees with a DBH >10 cm, measured over bark from ground or stump height to a top stem diameter of 0 cm (ref. 63).

$${\rm{TGB}}=\mathop{\sum }\limits_{i=1}^{n}{\rm{CWD}}_{\rm{mean}}\times {\rm{GSV}}\times {\rm{BEFs}}\times \left(\frac{1}{1-{\rm{RMF}}}\right)$$
(2)

Total living tree biomass (TGB) was then calculated by multiplying our CWD estimates with live tree volume (GSV)63. To match the resolution of our wood density map, the GSV map was aggregated from ~100 m to ~1 km resolution. GSV represents the volume of all living trees with a diameter greater than 10 cm at breast height, measured from the ground or stump height to a top stem diameter of 0 cm, including the bark. We then used biomass expansion factors (BEFs) from the literature75,134 (Supplementary Table 1) to convert stem biomass into aboveground tree biomass, representing the biomass of tree stems, branches, foliage, flowers and seeds63. Root mass fraction (RMF) is the relative proportion of plant biomass distributed to roots65.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.