Data-driven solutions

In an age with more climate data than ever before, how can we harness it to drive genuinely evidence-based climate action?

9th November 2018

Hurricane Florence approaching the east coast of the USA, as viewed from the International Space Station. Although climate is a ‘big data’ science, only a small amount of the data has been systematically analysed
© ESA/NASA–A. Gerst

Data-driven solutions

In an age with more climate data than ever before, how can we harness it to drive genuinely evidence-based climate action?

By Auroop R. Ganguly, Evan Kodra, Udit Bhatia, Mary Elizabeth Warner, Kate Duffy, Arindam Banerjee and Sangram Ganguly

The fourth and fifth assessment reports by the Intergovernmental Panel on Climate Change have declared global warming to be “unequivocal” and anthropogenic drivers to be “extremely likely” as the dominant cause. These conclusions, expressed in cautious scientific language governed by strict criteria, need to serve as a clarion call for action – from urban communities to world bodies – to prepare for what is likely, and to prevent the worst from happening.

Climate adaptation and mitigation have been respectively called “managing the unavoidable” and “avoiding the unmanageable”.1 While individuals, communities and nations may understand that there are inherent costs to both climate action and inaction, developing a comprehensive, evidence-based, scientifically credible and risk-informed action framework is not straightforward.

National and global climate mitigation policies include investments in renewable energy, carbon capture and storage solutions, divestments in fossil fuels, and environmental and land-use regulations. The will of nations to act may depend on perceptions of climate impacts and vulnerability on their respective societies and institutions – and on the challenges in adaptation.

Impacted sectors include natural resources (such as food, energy, water, ecosystems3,4,5,6), hazards and humanitarian aid (for example, critical infrastructures resilience7), as well as population growth and movement (such as environmental refugees).

However, while predictive insights from climate models and data are usually more credible at aggregate scales in space and time, climate action may be better suited at the local scale2 such as within urban communities. Urbanisation contributes significantly to emissions and land-use change, and hence to climate change, while urban areas are significantly impacted by climate change.

Communities in urban, peri-urban or rural regions need to understand, adapt to and mitigate risk elements. These include global-scale and locally exacerbated hazards (such as global warming and urban heatwaves), vulnerabilities of infrastructures and lifelines (including natural-built and grey–green infrastructures), as well as exposure of economic assets, ecosystem services and human populations.

Data-driven understanding and predictive insights can improve risk-informed adaptation and mitigation in three ways: (i) through improved understanding of earth systems science and engineering, translating to the probabilities and attributes of stresses and shocks; (ii) through risk frameworks, including risk assessments, which consider threat, vulnerability and exposure, emergency management (including preparedness and recovery), as well as time-phased and flexible adaptation strategies; and (iii) mitigation at multiple scales, from global and national to urban and community.

Climate model simulations and remotely sensed observations already exceed petabyte scales (one petabyte equals 1,000 terabytes) and are expected to reach a few hundred petabytes within the next couple of decades. But even though climate is now a ‘big data’ science, only a small fraction of the available data has been systematically analysed. Furthermore, even as ‘big models’ are increasing in space-time resolutions and complexity, this is not necessarily leading to more certainty in stakeholder-relevant insights.11,12

Machine learning
So, a group of climate modellers13 have resorted to machine learning (ML) – a subfield of artificial intelligence – to estimate parameters for high-resolution atmospheric processes14 such as convection15. Others have explored ML-based post-processing of model simulations16,17,18,19, often guided by physics20,21,22 to obtain finer-scale projections. Beyond atmospheric science, terrestrial ecology has benefited from ML through the creation of a global plant attribute database23,24, which in turn used an advanced data-driven parameter estimation method within numerical models25.

Despite these efforts, our lack of understanding of complex climate processes and feedback, as well as sources of irreducible uncertainty26, may persist.

First, greenhouse gas emissions and land-use change scenarios that drive the models are not precise predictions with probabilities, but are what-if scenarios.

Second, gaps in our knowledge of the climate system may not be easily plugged. Uncertainties result from variabilities across model simulations as well as their lack of correspondence (when models are hindcast into the past) with observations.

Third, inherent variability exists in the climate system, including extreme sensitivity to initial conditions26, which contributes to the irreducible component of uncertainty.

These three components contribute to the overall uncertainty. In the crucial 0–30 year near term, the projected climate change signal may be within the bounds of this overall uncertainty, which may in turn be dominated by the inherent variability27,28.

Climate challenges go beyond ‘big’ models and big data. Indeed, climate science is also dominated by what may be viewed as ‘small data’ challenges. Historical records from the data-poor eras of earth science are sparse, making historical reconstructions difficult to validate. Climate signals exhibit temporal fluctuations ranging from sub-seasonal to multi-decadal and even longer time scales (low-frequency variability).

The lack of historical data makes it challenging to understand changes or delineate signals from longer-term variations, although reconstructions based on models, instrumented records (even if relatively sparse), and proxy data (such as tree rings, fossils and ice cores) help to partially address aspects of the information gap.

Furthermore, climate data challenges are made worse by complex dependence. Tobler’s first law of geography states that “everything is related to everything else, but near things are more related than distant things”. While climate data exhibit this property in space and in time, long-range spatial dependence and persistence in time are also common.

Finally, climate change is not just a matter of mean change (for example, global warming) but also about changes in the patterns of extremes such as heatwaves, cold snaps, heavy rain, droughts, floods and hurricanes. Weather extremes turn to catastrophic disasters when hazards (e.g., a hurricane) are aligned with infrastructural (e.g., an inadequately designed dam or levee) and societal (e.g., economic disparity) vulnerability, exposure of people and assets (e.g., businesses and natural resources), as well as lack of emergency management plans. The relative rarity of such extremes adds to the ‘small data’ challenge and brings to the fore the need to manage, assimilate, analyse and interpret heterogeneous information.

Nonetheless, extracting predictive insights about the statistics of change and extremes is possible based on specialised data-driven methods such as extreme value theory, network science and signal processing. Thus, our research has examined complex dependence patterns29,30,31, low-frequency variability32 in climate (including for extremes33) and developed predictive insights. We have studied heatwaves and cold snaps34,35,36, heavy precipitation37,38,39, high winds40, droughts41 and urban climate extremes42.

Our work on droughts and heavy precipitation has examined long-memory processes43 and teleconnections44. We have discussed45 deep uncertainty (i.e., where probabilities cannot be easily assigned) and non-stationarity (i.e., significant and fundamental change) in climate and hydrology, as well as in climate adaptation and resilient engineering, and the possibility of blending physics and data sciences to address these challenges48.

Figure 1 illustrates how simulations and observations from earth system science combined with ancillary information may help generate predictive insights in climate and develop risk assessments.

Attribution studies, where change patterns are related to possible causes46, are typically based on observations and model simulations. This is one area where we believe the climate science community can benefit significantly by interacting with a wider group of interdisciplinary scientists.

Stakeholders such as the US Department of Defense indicate that climate change is a threat multiplier across many sectors, and hence adaptation and mitigation are urgent and necessary. Data challenges in adaptation and mitigation sectors are diverse and disparate – ranging from big data to small data, information gaps and confidentiality issues – and are exacerbated by gaps in understanding processes and the possibility of cascading failures.

Data-driven methods, data-informed, process-based approaches and physics-informed, data-science methods have all been found to be useful. Robust decisions and flexible-planning pathways have been suggested47. We have examined coastal processes49, water-energy nexus50,45, transportation networks51,52, public health and urban heatwaves53,54, and regulatory principles2. Figure 2, for example, shows how recovery strategies designed in anticipation of weather extremes can help save lives and money.

Future work may need to creatively leverage data from the public domains or from well-crafted simulations and testbeds. It may also incorporate confidential data that could still be used either through anonymisation or by following privacy regulations. The state of the art in critical infrastructures resilience7 offers specific examples.

Finally, the importance of economic incentives to overcome hurdles to best practice or to engineering innovation, as well as to policy myopia, cannot be overemphasised. These in turn may require analysis of financial, demographic and socio-economic data. Figure 3 suggests how a vicious cycle of maladaptation may be transformed to a virtuous cycle through improved incentives and innovations.

AR Ganguly, Bhatia, Warner and Duffy are at Sustainability and Data Sciences Laboratory (SDS Lab) of Northeastern University (NU) in Boston, MA, USA; Kodra is at the startup risQ (spinout of the SDS Lab) in Cambridge, MA; Banerjee is at the University of Minnesota (computer science) in Twin Cities, MN; S Ganguly is with the NASA Ames Research Center at Moffett Field, CA.


1. Scientific Expert Group on Climate Change (Rosina M. Bierbaum, John P. Holdren, Michael C. MacCracken, Richard H. Moss, and Peter H. Raven, eds.). “Confronting Climate Change: Avoiding the Unmanageable and Managing the Unavoidable.” Edited by John P. Holdren, Rosina M. Bierbaum, Michael C. MacCracken, Richard H. Moss and Peter H. Raven. Sigma Xi and the United Nations Foundation, April 2007. 144.
2. Rolland, S.E., Pimentel, A. and Ganguly, A., 2014. Taking Climate Change by Storm: Theorizing Global and Local Policy-Making in Response to Extreme Weather Events. Buffalo Law Review, 62, 933.
3. Wheeler, T. and Von Braun, J., 2013. Climate change impacts on global food security. Science, 341(6145), pp.508-513.
4. Vörösmarty, C.J., McIntyre, P.B., Gessner, M.O., Dudgeon, D., Prusevich, A., Green, P., Glidden, S., Bunn, S.E., Sullivan, C.A., Liermann, C.R. and Davies, P.M., 2010. Global threats to human water security and river biodiversity. Nature, 467(7315), p.555.
5. Miara, A., Macknick, J.E., Vorosmarty, C.J., Tidwell, V.C., Newmark, R.L. and Fekete, B., 2017. Climate and water resource change impacts and adaptation potential for US power supply. Nature Climate Change, 7(NREL/JA-6A20-67678).
6. Kolbert, E., 2014. The sixth extinction: An unnatural history. A&C Black, 319pp.
7. Ganguly, A.R., Bhatia, U. and Flynn, S.E., 2018. Critical Infrastructures Resilience: Policy and Engineering Principles. Routledge (Taylor & Francis), 154pp.
8. Grimm, N.B., Faeth, S.H., Golubiewski, N.E., Redman, C.L., Wu, J., Bai, X. and Briggs, J.M., 2008. Global change and the ecology of cities. Science, 319(5864), pp.756-760.
9. Rosenzweig, C., Solecki, W., Hammer, S.A. and Mehrotra, S., 2010. Cities lead the way in climate–change action. Nature, 467(7318), p.909.
10. Overpeck, J.T., Meehl, G.A., Bony, S. and Easterling, D.R., 2011. Climate data challenges in the 21st century. Science, 331(6018), pp.700-702.
11. Maslin, M. and Austin, P., 2012. Uncertainty: Climate models at their limit? Nature, 486(7402), p.183.
12. Kumar, D., Kodra, E. and Ganguly, A.R., 2014. Regional and seasonal intercomparison of CMIP3 and CMIP5 climate model ensembles for temperature and precipitation. Climate Dynamics, 43(9-10), pp.2491-2518.
13. Voosen, P., 2018. The Earth Machine: Science insurgents plot a climate model driven by artificial intelligence. Science, 361(6400): pp. 344-347.
14. Rasp, S., Pritchard, M.S. and Gentine, P., 2018. Deep learning to represent subgrid processes in climate models. Proceedings of the National Academy of Sciences,
15. O’Gorman, P.A. and Dwyer, J.G., 2018. Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change and extreme events. arXiv preprint, arXiv:1806.11037.
16. Vandal, T., Kodra, E. and Ganguly, A.R., 2018a. Intercomparison of machine learning methods for statistical downscaling: the case of daily and extreme precipitation. Theoretical and Applied Climatology,
17. Vandal, T., Kodra, E., Dy, J., Ganguly, S., Nemani, R. and Ganguly, A.R., 2018b. Quantifying Uncertainty in Discrete-Continuous and Skewed Data with Bayesian Deep Learning. arXiv preprint arXiv: 1802.04742.   (Accepted in the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, August 2018).
18. Vandal, T., Kodra, E., Ganguly, S., Michaelis, A., Nemani, R. and Ganguly, A.R., 2017, August. Deepsd: Generating high resolution climate change projections through single image super-resolution. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1663-1672), ACM.
19. Salvi, K., Ghosh, S. and Ganguly, A.R., 2016. Credibility of statistical downscaling under nonstationary climate. Climate Dynamics, 46(5-6), pp.1991-2023.
20. Das, D., Dy, J., Ross, J., Obradovic, Z. and Ganguly, A.R., 2014. Non-parametric Bayesian mixture of sparse regressions with application towards feature selection for statistical downscaling. Nonlinear Processes in Geophysics, 21(1): 1145-1157.
21. Ganguly, A.R., Kodra, E.A., Agrawal, A., Banerjee, A., Boriah, S., Chatterjee, Sn., Chatterjee, So., Choudhary, A., Das, D., Faghmous, J., Ganguli, P., Ghosh, S., Hayhoe, K., Hays, C., Hendrix, W., Fu, Q., Kawale, J., Kumar, D., Kumar, V., Liao, W., Liess, S., Mawalagedara, R., Mithal, V., Oglesby, R., Salvi, K., Snyder, P.K., Steinhaeuser, K., Wang, D. and Wuebbles, D., 2014. Toward enhanced understanding and projections of climate extremes using physics-guided data mining techniques. Nonlinear Processes in Geophysics, 21, pp.777-795.
22. Sahana, A.S. and Ghosh, S., 2018. An improved prediction of Indian summer Monsoon Onset from State of the Art Dynamic Model using Physics Guided Data Driven Approach. Geophysical Research Letters. doi: 10.1029/2018GL078319.
23. Schrodt, F., Kattge, J., Shan, H., Fazayeli, F., Joswig, J., Banerjee, A., Reichstein, M., Bönisch, G., Díaz, S., Dickie, J. and Gillison, A., 2015. BHPMF–a hierarchical B ayesian approach to gap‐filling and trait prediction for macroecology and functional biogeography. Global Ecology and Biogeography, 24(12), pp.1510-1521.
24. Shan, H., Kattge, J., Reich, P., Banerjee, A., Schrodt, F. and Reichstein, M., 2012. Gap Filling in the Plant Kingdom—Trait Prediction Using Hierarchical Probabilistic Matrix Factorization. Proceedings of the 29th International Conference on Machine Learning, ICML 2012, 1303-1310.
25. Butler, E.E., Datta, A., Flores-Moreno, H., Chen, M., Wythers, K.R., Fazayeli, F., Banerjee, A., Atkin, O.K., Kattge, J., Amiaud, B. and Blonder, B., 2017. Mapping local and global variability in plant trait distributions. Proceedings of the National Academy of Sciences,
26. Deser, C., Phillips, A., Bourdette, V. and Teng, H., 2012. Uncertainty in climate change projections: the role of internal variability. Climate Dynamics, 38(3-4), pp.527-546.
27. Kumar, D. and Ganguly, A.R., 2018. Intercomparison of model response and internal variability across climate model ensembles. Climate Dynamics, 51(1-2), pp.207-219.
28. Kirtman B., Power S.B., Adedoyin A.J., Boer G.J., Bojariu R., Camilloni I., Doblas-Reyes F., Fiore A.M., Kimoto, M., Meehl, G.A., Prather, M., Sarr, A., Schär, C., Sutton, R., van Oldenborgh, G.J., Vecchi, G. and Wang, H.J. (2013). Chapter 11 – Near-term climate change: Projections and predictability. In: Climate Change 2013: The Physical Science Basis. IPCC Working Group I Contribution to AR5. Eds. IPCC, Cambridge: Cambridge University Press.
29. Liess, S., Kumar, A., Snyder, P.K., Kawale, J., Steinhaeuser, K., Semazzi, F.H., Ganguly, A.R., Samatova, N.F., Kumar, V., 2014. Different modes of variability over the Tasman Sea: Implications for regional climate. Journal of Climate, 27(22), pp.8466-8486.
30. Steinhaeuser, K., Ganguly, A.R. and Chawla, N.V., 2012. Multivariate and multiscale dependence in the global climate system revealed through complex networks. Climate Dynamics, 39(3-4), pp.889-895.
31. Chatterjee, S., Steinhaeuser, K., Banerjee, A., Chatterjee, S. and Ganguly, A., 2012, April. Sparse group lasso: Consistency and climate applications. In Proceedings of the 2012 SIAM International Conference on Data Mining (pp. 47-58). Society for Industrial and Applied Mathematics.
32. Kodra, E., Ghosh, S. and Ganguly, A.R., 2012. Evaluation of global climate models for Indian monsoon climatology. Environmental Research Letters, 7(1), p.014012. (See online supplement).
33. Kuhn, G., Khan, S., Ganguly, A.R. and Branstetter, M.L., 2007. Geospatial–temporal dependence among weekly precipitation extremes with applications to observations and climate model simulations in South America. Advances in Water Resources, 30(12), pp.2401-2423.
34. Kodra, E. and Ganguly, A.R., 2014. Asymmetry of projected increases in extreme temperature distributions. Scientific Reports, Nature Publishing Group, 4, p.5884.
35. Ganguly, A.R., Steinhaeuser, K., Erickson, D.J., Branstetter, M., Parish, E.S., Singh, N., Drake, J.B. and Buja, L., 2009. Higher trends but larger uncertainty and geographic variability in 21st century temperature and heat waves. Proceedings of the National Academy of Sciences, 106(37), pp.15555-15559.
36. Kodra, E., Steinhaeuser, K. and Ganguly, A.R., 2011. Persisting cold extremes under 21st‐century warming scenarios. Geophysical Research Letters, 38(8).
37. Ghosh, S., Das, D., Kao, S.C. and Ganguly, A.R., 2012. Lack of uniform trends but increasing spatial variability in observed Indian rainfall extremes. Nature Climate Change, 2(2), pp. 86-91.
38. Kao, S.C. and Ganguly, A.R., 2011. Intensity, duration, and frequency of precipitation extremes under 21st‐century warming scenarios. Journal of Geophysical Research: Atmospheres, 116, D16119: 14 pp.
39. Khan, S., Kuhn, G., Ganguly, A.R., Erickson, D.J. and Ostrouchov, G., 2007. Spatio‐temporal variability of daily and weekly precipitation extremes in South America. Water Resources Research, 43(11),
40. Kumar, D., Mishra, V. and Ganguly, A.R., 2015. Evaluating wind extremes in CMIP5 climate models. Climate Dynamics, 45(1-2), pp.441-453.
41. Ganguli, P. and Ganguly, A.R., 2016a. Space-time trends in US meteorological droughts. Journal of Hydrology: Regional Studies, 8, pp.235-259.
42. Mishra, V., Ganguly, A.R., Nijssen, B. and Lettenmaier, D.P., 2015. Changes in observed climate extremes in global urban areas. Environmental Research Letters, 10(2), p.024005.
43. Ganguli, P. and Ganguly, A.R., 2016. Robustness of meteorological droughts in dynamically downscaled climate simulations. JAWRA Journal of the American Water Resources Association, 52(1), pp.138-167.
44. Das, D., Dy, J., Ross, J., Obradovic, Z. and Ganguly, A.R., 2014. Non-parametric Bayesian mixture of sparse regressions with application towards feature selection for statistical downscaling. Nonlinear Processes in Geophysics, 21(6), pp.1145-1157.
45. Ganguly, A.R., Kumar, D., Ganguli, P., Short, G. and Klausner, J., 2015. Climate adaptation informatics: water stress on power production. Computing in Science & Engineering, 17(6), pp.53-60.
46. Min, S.K., Zhang, X., Zwiers, F.W. and Hegerl, G.C., 2011. Human contribution to more-intense precipitation extremes. Nature, 470(7334), p.378-381.
47. Kates, R.W., Travis, W.R. and Wilbanks, T.J., 2012. Transformational adaptation when incremental adaptations to climate change are insufficient. Proceedings of the National Academy of Sciences,
48. Shen, C., 2018. A trans‐disciplinary review of deep learning research and its relevance for water resources scientists. Water Resources Research,
49. Wang, D., Gouhier, T.C., Menge, B.A. and Ganguly, A.R., 2015. Intensification and spatial homogenization of coastal upwelling under climate change. Nature, 518(7539), p.390-394.
50. Ganguli et al. 2017, Ganguli, P., Kumar, D. and Ganguly, A.R., 2017. US Power Production at Risk from Water Stress in a Changing Climate. Scientific Reports, Nature Publishing Group, 7(1), p.11983.
51. Bhatia, U., Kumar, D., Kodra, E. and Ganguly, A.R., 2015. Network-science based quantification of resilience demonstrated on the Indian Railways Network. PLOS ONE, 10(11), p. e0141890.
52. Clark, K.L., Bhatia, U., Kodra, E.A. and Ganguly, A.R., 2018. Resilience of the US National Airspace System Airport Network. IEEE Transactions on Intelligent Transportation Systems. DOI: 10.1109/TITS.2017.2784391.
53. Fard, B.J., Hassanzadeh, H., Warner, M.E., Bhatia, U., and Ganguly, A.R., 2018. Integrated climate risk assessment: A practical approach to inform action plan for heatwave threat to public health. Town or Brookline and AGU Thriving Earth Exchange.
54. Boston Advisory Group Report (BRAG), 2016. Climate change and sea level rise projections for Boston. Climate Ready Boston, 60 pp.