List of my publications in reversed chronological order.
Projecting the climate penalty on PM2.5 pollution with spatial deep learning (short)
Workshop in Tackling Climate Change with Machine Learning at ICLR, 2023
The climate penalty measures the effects of a
changing climate on air quality due to the interaction
of pollution with climate factors, independently of
future changes in emissions. This work introduces a
statistical framework for estimating the climate
penalty on soot pollution (PM 2.5), which has been
linked to respiratory and cardiovascular diseases
and premature mortality.
The framework is used to evaluate the disparities in
future PM 2.5 exposure across racial/ethnic and income groups.
The findings of this study have the potential to
inform mitigation policy aiming to protect public
health and promote environmental equity in addressing
the effects of climate change.
The proposed methodology significantly improves upon
existing statistical-based methods for estimating
the climate penalty. It will use higher-resolution
climate inputs---which current statistical approaches
cannot accommodate---using an expressive and scalable
predictive model based on spatial deep learning with
spatiotemporal trend estimation. It will also integrate
additional predictive data sources such as demographics
and geology. This approach allows us to consider
regional dependencies and synoptic weather patterns
that influence PM 2.5, and deconvolve them from the
effects of exogenous factors, such as the trends in
increasing air quality regulations and other sources
of unmeasured spatial heterogeneity.
CRE: An R package for interpretable discovery and inference of heterogeneous treatment effects [UNDER REVIEW]
Journal of Open Source Software
In health and social sciences, it is critically
important to identify interpretable subgroups of the
study population where a treatment has notable
heterogeneity in the causal effects with respect to
the average treatment effect (ATE). Several
approaches have already been proposed for heterogenous
treatment effect (HTE) discovery, either estimating
first the conditional average treatment effect (CATE)
and identifying heterogeneous subgroups in a second
stage, either in a direct data-driven procedure.
Many of these methodologies are decision
tree-based methodologies. Tree-based approaches are
based on efficient and easily implementable recursive
mathematical programming (e.g., HTE maximization),
they can be easily tweaked and adapted to different
scenarios depending on the research question of
interest, and they guarantee a high degree of
interpretability---i.e., the degree to which a human
can understand the cause of a decision. Despite these
appealing features, single-tree heterogeneity
discovery is characterized by two main limitations:
instability in the identification of the subgroups and
reduced exploration of the potential heterogeneity.
To accommodate these shortcomings, Bargagli et al.
(2023) proposed Causal Rule Ensemble, a new method
for interpretable HTE characterization in terms of
decision rules, via an extensive exploration of
heterogeneity patterns by an ensemble-of-trees
approach, enforcing high stability in the discovery.
CRE is an R package providing a flexible
implementation of Causal Rule Ensemble algorithm.
Causal Rule Ensemble: Interpretable Discovery and Inference of Heterogeneous Treatment Effects
In health and social sciences, it is critically important to identify subgroups of the study population where a treatment has notable heterogeneity in the causal effects with respect to the average treatment effect. Data-driven discovery of heterogeneous treatment effects (HTE) via decision tree methods has been proposed for this task. Despite its high interpretability, the single-tree discovery of HTE tends to be highly unstable and to find an oversimplified representation of treatment heterogeneity. To accommodate these shortcomings, we propose Causal Rule Ensemble (CRE), a new method to discover heterogeneous subgroups through an ensemble-of-trees approach. CRE has the following features: 1) provides an interpretable representation of the HTE; 2) allows extensive exploration of complex heterogeneity patterns; and 3) guarantees high stability in the discovery. The discovered subgroups are defined in terms of interpretable decision rules, and we develop a general two-stage approach for subgroup-specific conditional causal effects estimation, providing theoretical guarantees. Via simulations, we show that the CRE method has a strong discovery ability and a competitive estimation performance when compared to state-of-the-art techniques. Finally, we apply CRE to discover subgroups most vulnerable to the effects of exposure to air pollution on mortality for 35.3 million Medicare beneficiaries across the contiguous U.S.
Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective (long)
IEEE Conference on Computer Vision and Pattern Recognition, 2022
Learning behavioral patterns from observational data has been a de-facto approach to motion forecasting. Yet, the current paradigm suffers from two shortcomings: brittle under covariate shift and inefficient for knowledge transfer. In this work, we propose to address these challenges from a causal representation perspective. We first introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables, namely invariant mechanisms, style confounders, and spurious features. We then introduce a learning framework that treats each group separately: (i) unlike the common practice of merging datasets collected from different locations, we exploit their subtle distinctions by means of an invariance loss encouraging the model to suppress spurious correlations; (ii) we devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph; (iii) we introduce a style consistency loss that not only enforces the structure of style representations but also serves as a self-supervisory signal for test-time refinement on the fly. Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations, outperforming prior state-of-the-art motion forecasting models for out-of-distribution generalization and low-shot transfer.
Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective (short)
Workshop in Distribution Shifts at NeurIPS, 2021
Learning behavioral patterns from observational data has been a de-facto approach to motion forecasting. Yet, the current paradigm suffers from two fundamental shortcomings: brittle under covariate shift and inefficient for knowledge transfer. In this work, we propose to address these challenges from a causal representation perspective. We first introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with physical mechanisms, style confounders, and spurious correlations. We then propose two components that explicitly promote the robustness and reusability of the learned motion representations: (i) unlike the common practice of merging datasets collected from different locations, we exploit their subtle distinctions by means of an invariance loss function, which encourages the model to suppress spurious correlations and capture physical mechanisms; (ii) we devise a modular architecture that factorizes the representations of physical laws and motion styles in a structured way, and progressively prune their dense connections during training to approximate a sparse causal graph. We empirically validate the strength of the proposed method for robust generalization in controlled real-world experiments. We finally discuss the challenges and opportunities in the presence of style shifts through synthetic simulations.
Quantification of the available area for rooftop photovoltaic installation from overhead imagery using convolutional neural networks
Journal of Physics: Conference Series, 2021
The integration of solar technology in the built environment is realized mainly through rooftop-installed panels. In this paper, we combine state-of-the-art Machine Learning and computer vision techniques together with high-resolution overhead images to provide a geo-localization of the available rooftop surfaces for solar panel installation. We further associate them to the corresponding buildings by means of a geospatial post-processing approach. The stand-alone Convolutional Neural Network used to segment suitable rooftop areas reaches an intersection over union of 64% and an accuracy of 93%, while a post-processing step using building database improves the rejection of false positives. The model is applied to a case study area in the canton of Geneva and the results are compared with another recent method used in the literature to derive the available area.