Alexander D'Amour

Staff Research Scientist, Google DeepMind
Cambridge, MA
alexdamour@google.com

I am a Staff Research Scientist at Google DeepMind in Cambridge, MA. Formerly, I was a Neyman Visiting Assistant Professor in the Department of Statistics at UC Berkeley. I did my PhD in the Department of Statistics at Harvard University, where I was advised by Edoardo Airoldi.

I work primarily on problems in causal inference, distribution shift, domain generalization, and fairness for building reliable, trustworthy Machine Learning/AI systems. More generally, I am interested in problems where simple prediction is not enough.

I am interested in all kinds of applications. In research and in consulting, I've worked on problems in sports, healthcare, education, social network analysis, marketing, finance, microfinance, and entertainment.

Papers

Note: This section is no longer maintained. Please see Google Scholar for up-to-date publications.

To Appear

On Multi-Cause Causal Inference with Unobserved Confounding: Counterexamples, Impossibility, and Alternatives
Contrary to some recent claims, evaluating many causes (e.g., treatments) simultaneously cannot eliminate unobserved confounding in observational causal inference. This paper demonstrates this point by counterexample
Alexander D'Amour
To appear in Proceedings of AISTATS 2019

arXiv Preprint

Flexible sensitivity analysis for observational studies without observable implications
In causal inference, sensitivity analysis is meant to probe unidentifiable assumptions that are necessary for causal identification, but may sensitivity analysis methods inadvertently impose restrictions that have observable implications. We propose a flexible, interpretable framework that does not have this problem and is compatible with modern observed data modeling techniques.
Alexander Franks, Alexander D'Amour, and Avi Feller
To appear in the Journal of the American Statistical Association

arXiv Preprint

Under Review

Overlap in Observational Studies with High-Dimensional Covariates
In high dimensions, overlap is a stronger assumption than most people realize. This paper presents some implications.
Alexander D'Amour, Peng Ding, Avi Feller, Lihua Lei, and Jasjeet Sekhon

arXiv Preprint

Published

Reducing Reparameterization Gradient Variance
Control variate technique for reducing the variance of stochastic gradients used in Monte Carlo variational inference.
Andrew C. Miller, Nicholas J. Foti, Alexander D'Amour, and Ryan P. Adams
Advances in Neural Information Processing Systems (NIPS), 2017

arXiv Preprint

Meta-Analytics: Tools for Understanding the Statistical Properties of Sports Metrics
Introduces an ensemble of r-squared-style statistics to quantify the reliability and uniqueness of sports metrics.
Alexander Franks, Alexander D'Amour, Daniel Cervone, and Luke Bornn
Journal of Quantitative Analysis in Sports

A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes
Methodology for computing Expected Possession Value, an instantaneous expected point value for a basketball possession.
Daniel Cervone, Alexander D'Amour, Luke Bornn, and Kirk Goldsberry
Journal of the American Statistical Association

Disambiguation and Co-authorship Networks of the U.S. Patent Inventor Database
A supervised learning approach to adding unique inventor identifiers to the US patent database.
G. Li, R. Lai, Alexander D'Amour, D. Doolin, Y. Sun, V. Torvik, A. Yu, and L. Fleming
Research Policy, 2014.

Journal Paper

Estimating Rates of Carriage Acquisition and Clearance and Competitive Ability for Pneumococcal Serotypes in Kenya With a Markov Transition Model
Markov model approach to estimating epideiological properties of Pneumococcal serotypes using periodic testing data from Kenyan schoolchildren.
M. Lipsitch, O. Abdullani, Alexander D'Amour, W. Xie, D. Weinberger, E. Tchetgen, and J. Scott
Epidemiology, 2012.

Journal Paper

Improving Major League Park Factor Estimates
An ANOVA approach to estimating park factors in Major League Baseball. Written in conjunction with the Harvard Sports Analysis Collective.
R. Acharya, A. Ahmed, Alexander D'Amour, H. Lu, C. Morris, B. Oglevee, A. Peterson, and R. Swift

Journal Paper

Dissertation

The Effective Estimand
A framework for characterizing the scientific usefulness of an estimator derived from a misspecified model. Generalizes the work on networks to general modeling tasks.
Alexander D'Amour and Edoardo Airoldi

In preparation.
Working Draft

Misspecification, Sparsity, and Superpopulation Inference for Sparse Social Networks
Theoretical characterization of how the sparse scaling of social networks undermines superpopulation investigations when the sparsity is not modeled exactly. Proposes sparsity-invariant modeling and inference methodology.
Alexander D'Amour and Edoardo Airoldi

In preparation.
Working Draft
Slides

Causal Inference with Sparse Social-Interaction-Valued Outcomes
Extension of sparsity-invariant methodology for network data to causal settings.
Alexander D'Amour and Edoardo Airoldi

In preparation.
Working Draft

Talks, Posters, Other Media

Talks

Overlap in High Dimensions
Surprisingly strong implications of the overlap assumption that is usually invoked in high-dimensional causal inference. Upshot: in high dimensions, the overlap assumption approaches a balance assumption.
Invited talk at the Berkeley Division of Biostatistics Seminar, October 2017 at UC Berkeley.
Invited talk at the Atlantic Causal Inference Conference, May 2017 at UNC Chapel Hill.

Slides

Advances in Basketball Analytics Using Player-Tracking Data
High-level overview of new quantiative methods for understanding basketball, implemented by XY Research group using player-tracking data from the NBA.
Invited talk at Consortium for Data Analytics and Risk, October 2017 at UC Berkeley.
Invited talk at Boston ML meetup, July 2016 in Boston, MA.

Video

Prediction is Not Enough: Designing decision-support statistics for causal inference and attribution
Exploration of Statistical applications where the objective requires more than the ability to predict future replications of the observe data stream.
Invited talk at Clarify Health Solutions in San Francisco, CA.
Invited talk at Lumos Labs in San Francisco, CA.

Slides

A Design-Based Perspective on Variable Selection
An approach to variable selection that treats it as the design choice -- namely choosing which conditional distribution to model. Some preliminary thoughts on optimal data-splitting.
Talk given in the Harvard Statistics Department's Research in Statistics student colloquium.

Colloquium Slides

Posters

Extrapolation Parameterizations for Assessing Sensitivity to Unmeasured Confounding
Proposes the extrapolation factorization for sensitivity analysis in causal inference, which explicitly separates identified and unidentified parts of the potential outcomes model.

Poster

Move or Die: How Ball Movement Creates Open Shots in the NBA
Uses summaries of a Markov model for basketball possessions to show that ball movement is effective only inasmuch as it introduced unpredictability into an NBA offense.
Winner: Best Poster, 2015 Sloan Sports Analytics Conference.

Poster

Popular Media

Bayesian Statistician
You're the Expert (radio show)

Podcast

Behind Databall: A Discussion on the Methodology of Expected Possession Value
Grantland

Article

Teaching

Classes

At Berkeley, I have taught the following courses:

Statistics 153: Timeseries Analysis (Spring 2017, Fall 2017)
Statistics 298, 278B: Causal Inference Reading Group (Fall 2016, Spring 2017, Fall 2017, Spring 2018)
Statistics 88: Probability and Mathematical Statistics for Data Science (Fall 2016)

At Harvard, I was a teaching fellow for the following courses:

Statistics 220: Bayesian Data Analysis (Fall 2011, Fall 2012)
Statistics 221: Statistical Computation and Visualization (Spring 2013)
Statistics 225: Spatial Statistics (Spring 2014)
Statistics 121/Computer Science 109: Data Science (Fall 2013, Fall 2014)
Statistics 107: Financial Statistics (Spring 2012)

Awards

2014 David Pickard Memorial Teaching Fellow.
Four-time awardee of the Certificate for Distinction in Teaching.

Consulting

In a previous life, I fielded many applied statistical problems from industry in a Data Science consulting practice. I was a founding partner of Damyata, LLC, a consultancy that I founded with two tech industry veterans. Our mission was to establish best practices in Data Science by delivering state-of-the art data-driven systems to our clients. A core part of our mission was to foster academic-industry research partnerships.

Former consulting clients include