# Alexander D'Amour

Neyman Visiting Assistant Professor, Department of Statistics, UC Berkeley

alexdamour@berkeley.edu

I am currently the Neyman Visiting Assistant Professor in the Department of Statistics at UC Berkeley. I did my PhD in the Department of Statistics at Harvard University, where I was advised by Edoardo Airoldi. I was a member of the Harvard Laboratory for Applied Statistical Methodology & Data Science.

At Berkeley, I am focusing on causal inference in observational studies with high-dimensional covariates. I am co-instructing a causal inference reading group with Peng Ding, Avi Feller, and Will Fithian.

Very broadly, I am interested in developing foundational principles for applied statistics and Data Science that unify themes in design, modeling, inference, and decision rules. I am particularly interested in:

- Decision problems where simple prediction is not enough. These problems include
**causal inference**,**attribution**,**hypothesis generation**, and**experimental design**. - Applying
**machine learning**to causal inference problems. - Problems that require
**transportability**to new contexts. - Network, event, and spatial data.
- Improving statistical education and statistical practice.

I am an active member of the XY Research group, which conducts research in sports statistics with a focus on player-tracking data.

## Papers

Under Review

**Overlap in Observational Studies with High-Dimensional Covariates**

In high dimensions, overlap is a stronger assumption than most people realize. This paper presents some implications.

**Alexander D'Amour**, Peng Ding, Avi Feller, Lihua Lei, and Jasjeet Sekhon

Dissertation

**The Effective Estimand**

A framework for characterizing the scientific usefulness of an estimator derived from a misspecified model.
Generalizes the work on networks to general modeling tasks.

**Alexander D'Amour** and Edoardo Airoldi

- In preparation.
- Working Draft

**Misspecification, Sparsity, and Superpopulation Inference for Sparse Social Networks**

Theoretical characterization of how the sparse scaling of social networks undermines superpopulation investigations when the sparsity is not modeled exactly.
Proposes sparsity-invariant modeling and inference methodology.

**Alexander D'Amour** and Edoardo Airoldi

- In preparation.
- Working Draft
- Slides

**Causal Inference with Sparse Social-Interaction-Valued Outcomes**

Extension of sparsity-invariant methodology for network data to causal settings.

**Alexander D'Amour** and Edoardo Airoldi

- In preparation.
- Working Draft

Published

**Reducing Reparameterization Gradient Variance**

Control variate technique for reducing the variance of stochastic gradients used in Monte Carlo variational inference.

Andrew C. Miller, Nicholas J. Foti, **Alexander D'Amour**, and Ryan P. Adams

*Advances in Neural Information Processing Systems (NIPS), 2017 *

**Meta-Analytics: Tools for Understanding the Statistical Properties of Sports Metrics**

Introduces an ensemble of r-squared-style statistics to quantify the reliability and uniqueness of sports metrics.

Alexander Franks, **Alexander D'Amour**, Daniel Cervone, and Luke Bornn

*Journal of Quantitative Analysis in Sports*

**A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes**

Methodology for computing Expected Possession Value, an instantaneous expected point value for a basketball possession.

Daniel Cervone, **Alexander D'Amour**, Luke Bornn, and Kirk Goldsberry

*Journal of the American Statistical Association*

**Disambiguation and Co-authorship Networks of the U.S. Patent Inventor Database**

A supervised learning approach to adding unique inventor identifiers to the US patent database.

G. Li, R. Lai, **Alexander D'Amour**, D. Doolin, Y. Sun, V. Torvik, A. Yu, and L. Fleming

*Research Policy*, 2014.

**Estimating Rates of Carriage Acquisition and Clearance and Competitive Ability for Pneumococcal Serotypes in Kenya With a Markov Transition Model**

Markov model approach to estimating epideiological properties of *Pneumococcal* serotypes using periodic testing data from Kenyan schoolchildren.

M. Lipsitch, O. Abdullani, **Alexander D'Amour**, W. Xie, D. Weinberger, E. Tchetgen, and J. Scott

*Epidemiology*, 2012.

**Improving Major League Park Factor Estimates**

An ANOVA approach to estimating park factors in Major League Baseball. Written in conjunction with the Harvard Sports Analysis Collective.

R. Acharya, A. Ahmed, **Alexander D'Amour**, H. Lu, C. Morris, B. Oglevee, A. Peterson, and R. Swift

## Talks, Posters, Other Media

Talks

**Overlap in High Dimensions**

Surprisingly strong implications of the overlap assumption that is usually invoked in high-dimensional causal inference. Upshot: in high dimensions, the overlap assumption approaches a balance assumption.

Invited talk at the *Berkeley Division of Biostatistics Seminar*, October 2017 at UC Berkeley.

Invited talk at the *Atlantic Causal Inference Conference*, May 2017 at UNC Chapel Hill.

**Advances in Basketball Analytics Using Player-Tracking Data**

High-level overview of new quantiative methods for understanding basketball, implemented by XY Research group using player-tracking data from the NBA.

Invited talk at *Consortium for Data Analytics and Risk*, October 2017 at UC Berkeley.

Invited talk at *Boston ML* meetup, July 2016 in Boston, MA.

**Prediction is Not Enough: Designing decision-support statistics for causal inference and attribution**

Exploration of Statistical applications where the objective requires more than the ability to predict future replications of the observe data stream.

Invited talk at *Clarify Health Solutions* in San Francisco, CA.

Invited talk at *Lumos Labs* in San Francisco, CA.

**A Design-Based Perspective on Variable Selection**

An approach to variable selection that treats it as the design choice -- namely choosing which conditional distribution to model. Some preliminary thoughts on optimal data-splitting.

Talk given in the Harvard Statistics Department's Research in Statistics student colloquium.

Posters

**Extrapolation Parameterizations for Assessing Sensitivity to Unmeasured Confounding**

Proposes the extrapolation factorization for sensitivity analysis in causal inference, which explicitly separates identified and unidentified parts of the potential outcomes model.

**Move or Die: How Ball Movement Creates Open Shots in the NBA**

Uses summaries of a Markov model for basketball possessions to show that ball movement is effective only inasmuch as it introduced *unpredictability* into an NBA offense.

Winner: Best Poster, 2015 Sloan Sports Analytics Conference.

Popular Media

**Bayesian Statistician**

*You're the Expert* (radio show)

**Behind Databall: A Discussion on the Methodology of Expected Possession Value**

*Grantland*

## Teaching

Classes

At Berkeley, I have taught the following courses:

**Statistics 153**: Timeseries Analysis (Spring 2017, Fall 2017)**Statistics 298, 278B**: Causal Inference Reading Group (Fall 2016, Spring 2017, Fall 2017, Spring 2018)**Statistics 88**: Probability and Mathematical Statistics for Data Science (Fall 2016)

At Harvard, I was a teaching fellow for the following courses:

**Statistics 220**: Bayesian Data Analysis (Fall 2011, Fall 2012)**Statistics 221**: Statistical Computation and Visualization (Spring 2013)**Statistics 225**: Spatial Statistics (Spring 2014)**Statistics 121/Computer Science 109**: Data Science (Fall 2013, Fall 2014)**Statistics 107**: Financial Statistics (Spring 2012)

Awards

- 2014 David Pickard Memorial Teaching Fellow.
- Four-time awardee of the Certificate for Distinction in Teaching.

## Consulting

I field many applied statistical problems from industry in an active Data Science consulting practice. I am a founding partner of Damyata, LLC, a consultancy that I founded with two tech industry veterans. Our mission is to establish best practices in Data Science by delivering state-of-the art data-driven systems to our clients. A core part of our mission is to foster academic-industry research partnerships.

Former and current consulting clients include