Spring 2023 Seminars

List of Seminars

Sufficient Cause Urn Analysis of Principal Stratification Methods

Jaffer Zaidi

Assistant Professor

Department of Global and Community Health

George Mason University

 

Date: Friday, April 28, 2023

Abstract

The analysis of causal effects when the outcome of interest is possibly truncated by death has a long history in statistics. The survivor average causal effect is commonly identified with more assumptions than those guaranteed by the design of a randomized clinical trial or using sensitivity analysis. This paper demonstrates that individual level causal effects in the `always survivor' principal stratum can be identified with no stronger identification assumptions than randomization. We further develop Rothman's sufficient cause model to derive further results providing a unified framework for sensitivity analysis of different identification strategies for principal stratification causal effects.

About the Speaker

Dr. Jaffer Zaidi is an assistant professor in the Department of Global and Community Health. Jaffer's research interests are primarily within causal inference, including but not limited to sufficient cause methods, sensitivity analysis, principal stratification, and interaction analysis. Before coming to Mason, Zaidi was a postdoctoral research fellow at University of North Carolina at Chapel Hill, funded through SAMSI (Statistical and Applied Mathematical Sciences Institute), and has conducted research in Somkhele, South Africa at the Africa Health Research Institute (AHRI).

Event Organizers

Abolfazl Safikhani

Nicholas Rios

Causal Inference with Interference

Michael Hudgens

Professor and Associate Chair

Department of Biostatistics

University of North Carolina

Date: Friday, April 21, 2023

Abstract

A fundamental assumption usually made in causal inference is that of no interference between individuals (or units), i.e., the potential outcomes of one individual are assumed to be unaffected by the treatment assignment of other individuals. However, in many settings, this assumption obviously does not hold. For example, in infectious diseases, whether one person becomes infected may depend on who else in the population is vaccinated. In this talk we will discuss recent approaches to assessing treatment effects in the presence of interference.

About the Speaker

Dr. Michael Hudgens is a Professor and Associate Chair of the Department of Biostatistics at the University of North Carolina. He also serves as the Co-Director of the UNC Causal Inference Research Lab. Professor Hudgens has co-authored approximately 300 peer-reviewed papers in statistical journals such as Biometrics, Biometrika, JASA, and JRSS-B, as well as biomedical journals such as the Lancet, Nature, and New England Journal of Medicine. He currently serves as an associate editor for Biometrics. He is an elected fellow of the American Statistical Association and has taught graduate-level biostatistics courses at UNC for over 15 years.

Event Organizers

Abolfazl Safikhani

Nicholas Rios

Generating Space-Filling Designs for Computer Experiments

Ling Wang

Assistant Professor

Department of Statistics

Purdue University 

Date: Friday, April 14, 2023

Abstract

Space-filling designs are commonly used in controlled experiments for investigating complex simulation systems. Latin hypercube design is a popular type of space-filling design because it studies as many levels as the design size for each variable and therefore achieves one-dimensional uniformity. In this talk, I will introduce a series of new methods for generating large and high-dimensional Latin hypercube designs. The generated designs are shown to be optimal under the maximin distance criterion and have small pairwise correlations between variables. When those many levels in a Latin hypercube design are not needed to learn the simulation system, the proposed methods can also be used to generate space-filling designs with less and balanced levels.  

About the Speaker

Lin Wang is an Assistant Professor of Statistics at Purdue University. Prior to joining Purdue, she was an Assistant Professor of Statistics at George Washington University from 2019 to 2022. She obtained her PhD in Statistics in 2019 from University of California, Los Angeles. Her research interests include sampling, subsampling, experimental design, and causal inference.

Event Organizers

Abolfazl Safikhani

Nicholas Rios

Power and Sample Size Calculations for Rerandomized Experiments

Zach Branson

Assistant Professor

Statistics and Data Science

Carnegie Mellon University

Date: Friday, April 7, 2023

Abstract

Power analyses are an important aspect of experimental design, because they help determine how experiments are implemented in practice. It is common to specify a desired level of power and compute the sample size necessary to obtain that power. Such calculations are well-known for completely randomized experiments, but there can be many benefits to using other experimental designs. For example, it has recently been established that rerandomization, where subjects are randomized until covariate balance is obtained, increases the precision of causal effect estimators. This work establishes the power of rerandomized treatment-control experiments, thereby allowing for sample size calculators. We find the surprising result that, while power is often greater under rerandomization than complete randomization, the opposite can occur for very small treatment effects. The reason is that inference under rerandomization can be relatively more conservative, in the sense that it can have a lower type-I error at the same nominal significance level, and this additional conservativeness adversely affects power. This surprising result is due to treatment effect heterogeneity, a quantity often ignored in power analyses. We find that heterogeneity increases power for large effect sizes but decreases power for small effect sizes.

About the Speaker

Zach Branson is an Assistant Teaching Professor in Statistics and Data Science at Carnegie Mellon University. His main research interests are experimental design and causal inference, where the goal is to assess if treatments (randomized or not) cause a change in outcomes. In addition to theoretical and methodological work, he works on applying causal inference methods in criminology, medicine, mental health, and text analysis. Beyond research, his main teaching interests are in statistical communications, e.g. training PhD students to write papers and undergraduates to give statistical presentations.

Event Organizers

Abolfazl Safikhani

Nicholas Rios

A Multiple Imputation Procedure for Record Linkage and Causal Inference to Estimate the Effects of Home-Delivered Meals

Roee Gutman

Associate Professor

Department of Biostatistics

Brown University

Date: Friday, March 31, 2023

Abstract

Causal analysis of observational studies requires data that comprise of a set of covariates, a treatment assignment indicator, and the observed outcomes. However, data confidentiality restrictions or the nature of data collection may distribute these variables across two or more datasets. In the absence of unique identifiers to link records across files, probabilistic record linkage algorithms can be leveraged to merge the datasets. Current applications of record link-age are concerned with the estimation of associations between variables that are exclusive to one file and not causal relationships. We propose a Bayesian framework for record linkage and causal inference where one file comprises all the covariate and observed outcome information, and the second file consists of a list of all individuals who receive the active treatment. Under certain ignorability assumptions, the procedure properly propagates the error in the record linkage process, resulting in valid statistical inferences. To estimate the causal effects, we devise a two-stage procedure. The first stage of the procedure performs Bayesian record linkage to multiply impute the treatment assignment for all individuals in the first file, while adjustments for covariates' imbalance and imputation of missing potential outcomes are performed in the second stage. This procedure is used to evaluate the effect of Meals on Wheels services on mortality and healthcare utilization among homebound older adults in Rhode Island. In addition, an interpretable sensitivity analysis is developed to assess potential violations of the ignorability assumptions. 

About the Speaker

Dr. Roee Gutman is an Associate Professor in the Department of Biostatistics at Brown University, where he also serves as the director for the Undergraduate Statistics Concentration. His areas of expertise are causal inference, file linkage, missing data, Bayesian data analysis, and their application to big data sources in health services research. He has been involved in many comparative effectiveness studies where he contributed both in terms of the statistical theory and its implementation.

Event Organizers

Abolfazl Safikhani

Nicholas Rios

Bayesian Modeling with Spatial Curvature Processes

Aritra Halder

Assistant Professor

Department of Biostatistics

Drexel University

Date: Friday, March 10, 2023

Abstract

Spatial process models are widely used for modeling point-referenced variables arising from diverse scientific domains. Analyzing the resulting random surface provides deeper insights into the nature of latent dependence within the studied response. We develop Bayesian modeling and inference for rapid changes on the response surface to assess directional curvature along a given trajectory. Such trajectories or curves of rapid change, often referred to as wombling boundaries, occur in geographic space in the form of rivers in a flood plain, roads, mountains or plateaus or other topographic features leading to high gradients on the response surface. We demonstrate fully model based Bayesian inference on directional curvature processes to analyze differential behavior in responses along wombling boundaries. We illustrate our methodology with a number of simulated experiments followed by multiple applications featuring the Meuse river data; temperature data from the Northeastern United States; and Boston Housing data.

About the Speaker

Aritra Halder is an Assistant Professor in the Department of Biostatistics, at Drexel University’s Dornsife School of Public Health. He completed his PhD. in Statistics from the University of Connecticut in July, 2020. His research interests are Bayesian modeling, Spatial and Spatial-temporal Statistics, and Statistical Computation.

Event Organizers

- Abolfazl Safikhani

- Nicholas Rios

Prognostic Digital Twins: Current and Future Applications

Arman Sabbaghi

Head of Biostatistics Research

Unlearn.AI

Date: Friday, February 10, 2023

Abstract

Clinical trials are established as the gold standard for evaluating the causal effects of new medical treatments, interventions, or therapies. However, modern clinical trials are becoming increasingly difficult to conduct due to enrollment challenges, long trial durations, and significant costs. Existing methods based on external controls can help to address some of these difficulties, but they are typically unreliable. We shall present Unlearn's TwinRCT technology, which is a novel trial design that combines historical data, machine learning, and randomization to deliver smaller, faster clinical trials and yield results that are more reliable that external controls. The core technology underlying the TwinRCT is a Digital Twin Generator (DTG), which is developed from historical data and then applied to baseline data for new clinical trial participants to create Prognostic Digital Twins. Prognostic scores for each clinical outcome of interest are derived from the Prognostic Digital Twins and used to improve the efficiency of the analysis. We shall demonstrate the relative advantages of the TwinRCT technology with respect to studies in which external controls or supplemental controls are considered options, and describe new Bayesian extensions of TwinRCT based on Prognostic Digital Twins. Ultimately, as described by the European Medicines Agency (EMA) qualification of this approach, the TwinRCT technology can yield unbiased treatment effect estimation in the primary analysis of pivotal studies, and preserve strict Type I error control even in circumstances where the historical data have known differences versus the clinical trial patients enrolled in the TwinRCT.

About the Speaker

Prior to becoming the Head of Biostatistics Research at Unlearn.AI, Arman Sabbaghi was an Associate Professor of Statistics at Purdue University. Arman Sabbaghi obtained his Ph.D. in Statistics from Harvard University in 2014.  At Purdue, his research focused on the development of new causal inference methodology for the analysis of observational data and clinical trials, the creation of statistical tools for assessing experimental designs, and the development of ML algorithms for quality control. 

Event Organizers

Abolfazl Safikhani

Nicholas Rios

Garbage In, Einstein Out: A Mathematical Study of "Einstein from Noise"

I-Ping Tu

Research Fellow

The Institute of Statistical Science

Academia Sinica, Taipei

Date: Friday, February 24, 2023

Abstract

A cryo-EM 3D structure is solved from many noisy 2D projections of individual molecules. Two keys that make this 3D reconstruction a challenging computational task is its high level of noise and the unknown pose parameters of each individual molecule.  Often times, reference is used to initiate the search of orientation, which has incurred the risk of coalescing images with low or no signal to the reference, known as the ‘Einstein from noise’ problem. Here, we investigate this phenomenon from model-bias viewpoint in terms of image dimensionality and sample size. By using mathematical modeling, we derive a surprisingly simple form accurately predicting the correlation value between Einstein face and the spurious image arising from averaging the sorted top images of purely Gaussian noise images. This theoretical value increases with n (the number of images) and m (the number of images sorted for averaging) but decreases with p (the dimensionality of image). To avoid ‘Einstein from noise’ pitfall, we propose a denoising method as a data pre-processing tool to increase the SNR. We observe that this tool makes significant improvement in either computation time or clustering average quality in 2D clustering of various cryo-EM analysis packages.  

About the Speaker

Dr. I-Ping Tu received her Ph.D. in Statistics from Stanford University in 1997. She was a senior statistician at the Stanford Functional Genomics Facility until 2003. She later moved to become a Research Fellow at the Institute of Statistical Science, Academia Sinica. Her research has mainly focused on developing statistical methods to analyze cryo-electron microscopy (cryo-EM) image data. In recent years, technical breakthrough has transformed cryo-EM to become a main tool for determination of molecular structure to atomic resolution without crystals or in solution. However, the process of structural determination from single-particle cryo-EM images is still very challenging because it involves processing extremely noisy images of unknown orientation. She has developed a 2D classification package called RE2DC with a processing platform ASCEP which integrate RE2DC with other packages to execute a pipeline for 3D structure determinations of cryo-EM data. She will continue developing efficient and robust statistical methods to improve the analysis. 

Event Organizers

Abolfazl Safikhani

Nicholas Rios

Efficient and Targeted COVID-19 Border Testing via Reinforcement Learning

Hamsa Bastani

Assistant Professor of Operations, Information, and Decisions

University of Pennsylvania

Date: Friday, February 17, 2023

Abstract

Throughout the COVID-19 pandemic, countries relied on a variety of ad-hoc border control protocols to allow for non-essential travel while safeguarding public health: from quarantining all travellers to restricting entry from select nations based on population-level epidemiological metrics such as cases, deaths or testing positivity rates. Here we report the design and performance of a reinforcement learning system, nicknamed ‘Eva’. In the summer of 2020, Eva was deployed across all Greek borders to limit the influx of asymptomatic travellers infected with SARS-CoV-2, and to inform border policies through real-time estimates of COVID-19 prevalence. In contrast to country-wide protocols, Eva allocated Greece’s limited testing resources based upon incoming travellers’ demographic information and testing results from previous travellers. By comparing Eva’s performance against modelled counterfactual scenarios, we show that Eva identified 1.85 times as many asymptomatic, infected travellers as random surveillance testing, with up to 2-4 times as many during peak travel, and 1.25-1.45 times as many asymptomatic, infected travellers as testing policies that only utilize epidemiological metrics. We demonstrate that this latter benefit arises, at least partially, because population-level epidemiological metrics had limited predictive value for the actual prevalence of SARS-CoV-2 among asymptomatic travellers and exhibited strong country-specific idiosyncrasies in the summer of 2020. Our results raise serious concerns on the effectiveness of country-agnostic internationally proposed border control policies that are based on population-level epidemiological metrics. Instead, our work represents a successful example of the potential of reinforcement learning and real-time data for safeguarding public health.

Paper Link: https://www.nature.com/articles/s41586-021-04014-z 

About the Speaker

Hamsa Bastani is an Assistant Professor of Operations, Information, and Decisions at the Wharton School, University of Pennsylvania. Her research focuses on developing novel machine learning algorithms for data-driven decision-making, with applications to healthcare operations and social good. Her work has received several recognitions, including the Wagner Prize for Excellence in Practice (2021), the Pierskalla Award for the best paper in healthcare (2016, 2019, 2021), the Behavioral OM Best Paper Award (2021), as well as first place in the George Nicholson and MSOM student paper competitions (2016). 

Event Organizers

Abolfazl Safikhani

Nicholas Rios