Spring 2022 Seminars

List of Seminars

Association and Causation: Attributes and Effects of Judges in Equal Employment Opportunity Commission Litigation Outcomes

Date: Friday, May 6, 2022
Speaker: Michael Sobel
Affiliation: Columbia University

Federal Communications Commission (FCC)—"Addressing the Spectrum Crunch at the Federal Communications Commission"

Karla Hoffman
Professor, George Mason University

Date: Friday, April 29, 2022

Abstract

From mobile devices to improved Wi-Fi, from connected cars to satellites, the demand for more wireless connectivity grows every day. These uses require more dedicated spectrum but much of the available spectrum has already been allocated. Resolving the “spectrum crunch” requires creative solutions and new approaches. In this talk, we will discuss different ways the Federal Communications Commission has attempted to use the available spectrum more efficiently: repacking television stations and then using optimization to determine and dynamically re-schedule the TV transition, reconfiguring 37 and 39 GHz licensees, and creating dynamic licenses in 3.5 GHz. We will discuss how mathematical optimization could be used in each of these scenarios. This is joint work with the optimization team led by Brian Smith who received his Masters’ degree from the SEOR department of George Mason University.

About the speaker

Karla Hoffman is a Professor within the School of Engineering and Computing of George Mason University. She received her BS in mathematics from Rutgers and an M.B.A. and a D.Sc. from GWU. Previously, she worked as a mathematician at the National Institute of Standards and Technology (NIST). She has received NIST’s Applied Research Award, a Commerce Department Silver Medal, GMU’s Distinguished Faculty Award and is a Fellow of the Institute of for Operations Research and the Management Sciences (INFORMS), where she also received the Edelman Prize and the Kimball medal. Dr. Hoffman’s primary areas of research are optimization and auction design and testing. Her research focuses on the development of new algorithms for solving complex problems arising in industry and government. She consults to the FCC on auction design and testing and has served as a consultant on combinatorial optimization problems for the telecommunications, transportation, and military industries.

Least Squares Estimation of a Quasiconvex Regression Function

Rohit Patra
Assistant Professor, University of Florida

Date: Friday, April 22, 2022

Abstract

We develop a new approach for the estimation of a multivariate function based on the economic axioms of quasiconvexity (and monotonicity). On the computational side, we prove the existence of the quasiconvex constrained least squares estimator (LSE) and provide a characterization of the function space to compute the LSE via a mixed integer quadratic programme. On the theoretical side, we provide finite sample risk bounds for the LSE via a sharp oracle inequality. Our results allow for errors to depend on the covariates and to have only two finite moments. We illustrate the superior performance of the LSE against some competing estimators via simulation. Finally, we use the LSE to estimate the production function for the Japanese plywood industry and the cost function for hospitals across the US.

About the speaker

I am an assistant professor in the Department of Statistics at University of Florida. My research centers around semiparametric/nonparametric methodology and large sample theory - efficient estimation in semiparametric models, nonparametric function estimation (with special emphasis on shape constrained estimation), likelihood and bootstrap based inference in (non-standard) parametric and nonparametric models. The main motivation of the research is in developing nonparametric procedures that are automated (free from tuning parameters) but still flexible enough to incorporate data-driven features.

My research has applications in broad areas such as genetics (multiple testing problems), economics (utility and production function estimation and binary response models), causal inference (conditional independence) and astronomy (analysis of accretion of galaxies), among other fields.

Spatiotemporal modeling of an estuarine decapod using Bayesian inference: environmental drivers of juvenile blue crab abundance

Grace Chiu
Professor of Environmental Statistics at the Virginia Institute of Marine Science, College of William & Mary

Date: Friday, April 15, 2022

Abstract

Nursery grounds substantially enhance secondary production of commercially exploited fish and crustacean populations by providing food and refugia for their juveniles. Previous small-scale studies for blue crabs have emphasized seagrass meadows as highly productive nurseries. Yet, to generalize inference of nursery function, identify highly productive regions, and inform regional management, it is vital to unify digitized data on structurally complex habitats with survey data over larger spatiotemporal scales. Thus, we construct five Bayesian hierarchical models with various spatiotemporal dependence structures on 21 years of data across temperate estuaries in Virginia to infer nursery habitat value for blue crabs. Our results indicate that 1) the nonseparable spatiotemporal model outperformed the simpler models in cross validations, and 2) salt marsh surface area and turbidity, not seagrass, are the strongest determinants of local juvenile blue crab production. These highlight the need to consider nursery function at multiple spatiotemporal resolutions, and therefore, spatiotemporal dependence in large scale fisheries catch data, in order for robust inference on local productivity. Details of our work can be found in Hyman et al. (2022) in Frontiers in Marine Science, DOI: 10.3389/fmars.2022.834990.

About the speaker

Grace Chiu is Professor of Environmental Statistics at the Virginia Institute of Marine Science (VIMS), home to William & Mary's Graduate School of Marine Science. Her career has spanned three countries (US, Canada, Australia) as an academic and a federal government scientist. In her research, she develops computationally intensive Bayesian models to understand complex natural phenomena from human societies and the environment. At W&M, she advises and teaches statistics to VIMS graduate students, and advises honors students in the Computational & Applied Mathematics & Statistics (CAMS) program. For 25 years, she has been a devoted educator of statistics to undergraduate and graduate students from a wide range of disciplines. Since joining VIMS in 2019, she has been actively developing an advanced statistics curriculum for the School of Marine Science. Grace is also an affiliate faculty member at VCU, University of Washington, University of Waterloo, and the Australian National University.

New measures for assessing non-ignorable selection bias in non-probability samples and low response rate probability samples

Brady T. West
Research Associate Professor
University of Michigan-Ann Arbor

Date: Friday, April 8, 2022

Abstract

Recent developments in survey statistics have yielded simple, novel measures of the non-ignorable selection bias in estimates of means, proportions, and regression coefficients that may arise due to deviations from ignorable sample selection, where these deviations might be introduced by the sampling mechanism (e.g., non-probability sampling) or survey nonresponse. This presentation will review the computation of these indicators, the data required to compute them, software tools for computing them, and examples of their use and interpretation based on real survey data. Future directions for research in this area, including ongoing work to assess selection bias in pre-election polls conducted for the 2020 presidential election, will be provided in conclusion.

About the speaker

Brady T. West is a research associate professor in the Survey Methodology Program, located within the Survey Research Center at the Institute for Social Research (ISR) on the University of Michigan-Ann Arbor (U-M) campus. He earned his PhD from the Michigan Program in Survey and Data Science (formerly the Michigan Program in Survey Methodology) in 2011. Before that, he received an MA in Applied Statistics from the U-M Statistics Department in 2002, being recognized as an Outstanding First-year Applied Masters student, and a BS in Statistics with Highest Honors and Highest Distinction from the U-M Statistics Department in 2001. His current research interests include total survey error / total data quality, responsive and adaptive survey design, interviewer effects, survey paradata, the analysis of complex sample survey data, and multilevel regression models for clustered and longitudinal data. He has developed short courses on statistical analysis using SAS, SPSS, R, Stata, and HLM, and regularly consults on the use of procedures in these software packages for the analysis of longitudinal and clustered data. The author or co-author of more than 180 peer-reviewed publications in survey statistics, survey methodology, applied statistics, and public health, in addition to three edited volumes on survey methodology, he is also the lead author of Linear Mixed Models: A Practical Guide Using Statistical Software (Third Edition, with Kathy Welch and Andrzej Galecki), and a co-author of a book entitled Applied Survey Data Analysis (with Steven Heeringa and Patricia Berglund), the second edition of which was published by Chapman Hall in 2017.

“Geographer’s” perspectives on analyzing spatial data

David Wong
Professor, George Mason University

Date: Friday, April 1, 2022

Abstract

Instead of focusing on one specific research topic, this talk is to share some geographer’s views on analyzing spatial data. Statisticians and geographers may approach spatial data analysis differently, partly due to the differences in their statistical skills. But the differences may also reflect how they perceive space differently. Thus, some geographer’s perspectives may be ignored by statisticians. In this talk, I will review some challenges that geographers have encountered in analyzing spatial data. While some challenges are well-known and have been investigated for decades, some are raised recently. The goal of the talk is to share different views and to facilitate cross-disciplinary communication.

About the speaker

David Wong, Professor in Geography & Geoinformation Science Department. Except spending two years teaching at the University of Hong Kong between 2013 and 2015, he has been teaching at Mason since 1993. He has broad research interests, ranging from geovisualization, to the more social-oriented issues in spatial epidemiology and aging. His primary research interest is in population analysis, particularly in measuring segregation. Some of his publications include three co-authored books and more than 90 papers in peer-reviewed journals. Some of his research funding supports were provided by HUD, U.S. Census Bureau and NIH (both NICHD and NCI through R01, R03 and contracts). He has served on the editorial boards of seven international journals in GIS, spatial analysis and population.

Equivariant machine learning, structured like classical physics

Soledad Villar
Assistant Professor, Department of Applied Mathematics & Statistics, and Mathematical Institute for Data Science, Johns Hopkins University

Date: Friday, March 25, 2022

Abstract

There has been enormous progress in the last few years in designing conceivable (though not always practical) neural networks that respect the gauge symmetries – or coordinate freedom – of physical law. Some of these frameworks make use of irreducible representations, some make use of higher order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries, but a large fraction (possibly all) of classical physics is equivariant to translation, rotation, reflection (parity), boost (relativity), and permutations. Here we show that it is simple to parameterize universally approximating polynomial functions that are equivariant under these symmetries, or under the Euclidean, Lorentz, and Poincaré groups, at any dimensionality d. The key observation is that nonlinear O(d)-equivariant (and related-group-equivariant) functions can be expressed in terms of a lightweight collection of scalars — scalar products and scalar contractions of the scalar, vector, and tensor inputs. These results demonstrate theoretically that gauge-invariant deep learning models for classical physics with good scaling for large problems are feasible right now.

Pathfinder: Parallel quasi-Newton variational inference

Bob Carpenter
Center for Computational Mathematics, Flatiron Institute

Date: Friday, March 11, 2022

Abstract

In this talk, I'll introduce Pathfinder, a variational method for approximately sampling from differentiable log densities. Starting from a random initialization, Pathfinder locates normal approximations to the target density along a quasi-Newton optimization path, with local covariance estimated using the inverse Hessian estimates produced by the optimizer. Pathfinder returns draws from the approximation with the lowest estimated Kullback-Leibler (KL) divergence to the true posterior. We evaluate Pathfinder on a wide range of posterior distributions, demonstrating that its approximate draws are better than those from automatic differentiation variational inference (ADVI) and comparable to those produced by short chains of dynamic Hamiltonian Monte Carlo (HMC), as measured by 1-Wasserstein distance. Compared to automatic differentiation variational inference (ADVI) and short dynamic Hamiltonian Monte Carlo (HMC) runs, Pathfinder requires one to two orders of magnitude fewer log density and gradient evaluations, with greater reductions for more challenging posteriors. Importance resampling over multiple runs of Pathfinder improves the diversity of approximate draws, reducing 1-Wasserstein distance further and providing a measure of robustness to optimization failures on plateaus, saddle points, or in minor modes. The Monte Carlo KL-divergence estimates are embarrassingly parallelizable in the core Pathfinder algorithm, as are multiple runs in the resampling version, further increasing Pathfinder's speed advantage with multiple cores.

[joint work with Lu Zhang, Aki Vehtari, and Andrew Gelman]; Preprint available on arXiv.

About the speaker

Bob Carpenter joined the Flatiron Institute’s Center for Computational Mathematics in March 2020. He previously was a research scientist at Columbia University, Alias-I (LingPipe), SpeechWorks, and Lucent Bell Labs. Carpenter was also previously a professor of computational linguistics at Carnegie Mellon University. Carpenter is known for developing Stan, a probabilistic programming language and is one of the Stan core developers. In addition to numerous publications, Carpenter has written two books on computational linguistics. Carpenter has also received grants from the NSF, ONR, Sloan, IES, and NIH for his various programming. Carpenter has a B.A. in Math and Computer Science from Michigan State University and a Ph.D. in Cognitive Science and Computer Science from the University of Edinburgh.

Statistical hurdles and swamps in predicting future forests and winegrowing regions

Elizabeth Wolkovich
Associate Professor, Department of Forest & Conservation Sciences, University of British Columbia

Date: Friday, March 4, 2022

Abstract

Climate change is having large impacts on natural and agricultural systems around the globe. Mitigating the worst consequences requires models that mechanistically predict changes. Towards that goal, my lab (Temporal Ecology Lab) works on models to better predict the most reported biological impact---shifts in phenology, the timing of recurring life history events such as leafout and flowering. Phenological records of cherry blossoms are the longest written records on earth, yet we still struggle to accurately predict them across space, time and climatic change. Here I review several major areas of research where statistical inference has been critical to my lab's insights and advances, but which also highlight some of the deep methodological issues in the field: plant sensitivity to warming temperatures over time and space, timing mismatches between critical species interactions (for example, plants and pollinators) and predicting shifting winegrowing regions with warming.

About the speaker

Elizabeth Wolkovich is an Associate Professor in Forest and Conservation Sciences and Canada Research Chair at the University of British Columbia. She runs the Temporal Ecology Lab, which focuses on understanding how climate change shapes plants and plant communities, with a focus on shifts in the timing of seasonal development (e.g., budburst, flowering and fruit maturity)---known as phenology. Her lab both collects new data on forest trees and winegrapes and collates existing data to provide global estimates of shifts in phenology with warming from plants to birds and other animals, and to understand how human choices will impact future inegrowing regions. Her research benefits from an interdisciplinary team of collaborators from agriculture, biodiversity science, climatology, evolution and viticulture, as well as from shared long-term datasets from across North America and Europe.

Improved Small Domain Estimation via Compromise Regression Weights

Thomas A. Louis
Professor Emeritus of Biostatistics, Johns Hopkins Bloomberg School of Public Health

Date: Friday, February 25, 2022

Abstract

Shrinkage estimates of small domain attributes combine a noisy direct estimate with a more stable, regression-based estimate. When the regression model is misspecified estima- tion performance for the noisier domains can suffer due to substantial shrinkage towards a poorly estimated regression surface. To address this issue, we introduce a class empirically- determined weights used to estimate the regression that improve performance for the noisy domains. The weights are a convex combination of the those that produce the best linear unbiased predictor (BLUP) and those that produce the observed best predictor (OBP) of Jiang and co-authors. The convex combination is found by minimizing an unbiased estimate of the summed mean-squared prediction error, producing the “compromise best predictor” (CBP). This data-adaptive mixture of regression weights retains the robustness of the OBP while maintaining much of the advantage of the BLUP when the regression model is correct. We compare the BLUP, OBP and CBP via simulation and demonstrate their output in estimating gait speed in older adults. Joint work with Nick Henderson and Ravi Varadhan.

Marginal and Conditional Sufficient Variable Screening for Ultrahigh Dimensional Data

Chenlu Ke
Assistant Professor, Department of Statistical Sciences and Operations Research, Virginia Commonwealth University

Date: Friday, February 18, 2022

Abstract

Many contemporary research problems in diverse fields are characterized by ultrahigh dimensional datasets, where the number of variables can be much higher than the sample size. To extract core information by identifying low-dimensional presentations of predictive features is very challenging with interrelations, redundancy and noises embedded in ultrahigh dimensional data. Traditional variable selection and regularization methods are no longer applicable or favorable in terms of computational expediency, statistical accuracy and algorithmic stability. Variable screening aims to swiftly filter out redundant variables through independence learning. In this talk, we will introduce a novel unified framework of variable screening for ultrahigh dimensional data based on the notion of sufficiency. Candidate variables are ranked according to their marginal and conditional contributions to the response measured by "kernel" inverse regression statistics. Our screening procedure is model-free and applicable to continuous and categorical responses. When prior information is available or when potential confounding exists, the method can be readily extended to achieve conditional variable screening, where the conditional set can also be ultrahigh dimensional. The proposed framework enjoys the sure screening property and the rank consistency property in the regime of sufficient variable selection, with which its superiority over existing methods is well-established. We will also demonstrate the advantages of our method through simulation studies and real data applications.

About the speaker

Dr. Chenlu Ke is an Assistant Professor in the Department of Statistical Sciences and Operations Research at Virginia Commonwealth University. Chenlu obtained her PhD in Statistics in 2019 from the University of Kentucky. Her research focuses on developing variable selection and dimension reduction methods for ultrahigh dimensional data as well as their applications in survival analysis.

Let’s Talk About Data Ethics

Wendy Martinez
Director, Mathematical Statistics Research Center, Bureau of Labor Statistics

Date: Friday, February 11, 2022
 

Abstract

I have had the honor of giving talks on data ethics at various events around the world, including the US, Bangladesh, Hong Kong, and Japan. These talks sparked very interesting conversations about the ethical use of data. These conversations made me realize that statisticians and data scientists must be intentional in our application of ethical guidelines for statistical practice. I also learned that data ethics is something we all need to worry about, regardless of where we work and live in the world. I will begin my presentation by offering my definition of data ethics and will then provide a few real-world examples where ethical concerns arose. I will conclude the discussion by providing examples of data ethics frameworks and efforts from around the world.

About the speaker

Wendy Martinez has been serving as the Director of the Mathematical Statistics Research Center at the Bureau of Labor Statistics (BLS) for over ten years. Before that, she worked in several research positions throughout the Department of Defense. She held the position of Science and Technology Program Officer at the Office of Naval Research, where she established a research portfolio comprised of academia and industry performers developing data science products for the future Navy and Marine Corps. Her areas of interest include computational statistics, exploratory data analysis, and text data mining. She is the lead author of three books on MATLAB and statistics. Dr. Martinez was elected as a Fellow of the American Statistical Association (ASA) in 2006 and is an elected member of the International Statistical Institute. She was honored by the American Statistical Association when she received the ASA Founders Award at the JSM 2017 conference. Wendy is also proud and grateful to have been elected as the 2020 ASA President.