Recent & Upcoming Talks

2018

Exploring Finite-sample Bias in Propensity Score Weights

The principle limitation of all observational studies is the potential for unmeasured confounding. Various study designs may perform similarly in controlling for bias due to measured confounders while differing in their sensitivity to unmeasured confounding. Design sensitivity (Rosenbaum, 2004) quantifies the strength of an unmeasured confounder needed to nullify an observed finding. In this presentation, we explore how robust certain study designs are to various unmeasured confounding scenarios. We focus particularly on two exciting new study designs - ATM and ATO weights. We illustrate the performance in a large electronic health records based study and provide recommendations for sensitivity to unmeasured confounding analyses in ATM and ATO weighted studies, focusing primarily on the potential reduction in finite-sample bias.

Making Causal Claims as a Data Scientist: Tips and Tricks Using R

Making believable causal claims can be difficult, especially with the much repeated adage “correlation is not causation”. This talk will walk through some tools often used to practice safe causation, such as propensity scores and sensitivity analyses. In addition, we will cover principles that suggest causation such as the understanding of counterfactuals, and applying Hill’s criteria in a data science setting. We will walk through specific examples, as well as provide R code for all methods discussed.

2017

R-Ladies Panel: Improving Gender Diversity in a Male-dominated Community

R-Ladies is a worldwide organization whose mission is to promote gender diversity in the R community. We are interested in presenting a panel of regional leaders in the R-Ladies movement. We will discuss topics such as diversity data in the R community, best practices for starting up a meetup in your own community, best practices for running and continued success of a meetup in your community, and funding opportunities. We will also diagnose different obstacles and discuss how we attack them, for example increasing women’s competence versus confidence versus recognition in the R community. Finally we will provide resources and details about how to get involved with local meetups.

Contextualizing Sensitivity Analysis in Observational Studies: Calculating Bias Factors for Known Covariates

The strength of evidence provided by epidemiological and observational studies is inherently limited by the potential for unmeasured confounding. While methods exist to quantify the potential effect of a specified unmeasured confounder, these methods should be anchored and contextualized within each study. We put forward a method for merging sensitivity to unmeasured confounding analyses with the impacts of the observed covariates. We graphically display what we call the observed bias factors with the tipping point sensitivity analysis. We illustrate the method under various study designs and provide an application created to simplify the implementation of this methodology.

papr: Tinder for pre-prints, a Shiny Application for collecting gut-reactions to pre-prints from the scientific community

papr is an R Shiny web application and social network for evaluating bioRxiv pre-prints. The app serves multiple purposes, allowing the user to quickly swipe through pertinent abstracts as well as find a community of researchers with similar interests. It also serves as a portal for accessible “open science”, getting abstracts into the hands of users of all skill levels. Additionally, the data could help build a general understanding of what research the community finds exciting.

We allow the user to log in via Google to track multiple sessions and have implemented a recommender engine, allowing us to tailor which abstracts are shown based on each user’s previous abstract rankings. While using the app, users view an abstract pulled from bioRxiv and rate it as “exciting and correct”, “exciting and questionable”, “boring and correct”, or “boring and questionable” by swiping the abstract in a given direction. The app includes optional social network features, connecting users who provide their twitter handle to users who enjoy similar papers.

This presentation will demonstrate how to incorporate tactile interfaces, such as swiping, into a Shiny application using a package we created for this functionality shinysense, store real-time user data on Dropbox using drop2, login in capabilities using googleAuthR and googleID, how to implement a recommender engine using principle component analysis, and how we have handled issues of data safety/security through proactive planning and risk mitigation. Finally, we will report the app activity, summarizing both the user traffic and what research users are finding exciting.

An R + GitHub Journey

Join us for a GitHub journey, guided by Lucy D’Agostino McGowan! We’ll answer questions like:

What is so great about GitHub?
How can I make it work for me and my workflow?
How can I show the world some of the cool things I’m working on?

This will be a hands-on workshop that will give you all the tools to have a delightful time incorporating version control & R (and blogdown ( https://github.com/rstudio/blogdown) if you are so inclined). All levels are welcome!

Streamline Your Workflow: Integrating SAS, LaTeX, and R into a Single Reproducible Document

There is an industry-wide push toward making workflows seamless and reproducible. Incorporating reproducibility into the workflow has many benefits; among them are increased transparency, time savings, and accuracy. We walk through how to seamlessly integrate SAS®, LaTeX, and R into a single reproducible document. We also discuss best practices for general principles such as literate programming and version control.

Simplifying and Contextualizing Sensitivity to Unmeasured Confounding Tipping Point Analyses

The strength of evidence provided by epidemiological and observational studies is inherently limited by the potential for unmeasured confounding. Thus, we would expect every observational study to include a quantitative sensitivity to unmeasured confounding analysis. However, we reviewed 90 recent studies with statistically significant findings, published in top tier journals, and found 41 mentioned the issue of unmeasured confounding as a limitation, but only 4 included a quantitative sensitivity analysis. Moreover, the rule of thumb that considers effects 2 or greater as robust can be misleading in being too low for studies missing an important confounder and too high for studies that extensively control for confounding. We simplify the seminal work of Rosenbaum and Rubin (1983) and Lin, Pstay, and Kronmal (1998). We focus on three key quantities: the observed bound of the confidence interval closest to the null, a plausible residual effect size for an unmeasured binary confounder, and a realistic prevalence difference for this hypothetical confounder. We offer guidelines to researchers for anchoring the tipping point analysis in the context of the study and provide examples.

2016

Assessing the Association Between Accident Injury Severity and NCAP Car Safety Ratings”

The U.S. New Car Assessment Program (NCAP) evaluates the safety of new cars through their 5-Star Safety Ratings program. In 2010, this program enhanced their protocol, making the ratings more stringent for cars in model years 2011 and onwards. We are interested in assessing this rating system’s ability to predict accident injury severity. To evaluate this question, we use data reported in the National Highway Traffic Safety Administration’s (NHTSA) General Estimates System (GES) database for the years 2011 to 2014, matched to NCAP overall safety ratings for 291 unique make, model, model year combinations. We fit a proportional odds regression model predicting injury severity for 23,641 individual passengers involved in car crashes, adjusting for accident-level covariates, such as the speed of the car and point of impact, and individual-level covariates, such as age and seating position.

Integrating SAS and R to Perform Optimal Propensity Score Matching

In studies where randomization is not possible, imbalance in baseline covariates (confounding by indication) is a fundamental concern. Propensity score matching (PSM) is a popular method to minimize this potential bias, matching individuals who received treatment to those who did not, to reduce the imbalance in pre-treatment covariate distributions. PSM methods continue to advance, as computing resources expand. Optimal matching, which selects the set of matches that minimizes the average difference in propensity scores between mates, has been shown to outperform less computationally intensive methods. However, many find the implementation daunting. SAS/IML® software allows the integration of optimal matching routines that execute in R, e.g. the R optmatch package. This presentation walks through performing optimal PSM in SAS® through implementing R functions, assessing whether covariate trimming is necessary prior to PSM. It covers the propensity score analysis in SAS, the matching procedure, and the post-matching assessment of covariate balance using SAS/STAT® 13.2 and SAS/IML procedures.