Hill’s Criteria for the Data Scientist: Incorporating Causal Inference Techniques
This talk will walk through Sir Austin Bradford Hill’s viewpoints for causality, using XKCD comics along the way.
This talk will walk through Sir Austin Bradford Hill’s viewpoints for causality, using XKCD comics along the way.
This talk will walk through building a self-contained randomized study using Shiny and learnr modules. We will discuss building informed consent, the randomization process, demographic surveys, and R-based studies into a single online framework to allow users to seamlessly enroll and participate in randomized studies via a single URL. The talk will include both practical recommendations as well as technical code snippets.
In both data science and academic research, prediction modeling is often not enough; to answer many questions, we need to approach them causally. In this workshop, we’ll teach the essential elements of answering causal questions in R through causal diagrams, and causal modeling techniques such as propensity scores and inverse probability weighting. We’ll also show that by distinguishing predictive models from causal models, we can better take advantage of both tools. You’ll be able to use the tools you already know–the tidyverse, regression models, and more–to answer the questions that are important to your work.
This talk will focus on the tipr R package.
This talk will focus on an application, ConTESSA, along with the accompanying R package, tti, designed to help quantify the efficacy of contact tracing programs. The talk will walk through the technical aspects of the underlying model as well as highlight how R, and in particular shiny, were used to create this product.
This talk will focus on bringing pedagogical best practices into the data science classroom. We will begin by focusing on building confident coders, followed by an exploration of developing quantitative intuition, with a particular focus on understanding uncertainty. Finally, we will wrap up with tips for empowering strong data science communicators.
In the age of “big data” there is an information overload. It is increasingly important for people to be able to sift through what is important and what is noise, what is evidence and what is an anecdote. Accordingly, the effective communication of statistical concepts to diverse audiences is currently an education and public health priority. This talk focuses on techniques to strike an appropriate balance, with specifics on how to communicate complex statistical concepts in an engaging manner without sacrificing truth and content, specifically addressing how to help the general public read past headlines to the actual evidence, or lack there of. We will discuss engaging with the public via organizations such as TED Ed - focusing both best practices and lessons learned.
In both data science and academic research, prediction modeling is often not enough; to answer many questions, we need to approach them causally. In this workshop, we’ll teach the essential elements of answering causal questions in R through causal diagrams, and causal modeling techniques such as propensity scores and inverse probability weighting. We’ll also show that by distinguishing predictive models from causal models, we can better take advantage of both tools. You’ll be able to use the tools you already know–the tidyverse, regression models, and more–to answer the questions that are important to your work.
We are interested in studying best practices for introducing students in statistics or data science to the programming language R. The “tidyverse” is a suite of R packages created to help with common statistics and data science tasks that follow a consistent philosophy. We have created two sets of online learning modules, one that introduces tidyverse concepts first and then dives into idiosyncrasies of R as a programming language, the second that takes a more traditional approach, first introducing R broadly and then following with an introduction to a particular suite of packages, the tidyverse. We have created a randomized study to examine whether the order certain concepts are introduced impacts whether learning objectives are met and/or how engaged students are with the material. This talk will focus on the mechanics of this study: how it was designed, how we enrolled participants, and how we evaluated outcomes.