Recent & Upcoming Talks

2016

Integrating SAS and R to Perform Optimal Propensity Score Matching

In studies where randomization is not possible, imbalance in baseline covariates (confounding by indication) is a fundamental concern. Propensity score matching (PSM) is a popular method to minimize this potential bias, matching individuals who received treatment to those who did not, to reduce the imbalance in pre-treatment covariate distributions. PSM methods continue to advance, as computing resources expand. Optimal matching, which selects the set of matches that minimizes the average difference in propensity scores between mates, has been shown to outperform less computationally intensive methods. However, many find the implementation daunting. SAS/IML® software allows the integration of optimal matching routines that execute in R, e.g. the R optmatch package. This presentation walks through performing optimal PSM in SAS® through implementing R functions, assessing whether covariate trimming is necessary prior to PSM. It covers the propensity score analysis in SAS, the matching procedure, and the post-matching assessment of covariate balance using SAS/STAT® 13.2 and SAS/IML procedures.

April 20, 2016

11:00 AM – 11:45 AM

SAS Global Forum 2016


By Lucy D'Agostino McGowan, Robert A. Greevy, Jr in Invited Oral Presentation

pdf

2015

Census Tract-Level Disparities: Examining Food Swamps and Food Deserts

Examining disparities in resources on the census tract-level is currently a public health priority. The Modified Retail Food Environment Index (mRFEI), provided by the CDC, incorporates two food environment metrics, ‘food deserts’, areas with no access to healthy foods, and ‘food swamps’, areas in which the quantity of unhealthy food options overwhelm healthy ones. We assess the association between the census tract racial make-up and food environment. Multiple logistic regression models are fit, controlling for census-tract level covariates from 2008-2012 ACS estimates, as well as state. Percent black is significantly associated with food swamps, with an absolute increase of 14.4 percent black living in food swamps (p< 0.01). Percent Hispanic is associated with food swamps, with an absolute increase of 9.1 percent Hispanic living in food swamps (p< 0.01), but inversely related to food deserts (absolute difference -6.8, p< 0.01). After adjustment, all associations remain significant. The strong association between the census tract-level racial make-up and food swamps shown here will allow for targeted interventions to census tracts where these disparities exist.

Using PROC SURVEYREG and PROC SURVEYLOGISTIC to Assess Potential Bias

The Behavioral Risk Factor Surveillance System (BRFSS) collects data on health practices and risk behaviors via telephone survey. This study focuses on the question, On average, how many hours of sleep do you get in a 24-hour period? Recall bias is a potential concern in interviews and questionnaires, such as BRFSS. The 2013 BRFSS data is used to illustrate the proper methods for implementing PROC SURVEYREG and PROC SURVEYLOGISTIC, using the complex weighting scheme that BRFSS provides.

2014

Using SAS/STAT® Software to Validate a Health Literacy Prediction Model in a Primary Care Setting

Existing health literacy assessment tools developed for research purposes have constraints that limit their utility for clinical practice. The measurement of health literacy in clinical practice can be impractical due to the time requirements of existing assessment tools. Single Item Literacy Screener (SILS) items, which are self-administered brief screening questions, have been developed to address this constraint. We developed a model to predict limited health literacy that consists of two SILS and demographic information (for example, age, race, and education status) using a sample of patients in a St. Louis emergency department. In this paper, we validate this prediction model in a separate sample of patients visiting a primary care clinic in St. Louis. Using the prediction model developed in the previous study, we use SAS/STAT® software to validate this model based on three goodness of fit criteria: rescaled R-squared, AIC, and BIC. We compare models using two different measures of health literacy, Newest Vital Sign (NVS) and Rapid Assessment of Health Literacy in Medicine Revised (REALM-R). We evaluate the prediction model by examining the concordance, area under the ROC curve, sensitivity, specificity, kappa, and gamma statistics. Preliminary results show 69% concordance when comparing the model results to the REALM-R and 66% concordance when comparing to the NVS. Our conclusion is that validating a prediction model for inadequate health literacy would provide a feasible way to assess health literacy in fast-paced clinical settings. This would allow us to reach patients with limited health literacy with educational interventions and better meet their information needs.

March 26, 2014

9:30 AM – 10:15 AM

SAS Global Forum 2014


By Lucy D'Agostino McGowan, Melody S. Goodman, Kimberly A. Kaphingst in Invited Oral Presentation

pdf

2013

Developing County-Level Estimates of Racial Disparities in Obesity Using Multilevel Reweighted Regression

Background: The agenda to reduce racial health disparities has been set primarily at the national and state levels. These levels may be too far removed from the individual level where health outcomes are realized. This disconnect may be slowing the progress made in reducing these disparities. We use a small area analysis technique to fill the void for county-level disparities data. Methods:Behavioral Risk Factor Surveillance System data is used to estimate the prevalence of obesity by county among Non-Hispanic Whites and Non-Hispanic Blacks. A modified weighting system was developed based on demographics at the county level. A multilevel reweighted regression model is fit to obtain county-level prevalence estimates by race. To examine whether racial disparities exist at the county level, these rates are compared using risk difference and rate ratio. Results: Gulf County, Florida was ranked as having the largest disparity in absolute terms (risk difference). New York County, New York was ranked as having the largest disparity in relative terms (risk ratio). Based on the average risk difference, the top five states with the largest average disparity were: Oklahoma, Kentucky, Ohio, Washington D.C., and Kansas. The top five states with the largest average relative disparity were: Washington D.C., Massachusetts, Colorado, Kentucky, and New York. Conclusions: Addressing disparities based on factors such as race/ethnicity, geographic location, and socioeconomic status is a current public health priority. This study takes a first step in developing the statistical infrastructure needed to target disparities interventions and resources to the local areas with greatest need.

November 6, 2013

12:00 PM – 1:00 PM

141st American Public Health Association Annual Meeting 2013


By Lucy D'Agostino McGowan, Melody S. Goodman in Contributed Poster

poster

Small Areal Estimation of Racial Disparities in Diabetes Using Multilevel Reweighted Regression

Introduction: The agenda to reduce racial health disparities has been set primarily at the national and state levels. These levels may be too far removed from the individual level where health outcomes are realized. This disconnect may be slowing the progress made in reducing these disparities. We use a small area analysis technique to fill the void for county level disparities data. Methods: Behavioral Risk Factor Surveillance System data is used to estimate the prevalence of diabetes by county among Non-Hispanic Whites and Non-Hispanic Blacks. A modified weighting system was developed based on demographics at the county-level. A multilevel reweighted regression model is fit to obtain county level prevalence estimates by race. To examine whether racial disparities exist at the county-level, these rates are compared using risk difference and rate ratio. Results: The District of Columbia was ranked as having the largest average disparity in both absolute and relative terms (risk difference and risk ratio). Based on the average risk difference of counties within a state, the next five states with the largest average disparity are: Massachusetts, Kansas, Ohio, North Carolina, and Kentucky. The next five states with the largest average relative disparity, calculated with rate ratio, were: Massachusetts, Colorado, Kansas, Illinois, and Ohio. Discussion: Addressing disparities based on factors such as race/ethnicity, geographic location, and socioeconomic status is a current public health priority. This study takes a first step in developing the statistical infrastructure needed to target disparities interventions and resources to the local areas with greatest need.

November 5, 2013

2:30 PM – 2:45 PM

141st American Public Health Association Annual Meeting 2013


By Lucy D'Agostino McGowan, Melody S. Goodman in Contributed Oral Presentation

video

Mining Through Resumes: Utilizing SAS to Increase Efficiency and Objectivity in the Hiring Process

In the current job market, it is common to be inundated with resumes and applications. It has become increasingly important to streamline the evaluation process in order to sift through these candidates. Anecdotally, we recently received 50 resumes for 2 positions, many of which did not meet the minimum qualifications for employment. In order to minimize the time spent evaluating these resumes, and maximize the objectivity and efficiency of the process, we developed a SAS macro to determine which candidates should progress to a first round interview.

October 22, 2013

12:00 PM – 1:00 PM

SAS Analytics 2013


By Lucy D'Agostino McGowan, Patrick J. McGowan in Contributed Poster

poster

Using PROC GLIMMIX and PROC SGPLOT to Demonstrate County-level Racial Disparities in Obesity in North Carolina

The agenda to reduce racial health disparities has been set primarily at the national and state levels. These levels may be too far removed from the individual level where health outcomes are realized. This disconnect may be slowing the progress in reducing these disparities. Behavioral Risk Factor Surveillance System data is used to estimate the prevalence of obesity by county among Non-Hispanic Whites and Non-Hispanic Blacks. A modified weighting system was developed based on demographics at the county-level, and a multilevel reweighted regression model using PROC GLIMMIX is fit to obtain county-level prevalence estimates by race. To examine whether racial disparities exist at the county-level, these rates are compared using risk difference and rate ratio. These county-level estimates are then compared graphically using PROC SGPLOT. The distribution of prevalence estimates for Blacks is shifted to the right in comparison to the distribution for Whites; based on a two-sample test for differences in proportions the mean of the distribution of obesity prevalence estimates for Blacks is 35.7% higher than for Whites in North Carolina. This difference is statistically significant (p<.0001). Addressing disparities based on factors such as race/ethnicity, geographic location, and socioeconomic status is a current public health priority. This study takes a first step in developing the statistical infrastructure needed to target disparities interventions and resources to the local areas with greatest need as well as providing a graphical representation of disparities, allowing for the implementation of interventions and dissemination of information to occur more effectively and efficiently.

Using PROC GLIMMIX and PROC SGPLOT to Demonstrate County-level Racial Disparities in Obesity in North Carolina

The agenda to reduce racial health disparities has been set primarily at the national and state levels. These levels may be too far removed from the individual level where health outcomes are realized. This disconnect may be slowing the progress in reducing these disparities. Behavioral Risk Factor Surveillance System data is used to estimate the prevalence of obesity by county among Non-Hispanic Whites and Non-Hispanic Blacks. A modified weighting system was developed based on demographics at the county-level, and a multilevel reweighted regression model using PROC GLIMMIX is fit to obtain county-level prevalence estimates by race. To examine whether racial disparities exist at the county-level, these rates are compared using risk difference and rate ratio. These county-level estimates are then compared graphically using PROC SGPLOT. The distribution of prevalence estimates for Blacks is shifted to the right in comparison to the distribution for Whites; based on a two-sample test for differences in proportions the mean of the distribution of obesity prevalence estimates for Blacks is 35.7% higher than for Whites in North Carolina. This difference is statistically significant (p<.0001). Addressing disparities based on factors such as race/ethnicity, geographic location, and socioeconomic status is a current public health priority. This study takes a first step in developing the statistical infrastructure needed to target disparities interventions and resources to the local areas with greatest need as well as providing a graphical representation of disparities, allowing for the implementation of interventions and dissemination of information to occur more effectively and efficiently.

SAS ® for Budgeting an Ideal Wedding

When considering beverages at a wedding reception, there are often two possible payment options: (1) a set price per person per hour; (2) a fixed price per drink. We developed a SAS macro to help choose the most cost effective option.

September 9, 2013

3:00 PM – 4:00 PM

Northeast SAS Users Group 2013


By Lucy D'Agostino McGowan in Contributed Poster

poster