I am a data and social science consultant with a passion for using data and technology to understand the ways that culture and social structures shape our world.
I completed my Sociology PhD at Duke in 2023, where my research was focused on measuring and modeling aspects of culture such as stereotypes, social cohesion, narratives, and moral frames using methodologies such as social/organizational network analysis, interpretive content analysis, surveys, experiments, and NLP/computational text analysis. In my work as a data and social science consultant, I perform analyses and advise on approaches to identify challenges and opportunities for the implementation of business activities and programs. More recently, my work has expanded to advising on strategies for using generative AI to seamlessly integrate data insights into everyday business processes.
My consulting work broadly falls into two categories.
I believe that investment in AI is about more than investing in AI technology itself - it is about investing in the ecosystem of technologies and human processes that make organizations unique.
I want to help your business invest in these ways:
My technical skills span a broad range of areas including text analysis/NLP, social network analysis, statistical modeling, data engineering, visualization, and machine learning. I primarily implement these methods in Python, but I have several years of experience teaching and working with these methods in R. As a former engineer, I also have experience creating data-oriented APIs, designing database schemas and pipelines, implementing signal processing (images, audio, etc) algorithms, and working with a range of other languages and technologies. See my Github profile, data science blog, and Python packages (below) for examples of my work.
Package for maintaining chtabots with tools and conversation history. Built on LangChain.
GitHub / Docs
This package offers a powerful interface for compiling markdown documents to a number of other formats using a combination of Pandoc document conversion and the Jinja templating engine with custom functions that can be used to insert tables from CSV or Excel files, insert SVG or PDF images directly, and offer conditional logic for compiling to different document formats.
GitHub / Docs
Package for parsing, storing, and accessing text documents and models for large scale text analysis.
GitHub / Docs
Tools for working with video and image files that I've used in a number of projects.
GitHub
Template for Python package repositories with makefile for publishing example Jupyter notebooks on a MKDocs Material based-website, setting up virtual environments, linting, testing, and more.
GitHub
This Python project allows you to create a photomosaic from a target image and a large set of candidate images.
GitHub
Here I highlight some of my academic work. See my CV for a full list of my publications.
Ph.D. Dissertation, Duke University (2023)
The emergence and persistence of communities has long been of interest to social scientists, and the increasingly digital landscape in which these communities exist present some important theoretical and methodological challenges. In this dissertation, I develop methods for identifying and characterizing communities on Twitter and examine the kinds of interactions that affect social cohesion. Using the Fat Liberation community as a case study, I find that there is a core set of users engaged in conversations around criticizing conceptions of Fatness, and I observe partitions in the community differentiated by stylistic approaches to discussion rather than topical focus. I next operationalize hypotheses from Randall Collins' Interaction Ritual Chain theory using novel methods for measuring the effects of engaging in particular types of interactions. I find support for several hypotheses generated directly from this theory in online settings and further find that high-status users play a particularly important role in producing group cohesion - a perhaps underplayed aspect of the theory that may be particularly important in online settings. Finally, I build on conflict theories to hypothesize that exposure to toxic interactions will affect social cohesion - particularly when they involve other high-status users. I do not find support for these hypotheses, however, suggesting further work should investigate the role of toxic behavior by accounting for the situational dynamics produced by interactions.
Journal Article: Social Forces (2023)
Gender stereotypes have important consequences for boys' and girls' academic outcomes. In this article, we apply computational word embeddings to a 200-million-word corpus of American print media (1930-2009) to examine how these stereotypes changed as women’s educational attainment caught up with and eventually surpassed men’s. This transformation presents a rare opportunity to observe how stereotypes change alongside the reversal of an important pattern of stratification. We track six stereotypes that prior work has linked to academic outcomes. Our results suggest that stereotypes of socio-behavioral skills and problem behaviors—attributes closely tied to the core stereotypical distinction between women as communal and men as agentic—remained unchanged. The other four stereotypes, however, became increasingly gender-polarized: as women’s academic attainment increased, school and studying gained increasingly feminine associations, whereas both intelligence and unintelligence gained increasingly masculine ones. Unexpectedly, we observe that trends in the gender associations of intelligence and studying are near-perfect mirror opposites, suggesting that they may be connected. Overall, the changes we observe appear consistent with contemporary theoretical accounts of the gender system that argue that it persists partly because surface stereotypes shift to reinterpret social change in terms of a durable hierarchical distinction between men and women.
Journal Article: Sociological Forum (2021)
The emergence and persistence of communities has long been of interest to social scientists, and the increasingly digital landscape in which these communities exist present some important theoretical and methodological challenges. In this dissertation, I develop methods for identifying and characterizing communities on Twitter and examine the kinds of interactions that affect social cohesion. Using the Fat Liberation community as a case study, I find that there is a core set of users engaged in conversations around criticizing conceptions of Fatness, and I observe partitions in the community differentiated by stylistic approaches to discussion rather than topical focus. I next operationalize hypotheses from Randall Collins' Interaction Ritual Chain theory using novel methods for measuring the effects of engaging in particular types of interactions. I find support for several hypotheses generated directly from this theory in online settings and further find that high-status users play a particularly important role in producing group cohesion - a perhaps underplayed aspect of the theory that may be particularly important in online settings. Finally, I build on conflict theories to hypothesize that exposure to toxic interactions will affect social cohesion - particularly when they involve other high-status users. I do not find support for these hypotheses, however, suggesting further work should investigate the role of toxic behavior by accounting for the situational dynamics produced by interactions.
M.A. Thesis, University of California Santa Barbara (2019)
When are politicians influential in shifting party discourse? This study explores how same-party politicians influence one another, and how this influence leads to changes to a party's larger discourse. I suggest that the extent to which politicians are able to influence other party politicians depends on how their messages situate them within the party’s discursive field. I further suggest that certain messages are particularly influential when distinctive within a given time period. To assess this effect, I use a case study of just under 1 million Tweets from politicians in the Colombian political party Centro Democrático from 2015-2017. I use topic modeling and network analysis to measure influence within a dynamic discursive field, and a genetic learning algorithm to identify types of messages, as topics, which constitute the field under which we observe the strongest linkage between field position and influence. I find that politicians are influential when posting about current events and when creating symbolic distinctions which are central to the party ideology - in the case of Centro Democrático, distinctions between the concept of peace itself and the peace process developing in Colombia. These results suggest that the discursive field can be a powerful tool for analysis of influence and political discourse.