Devin J. Cornell, Ph.D.

Data and Social Science Research Consultant
devin@devinjcornell.com
Resume (PDF)

I am a data and social science consultant with a passion for using data and technology to understand the ways that culture and social structures shape our world.

I completed my Sociology PhD at Duke in 2023, where my research was focused on measuring and modeling aspects of culture such as stereotypes, social cohesion, narratives, and moral frames using methodologies such as social/organizational network analysis, interpretive content analysis, surveys, experiments, and NLP/computational text analysis. In my work as a data and social science consultant, I perform analyses and advise on approaches to identify challenges and opportunities for the implementation of business activities and programs. More recently, my work has expanded to advising on strategies for using generative AI to seamlessly integrate data insights into everyday business processes.


Consulting

My consulting work broadly falls into two categories.

Social Science and Data Science Consulting

I help organizations apply rigorous methodologies to turn data into insights.
  • Modeling: machine learning and statistical modeling for inference with survey, text, and digital trace data.
  • Measurement: develop novel measures that balance precision and interpretability to use as indicators or descriptors.
  • Network analysis: collect and analyze social and organizational network data from survey or digital trace data.
  • Content analysis: manually analyze large quantities of text to develop hypotheses and modeling strategies.
  • Data engineering: build efficient and error-free data pipelines using good software design principles.

Generative AI for Businesses

I help organizations invest in a new AI-driven world.

I believe that investment in AI is about more than investing in AI technology itself - it is about investing in the ecosystem of technologies and human processes that make organizations unique.

I want to help your business invest in these ways:

  • Data Lakes / Warehouses. Reshape your data infrastructure to support AI-enabled interfaces from collection to insight.
  • Institutional Knowledge. Integrate the values and commitments of your organization into everyday work.
  • Organizational policies. Give your employees easier access to information about policies and organizational structure.
  • Communication/information Services. Collaborate with team members by connecting to existing communication services.

Skills

Github

My technical skills span a broad range of areas including text analysis/NLP, social network analysis, statistical modeling, data engineering, visualization, and machine learning. I primarily implement these methods in Python, but I have several years of experience teaching and working with these methods in R. As a former engineer, I also have experience creating data-oriented APIs, designing database schemas and pipelines, implementing signal processing (images, audio, etc) algorithms, and working with a range of other languages and technologies. See my Github profile, data science blog, and Python packages (below) for examples of my work.


Python Packages

SimpleChatbot

Package for maintaining chtabots with tools and conversation history. Built on LangChain.
GitHub / Docs

PyMdDoc

This package offers a powerful interface for compiling markdown documents to a number of other formats using a combination of Pandoc document conversion and the Jinja templating engine with custom functions that can be used to insert tables from CSV or Excel files, insert SVG or PDF images directly, and offer conditional logic for compiling to different document formats.
GitHub / Docs

DocTable


Package for parsing, storing, and accessing text documents and models for large scale text analysis.
GitHub / Docs

CoProc


Provides building blocks for running stateful concurrent processes.
GitHub / Docs

Media-Tools

Tools for working with video and image files that I've used in a number of projects.
GitHub

Devin-Package-Template

Template for Python package repositories with makefile for publishing example Jupyter notebooks on a MKDocs Material based-website, setting up virtual environments, linting, testing, and more.
GitHub

Personal Projects

Photomosaic


This Python project allows you to create a photomosaic from a target image and a large set of candidate images.
GitHub

Highlighted Academic Work

Curriculum Vitae

Here I highlight some of my academic work. See my CV for a full list of my publications.

Social Cohesion in the Fat Liberation Community on Twitter

Devin J. Cornell

Ph.D. Dissertation, Duke University (2023)

The emergence and persistence of communities has long been of interest to social scientists, and the increasingly digital landscape in which these communities exist present some important theoretical and methodological challenges. In this dissertation, I develop methods for identifying and characterizing communities on Twitter and examine the kinds of interactions that affect social cohesion. Using the Fat Liberation community as a case study, I find that there is a core set of users engaged in conversations around criticizing conceptions of Fatness, and I observe partitions in the community differentiated by stylistic approaches to discussion rather than topical focus. I next operationalize hypotheses from Randall Collins' Interaction Ritual Chain theory using novel methods for measuring the effects of engaging in particular types of interactions. I find support for several hypotheses generated directly from this theory in online settings and further find that high-status users play a particularly important role in producing group cohesion - a perhaps underplayed aspect of the theory that may be particularly important in online settings. Finally, I build on conflict theories to hypothesize that exposure to toxic interactions will affect social cohesion - particularly when they involve other high-status users. I do not find support for these hypotheses, however, suggesting further work should investigate the role of toxic behavior by accounting for the situational dynamics produced by interactions.

School, Studying, and Smarts: Gender Stereotypes and Education Across 80 Years of American Print Media, 1930-2009

Andrei Boutyline, Alina Arseniev-Koehler, Devin Cornell

Journal Article: Social Forces (2023)

Gender stereotypes have important consequences for boys' and girls' academic outcomes. In this article, we apply computational word embeddings to a 200-million-word corpus of American print media (1930-2009) to examine how these stereotypes changed as women’s educational attainment caught up with and eventually surpassed men’s. This transformation presents a rare opportunity to observe how stereotypes change alongside the reversal of an important pattern of stratification. We track six stereotypes that prior work has linked to academic outcomes. Our results suggest that stereotypes of socio-behavioral skills and problem behaviors—attributes closely tied to the core stereotypical distinction between women as communal and men as agentic—remained unchanged. The other four stereotypes, however, became increasingly gender-polarized: as women’s academic attainment increased, school and studying gained increasingly feminine associations, whereas both intelligence and unintelligence gained increasingly masculine ones. Unexpectedly, we observe that trends in the gender associations of intelligence and studying are near-perfect mirror opposites, suggesting that they may be connected. Overall, the changes we observe appear consistent with contemporary theoretical accounts of the gender system that argue that it persists partly because surface stereotypes shift to reinterpret social change in terms of a durable hierarchical distinction between men and women.

All Roads Lead to Polenta: Cultural Attractors at the Junction of Public and Personal Culture

Andrei Boutyline, Devin J. Cornell, Alina Arseniev-Koehler

Journal Article: Sociological Forum (2021)

The emergence and persistence of communities has long been of interest to social scientists, and the increasingly digital landscape in which these communities exist present some important theoretical and methodological challenges. In this dissertation, I develop methods for identifying and characterizing communities on Twitter and examine the kinds of interactions that affect social cohesion. Using the Fat Liberation community as a case study, I find that there is a core set of users engaged in conversations around criticizing conceptions of Fatness, and I observe partitions in the community differentiated by stylistic approaches to discussion rather than topical focus. I next operationalize hypotheses from Randall Collins' Interaction Ritual Chain theory using novel methods for measuring the effects of engaging in particular types of interactions. I find support for several hypotheses generated directly from this theory in online settings and further find that high-status users play a particularly important role in producing group cohesion - a perhaps underplayed aspect of the theory that may be particularly important in online settings. Finally, I build on conflict theories to hypothesize that exposure to toxic interactions will affect social cohesion - particularly when they involve other high-status users. I do not find support for these hypotheses, however, suggesting further work should investigate the role of toxic behavior by accounting for the situational dynamics produced by interactions.

Discursive Fields and Intra-party Influence in Colombian Politics

Devin J. Cornell

M.A. Thesis, University of California Santa Barbara (2019)

When are politicians influential in shifting party discourse? This study explores how same-party politicians influence one another, and how this influence leads to changes to a party's larger discourse. I suggest that the extent to which politicians are able to influence other party politicians depends on how their messages situate them within the party’s discursive field. I further suggest that certain messages are particularly influential when distinctive within a given time period. To assess this effect, I use a case study of just under 1 million Tweets from politicians in the Colombian political party Centro Democrático from 2015-2017. I use topic modeling and network analysis to measure influence within a dynamic discursive field, and a genetic learning algorithm to identify types of messages, as topics, which constitute the field under which we observe the strongest linkage between field position and influence. I find that politicians are influential when posting about current events and when creating symbolic distinctions which are central to the party ideology - in the case of Centro Democrático, distinctions between the concept of peace itself and the peace process developing in Colombia. These results suggest that the discursive field can be a powerful tool for analysis of influence and political discourse.