This last summer, I had the privilege of closely working with Jordan Pettyjohn, an undergraduate student from the Colorado School of Mines. This was part of the BigDataX REU program. His summer research project focused on using an intepretability framework to identify and address toxicity and bias in transformer-based language models. This research was awarded the 1st place prize in the ACM Student Research Competition (SRC)

This work was also featured at the 2024 BlackBoxNLP Workshop hosted at this year’s EMNLP conference.

  • The poster for this work, presented at both the Supercomputing conference and the BlackBoxNLP workshop, can be found here.
  • A white paper for this work more information the project can be found here.
Jordan Pettyjohn winning 1st prize at the Supercomputing conference.