Text data cleaning workshop for February 13, 2025 and April 1, 2025.
The main notebook is Persnickety Python - Text Data Cleaning
Notebook A provides an in-depth look at a particularly messy dataset, and shows how convoluted the process of data cleaning has the potential to be.
Notebooks B-D each cover specific topics related to data cleaning.
These materials are best used for an in-person workshop, but they are also intended to be used as a self-paced tutorial for the included topics.
If you have questions or suggestions, please don't hesitate to reach out! My email is david.merten-jones@claremont.edu
© 2025. This work is openly licensed via CC BY 4.0