Tabular data is widely used. It is often entered and formatted in a way that makes it easy for the human eye to read. However, in order to perform simple and accurate analyses and visualizations of tabular data, or to process it further using programming languages such as R or Python, the dataset should first be cleaned and organized according to the principles of tidy data.
Specifically, the following topics will be covered and practiced using sample data from library and information science:
- Best practices for data entry and formatting
- Avoiding common formatting errors
- Handling dates in spreadsheets
- Basics of quality control and data manipulation in tables
- Exporting data from tables
- Reconciliation with external sources, e.g., authority files
The workshop is based on the curricula of The Carpentries.
- Instructors: Claudia Engelhardt (TU Dresden/Center for Interdisciplinary Digital Sciences)
- Format: Workshop (in-person, online)
- Target groups: Students (B.A., M.A.), researchers
- Languages: German, English