Find us on GitHub

A Data Carpentry Workshop

Wits University

30 Jan - 1 Feb, 2018

30 Jan - 31 Jan 08:30 to 16:30, 1 Feb 8:30 to 12:00

Instructors: Martin Mafunda, Juan Steyn, Caroline Ajilogba

Helpers: Alphonse Bere, Lactatia Motsoku

General Information

Thank your for your interest in this Data Carpentry workshop.

The ever-increasing digital nature of research requires researchers, postgraduate students, and research-support staff to equip themselves with the skills to create, manipulate and manage data in digital format. This can involve complex research data management techniques.

However, very often researchers and students can perform simple to complex data management by mastering tools and techniques which don’t require the purchase of pricey software licenses nor necessitating highly specialist skills.

This workshop will assist researchers, postgraduate students, and research-support staff to learn more about such tools and techniques

The workshop is organised and funded as part of the DST-funded National e-Science Postgraduate Teaching and Training Platform (NEPTTP) and in cooperation with the South African Center for Digital Language Resources (SADiLaR)

Workshop Aims

This workshop aims to provide a broad introduction to the following concepts and tools

  • Data formatting, cleaning, and manipulation using OpenRefine
  • Research Data Management
  • Introduction to R for analysing data

Registration and other information

Please register for the workshop through the online registration form by 28 January 2018. Registration is free of charge, although a penalty fee of R500 will be charged for no-shows . Space is limited, if you have any questions please contact juan.steyn@nwu.ac.za.


Data Carpentry workshops are for any graduate student, staff member or researcher who has data they want to analyze, and no prior computational experience is required. This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data.

We will cover data organisation in spreadsheets, Data cleaning with OpenRefine and Data analysis and visualisation in R as well as concepts of research data management Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.

Who: The course is aimed at postgraduate students, staff members and researchers.

Where: MSL111, Mathematical Sciences Laboratories, Lower Ground Floor, Mathematical Sciences Building, West Campus, Wits, Enoch Sontonga Avenue, Braamfontein. Get directions with Google Maps.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating sytem (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by Data Carpentry's Code of Conduct.

Contact: Please mail juan.steyn@nwu.ac.za for more information and if you have any trouble registering for the workshop.


Preliminary Schedule

Surveys

Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey

Day 1

30 January 2018

Morning Data organisation in spreadsheets
Afternoon Data cleaning with OpenRefine

Day 2

31 January 2018

Morning Data analysis and visualisation in R
Afternoon Data analysis and visualisation in R

Day 3

1 February 2018

Morning Data analysis and visualisation in R

Meals:

Lunch will not be catered for.

Tea/Coffee will be provided according to schedule.

Etherpad: http://pad.software-carpentry.org/2018-01-30-WITS.
We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.


Detail Programme

Day 1 - 31 January 2018

08:30 Software installations and setup
09:00 Introductions
09:15 Data organisation in spreadsheets
10:15 Coffee and tea break
10:35 Data organisation in spreadsheets
12:00 Lunch will not be catered for
13:00 Introduction to OpenRefine
14:20 Coffee and tea break
14:40 OpenRefine
16:15 Wrap-up

Day 2 - 30 January 2018

08:30 Introduction to R
10:15 Coffee and tea break
10:35 Starting with data in R
12:00 Lunch will not be catered for
13:00 Manipulating data in R
14:20 Coffee and tea Break
14:40 Visualising data in R
16:15 Wrap-up

Day 3 - 1 February 2018

08:30 Visualising data in R
10:00 Coffee and tea Break
10:30 Visualising data in R
12:00 Finish

Syllabus

Using spreadsheet programs for scientific data

OpenRefine

OpenRefine (previously Google Refine) is a tool for data cleaning that runs through a web browser, and any browser - Safari, Firefox, Chrome, Explorer - should work fine. You will need to download OpenRefine and install it, and when you open it, it will run through the browser, but you don't need an internet connection, and the data will all be stored on your computer.

R for data analysis and visualisation

  • Introduction to R
  • Starting with data
  • Aggregating and analyzing data with dplyr
  • Data visualisation with ggplot2
  • R and Databases

Setup

To participate in a Data Carpentry based workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants should bring and use their own laptops to insure the proper setup of tools for an efficient workflow once you leave the workshop.

Please follow these Setup Instructions.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.