Course goals and objectives
The aim of this course is to introduce several methods, tools and techniques for finding, analysing and visualising 'big' data. Both qualitative and quantitative methods will be applied to participants’ data(sets), be they pre-existing from participants’ own research and/or newly gathered data for the course. The course will provide a reflexive note on the role of analytics tools in research and on the role (and power) of scientific visualizations. After this course the student will be able to work with data that is relevant for their own PhD research.
'Big Data’ sets, whether one finds them on the Internet and social media, in economic and medical data bases or newspaper and population archives, are becoming available for every research topic imaginable. For academic and marketing researchers alike, big datasets offer valuable sources of information. Free software for the mining of data and text, like word clouds or the so-called N-gram viewer make it possible for everyone to automatically analyse text and other data types for their own purpose.
The discipline of researching and (meta-) analyzing data is rapidly emerging and is called data science. It is calling for dedicated training of researchers who work with big datasets (read: bigger than what was previously considered a big volume). In order to reach appropriate skill levels, not only the explanation of new tools and methods is needed. To be able to crunch data is one thing, but to make datasets accessible (searchable, usable, comprehensible) for a specific audience is another. And where databases grow, so do the questions of their purpose, their usability, and their reach/potential. This course aims to support students in taking the initial steps in data processing and to stimulate the reflection on what would be appropriate next steps.
The course is not meant to turn students into data scientists, but will provide a faceted introduction of methods to scrape, capture, collect, analyze, clean up and parse data. A specific focus will be on social network and Web data. This serves two purposes: (i) to introduce the theory of network analysis and (ii) to provide hands-on experience in gathering and analyzing network data. In addition, the course we will zoom in on methods of visualizing data. There will be ample opportunities for hands-on experience with several types of tools, both for qualitative and quantitative data.
The course is tailored to PhD students interested in knowing more about the subject of big data and visualisation. The course aims at letting participants go through the necessary steps of data gathering, cleaning, and visualizing. It does not aim to improve your programming skills or to teach you a particular type of software. In fact programming skills are not required. However, if you have any, this could come in handy.
We will have four meetings of three hours, that consists of two parts: in part I, theories are presented and discussed, and in part II, the students will engage in a data project themselves. Part II will consists of a tutorial and/or a specific assignment, while there will also be time to work and ask questions and get feedback on the project progress.
A list of obligatory and additional readings will be announced by email. The principle of the course is ‘self-directed learning’, but weekly exercises will be provided. Students are advised to bring a laptop/ computer for which they have the admin rights to install and de-install software. All readings will be made available via the course website.
The project to be worked on can be done individually or in teams of two, depending, on topic, reach and personal learning goals. As deliverables for the course, we ask the following:
1. a written report on the project.
2. a final presentation on process and findings
After course completion, the participants can:
- use data and text mining techniques to explore or answer a research question in their own PhD research use data and text mining techniques to visualise and analyse data
- reflect on the usability and scope of methods for big data analysis
- acquire a set of transferable skills that enhances their employability outside of academia, where – for instance – data journalism and predictive analytics are in high demand
- distinguish between automatic and human forms of analysis and the tools involved in specific use cases