This course has a focus on the newest technics and methodologies of digital data collection of human behavioral data from various sources. As such, participation requires some basic programming skills in Python, to able to implement and operate online data collection modules and to manage, prepare and analyze the collected datasets. As a prerequisite we ask students whether (a) to pass the DNDS 6007 course on Scientific Python (or equivalent), (b) show a certificate of an online python programming course, or (c) show and discuss a project you developed in Python. During the first class, we will hand out a test to check the prerequisites among all the registered students. Those that do not reach the minimum threshold will not be able to take the course, even if regularly registered. The test does not count for the final grade of the course.
Background and the overall aim of the course:In the age of the digital data revolution the collection of human behavioral datasets is a very important issue and requires thorough training for the appropriate design of collection methods. While researchers commonly assume that data is granted at the outset, without control on the data collection pipeline, one never can be sure about intrinsic biases, hidden correlations or unrepresentative sampling. All these can potentially induce misleading noise or undermine any observation/conclusion drawn from the date-driven observations. The aim of this course is to provide proper training on the methodological paths of digital data collection to understand how to translate a scientific hypothesis to data collection pipelines precisely measuring the question in hand with the least possible noise and environmental effects. During the course we will learn in depth about all the latest techniques to collect individual or collective human behavioral data using tracking, monitoring or crawling methods or transactional data technics. We will also learn how to design digital social experiments to collect online surveys or to set up controlled online social experiments. All these methods are in the frontline of computational social science and are pivotal for the coming generation of researchers and data scientists working on any related questions. The course will have a hands-on approach, with homeworks, practical classes and with the development of a project. We also plan a course long data-collection experiment where students may collect their own social, mobility or health data using their smart phones and readily developed applications. This way they will directly learn all the difficulties and potentials of such technics, while in the end by analyzing their own data would solve any privacy issues during this exercise.
The learning outcomes of the course:The aim of the course is to provide a comprehensive introduction to digital data collection methods. In the end of the course students will be able to:
- Design basic data collection strategies and to obtain data from a number of data sources
- Learn to program up API streams to collect data from online sources
- Learn to crawl data from already existing online textual or profile data
- Learn to design simple online or offline digital data-driven experiments using collective platforms or personal logging devices;
- Obtain basic techniques how to curate, filter or couple various types of dataset including temporal, spatial, or social relational data
- Learn about the potential sampling biases and privacy concerns and how to consider them during the data collection.
What you will NOT learn in this course: This course is about the methods and algorithms to collect data. It will not provide you advanced coding and data visualization skills, neither training on data handling and database management. For learning to code, consider attending DNDS 6007 Scientific Python. For learning to visualize data, consider attending DNDS 6007 Data and Network Visualization. This course focuses on human behavioral data thus will not teach methodologies on financial, geospatial, neural etc. data collection.
- Instructor: Marton Karsai