Learning from Big Data

How To Learn and Interact with Consumers in the Big-Data Age

Name minor:

Learning from Big Data



Teaching language:


Programme which has the coordinating role for this minor:

Rotterdam School of Management, Erasmus University (RSM); Department of Marketing Management

Other programmes which are contributing to the minor:



See admissions matrix



Every day, millions of consumers make online purchases and voice their opinions in product-review websites, blogs and chat rooms. They spontaneously produce massive amounts of user-generated data (also called user-generated content, or UGC), most of which is textual and freely available for downloading and analysis. In this big-data era you can quickly collect large amounts of rich, valuable, and reliable data on consumers. All you need is the right set of tools and the proper training to know how to use them.

However, the wealth of UGC data is so large that it can be hard to know how to track the correct information, and how to separate noise from data. To make this more difficult, UGC data is often unstructured and textual. Do you know how to analyze this? How do you go from consumer textual information to marketing decisions?

The massive adoption of Internet by virtually everyone also made it easy to run large-scale online experiments to identify the best ways to communicate and interact with consumers. These online test/control experiments are often called “A/B tests” because consumers are randomly assigned to a test group (exposed to one of the treatments of interest such as an innovative website design) or to a control group, exposed to a baseline condition such as a traditional website design. The comparison provides information on which condition is the most effective in terms of sales, click-through or another metric of interest. Firms run thousands of such online experiments every day to find out the most effective banners, website design, emails, promotions and even product recommendations.

Thus, learning about consumers has never been so easy. Anyone with the right tools and skills can listen to what millions of consumers are saying, find out what suggestions they have, identify unmet needs and preferences, and experiment to learn the best ways to interact with them. Listening to what consumers are saying is valuable because it can lead to profitable new market opportunities (such as opportunities for new products) and anticipate major problems (such as a recall crisis). 

In this course you will find tools and conceptual frameworks needed to identify and exploit the opportunities from these big data sources.

Learning objectives

After taking this course you will be able to design and analyze simple experiments such as A/B tests to find out the most effective approaches to interact with consumers, such as the types of display banners, promotions or website designs that are more successful in a specific market or segment. You will also be able to select and analyze UGC data and use basic text-mining tools to listen to the voice-of-the-consumer expressed in unstructured data such as texts in blogs and product reviews.

The objectives of this course are to equip students with the tools and methods necessary to:

(1)    Obtain insights on consumer preferences and behavior from user-generated data

(2)    Design and run experiments to learn the most effective ways of interacting with a specific set of consumers

(3)    Understand major challenges and steps in optimizing online experiments with methods such as website morphing

(4)    Understand the potential of user-generated content (UGC) as a way to listen to the voice-of-the-consumer expressed in unstructured data such as texts in blogs and product reviews.

(5)    Analyze a real-world online experiment from a real company that has partnered with Erasmus Center for Optimization of Digital Experiments (e-Code)

This course will be based on lectures and hands-on activities. Students will be encouraged to apply the methods, excel templates and R scripts seen during the lectures, use text mining to replicate the findings, and perform at least one real-world online A/B experiment on their own. Most of the tools used in this course are freely available as open-source.

Specific characteristics

Both sides of your brain will be heavily used in this course J. This course will push you to develop new analytical skills and to be creative when devising solutions and when interpreting your results to understand the consumer behavior that generated the data you have in your hands.

The course has three modules. In the first module (“Mine your Own Business in Blogs, Reviews and Tweets”) we will work with various types of UGC, with a strong emphasis on textual data. We will discuss and use tools to help you automate the detection of both sentiment and content from texts produced by consumers. The other two modules are focused on online experiments: in the second module (“Learning from Experience: Introduction to Online Experiments”) we will provide an introduction to the popular method of A/B testing (https://en.wikipedia.org/wiki/A/B_testing) and help you run a test on your own. The third and last module (“Learning While Earning: Advanced Methods for Online Experimentation”) is focused on adaptive methods, which are the state-of-art of online experiments because they allow firms to run experiments faster and at lower costs. There is one major “module assignment” to be handed in at the end of each module.

All data analysis in this course can be done using Excel and SPSS, in a small scale. For example, the analysis of a single movie review from the IMDB website can be done in Excel with the template that is provided in this course. Students interested in achieving greater efficiency (such as processing thousands of reviews automatically) can benefit from learning R, which allows you to be far more productive with data.

Preparation is Important!

This summer prepare yourself for this course with three simple tasks:

  1.  Read the chapter on “Regression” and the chapter on “Comparing Two Means” of this book:           Field, Andy. Discovering Statistics using SPSS (2009 or more recent). London: Sage Publications 
  1. Study this simple R tutorial:  www.biostat.jhsph.edu/~ajaffe/docs/undergradguidetoR.pdf  Please make sure you install R in your computer and play with it following the exercises in the link.  
  1. If you are not familiar with MS Excel, please at least learn how to use the v-lookup function


Maximum number of students that can participate in the minor: 50
Minimum number of students that can participate in the minor: 12

Contact hours: This course will be about 40 contact hours spanning 8 weeks. Each week consists of two classroom sessions of about 2.5 hours.

Overview modules

Module 1: MINe your own business in blogs, reviews and  tweets


In this module students will understand how UGC is produced, tools available to collect it, how to clean it, and how they can use simple text-analysis tools to derive useful information that can be used to support decision-making.

  • What questions UGC can and cannot answer, and how to make the right questions
  • The big-data side of UGC
    • Data planning: sources, aggregation level, timeline, integration
    • Downloading and organizing data
  • Text mining made easy: tools and tricks 
    • Data preparation: cleaning and tokenization
    • Sentiment analysis
    • Topic analysis
  • Predictive analytics and UGC: data purification and the silence of content

 Teaching method: Lectures, guest lectures, various hands-on laboratory days, business case discussions and a large hands-on assignment per module. Students will work on the module assignment in small groups. Each group will run a UGC project to answer one of the provided research questions. This includes locating the most appropriate sources of UGC data, downloading it, running the appropriate analyses, and delivering a robust report with conclusions.

Teaching materials:  lectures, cases, assignments

Contact hours: 4 sessions of 2.5h = 10h

Module 2:  LEARNING FROM experIENCE: introduction to ONLINE EXPERIMENTS


In this module we will equip you with a conceptual framework and basic tools to design simple online experiments, run them, and perform the appropriate statistical analyses to interpret their results. The main topics are:

  • What questions experiments can and cannot answer, and how to make the right questions
  • Why and when we run experiments
  • Main elements of experiments
  • Planning and designing experiments: basic concepts and tools
    • Goals of experiments, design, and treatments
    • Internal and external validity
    • Analysis of experimental data: simple but robust statistical tools
  • Running your experiments on an online platform
  • Pools of respondents: Amazon Mechanical Turk and other platforms
  • Reporting technical information to managers and clients
  • The consumer perspective on experimentation: engagement, buying behavior and privacy 
    • Annoyance, intrusiveness, privacy
    • Consumer engagement: measures, approaches
    • Considering consumer buying behavior when designing your experiment
    • The ethics and limits of digital experimentation: the Facebook case

Teaching method: Lectures, guest lectures, various hands-on laboratory days, business case discussions and a large hands-on assignment per module. Students will work on the module assignment in small groups. Each group will design, run, analyze and report an online A/B experiment to address one of the research questions provided by the lecturers. They should use Google Analytics (formerly GWO) or another equivalent platform and Amazon Turk or an equivalent pool of respondents.

Teaching materials:  lectures, examples, practical exercises, and assignments. Students are expected to become familiar with the concepts in this book: Field, Andy. Discovering Statistics using SPSS. (2009). London: Sage Publications.

Contact hours: 6 sessions of 2.5h = 15h

Module 3: LEARNING WHILE EARNING: advanced methods for online  experimentation


In this module you will be exposed to the most advanced methods of online experimentation. A special emphasis will be given to adaptive methods that accelerate learning and reduce costs, such as website morphing and knowledge gradient. Website morphing is a method to adapt a website to match the style of each individual consumer. Morphing works by continuously running an experiment that balances learning and sales in an optimal way to obtain the maximum possible revenue at the minimal cost. Morphing started in the website context but has been extended to display banners and other applications. The main topics of this module are:

  • Multi-Armed Bandits and Morphing Bandits made easy
  • The Morphing Bandits Concept and Process
  • Designing a Morphing Bandit Project and Key Decisions
  • The Analytics of Morphing Bandits and the Transition from Exploration to Exploitation 
  • Interpreting results of a morphing bandit project, sanity checks, and statistical robustness
  • The Demand Side
  • The Firm Side
  • Applications and Examples
  • Monitored Webpages and Links
  • Morphs, Cognitive Styles, and Outcome Variable.
  • Change A/B Testing Culture
    • From Testing Creatives to Testing Methods
    • From Testing on Aggregate Data to Testing on Individual-level Data

Teaching method: Lectures, guest lectures, various hands-on laboratory days, business case discussions and a large hands-on assignment per module. Students will work on the module assignment in small groups. Each group will do  a major exercise to become familiar with the morphing scripts and run a small morphing experiment

Teaching materials:  lectures, cases, assignments

Contact hours: 6 sessions of 2.5h = 15h 


  • Class attendance and Participation (10%)
  • Group Assignments (40%): these are done in groups of 3 to 4 students but the individual grade may be assessed at the individual level to reflect dedication and contribution.
  • Individual Assignments (50%)

Feedback method:

  • individual feedback via email
  • Additional individual feedback available in person during office hours

Contact person

Naam/name: Gui Liberali
E-mail/e-mail: liberali@remove-this.rsm.nl     
Telefoonnummer/Phonenumber: 82732
Kamer/room: T10-14

Faculty website