Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”


Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Quantitative Text Analysis

Course Dates and Times

Monday 10 ꟷ Friday 14 August 2020
2 hours of live teaching per day
Courses will be either morning or afternoon to suit participants’ requirements

Kostas Gemenis

Cyprus University of Technology

This course provides a highly interactive online teaching and learning environment, using state of the art online pedagogical tools. It is designed for a demanding audience (researchers, professional analysts, advanced students) and capped at a maximum of 16 participants so that the teaching team (the Instructor plus one highly qualified Teaching Assistant) can cater to the specific needs of each individual.

Purpose of the course

The course will introduce you to quantitative text analysis methods using examples from political science and related disciplines. It will look at manual content analysis but the emphasis will be on computer-assisted text analysis. 

We will cover methodological and practical aspects of the different methods. 

ECTS Credits

3 credits Engage fully with class activities
4 credits Complete a post-class assignment

Instructor Bio

Kostas Gemenis is Senior Researcher in Quantitative Methods at the Max Planck Institute for the Study of Societies.

His research interests include measurement in the social sciences, and content analysis with applications to estimating the policy positions of political actors.

He is currently involved in Preference Matcher, a consortium of researchers who collaborate in developing e-literacy tools designed to enhance voter education.


We begin with some key concepts in content analysis and continue with the basics of coding text manually. Topics will include best practices for defining a coding scheme, selecting the appropriate documents, coding the documents, estimating inter-coder reliability using Krippendorff's alpha, scaling the coded data, and the possibilities brought by crowdcoding. The aim is to give you all the elements for designing a manual content analysis project. From TuesdayꟷFriday we will focus on computer-assisted text analysis. 


After an introduction on document pre-processing and some basic rules for good practice, this session will cover the construction and validation of dictionaries, and their use in sentiment analysis.


We focus on scaling methods in text analysis, covering supervised methods such as Wordscores, and unsupervised methods, such as Wordfish. Much emphasis will be placed on different ways to validate the output of scaling methods. 


We will continue with supervised classification methods, discussing different algorithms and evaluations metrics used in the so-called machine learning literature. 


In our last session, we discuss unsupervised classification methods and topic models in particular. Again, the focus will be on practical issues as well as the question of validating the classification output. Finally, we will compare the different methods and discuss trade-offs in quantitative text analysis.

How the course will work online

The course will be taught using a combination of lectures and seminars featuring ‘live’ and independent elements. 

Each day will have 60ꟷ90 minutes of pre-recorded lectures in addition to a set of course readings. 

There will also be an hour on Zoom for semi-structured discussion and Q&A with the Instructor on the topics of the day. 

You will have access to a 100-page e-book with annotated exercises illustrating the different text analysis methods in R statistical software. You can work on these on your own or take advantage of the daily Zoom time, when you can ask the Instructor questions, discuss the exercises, troubleshoot, and so on. 

You are also welcome to arrange one-to-one video-chat appointments with the Instructor, to discuss practical aspects of your own project.

You should be familiar with basic statistical concepts such as

  • measures of central tendency (mean, median)
  • dispersion (standard deviation)
  • tests of association (Pearson’s r)
  • inference (χ2, t-test).

These materials are covered in the first few chapters of introductory statistics or data analysis textbooks. A useful example is Pollock P.H. III, The Essentials of Political Analysis, fourth edition (Washington, DC: CQ Press, 2012). Chapters 2, 3, 6, and 7.

Some familiarity with R statistical software is also desirable but not necessary. We will use R Studio for all the seminar exercises. 

Each course includes pre-course assignments, including readings and pre-recorded videos, as well as daily live lectures totalling at least three hours. The instructor will conduct live Q&A sessions and offer designated office hours for one-to-one consultations.

Please check your course format before registering.

Online courses

Live classes will be held daily for three hours on a video meeting platform, allowing you to interact with both the instructor and other participants in real-time. To avoid online fatigue, the course employs a pedagogy that includes small-group work, short and focused tasks, as well as troubleshooting exercises that utilise a variety of online applications to facilitate collaboration and engagement with the course content.

In-person courses

In-person courses will consist of daily three-hour classroom sessions, featuring a range of interactive in-class activities including short lectures, peer feedback, group exercises, and presentations.


This course description may be subject to subsequent adaptations (e.g. taking into account new developments in the field, participant demands, group size, etc.). Registered participants will be informed at the time of change.

By registering for this course, you confirm that you possess the knowledge required to follow it. The instructor will not teach these prerequisite items. If in doubt, please contact us before registering.