ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Multi-Variate Statistical Analysis and Comparative Cross-National Surveys Data

Bruno Cautrès
bruno.cautres@gmail.com

Sciences Po Paris

Bruno Cautrès is attached to CEVIPOF – Centre de recherches politiques de Sciences Po (Paris), at the Fondation Nationale des Sciences Politiques in Paris.

He is a senior CNRS research fellow with interests in voting behaviour, political attitudes and behaviours, comparative survey research and quantitative techniques.

Bruno is involved in a variety of projects, including the European Social Survey, European Values Studies, International Social Survey Programme and European elections studies; and he participates in the development of elections studies in France. His current research programme concerns political trust and attitudes to democracy in France.

  @BCautres

Course Dates and Times

Monday 1 to Friday 5 August and Monday 8 to Friday 12 August 2016

Generally classes are either 09:00-12:30 or 14:00-17:30

30 hours over 10 days

Prerequisite Knowledge

The course is an introductory level one even if methods taught are multivariate ones. It is designed to help social sciences students that have difficulties into entering in the multivariate techniques world and to make them crossing over these first difficulties during the summer school.  Because the course is introductory, it does not require any other previous skills than basic descriptive statistics and hypotheses testing. If students need basic recall on these points, they can ask the instructor documents or advices before the course; in any case, if needed the instructor and/or the TA can propose a remedial on this in the very beginning of the summer school.


Short Outline

This course proposes an overview and practical experience of the major theoretical and practical issues that are faced by the statistical analysis of cross-national comparative survey analysis. It combines the learning of fundamental points about the statistical techniques used to analyse such surveys (without going into too complex mathematical things) and the learning of the practical problems encountered when using them. The originality of the course is to present in most simple formats the multivariate methods and to concentrate on their use for cross-national comparative analysis. This objective fits with the introductory level of the course. The course proposes a panorama of the most used techniques such as linear regression analysis (and ANCOVA), logistic regression, loglinear models, factor analysis and correspondence analysis. Typically, students attending the course may wish to use cross-national data sets (ESS, EVS, Eurobarometers, ISSP, macro-comparative data bases) for their research and may have as a key problem the question of the homogeneity/heterogeneity of the statistical relationship between variables across their countries cases. The “black box” explanation of country effect can be investigated through multivariate technique testing for this homogeneity/heterogeneity hypotheses. More generally, the statistical methods that will be used in this course are relevant for “multiple groups” analysis, when groups are countries, but could be any groups such as gender, ethnic, regional groups. The course is a perfect preparation for attending later more specific and advanced courses (for instance on multilevel models or structural equations models) but that require a strong skill in the basics of multivariate analysis like this course delivers in only 2 weeks.


Long Course Outline

A growing concern and professional practice in the social sciences is the comparison of social and political behaviours between social groups. The focus can be on economic, political, cultural, socio-psychological attitudes and behaviours, it can be at an individual or collective level and the groups of interest can be defined by gender, ethnicity, age, generation.  One particular type of groups will be at the centre of this course: nations or countries. Indeed, the trend in cross-national, and also cross-cultural, research has gained greater popularity in the last decades, especially since the developments of comparative cross-national data bases such as Eurobarometers, EVS, ISSP or ESS.

 

Thanks to methodological developments in comparative survey designs such as the one achieved by a survey like ESS or equivalent surveys, a significant step forward has been crossed over. A new big challenge for comparative survey analysis is thus to analysis these comparative data sets with statistical techniques and methods that make it possible to test for country effects. Unfortunately, many users still just juxtapose country by country their results and finally few are really dealing with the logic of statistical control for country effects. When they introduce country effects in their explanatory models, most of the time it is just on the form of dummy effects coding in regression analysis, which only account for the heterogeneity of the dependent variable in the different countries. But how to test for heterogeneity/homogeneity of beta parameters, the one estimating the effects of the explanatory variables on the dependent one ? How can we know that beta parameters estimated by a logistic regression analysis are statistically different across nations? How can we know that factorial structures diverge between countries or are similar ? This is the kind of big issues that this course will cover by looking at different methods, among the most used in statistical comparative analysis.  The course will focus every day on a specific method, with as strong emphasis on two blocks of methods : modelling techniques (regression, both linear and logistic, loglinear and latent class models) and data reduction techniques (factor analysis, multiple correspondence analysis). The complementarities between these types of methods will be emphasized.  Modelling techniques are the one proposing to the user to apply a model on their data structures : the model is an algebric expression such as the classical linear regression model. This family of methods attach great importance to hypotheses testing, in particular about significance of the models and their parameters. But, how to cope with the countries effects in that case? Tests may be sensitive to sample sizes for instance if a researcher compare two countries with fairly different N. How to know that when beta parameters have higher values in one country than another it really means a country difference? On the other hand, data reduction techniques are of great help for reducing the space of variables and/or individuals or cases: with big data bases like Eurobarometers or ESS, EVS, with so many indicators, reducing the data space to few dimensions is a very interesting objective. But what constitute a “factorial invariance” across countries? How could we say that the factorial space is homogeneous or not across nations? Factor analysis and correspondence analysis propose certain solutions to these problems. The course will concentrate mainly on exploratory factor analysis, the confirmatory factor analysis being also interesting for comparative analysis will be explained anyway but on a shorter basis.

 

A big point of this course is to offer a (non-exhaustive) panorama of some very important methods, helping comparatists to deal with country effects. The scope is not exhaustive but permits in only two weeks to save a huge amount of time : if working with a textbook only, a student could spend months to discover so many different techniques.  Attendants to the course should be aware that this is precisely the objective: to offer a panorama of some methods and practical solutions, in a non exhaustive framework but concentrated on most used methods. The objective is that after the summer school, participants know what methods are best adapted to their research perspectives and types of data.  The summer school offers other courses specialized on one method (such as confirmatory factor analysis, is structural equation models), which is a different choice. The course will show, both by theoretical presentations and empirical illustrations, the advantages and the limits of each method, knowing that no method is the perfect one.

Day Topic Details
Monday 1 Issues and challenges in comparative cross-national statistical analysis

Lecture : one hour and a half

Lab : one hour and a half

Tuesday 2 Comparing simple statistics across groups : comparisons of means, proportions, odds, correlation and association measures

Lecture : one hour and a half

Lab : one hour and a half

Wednesday 3 Correlation, causality and the linear regression model

Lecture : one hour and a half

Lab : one hour and a half

Thursday 4 Multiple regression analysis and how to introduce country effects and to control for (1) : dummies variables and Chow test

Lecture : one hour and a half

Lab : one hour and a half

Friday 5 Multiple regression analysis and how to introduce country effects and to control for (2) : interactions effects for country effects

Lecture : one hour and a half

Lab : one hour and a half

Monday 8 Logistic regression analysis and controlling for country effects

Lecture : one hour and a half

Lab : one hour and a half

Tuesday 9 Logit models and loglinear analysis of comparative cross-classified data

Lecture : one hour and a half

Lab : one hour and a half

Wednesday 10 Data reduction techniques (1) : factorial invariance across countries (factor analysis and PCA)

Lecture : one hour and a half

Lab : one hour and a half

Thursday 11 Data reduction techniques (2) : factorial invariance across countries (correspondence analysis) More on data reduction techniques or on modelling techniques (depending of the students needs : either more on factor/correspondence analysis or an introduction to latent class models for comparative analysis typologies).

Lab : three hours

Friday 12 Presentations by the students of their research work

In lecture room

Day Readings
Note

The reading list makes a distinction between compulsory and recommended readings. Recommended are readings that can be done or browse to go further or to complete the compulsory readings. It also serve as an indication of extended bibliography to work/read after the summer school.

Monday 1
On general perspectives about comparative survey research and some of the statistical problems linked to : Janet A. Harkness, Fons J. R. van de Vijver, Peter Ph. Mohler (ed.). Cross-cultural survey methods. NY, Wiley, 2003 (compulsory)

 

Janet A. Harkness, Michael Braun, Brad Edwards, Timothy P. Johnson, Lars Lyberg, Peter Ph. Mohler, Beth-Ellen Pennell, Tom W. Smith. Surveys methods in multinational, multiregional and multicultural contexts. John Wiley, 2010, extracts from  chapters 1, 2, 26 (recommended)

 

On the first day, students will also be (if needed), introduced to the handling of a major comparative data base (such as the ESS) with SPSS environment

Tuesday 2

Extracts (to be precised) from Alan Agresti, Barbara Finlay. Statistical methods for social sciences. 4th edion , chapters 5, 6, 7 and 9.

Wednesday 3

Damodar Gujarati. Basic econometrics, 4th ed., pages 37 to 51 (compulsory), pages 58 to 64 (compulsory), pages 65 to 79 (recommended), pages 81 to 87 (compulsory), pages 127-139 (compulsory)

Thursday 4 and Friday 5

Damodar Gujarati. Basic econometrics, 4th ed., pages 202-215 (compulsory), pages 217-223 to 64 (compulsory), pages 248-265 (recommended), pages 297-303 (recommended), pages 304-311 (compulsory)

 

Alfred DeMaris. Regression with social data : modelling continuous and limited response variables. Wiley, NJ, 2004, pp. 148-154 (compulsory)

 

Gregory C. Chow, Tests of Equality Between Sets of Coefficients in Two Linear Regressions, Econometrica, vol. 28(3), 1960, p. 591–605 (recommended)

 

 

Damodar Gujarati. Use of Dummy Variables in Testing for Equality between Sets of Coefficients in Two Linear Regressions: A Note. The American Statistician, 1970, 24(1), 50-52; and 1970, 24(5), 18-22 (recommended).

 

If you have time to discover an interesting example of the dummy variable techniques in another field :

 

Thrane Christer (2004). In defence of the price hedonic model in wine research, Journal of Wine Research, 15: 2, pages 123 — 134

Monday 8

Scott Menard. Applied logistic regression analysis. Sage Publications (Quantitative applications in the Social Sciences, n°106), pages 12-24 (compulsory), pages 37-52 (compulsory)

Tuesday 9

Alan Agresti. Categorical data analysis. NY, Wiley, 2003, chap 8 and 9 (recommended) Alfred DeMaris. Logit modeling. Practical applications. Sage Publications. (Quantitative applications in the Social Sciencesn, n°86), pages 7-28 (compulsory)

 

McCutcheon, A., and Mills, C. “Categorical data analysis: Log-linear and latent class models”, in E. Scarbrough and E. Tanenbaum (eds.), Research Strategies in the Social Sciences. Oxford University Press, 1998 (recommended)

Wednesday 10

Pennings, Paul, Hans Keman, and Jan Kliennijenhuis. Doing Research in Political Science: An Introduction to Comparative Methods and Statistics.  London, Sage Publications, 1999 (compulsory : pages on factor analysis, to be precised).

 

Rummel R.J. Understanding factor analysis. The Journal of Conflict Resolution, Vol. 11, No. 4, (Dec., 1967), pp. 444-480 (recommended).

Thursday 11

Romney and Weller. Metric Scaling : correspondence analysis, Sage Publications,  1990, pages : 17-26 (compulsory) ; 55-70 (compulsory); 85-90 (compulsory).

 

Jörg Blasius, Victor Thiessen. Assessing Data Quality and Construct Comparability in Cross-National Surveys. European Sociological Review , 22 (3), July 2006, pp. 229–242 (recommended)

Friday 12

No readings.

Software Requirements

SPSS20

Stata 14

Hardware Requirements

None - a computer lab will be used when necessary.

Literature

On substantive and methodological issues of crossnational surveys data analysis (and outside the bibliography of the core daily readings), participants may find very interesting elements in :

 

Roger Jowell, Caroline Roberts, Rory Fitzgerald, Gillian Eva. Measuring Attitudes Cross-Nationally. Lessons from the European Social Survey, Sage Publications, 2007.

 

Janet A. Harkness, Fons J. R. van de Vijver, Peter Ph. Mohler (ed.). Cross-cultural survey methods. NY, Wiley, 2003

 

Janet A. Harkness, Michael Braun, Brad Edwards, Timothy P. Johnson, Lars Lyberg, Peter Ph. Mohler, Beth-Ellen Pennell, Tom W. Smith. Surveys methods in multinational, multiregional and multicultural contexts. John Wiley, 2010.

 

On the statistical analysis part, an excellent textbook covering most of the course issues is : Alan Agresti, Barbara Finlay. Statistical methods for the social sciences. Prentice Hall, 4th edition, 2008. For regression analysis, the most complete book is certainly : Damodar Gujarati. Basic econometrics.  New York: McGraw-Hill, 4th edition, 2004 or 5th edition (with Dawn C Porter, 2009)

 

Participants are not expected to buy these books, that can be expensive. These books just complete the methodological readings that are the core readings to do. Electronic versions of the core readings will be made available as much as possible through the Moodle platform of the summer school.

 

Recommended Courses to Cover Before this One

<p>Introduction to SPSS</p> <p>Introduction to STATA</p> <p>Comparative Research Designs</p> <p>Introduction to Statistics</p> <p>Survey Designs</p>

Recommended Courses to Cover After this One

<p>Multiple Regression Analysis</p> <p>Multilevel Structural Equation Modelling</p> <p>Applied Multilevel Models</p> <p>Advanced Topics in Applied Regression</p> <p>Data Visualisation</p> <p>Structural Equation Modelling</p>


Additional Information

Disclaimer

This course description may be subject to subsequent adaptations (e.g. taking into account new developments in the field, participant demands, group size, etc). Registered participants will be informed at the time of change.

By registering for this course, you confirm that you possess the knowledge required to follow it. The instructor will not teach these prerequisite items. If in doubt, please contact us before registering.