Our Assessments Work Anywhere
A recently published study suggests that some of the most common personality assessments (i.e., one’s based on the Big 5) don’t work in other countries. The study was published in a prestigious journal (Science Advances, impact factor > 12), and it has already gained prominent media attention. One outlet said that these personality tests don’t hold up around the world. NPR said that personality tests don’t reveal the real you.
Reading these articles might make you conclude that personality assessments just can’t be used in other countries. Fortunately, despite what the economists who contributed to the article and the journalists who are covering it might have you believe, such a conclusion is just wrong. In what follows, we show you why.
Psychologists use a metric called the congruence coefficient to determine the degree to which instruments are measuring the same thing. Scores on the metric can range from -1.00 to +1.00, with higher scores indicating greater similarity. The accepted standard for declaring the instruments as similar is a congruence coefficient > .84. The recently published study found average congruence coefficients of .73 and .71 in survey data gathered in so-called non-WEIRD countries (e.g., Kenya, Philippines, Colombia, etc.).
Table 1. Average Congruence Coefficient for 52 Countries on the HPI.
Country | Congruence Coefficient | Country | Congruence Coefficient | |
Canada | .99 | Kenya | .96 | |
Australia | .99 | Norway | .96 | |
South Africa | .98 | Philippines | .96 | |
United Kingdom | .98 | Switzerland | .95 | |
France | .98 | Chile | .95 | |
Sweden | .98 | Malaysia | .95 | |
Germany | .97 | China | .95 | |
Singapore | .97 | Portugal | .95 | |
New Zealand | .97 | Austria | .95 | |
Italy | .97 | Ireland | .95 | |
Czech Republic | .97 | Montenegro | .95 | |
Hong Kong | .97 | Russia | .94 | |
India | .97 | Thailand | .94 | |
Netherlands | .97 | Pakistan | .94 | |
Greece | .97 | Japan | .93 | |
Serbia | .97 | Poland | .93 | |
Denmark | .97 | Taiwan | .93 | |
Finland | .97 | Ukraine | .93 | |
Croatia | .97 | Turkey | .93 | |
Hungary | .97 | Saudi Arabia | .93 | |
Spain | .97 | United Arab Emirates | .92 | |
Brazil | .97 | South Korea | .92 | |
Belgium | .96 | Indonesia | .91 | |
Romania | .96 | Mexico | .91 | |
Argentina | .96 | Colombia | .91 | |
Slovakia | .96 | Peru | .90 |
Every single country exceeds the .84 threshold for similarity. The lowest congruence coefficient we found was for Peru (.90). As a direct comparison with the recently published work, we find much higher congruence coefficients for Kenya (.96 vs. .71), Colombia (.91 vs. .72), Philippines (.96 vs. .72), and Serbia (.97 vs. .79). These results are in stark contrast to the conclusions drawn by popular media: high quality personality assessments work – and measures exactly what we think it is measuring – in other countries and languages all around the globe.
We Aren’t the Only Ones
Another paper by De Fruyt and colleagues extended this analysis to adolescents, reporting an average congruence coefficient of .92 across 24 different countries (including Malaysia, Serbia, South Korea, Japan Iran, Thailand, Hong Kong, Turkey, China, and Uganda; which collectively averaged .90).
The point here is this: the largest, most comprehensive studies and databases speaking to the universality of personality factor structures have all come to the conclusions that these personality dimensions are universal. So why did this recent study come to the opposite conclusion?
What’s Wrong with that Study?
Research published in academic journals typically must go through a rigorous (and at times, somewhat arbitrary) review process. This involves subjecting the research to review by external experts in the field who scrutinise the work for potential errors and mistakes. Despite this process, it is sometimes the case that flawed work, or flawed conclusions, slip through the cracks. Such is the case with the article in question here. There are two critical problems.
First, the analyses and conclusions of this paper rest on data gathered using a 15-item measure of personality. The 15 items are a subset of items from a medium-length (but well-validated) 44-item measure of personality, known as the Big Five Inventory. It is not clear how these 15 items were chosen (as part of a larger survey), or their psychometric properties. However, it is clear that short measures of personality frequently show poor results. By comparison, studies demonstrating the universality of personality structures (including our own data) used longer, and undeniably far superior, measures of personality. Thus, the results of this study could be adequately summarised as garbage in, garbage out.
Second, the study in question regularly notes that many of the people surveyed had trouble understanding the questions they were being asked, in some cases it was not clear that the participants were even literate. It should come as no surprise that if people cannot read, or understand, the items on a personality assessment that their responses to the questions are necessarily nonsense. If responses to a personality assessment are effectively random, it is certain that there will be no congruence.
Further, if even only a sizable proportion of the respondents cannot read the questionnaire but respond anyway, this will necessarily drive congruence coefficients down, perhaps even below the threshold for similarity. By comparison purposes, the participants in all of the studies demonstrating the universality of personality structures were educated well-enough to read and understand the questionnaires. Put another way, the results of this study demonstrate that if people cannot read your test, they will not respond in logically coherent ways.
Summary
Read more