An Overview on Issues of Cross-Cultural Research and Back-Translation
Abstract
There are numerous studies which have been conducted in the field of sport based on an adapted or translated instrument across countries. However, using an adapted or translated instrument does not ensure that the adapted or translated one measures the same constructs as the original one does as a result of the cultural and lingual differences. Therefore, researchers who would like to adapt or translate in instrument from English version into different language version should be cognizant of such potential problems. The purpose of this paper is to provide researchers with an overview of issues regarding the cross-cultural study as well as the adapting or translating an instrument. In addition, the practical guidelines and the possible methods that can detect such problems are also included.
Due to the fact that the world is becoming a global village, more and more fields, such as business, public affairs, and research are becoming borderless. The frequent interaction and collaboration in the field of research all over the world result in greater interests in cross-cultural and international research (Sireci & Berberoglu, 2000). Numerous tests and questionnaires developed for the population in the United States have been translated or adapted by many researchers in some non-English countries (Butcher & Garcia, 1978). This phenomenon is also salient in Asian countries, i.e., research instruments translated from English is popular in academics in Taiwan, especially in the area of psychology and sports. Such translations and adaptations seemed to assume that these translated instruments have as satisfactory validity and reliability as the original one does. However, such an assumption could be dangerous due to a variety of factors that could influence the validity of score from an instrument in different cultural settings and languages (Geisinger, 1994; Hambleton, 2001; Van de Vijver & Hambleton, 1996). In addition, some bias including construct bias and item bias could arise when translating or adapting an instrument from another language (Butcher & Garcia, 1978; Van de Vijver & Hambleton, 1996). Under such a circumstance, the validity could be one of the problems causing inaccurate results. Therefore, a more careful examination on these issues is needed when a researcher translates or adapts the existing tests or questionnaires from another language. The purpose of this paper is to examine the potential issues that might be encountered by researchers when they are translating or adapting instruments or tests from another language. Moreover, remedies and practice from existing studies will also be discussed.
Generally, there are three types of bias in cross-cultural studies: construct bias, method bias, and item bias (Van de Vijver & Hambleton, 1996). The following are the elaboration of each type of bias as well as the possible methods to alleviate the potential problems:
In addition, Geisinger (1994) raised some issues regarding cross-cultural assessment by using translation and adaptation of an instrument. The following are the descriptions and some suggestions of each issue:
Practical Guidelines for Cross-Cultural Research
This section will present the practical guidelines for cross-cultural researchers to ensure satisfactory reliability and validity of the cross-cultural studies. The following are the suggested guidelines and principles adapted from Geisinger (1994) and Van de Vijver & Hambleton (1996).
Literature Concerning the Issues of Cross-Cultural Research
Watkins (1989) pointed out some problems with the traditional exploratory factor analysis and illustrated the advantages and applications of confirmatory factor analysis. Confirmatory factor analysis is based on the statistical theory of structural equation modeling and possesses some good properties, such as allowing researchers to specify the factor loadings, correlated residuals, and correlated factors. The utilization of confirmatory factor analysis can assist interpretation of an instrument, provide a better way of comparing factor structures and testing competing models, and aid the analysis of the multitrait-multimethod matrices when cross-cultural studies are conducted.
Sireci and Berberoglu (2000) attempted to evaluate translated-adapted items by means of bilingual respondents because there is no guarantee that the different language versions of instruments are equivalent (in their research, they utilized an English-Turkish version of a course evaluation form). They pointed out some advantages and disadvantages of using bilinguals to evaluate translated items. The same examinees responding to both language versions of an item eliminate the problem of item translation difference. In addition, the bilingual test takers possess the ability to place nontranslated items in both test forms. However, there are some disadvantages of employing bilinguals. For example, the generalization of the results may be problematic since the bilinguals are typically a selected and limited group of people. Moreover, the homogeneity of bilinguals’ language proficiency may be another problem: some have better command of language than others.
In Myers et al.’s study (2000), they stated that multi-group structural equations modeling is a reliable method for examining measurement equivalence. They assessed three constructs derived from cross-cultural advertising research across U.S. and Korea samples. They found that most but not all constructs used in this study met the requirements for cross-cultural equivalence. However, the model did not fit well when the factor loadings were constrained to be equal across groups. Some specific items may be the likely source of the problem detected by further tests. In sum, they concluded that multi-group structural equation modeling is a useful tool for model fit in cross-cultural research.
Ellis (1989) used item response theory (IRT) to evaluate the measurement equivalence of translated American and German intelligence tests. Also, content analysis was utilized to detect probable problems when differential item functioning (DIF) was identified in some items. The conclusions in this study are as follows: differential item functioning may be attributed to translation errors but it is likely due to differences in cultural knowledge or experience; this study provides cross-cultural psychologists with a cultural-free methodology for identifying cultural differences.
Conclusion
Cross-cultural studies have caught researchers’ attention for decades. Translations of instruments are an inevitable tool to conduct such studies. However, literal translation does not ensure that the translated instrument measures the same constructs as in the original instrument. The reason is that there may exist lingual or cultural or both differences across samples. Therefore, cross-cultural researchers should be cognizant of the numerous potential problems, such as construct, method, and item bias that could affect the results of studies. After identifying the possible bias, cross-cultural researchers should use appropriate statistical analysis techniques including confirmatory factor analysis and item response theory to examine, avoid, or eliminate the bias. Further, cross-cultural researchers should also pay close attention to the details regarding the administration of the tests or measurements. For instance, the physical conditions of administration of the measurement, avoidance of using slang, and how to interpret the score differences across samples are the critical factors that could undermine the quality of the studies. Consequently, only when the possible factors that could potentially influence the results of the cross-cultural studies are identified and remedied can researchers ensure the accuracy of the cross-cultural research.
References
Brislin, R.W. (1980). Translation and content analysis of oral and written material. In H.C. Triandis & J.W. Berry (Eds.), Handbook of cross-cultural psychology (Vol. 1, pp.389-444). Boston: Allyn & Bacon.
Butcher, J.N., & Garcia, R.E. (1978). Cross-national application of psychological tests. Personnel and Guidance Journal, 56, 472-475.
Ellis, B.B. (1989). Differential item functioning: Implications for test translations. Journal of Applied Psychology, 74(6), 912-921.
Geisinger, K.F. (1992b). The metamorphosis of test validation. EducationalPsychologist , 27, 197-222.
Geisinger, K.F. (1994). Cross-cultural normative assessment: Translation and adaptation issues influencing the normative interpretation of assessment instruments. Psychological Assessment, 6(4), 304-312.
Hambleton, R.K. (2001). The next generation of the ITC test translation and adaptation guidelines. European Journal of Psychological Assessment, 17(3), 164-172.
Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Eds.), Test Validity (pp.129-145). Hillsdale, NJ: Erlbaum.
Marsh, H. W., & Byrne, B.M. (1993). Confirmatory factor analysis of multigroup-multimethod self-concept data: Between-group and within-group invariance constraints. Multivariate Behavioral Research, 28, 313-349.
Myers, M.B., Calantone, R.J., Page Jr. T.J., and Taylor, C.R. (2000). An application of multiple-group causal models in assessing cross-cultural measurement equivalence. Journal of International Marketing, 8(4), 108-121.
Schmitt, N., Coyle, B.W., & Saari, B.B. (1977). A review and critique of analyses of multitrait-multimethod matrices. Multivariate Behavioral Research, 12, 447-478.
Sireci, S.G., & Berberoglu, G. (2000). Using bilingual respondents to evaluate translated-adapted items. Applied Measurement In Education, 13(3), 229-248.
Van de Vijver, F. & Hambleton, R.K. (1996). Translating tests: Some practical guidelines. European Psychologist, 1(2), 89-99.
Watkins, D. (1989). The role of confirmatory factor analysis in cross-cultural research. International Journal of Psychology, 24, 685-701.