Validation of a translated version of the modified Japanese Orthopedic Association (mJOA) cervical myelopathy score in an Arabic speaking population

Introduction: Degenerative Cervical Myelopathy (DCM) is a growing disorder. Standardization of its assessment tools is an integral part of its management. The modified Japanese orthopedic association (mJOA) score is one of the most commonly used tools. Currently, there is no available Arabic translated version of any cervical myelopathy functional score. This study aimed to translate, culturally adapt, and measure the psychometric properties of an Arabic translated version of the mJOA. Methods: After translation of the score using the standard forward-backward translation procedure, a validation study including 100 patients was carried out from June 2019 to June 2020. The following psychometric properties were measured: feasibility, reliability, internal consistency, validity, minimal clinically important difference (MCID), ceiling, and floor effect. Results: No problems were encountered during the process of translation and cross-cultural adaptation of the score. The mJOA-AR was found to be a feasible score. It showed high inter-observer reliability (r = 0.833, P < 0.001), test-retest reliability (r = 0.987, P < 0.001) and good internal consistency using Cronbach’s alpha (0.777) and Pearson interclass correlation coefficient (r = 0.717). The score showed good convergent and divergent construct validity correlating it to the Arabic validated version of the neck disability index (NDI). The mJOA-AR had an MCID of 1.506. Both the ceiling and floor effects of the total score and the first and second domains were within the acceptable range, while the third and fourth domains had a high ceiling effect (30% and 39%, respectively). Discussion: Our translated version of the mJOA score was found to be a feasible score with acceptable psychometric properties. This score can be utilized as a good outcome measure tool in Arabic-speaking countries.


Introduction
Degenerative cervical myelopathy (DCM) is the leading cause of spinal cord dysfunction with increasing incidence in countries with an aging population. DCM is an incapacitating disease that hinders individuals from performing their simple daily activities and greatly affects their quality of life. It is also considered a burden from both the economic and social points of view [1,2].
With the growing disease burden, objective assessment of patients with DCM became a pivotal point in managing the disease. The modified Japanese Orthopedic Association (mJOA) score is considered one of the most accepted and widely used scores to assess the functional status of patients [3]. The mJOA has been well studied in people suffering from DCM and is considered one of the predictors of the outcome after surgical intervention [3][4][5].
The mJOA has been translated and validated into Italian, Brazilian Portuguese, Dutch and Persian paving the way for other translated versions [2,[6][7][8]. Providing a translated version of a specific score helps ensure equivalence to the original score and reduces bias in the study [9]. A translated version of any cervical myelopathy functional scores has not been studied within an Arabic-speaking population, limiting the ability to objectively assess DCM and share validated outcomes from a cohort of nearly 400 million individuals which necessitates performing validation studies for the most commonly used scores. This study aimed to translate, culturally adapt and measure the psychometric properties of an Arabic translated version of the mJOA to justify using this score in Arabicspeaking countries.

Materials and methods
Between June 2019 and June 2020, one hundred patients with DCM were recruited from the spine outpatient clinic.
Patients older than 18 years with DCM clinical and radiological manifestations were included while patients with previous surgery were excluded.
The sample size was determined using power analysis (considering an alpha error of 0.05 and a power of 90%). A minimum sample size of 92 patients was needed for strong correlation, which was increased to 100 patients [10].

Translation and cross-cultural adaptation
The process of forward-backward translation with independent translations and counter-translation was performed using the guidelines set by Guillemin et al. [11]. The following steps were performed: 1. The original mJOA Score was forward translated into Arabic by two independent professional translators (one mother tongue Arabic, fluent in English, and one mother tongue English fluent in Arabic). 2. These translations were reviewed, and the first draft was developed. 3. The first draft was back-translated into English by two professional English translators. 4. The forward and back translations were reviewed, and the second version was developed by an expert linguistic translator specialized in medical questionnaires.
After reaching the final version, the cognitive assessment was performed on ten patients to check for any linguistic or verbal difficulties.

Psychometric properties analysis
For measuring the psychometric properties, each patient performed two visits. In the first visit, the score was applied to the patients by two different physicians, then a second visit was performed 14 days later to assess the test-retest reliability.

Feasibility
The time for score completion was calculated. Also, all the data was checked for missing or multiple responses.

Reliability
A fourteen-day test-retest reliability was applied. Its results were examined by using the Spearman test for the whole questionnaire and each domain independently. Inter-observer reliability was analyzed using the Kappa statistic to determine the consistency among the two observers for the whole questionnaire and each domain.

Internal consistency
Cronbach's alpha was used to measure internal consistency for the whole questionnaire and after removing one domain at a time. Internal consistency was also measured by using the Pearson inter-item correlation coefficient.

Validity
Both convergent and divergent construct validity was measured. Convergent construct validity was measured by correlating the mJOA-AR, and its domains with the results of the Arabic validated version of the Neck disability index (NDI) [12] using Pearson's correlation, while divergent construct validity was measured by correlating the total mJOA-Ar score with items of the NDI that were expected to differ from the mJOA-Ar (reading, headache, concentration, and sleep). The NDI was used being the only available neck functional disability index that has been translated and validated into Arabic.

Minimum Clinically Important Difference (MCID)
It was determined using a distribution-based method. Norman et al. proposed the standard deviation method and reported that, in patients with chronic disease, the MCID equals half a standard deviation of baseline scores [13].

Ceiling and floor effect
It is defined as the proportion of individuals that have scored either the highest (ceiling) or lowest (floor) possible score. They were calculated for the whole score and each domain.

Data entry and statistical analysis
Data entry was performed using Microsoft Excel Ò , and the statistical analysis was done using the IBM SPSS Ò version 27 [14,15].
This study was reviewed and accepted by the Institutional Review Board (IRB) of our institution, and informed written consent was obtained from the patients before enrollment in the study.

Results
Demographic data (Table 1) Out of the 100 patients involved in our study, 63 were males, and 37 were females. The patient's age ranged from 29 to 77 years with a mean age of 50.6 ± 12.9 years. Translation and cross-cultural adaptation No major problems were observed during forward and back translation of the score with linguistic or grammatical errors. The only expression that was replaced by an equivalent Arabic synonym was in the motor dysfunction of the lower limb domain, in which the term "smooth reciprocation" was replaced by "walk with alternating footsteps".

Feasibility
The score was filled without any major difficulties (mean time required: 4 min, range 3-7 min). None of the investigators or patients reported an inability to complete the score because of linguistic or perceptive problems, also there were no missing or multiple answers.

Reliability
The score showed good inter-observer and test-retest reliability. The inter-observer reliability using the Kappa statistic was outstanding for the total score (r = 0.833, P < 0.001) and for each domain (r = 0.889, r = 0.926, r = 0.939 and r = 0.984, P < 0.001 respectively).The test-retest reliability using the Spearman's correlation showed strong correlations for the total score (r = 0.987, P < 0.001) and for each domain (r = 0.979, r = 0.98, r = 0.912 and r = 0.971, P < 0.001), respectively.

Internal consistency
The total score had high internal consistency. Using Cronbach's alpha, it was found to be 0.777, and after removal of each domain was found to be 0.740, 0.684, 0.770, and 0.761, respectively. The largest drop was observed after removing the MDLE (motor dysfunction of the lower extremity) domain which dropped to 0.684. Internal consistency was also estimated by Pearson inter-item correlation coefficient (ICC). The total mJOA-Ar score was highly correlated with both the MDUE (motor dysfunction of the upper extremity) and MDLE domains (r = 0.717 and r = 0.826, respectively) and moderately correlated with the SDUE (sensory dysfunction of the upper extremity) and SD (sphincter dysfunction) domains (r = 0.619 and r = 0.691, respectively). Regarding the individual domains, MDUE was moderately correlated with all the other domains (r = 0.397, r = 0.395, and r = 0.318, respectively), The MDLE was weakly correlated with the SDUE domain (r = 0.224) and moderately correlated with the SD domain (r = 0.462), and the SDUE domain score was moderately correlated with the SD domain (r = 0.389) (Tables 2 and 3).

Minimum Clinically Important Difference (MCID):
It was found to be 1.506 (based on a standard deviation of 3.012).

Ceiling and floor effect
The ceiling and floor effect of the total score and the first two domains were acceptable, unlike those for the third and fourth domains. The total mJOA-Ar score had a ceiling and floor effect of 0%. Both the MDUE domain and the MDLE domain had a ceiling effect of 6% and a floor effect of 0%, the SDUE domain had a ceiling effect of 30% and a floor effect of 1%, and the SD domain had a ceiling effect of 39% and a floor effect of 0%.

Discussion
Recently, research into the field of cross-cultural adaptation and psychometric properties analysis of patient-reported outcome measures in the field of orthopedics and spine surgery has gained much momentum, especially in Arabic speaking countries in the past few years, leading to the emergence of many Arabic translated and validated scores [16,17]. DCM is a growing disease with increasing global concern. Research in all aspects of its management is increasing exponentially over the last two decades. One of the cornerstones of this research effort is identifying the appropriate tools and outcome measures for assessing and monitoring the disease [18,19]. Overall, the psychometric properties of our translated version were acceptable, and no problems we encountered regarding the acceptability of the score or its comprehension by the patients.
The main limitation of this study was the absence of an intervention. This prevented us from measuring the responsiveness of the score and its sensitivity to change. It also prevented us from calculating the MCID by using anchor-based methods.
During the process of forward and back translation of the original mJOA score into Arabic, the only expression that had to be changed was in the MDLE domain to be properly understood within an Arabic dialect. The same expression was changed in both the Italian and Brazilian Portuguese translations of the score [2,5]. Regarding its feasibility, the score had no missing or multiple answers. It was completed in a relatively short period of time compared to the time required to complete the Italian version [2].
Our study demonstrated high inter-observer and test-retest reliability of the mJOA-Ar and each of its domains, showing an outstanding Kappa statistic and a strong Spearman's correlation, respectively. These findings are consistent with those of the English mJOA score and its Italian and Persian versions [2,8,20]. On the other hand, none of the previously mentioned studies calculated the Kappa coefficient for each domain.
The internal consistency of our mJOA-Ar established a strong correlation for the whole test and after removal of each domain, respectively, using Cronbach's alpha. The largest drop was observed after the removal of the MDLE domain. Both Kopjar et al. and the Italian version reported a modest correlation for the mJOA score with a significant drop after removal of the MDUE domain [2,3], while the Persian version showed similar results to our study with a higher Cronbach's alpha and no significant drop after removal of any domain [8]. Using Pearson inter-item correlation coefficient, the internal consistency of the total mJOA-Arabic score was highly correlated with both MDUE and moderately correlated with MDLE, SDUE, and SD. With respect to individual scale components, MDUE was moderately correlated with MDLE and SDUE. The SD score was weakly associated with the SDUE components of the mJOA-Arabic, which is similar to both the English and Italian mJOA [2,3].
Our score demonstrated both convergent and divergent construct validity by correlating it to the Arabic NDI and its individual domains, which contradicts the findings in other studies that demonstrated a weak correlation between the mJOA and the NDI [2,3], unlike in the Brazilian Portuguese version where it was strongly correlated with the NDI [6] (Table 4).
These differences in the results of both internal consistency and construct validity between different studies can be explained by the presence of multiple other factors that can affect both the quality of life and the patient-reported outcomes in patients with DCM which requires the use of multiple assessment tools together to properly assess patients with DCM. No single score or scale is considered a gold standard in the management of DCM, and this fact was supported in other multiple studies [2,3,18,19].
Regarding the MCID, it was consistent with the findings in the study done by Tetreault et al. [21]. The main difference was that in our study, the MCID was calculated using only distribution-based methods, while in the other study it was calculated using three different methods: distribution-based, anchor-based, and the Delphi method.
The ceiling and floor effect of the mJOA has not been reported before. Both the ceiling and floor effects of the total score and the MDLE and MDUE were within the desired range while on the other hand, the SDUE and SD domains had a high ceiling effect which reflects the inability of these two domains to discriminate individuals at the higher end of the scale which may represent a problem on trying to identify the change in these domains. This may be attributed to the limited number of possible answers in these two domains compared to the other ones. This is the first Arabic translation and validation study of the original English version of mJOA score to the best of our knowledge. It should be used in the clinical and research settings of Arabic-speaking countries to objectively evaluate patients, report outcomes, and compare them with international literature.

Conclusion
Our translated version of the mJOA score was found to be both a feasible and reliable instrument for assessing people suffering from DCM with good psychometric properties. This score can be utilized as a good outcome tool for use in Arabic-speaking countries.

Conflict of interest
BE certifies that he or she has no financial conflict of interest (e.g., consultancies, stock ownership, equity interest, patent/ licensing arrangements, etc.) in connection with this article. AH certifies that he or she has no financial conflict of interest (e.g., consultancies, stock ownership, equity interest, patent/licensing arrangements, etc.) in connection with this article.
KH certifies that he or she has no financial conflict of interest (e.g., consultancies, stock ownership, equity interest, patent/ licensing arrangements, etc.) in connection with this article.
HA certifies that he or she has no financial conflict of interest (e.g., consultancies, stock ownership, equity interest, patent/ licensing arrangements, etc.) in connection with this article.