 of students, andthrough a t test, it was shown that two of the classes did not have any significant difference in their PET mean scores. Before running the t test the assumption of normality was checked.The following table shows that their scores were normally distributed:

Table 4. 1. Tests of Normality of PET scores, main administration

Kolmogorov-Smirnova
Shapiro-Wilk

Statistic
Df
Sig.
Statistic
Df
Sig.
Class 9.40
.۱۰۴
۲۱
.۲۰۰*
.۹۴۲
۲۳
.۱۹۷
Class 11.20
.۱۳۵
۲۱
.۲۰۰*
.۹۴۶
۱۸
.۳۶۸

As all the sig values are larger than .05, it is concluded that both sets of scores were normally distributed. So the condition for conducting a t-test was met. The following tables show the result of t-test:
Table 4. 2. Group Statistics of PET scores, main administration

Grouping2
N
Mean
Std. Deviation
Std. Error Mean
PET scores
۱.۰۰
۲۱
۴۶.۰۰۰۰
۹.۵۲۹۸۶
۱.۹۸۷۱۱

۲.۰۰
۲۱
۳۹.۸۳۳۳
۱۲.۳۶۸۱۳
۲.۹۱۵۲۰

Table 4. 3.Independent Samples Test on PET scores, main administration

Levene’s Test for Equality of Variances
t-test for Equality of Means

۹۵% Confidence Interval of the Difference

F
Sig.
t
df
Sig. (2-tailed)
Mean Difference
Std. Error Difference
Lower
Upper
PET scores
Equal variances assumed
.۹۳۱
.۳۴۱
۱.۸۰۵
۳۹
.۰۷۹
۶.۱۶۶۶۷
۳.۴۱۷۱۸
-.۷۴۵۲۳
۱۳.۰۷۸۵۶

Equal variances not assumed

۱.۷۴۸
۳۱.۲۵۴
.۰۹۰
۶.۱۶۶۶۷
۳.۵۲۸۰۳
-۱.۰۲۶۴۳
۱۳.۳۵۹۷۶

As it is shown in the above table, the difference between the two classes’ PET mean scores was not significant (t=1.80, p=.079.05), with equal variances (F=.931, p=.341.05) assumed.
The next step was to show that the two classes were homogeneous regarding their translation ability. After collecting their translations prior to the treatment, two raters scored them. Firstly, the inter-rater reliability for the scores of each class was calculated. The following tables show the result of normality check as a condition for Pearson correlation:

Table 4.4. Descriptive Statistics of the translation scores given by each rater in the first class at the pre-treatment stage

N
Minimum
Maximum
Mean
Std. Deviation
Skewness
Skewness
ratio

Statistic
Statistic
Statistic
Statistic
Statistic
Statistic
Std. Error

Rater 2
۲۱
۸.۰۰
۱۸.۰۰
۱۴.۵۷۱۴
۲.۴۶۱۱۳
-.۹۴۰
.۵۰۱
۱.۸۸
Rater 1
۲۱
۱۴.۰۰
۱۹.۰۰
۱۶.۸۰۹۵
۱.۵۳۶۸۵
-.۲۸۷
.۵۰۱
.۵۷
Valid N (listwise)
۲۱

As the skewness ratios of both sets of scores were within the normality range of ±۱.۹۶, it is concluded that both of them were normally distributed. So, the Pearson correlation was used to calculate the relationship between the two sets of scores. The following table shows the result:

Table4.5 . Correlation between the two raters’ scores given to the first class’s translations at the outset

Rater 2
Rater 1
Rater 2
Pearson Correlation
۱
.۸۵۰**

Sig. (2-tailed)

.۰۰۰

N
۲۱
۲۱
Rater 1
Pearson Correlation
.۸۵۰**
۱

Sig. (2-tailed)
.۰۰۰

N
۲۱
۲۱
**. Correlation is significant at the 0.01 level (2-tailed).
As shown above the correlation between the scores given by the two raters to the first class was significant (r=.850, p=.000.05).

The following table shows the descriptive statistics of the scores given by the two raters to the second class, including the skewness ratios:
Table 4.6. Descriptive Statistics of the translation scores given to the second class by two raters at the outset

N
Minimum
Maximum
Mean
Std. Deviation
Skewness
Skewnwss ratio

Statistic
Statistic
Statistic
Statistic
Statistic
Statistic
Std. Error

Rater 1, Class 2
۲۱
۱۳.۰۰
۱۹.۰۰
۱۷.۱۹۰۵
۱.۷۲۱۰۲
-.۹۷۶
.۵۰۱
۱.۹۴
Rater 2, class 2
۲۱
۱۱.۰۰
۱۹.۰۰
۱۵.۶۶۶۷
۲.۱۹۸۴۸
-.۷۱۵
.۵۰۱
۱.۴۲
Valid N (listwise)
۲۱

As depicted in the above table, both sets of scores were normally distributed as the skewness ratios of both of them were within the range of ±۱.۹۶. Therefore, the Pearson correlation coefficient as the parametric test of correlation was used. The following table shows the result:

Table 4.7. Correlation between the translation scores given to the second class by two raters

Rater 1, Class 2
Rater 2, class 2
Rater 1, Class 2
Pearson Correlation
۱
.۸۳۷**

Sig. (2-tailed)

.۰۰۰

N
۲۱
۲۱
Rater 2, class 2
Pearson Correlation
.۸۳۷**
۱

Sig. (2-tailed)
.۰۰۰

N
۲۱
۲۱
**. Correlation is significant at the 0.01 level (2-tailed).

As displayed above, the correlation between the two sets of scores given by the two raters to the second class was significant (r=.837, p=.000.05). Therefore, the mean of the scores given by both raters for each individual learner was calculated and used for further analyses.
Two check whether there was any significant difference between the translation ability of the two groups of learners prior to the treatment, a t test was needed. But firstly, the assumption of normality for both sets of scores was checked. The following table shows the result.

Table 4.8. Descriptive Statistics of the pre-treatment translation scores

N
Minimum
Maximum
Mean
Std. Deviation
Skewness

Statistic
Statistic
Statistic
Statistic
Statistic
Statistic
Std. Error
Translations scores, Cl1
۲۱
۱۱.۵۰
۱۸.۰۰
۱۵.۴۲۸۶
۱.۹۴۴۷۷
-.۳۰۹
.۵۰۱
Translation scores, Cl2
۲۱
۱۲.۰۰
۱۹.۰۰
۱۶.۲۸۵۷
۱.۷۶۴۷۳
-۱.۰۷۰
.۵۰۱
Valid N (listwise)
۲۱

As shown in the above table, the scores in class 2 were not normally distributed as the skewness ratio (2.13) exceeds 1.96. Therefore, as the assumption of normality for a t test was not met, the researcher opted for the non-parametric equivalent, Mann Whitney U test. The following tables show the result:

Table 4.9. Ranks of pre-treatment translation scores

classes
N
Mean Rank
Sum of Ranks
Translation mean
class 1
۲۱
۱۸.۸۸
۳۹۶.۵۰

class 2
۲۱
۲۴.۱۲
۵۰۶.۵۰

Total
۴۲

As shown above, the second group obtained a higher mean rank. The following table shows the significance check of the difference between their mean ranks:

Table 4.10 . Test Statistics of pre-treatment translation scores

Translation mean
Mann-Whitney U
۱۶۵.۵۰۰
Wilcoxon W
۳۹۶.۵۰۰
Z
-۱.۳۹۲
Asymp. Sig. (2-tailed)
.۱۶۴
a. Grouping Variable: classes

As displayed above, the difference between the two mean ranks was not significant (Z=1.39, p=.164.05). Therefore, it is concluded that the two groups were not significantly different regarding their translation ability at the start.
Two raters scored the posttest translation papers of the two groups. So, in order to use the means of their scores the researcher had to be sure that there was a significant correlation between their scores. Firstly, however, the assumptions of Normality and linearity had to be checked. The following table shows the normality check of the scores given by both raters to the experimental group.

Table4.11.Tests of Normality of the scores given by both raters to the EXG

Kolmogorov-Smirnova
Shapiro-Wilk

Statistic
Df
Sig.
Statistic
df
Sig.
EX G. R1
.۲۵۰
۱۸
.۰۰۴
.۸۳۹
۱۸
.۰۰۶
EX G. R2
.۱۵۱
۱۸
.۲۰۰*
.۹۳۴
۱۸
.۲۳۱

As the above table shows , distribution of the scores given by the first rater is not normal as the sig value is smaller than .05.
The following graph shows visually the linearity of their relationship.

Figure 4.1.Scatter plot representing the relationship between the translation posttest scores given to the experimental group by both raters
As the above figure displays, the dots form a linear shape stretching from the bottom left to the top right. Hence, the linearity assumption was met. The following table shows the result of Spearman correlation as a non-parametric equivalent for Pearson formula.

Table4.12.Correlation between the posttest scores given by both raters to the ExG

EX G. R1
EX G. R2
Spearman’s rho
EX G. R1
Correlation Coefficient
۱.۰۰۰
.۸۴۴**

Sig. (2-tailed)
.
.۰۰۰

N
۱۸
۱۸

EX G. R2
Correlation Coefficient
.۸۴۴**
۱.۰۰۰

Sig. (2-tailed)
.۰۰۰
.

N
۱۸
۱۸
**. Correlation is significant at the 0.01 level (2-tailed).

The above table shows that the correlation between the scores given by the two raters to the experimental group’s translation posttest papers was significant (r=.844, p=.000.05).
The same procedure was followed for the control group. The following table shows the normality of the distributions.

Table4.13.Tests of Normality of the posttest scores given by both raters to the CG

Kolmogorov-Smirnova
Shapiro-Wilk

Statistic
Df
Sig.
Statistic
Df
Sig.
CG R1
.۴۰۶
۱۸
.۰۰۰
.۶۸۲
۱۸
.۰۰۰
CG R2
.۱۷۸
۱۸
.۱۳۹
.۹۳۹
۱۸
.۲۷۷
a. Lilliefors Significance Correction

As the above table exhibits, the scores 