Figure 6.
The grading components outlined for each question with a color scheme representing occasions in which ChatGPT scores higher (blue) or marginally higher (red) than PTs.