For the Automatic Short Answer Grading (ASAG) task in NLP, after the dissemination of neural networks and attention based transformer models, researchers shifted their focus to extract crucial syntactic and semantic information to be used in the comparison of blueprint answers to that of students. In the first part of our discussion, the shortcomings and potentials of using transformer models for the task of ASAG will be explored, supported by real life research examples collected in a project group.
The introductory first part leads to the second part which is dedicated to the use of Large Language Models (LLMs) for the same answer grading and evaluation task. Bearing in mind the ongoing debates about the latent and visible dangers of using LLMs expansively in many fields, problems that pertain to the task of ASAG in higher education will be discussed. Certain examples will be used to at least attempt to represent how such problems can be remedied or limited with the later developments in the field.