Abstract: Today, the challenges faculty and teachers face are unprecedented. Even school teachers have to keep themselves abreast to teach students the best at the right time with the right content. They have to write articles, access answer books, and arrange other student-related work for students’ future development. Amid all this, automating student assessment with AI would be the best help. Regular assessment of students carries its own pressure and need. AI can access a student’s answer book in structured and unstructured ways, with the help of the plethora of options available in AI and computer science toolkits. It is time we use these toolkits and start automating student assessments with AI. This holds not just for college but also for primary school, where grading is not needed, and for secondary schools, where grading is required.
Introduction
It’s time we start looking at assessments from an AI perspective. There is a lot of data in assessments that can help young men and women look ahead. Yes, if AI is used appropriately in assessments, students can know in advance where they stand in life and what they can do early on. But how does AI come into all this? Yes, teachers are needed, as we need humans to review how AI assesses children’s answer books. As in my previous article, students can be divided not just between school, college, and university, but also within school, as kindergarten, primary, and secondary school students. The teacher’s job would be to access the answer books, upload them, and complete the necessary forms so AI can understand human teacher assessments.
With growing populations in the education sector, AI handling some work at schools and colleges can be a real help. There is a dearth of teachers, not to mention the load they have to carry in assessing children and students. Grading can still be done by teachers or by theoretical AI. Teachers are overburdened, trying to keep up with what is latest to teach, what to keep abreast of in technology, writing research papers, teaching in the best modern ways they can, and also doing assessments. One way to ease the burden on teachers across sections of society is to automate assessments with AI.
For all this to be done with AI, we need to understand that there are two ways to do it. The first approach is to use AI theory and adopt structured AI, and the other is to use GPT-like LLMs and adopt unstructured AI. In the following sections, we will examine the two approaches in detail. One must select an approach based on data availability. For example, applying structured data can take time; hence, theoretical AI may take time to fit into the shoes of an evaluating teacher, whereas LLM-based AI can be used with little training. With AI as a right, AI like GPT or Llama can be applied soon and tailored for quick use. But it needs to be taught how to grade, especially how to do relative grading, a system we use in most advanced colleges and universities. Grading is not widely used in primary schools, but evaluations are required to assess children’s answer books with AI. Same with college students: not much of their homework is assessed, while examinations are evaluated. How about a world where all this is assessed, marked, and interpreted? The next two sections describe the two methodologies for assessing answer books, homework, and lab work.
Unstructured AI
We can use AI to access homework or answer books, either in a structured or unstructured way. An unstructured approach can simply provide a dump of files for the LLM to assess, and the LLM engine can be tailored to perform evaluations. However, there are other ways to assess an AI-written answer book. LLMs can read both OCR-based evaluations and handwritten evaluations. LLMs can even analyze biology diagrams and access them. In the end, grading can be encoded as an algorithm build on top of the LLM. Here, an agent must be in place to create student accounts, regularly upload answer books to those accounts, and send the results as required to teacher in charge or, if students are adults, to students. The uploaded answer book would be part of a group for relative assessment.
Students can learn online if such a system exists. More and more students can be enrolled as this reduces the assessment load on faculty at schools and colleges. However, we must understand the difference between assessments across various parts of schools and colleges. Student assessments in schools must be analysed and handled by teachers until the software is mature. One must understand that an LLM on its own does not assess a group of students for relative grading. We must build a wrapper around the LLM and teach it to access grades or correct answer books. In some cases, ideal answer books must be provided to the LLM. Ideal images for correction may be provided to the LLM. The LLM must be wrapped in software that computes grades. Various options must be provided to match the exact wording. For example, when evaluating a verse, the exact words from Shakespeare must be matched. These instructions must be included in the evaluation specifications. Alternatively, a “match a gist” option can be selected to evaluate understanding rather than exact terminology.
This is a quick solution, it’s ready to use, and does not need much waiting time. However, the structured AI, as discussed in the next section, needs some time to be made, loaded, and processed.
Structured AI
In structured AI, each answer book must be converted into an XML file with sections. We call this XML the Student Document Architecture (SDA). Each section must have a unique ID. This is comparable to the Clinical Document Architecture used for medical files. User roles can vary from student to teacher to evaluator, or even principal. Each section must be well coded. Codes for each role would be documented such as LONIC codes, as in medical data. Such an XML file can be loaded into an AI algorithm, such as Naïve Bayes, to generate correct or incorrect answers and, as a post-processing step, grade the XML file. Classification can be performed using SVM as the data is structured. Other benefits of an XML-based evaluation technique include the ability to use theoretical AI, not just LLMs, but also theoretical AI algorithms such as a rule-based Apriori algorithm. Other algorithms, such as K-Means, can also be applied. Here, rule-based expert systems, such as Fuzzy Expert Systems, can be used to generate student grades by extracting the necessary information from the XML file. The SDA is handy, and students can carry it in interviews as well. SDA can be used not just to cluster students who need to be in the same section or to join nautical engineering, for example. LLMs can be applied to clusters of SDA documents to further process them.
These documents can be created dynamically from uploaded documents or as a form filled out by the teacher or faculty member in charge. We need globally recognized codes for all sections. Each section can correspond to a question answered by the student, with maximum marks and negative marking. This can be loaded into software, and a visual representation can be viewed with the help of the software.
Conclusion and Future Work
We can start working on structured and unstructured assessments. However, structured student assessments would require additional time and standardization to begin with. We need to create codes for all aspects. This is just the beginning, and we must discuss the aspects surrounding structured AI assessments, both globally and unilaterally. These XML files can be kept per semester for each student, and they can even be helpful during interviews. So, in the future, we must discuss the codes, the structure of XML, and the global consensus on it, because once created, this XML can be used anywhere globally. This would be a small world then. And for unstructured assessments, we are already there. All we need is a platform for faculty to upload the answer books, LLM engines to respond to the evaluation request, and a group for relative grading. Some work is needed in unstructured AI as well going forward. In any case, whichever we choose, it would reduce the assessment load on faculty, so that faculty can focus on teaching, not assessments.