Rethinking authentic assessments in the age of generative AI

Rethinking authentic assessments in the age of generative AI

RMIT Vietnam researchers have introduced a framework to help educators determine the effectiveness of popular generative AI tools such as ChatGPT in solving authentic assessments.

The incorporation of generative AI/large language models (LLMs) into education poses a big challenge for assessment design. According to RMIT researcher Dr Nguyen Thanh Binh, this challenge is especially relevant for authentic assessments which rely heavily on report writing in take-home assignments.

Generative AI tools enable students to produce essays or research reports by simply inputting the assessment instructions into the system. In addition to concerns about academic integrity, the benefits of authentic assessments, such as enhancing students’ higher-order thinking skills, could be compromised.

“It is crucial to identify the types of authentic assessments that generative AI can effectively complete. Based on that, educators can adjust their assessments, particularly for take-home assignments where students have ample time and access to diverse AI tools,” Dr Binh said.

View from above of person typing on laptop Generative AI tools enable students to produce essays or reports by simply inputting the assessment instructions into the system. (Image: Pexels)

Framework to assess generative AI capabilities

In a recent article published in the Australasian Journal of Educational Technology, Dr Binh and his colleagues from the Economics and Finance Department at The Business School, RMIT Vietnam, proposed a comprehensive framework to systematically assess the capabilities of generative AI tools in solving authentic assessments.

The framework employs Bloom’s taxonomy as a guiding principle to design assessments. This widely used taxonomy categorises cognitive skills into six levels ranging from simple recall to complex creation, namely “remember”, “understand”, “apply”, “analyse”, “evaluate” and “create”.

Assessment questions corresponding to the different cognitive levels, as well as marking rubrics and marking guidelines, are fed into ChatGPT-4, ChatGPT-3.5, Google Bard (now called Gemini) and Microsoft Bing (now called Copilot) to generate answers. Teaching staff then grade the generated answers to measure the proficiency of the different tools.

Framework for the evaluation of generative AI capabilities (Image: RMIT research team) Framework for the evaluation of generative AI capabilities (Image: RMIT research team)

The study revealed that generative AI tools perform strongly at the lower tiers of Bloom’s taxonomy (“remember” and “understand”), maintain a decent performance at the levels of “apply”, “analyse” and “evaluate”, but falter significantly at the “create” level.

This could be explained by the fact that AI models are trained on huge datasets which could help them provide good information and summary. Such capabilities affect both educational design and objectives, highlighting the need to focus on learning goals beyond these fundamental levels.

RMIT Lecturer Dr Vo Thi Hong Diem pointed out that “interestingly, the generative AI tools are better at addressing numeric-based questions than text-based ones”.

For example, ChatGPT-4, along with its Code Interpreter tool, shows an impressive ability to generate graphs from data sets, and to analyse and interpret those graphs.

This poses a critical challenge for educators and universities, especially in fields heavily reliant on data analysis such as statistics and econometrics.

Dr Diem said, “The ease with which students can now produce reports and data analyses through generative AI necessitates a comprehensive re-evaluation of course learning outcomes, content and assessments.”

ChatGPT app on smartphone Generative AI tools show the weakest level of performance in tasks requiring the ability to create new or original work. (Image: Pexels)

Race with AI, not against AI

Despite their growing capabilities, all the generative AI tools struggle with building arguments based on theoretical frameworks, maintaining the coherence of the arguments, and providing appropriate references.

RMIT Lecturer Dr Nguyen Nhat Minh said, “This gives students a chance to build on the basics provided by AI. Instead of collecting information and forming basic arguments, students should take on a more nuanced role by combining AI outputs with relevant theories and real-world contexts.”

As for educators, it is pivotal that they embrace the integration of AI tools in their teaching methodologies.

Rather than designing assessments and learning activities that solely aim to counteract the capabilities of generative AI, educators should focus on crafting authentic assessments that elevate students' higher-level cognitive skills, such as evaluation and creativity, while harnessing the advantages of AI technologies.

The goal is to develop an assessment environment that genuinely fosters learning and the development of critical skills in harmony with AI tools.

Dr Minh explained, “Such an environment would not only prepare students to think critically and creatively but also equip them with the ability to effectively use AI as a tool for innovation and problem solving.

“By doing so, we can ensure that education evolves in tandem with technological advancements, preparing students for a future where they can successfully collaborate with AI in various domains.”

Dr Binh reiterated that as generative AI becomes more prevalent in the workplace, professionals, especially those in data- and text-based fields, will see their roles transform significantly.

“This shift demands a proactive response. Educators and educational institutions need to review and consider redesigning their assessments, course outcomes and academic programs,” Dr Binh said.

The article “Race with the machines: Assessing the capability of generative AI in solving authentic assessments” is published in the Australasian Journal of Educational Technology (DOI: 10.14742/ajet.8902). 

Story: Ngoc Hoang

  • Research

Related news