Measuring AI Literacy

Jace Hargis
Aug 22
3 min read

Greetings and I hope that wherever you are in the planning of this new academic term, things are going well. I am working with an amazing team of AI Fellows, who are creating an online, asynchronous AI Literacy Certificate Course and one of our discussions centered around impact and direct measurement. So, this week I would like to share several articles that I found on the topic, which we will be integrating into our course and subsequent SoTL research for your consideration:

As education institutions integrate AI, the challenge becomes clear: how do we measure their effectiveness in ways that are valid, reliable, and theoretically grounded? Without strong assessment, these programs risk being anecdotal rather than evidence-based. Fortunately, recent scholarship has advanced our ability to evaluate AI literacy through validated instruments such as the AI Literacy Test (AILIT), its short form (AILIT-S), and through comparative reviews of scales. This blog integrates findings from Hornberger et al. (2023, 2025) and Lintner (2024) to show how these tools can be leveraged to measure program outcomes, anchored in learning theory and human information processing models.

Why Measure AI Literacy?

AI literacy encompasses the ability to understand, evaluate, and apply artificial intelligence technologies responsibly (Long & Magerko, 2020). For certificate programs, this means ensuring that learners:

Acquire conceptual knowledge of AI.
Develop critical thinking about AI’s societal and ethical implications.
Apply AI tools in disciplinary and professional contexts.

Assessment instruments provide the evidence needed to demonstrate that these outcomes are achieved, supporting continuous program improvement and accountability.

Hornberger et al. (2023) developed the AI Literacy Test (AILIT), a 31-item performance-based instrument rooted in Long and Magerko’s (2020) competency framework. Using Item Response Theory (IRT), the test demonstrated strong psychometric properties and captured variance across disciplines and prior experiences with AI. Findings revealed that students with technical backgrounds or prior exposure to AI outperformed their peers, highlighting the importance of tailoring instruction to diverse prior knowledge.

Building on this work, Hornberger et al. (2025) introduced the AILIT-S, a validated 10-item short form. While slightly less reliable, it maintains strong construct and congruent validity, requiring under 5 minutes to complete. This makes it ideal for:

Formative assessment at module checkpoints.
Rapid pre/post course comparisons when time is limited.
Large-scale program evaluations where student participation rates are critical.

In information-processing terms (Atkinson & Shiffrin, 1968), shorter tests minimize cognitive fatigue and working memory overload while still probing learners’ schema development.

Lintner’s (2024) systematic review of 16 AI literacy scales provides essential context. Most available measures are self-report Likert-style surveys, while only three—including AILIT—are performance-based. The review highlights gaps: few scales assess cross-cultural validity, measurement error, or responsiveness to learning interventions.

The alignment of assessment tools with foundational learning theories strengthens their use:

Constructivism (Piaget, 1970; Vygotsky, 1978): Students build AI literacy through active engagement. AILIT measures whether learners reconstruct misconceptions (e.g., “AI thinks like the human brain”) into more accurate schemas.
Information Processing Model (Atkinson & Shiffrin, 1968): AILIT-S reduces test length, lowering cognitive load, while repeated pre/post testing tracks how students encode, store, and retrieve AI concepts.
Metacognition (Flavell, 1979): Complementary self-report scales capture whether students recognize their own knowledge boundaries, supporting self-regulated learning (Zimmerman, 2002).

Together, these theories explain not just whether students know more after the course, but also how their cognitive processes and reflective practices evolve.

References

Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. Psychology of Learning and Motivation, 2, 47–89.

Flavell, J. H. (1979). Metacognition and cognitive monitoring: A new area of cognitive–developmental inquiry. American Psychologist, 34(10), 906–911.

Hornberger, M., Bewersdorff, A., & Nerdel, C. (2023). What do university students know about artificial intelligence? Development and validation of an AI literacy test. Computers and Education: Artificial Intelligence, 5, 100165. https://doi.org/10.1016/j.caeai.2023.100165

Hornberger, M., Bewersdorff, A., Schiff, D. S., & Nerdel, C. (2025). Development and validation of a short AI literacy test (AILIT-S) for university students. Computers in Human Behavior: Artificial Humans, 5, 100176. https://doi.org/10.1016/j.chbah.2025.100176

Lintner, T. (2024). A systematic review of AI literacy scales. npj Science of Learning, 9(50), 1–12. https://doi.org/10.1038/s41539-024-00264-4

Long, D., & Magerko, B. (2020). What is AI literacy? Competencies and design considerations. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–16.

Piaget, J. (1970). Genetic epistemology. Columbia University Press.

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.

Zimmerman, B. J. (2002). Becoming a self-regulated learner: An overview. Theory into Practice, 41(2), 64–70.