CoT, Agentic and MCP

Jace Hargis
Apr 18
4 min read

Recent developments in AI—including Chain of Thought (CoT) prompting, Agentic AI, and the Model Context Protocol (MCP) align with foundational learning theories. As universities seek to implement AI systems tailored to their institutional culture, curriculum, and student profiles, these innovations present an opportunity to develop adaptive, outcomes-driven educational tools. So, this week I would like to share a recent SoTL article entitled, “A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students’ Formative Assessment Responses” by Cohn et al. (2025).

This paper explores the use of large language models (LLMs) to score and explain short-answer assessments. While existing methods can score more structured math and computer science assessments, they often do not provide explanations for the scores. This study focuses on employing GPT-4 for automated assessment, combining few-shot and active learning with chain-of-thought (CoT) reasoning. CoT prompting encourages LLMs to reason through problems step-by-step. This scaffolding approach echoes Vygotskian scaffolding and metacognitive strategy training, promoting deeper cognitive engagement and reducing cognitive load. Using a human-in-the-loop approach, the authors successfully score and provide meaningful explanations for formative assessment responses.

Across all questions, the model’s scoring mostly aligned with the human scorers. Of the 11 subscores and total scores, 9 of them saw “strong” agreement or better. Four subscores achieved “almost perfect” agreement. All subscores except one resulted in a Macro F1 of 0.90 or greater. The results show that GPT-4, CoT reasoning, and active learning can be effectively leveraged toward accurate grading of formative assessments. In several cases, the model achieved “almost perfect” alignment with humans. The model generated relevant evidence linked to the rubric to help explain its scoring.

AI Connected to Learning Background

From an educational standpoint, behaviorism, cognitivism, and constructivism provide foundational lenses through which learning is understood. These frameworks align remarkably well with the three major paradigms in machine learning (ML):

Supervised Learning mirrors deductive reasoning, often associated with behaviorist models like Skinner’s operant conditioning. Just as learners are trained through reinforcement and feedback, supervised ML uses labeled data to train models by minimizing error in prediction (Goodfellow et al., 2016).
Unsupervised Learning, by contrast, resonates with constructivist views, where learners construct meaning from patterns in data. Like Vygotsky’s zone of proximal development or Piaget’s schema theory, unsupervised ML identifies latent patterns in unlabeled data, akin to how students abstract general principles from raw experiences (Siemens, 2005).
Reinforcement Learning (RL) parallels experiential learning and trial-and-error, offering systems feedback through reward structures that refine behavior over time—a direct computational analogy to Bandura’s social learning or Kolb’s experiential learning cycle (Sutton & Barto, 2018).

Generative models, such as neural networks and Large Language Models (LLMs), represent a shift toward cognitive simulation. These systems model language patterns probabilistically, predicting subsequent tokens based on preceding input. In educational terms, this mimics schema activation and working memory processes. Unlike discriminative models, which classify data, generative models create new content, aligning with constructivist epistemology where learners actively construct knowledge.

Retrieval-Augmented Generation (RAG) integrates retrieval with generation, creating a hybrid approach to learning and problem-solving. In RAG, vector-based retrieval systems locate semantically relevant documents, which are then used to condition the LLM’s output—a method that closely resembles the cognitive apprenticeship model, wherein learners access expert practices and external representations of thinking (Collins et al., 1989). RAG systems, when situated within educational platforms, offer potential for context-sensitive tutoring systems—tools that not only respond accurately but adapt to the learner’s curriculum, previous interactions, and disciplinary lexicon.

Agentic AI represents a paradigm shift from reactive to proactive systems. These agents leverage LLMs, RAG architectures, and planning mechanisms to autonomously achieve instructional goals with minimal human intervention. They are capable of:

Reflection and Metacognition: Evaluating their outputs, refining approaches, and adapting behavior—functions aligned with Zimmerman’s self-regulated learning (Zimmerman, 2002).
Planning and Multi-Agent Collaboration: Coordinating subtasks and working collaboratively—analogous to social constructivist approaches and collaborative learning frameworks (Johnson & Johnson, 2004).

While Agentic AI represents the what of autonomous behavior, the Model Context Protocol (MCP) defines the how. MCP structures, persists, and shares contextual information across sessions and tasks—functioning as a nervous system for AI agents. In educational terms, MCP aligns with situated cognition theory, wherein learning is inseparable from the context in which it occurs. Just as students learn differently depending on environmental cues, prior knowledge, and task framing, AI agents enhanced by MCP can dynamically adjust based on evolving educational contexts.

Universities, with their unique institutional cultures and diverse student populations, are increasingly tasked with providing personalized, equitable learning experiences. By integrating AI systems grounded in learning theory and powered by Agentic AI and MCP, institutions can:

Develop Custom AI Agents that reflect their disciplinary norms, linguistic expectations, and curricular structures.
Support Faculty in course design, assessment alignment, and instructional feedback using AI assistants trained on institutional data.
Enhance Student Learning through AI tutors that adapt in real-time, support metacognition, and guide learners through complex problem-solving.
Ensure Responsible AI Use by embedding context-aware governance and transparent agent behavior through MCP-driven architectures.

References

Cohn, C., Hutchins, N., Le, T., & Biswas, G. (2024). A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students’ Formative Assessment Responses in Science. Proceedings of the AAAI Conference on Artificial Intelligence, 38(21), 23182-23190. https://doi.org/10.1609/aaai.v38i21.30364

Collins, A., Brown, J. S., & Newman, S. E. (1989). Cognitive apprenticeship: Teaching the crafts of reading, writing, and mathematics. In L. B. Resnick (Ed.), Knowing, learning, and instruction: Essays in honor of Robert Glaser. Lawrence Erlbaum Associates.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Russell, S., & Norvig, P. (2021). Artificial intelligence: A modern approach (4th ed.). Pearson.

Siemens, G. (2005). Connectivism: A learning theory for the digital age. International Journal of Instructional Technology and Distance Learning, 2(1), 3–10.

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press.

Zimmerman, B. J. (2002). Becoming a self-regulated learner: An overview. Theory Into Practice, 41(2), 64–70. https://doi.org/10.1207/s15430421tip4102_2