States of Learning

Towards a formal representation of learner knowledge state

On the gap between measuring learning and modelling it and the architectural choices that gap implies

There is a question that sounds simple and turns out to be surprisingly hard: what does it mean to know something?

Not to have encountered it, or to be able to reproduce it under familiar conditions, but to genuinely know it - in the sense that the knowledge is stable, connected to other knowledge, applicable under novel conditions, and durable under the pressure of time and competing demands. This is the question that every assessment system is trying to answer, and that most assessment systems answer imperfectly.

The gap between measuring learning and modelling it is where most educational technology currently lives. Measurement asks: what can this student produce right now? Modelling asks something harder: what is the current structure of this student's understanding, how is it changing, and what does that trajectory predict? These are not the same question. The instruments adequate to the first are not adequate to the second. And the architectural choices required to build systems capable of the second are substantially different from the choices made by systems optimised for the first.

The three things a test score conceals

A test score tells you something real: A student who consistently scores 85% on fraction problems knows more about fractions than a student who scores 40%. This is useful information. But it systematically conceals three things that matter at least as much.

From measurement to modelling: what a knowledge state representation requires

The instrument that would capture what measurement misses is not a better test. It is a fundamentally different kind of representation - one that tracks not what a student scored at a moment but how their understanding is currently structured and how that structure is evolving.

Formally, such a representation must satisfy four properties. It must be continuous - updated with each new interaction rather than sampled at defined intervals, because the trajectory information is in the rate of change and discretising time destroys it. It must be structured rather than scalar, because understanding is not a single quantity but a graph of connected concepts with varying degrees of consolidation, and the topology of that graph carries information that any scalar reduction loses. It must be longitudinal — trained on interaction sequences of sufficient temporal depth that the model can learn what the temporal signature of developing fragility looks like before observing the failure it predicts. And it must be predictive in a specific sense: not merely predictive of next-question performance, but predictive of future performance on dependent concepts not yet encountered in the current interaction history.

The deep knowledge tracing literature has made substantial progress on the first three properties. Models from DKT (Piech et al., 2015) through AKT (Ghosh et al., 2020) and SAINT+ (Shin et al., 2021) have progressively improved prediction of next-question correctness on held-out test sets.³ But the systematic review of DKT research covering 2015–2025 found that only 3.6% of studies assessed sequential stability of knowledge estimates over time, and only 11.9% included interpretability measures sufficient for a practitioner to act on the model's outputs.⁴ The field has optimised for short-horizon prediction accuracy on benchmark datasets while the fourth property - longitudinal predictive validity across dependent concepts in real curricula - remains largely unaddressed.

The architectural question: why the choice of sequence model matters

The requirement for longitudinal, structured, continuously updated learner representations is not neutral with respect to architectural choice. Different sequence modelling architectures make different tradeoffs that bear directly on which of the four properties above they can satisfy, and at what computational cost.

The FINER paper (Liu et al., 2025) demonstrated a related insight from a different direction: incorporating forward-looking performance trends, e.g. what happens to a student's performance in the weeks following a given interaction, into the training signal improves prediction accuracy on six real-world datasets by 8.74% to 84.85% over state-of-the-art KT baselines.⁶ The magnitude of the improvement reflects how much predictive information is present in temporal patterns that standard session-level evaluation cannot access. Graph neural network components are a natural complement to the temporal backbone for the structured knowledge representation property. The dependency structure of a curriculum — fractions → ratios → proportional reasoning → algebraic fractions — can be represented as a directed graph, with concepts as nodes and prerequisite relationships as edges. Embedding the learner state in this graph, rather than in an unstructured vector space, means that uncertainty about a student's grasp of a foundational concept propagates forward through the graph to dependent concepts in a way that is structurally grounded rather than learned implicitly from co-occurrence statistics. Structure-aware knowledge tracing models (SKT, SINKT) have demonstrated that incorporating the knowledge graph into the state representation improves prediction accuracy and interpretability relative to graph-agnostic baselines.⁷

The training signal problem

The most important and least discussed challenge for longitudinal learner state modelling is not architectural but data-related, and it is worth being precise about why.

Training a model to predict the future stability or fragility of a student's understanding requires training examples where both the trajectory signal and the subsequent outcome are present in the data. This means: longitudinal interaction sequences of sufficient length (months, not weeks), with ground truth outcomes also present in the dataset. Most publicly available educational datasets are either too short (ASSISTments interactions are typically sparse over time), too coarse (single binary correctness labels without timing or hint request data), or lack the curricular structure annotations required to identify dependent concepts.

The implication is that the architectural choices above are necessary but not sufficient. The training data must have the temporal and structural depth that the model requires to learn the longitudinal patterns it is being asked to detect. Building that training data, e.g. through institutional partnerships that generate sustained, structured, annotated interaction logs at meaningful scale, is the constraint that determines whether the system described here is buildable in practice, not merely in principle.

The teacher who sees December in October

The question of what it would mean to actually know a student — not measure them at intervals but model the current structure and trajectory of their cognitive development — is the question that the next generation of educational AI needs to answer. The architectural primitives for approaching it exist: selective state space models for efficient long-range temporal representation, graph neural networks for structured knowledge state encoding, forward-looking training objectives for longitudinal predictive validity. What does not yet exist is a system that combines these components, trains them on data of sufficient depth, and deploys them at the scale that would make the detection meaningful.

The teacher who sees December coming in October is not running a better assessment. They are maintaining a richer internal model: one that represents not just where a student is but how they got there and where the trajectory leads. That is the capability that needs to be formalised, trained, and scaled.

¹ For the procedural-conceptual distinction in mathematics education, see: Rittle-Johnson, B., Siegler, R.S., & Alibali, M.W. (2001). Developing conceptual understanding and procedural skill in mathematics: An iterative process. Journal of Educational Psychology, 93(2), 346–362. ² Smith, J.P., diSessa, A.A., & Roschelle, J. (1994). Misconceptions reconceived: A constructivist analysis of knowledge in transition. Journal of the Learning Sciences, 3(2), 115–163. ³ Piech, C. et al. (2015). Deep Knowledge Tracing. NeurIPS 2015. Ghosh, A., Heffernan, N., & Lan, A.S. (2020). Context-aware attentive knowledge tracing. KDD 2020. Shin, D. et al. (2021). SAINT+: Integrating temporal features for EdNet correctness prediction. LAK 2021. ⁴ Preprints.org (2025). A Systematic Review of Deep Knowledge Tracing (2015–2025): Toward Responsible AI for Education. ⁵ Gu, A. & Dao, T. (2023). Mamba: Linear-time sequence modelling with selective state spaces. arXiv:2312.00752. ⁶ Liu, H. et al. (2025). Advancing Knowledge Tracing by Exploring Follow-up Performance Trends (FINER). arXiv:2508.08019. ⁷ Tong, S. et al. (2020). Structure-based Knowledge Tracing: An Influence Propagation View. ICDM 2020. Fu, L. et al. (2024). SINKT: A structure-aware inductive knowledge tracing model with large language model. CIKM 2024.