Can an LLM discover concepts it was never explicitly taught, like the number zero?
This is one of the more philosophically interesting questions in AI research: can a language model generalize to concepts that weren't directly labeled in its training data? Zero is a fascinating test case because it's both mathematically foundational and historically non-obvious — many ancient civilizations had arithmetic without it.
Researchers probe this by training models on number systems or symbolic tasks that imply the need for a zero-like concept, then checking whether the model internally represents or uses it correctly. What they often find is that models can develop implicit representations of such concepts through pattern pressure — the structure of the data demands it, even if no example ever says "this is zero."
This suggests LLMs don't just memorize — they perform a form of latent concept induction. Whether that counts as true discovery or sophisticated interpolation is still actively debated.