Topographic abstract map of language model response behavior

Behavior atlas

Model behavior can be mapped without pretending it is fixed.

WikiLM uses terrain language because model responses shift with prompt pressure, context, source quality, and requested format. A map does not claim the mountain is the same in every season. It gives a reader landmarks for careful movement.

Terrain mark

Confidence ridge

Where fluent structure rises faster than evidence. The question is whether the response shows its footing.

Terrain mark

Refusal gate

Where safety, policy, ambiguity, or missing context blocks the path. The useful detail is the stated reason.

Terrain mark

Compression valley

Where a model shortens source material and quietly drops qualifiers, minority cases, or sequence.

Terrain mark

Repair path

Where a correction improves the answer, changes the task, or exposes what the model could not track.

The atlas is useful when a term is too clean for the evidence.

A term like hallucination can be necessary, but it can also hide several different behaviors: unsupported invention, source confusion, stale memory, exaggerated synthesis, or a confident bridge between facts. The atlas tries to keep these distinctions visible. It describes what the model did, what the prompt asked, and which part of the response carried the risk.

This makes the site practical for editors, researchers, product teams, and students. Instead of treating every failure as the same category, a note can point to the exact terrain: the answer compressed away a condition, refused without explaining a path forward, or improved only after the evidence requirement became explicit. Those distinctions are small, but they change how people design prompts, evaluate outputs, and cite model-assisted work.