Large language models deployed as tool-using agents exhibit distinctive behavioural patterns — cognitive fingerprints — that emerge from their training lineage rather than their explicit instructions. We present a controlled experiment in which thirteen substrates from nine lineages performed the same specification-authoring task with identical tool access (file search, content search, file reading, task tracking). We measure six dimensions beyond task accuracy: tool-foraging strategy, survey depth, specification quality, convention adherence, interpretive divergence, and reflection quality. Our findings show that (1) tool-use patterns constitute a stable cognitive phenotype per lineage, (2) convention adherence varies independently of task competence, (3) interpretive divergence across substrates maps automation boundaries — where substrates converge, the task is mechanical; where they diverge into clusters, human judgment is required, and (4) substrate mixing yields complementary coverage that no single substrate achieves alone. We frame these findings within a five-thread literature review spanning behavioural fingerprinting, tool-use benchmarking, multi-agent diversity, beyond-accuracy evaluation, and convention adherence. This is a living survey: we intend to update it as new substrates are tested and new literature appears.
...