Essay April 2026 Living document

On Being Cited

Why a documentary archive needs a permanent citation grammar — and what happens when it does not have one

Companion to How the Archive is Organised (the technical reference)

Audience Curators, journalists, funders, future archivists

A photograph that cannot be cited cannot be argued with, cannot be linked to, cannot be quoted in a footnote, cannot be deposited with confidence into a library catalogue, and cannot be found again in a hundred years. It can be looked at, but it cannot be referenced. The difference between the two is the difference between an image and an archive entry.

This page makes the argument for the citation grammar this archive uses - the reason every photograph carries a permanent IM-NNNN, why every page has a permanent archive ID, and what the citation system is doing that the photograph by itself cannot do. The mechanical details - the prefix system, the registry, the redirect logic - are documented at How the Archive is Organised. This page is about the why.

What citation does

Citation is doing four things at once, and each of them does work that nothing else can do.

It freezes the reference. A citation says: this specific frame, of this specific subject, at this specific moment. Without an ID the reference is ‘the picture of the millwright’ - which picture, of which millwright, by which photographer, in which version, before or after the editorial cut, before or after the colour grade. With an IM-0030 the reference is unambiguous. The frame is identified, separable from every other frame in the archive, and stable against future revision of the page that contains it.

It survives migration. Archives move. The current website may be hosted on Vercel today, on a successor in five years, in a museum’s long-term digital preservation system in twenty. URLs will change. Site structures will change. CMS systems will be replaced. What survives the migration is the citation grammar, because it is small, portable, and tied to the entry rather than to any particular delivery medium. A future archivist receiving the deposit can rebuild the redirects in any system the entry has migrated into, and the IM-0030 reference made today will still resolve when both the original site and its successor are gone.

It enables external scholarship. A journalist writing about the archive in 2027 can cite IM-0030 in a footnote and have confidence the citation will work. A historian in 2050 can find the photograph by ID even if the website is no longer live, because the ID will resolve to whichever current institutional deposit holds the work. A teacher building a course module on English heritage crafts in 2032 can hand students a list of citations and trust that the references are stable. None of this is possible without the grammar.

It documents the editing. Citations are also the record of what was kept. The archive’s registry retires IDs that are cut from the work; those retired IDs are visible at the archive ID index with the reason for the cut and the date. This is rare in photography projects. It means the archive is honest about its editorial decisions in a way that disappears if the cut images simply vanish without trace. The citation grammar carries the editorial record, not just the published record.

The cost of not having it

Most documentary photography projects do not have a citation grammar. They have books, exhibitions, and gallery prints. They have a photographer’s name and a year, and the rest is fuzzy. The cost of this is not visible while the project is actively published, because the photographer or the gallery can answer questions directly. It becomes visible later, when the photographer is no longer available and the project has migrated through several institutional homes.

Three patterns recur in projects that have skipped citation:

The orphaned image. A photograph that has appeared in publications, exhibitions, and online articles, but where nobody can confirm which print is the canonical version, what the original caption was, who the subject was, or where it was made. The image survives. The metadata around it does not. Stock-photography catalogues are full of these.

The fragmented record. A photographer’s archive split, after death, between several institutions. Each holds a partial set of prints with its own catalogue numbers. There is no master register. To research the photographer’s work, a scholar must visit multiple institutions, cross-reference partial lists, and reconstruct what was once a single body of work. The fragmentation is not the institutions’ fault; they were not given a portable citation grammar to inherit.

The lost edit. A book published with seventy plates. A photographer’s notebook indicates two hundred frames were considered. Without IDs on either set, the relationship between the seventy and the two hundred is lost. The editorial decision that shaped the published work cannot be reconstructed, because there is no way to ask ‘which of the two hundred became the seventy, and which were cut, and why?’

Each of these costs is invisible during the working life of the project and becomes visible only when the project has been handed on. Documentary archives are made for the long view. The grammar is one of the cheapest insurances available against the predictable losses of that handoff.

Why permanence is not negotiable

The hardest part of running a citation grammar is holding the line on permanence. The temptation is constant. A subject changes their preferred name; the slug becomes inaccurate; surely it is fine to update. A category gets renamed; surely the old IDs should be reassigned. A photograph turns out to have been a duplicate; surely the ID should be reused. A page is moved to a different URL; surely the ID can move with it.

Each of those temptations is reasonable in isolation. None of them is acceptable in aggregate. Once the rule is broken once, the grammar becomes unreliable, and an unreliable citation grammar is no use to anyone. A footnote that says ‘IM-0030’ either always points at the same frame or it does not. There is no middle ground.

The archive’s rules - documented in How the Archive is Organised - are absolute on this point.

Once assigned, never changes. The slug, the URL, the title, the category - any of those can change. The ID stays.
Never reused. A retired ID is retired forever. A new entry that resembles a retired one gets the next sequential number, not the recycled one.
Sequential within type, in order of addition. First in, lowest number. The sequence carries chronology by itself.
Forthcoming entries get IDs too. A page does not have to be live to be registered. The ID is assigned at conception, not at publication.

The build system enforces these rules at the technical level. The archive’s registry throws at module load if it sees two entries with the same ID; the build fails rather than ships. The discipline is not aspirational; it is structural. The rules cannot be quietly bent because the build will not let them be bent.

The bet

The decision to run a citation grammar this strict is, in honest terms, a bet. It costs editorial time. It requires a build system that enforces the rules. It demands a registry that becomes more elaborate over time. It is not the easy choice for a documentary photography project, which traditionally publishes books and walks away.

The bet is that machine-readable archival citation will matter more in the next century than it did in the last. The lineage this archive sits inside - Stone, Evans, Roberts, Sykes - has handled citation as a downstream institutional responsibility, with mixed results. The work that has come through cleanly has come through because a serious institution took on the citation work after the photographer’s death. The work that has been lost has been lost because that institutional handoff did not happen, or happened to a body that did not have the resources to do it well.

Running the citation grammar inside the project from the beginning shifts that work. The archive does not depend on a future institution to invent its identifiers. The IDs are already assigned, already permanent, already documented, already tested by the build system. Whichever future institution receives the deposit inherits a grammar that already works, and the grammar can survive being moved between institutions because it is not tied to any particular catalogue system.

That is the bet. It assumes that the next century’s archive readers - human and machine - will appreciate finding a stable identifier when they look for one, and will appreciate not finding the orphaned image, the fragmented record, and the lost edit. It assumes that the cost of running the grammar now is small compared with the cost of having to invent it retrospectively in fifty years. It assumes that being properly cited is part of what a documentary archive is for.

So far, on the evidence, the assumption holds.

What citation does

The cost of not having it

Why permanence is not negotiable

The bet

Further in the archive