Help: Naming conventions
From WikiLeague, the free baseball governance encyclopedia.
NAMING
Consistent filenames are how the archive stays browsable, sortable, and machine-parseable. These conventions are mandatory.
1. Pattern
{date}_{doctype}_{slug}.{ext}
The paired metadata file uses the same base name with .md:
{date}_{doctype}_{slug}.{ext} ← the source document
{date}_{doctype}_{slug}.md ← the metadata
Both live in the same category folder under documents/.
2. Components
2.1 {date}
ISO 8601, matching the document's date_precision:
YYYY-MM-DDfor day-precisionYYYY-MMfor month-precisionYYYYfor year-precision
For documents with a date range (CBAs, leases), use the start date or year of execution. The range goes in the slug and in metadata's effective_period.
Examples: 1972-06-19, 2022-03-10, 1921, 1903-09.
2.2 {doctype}
A short, consistent code drawn from this fixed list (kept in sync with TAXONOMY.md):
| Code | Meaning |
|---|---|
cba |
Collective Bargaining / Basic Agreement |
constitution |
League constitution |
agreement |
Other formal multi-party agreement (Major League Agreement, National Agreement, posting agreement) |
contract |
Uniform Player Contract, executed contracts |
policy |
Drug policy, conduct policy, internal policy |
caselaw |
Court ruling or opinion |
legislation |
Statute, public law |
hearing |
Congressional hearing record |
report |
Formal investigation or commissioned report |
memo |
Internal memo, commissioner memo |
letter |
Open letter, correspondence in evidence |
testimony |
Standalone witness testimony |
statement |
Formal public statement |
arbitration |
Grievance or salary arbitration ruling |
lease |
Stadium or facility lease |
filing |
Court filing (complaint, motion, response) |
bylaws |
Organizational bylaws |
rules |
Major League Rules or comparable rulebook |
decision |
Commissioner formal decision |
If a document plausibly fits two doctypes, pick the one closest to the document's primary character — a Congressional report on steroid hearings is report, not hearing; the hearings themselves are hearing.
2.3 {slug}
Lowercase, hyphen-separated, descriptive. Aim for unambiguity at a glance.
Rules:
- Lowercase only.
- Use hyphens between words.
- Use underscores
_only as separators between filename components (between date, doctype, and slug). Never inside the slug itself. - No spaces, no punctuation, no diacritics (transliterate as needed).
- Aim for 3–7 words.
- Include the most distinguishing element: party name, version year, subject.
- Drop articles (
a,the). - Drop redundant words (no
case,act,agreementwhen the doctype field already says that).
Examples:
flood-v-kuhn(case)curt-flood-act-of-1998(legislation)mlb-cba-2022-2026(CBA — include the term for disambiguation)mlb-constitution-2005-updatevincent-drug-policy-memomitchell-reportblue-ribbon-panel-baseball-economicspalmeiro-statement-positive-testexpos-rico-complaint-minority-ownersdcsec-baseball-stadium-agreement
3. Examples — full filenames
documents/antitrust-and-courts/1922-05-29_caselaw_federal-baseball-club-v-national-league.pdf
documents/antitrust-and-courts/1922-05-29_caselaw_federal-baseball-club-v-national-league.md
documents/antitrust-and-courts/1972-06-19_caselaw_flood-v-kuhn.pdf
documents/antitrust-and-courts/1972-06-19_caselaw_flood-v-kuhn.md
documents/legislation-and-hearings/1998-10-27_legislation_curt-flood-act-of-1998.pdf
documents/legislation-and-hearings/1998-10-27_legislation_curt-flood-act-of-1998.md
documents/cbas/2022-03-10_cba_mlb-cba-2022-2026.pdf
documents/cbas/2022-03-10_cba_mlb-cba-2022-2026.md
documents/governance/1921_agreement_major-league-agreement.pdf
documents/governance/1921_agreement_major-league-agreement.md
documents/governance/2008-03_constitution_mlb-constitution-post-2005-amendment.pdf
documents/governance/2008-03_constitution_mlb-constitution-post-2005-amendment.md
documents/drug-and-conduct/1991-06-07_memo_vincent-drug-policy-program.pdf
documents/drug-and-conduct/1991-06-07_memo_vincent-drug-policy-program.md
documents/reports-and-investigations/2007-12-13_report_mitchell-report.pdf
documents/reports-and-investigations/2007-12-13_report_mitchell-report.md
documents/stadiums-and-relocation/2004_lease_dcsec-baseball-stadium-agreement.pdf
documents/stadiums-and-relocation/2004_lease_dcsec-baseball-stadium-agreement.md
documents/franchise-finance/2002_filing_expos-rico-complaint-minority-owners.pdf
documents/franchise-finance/2002_filing_expos-rico-complaint-minority-owners.md
4. Multi-part documents
Some documents come as multi-file sets (a hearing transcript with separate exhibit volumes, a court ruling with appendices):
- Use a shared base name plus a
_partNor descriptive suffix. - Each part gets its own paired
.mdif it's separately citable; otherwise, one.mddescribes the whole set and lists all parts infile.additional_files.
Examples:
2005-03-17_hearing_house-government-reform-steroids_transcript.pdf
2005-03-17_hearing_house-government-reform-steroids_exhibits-vol1.pdf
2005-03-17_hearing_house-government-reform-steroids_exhibits-vol2.pdf
2005-03-17_hearing_house-government-reform-steroids.md ← single metadata file
5. Versions and updates
When a document is replaced by a new version (e.g., the MLB Constitution gets an update):
- The new version gets its own file with its own date in the filename.
- The old version's metadata
superseded_byfield is updated to point to the new file's slug. - The new version's metadata
supersedesfield points to the old.
Don't overwrite an old version with a new one — preserve both.
6. Slugs in metadata cross-references
When referring to other documents in supersedes, superseded_by, related_documents, etc., use the slug only — the full base filename without extension:
related_documents:
- "1922-05-29_caselaw_federal-baseball-club-v-national-league"
- "1953-11-09_caselaw_toolson-v-new-york-yankees"
- "1998-10-27_legislation_curt-flood-act-of-1998"
This makes cross-linking deterministic and gives the future front-end something predictable to resolve.
7. Edge cases
- Undated documents: use the earliest plausible year you can defend, set
date_precision: year, and note the dating evidence insource.snapshot_notesornotes. - Documents with multiple effective dates (e.g., a law signed one day, effective another): use the date that matches
date_type(usuallysignedoreffective, chosen per the document's character). - Documents you only know by case number: pull the issuance date from the docket; that's the filename date.
- Documents discovered through Wayback Machine: filename uses the document's own date, not the snapshot date. Snapshot date goes in
source.archive_urlandsource.retrieved_date.
8. Collection folders
Some single research entries are document sets rather than single files (e.g., a court case with multiple filings, a hearing with witness-by-witness testimony, a NARA box of case files). These are stored as collection folders named per the standard {date}_{doctype}_{slug}/ convention, with the following internal structure:
documents/<category>/<date>_<doctype>_<slug>/
README.md ← collection-level metadata file (canonical)
MANIFEST.md ← (optional) free-form per-file inventory
{date}_{doctype}_{sub-slug}.pdf ← (optional) individually citable sub-documents,
{date}_{doctype}_{sub-slug}.md ← each with its own paired metadata
*.pdf, *.txt ← (optional) non-citable supporting files,
tracked under collection README's
file.additional_files
Rules:
README.mdis the collection's canonical metadata file. It carries the full YAML frontmatter (same schema as a single-document metadata file), and itsstatusreflects the collection-level verification status.MANIFEST.mdis optional. If present, it is a free-form per-file inventory (file list, SHA256 per file, page counts, descriptive notes). It does NOT need YAML frontmatter — it is a sibling document ofREADME.md, not its own indexable entry.- Sub-document
.mdfiles are reserved for sub-documents that are separately citable in their own right (e.g., the Messersmith-McNally collection has separate sub-documents for the W.D. Mo. and 8th Cir. rulings). Each gets its own paired metadata file following the normal{date}_{doctype}_{slug}pattern, and each appears as its own row inINDEX.md. - Non-citable supporting files (case-file PDFs in a NARA box, individual witness testimony transcripts within a larger hearing, secondary-source PDFs) are tracked under the collection README's
file.additional_filesarray, not as separate.mdentries.
INDEX.md treatment: the collection's README appears as a row using the slug <date>_<doctype>_<slug>/README (with the trailing /README). Separately citable sub-documents appear as additional rows below the README row using the slug <date>_<doctype>_<slug>/<date>_<doctype>_<sub-slug>.
Cross-references (related_documents etc.): use the full slug path including /README for collection-level references and /<sub-slug> for sub-document references.
9. Confirmation copies and current-version markers
Two filename-prefix conventions are reserved for non-canonical companion files stored alongside the primary file:
9.1 _confirm_* prefix
A _confirm_* PDF is an independent second-source confirmation copy of the primary document, stored alongside the primary for verification audit purposes. Naming pattern:
_confirm_{date}_{primary-slug}_{secondary-publisher}.{ext}
Examples:
_confirm_1972-flood-v-kuhn-justia.pdf— Justia copy of Flood v. Kuhn alongside the primary LoC PDF._confirm_constitution_documentcloud.pdf— DocumentCloud copy of the MLB Constitution alongside the primary._confirm_jdptp_mlb_mirror.pdf— MLB-hosted mirror alongside the MLBPA-hosted primary of the Joint Drug Program.
Confirmation copies do NOT get their own .md metadata file. They are referenced from the primary document's source.confirmation_sources[].url (the URL they were retrieved from) and, where relevant, from file.additional_files (the local file inventory).
9.2 _current_* prefix
A _current_* PDF is a moving-target pointer to "whatever the current operative version of this document is, separate from any date-stamped historical version." This pattern is reserved for live policy/rule documents that get updated in place by the publisher.
Example: _current_rules_major-league-rules.pdf points to "the current operative MLR" even when versioned files (e.g., 2008-03-31_rules_major-league-rules-2008.pdf, 2026-03-17_rules_major-league-rules.pdf) are also in the archive.
Like _confirm_* files, _current_* files do NOT get their own metadata. They are tracked under the most recent versioned file's file.additional_files or in a notes field. Use sparingly — version-stamped files are preferred whenever the document has a meaningful version date.