Help: Naming conventions

From WikiLeague, the free baseball governance encyclopedia.

All Help pages

NAMING

Consistent filenames are how the archive stays browsable, sortable, and machine-parseable. These conventions are mandatory.

1. Pattern

{date}_{doctype}_{slug}.{ext}

The paired metadata file uses the same base name with .md:

{date}_{doctype}_{slug}.{ext}      ← the source document
{date}_{doctype}_{slug}.md         ← the metadata

Both live in the same category folder under documents/.

2. Components

2.1 {date}

ISO 8601, matching the document's date_precision:

  • YYYY-MM-DD for day-precision
  • YYYY-MM for month-precision
  • YYYY for year-precision

For documents with a date range (CBAs, leases), use the start date or year of execution. The range goes in the slug and in metadata's effective_period.

Examples: 1972-06-19, 2022-03-10, 1921, 1903-09.

2.2 {doctype}

A short, consistent code drawn from this fixed list (kept in sync with TAXONOMY.md):

Code Meaning
cba Collective Bargaining / Basic Agreement
constitution League constitution
agreement Other formal multi-party agreement (Major League Agreement, National Agreement, posting agreement)
contract Uniform Player Contract, executed contracts
policy Drug policy, conduct policy, internal policy
caselaw Court ruling or opinion
legislation Statute, public law
hearing Congressional hearing record
report Formal investigation or commissioned report
memo Internal memo, commissioner memo
letter Open letter, correspondence in evidence
testimony Standalone witness testimony
statement Formal public statement
arbitration Grievance or salary arbitration ruling
lease Stadium or facility lease
filing Court filing (complaint, motion, response)
bylaws Organizational bylaws
rules Major League Rules or comparable rulebook
decision Commissioner formal decision

If a document plausibly fits two doctypes, pick the one closest to the document's primary character — a Congressional report on steroid hearings is report, not hearing; the hearings themselves are hearing.

2.3 {slug}

Lowercase, hyphen-separated, descriptive. Aim for unambiguity at a glance.

Rules:

  • Lowercase only.
  • Use hyphens between words.
  • Use underscores _ only as separators between filename components (between date, doctype, and slug). Never inside the slug itself.
  • No spaces, no punctuation, no diacritics (transliterate as needed).
  • Aim for 3–7 words.
  • Include the most distinguishing element: party name, version year, subject.
  • Drop articles (a, the).
  • Drop redundant words (no case, act, agreement when the doctype field already says that).

Examples:

  • flood-v-kuhn (case)
  • curt-flood-act-of-1998 (legislation)
  • mlb-cba-2022-2026 (CBA — include the term for disambiguation)
  • mlb-constitution-2005-update
  • vincent-drug-policy-memo
  • mitchell-report
  • blue-ribbon-panel-baseball-economics
  • palmeiro-statement-positive-test
  • expos-rico-complaint-minority-owners
  • dcsec-baseball-stadium-agreement

3. Examples — full filenames

documents/antitrust-and-courts/1922-05-29_caselaw_federal-baseball-club-v-national-league.pdf
documents/antitrust-and-courts/1922-05-29_caselaw_federal-baseball-club-v-national-league.md

documents/antitrust-and-courts/1972-06-19_caselaw_flood-v-kuhn.pdf
documents/antitrust-and-courts/1972-06-19_caselaw_flood-v-kuhn.md

documents/legislation-and-hearings/1998-10-27_legislation_curt-flood-act-of-1998.pdf
documents/legislation-and-hearings/1998-10-27_legislation_curt-flood-act-of-1998.md

documents/cbas/2022-03-10_cba_mlb-cba-2022-2026.pdf
documents/cbas/2022-03-10_cba_mlb-cba-2022-2026.md

documents/governance/1921_agreement_major-league-agreement.pdf
documents/governance/1921_agreement_major-league-agreement.md

documents/governance/2008-03_constitution_mlb-constitution-post-2005-amendment.pdf
documents/governance/2008-03_constitution_mlb-constitution-post-2005-amendment.md

documents/drug-and-conduct/1991-06-07_memo_vincent-drug-policy-program.pdf
documents/drug-and-conduct/1991-06-07_memo_vincent-drug-policy-program.md

documents/reports-and-investigations/2007-12-13_report_mitchell-report.pdf
documents/reports-and-investigations/2007-12-13_report_mitchell-report.md

documents/stadiums-and-relocation/2004_lease_dcsec-baseball-stadium-agreement.pdf
documents/stadiums-and-relocation/2004_lease_dcsec-baseball-stadium-agreement.md

documents/franchise-finance/2002_filing_expos-rico-complaint-minority-owners.pdf
documents/franchise-finance/2002_filing_expos-rico-complaint-minority-owners.md

4. Multi-part documents

Some documents come as multi-file sets (a hearing transcript with separate exhibit volumes, a court ruling with appendices):

  • Use a shared base name plus a _partN or descriptive suffix.
  • Each part gets its own paired .md if it's separately citable; otherwise, one .md describes the whole set and lists all parts in file.additional_files.

Examples:

2005-03-17_hearing_house-government-reform-steroids_transcript.pdf
2005-03-17_hearing_house-government-reform-steroids_exhibits-vol1.pdf
2005-03-17_hearing_house-government-reform-steroids_exhibits-vol2.pdf
2005-03-17_hearing_house-government-reform-steroids.md          ← single metadata file

5. Versions and updates

When a document is replaced by a new version (e.g., the MLB Constitution gets an update):

  • The new version gets its own file with its own date in the filename.
  • The old version's metadata superseded_by field is updated to point to the new file's slug.
  • The new version's metadata supersedes field points to the old.

Don't overwrite an old version with a new one — preserve both.

6. Slugs in metadata cross-references

When referring to other documents in supersedes, superseded_by, related_documents, etc., use the slug only — the full base filename without extension:

related_documents:
  - "1922-05-29_caselaw_federal-baseball-club-v-national-league"
  - "1953-11-09_caselaw_toolson-v-new-york-yankees"
  - "1998-10-27_legislation_curt-flood-act-of-1998"

This makes cross-linking deterministic and gives the future front-end something predictable to resolve.

7. Edge cases

  • Undated documents: use the earliest plausible year you can defend, set date_precision: year, and note the dating evidence in source.snapshot_notes or notes.
  • Documents with multiple effective dates (e.g., a law signed one day, effective another): use the date that matches date_type (usually signed or effective, chosen per the document's character).
  • Documents you only know by case number: pull the issuance date from the docket; that's the filename date.
  • Documents discovered through Wayback Machine: filename uses the document's own date, not the snapshot date. Snapshot date goes in source.archive_url and source.retrieved_date.

8. Collection folders

Some single research entries are document sets rather than single files (e.g., a court case with multiple filings, a hearing with witness-by-witness testimony, a NARA box of case files). These are stored as collection folders named per the standard {date}_{doctype}_{slug}/ convention, with the following internal structure:

documents/<category>/<date>_<doctype>_<slug>/
    README.md                            ← collection-level metadata file (canonical)
    MANIFEST.md                          ← (optional) free-form per-file inventory
    {date}_{doctype}_{sub-slug}.pdf      ← (optional) individually citable sub-documents,
    {date}_{doctype}_{sub-slug}.md       ← each with its own paired metadata
    *.pdf, *.txt                         ← (optional) non-citable supporting files,
                                           tracked under collection README's
                                           file.additional_files

Rules:

  • README.md is the collection's canonical metadata file. It carries the full YAML frontmatter (same schema as a single-document metadata file), and its status reflects the collection-level verification status.
  • MANIFEST.md is optional. If present, it is a free-form per-file inventory (file list, SHA256 per file, page counts, descriptive notes). It does NOT need YAML frontmatter — it is a sibling document of README.md, not its own indexable entry.
  • Sub-document .md files are reserved for sub-documents that are separately citable in their own right (e.g., the Messersmith-McNally collection has separate sub-documents for the W.D. Mo. and 8th Cir. rulings). Each gets its own paired metadata file following the normal {date}_{doctype}_{slug} pattern, and each appears as its own row in INDEX.md.
  • Non-citable supporting files (case-file PDFs in a NARA box, individual witness testimony transcripts within a larger hearing, secondary-source PDFs) are tracked under the collection README's file.additional_files array, not as separate .md entries.

INDEX.md treatment: the collection's README appears as a row using the slug <date>_<doctype>_<slug>/README (with the trailing /README). Separately citable sub-documents appear as additional rows below the README row using the slug <date>_<doctype>_<slug>/<date>_<doctype>_<sub-slug>.

Cross-references (related_documents etc.): use the full slug path including /README for collection-level references and /<sub-slug> for sub-document references.

9. Confirmation copies and current-version markers

Two filename-prefix conventions are reserved for non-canonical companion files stored alongside the primary file:

9.1 _confirm_* prefix

A _confirm_* PDF is an independent second-source confirmation copy of the primary document, stored alongside the primary for verification audit purposes. Naming pattern:

_confirm_{date}_{primary-slug}_{secondary-publisher}.{ext}

Examples:

  • _confirm_1972-flood-v-kuhn-justia.pdf — Justia copy of Flood v. Kuhn alongside the primary LoC PDF.
  • _confirm_constitution_documentcloud.pdf — DocumentCloud copy of the MLB Constitution alongside the primary.
  • _confirm_jdptp_mlb_mirror.pdf — MLB-hosted mirror alongside the MLBPA-hosted primary of the Joint Drug Program.

Confirmation copies do NOT get their own .md metadata file. They are referenced from the primary document's source.confirmation_sources[].url (the URL they were retrieved from) and, where relevant, from file.additional_files (the local file inventory).

9.2 _current_* prefix

A _current_* PDF is a moving-target pointer to "whatever the current operative version of this document is, separate from any date-stamped historical version." This pattern is reserved for live policy/rule documents that get updated in place by the publisher.

Example: _current_rules_major-league-rules.pdf points to "the current operative MLR" even when versioned files (e.g., 2008-03-31_rules_major-league-rules-2008.pdf, 2026-03-17_rules_major-league-rules.pdf) are also in the archive.

Like _confirm_* files, _current_* files do NOT get their own metadata. They are tracked under the most recent versioned file's file.additional_files or in a notes field. Use sparingly — version-stamped files are preferred whenever the document has a meaningful version date.