Skip to content

docs: remove docs code reference#674

Open
andreatgretel wants to merge 1 commit into
mainfrom
andreatgretel/docs/remove-code-reference-docs
Open

docs: remove docs code reference#674
andreatgretel wants to merge 1 commit into
mainfrom
andreatgretel/docs/remove-code-reference-docs

Conversation

@andreatgretel
Copy link
Copy Markdown
Contributor

@andreatgretel andreatgretel commented May 18, 2026

📋 Summary

Removes the generated code reference docs from both MkDocs and Fern so the docs no longer publish or link to the retired API reference surface. This also removes the generation plumbing and adds publish-branch cleanup for archived Fern versions so stale reference pages do not survive in docs-website archives.

🔗 Related Issue

N/A

🔄 Changes

🗑️ Removed

  • Deleted the MkDocs docs/code_reference/** pages, Fern fern/versions/latest/pages/code_reference/** pages, mkdocstrings CSS, and py2fern normalization script.
  • Removed code reference nav/config, dependency entries, Make targets, workflow env, and ignored Fern artifacts.
  • Removed stale reference links from MkDocs/Fern concept and plugin docs, plus contributor and agent docs.

🔧 Changed

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

  • fern/scripts/fern-published-branch.py - Archived Fern versions copy cleaned current versions of the affected concept/plugin pages during publish sync so stale reference links are removed from historical docs.

🧪 Testing

  • .venv/bin/ruff check --fix .
  • .venv/bin/ruff format .
  • make check-fern-docs passes with 0 errors and 2 existing warnings
  • .venv/bin/mkdocs build passes with existing docs warnings
  • git diff --check
  • Source keyword sweep for retired reference strings
  • docs-website dry-run sync plus make check-fern-docs
  • Claude review and follow-up found no actionable findings
  • make test passes (N/A - docs-only; not run)
  • Unit tests added/updated (N/A - no testable logic)
  • E2E tests added/updated (N/A - docs-only)

✅ Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO)
  • Architecture docs updated (N/A - no architecture changes)

@andreatgretel andreatgretel marked this pull request as ready for review May 18, 2026 21:37
@andreatgretel andreatgretel requested a review from a team as a code owner May 18, 2026 21:37
@github-actions
Copy link
Copy Markdown
Contributor

MkDocs preview: https://237e0a14.dd-docs-preview.pages.dev

Fern preview: https://nvidia-preview-pr-674.docs.buildwithfern.com/nemo/datadesigner

Fern previews include the docs-website version archive with PR changes synced into latest. Notebook tutorials are rendered without execution outputs in previews.

@github-actions
Copy link
Copy Markdown
Contributor

PR #674 Review — docs: remove docs code reference

Summary

This is a docs-only PR (79 additions / 1690 deletions) that removes the
generated API reference surface from both publishing pipelines:

  • MkDocs: deletes docs/code_reference/**, drops mkdocstrings /
    mkdocstrings-python from pyproject.toml and uv.lock, removes
    the mkdocstrings CSS, and trims mkdocs.yml nav.
  • Fern: deletes fern/versions/latest/pages/code_reference/**, removes
    the Code Reference nav section from fern/versions/latest.yml, drops
    the libraries: block and all /code-reference/* redirect rules from
    fern/docs.yml, removes py2fern from deps, and deletes
    fern/scripts/normalize-py2fern-indexes.py.
  • Plumbing: removes the generate-fern-api-reference[-native] Make
    targets, the DOCS_PY2FERN workflow env, and the fern/code-reference/
    gitignore entry.
  • Concept/plugin pages: rewrites stale /code-reference/... links into
    short prose mentions (columns, custom_columns, model-configs,
    person_sampling, security, tool_use_and_mcp, validators,
    plugins/example, plugins/overview).
  • Agent docs: updates .agents/, CONTRIBUTING.md, fern/AGENTS.md,
    and fern/README.md so they no longer reference the retired surface.

The only behavioral change is in fern/scripts/fern-published-branch.py,
which now strips the Code Reference nav and code_reference page tree
from archived Fern versions during publish sync, and refreshes the
affected concept/plugin pages in those archives so the inline links
stripped on latest also disappear from historical docs.

Findings

Correctness

  • remove_retired_reference_archive flow looks sound. It runs after
    clear_published_tree + source copy + merge_preserved_versions, so
    the published tree at this point is: source latest (no
    code_reference) + preserved v* versions (which may still have
    code_reference). The script (a) strips the nav block from every
    v*.yml, (b) deletes any */pages/code_reference directory under
    versions/, and (c) overlays the cleaned-on-latest versions of the
    9 affected concept/plugin pages into each v*/pages/. That's a
    consistent end state.
  • remove_navigation_section shares the same end-of-block heuristic
    as extract_/replace_navigation_section
    (next line that
    startswith(" - ") and is non-empty). For a section that is last
    in the file, end falls through to len(lines), which is the desired
    behavior. ✅
  • glob("v*/pages") is intentionally narrow — it only matches
    version directories whose names start with v, matching the
    REDIRECT_VERSION_RE convention elsewhere in this file. If a future
    archive uses an older-versions/... shape, page refreshes there would
    be skipped silently. Not a regression for this PR; worth noting if
    archive naming ever broadens.
  • glob(f"*/pages/{RETIRED_REFERENCE_DIR}") would also match
    latest/pages/code_reference, but latest no longer contains that
    directory after the source copy, so the broader glob is harmless and
    keeps the cleanup robust against a stray re-add.
  • Redirect removal is a deliberate trade-off. All the
    /nemo/datadesigner/code_reference/*/code-reference/* redirects
    in fern/docs.yml are deleted. Users following indexed search
    results to /code_reference/... will now get 404s instead of being
    redirected to the (nonexistent) new code-reference pages. Since the
    destination is also gone, redirecting wouldn't help — but you may want
    to consider a single redirect of the code_reference root to
    /concepts/columns or the API overview page. Not blocking; a product
    call.
  • fern-published-branch.py lines 18-19 use split string literals
    ("Code " + "Reference", "code" + "_reference").
    This is a
    workaround for the "source keyword sweep for retired reference
    strings" check listed in the PR's testing checklist. It works, but
    it's the kind of cleverness that future maintainers will revert
    without realizing why. A one-line # noqa-style comment explaining
    the sweep would prevent that — e.g. # Split to satisfy the retired- reference keyword sweep; do not collapse. Optional.

Conventions

  • Matches the surrounding style of fern-published-branch.py:
    module-level constants, from __future__ import annotations, modern
    type annotations (list[str], re.Pattern[str]), no relative
    imports, PublishedBranchError for failures. ✅
  • Concept-page rewrites preserve voice and Markdown link conventions
    used elsewhere in fern/versions/latest/pages/concepts/.
  • pyproject.toml and uv.lock are kept in sync; transitive removal
    of astroid (mkdocstrings → griffe → astroid) is correctly reflected
    in the lockfile.
  • Makefile .PHONY list is kept in sync with the deleted targets.
  • Typo in the existing source line at
    fern/versions/latest/pages/concepts/person_sampling.mdx:43
    ("For mor details") is replaced rather than corrected — fine for this
    PR, but a free fix you could land alongside.

Performance

  • No runtime/library performance impact. Loss of the docs build step
    for the API reference will modestly speed up make check-fern-docs
    and the docs-preview workflow.

Test coverage

  • Docs-only; no logic tests are required for the deletions.
  • fern-published-branch.py has no unit tests in this repo (pre-existing
    state). The new remove_retired_reference_archive is therefore
    exercised only by the publish dry-run noted in the PR checklist
    ("docs-website dry-run sync plus make check-fern-docs"). Adding a
    small pytest around the YAML mutation helpers would be a low-cost
    follow-up but is out of scope here.
  • The PR explicitly verifies make check-fern-docs (0 errors, 2
    pre-existing warnings) and mkdocs build. Adequate for a docs PR.

Security

  • No secrets, no new network calls, no executable changes outside the
    publish-sync script. shutil.rmtree is constrained to paths inside
    published_root / "fern" / "versions" (the script's own temp
    workspace), so no risk of overreach.
  • No prompt-injection / external-content concerns.

Risks / things to double-check after merge

  1. Inbound links from external sources (Google, blog posts,
    internal NVIDIA wiki) pointing at /code_reference/... or
    /code-reference/... will 404. If telemetry shows non-trivial hits,
    consider a single catch-all redirect to a relevant concept page.
  2. docs-website archive cleanup runs only at next publish. Until
    then, archived versions on the live site still surface broken
    /code-reference/... links from their concept pages. The PR's
    approach (refresh from latest on publish) handles this on the
    next run; just be aware the gap is one publish cycle.
  3. fern/AGENTS.md still references code-reference/ in some
    commentary (worth a final grep before merging).

Verdict

Approve / non-blocking comments only. This is a clean, well-scoped
removal of a retired surface. The sole logic change in
fern-published-branch.py is straightforward and consistent with the
existing nav-mutation helpers. Two optional follow-ups: (a) add a
brief comment explaining the split string literals, and (b) consider a
single catch-all redirect to soften the 404 cliff for external
inbound links. Neither blocks merge.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR removes the generated code reference documentation from both MkDocs and Fern, retiring the py2fern/mkdocstrings pipeline and all associated nav entries, redirects, dependency declarations, and Makefile targets. The fern-published-branch.py publish script is updated to strip the retired reference section from archived version nav files, delete archived code_reference page directories, and refresh the concept/plugin pages that previously contained now-broken links.

  • Deleted 30+ docs/code_reference/** and fern/versions/latest/pages/code_reference/** files, removed mkdocstrings CSS, the py2fern normalization script, and all related pyproject.toml/Makefile/workflow dependencies.
  • Replaced in-doc hyperlinks to the retired code reference with plain text or tutorial references in concept and plugin pages.
  • Added remove_retired_reference_archive to the publish sync pipeline, which removes Code Reference nav entries from all archived version .yml files, deletes their code_reference page dirs, and copies the cleaned concept/plugin pages into each archived version.

Confidence Score: 5/5

Safe to merge; all changes are documentation and tooling removals with no impact on runtime code.

The publish script's new remove_retired_reference_archive correctly sequences nav cleanup, directory deletion, and page refresh before materialize_version_nav_pages runs, so archived versions will have no dangling code-reference nav entries or page files. All dependency, config, and nav removals are internally consistent across all 86 changed files.

fern/scripts/fern-published-branch.py is the only file with meaningful logic changes and is worth a close read, but the implementation is straightforward.

Important Files Changed

Filename Overview
fern/scripts/fern-published-branch.py Core logic change: replaces sync_code_reference_archive with remove_retired_reference_archive; new remove_navigation_section helper correctly identifies YAML nav section boundaries. String constants deliberately obfuscated to avoid self-matching in keyword sweep.
fern/docs.yml Removes the libraries: block and all code_reference redirects; surviving redirects are not affected.
fern/versions/latest.yml Removes the entire Code Reference section (70 lines) from navigation; no other nav entries affected.
mkdocs.yml Removes mkdocstrings plugin config, code_reference nav entries, watch paths, and mkdocstrings.css; clean removal with no dangling references.
pyproject.toml Drops mkdocstrings-python, mkdocstrings, and py2fern from docs dependencies; consistent with removal of all generation tooling.
.github/workflows/docs-preview.yml Removes DOCS_PY2FERN env var arg from check-fern-docs make call; safe removal.
Makefile Removes generate-fern-api-reference targets and associated variable definitions; prepare-fern-docs now only depends on generate-fern-notebooks.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[sync_source called] --> B[clear_published_tree]
    B --> C[copytree source to published]
    C --> D[merge_preserved_versions\nrestores archived v*.yml + page dirs]
    D --> E[remove_retired_reference_archive]
    E --> E1[For each archived v*.yml:\nremove_navigation_section\nCode Reference]
    E --> E2[Delete pages/code_reference dirs\nfrom archived versions]
    E --> E3[Copy clean concept/plugin pages\nfrom latest to v*/pages/]
    E1 --> F[materialize_version_nav_pages]
    E2 --> F
    E3 --> F
    F --> G[restore_versions_block]
    G --> H[validate_redirect_targets]
    H --> I[write_publish_metadata]
Loading

Reviews (1): Last reviewed commit: "docs: remove docs code reference" | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant