Skip to content

Backfill Cable Bacteria related_ingredients via re-fetched abstract (#30)#147

Merged
realmarcin merged 1 commit into
mainfrom
backfill/related-ingredients-cable-refetch
Jun 14, 2026
Merged

Backfill Cable Bacteria related_ingredients via re-fetched abstract (#30)#147
realmarcin merged 1 commit into
mainfrom
backfill/related-ingredients-cable-refetch

Conversation

@realmarcin

Copy link
Copy Markdown
Contributor

Part (b) — re-fetch fuller abstracts for high-value un-backfillable files.

Screened the un-backfilled set for references cached only as stubs/missing, then re-fetched 4 PMIDs via communitymech.literature:

PMID File Outcome
27058505 Cable_Bacteria_Photosynthetic_Biofilm_Sediment Unlocked — full abstract has sulfur/oxygen chemistry
38228683 CeMbio No chemistry (behaviour paper)
34135464 PMI_Variovorax No chemistry (phenotypic-variation paper)
19395564 Yogurt No chemistry (in-silico HGT/genomics paper)

For the one that unlocked, I folded the full abstract into references_cache/PMID_27058505.md (the validator's primary cache, previously a stub) and added 4 CHEBI-grounded ingredients to the cable-bacteria community: sulfide, dioxygen, sulfate, iron sulfide. The other 3 are confirmed genuinely un-backfillable (re-fetch verified no hidden chemistry); scratch fetches discarded, no changes.

Validation

  • 4/4 labels OAK-canonical; 4/4 snippets verbatim substrings of the cited+cached refs
  • linkml-validate → exit 0

Adoption: 181 → 182 / 265.

🤖 Generated with Claude Code

…ct (#30)

Part (b) of the backfill effort: re-fetch fuller abstracts for high-value
un-backfillable files, then curate.

Re-fetched PMID:27058505 (was a stub cache) and folded the full abstract into
references_cache/PMID_27058505.md (the validator's primary cache), unlocking the
cable-bacteria sulfur/oxygen chemistry. Added 4 CHEBI-grounded ingredients to
Cable_Bacteria_Photosynthetic_Biofilm_Sediment: sulfide (CHEBI:15138), dioxygen
(CHEBI:15379), sulfate (CHEBI:16189), iron sulfide (CHEBI:75896) — all snippets
verbatim from cited+cached references, all labels OAK-canonical.

Also re-fetched PMID:38228683 (CeMbio), PMID:34135464 (PMI Variovorax), and
PMID:19395564 (Yogurt) but their full abstracts name no metabolite chemistry
(behaviour / phenotypic-variation / in-silico HGT papers) — confirmed genuinely
un-backfillable, no changes made.

Verified: 4/4 labels canonical, 4/4 snippets exact, linkml-validate passes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@realmarcin realmarcin merged commit 0976eff into main Jun 14, 2026
3 checks passed
@realmarcin realmarcin deleted the backfill/related-ingredients-cable-refetch branch June 14, 2026 07:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant