Skip to content

Backfill related_ingredients in 8 more communities (#30)#144

Merged
realmarcin merged 1 commit into
mainfrom
backfill/related-ingredients-batch8e
Jun 14, 2026
Merged

Backfill related_ingredients in 8 more communities (#30)#144
realmarcin merged 1 commit into
mainfrom
backfill/related-ingredients-batch8e

Conversation

@realmarcin

Copy link
Copy Markdown
Contributor

Continues the #30 related_ingredients backfill — 22 CHEBI-grounded ingredients across 8 communities, strict no-fabrication protocol. This is the thinner tail (abstract-only caches), so several files correctly yield only 1–3 entries.

File #
Saccharomyces_Acinetobacter_Lignocellulose_Detox_Coculture 3
Clostridium_Saccharomyces_Cellulose_Ethanol_Coculture 3
Synechococcus_Halomonas_Light_Driven_PHB_Coculture 3
MAMC_M48_Lignocellulose 4
Synechococcus_Bacillus_SPC 1
Synechococcus_Ecoli_SPC 2
Synechococcus_Yarrowia_SPC 1
Sulfide_Spring_Autotrophic_CPR_Biofilm 5

Protocol (enforced + independently verified)

  • CHEBI ids OAK-verified, canonical labels exact → 22/22 labels canonical.
  • Every snippet is a verbatim contiguous substring of a cited+cached reference (full-text .md preferred) → 22/22 snippets exact.
  • Compounds not named in the cited+cached text were omitted (no fabrication).

Validation

  • linkml-validate all 8 → exit 0
  • 22/22 labels canonical; 22/22 snippets exact
  • additions-only (316 insertions)

Adoption: 165 → 173 / 265.

🤖 Generated with Claude Code

Adds 22 CHEBI-grounded related_ingredients across 8 communities, strict
no-fabrication protocol (OAK-verified canonical CHEBI labels; every snippet a
verbatim substring of a reference already cited + cached in the file). Thinner
tail (abstract-only caches), so several files yield only 1-3 entries rather than
fabricated extras.

- Saccharomyces_Acinetobacter_Lignocellulose_Detox_Coculture: 3 (furfural, HMF,
  lactic acid)
- Clostridium_Saccharomyces_Cellulose_Ethanol_Coculture: 3 (cellulose, ethanol, O2)
- Synechococcus_Halomonas_Light_Driven_PHB_Coculture: 3 (sucrose, PHB, alginate)
- MAMC_M48_Lignocellulose: 4 (lignocellulose, cellulose, hemicellulose, lignin)
- Synechococcus_Bacillus_SPC: 1 (sucrose)
- Synechococcus_Ecoli_SPC: 2 (sucrose, PHB)
- Synechococcus_Yarrowia_SPC: 1 (sucrose)
- Sulfide_Spring_Autotrophic_CPR_Biofilm: 5 (elemental sulfur, sulfide,
  thiosulfate, sulfite, H2)

Verified: 22/22 labels canonical, 22/22 snippets exact substrings, all 8 pass
linkml-validate, additions-only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@realmarcin realmarcin merged commit 21910eb into main Jun 14, 2026
3 checks passed
@realmarcin realmarcin deleted the backfill/related-ingredients-batch8e branch June 14, 2026 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant