Skip to content

reading medium content has HTTP 403 #23

@clojj

Description

@clojj

Starting the guide app locally, loading content always gets 403:
(browsing these blog posts is possible from my machine)

2025-12-31T16:38:13.125+01:00  INFO 12107 --- [           main] .e.a.r.i.UrlSpecificContentRefreshPolicy : Checking whether to reread content at uri=https://medium.com/@springrod/context-engineering-needs-domain-understanding-b4387e8e4bf8 : existsInRepository=false, shouldRefreshUri=false => shouldReread=true
2025-12-31T16:38:13.169+01:00  WARN 12107 --- [           main] c.e.a.r.i.TikaHierarchicalContentReader  : Received HTTP 403 for URL: https://medium.com/@springrod/context-engineering-needs-domain-understanding-b4387e8e4bf8
2025-12-31T16:38:13.169+01:00 ERROR 12107 --- [           main] com.embabel.guide.rag.DataManager        : ❌ Failure loading URL https://medium.com/@springrod/context-engineering-needs-domain-understanding-b4387e8e4bf8: Server returned HTTP response code: 403 for URL: https://medium.com/@springrod/context-engineering-needs-domain-understanding-b4387e8e4bf8

java.io.IOException: Server returned HTTP response code: 403 for URL: https://medium.com/@springrod/context-engineering-needs-domain-understanding-b4387e8e4bf8
	at com.embabel.agent.rag.ingestion.TikaHierarchicalContentReader.parseUrl(TikaHierarchicalContentReader.kt:92)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions