Operational crawler platform monorepo. The active application is XCrawler — a Laravel 12 app that crawls metadata, normalizes catalog data into MySQL, indexes movies into Elasticsearch, and exposes a React/Inertia operations UI.
Companion package Scrapy is a Phase 1 standalone Python extractor in scrapy/. Observability is an active NestJS downstream service in observability/. XCrawler remains source of truth for catalog data, crawl logs, queues, and search indexing; production parsing still runs in Modules/Crawler PHP adapters.
Package root: composer, npm, and artisan run from xcrawler/. Docker lifecycle: scripts/docker-local.sh and scripts/docker-testing.sh from monorepo root.
git clone https://github.com/jooservices/XCrawler.git
cd XCrawler/xcrawler
cp .env.example .env
docker context use desktop-linux
bash ../scripts/docker-local.shApp: http://127.0.0.1:8080 · Horizon: http://127.0.0.1:8080/horizon
Full install (Docker + host), prerequisites, and env wiring: Monorepo docs → XCrawler Docker.
| Package | Path | Status | Install |
|---|---|---|---|
| XCrawler | xcrawler/ |
Active | Docker · Host |
| Scrapy | scrapy/ |
Phase 1 | Docker · Host |
| Observability | observability/ |
Active | docs |
Quality gate (required before commit/push):
bash scripts/docker-testing.sh gateDetails: CI and quality gates. Contributing: xcrawler/CONTRIBUTING.md.
AI-assisted work: AGENTS.md → xcrawler/AGENTS.md.
| Audience | Start here |
|---|---|
| Monorepo (install, integration, CI) | docs/README.md |
| XCrawler app (architecture, ops, dev) | xcrawler/docs/README.md |
| Roadmap | xcrawler/ROADMAP.md |
.
├── docs/ # monorepo docs (install, integration, CI)
├── AGENTS.md # monorepo agent router
├── xcrawler/ # Laravel app (package root)
├── scrapy/ # Python extractor (Phase 1 standalone)
├── observability/ # NestJS observability service
├── scripts/ # docker-local.sh, docker-testing.sh (monorepo Docker)
└── docker-compose.yml # includes xcrawler, observability, scrapy compose
Copyright (c) 2026 JOOservices. MIT License.