Multi-platform job scraper for Upwork and Freelancer.com with automatic Cloudflare bypass. Extracts job listings via browser automation (DrissionPage) and API calls (curl_cffi), stores results in SQLite.
- Upwork scraping β DrissionPage + Chromium, bypasses Cloudflare JS challenges, extracts
window.__NUXT__state - Freelancer.com scraping β curl_cffi with Chrome TLS fingerprint impersonation, no browser needed
- 20 keyword categories β auto-scrapes across AI, automation, scraping, and development niches
- SQLite storage β deduplication, status tracking, filtering
- CLI interface β
python -m src.main <command> - REST API β optional FastAPI server for querying results
# Install
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# Configure
cp config.yaml config.yaml # edit keywords & delay
# Scrape Freelancer.com (fast, no browser)
python -m src.main freelancer
# Scrape Upwork (needs Chromium + display)
python -m src.main upwork
# Scrape a single job detail
python -m src.main detail <url>
# View results
python -m src.main stats
python -m src.main jobssrc/
βββ main.py # CLI entry point
βββ config.py # Config loader
βββ database.py # SQLite operations
βββ job_filter.py # Relevance filtering
βββ scrapers/
β βββ base.py # Shared browser setup (DrissionPage)
β βββ upwork.py # Upwork Nuxt extraction
β βββ freelancer.py # Freelancer.com API (curl_cffi)
β βββ job_detail.py # Individual job detail scrape
βββ models/
β βββ job.py # Job dataclass + parsers
βββ utils/
βββ logger.py
βββ clean.py
- Python 3.10+
- DrissionPage (Chromium browser automation)
- curl_cffi (Chrome TLS fingerprint impersonation)
- FastAPI (optional API server)
- SQLite