This script automates the extraction of car listings from Wallapop using Selenium. It collects data from 10 product pages and saves it into a structured JSON file.
-
Install dependencies
Make sure you have Python 3.11+ and Chrome installed. Then install Selenium:pip install selenium
-
Run the script
From the project folder, run:python main_wallapop.py
-
Output
A file namedproducts_wallapop.jsonwill be saved in the/outputfolder.
- Selenium + ChromeDriver chosen for compatibility with JavaScript-rendered pages.
- Explicit waits (
WebDriverWait) used instead of fixed delays to ensure element presence. - Resilient scraping: uses
find_elements+ fallback defaults to avoid breaking on missing content. - Simple JSON structure: clean and portable for further processing.
- Randomized user behavior: delay and interaction simulation for realism.
To simulate human behavior and reduce detection:
- User-Agent spoofing: Custom desktop browser signature is applied.
- Real-time navigation: Browser accepts cookies, scrolls, and opens each product page individually.
- Randomized delays: Between 1–4 seconds added between requests.
- Browser dimensions: Configured to mimic real user screens (1920x1080).
- Try/except for fault tolerance: Script continues even if one page fails.
Example JSON entry:
[
{
"title": "BMW Serie 3 1997",
"price": "4500 €",
"description": "BMW 318TDS in good condition, always in garage. Everything original. Includes 17\" wheels.",
"image": "https://cdn.wallapop.com/images/...",
"url": "https://es.wallapop.com/item/..."
}
]Optional screenshot snippet:
driver.save_screenshot(f"output/screenshot_{i}.png")- Console logs inform about cookies, extracted product count, and any errors.
- No unit tests included, since the script relies on interactive, stateful page content.
- Exceptions are handled per product to prevent full script failure.
This solution fulfills the test’s requirements:
- Scrapes a real-world JavaScript site
- Demonstrates basic anti-bot evasion
- Includes user interaction (clicks, navigation, scroll)
- Produces clean JSON output
Ready to extend with additional features like proxy rotation, screenshot capture, or headless execution.