Efficiently extract product, category, and search data from Costco using Python and Playwright. These scrapers are designed to handle dynamic content and modern web architectures, ensuring reliable data collection for e-commerce analysis and monitoring.
This directory contains Python scrapers built with Playwright.
Playwright is a powerful browser automation library that offers several advantages for scraping modern, JavaScript-heavy sites like Costco:
- Dynamic Content Rendering: Unlike static scrapers, Playwright executes JavaScript, allowing it to capture content that is loaded dynamically or rendered client-side.
- Headless & Headed Modes: Run scrapers in headless mode for maximum performance or headed mode for debugging and visual verification of the scraping process.
- Auto-Wait Functionality: Playwright automatically waits for elements to be actionable before performing tasks, significantly reducing flakiness caused by slow network responses or elements loading at different speeds.
- Network Interception: Easily monitor or block specific network requests (like images or trackers) to speed up scraping and reduce bandwidth usage.
- Browser Contexts: Supports isolated browser contexts, which are faster to create than full browser instances and help in managing sessions and cookies effectively.
- Performance: Generally faster and more resource-efficient than older automation tools like Selenium, with built-in support for modern web features.
- When to Use: Choose Playwright over BeautifulSoup when the target data is hidden behind JavaScript execution, and over Selenium when you need faster execution and a more modern developer experience.
- Python: Python 3.7 or higher
- pip: pip
- ScrapeOps API Key: For anti-bot protection (free tier available)
- Navigate to the specific scraper directory:
cd product_category # or product_data, product_search- Install dependencies:
pip install playwright beautifulsoup4-
Get your ScrapeOps API key from https://scrapeops.io/app/register/ai-scraper
-
Update the API key in the scraper file:
API_KEY = 'YOUR-API-KEY'All scrapers can integrate with ScrapeOps to help handle Costco's anti-bot measures:
- Proxy rotation (may help reduce IP blocking)
- Request header optimization (can help reduce detection)
- Rate limiting management
Note: Anti-bot measures vary by site and may change over time. CAPTCHA challenges may occur and cannot be guaranteed to be resolved automatically. Using proxies and browser automation can help reduce blocking, but effectiveness depends on the target site's specific anti-bot measures.
Free Tier Available: ScrapeOps offers a generous free tier perfect for testing and small-scale scraping.
All scrapers output data in JSONL format (one JSON object per line):
- Each line represents one product/result
- Efficient for large datasets
- Easy to process line-by-line
- Can be imported into databases or data processing tools
Example output files:
costco_com_product_category_page_scraper_data_20260114_120000.jsonlcostco_com_product_page_scraper_data_20260114_120000.jsonlcostco_com_product_search_page_scraper_data_20260114_120000.jsonl
This repository provides multiple implementations for different use cases:
playwright/
- product_category/
- example-data/
- product_category.json
- README.md
- scraper/
- costco_scraper_product_category_v1.py
- product_data/
- example-data/
- product_data.json
- README.md
- scraper/
- costco_scraper_product_data_v1.py
- product_search/
- example-data/
- product_search.json
- README.md
- scraper/
- costco_scraper_product_search_v1.py
- Respect Rate Limits: Use appropriate delays and concurrency settings
- Monitor ScrapeOps Usage: Track your API usage in the ScrapeOps dashboard
- Handle Errors Gracefully: Implement proper error handling and logging
- Validate URLs: Ensure URLs are valid Costco pages before scraping
- Update Selectors: Costco may change HTML structure; update selectors as needed
- Test Regularly: Test scrapers regularly to catch breaking changes early
- Handle Missing Data: Some products may not have all fields; handle null values appropriately
- ScrapeOps Documentation: https://scrapeops.io/docs/intro/
- Playwright Documentation: https://playwright.dev/
- Example Outputs: See
example/folders in each scraper directory
This scraper is provided as-is for educational and commercial use. Please ensure compliance with Costco's Terms of Service and robots.txt when using these scrapers.