AliExpress Product Scraper

A powerful and user-friendly web interface for scraping product data from AliExpress using their unofficial API.

Screenshots

Search Interface

Results and Field Selection

Features

🌐 Web Interface: Clean and intuitive UI for easy interaction
🚀 API-Based Scraping: Fast and efficient data collection using AliExpress's unofficial API
🔒 Smart Session Management: Uses browser automation only for initial cookie collection
🛡️ Anti-Block Protection:
- Configurable delay between requests (0.2-10 seconds)
- Sequential request processing to avoid overwhelming the server
- Session caching to minimize browser automation
📊 Flexible Data Export:
- JSON format for full data preservation
- CSV format for easy spreadsheet import
🎯 Customizable Fields: Select exactly which product details to extract
🔍 Advanced Filtering:
- Price range filtering
- Discount deals filter
- Free shipping filter
📝 Real-time Progress: Live logging of the scraping process

How It Works

Smart Session Handling:
- First visit uses a headless browser to collect necessary cookies
- Subsequent requests use cached session data (30-minute validity)
- Minimizes the need for browser automation
Efficient API Scraping:
- Uses AliExpress's internal API for data collection
- Faster and more reliable than HTML scraping
- Reduces the chance of being blocked
Data Processing:
- Extracts clean, structured data
- Handles currency formatting
- Processes URLs and image links
- Manages pagination automatically

Installation

Clone the repository:

git clone https://github.com/ImranDevPython/aliexpress-scraper.git
cd aliexpress-scraper

Install required packages:

pip install -r requirements.txt

Usage

Start the web interface:

python app.py

Open your browser and navigate to:

http://localhost:5000

In the web interface:
- Enter your search keyword
- Select number of pages to scrape (1-60)
- Choose which fields to include
- Set optional filters (price range, discounts, shipping)
- Adjust request delay (recommended: 1 second)
- Start scraping and monitor progress
Results will be saved in the results folder as:
- aliexpress_[keyword]_extracted.json
- aliexpress_[keyword]_extracted.csv

Available Fields

Product ID
Title
Sale Price
Original Price
Discount (%)
Currency
Rating
Orders Count
Store Name
Store ID
Store URL
Product URL
Image URL

Best Practices

Request Delay:
- Default: 1 second between requests
- Lower values (0.2-0.5s) may work but risk temporary IP blocks
- Adjust based on your needs and risk tolerance
Page Count:
- Maximum: 60 pages per search
- Recommended: Start with fewer pages to test
- Use filters to get more relevant results
Session Management:
- Session data is cached for 30 minutes
- Clear browser cookies if you encounter issues
- Let the automated browser handle cookie collection

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Disclaimer

This tool is for educational purposes only. Use responsibly and in accordance with AliExpress's terms of service.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AliExpress Product Scraper

Screenshots

Search Interface

Results and Field Selection

Features

How It Works

Installation

Usage

Available Fields

Best Practices

Contributing

License

Disclaimer

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AliExpress Product Scraper

Screenshots

Search Interface

Results and Field Selection

Features

How It Works

Installation

Usage

Available Fields

Best Practices

Contributing

License

Disclaimer