Added support for naturalharry.au by JackSun815 · Pull Request #1430 · hhursev/recipe-scrapers

JackSun815 · 2024-12-09T01:23:42Z

Pull Request: Add New Scraper for Natural Harry Recipes

This pull request introduces a new scraper for recipes hosted on naturalharry.au. The scraper is implemented as a subclass of AbstractScraper and provides support for extracting various recipe details from the site. The implementation has been thoroughly tested to ensure compatibility and correctness.

Features Added:

Scraper Functionality:
The scraper extracts the following details for recipes:
- Host URL: Identifies the source website.
- Author: Captures the recipe author, e.g., "Harry."
- Title: Extracts the title of the recipe.
- Languages: Determines the language of the recipe, e.g., "en-US."
- Description: Extracts a concise description of the recipe.
- Category: (if available) Identifies the recipe category.
- Total Time: Parses and calculates the total preparation and cooking time.
- Ingredients: Accurately extracts and formats ingredients from the recipe content.
- Instructions: Captures the step-by-step instructions, ensuring no extraneous content is included.
- Image: Retrieves the main image associated with the recipe.
- Yields: Extracts the yield/serving size, e.g., "about 10 tacos."
- Cuisine: (if available) Identifies the cuisine type.
Testing:
Test cases have been added to validate the scraper's functionality:
- JSON test cases for all supported fields, ensuring accurate parsing and alignment with expected outputs.

How to Test:

Run the scraper on the naturalharry.au recipes using the following command:
```
python -m unittest -k naturalharry
```
Validate that all test cases pass and extracted fields match the expected outputs in the JSON test files.
Ensure the scraper handles variations in recipe formatting gracefully.

Future Improvements:

Dynamic error handling for unexpected changes in the site's HTML structure.

…tions

…assing

sonarqubecloud · 2024-12-21T03:42:42Z

Quality Gate passed

Issues
4 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

hhursev · 2025-01-06T01:18:20Z

remove this file altogether

JackSun815 added 3 commits December 8, 2024 19:09

added scrape functionality for all except for ingredients and instruc…

6e32f62

…tions

Modified scrape to work for ingredients and instructions, all tests p…

f8cc81b

…assing

added link to naturalharry under README

5fe43d1

hhursev reviewed Jan 6, 2025

View reviewed changes

Comment thread tests/test_data/naturalharry.com.au/naturalharryTest.py

Copy link
Copy Markdown

Owner

hhursev Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this file altogether

Merge remote-tracking branch 'upstream/main' into pr/1430

0775c1a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for naturalharry.au#1430

Added support for naturalharry.au#1430
JackSun815 wants to merge 4 commits intohhursev:mainfrom
JackSun815:naturalharry_scraper

JackSun815 commented Dec 9, 2024

Uh oh!

sonarqubecloud Bot commented Dec 21, 2024

Uh oh!

hhursev Jan 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JackSun815 commented Dec 9, 2024

Pull Request: Add New Scraper for Natural Harry Recipes

Features Added:

How to Test:

Future Improvements:

Uh oh!

sonarqubecloud Bot commented Dec 21, 2024

Quality Gate passed

Uh oh!

hhursev Jan 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants