Skip to content

Adds support for albert.cz#1891

Merged
jknndy merged 2 commits intohhursev:mainfrom
zdenek-stursa:site/albert-cz
Apr 24, 2026
Merged

Adds support for albert.cz#1891
jknndy merged 2 commits intohhursev:mainfrom
zdenek-stursa:site/albert-cz

Conversation

@zdenek-stursa
Copy link
Copy Markdown
Contributor

@zdenek-stursa zdenek-stursa commented Apr 22, 2026

Adds scraper for albert.cz — Czech supermarket Albert recipe website.

The site uses schema.org/Recipe, with the following customizations:

  • instructions() — filters out numbered step markers (1., 2., etc.) from HowToStep names
  • description() — reads from <meta name="description"> (not present in schema)
  • author() and site_name() — hardcoded to Albert (schema author name is empty)

Recipes:

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@jknndy jknndy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @zdenek-stursa , thanks for the PR! I've made two comments for your review

Comment thread recipe_scrapers/albertcz.py Outdated
filtered = [
line
for line in instructions.split("\n")
if not re.fullmatch(r"\d+\.", line.strip())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not re.fullmatch(r"\d+\.", line.strip())
if not line.strip().endswith(".") or not line.strip()[:-1].isdigit()

Instead of importing re we can use a string check to see if the line ends with a period and the rest is numeric accomplishing the same output

Comment thread recipe_scrapers/albertcz.py Outdated
Comment on lines +1 to +2
import re

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import re

Replace re.fullmatch(r"\d+\.", ...) with string-based check
using endswith(".") and isdigit() as suggested by reviewer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zdenek-stursa
Copy link
Copy Markdown
Contributor Author

Thank you so much @jknndy for taking the time to review this! 🙏 Your suggestion is spot on — the string-based check is much cleaner and avoids an unnecessary import re. I've applied both changes.

This scraper is particularly close to my heart as it covers my wife's favorite recipe site, so I'm really happy someone had a look at it. Much appreciated! 😊

@jknndy jknndy merged commit b1ce047 into hhursev:main Apr 24, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants