Skip to content

Text inside nobr-tags is dropped #825

@kod-kristoff

Description

@kod-kristoff

I noticed that trafilatura drops text in <nobr> tags.

Example: "Sjätte <nobr>AP-fonden</nobr>" gets extracted as "Sjätte " and not the expected "Sjätte AP-fonden".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions