feat(pt-BR): add number parsing support and ordinal numeric formatting by Kaikygabriel · Pull Request #1785 · Humanizr/Humanizer

Kaikygabriel · 2026-05-24T01:07:28Z

Here is a checklist you should tick through before submitting a pull request:

Enhances the pt-BR locale with improved localization and number parsing support.

Changes include:

Added token-based number parsing (cardinal and ordinal)
Added support for ordinal numeric formatting (e.g., 1º, 2º)
Improved time unit symbols for better clarity
Adjusted grammatical gender for better linguistic accuracy
General consistency improvements across phrases and units

No runtime code changes were made. This PR focuses only on locale improvements.
This brings the pt-BR locale closer to feature parity with the en locale while preserving natural Portuguese usage.

coderabbitai · 2026-05-24T01:07:42Z

📝 Walkthrough

Summary by CodeRabbit

Improvements
- Portuguese (pt-BR) number parsing: more robust handling of case, periods, negatives, connector words, and ordinal abbreviations for more accurate numeric interpretation.
- Expanded token mappings for units, tens, hundreds and ordinals, plus parse options to better handle terminal ordinals and hundred multipliers.
- More natural clock/time expressions in Portuguese: added singular/plural articles and updated minute-offset templates for grammatical correctness.

Walkthrough

Adds a token-map number parser under surfaces.number.parse for pt-BR and updates clock templates to use {nextArticle} with new singularArticle/pluralArticle keys.

Changes

Portuguese-Brazil Locale Updates

Layer / File(s)	Summary
Numeric parsing configuration `src/Humanizer/Locales/pt-BR.yml`	Added `surfaces.number.parse` block using `token-map` engine with lowercase and period-removal normalization, cardinal and ordinal token-to-value maps (units, tens, hundreds, ordinals), `menos` negative prefix, ignored token `e`, ordinal suffixes `º`/`ª`, and parse option flags for terminal ordinals, hundred multiplication, and invariant integer input.
Time formatting: clock phrase templates `src/Humanizer/Locales/pt-BR.yml`	Introduced `singularArticle: "a"` and `pluralArticle: "as"`, and updated `min40`, `min45`, `min50`, and `min55` templates to use `{nextArticle}` instead of a hardcoded `as`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 No pt‑BR o salto é preciso,
sem encurta o passo do aviso,
Números aprendem a falar direito,
Relógios pedem artigos no enfeite,
Um rabisco feliz celebra o serviço.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main changes: adding number parsing support and ordinal numeric formatting to the pt-BR locale, which aligns directly with the primary modifications in the changeset.
Description check	✅ Passed	The description provides relevant context about the locale enhancements, including number parsing and ordinal formatting, which directly relate to the changes made in the YAML configuration file.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05ffb8e3a4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-24T01:11:26Z

+        noventa: 90
+        cem: 100
+        mil: 1000
+        milhão: 1000000


Add missing milhões token to pt-BR cardinal map

The new token-map parser for pt-BR only includes milhão but omits the plural milhões, even though this locale’s number-to-words output uses plural million forms (for example, values like 2,000,000 are rendered with milhões). TokenMapWordsToNumberConverter resolves scale words via exact token lookup, so dois milhões is treated as unrecognized and parsing fails for common million-range inputs.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-24T01:11:27Z

+        setenta: 70
+        oitenta: 80
+        noventa: 90
+        cem: 100


Add cento/hundreds tokens needed for canonical pt-BR parsing

This cardinal map defines cem but not cento (nor other hundred words), while the same locale’s number-to-words surface emits forms like cento e um and duzentos. With token-map parsing, those words must exist in cardinalMap to be recognized, so many standard pt-BR numbers in the 101–999 range will fail to parse even though they are canonical outputs of the locale.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Humanizer/Locales/pt-BR.yml`:
- Around line 342-378: Add the missing numeric word keys to the cardinalMap so
parsing handles plural and hundred forms: include plural large-number keys
("milhões", "trilhões", "quadrilhões", "quintilhões"), add hundreds entries
matching number.words.cardinal.hundredsMap ("duzentos", "trezentos",
"quatrocentos", "quinhentos", "seiscentos", "setecentos", "oitocentos",
"novecentos"), and ensure "milhões" maps to 1000000 while the plural
large-number keys map to their corresponding powers of 1,000; update the
cardinalMap block (the YAML mapping named cardinalMap) to contain these keys and
their numeric values to mirror the singular forms already present (e.g.,
"milhão":1000000).
- Around line 441-444: Add Portuguese articles for nextHour in the pt-BR clock
templates: define singularArticle and pluralArticle entries in the pt-BR clock
section (same keys used by other locales) and update min40, min45, min50 and
min55 to include the nextArticle token before nextHour (use "{nextArticle}
{nextHour}" pattern) so phrases like "vinte para as duas" are produced; modify
the existing min40/min45/min50/min55 templates and add
singularArticle/pluralArticle keys accordingly in the pt-BR locale block.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 51370ed1-4ba6-4b98-bdc2-fc343f3825e3

📥 Commits

Reviewing files that changed from the base of the PR and between f9292aa and 05ffb8e.

📒 Files selected for processing (1)

src/Humanizer/Locales/pt-BR.yml

coderabbitai

🧹 Nitpick comments (1)

src/Humanizer/Locales/pt-BR.yml (1)
238-238: 💤 Low value

Consider using an abbreviated symbol for week.

Other time unit symbols use short abbreviations (ms, s, min, h, d, m, a), but week uses the full word semana. While Portuguese lacks a universally accepted abbreviation for "week," using sem or sem. would maintain consistency with other symbols. If the full word is intentional for clarity, feel free to disregard.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Humanizer/Locales/pt-BR.yml` at line 238, The pt-BR locale uses the full
word for the week symbol ("symbol: 'semana'"); change that value to a short
abbreviation (e.g., "sem" or "sem.") to match the other short unit symbols (ms,
s, min, h, d, m, a) and keep consistency across time unit symbols—update the
"symbol: 'semana'" entry accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/Humanizer/Locales/pt-BR.yml`:
- Line 238: The pt-BR locale uses the full word for the week symbol ("symbol:
'semana'"); change that value to a short abbreviation (e.g., "sem" or "sem.") to
match the other short unit symbols (ms, s, min, h, d, m, a) and keep consistency
across time unit symbols—update the "symbol: 'semana'" entry accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4603d172-ce6f-4c3a-a670-d162a4f5d93c

📥 Commits

Reviewing files that changed from the base of the PR and between d25e491 and 28d6505.

📒 Files selected for processing (1)

src/Humanizer/Locales/pt-BR.yml

feat(pt-BR): add number parsing support and ordinal numeric formatting

05ffb8e

chatgpt-codex-connector Bot reviewed May 24, 2026

View reviewed changes

coderabbitai Bot requested changes May 24, 2026

View reviewed changes

Comment thread src/Humanizer/Locales/pt-BR.yml

Comment thread src/Humanizer/Locales/pt-BR.yml Outdated

Ajusts in pt-br yml

d25e491

coderabbitai Bot approved these changes May 26, 2026

View reviewed changes

Ajusts in pt-br yml

28d6505

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(pt-BR): add number parsing support and ordinal numeric formatting#1785

feat(pt-BR): add number parsing support and ordinal numeric formatting#1785
Kaikygabriel wants to merge 3 commits into
Humanizr:mainfrom
Kaikygabriel:main

Kaikygabriel commented May 24, 2026

Uh oh!

coderabbitai Bot commented May 24, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 24, 2026

Uh oh!

chatgpt-codex-connector Bot May 24, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Kaikygabriel commented May 24, 2026

Uh oh!

coderabbitai Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 24, 2026 •

edited

Loading