Skip to content

bbbbiiii-commits/global-business-categories

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Global Business Categories

Languages Categories License Formats

Русская версия

Open-source hierarchical taxonomy of 1500+ business categories in 21 languages with country-specific localization. Ready-to-use dataset for marketplaces, directories, CRM systems, location apps, and AI/ML classification tasks.

Not just translation — each language version includes local business types, government agencies, and cultural adaptations. Inspired by real-world directories: Google Maps, 2GIS, Yelp, Yellow Pages, Zoon.

Why Use This Dataset

  • Structured taxonomy — 3-level hierarchy (24 sections > 260 subsections > 1500+ items) ready for dropdowns, filters, and search
  • Multilingual — 21 languages with native terminology, not machine translation
  • Country-localized — each version adapted for local market (Japanese izakaya, German Handwerker, Brazilian lanchonete, etc.)
  • Multiple formats — Markdown, JSON, YAML, CSV — pick what fits your stack
  • Documented diffs — every localization change tracked in diff.json
  • No dependencies — plain data files, use with any language or framework
  • AI/ML ready — structured labeled data for business classification, NER training, and category prediction
  • Free — CC BY 4.0, use commercially

Available Languages

Language Code Catalog JSON YAML CSV Localization
Russian ru catalog.md JSON YAML CSV Original
English en catalog.md JSON YAML CSV diff
Chinese (Simplified) zh-cn catalog.md JSON YAML CSV diff
Spanish es catalog.md JSON YAML CSV diff
Hindi hi catalog.md JSON YAML CSV diff
Arabic ar catalog.md JSON YAML CSV diff
Portuguese (Brazil) pt-br catalog.md JSON YAML CSV diff
French fr catalog.md JSON YAML CSV diff
German de catalog.md JSON YAML CSV diff
Japanese ja catalog.md JSON YAML CSV diff
Korean ko catalog.md JSON YAML CSV diff
Turkish tr catalog.md JSON YAML CSV diff
Italian it catalog.md JSON YAML CSV diff
Vietnamese vi catalog.md JSON YAML CSV diff
Thai th catalog.md JSON YAML CSV diff
Indonesian id catalog.md JSON YAML CSV diff
Polish pl catalog.md JSON YAML CSV diff
Dutch nl catalog.md JSON YAML CSV diff
Swedish sv catalog.md JSON YAML CSV diff
Czech cs catalog.md JSON YAML CSV diff
Ukrainian uk catalog.md JSON YAML CSV diff

What Makes This Different

This is not a machine translation of one catalog into 21 languages. Each version is localized:

  • Japan: izakaya, ramen shops, pachinko, onsen, konbini, juku prep schools
  • Korea: jjimjilbang, PC bang, hagwon, noraebang, chimaek restaurants
  • Germany: Handwerker, Meisterbetrieb, Drogerie, Schornsteinfeger, TUV/Dekra
  • Brazil: lanchonete, acai shops, churrascaria, lotericca, despachante
  • China: hot pot, bubble tea, live-stream commerce, community group buying
  • Thailand: Thai massage, som tam shops, noodle shops, temple services
  • Italy: trattoria, osteria, enoteca, alimentari, tabaccheria
  • Turkey: lokanta, cay bahcesi, simit sarayi, nargile cafe

Each diff.json file documents exactly what was added, removed, or renamed for each country.

Catalog Hierarchy

24 Sections (Level 1)
├── Food & Dining
├── Beauty & Personal Care
├── Healthcare & Medical
├── Education & Development
├── Auto & Transportation
├── Construction & Real Estate
├── Retail & Shopping
├── Consumer Services
├── IT & Internet
├── Finance & Insurance
├── Legal Services
├── Entertainment & Leisure
├── Sports & Fitness
├── Tourism & Hotels
├── Culture & Arts
├── Manufacturing & Industry
├── Agriculture & Food Production
├── Mining & Energy
├── Transportation & Logistics
├── Advertising & Marketing
├── Business & Professional Services
├── Telecommunications
├── Pet Services
└── Government & Public Organizations

260+ Subsections (Level 2)
├── Restaurants, Cafes, Fast Food, Delivery...
├── Clinics, Dentistry, Pharmacy, Labs...
└── ...

1500+ Items (Level 3)
├── Fine dining, Sushi bar, Pizzeria...
├── Dental clinic, MRI center, Veterinary...
└── ...

Quick Start

git clone https://github.com/bbbbiiii-commits/global-business-categories.git
cd global-business-categories

Use JSON directly

const catalog = require('./catalog/en/catalog.json');
console.log(catalog.sections[0].name); // "Food and Dining"
console.log(catalog.metadata.items_count); // 1567

Python

import json
with open('catalog/en/catalog.json') as f:
    catalog = json.load(f)
categories = [item for s in catalog['sections'] for sub in s['subsections'] for item in sub['items']]
print(len(categories))  # 1567

Convert from Markdown

node scripts/convert.js --all          # Convert all languages
node scripts/convert.js catalog.md     # Convert single file

Apply localization diff

node scripts/apply-diff.js --all                    # Apply all diffs
node scripts/apply-diff.js catalog/en/diff.json     # Apply single diff

Formats

Format Best for Example
Markdown Reading, contributing, GitHub preview catalog.md
JSON APIs, web apps, programmatic access catalog.json
YAML Configuration files, CI/CD catalog.yaml
CSV Excel, data analysis, database imports, pandas catalog.csv

Use Cases

  • Marketplaces & Directories — categorize businesses for discovery (like Yelp, Yellow Pages)
  • CRM Systems — organize clients by industry and business type
  • Registration Forms — dropdown menus for business category selection
  • Location Apps — filter and display businesses on maps (like Google Maps, 2GIS)
  • Search Engines — structured business categorization and faceted search
  • Mobile Apps — ready-made business taxonomy for directory features
  • AI/ML Training — labeled dataset for business classification, named entity recognition, category prediction
  • Chatbots & Assistants — business context understanding for recommendations
  • Analytics & BI — segment businesses for market research and reporting

Data Schema

JSON structure

{
  "metadata": {
    "language": "en",
    "sections_count": 24,
    "subsections_count": 262,
    "items_count": 1567,
    "generated_at": "2026-02-16"
  },
  "sections": [
    {
      "id": 1,
      "name": "Food and Dining",
      "subsections": [
        {
          "id": "1.1",
          "name": "Restaurants and Cafes",
          "items": ["Fine dining restaurants", "Family restaurants", "..."]
        }
      ]
    }
  ]
}

CSV columns

section_id, section_name, subsection_id, subsection_name, item

Contributing

We welcome contributions! See CONTRIBUTING.md for details.

  • Add a new language: Translate + localize the catalog
  • Improve categories: Suggest additions or corrections
  • Report issues: Found a problem? Open an issue

Related Projects & Standards

License

Creative Commons Attribution 4.0 International

Free to use, modify, and distribute — even commercially. Just give credit.


Star this repository if you find it useful!

Keywords

business categories, business directory, business taxonomy, industry classification, multilingual categories, localized business types, business catalog dataset, open data, JSON business categories, CSV business directory, marketplace categories, CRM categories, business types list, NAICS alternative, Google Maps categories, business classification AI, NER training data, category prediction dataset

About

Hierarchical catalog of 1500+ business categories in 21 languages with country-specific localization. JSON, YAML, CSV, Markdown.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors