Unidecode

Unicode to ASCII transliteration for Python — converts Unicode text to its closest ASCII representation. Unidecode features: unidecode() for full transliteration, unidecode_expect_ascii() for faster ASCII-expected input, unidecode_expect_nonascii() for Unicode-expected input, character-level mapping tables, and support for CJK (Chinese/Japanese/Korean) romanization, Cyrillic/Greek/Hebrew/Arabic transliteration, accent removal, and full Unicode coverage. Converts 'Ångström' to 'Angstrom', 'Café' to 'Cafe', '日本語' to 'Ri Ben Yu'.

Evaluated Mar 06, 2026 (0d ago) v1.3.x

Homepage ↗ Repo ↗ Developer Tools python unidecode transliteration ascii unicode slug text-normalization

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

Transliteration library with no network calls. Output is always ASCII — safe for filesystem and URL use after additional sanitization. Lossy transformation may normalize different Unicode inputs to same ASCII — consider homograph attacks when using unidecode output for security-sensitive identifiers.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

Generating ASCII slugs, filenames, and identifiers from Unicode text — Unidecode provides consistent, predictable ASCII output for any Unicode input without external dependencies.

Avoid When

Round-trip required (use original Unicode), semantic translation (use translation API), or CJK-specific romanization conventions (use language-specific libraries).

Use Cases

• Agent URL slug generation — from unidecode import unidecode; import re; def slugify(text): ascii_text = unidecode(text); slug = re.sub(r'[^a-z0-9]+', '-', ascii_text.lower()).strip('-'); return slug; slugify('Crème brûlée') — 'creme-brulee' — URL-safe slug; agent generates URL slugs from international titles; consistent ASCII output for any Unicode input
• Agent filename sanitization — from unidecode import unidecode; safe_name = unidecode(user_filename).replace(' ', '_'); safe_path = os.path.join(base_dir, safe_name) — ASCII filename; agent creates files from user-provided names; filesystem compatibility across OS; removes diacritics and special chars
• Agent text normalization for search — from unidecode import unidecode; normalized_query = unidecode(user_query.lower()); results = search_index.query(normalized_query) — normalize search query; agent search supports international input matching ASCII-indexed content; 'München' matches 'Munchen' in index
• Agent username validation — from unidecode import unidecode; base_username = unidecode(display_name).lower(); username = re.sub(r'[^a-z0-9]', '', base_username)[:20] — generate username from display name; agent creates ASCII usernames from Unicode display names; consistent identifier format
• Agent CSV/spreadsheet export — from unidecode import unidecode; rows = [[unidecode(str(v)) for v in row] for row in data]; csv_writer.writerows(rows) — ASCII CSV; agent exports data to systems expecting ASCII; international text remains readable in ASCII approximation

Not For

• Translation — unidecode does transliteration (sound), not semantic translation; 'Café' → 'Cafe' not 'Coffee Shop'
• Round-trip fidelity — unidecode is lossy; original Unicode cannot be recovered from ASCII output; don't use for display purposes
• Preserving semantic meaning — CJK transliteration may produce unexpected romanizations; for CJK-specific romanization use pinyin/kakasi libraries

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: none

OAuth: No Scopes: No

No auth — local text processing library.

Pricing

Model: open_source

Free tier: Yes

Requires CC: No

Unidecode is dual licensed GPL-2.0 or Artistic License. Check license compatibility for commercial use.

Agent Metadata

Pagination

none

Idempotent

Full

Retry Guidance

Not documented

Known Gotchas

⚠ Transliteration is lossy — unidecode('Café') → 'Cafe'; original accent information is lost; agent code must NOT use unidecode() for display to users; only use for slug/identifier generation; store original Unicode separately if needed
⚠ CJK transliteration is pronunciation-based — unidecode('日本') → 'Ri Ben'; this is Mandarin pronunciation of Japanese kanji; not the Japanese pronunciation 'Nihon'; agent code doing Japanese or Korean transliteration should use language-specific libraries (kakasi for Japanese)
⚠ Empty string for some characters — some rare Unicode characters have no ASCII equivalent; unidecode() returns empty string for those; agent code generating slugs may get shorter-than-expected strings or empty strings; always validate slug is non-empty after unidecode
⚠ GPL license check for commercial use — Unidecode is GPL-2.0 or Artistic License dual-license; GPL-2.0 requires derived works to also be GPL; for commercial closed-source use: check if Artistic License applies; consider alternatives like anyascii (Apache 2.0) for more permissive licensing
⚠ Spaces preserved as spaces — unidecode('hello world') → 'hello world' (spaces unchanged); for slug generation must separately replace spaces: unidecode(text).replace(' ', '-'); unidecode does not produce URL-safe output alone
⚠ unidecode_expect_ascii() vs unidecode() — unidecode_expect_ascii(mostly_ascii_text) is faster because it skips table lookup for ASCII characters; returns None for non-ASCII characters (not string); agent code using expect_ascii must handle None: use unidecode() if text may have any Unicode

Alternatives

ftfy-python-api

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Unidecode.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.