Which characters need percent-encoding in URLs? Reserved vs unreserved explained

RFC 3986 divides URL characters into three sets: unreserved (always safe, never encode), reserved (have special meaning, encode when used as data), and everything else (always encode). Knowing which set a character belongs to is the difference between a URL that works and one that silently breaks in production.

Encoding focus
Percent-encoding deep dive
percent-encoding
Category
Best Practices
Practical encoding guidance

How this is calculated

Unreserved characters: A-Z, a-z, 0-9, hyphen (-), underscore (_), period (.), tilde (~). These never need encoding. Reserved characters are split into gen-delims (: / ? # [ ] @) and sub-delims (! $ & ' ( ) * + , ; =). Reserved characters should be percent-encoded when they appear in a URL component where they don't serve their delimiter role. An & in a query parameter value must be encoded as %26. An & separating two query parameters must remain literal. This context sensitivity is why you should use a proper URL builder or encodeURIComponent() rather than regex-replacing characters.

Verdict

Use a library or built-in function (encodeURIComponent, URLSearchParams, Python's urllib.parse.urlencode) rather than manually deciding which characters to encode. The spec is subtle, the bugs are silent, and the libraries are correct.

More Encoding scenarios

Frequently asked questions

How do I convert text to Base64?
Paste your string into the Text field and the Base64 output appears instantly. The tool uses standard Base64 (RFC 4648), so the output is identical to Linux's base64 command and every major language's built-in Base64 encoder.
What's the difference between Base64 and hex encoding?
Both represent binary data as text, but with different alphabets. Base64 uses 64 characters and needs roughly 4 chars per 3 bytes (33% overhead). Hex uses 16 characters and needs exactly 2 chars per byte (100% overhead). Base64 is denser, while hex is easier to read byte by byte.
Why does my UTF-8 text break when converted to binary?
UTF-8 encodes non-ASCII characters as multibyte sequences, so a single emoji or accented letter becomes 2-4 bytes. The binary output will be longer than the character count suggests, that's correct behavior, not a bug.
Is it safe to paste sensitive data into the converter?
Yes. The encoding conversion runs entirely in your browser with JavaScript, nothing is sent to our servers, logged, or stored. You can verify this with your browser's Network tab: no requests fire when you type.
What is URL-safe Base64?
A variant that replaces `+` with `-` and `/` with `_` so the result can be safely placed in URLs without percent-encoding. JWT tokens use URL-safe Base64. Standard Base64 is fine for most other uses.
Can I decode Base64 back to the original text?
Yes, the converter is bidirectional. Paste Base64 into the Base64 field and you'll get the original UTF-8 string back. If decoding fails silently, the input isn't valid Base64 (wrong characters, bad padding, or it was double-encoded).