Which characters need percent-encoding in URLs? Reserved vs unreserved explained
RFC 3986 divides URL characters into three sets: unreserved (always safe, never encode), reserved (have special meaning, encode when used as data), and everything else (always encode). Knowing which set a character belongs to is the difference between a URL that works and one that silently breaks in production.
How this is calculated
Unreserved characters: A-Z, a-z, 0-9, hyphen (-), underscore (_), period (.), tilde (~). These never need encoding. Reserved characters are split into gen-delims (: / ? # [ ] @) and sub-delims (! $ & ' ( ) * + , ; =). Reserved characters should be percent-encoded when they appear in a URL component where they don't serve their delimiter role. An & in a query parameter value must be encoded as %26. An & separating two query parameters must remain literal. This context sensitivity is why you should use a proper URL builder or encodeURIComponent() rather than regex-replacing characters.
Verdict
Use a library or built-in function (encodeURIComponent, URLSearchParams, Python's urllib.parse.urlencode) rather than manually deciding which characters to encode. The spec is subtle, the bugs are silent, and the libraries are correct.
