URLs can only contain a limited set of ASCII characters. Spaces, special characters, Unicode text, and even some punctuation marks must be converted to a safe format before they can appear in a URL. This process is called URL encoding (or percent-encoding), and it's one of the most common sources of bugs in web applications. This guide explains how it works, when to use it, and the mistakes to avoid.
What Is URL Encoding?
URL encoding replaces unsafe characters with a % followed by two hexadecimal digits representing the character's byte value in UTF-8. For example:
Character Encoded Why
--------- -------- ----------------------------------
(space) %20 Spaces are not valid in URLs
& %26 & separates query parameters
= %3D = separates keys from values
# %23 # starts a fragment identifier
? %3F ? starts the query string
/ %2F / separates path segments
@ %40 @ is used in userinfo
+ %2B + has special meaning in forms
% %25 % is the escape character itself
% character itself must be encoded as %25. If you see double-encoding issues (like %2520 instead of %20), something is encoding an already-encoded URL.
Anatomy of a URL
Different parts of a URL have different encoding rules:
https://user:[email protected]:8080/path/to/page?key=value&q=hello+world#section
\___/ \_______/ \_________/ \__/ \__________/ \_______________________/ \_____/
scheme userinfo host port path query fragment
- Path — Encode everything except
/,-,.,_,~ - Query string — Encode everything except
-,.,_,~(and&/=as delimiters) - Fragment — Same rules as query string
- Host — International domain names use Punycode, not percent-encoding
JavaScript Encoding Functions
JavaScript provides multiple encoding functions — and choosing the wrong one is a common mistake.
encodeURIComponent() — For Values
Use this to encode a single value that will be placed into a URL. It encodes everything except A-Z a-z 0-9 - _ . ~ ! * ' ( ).
// Encoding a search query for a URL parameter
const query = 'price > 100 & category = "electronics"';
const encoded = encodeURIComponent(query);
// "price%20%3E%20100%20%26%20category%20%3D%20%22electronics%22"
const url = `https://api.example.com/search?q=${encoded}`;
// Works correctly — & and = inside the value don't break the URL
encodeURI() — For Full URLs
Use this to encode a complete URL. It preserves URL structure characters (: / ? # [ ] @ ! $ & ' ( ) * + , ; =) and only encodes characters that are never valid in any URL part.
// Encoding a full URL with a space in the path
const url = 'https://example.com/my page/résumé.pdf';
const encoded = encodeURI(url);
// "https://example.com/my%20page/r%C3%A9sum%C3%A9.pdf"
// Notice: :// and / are preserved
The Critical Difference
const value = 'a=1&b=2';
encodeURI(value); // "a=1&b=2" ← WRONG for a query value!
encodeURIComponent(value); // "a%3D1%26b%3D2" ← Correct
// Rule: Use encodeURIComponent for VALUES
// Use encodeURI for COMPLETE URLs
escape() function is deprecated and doesn't handle Unicode correctly. It encodes differently from the URL standard. Always use encodeURIComponent() or encodeURI().
URL Encoding in Other Languages
Python
from urllib.parse import quote, quote_plus, urlencode
# Encode a path segment
quote('my page/résumé.pdf') # 'my%20page/r%C3%A9sum%C3%A9.pdf'
quote('my page/résumé.pdf', safe='') # 'my%20page%2Fr%C3%A9sum%C3%A9.pdf'
# Encode a query parameter value (spaces become +)
quote_plus('hello world & goodbye') # 'hello+world+%26+goodbye'
# Encode an entire query string from a dict
urlencode({'q': 'price > 100', 'page': '2'}) # 'q=price+%3E+100&page=2'
PHP
// Encode a URL component
urlencode('hello world'); // "hello+world" (spaces as +)
rawurlencode('hello world'); // "hello%20world" (spaces as %20)
// For query parameters, use http_build_query
$params = ['q' => 'price > 100', 'page' => 2];
http_build_query($params); // "q=price+%3E+100&page=2"
Java
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
String encoded = URLEncoder.encode("hello world & more", StandardCharsets.UTF_8);
// "hello+world+%26+more"
Spaces: %20 vs + (Plus Sign)
This is the most confusing aspect of URL encoding. Both %20 and + can represent a space, but they're used in different contexts:
Context Space Encoding Standard
------------------------- ---------------- ---------------------------
URL path segments %20 RFC 3986
Query string (URL) %20 RFC 3986
HTML form (GET/POST) + application/x-www-form-urlencoded
In practice: %20 is always safe. + only means "space" in form-encoded data (application/x-www-form-urlencoded). In a URL path, a literal + is a plus sign, not a space.
%20 for spaces. It works everywhere. The + convention is a legacy from early HTML forms.
Common Pitfalls
1. Double Encoding
// ❌ Double encoding — %20 becomes %2520
const path = '/search?q=hello%20world';
const broken = encodeURI(path); // '/search?q=hello%2520world'
// ✅ Encode the value, then build the URL
const query = 'hello world';
const url = '/search?q=' + encodeURIComponent(query);
2. Not Encoding User Input in URLs
// ❌ User input breaks the URL structure
const username = 'user&admin=true';
const url = `/api/users?name=${username}`;
// /api/users?name=user&admin=true ← injects a parameter!
// ✅ Encode the value
const url = `/api/users?name=${encodeURIComponent(username)}`;
// /api/users?name=user%26admin%3Dtrue ← safe
3. Encoding the Entire URL Instead of Just Values
// ❌ Breaks the URL structure
const url = encodeURIComponent('https://example.com/path?q=test');
// "https%3A%2F%2Fexample.com%2Fpath%3Fq%3Dtest" ← unusable as a URL
// ✅ Only encode the dynamic parts
const q = encodeURIComponent('test value');
const url = `https://example.com/path?q=${q}`;
4. Forgetting to Decode
// Server receives: ?q=hello%20world
const raw = req.query.q; // Most frameworks auto-decode
// But if you're parsing manually:
const params = new URLSearchParams(window.location.search);
const q = params.get('q'); // "hello world" — auto-decoded
// Or explicitly:
decodeURIComponent('hello%20world'); // "hello world"
URL Encoding and Security
URL encoding is not a security mechanism — but incorrect encoding creates security vulnerabilities:
- Open Redirect — Unvalidated redirect URLs can be crafted with encoded characters to bypass naive checks.
- Path Traversal — Encoded
../(%2e%2e%2f) can bypass path filters that only check the raw string. - Parameter Injection — Unencoded
&and=in values can inject additional query parameters. - XSS via URL — Malicious JavaScript in URL fragments or query parameters must be properly encoded/escaped when reflected in HTML.
Quick Reference
Task Function / Method
---------------------------- ----------------------------------
Encode a query value (JS) encodeURIComponent(value)
Encode a full URL (JS) encodeURI(url)
Decode a value (JS) decodeURIComponent(value)
Build query string (JS) new URLSearchParams({key: value})
Encode value (Python) urllib.parse.quote(value, safe='')
Encode form value (Python) urllib.parse.quote_plus(value)
Build query (Python) urllib.parse.urlencode(dict)
Encode value (PHP) rawurlencode($value)
Encode form value (PHP) urlencode($value)