HTML Entity Encode / Decode

Why HTML Entity Encoding Matters

When user-supplied text is inserted into an HTML page without encoding, characters like < and & can break the page structure or — more critically — allow attackers to inject executable script tags. This is the root cause of Cross-Site Scripting (XSS), one of the most common web vulnerabilities.

Always encode at the point of output (when rendering), not at the point of input (when storing). Storing pre-encoded data corrupts it when used in non-HTML contexts.

Common HTML entities quick reference

Character	Named entity	Numeric entity	When to use
`&`	`&`	`&`	Always — starts every entity
`<`	`<`	`<`	In text content and attribute values
`>`	`>`	`>`	In text content (technically optional but good practice)
`"`	`"`	`"`	Inside double-quoted HTML attributes
`'`	`'`	`'`	Inside single-quoted HTML attributes
non-breaking space	` `	` `	Prevent line breaks between words
`©`	`©`	`©`	Copyright symbol
`—`	`—`	`—`	Em dash in prose

Encode non-ASCII characters

Characters outside the basic ASCII range (accented letters, emoji, CJK) are safe in UTF-8 HTML pages, but some legacy systems or email clients may mangle them. Encoding them as numeric entities (😀) guarantees they survive any transport.