Middleware

markdown-for-agents includes middleware for serving Markdown responses via content negotiation. When an AI agent (or any client) sends Accept: text/markdown, the middleware intercepts the HTML response and converts it to Markdown automatically. Normal browser requests pass through untouched.

Each middleware is a separate package. Install only the ones you need:

bash

npm install @markdown-for-agents/express
npm install @markdown-for-agents/fastify
npm install @markdown-for-agents/hono
npm install @markdown-for-agents/nextjs
npm install @markdown-for-agents/web

All middleware packages depend on markdown-for-agents (the core library), which is installed automatically as a dependency.

Python The Python package includes middleware for FastAPI, Flask, and Django. See the Python package docs for details. :::

How It Works

Client sends a request with Accept: text/markdown header
Your server generates an HTML response as usual
The middleware intercepts the response, converts the HTML to Markdown
The client receives Content-Type: text/markdown; charset=utf-8
The response includes an x-markdown-tokens header with the estimated token count, an ETag for cache validation, and a content-signal header with publisher consent signals (when configured)

All responses (converted or not) include Vary: Accept so that CDNs and proxies cache HTML and Markdown representations separately.

If the Accept header doesn't include text/markdown, or the upstream response isn't HTML, the middleware passes through without modification.

Express

bash

npm install @markdown-for-agents/express

import express from 'express';
import { markdown } from '@markdown-for-agents/express';

const app = express();

// Apply globally
app.use(markdown());

// Or with options
app.use(
    markdown({
        extract: true,
        baseUrl: 'https://example.com'
    })
);

app.get('/', (req, res) => {
    res.send(`
    <nav><a href="/">Home</a></nav>
    <main>
      <h1>Article</h1>
      <p>Content here...</p>
    </main>
    <footer>Copyright 2025</footer>
  `);
});

app.listen(3000);

The Express middleware intercepts res.send() calls. When the client sends Accept: text/markdown and the response content type is text/html, it converts the HTML body to Markdown before sending. Non-HTML responses and non-string bodies pass through untouched.

Test it:

bash

# Normal HTML response
curl http://localhost:3000/

# Markdown response
curl -H "Accept: text/markdown" http://localhost:3000/

Fastify

bash

npm install @markdown-for-agents/fastify

import Fastify from 'fastify';
import { markdown } from '@markdown-for-agents/fastify';

const fastify = Fastify();

// Register as a plugin
fastify.register(markdown());

// Or with options
fastify.register(
    markdown({
        extract: true,
        baseUrl: 'https://example.com'
    })
);

fastify.get('/', async (request, reply) => {
    reply.type('text/html');
    return `
    <nav><a href="/">Home</a></nav>
    <main>
      <h1>Article</h1>
      <p>Content here...</p>
    </main>
    <footer>Copyright 2025</footer>
  `;
});

fastify.listen({ port: 3000 });

The Fastify middleware uses the onSend hook to intercept the response payload before it's sent to the client. This is the idiomatic Fastify approach for response transformation.

Hono

bash

npm install @markdown-for-agents/hono

import { Hono } from 'hono';
import { markdown } from '@markdown-for-agents/hono';

const app = new Hono();

// Apply globally
app.use(markdown());

// Or with options
app.use(
    markdown({
        extract: true,
        baseUrl: 'https://example.com'
    })
);

app.get('/', c => {
    return c.html(`
    <nav><a href="/">Home</a></nav>
    <main>
      <h1>Article</h1>
      <p>Content here...</p>
    </main>
    <footer>Copyright 2025</footer>
  `);
});

export default app;

Test it:

bash

# Normal HTML response
curl https://localhost:3000/

# Markdown response
curl -H "Accept: text/markdown" https://localhost:3000/

The Hono middleware uses MiddlewareHandler from Hono, so it integrates natively with Hono's middleware chain.

Next.js

bash

npm install @markdown-for-agents/nextjs

Use a Next.js proxy for site-wide conversion. The proxy checks the Accept header and fetches the page as HTML before converting:

// proxy.ts
import { NextRequest, NextResponse, NextFetchEvent } from 'next/server';
import { withMarkdown } from '@markdown-for-agents/nextjs';

const options = {
    extract: true,
    deduplicate: true,
    contentSignal: { aiTrain: true, search: true, aiInput: true }
};

export async function proxy(request: NextRequest, event: NextFetchEvent) {
    const accept = request.headers.get('accept') ?? '';
    if (!accept.includes('text/markdown')) {
        return NextResponse.next();
    }

    const handler = withMarkdown(async (req: NextRequest) => fetch(req.url, { headers: { accept: 'text/html' } }), { ...options, baseUrl: request.nextUrl.origin });

    return (await handler(request, event)) ?? NextResponse.next();
}

export const config = {
    matcher: ['/', '/about', '/blog/:slug*']
};

How it works

The inner fetch sends accept: 'text/html', so when the request re-enters the proxy it hits the early return NextResponse.next() and renders the page normally — no infinite loop. Only Accept: text/markdown requests take this path; all other traffic passes straight through.

Tradeoffs

This pattern makes a second HTTP request to your own server for every Markdown conversion. Next.js proxy runs before page rendering and has no access to the response body, so there is no way to avoid this round trip within Next.js itself.

In practice this is usually fine:

Latency — the second request is localhost-to-localhost (or edge-to-edge on Vercel), so it adds minimal overhead.
Compute — your page renders twice for AI agent requests. For static or ISR pages this is a cache hit. For dynamic pages the extra render is the main cost.
Scope control — use config.matcher to limit which routes are eligible, so non-content pages (API routes, auth, assets) are never double-fetched.

Measuring the overhead

Enable serverTiming: true to get a breakdown of where time is spent. The Next.js middleware sets two metrics in both the Server-Timing and x-markdown-timing headers:

Server-Timing: mfa.fetch;dur=32.1;desc="Proxy fetch", mfa.convert;dur=4.7;desc="HTML to Markdown"
x-markdown-timing: mfa.fetch;dur=32.1;desc="Proxy fetch", mfa.convert;dur=4.7;desc="HTML to Markdown"

mfa.fetch — time spent on the proxy self-fetch (the second HTTP request)
mfa.convert — time spent converting HTML to Markdown

Server-Timing surfaces in browser devtools (Network > Timing) and can be read programmatically via PerformanceServerTiming. However, some CDNs strip Server-Timing from cached responses. The x-markdown-timing header carries the same data under a custom name that survives CDN caching, so the timing from the original render remains observable.

Use this to monitor the real overhead in production, since local benchmarks underestimate mfa.fetch (localhost skips DNS, TLS, and CDN routing that happen on Vercel Edge).

withMarkdown automatically includes nextImageRule, which unwraps /_next/image optimization URLs back to their original paths. For example, /_next/image?url=%2Fphoto.png&w=640&q=75 becomes /photo.png in the markdown output.

You can also use nextImageRule standalone with the core convert function:

import { nextImageRule } from '@markdown-for-agents/nextjs';
import { convert } from 'markdown-for-agents';

const { markdown } = convert(html, { rules: [nextImageRule] });

Full working example: See examples/nextjs/ for a complete Next.js app demonstrating the proxy pattern with integration tests.

Web Standard (Generic)

bash

npm install @markdown-for-agents/web

For any server that uses the Web Standard Request/Response API (Cloudflare Workers, Deno, Bun, etc.):

import { markdownMiddleware } from '@markdown-for-agents/web';

const mw = markdownMiddleware({ extract: true });

// Cloudflare Workers
export default {
    async fetch(request: Request): Promise<Response> {
        return mw(request, async req => {
            const html = await renderPage(req);
            return new Response(html, {
                headers: { 'content-type': 'text/html' }
            });
        });
    }
};

Deno

import { markdownMiddleware } from '@markdown-for-agents/web';

const mw = markdownMiddleware({ extract: true });

Deno.serve(async request => {
    return mw(request, async () => {
        return new Response('<h1>Hello from Deno</h1>', {
            headers: { 'content-type': 'text/html' }
        });
    });
});

Bun

import { markdownMiddleware } from '@markdown-for-agents/web';

const mw = markdownMiddleware({ extract: true });

Bun.serve({
    async fetch(request) {
        return mw(request, async () => {
            return new Response('<h1>Hello from Bun</h1>', {
                headers: { 'content-type': 'text/html' }
            });
        });
    }
});

Options

All middleware functions accept MiddlewareOptions, which extends ConvertOptions with one additional property:

interface MiddlewareOptions extends ConvertOptions {
    tokenHeader?: string; // Default: "x-markdown-tokens"
    contentSignal?: ContentSignalOptions;
}

You can pass any ConvertOptions (extraction, rules, baseUrl, etc.) and they are forwarded to the converter:

const mw = markdownMiddleware({
    // Conversion options
    extract: true,
    baseUrl: 'https://example.com',
    headingStyle: 'atx',
    rules: [
        /* custom rules */
    ],

    // Publisher consent signals
    contentSignal: { aiTrain: true, search: true, aiInput: true },

    // Performance observability
    serverTiming: true, // Adds Server-Timing header with mfa.convert duration

    // Middleware-specific
    tokenHeader: 'x-token-count' // Custom header name
});

Response Headers

When the middleware converts a response, it sets these headers:

Header	Value	Description
`Content-Type`	`text/markdown; charset=utf-8`	Replaces the original `text/html`
`x-markdown-tokens`	`123`	Estimated token count (configurable header name)
`ETag`	`"2f-1a3b4c5"`	Content hash of the markdown output for cache validation
`Vary`	`Accept`	Ensures caches store separate entries per content type (always set, even on non-converted responses)
`content-signal`	`ai-train=yes, search=yes, ai-input=yes`	Publisher consent signals (only set when `contentSignal` option is configured)
`Server-Timing`	`mfa.convert;dur=4.7;desc="HTML to Markdown"`	Conversion duration in ms (only set when `serverTiming: true`)
`x-markdown-timing`	`mfa.convert;dur=4.7;desc="HTML to Markdown"`	Same as `Server-Timing`, but survives CDN caching (only set when `serverTiming: true`)

The Next.js middleware includes an additional mfa.fetch metric in both timing headers, measuring the proxy self-fetch duration:

Server-Timing: mfa.fetch;dur=32.1;desc="Proxy fetch", mfa.convert;dur=4.7;desc="HTML to Markdown"
x-markdown-timing: mfa.fetch;dur=32.1;desc="Proxy fetch", mfa.convert;dur=4.7;desc="HTML to Markdown"

Server-Timing is a W3C standard header that surfaces automatically in browser devtools (Network tab > Timing) and is accessible via the PerformanceServerTiming API. However, some CDNs strip it from cached responses because the values are tied to a specific execution. The x-markdown-timing header carries the same data but uses a custom name that passes through CDN caching untouched, preserving the timing from the original render.

You can customise the header name via the timingHeader option:

markdown({ serverTiming: true, timingHeader: 'x-my-timing' });

Note that local benchmarks will underestimate the mfa.fetch overhead since the self-fetch goes to localhost; in production (e.g. Vercel Edge), the request goes through DNS, TLS, and CDN routing.

Caching

The middleware sets two headers that enable efficient caching out of the box:

Vary: Accept — tells CDNs and proxies that the response varies by Accept header. Without this, a CDN could cache the HTML variant and serve it to an AI agent requesting Markdown (or vice versa). This header is set on all responses, not just converted ones.
ETag — a deterministic content hash of the Markdown output. Enables conditional requests (If-None-Match) so CDNs and clients can validate cached responses without re-downloading the full body.

To control cache lifetime, add Cache-Control at your infrastructure layer:

// Example: cache Markdown responses for 1 hour at the CDN
app.use((req, res, next) => {
    if (req.headers.accept?.includes('text/markdown')) {
        res.setHeader('cache-control', 'public, max-age=3600');
    }
    next();
});
app.use(markdown());

Import Paths

Each middleware is a separate npm package:

// Express
import { markdown } from '@markdown-for-agents/express';

// Fastify
import { markdown } from '@markdown-for-agents/fastify';

// Hono — requires hono as peer dependency
import { markdown } from '@markdown-for-agents/hono';

// Next.js — requires next as peer dependency
import { withMarkdown, nextImageRule } from '@markdown-for-agents/nextjs';

// Generic Web Standard — no framework dependency
import { markdownMiddleware } from '@markdown-for-agents/web';

How Each Middleware Intercepts Responses

Framework	Mechanism
Express	Overrides `res.send()` to intercept the HTML body
Fastify	Uses the `onSend` hook to transform the payload
Hono	Uses Hono's native `MiddlewareHandler` with `c.res` replacement
Next.js	Wraps the route handler and replaces the `Response` object
Web Standard	Wraps the `next` handler and replaces the `Response` object

Middleware ​

How It Works ​

Express ​

Fastify ​

Hono ​

Next.js ​

How it works ​

Tradeoffs ​

Measuring the overhead ​

Web Standard (Generic) ​

Deno ​

Bun ​

Options ​

Response Headers ​

Caching ​

Import Paths ​

How Each Middleware Intercepts Responses ​

Middleware

How It Works

Express

Fastify

Hono

Next.js

How it works

Tradeoffs

Measuring the overhead

Web Standard (Generic)

Deno

Bun

Options

Response Headers

Caching

Import Paths

How Each Middleware Intercepts Responses