Skip to main content

How to Generate LLMs.txt Automatically: CMS & CI/CD Integration Guide

14 min readTutorial

What you'll learn: How to automate llms.txt generation for WordPress, Next.js, Astro, Hugo, headless CMS platforms, and custom CI/CD pipelines -- so your AI-facing content always stays in sync with your site.

Why Automate LLMs.txt?

Manually maintaining an llms.txt file works fine when your website has a handful of pages that rarely change. But the moment your site grows -- dozens of documentation pages, weekly blog posts, a constantly evolving API reference -- manual updates become a liability. Pages get added without being reflected in llms.txt, URLs change without anyone updating the file, and broken links silently degrade the experience for AI systems trying to understand your content.

Automation solves these problems by generating your llms.txt file directly from your content source of truth -- whether that is a CMS database, a folder of Markdown files, or an API endpoint. When content changes, llms.txt changes with it, automatically and accurately.

Platforms Covered in This Guide

This guide covers WordPress (with Yoast SEO), Next.js, Astro, Hugo, headless CMS platforms (Contentful, Sanity, Strapi), and custom CI/CD pipelines with GitHub Actions. Pick the section that matches your stack and start automating.

Key benefits of automating llms.txt generation:

  • Always in sync: Your llms.txt automatically reflects your latest published content -- no manual updates required.
  • Fewer broken links: Generated links come from live data, eliminating typos and outdated URLs that plague manually maintained files.
  • Scales effortlessly: Whether you have 10 pages or 10,000, the generation script handles it identically.
  • Built-in validation: Add a validation step to your pipeline and catch formatting issues before they reach production.

WordPress: Using the Yoast SEO Plugin

If you run WordPress, the fastest path to an automated llms.txt is through the Yoast SEO plugin. Yoast added built-in llms.txt support, allowing you to generate and manage the file directly from your WordPress dashboard without writing a single line of code.

Step-by-Step Setup

  1. 1. Install or update Yoast SEO: Make sure you are running the latest version. Navigate to Plugins → Installed Plugins → Yoast SEO and click "Update" if an update is available.
  2. 2. Open Yoast Settings: Go to Yoast SEO → Settings in your WordPress admin sidebar.
  3. 3. Navigate to the LLMs.txt tab: Under the "General" section, find and click the "LLMs.txt" tab. Enable the feature by toggling it on.
  4. 4. Configure sections: Choose which post types to include (posts, pages, custom post types), how to group them into H2 sections, and which metadata to pull for link descriptions.
  5. 5. Preview and publish: Yoast shows a live preview of the generated file. Once you are satisfied, save your settings and the file is automatically served at yoursite.com/llms.txt.

What Yoast Generates

Yoast builds your llms.txt with the following structure:

  • H1 heading pulled from your site title in Settings → General
  • Blockquote description from your site tagline
  • H2 sections grouped by category, post type, or custom taxonomy
  • Links with titles from post titles and descriptions from meta descriptions or excerpts

Customization Options

Yoast lets you fine-tune which content appears in your llms.txt. You can exclude specific post types, filter by category, add custom H2 sections with manual links, and choose whether to use the meta description or post excerpt as the link description. For advanced customization beyond what the UI offers, you can edit the generated output using WordPress hooks.

Other WordPress options: Several other WordPress plugins are emerging with llms.txt support, including dedicated plugins on the WordPress plugin directory. Yoast remains the most widely adopted solution, but explore alternatives if your needs are highly specialized.

Next.js: Build-Time Generation

For Next.js projects, the recommended approach is to generate llms.txt at build time using a Node.js script. This ensures the file is always up to date whenever you deploy, with zero runtime overhead.

Create a Generation Script

Create a file at scripts/generate-llms-txt.js in your project root:

// scripts/generate-llms-txt.js
const fs = require('fs');
const path = require('path');

const config = {
  siteName: 'My Documentation Site',
  siteUrl: 'https://docs.example.com',
  description: 'Comprehensive API documentation and developer guides',
  sections: [
    {
      title: 'Getting Started',
      links: [
        { title: 'Quick Start', path: '/docs/quickstart', desc: 'Get up and running in 5 minutes' },
        { title: 'Installation', path: '/docs/install', desc: 'Detailed installation instructions' },
      ]
    },
    {
      title: 'API Reference',
      links: [
        { title: 'REST API', path: '/api/rest', desc: 'Complete REST API documentation' },
        { title: 'Authentication', path: '/api/auth', desc: 'API authentication guide' },
      ]
    }
  ]
};

function generateLlmsTxt(config) {
  let content = `# ${config.siteName}\n\n`;
  content += `> ${config.description}\n\n`;

  for (const section of config.sections) {
    content += `## ${section.title}\n`;
    for (const link of section.links) {
      content += `- [${link.title}](${config.siteUrl}${link.path}): ${link.desc}\n`;
    }
    content += '\n';
  }

  fs.writeFileSync(path.join(__dirname, '../public/llms.txt'), content);
  console.log('llms.txt generated successfully!');
}

generateLlmsTxt(config);

Wire It Into Your Build

Add the script to your package.json so it runs automatically before every build:

{
  "scripts": {
    "prebuild": "node scripts/generate-llms-txt.js",
    "build": "next build",
    "dev": "next dev"
  }
}

Dynamic Content Sources

For sites with dynamic content, extend the script to read from your actual data sources. If you use MDX files, parse the frontmatter to extract titles and descriptions. If your content lives in a headless CMS, fetch it via the CMS API during the build step. The key principle is the same: read from the source of truth, output a valid llms.txt to your public/ directory.

Pro Tip: Use the same data source that powers your sitemap.xml to generate llms.txt. This ensures consistency and means you only maintain content metadata in one place.

Astro & Hugo: Static Site Generators

Static site generators are ideal candidates for llms.txt automation because they already process all your content at build time. Both Astro and Hugo offer straightforward approaches to generating the file alongside the rest of your site.

Astro: API Endpoint Approach

In Astro, create a static endpoint that outputs your llms.txt at build time:

// src/pages/llms.txt.ts
import { getCollection } from 'astro:content';

export async function GET() {
  const docs = await getCollection('docs');
  const posts = await getCollection('blog');

  let content = '# My Astro Site\n\n';
  content += '> Developer documentation and tutorials\n\n';

  content += '## Documentation\n';
  for (const doc of docs) {
    content += `- [${doc.data.title}](https://example.com/docs/${doc.slug}): ${doc.data.description}\n`;
  }

  content += '\n## Blog\n';
  for (const post of posts) {
    content += `- [${post.data.title}](https://example.com/blog/${post.slug}): ${post.data.description}\n`;
  }

  return new Response(content, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' },
  });
}

Hugo: Custom Output Format

Hugo supports custom output formats, which you can use to generate llms.txt from your content:

# config.toml
[outputFormats.LLMS]
  baseName = "llms"
  mediaType = "text/plain"
  isPlainText = true

[outputs]
  home = ["HTML", "RSS", "LLMS"]

Then create a template at layouts/index.llms.txt:

# {{ .Site.Title }}

> {{ .Site.Params.description }}

{{ range .Site.Sections }}
## {{ .Title }}
{{ range .Pages }}
- [{{ .Title }}]({{ .Permalink }}): {{ .Description }}
{{ end }}
{{ end }}

Key principle: With static site generators, always generate llms.txt during the build step so it is included in your final deployment output. Never rely on runtime generation for static sites.

Headless CMS Integration

If your content lives in a headless CMS like Contentful, Sanity, or Strapi, you can use webhooks to trigger llms.txt regeneration whenever content is published or updated. This ensures your AI-facing file always reflects the latest editorial changes without any manual intervention.

The Webhook Flow

  1. 1. Content is created, updated, or published in your CMS
  2. 2. The CMS fires a webhook to your build service (Vercel, Netlify, or a custom endpoint)
  3. 3. Your build script fetches all published content from the CMS API
  4. 4. The script generates a fresh llms.txt from the fetched content
  5. 5. The updated file is deployed alongside the rest of your site

Filtering Content

Not all CMS content belongs in your llms.txt. Apply these filters when querying your CMS API:

  • Status: Only include published content, never drafts or archived items
  • Content type: Focus on documentation, guides, and key landing pages rather than every content type
  • Priority: If your CMS supports priority or featured flags, use them to curate the most important pages
  • Exclusion list: Maintain a list of slugs or IDs to explicitly exclude (login pages, admin screens, etc.)

Example Webhook Handler

// api/regenerate-llms-txt.js
export default async function handler(req, res) {
  // Verify webhook signature from your CMS
  if (!verifyWebhookSignature(req)) {
    return res.status(401).json({ error: 'Unauthorized' });
  }

  // Fetch published content from CMS
  const docs = await cmsClient.getEntries({
    content_type: 'documentation',
    'fields.status': 'published',
    order: '-fields.priority',
  });

  // Generate llms.txt content
  let llmsTxt = '# My Site\n\n> Site description\n\n';
  llmsTxt += '## Documentation\n';
  for (const doc of docs.items) {
    llmsTxt += `- [${doc.fields.title}](https://example.com/${doc.fields.slug}): ${doc.fields.summary}\n`;
  }

  // Write to public directory or upload to CDN
  await writeFile('public/llms.txt', llmsTxt);

  // Trigger redeployment
  await triggerDeploy();

  return res.status(200).json({ success: true });
}

Tip: Add a debounce or cooldown to your webhook handler. If editors publish multiple content updates in quick succession, you do not want to trigger a rebuild for every single change. A 60-second cooldown batches rapid-fire edits into a single regeneration.

CI/CD Pipeline Integration

For maximum reliability and auditability, integrate llms.txt generation into your CI/CD pipeline. This approach works regardless of your CMS, framework, or hosting provider -- and gives you a single place to generate, validate, and deploy the file.

GitHub Actions Example

Here is a complete GitHub Actions workflow that generates and validates your llms.txt on every push to the main branch when content files change:

name: Update LLMs.txt
on:
  push:
    branches: [main]
    paths: ['content/**', 'docs/**']

jobs:
  generate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - run: npm ci
      - run: node scripts/generate-llms-txt.js
      - run: npx llms-txt-validator public/llms.txt  # Validate
      - uses: stefanzweifel/git-auto-commit-action@v5
        with:
          commit_message: "chore: update llms.txt"
          file_pattern: public/llms.txt

Key Steps in the Pipeline

Every llms.txt CI/CD pipeline should follow three core steps:

1. Generate

Run your generation script to produce a fresh llms.txt from current content sources.

2. Validate

Run a validator to check structure, URLs, and formatting. Fail the build if validation fails.

3. Deploy

Commit the file back to the repo, or deploy directly to your hosting platform.

Scheduled Regeneration

In addition to triggering on content changes, run a weekly cron job to catch stale links and ensure freshness:

on:
  schedule:
    - cron: '0 9 * * 1'  # Every Monday at 9:00 AM UTC
  push:
    branches: [main]
    paths: ['content/**', 'docs/**']

Adding Validation

The validation step is critical. It catches issues like missing H1 headings, invalid URLs, broken Markdown formatting, and files that exceed recommended size limits. Use our free validator tool to check files manually, or add a CLI validation command to your pipeline for automated checks. If validation fails, the pipeline should stop and alert your team before a broken llms.txt reaches production.

Monitoring & Maintenance

Automation handles the heavy lifting, but you still need monitoring to catch edge cases and ensure everything is working as expected. Use this checklist to build a robust maintenance routine:

Conclusion

Automating llms.txt generation is the difference between a file that starts strong and slowly decays, and one that stays accurate and useful indefinitely. Whether you use a WordPress plugin like Yoast SEO, a build-time script in Next.js or Astro, a headless CMS webhook, or a full CI/CD pipeline with GitHub Actions, the approach is the same: generate from your source of truth, validate the output, and deploy with confidence.

Start with the solution that matches your current stack. If you are on WordPress, enable Yoast's llms.txt feature today -- it takes less than five minutes. If you are on a custom stack, begin with a simple generation script and add validation and CI/CD integration as your workflow matures. The important thing is to stop maintaining llms.txt manually and let automation keep it in sync with your content.

And always validate. Automated generation eliminates human error in content updates, but it can introduce its own issues -- a misconfigured template, a changed API response format, a new content type that was not accounted for. A validation step in your pipeline catches these problems before they reach production.

People Also Ask About Automating LLMs.txt

These are common questions about llms.txt and AI optimization. Click on any question to see the answer.

Ready to Validate Your Generated LLMs.txt?

Whether you auto-generate or hand-craft your llms.txt, validation ensures it meets the standard and works correctly with AI systems.