Structured Data: From Manual Root.js to a JSON-LD Plugin
How my work on structured data eventually turned into a Docusaurus plugin
In my previous article, "I Took Control of My Metadata", I explained how I explored two different ways of injecting structured data into Docusaurus.
That work led me to a simple conclusion: for a bilingual, content-rich website designed for modern SEO (including SEO, GEO, and generative engines), none of the existing approaches were sustainable in the long run.
At that point, I had a choice: accept growing complexity… or change my approach entirely. That's how a third method emerged — and eventually, a plugin.
The work described below was not outsourced to AI, and the plugin itself was not developed by AI either.

Recap of the Methods
In my previous article, I presented two ways to manage structured data in Docusaurus and inject it into the HTML <head>, while hinting that I had arrived at a third method 😁.
Method 1 via docusaurus.config.js
This approach consists of defining a global schema, valid for the entire site, and placing it in Docusaurus's main customization entry point: the configuration file.
➕ Convenient for shared information (organization, website, person). ➖ Insufficient to describe individual pages.
Method 2 via a Root.js File
This method consists of creating a global React component* that injects page-specific JSON-LD.
To do so, a Root.js file must be populated with all desired properties for both base entities and each Markdown page of the site.
➕ Powerful, centralized, detailed, and exhaustive. ➖ Manual: a detailed schema must be written for every page. ➖ Hard to maintain and evolve over time.
- Global React component injected at the Docusaurus theme level.
The Concrete Limits of Root.js
In practice, this approach worked: the JSON-LD was correctly interpreted by Docusaurus, and the properties were properly defined.
But very quickly, the work became tedious, and the file threatened to become unmanageable.
When a File Hits its Limits
Once completed, the method ran into structural limits: a file of nearly 2,800 lines, centralizing the description of about fifty pages… per language. And my site has two.
This approach works as long as the site content remains stable.
However, every new page requires a manual update, with a risk of errors proportional to the file's size.
This is a classic trap: what is acceptable at a small scale quickly becomes a burden once the site starts to grow.
The Multilingual Problem
My site is bilingual (French and English), and I enabled Docusaurus i18n. I therefore expect to inject language-specific structured data without making the site invisible for French SEO — even though it is English first.
However, the Root.js file lives in src/theme and is, by nature, monolingual:
- Docusaurus does not duplicate this component per locale (no duplication in
i18n/fr). - Translations are not supported, unlike content in
src/components, for example.
Maintaining two
Root.jsfiles is therefore not supported — and thankfully so.
Duplicating this kind of logic would only multiply maintenance issues and create a real Rube Goldberg machine 😁.
Redundancy at the Heart of the Problem
Method 2 introduced significant redundancy:
- Repeating base entities (organization, website, person).
- Manually declaring canonical URLs for every page.
- Copy-pasting titles, descriptions, and dates already present in the front matter.
This duplication was not only tedious — it was dangerous for overall site consistency.
The @graph Best Practice
To limit these repetitions, I used an @graph.
This approach groups all entities related to a page (organization, author, page, article…) into a single JSON-LD block, improving readability and partially simplifying maintenance. Search engines interpret a well-structured graph more easily than a succession of independent scripts.
On this topic, you can watch on YouTube: "JSON-LD: How to Structure Your Site for Google and AI", starting at 4:34 for the @graph method.
The @graph method is an excellent foundation for structuring JSON-LD cleanly.
But on a real, multilingual, constantly evolving site, it is not sufficient on its own.
The core issue remained: URLs and duplication with front matter continued to create real consistency problems.
It was precisely this realization that led me to design a dedicated plugin — the subject of Part 2️⃣ 😉.
See a real @graph example used on CoffeeCup.tech
"@graph": [
{
"@type": "Organization",
"@id": "https://coffeecup.tech/#organization",
"name": "CoffeeCup.tech",
"legalName": "Florence Venisse EI",
"legalRepresentative": {
"@id": "https://coffeecup.tech/#person"
},
"location": "France",
(...)
},
{
"@type": "Person",
"@id": "https://coffeecup.tech/#person",
(...)
},
{
"@type": "WebSite",
"@id": "https://coffeecup.tech/#website",
"url": "https://coffeecup.tech/",
"name": "CoffeeCup.tech",
(...)
}
]
Front Matter at the Core of the Solution
💡 That's when the breakthrough happened: front matter is an existing source of truth.
Present on every Markdown (and MDX) page, it is structured, reliable, and already used by Docusaurus to generate HTML metadata.
The key was to fully leverage the fields that already map to
schema.orgproperties, instead of duplicating them elsewhere.
Front matter describes each page like an identity card.
---
id: contact
title: Contact CoffeeCup.tech
language: EN
author: Florence Venisse
description: "Contact CoffeeCup.tech xxxx."
date: 2026-01-19
---
Before Docusaurus, I used Hugo, a static site generator that relies heavily on front matter to type pages, manage content, and structure metadata.
Designing Method 3
Method 3 transforms front matter into JSON-LD. It combines:
- method 1 for global data,
- front matter for page-specific data,
- and method 2, since it dynamically generates a
Root.js.
At this point, the goal was clear: scale up and definitively move away from hand-written JSON-LD and its redundancies.
Some mappings are obvious:
title⬌headlineauthor⬌authordate⬌datePublished
But a schema that modern engines can truly exploit also requires:
- an explicit page type (WebPage, Article, BlogPosting…),
- relevant keywords,
- a reliable canonical URL,
- images optimized for snippets.
💡 By progressively enriching front matter, most of the JSON-LD could be sourced directly from where the information already lived.
Splitting Responsibilities
To eliminate redundancy and achieve a clear, maintainable architecture, the specific had to be strictly separated from the global.
- Specific data — title, description, dates, type, image, keywords — lives in front matter.
- Shared data — organization, recurring author, publisher, global graph — is centralized in
docusaurus.config.js.
A Systematic Preparatory Effort
I therefore enriched all my Markdown pages:
- Unique definitions aligned with Google guidelines.
- Non-redundant keywords.
- Snippet-ready images.
- Explicitly defined page types.
This work laid the foundation for the plugin.
What I Take Away From this Experience
I loved working behind the scenes. This project combined strategic reflection, technical learning, and real product design.
- A reflection on my positioning and what I want to say about CoffeeCup.tech, my work, and myself as an expert technical writer.
- A deeper understanding of SEO, structured data, and the Docusaurus ecosystem.
- A product design approach: identifying a real need and building a durable, reusable solution.
If you are working on a content-rich, multilingual Docusaurus site designed to last, this reflection can help you avoid costly technical decisions.
This plugin is not just a tool. It is a way of thinking about documentation as a coherent data graph, readable by humans and machines alike — and ready for generative engines.
In Part 2️⃣, I'll get into the concrete details:
- The plugin architecture.
- How front matter is read and validated.
- Multilingual handling.
- And automatic generation of coherent, maintainable, AI-ready JSON-LD schemas.
The next part is coming soon.