Intro
For reasons I won't go into, I wanted to build a Gatsby plugin which wraps elements / blocks of my choice in a div with a custom class name applied. I'm using Markdown as the source material for my Gatsby project, so here is the basics of how I wanted it to work – this…
<!-- CUSTOM_CLASS_WRAPPER="special-heading-style" -->
# Hello World
…would get transformed to this…
<div class="special-heading-style">
<h1>Hello World</h1>
</div>
I also wanted it to work with Markdown components that generate multiple HTML elements. For example, this…
<!-- CUSTOM_CLASS_WRAPPER="special-list-style" -->
- My list
- Sub Item
…would become
<div class="special-list-style">
<ul>
<li>
<p>My List</p>
<ul>
<li><p>Sub Item</p></li>
</ul>
</li>
</ul>
</div>
I thought this would be simple. After all, this is actually a tiny modification, and Gatsby is designed around extensibility and plugins. But things weren't as simple as they seemed…
Background on Markdown Conversion
For the uninitiated, the majority of Gatsby projects that take Markdown as the source content, such as the go-to starter blog, under the hood use a plugin called gatsby-transformer-remark
to convert the markdown to HTML. This, in turn, can accept hundreds of other plugins that extend it and modify how it handles the markdown conversion.
When you are writing a plugin, you basically are writing a callback function that gatsby-transformer-remark
will call with, among other arguments, a Markdown AST (more on this later), which is a representation of what was extracted out of the markdown source. Your plugin can modify that content, by adding, deleting, or changing nodes in the AST.
The Problem
Going back to what we want to accomplish, it seems like writing our plugin should be straightforward. Receive the AST, look for our special HTML comment, and if found, wrap the first adjacent sibling in a new HTML node. Otherwise, do nothing. Here is a rough attempt:
const visit = require('unist-util-visit');
const plugin = (args, options) => {
const markdownAST = args.markdownAST;
const commentClassPatt = /^[ ]*<!--\s*ADD_CLASS="([^\r\n>"]*)"\s*-->[ ]*$/;
visit(markdownAST, `html`, (node, index, parent) => {
commentClassMatch = node.value.match(commentClassPatt);
if (commentClassMatch) {
const className = commentClassMatch[1];
const siblings = parent.children || [];
const nextSibling = siblings[index + 1];
if (nextSibling) {
visit(nextSibling, (node, index, parent) => {
const wrapperNode = {
type: 'html',
value: `<div class="${className}"></div>`
};
wrapperNode.children = [
{
...node
}
];
node = wrapperNode;
});
}
}
});
return markdownAST;
}
module.exports = plugin
So, once again, our steps look like:
- Take the Markdown AST spit out by the main transformer, and look for our special magic HTML comment
- If found, check if it has an immediate adjacent sibling
- If sibling is found, wrap with a
<div>
, with the desired class name - Replace the sibling with the wrapper, with itself as child
If I add breakpoints and logging output, I can see that my code is finding the right triggers and modifying nodes in place. However, this does not work! Why? Well, it comes down to a misunderstanding of how Gatsby, the Markdown AST, and HTML all fit together.
Finding the Solution
The missing piece of understanding here is twofold.
-
In a Markdown AST, you can have nested markdown (AST) nodes, but you can't really have an HTML node that contains nested markdown
- You can see this yourself in the handy AST Explorer tool – sample
-
Using a Markdown node within an HTML node requires conversion first
So, re-evaluating things… if the AST node we want to wrap in a custom div…
-
…is markdown
- We first need to convert it to HTML, then wrap
-
…is HTML
- We could wrap it in place
BUT, either way, we must replace the node we want to wrap with a single HTML node, with the value as a string.
Markdown to HTML?
So far, so good. But how do we actually convert a snippet of markdown to HTML in Gatsby? This is where there is a bit of a gap in documentation, as most plugins are generating HTML from internal logic, rather than from markdown.
Option A) Use remark
/ unified
library from scratch
If you peek inside the gatsby-transformer-remark
plugin, you will see that Gatsby actually uses remark
for markdown to HTML transformation, and that in turn is heavily powered by the unified
suite of tools – there is a reason Gatsby is a gold sponsor of unified 🙂
There is nothing stopping you, as a plugin developer, from using these libraries directly. In fact, I would probably recommend it for situations like these; that way if some of the Gatsby internal APIs change, you are not forced to update.
Option B) Use methods from gatsby-transformer-remark
Rather than rolling our own code for converting a bit of markdown to HTML, shouldn't we be able to just use transformer-remark itself? After all, that is the main point of the plugin; converting markdown to HTML.
At first glance, it seems like we are in luck; there is a method exposed in the plugin arguments – options.compiler.generateHTML
, and if we look at the source for it, it says it takes a markdown node and returns html
. PERFECT! Let's try passing in a markdown node from visit()
that we want to convert to HTML and then wrap…
UNHANDLED REJECTION Cannot read property 'contentDigest' of undefined
… now is the time when angry words are muttered.
OK, what now? Well, it turns out that we have mixed up some terminology. There are really two types of markdown nodes.
-
First, there is a markdown AST node – which is a node of the markdown AST generated from the markdown file
- This is what we just passed to
generateHTML()
- This is what we just passed to
-
But, then there is a Gatsby markdown node, which is a special node specific to the Gatsby ecosystem, and has more to do with GraphQL and Gatsby internal than ASTs
- That is what the function is expecting
- You can see more about that data structure here
Further digging: Can we create a node to pass to generateHTML()
?
Aha! Yes we can!
You can manually construct Gatsby nodes by hand – either by constructing the raw object by hand, or using actions.createNode
– however, you need to have the string value of the node (raw markdown string) stored as node.internal.content
, because that is what the method uses as input to remark.parse().
So, how do we get the raw markdown string back out of the MDAST node? We can use unified
and their module remark-stringify
:
– unified().user(remarkStringify).stringify(node)
Let's say we have a MDAST node that we want to convert to a Gatsby MD node – here is how we could do so:
const unified = require('unified');
const remarkStringify = require('remark-stringify');
const plugin = (args, options) => {
const { createContentDigest, createNodeId } = args
// Assume we have MDAST `node` from unist-util-visit, or something similar
const rawMarkdown = unified().use(remarkStringify).stringify(node);
const gatsbyMdNode = {
id: createNodeId(rawMarkdown),
children: [],
parent: null,
fields: {},
internal: {
content: rawMarkdown,
contentDigest: createContentDigest(rawMarkdown),
owner: "gatsby-transformer-remark",
type: "MarkdownRemark",
},
}
}
For the above, you could also use the
actions.createNode
utility method (and probably should…)
Now, we can pass that Gatsby MD node to the generateHTML function without the contentDigest error.
New Problem
The HTML generator introduces a new problem to be aware of… it is an async function! How are we supposed to this inside a Gatsby plugin for transform-remark?
Luckily, we have we have a great blog post by Huy Nguyen that discusses this exact issue.
Based on that post, we know that all we really have to do is have our plugin return a promise
to Gatsby. There are a a bunch of ways you can do this; wrap the entire body in a promise and return it, write it as an async function, return Promise.all()
, etc.
I like the async / await syntax, so I just made my entire plugin function an async function.
If you are looking for some resources are writing an async plugin, here are some helpful links I've found:
- https://www.huy.dev/2018-05-remark-gatsby-plugin-part-3/
- https://spectrum.chat/unified/remark/async-plugins~e3916709-9c3e-494c-9d8e-af3efbf08e80
- https://spectrum.chat/unified/syntax-tree/is-there-any-way-to-execute-async-work-when-visiting-a-node-in-unist-util-visit~28177826-628e-44e3-ac3e-0ffb53c195c6
- https://swizec.com/blog/buildremark-plugin-supercharge-static-site/swizec/8860
Other Issues:
-
Replacing an AST node in-place
-
According to everything I could find, assigning my new node to the reference to the old one should replace it; however, I could not seem to get this to work
- Replacing it in the
parent.children
array also seemed troublesome - The best working method I found was to leave the reference in place, but override the props with
Object.assign(existingNode, replacementNode)
- Replacing it in the
-
-
Irregular Gatsby types
- Be warned: it can be hard to find up-to-date type definitions for all the internals of Gatsby
- It helps to look through Gatsby's source code, and existing TypeScript plugins
Solution – Final Working Version
After all that fuss and troubleshooting, the final code is actually pretty simple and short:
const visit = require("unist-util-visit");
const unified = require("unified");
const mdStringify = require("remark-stringify");
const plugin = async (args, options) => {
const { createContentDigest, markdownAST, createNodeId } = args;
const markToHtml = args.compiler.generateHTML;
const transformPromises = [];
const wrapMdAstNode = async (node, index, parent, className) => {
let generatedNode = {};
const wrap = htmlContent => {
return {
type: "html",
value: `<div class="${className}">${htmlContent}</div>`,
children: undefined
};
};
// Process
if (node.type === "html") {
generatedNode = wrap(node.value);
} else {
const rawMarkdown = unified()
.use(mdStringify)
.stringify(node);
const dummyNode = {
id: createNodeId(rawMarkdown),
children: [],
parent: null,
fields: {},
internal: {
content: rawMarkdown,
contentDigest: createContentDigest(rawMarkdown),
owner: "gatsby-transformer-remark",
type: "MarkdownRemark"
}
};
const html = await markToHtml(dummyNode);
generatedNode = wrap(html);
}
Object.assign(node, generatedNode);
return node;
};
const commentClassPatt = /^[ ]*<!--\s*ADD_CLASS="([^\r\n>"]*)"\s*-->[ ]*$/;
visit(markdownAST, `html`, (node, index, parent) => {
const commentClassMatch =
typeof node.value === "string" && node.value.match(commentClassPatt);
if (commentClassMatch) {
const className = commentClassMatch[1];
const siblings = parent.children || [];
const nextSibling = siblings[index + 1];
if (nextSibling) {
transformPromises.push(
wrapMdAstNode(nextSibling, index, parent, className)
);
}
}
});
await Promise.all(transformPromises);
return markdownAST;
};
module.exports = plugin;