Gatsby – Wrapping Markdown in HTML Divs with Custom Classes

  • report
    Disclaimer
    Click for Disclaimer
    This Post is over a year old (first published about 4 years ago). As such, please keep in mind that some of the information may no longer be accurate, best practice, or a reflection of how I would approach the same thing today.
  • infoFull Post Details
    info_outlineClick for Full Post Details
    Date Posted:
    Feb. 18, 2020
    Last Updated:
    Feb. 18, 2020
  • classTags
    classClick for Tags

 

Intro

For reasons I won't go into, I wanted to build a Gatsby plugin which wraps elements / blocks of my choice in a div with a custom class name applied. I'm using Markdown as the source material for my Gatsby project, so here is the basics of how I wanted it to work – this…

<!-- CUSTOM_CLASS_WRAPPER="special-heading-style" -->
# Hello World

…would get transformed to this…

<div class="special-heading-style">
    <h1>Hello World</h1>
</div>

I also wanted it to work with Markdown components that generate multiple HTML elements. For example, this…

<!-- CUSTOM_CLASS_WRAPPER="special-list-style" -->
 - My list
	 - Sub Item

…would become

<div class="special-list-style">
    <ul>
        <li>
            <p>My List</p>
            <ul>
                <li><p>Sub Item</p></li>
            </ul>
        </li>
    </ul>
</div>

I thought this would be simple. After all, this is actually a tiny modification, and Gatsby is designed around extensibility and plugins. But things weren't as simple as they seemed…

Background on Markdown Conversion

For the uninitiated, the majority of Gatsby projects that take Markdown as the source content, such as the go-to starter blog, under the hood use a plugin called gatsby-transformer-remark to convert the markdown to HTML. This, in turn, can accept hundreds of other plugins that extend it and modify how it handles the markdown conversion.

When you are writing a plugin, you basically are writing a callback function that gatsby-transformer-remark will call with, among other arguments, a Markdown AST (more on this later), which is a representation of what was extracted out of the markdown source. Your plugin can modify that content, by adding, deleting, or changing nodes in the AST.

The Problem

Going back to what we want to accomplish, it seems like writing our plugin should be straightforward. Receive the AST, look for our special HTML comment, and if found, wrap the first adjacent sibling in a new HTML node. Otherwise, do nothing. Here is a rough attempt:

const visit = require('unist-util-visit');

const plugin = (args, options) => {
	const markdownAST = args.markdownAST;
	const commentClassPatt = /^[ ]*<!--\s*ADD_CLASS="([^\r\n>"]*)"\s*-->[ ]*$/;
	
	visit(markdownAST, `html`, (node, index, parent) => {
		commentClassMatch = node.value.match(commentClassPatt);
		if (commentClassMatch) {
			const className = commentClassMatch[1];
			const siblings = parent.children || [];
			const nextSibling = siblings[index + 1];
			
			if (nextSibling) {
				visit(nextSibling, (node, index, parent) => {
					const wrapperNode = {
						type: 'html',
						value: `<div class="${className}"></div>`
					};
					wrapperNode.children = [
						{
							...node
						}
					];
					node = wrapperNode;
				});
			}
		}
	});
	return markdownAST;
}

module.exports = plugin

So, once again, our steps look like:

  1. Take the Markdown AST spit out by the main transformer, and look for our special magic HTML comment
  2. If found, check if it has an immediate adjacent sibling
  3. If sibling is found, wrap with a <div>, with the desired class name
  4. Replace the sibling with the wrapper, with itself as child

If I add breakpoints and logging output, I can see that my code is finding the right triggers and modifying nodes in place. However, this does not work! Why? Well, it comes down to a misunderstanding of how Gatsby, the Markdown AST, and HTML all fit together.

Finding the Solution

The missing piece of understanding here is twofold.

  1. In a Markdown AST, you can have nested markdown (AST) nodes, but you can't really have an HTML node that contains nested markdown

    • You can see this yourself in the handy AST Explorer tool – sample
  2. Using a Markdown node within an HTML node requires conversion first

So, re-evaluating things… if the AST node we want to wrap in a custom div…

  • …is markdown

    • We first need to convert it to HTML, then wrap
  • …is HTML

    • We could wrap it in place

BUT, either way, we must replace the node we want to wrap with a single HTML node, with the value as a string.

Markdown to HTML?

So far, so good. But how do we actually convert a snippet of markdown to HTML in Gatsby? This is where there is a bit of a gap in documentation, as most plugins are generating HTML from internal logic, rather than from markdown.

Option A) Use remark / unified library from scratch

If you peek inside the gatsby-transformer-remark plugin, you will see that Gatsby actually uses remark for markdown to HTML transformation, and that in turn is heavily powered by the unified suite of tools – there is a reason Gatsby is a gold sponsor of unified 🙂

There is nothing stopping you, as a plugin developer, from using these libraries directly. In fact, I would probably recommend it for situations like these; that way if some of the Gatsby internal APIs change, you are not forced to update.

Option B) Use methods from gatsby-transformer-remark

Rather than rolling our own code for converting a bit of markdown to HTML, shouldn't we be able to just use transformer-remark itself? After all, that is the main point of the plugin; converting markdown to HTML.

At first glance, it seems like we are in luck; there is a method exposed in the plugin arguments – options.compiler.generateHTML, and if we look at the source for it, it says it takes a markdown node and returns html. PERFECT! Let's try passing in a markdown node from visit() that we want to convert to HTML and then wrap…

UNHANDLED REJECTION Cannot read property 'contentDigest' of undefined

… now is the time when angry words are muttered.

OK, what now? Well, it turns out that we have mixed up some terminology. There are really two types of markdown nodes.

  • First, there is a markdown AST node – which is a node of the markdown AST generated from the markdown file

    • This is what we just passed to generateHTML()
  • But, then there is a Gatsby markdown node, which is a special node specific to the Gatsby ecosystem, and has more to do with GraphQL and Gatsby internal than ASTs

    • That is what the function is expecting
    • You can see more about that data structure here

Further digging: Can we create a node to pass to generateHTML()?

Aha! Yes we can!

You can manually construct Gatsby nodes by hand – either by constructing the raw object by hand, or using actions.createNodehowever, you need to have the string value of the node (raw markdown string) stored as node.internal.content, because that is what the method uses as input to remark.parse().

So, how do we get the raw markdown string back out of the MDAST node? We can use unified and their module remark-stringify:

unified().user(remarkStringify).stringify(node)

Let's say we have a MDAST node that we want to convert to a Gatsby MD node – here is how we could do so:

const unified = require('unified');
const remarkStringify = require('remark-stringify');

const plugin = (args, options) => {
	const { createContentDigest, createNodeId } = args
	// Assume we have MDAST `node` from unist-util-visit, or something similar
	const rawMarkdown = unified().use(remarkStringify).stringify(node);
	const gatsbyMdNode = {
		id: createNodeId(rawMarkdown),
		children: [],
		parent: null,
		fields: {},
		internal: {
			content: rawMarkdown,
			contentDigest: createContentDigest(rawMarkdown),
			owner: "gatsby-transformer-remark",
			type: "MarkdownRemark",
		},
	}
}

For the above, you could also use the actions.createNode utility method (and probably should…)

Now, we can pass that Gatsby MD node to the generateHTML function without the contentDigest error.

New Problem

The HTML generator introduces a new problem to be aware of… it is an async function! How are we supposed to this inside a Gatsby plugin for transform-remark?

Luckily, we have we have a great blog post by Huy Nguyen that discusses this exact issue.

Based on that post, we know that all we really have to do is have our plugin return a promise to Gatsby. There are a a bunch of ways you can do this; wrap the entire body in a promise and return it, write it as an async function, return Promise.all(), etc.

I like the async / await syntax, so I just made my entire plugin function an async function.

If you are looking for some resources are writing an async plugin, here are some helpful links I've found:

Other Issues:

  • Replacing an AST node in-place

    • According to everything I could find, assigning my new node to the reference to the old one should replace it; however, I could not seem to get this to work

      • Replacing it in theparent.children array also seemed troublesome
      • The best working method I found was to leave the reference in place, but override the props with Object.assign(existingNode, replacementNode)
  • Irregular Gatsby types

    • Be warned: it can be hard to find up-to-date type definitions for all the internals of Gatsby
    • It helps to look through Gatsby's source code, and existing TypeScript plugins

 

Solution – Final Working Version

After all that fuss and troubleshooting, the final code is actually pretty simple and short:

 

const visit = require("unist-util-visit");
const unified = require("unified");
const mdStringify = require("remark-stringify");

const plugin = async (args, options) => {
  const { createContentDigest, markdownAST, createNodeId } = args;
  const markToHtml = args.compiler.generateHTML;
  const transformPromises = [];

  const wrapMdAstNode = async (node, index, parent, className) => {
    let generatedNode = {};
    const wrap = htmlContent => {
      return {
        type: "html",
        value: `<div class="${className}">${htmlContent}</div>`,
        children: undefined
      };
    };
    // Process
    if (node.type === "html") {
      generatedNode = wrap(node.value);
    } else {
      const rawMarkdown = unified()
        .use(mdStringify)
        .stringify(node);
      const dummyNode = {
        id: createNodeId(rawMarkdown),
        children: [],
        parent: null,
        fields: {},
        internal: {
          content: rawMarkdown,
          contentDigest: createContentDigest(rawMarkdown),
          owner: "gatsby-transformer-remark",
          type: "MarkdownRemark"
        }
      };
      const html = await markToHtml(dummyNode);
      generatedNode = wrap(html);
    }
    Object.assign(node, generatedNode);
    return node;
  };

  const commentClassPatt = /^[ ]*<!--\s*ADD_CLASS="([^\r\n>"]*)"\s*-->[ ]*$/;

  visit(markdownAST, `html`, (node, index, parent) => {
    const commentClassMatch =
      typeof node.value === "string" && node.value.match(commentClassPatt);

    if (commentClassMatch) {
      const className = commentClassMatch[1];
      const siblings = parent.children || [];
      const nextSibling = siblings[index + 1];

      if (nextSibling) {
        transformPromises.push(
          wrapMdAstNode(nextSibling, index, parent, className)
        );
      }
    }
  });

  await Promise.all(transformPromises);
  return markdownAST;
};

module.exports = plugin;

Leave a Reply

Your email address will not be published.