Developing with Astro - Notes on Some Random Rough Edges
I recently transitioned my site from WordPress to Astro and although the experience was overall positive and relatively pain-free, there were a few rough-edges that I wanted to document and share. Just in case they help anyone else.
Running Scripts in the Astro Environment / Importing Astro Virtual Modules Outside Astro
Astro’s content layer (e.g., content collections) is both a strength and weakness for Astro. Most of the time it feels like wonderful magic - a few lines of code and suddenly you have a full content to HTML pipeline integrated, complete with schema validation. However, this magic abstraction comes at a cost (like every magical abstraction) - its opaque nature makes it perplexing to deal with when things go south.
For example, I started off trying to do something that I assumed would be straightforward: creating a one-off script that could iterate over all my entries (posts, pages, etc.) in Astro and run some checks against them.
However, when I tried to run my script (e.g. npx tsx my_astro_script.mjs), I immediately ran into issues:
Error [ERR_UNSUPPORTED_ESM_URL_SCHEME]: Only URLs with a scheme in: file, data, and node are supported by the default ESM loader. Received protocol ‘astro:’
What is going on here? Well, the core API / content layer methods are imported from astro:content, which as the error points out, is not a valid import path. At build (or dev/run) time this is not an issue as Astro ships a vite plugin (ie stuff like this “virtual module” resolver) which tells vite how to handle these.
Before I fully understood the problem space, I first thought I could work around this by writing a quick custom loader for NodeJS I could use with my script. Custom module resolution in Node is its own cool can of worms - done correctly, it can allow you to import almost anything. But its not the right tool for this job - not exactly.
If you want to learn more about Node’s support for custom module resolution, aka loaders, aka module customization hook - this is a good intro
The better solution is to re-use the existing vite plugin architecture that Astro ships, but that is easier said than done - in practice, I found that Astro kept most of that code internal and without a clean interface to get access to the vite “container” with their module system loaded.
In the end, I was able to get a system that worked with minimal wrapper code, but it wasn’t the easiest to figure out:
Loading Github Gist: https://gist.github.com/joshuatz/60c0ff82154a318b49e03d60ec08ac38
I now have a function runInAstroEnvironment, which I can use in any script I like and have it able ot access the content layer data. For example, I have a script like this:
commands/audit.mjs
import { writeFileSync } from 'fs';
import { runInAstroEnvironment } from '../astro-script-runner.mjs';
runInAstroEnvironment(async (env) => {
const { auditCollections } = /** @type {import('../src/sever')} */ (await env.runner.import('./src/server'));
const results = await auditCollections();
console.log(results);
writeFileSync('./audit_results.json', JSON.stringify(results, null, 4));
if (results.hasErrors) {
console.warn(Object.values(results.errorsPerCollection).map((epc) => epc.errors));
process.exit(1);
}
});
And to run it, all I need to do is run npx tsx commands/audit.mjs.
Images
Another issue I ran into was with the default handling of Markdown images -> HTML.
I found that Astro was breaking image tags if the original markdown encoded the image with typical markdown syntax of . I was seeing resulting HTML that looked something like:
<img __astro_image_="{"src":"../my_image.jpg","alt":"","index":0}">
Hmm. That doesn’t look right!
The best fix here would to be find the culprit in the markdown converter library, patch it, and submit a PR to the upstream repo. However, as a much quicker fix, I slapped together a workaround using simple regular expression find-and-replace:
entry.rendered.html = entry.rendered.html.replace(
/<img __astro_image_="(?<astro_image_json>{[^>}]+})">/gi,
(_match, _p1, _offset, _wholeString, groups) => {
let astroImageJSON: {
src: string;
alt: string;
index: number;
};
let astroImageJSONRaw = groups.astro_image_json as string;
// Need to handle encoded, like: `{"src":"/media/my-image.jpg","alt":"alt text","index":0}`
astroImageJSONRaw = decodeHTML(astroImageJSONRaw);
try {
astroImageJSON = JSON.parse(astroImageJSONRaw);
} catch (err) {
throw new Error(`Could not parse raw astro image JSON\n\n${astroImageJSONRaw}\n\nErr: ${err}`);
}
// Re-serialized, but as HTML props instead of JSON
const stringifiedImageProps = Object.entries(astroImageJSON)
.map(([key, val]) => `${key}="${val}"`)
.join(' ');
return `<img ${stringifiedImageProps} />`;
},
);
RegEx is one of those things that I think is highly underrated as a versatile tool to be familiar with. Is this the most robust solution? No. Does it get me 99% of the way there, with 1% of the effort? Yes.
Nothing Is Perfect
Astro may have its rough edges, but I’m still finding it an overall very pleasant experience to work with!