Using Binary Data with Front-End JavaScript and the Web

  • infoFull Post Details
    info_outlineClick for Full Post Details
    Date Posted:
    Nov. 20, 2020
    Last Updated:
    Nov. 20, 2020
  • classTags
    classClick for Tags

I’m making this post because, despite binary files being a huge part of what makes up both your local and online filesystem, I kept having difficulties finding explanations on how (and why) to deal with binary files on the web and in JavaScript. In addition, a lot has changed in the world of JavaScript over the past years, so a lot of information out there is rather dated.

Table of Contents

Intro

First, some things that should be clarified before getting into the details:

  • Binary file / data: I’m using this to refer to pretty much any file or data that is not text-based. Examples: raw contents of image and video files, blobs, etc.
  • This post is mostly about front-end manipulation of binary data, not server-side (although some parts also apply to NodeJS)
  • Many examples used throughout the post require a somewhat modern web browser; e.g., it uses async functions

Converting Binary Data to Strings

JavaScript: Binary Data to Base64 (including Data URLs)

Note: If you are trying to convert from binary to Base64 in order to do something like display an image, you might not need to do this in the first place; browsers now support using raw blobs for media loading via Object URLs. I have instructions for this here.

In general, loading raw binary data via Object URL is recommended instead of using Base64 URLs

/**
 * Convert a raw binary blob into a Base64 String
 * @param {Blob} blob Raw binary blob
 * @param {boolean} [asDataUrl] Should full DataURI be returned
 * @returns {Promise<string>}
 */
const blobToBase64 = (blob, asDataUrl = false) => {
    return new Promise(res => {
        const reader = new FileReader();
        reader.onload = () => {
            /** @type {string} */
            const dataUrl = (reader.result);
            if (asDataUrl) {
                res(dataUrl);
            } else {
                // Remove MIME / dataUrl prefix
                res(dataUrl.replace(/data:\w+\/\w+;base64,/i, ''));
            }
        }
        reader.readAsDataURL(blob);
    });
}

async function runExample() {
    const binaryBlob = await (await fetch('https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png')).blob();
    const base64Url = await blobToBase64(binaryBlob, true);
    document.querySelector('img').src = base64Url;
}
runExample();

Live Demo

JavaScript: Base64 to Binary Data Blob

First, I’ll point out that there are not a whole lot of good reasons to need to convert Base64 data to a binary blob in the browser to begin with:

  • If your data is in Base64, you can already use it for images, videos, etc. – by using Data URLs (aka Data URIs)
  • AFAIK, converting to a binary blob if you already have the data as a stored string is going to basically be double-allocating space for it

However, if you really need to do this, there are some common approaches. The newest, and easiest to use, is to use the fetch() web API, and its .blob() return method:

/**
 * Convert base64 string to blob
 * @param {string} base64
 * @param {string} [mimeType] Mime type used by data held in base64 string
 * @returns {Promise<Blob>}
 */
const base64ToBlob = (base64, mimeType = 'application/octet-stream') => {
    return fetch(`data:${mimeType};base64,${base64}`)
        .then(res => res.blob());
}

// Example:
async function runExample() {
    const imgStr = `Qk1aAAAAAAAAAEIAAAAoAAAABgAAAAYAAAABAAQAAAAAAAAAAADEDgAAxA4AAAMAAAADAAAAAAAA/wAA//8A2P//IAEgAAASAAABIAEAEgASACABIAAAEgAA`;
    const blob = await base64ToBlob(imgStr, 'image/bmp');
    document.querySelector('img').src = URL.createObjectURL(blob);
}
runExample();

Live Demo

You can also use fetch(resource).then(res => res.blob()) to convert any fetch-able object to a blob.

If you are targeting an older browser, you might not be able to use fetch(), but there are still other solutions you can use. I would recommend looking at this StackOverflow question and reading through the responses to find what works best for you: “Creating a BLOB from a Base64 string in JavaScript”

Converting Binary Files to Base64 Strings on your Local Computer / CLI

If you have access to standard *nix utilities, the easiest tool to use is the aptly-named base64 utility.

For example, if I had a JPG I wanted to get as a Base64 encoded string, I could use:

base64 --wrap=0 my-image.jpg > my-image-base64.txt
# You can also pipe to it
cat my-image.jpg | base64 --wrap=0

You’ll note that I always use --wrap=0; this is because the default behavior is to wrap (add line breaks) at 76 columns. This wastes space, and is completely unnecessary, especially if I’m just going to be pasting into a JS or HTML file.

On Windows, you can also get the string put right in your clipboard, with:

cat my-image.jpg | base64 --wrap=0 | clip

Converting Binary Files to Base64 – Online Tools

If you don’t have access to a terminal, or just prefer using a web browser, there are dozens of online file-to-base64 converters, such as this one by browserling.

As discussed above, base64 conversion can be done entirely in front-end JavaScript, so I wouldn’t trust any online tool that actually “uploads” your file anywhere.


Using Binary Data as Source For Elements

Loading Data via Object URLs

If you already have your data stored in a binary format, in memory, and need to load it into an element (for example an img element), there is an alternative to Base64 DataURLs that is much preferable: Object URLs.

These are essentially virtual URLs that point to raw data; this could be a blob of binary data stored in memory, or even a reference to a user’s local file from their OS. Because they are references to existing data location, they use less memory than Base64 strings (and you can even release the pointer after loading)

// Example: Loading an image blob
const blob = await (await fetch('https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png')).blob();

const imageElem = document.querySelector('img');
// Good practice is to release once loaded
imageElem.onload = () => {
    URL.revokeObjectURL(imageElem.src);
}
imageElem.src = URL.createObjectURL(blob);

Just like Data URLs, Loading data through Object URLs works for a variety of media element, including <video> elements!

Loading Data via Base64 String Data URLs

Data URLs (formerly called Data URIs) are essentially just strings that are comprised of the data itself, as opposed to a pointer to where the data resides (how normal URLs or Object URLs work).

They follow a standardized syntax:

data:{mimeType};base64,{dataStr}

# Example


# If the data is *not* base64 encoded, you can leave off the `;base64` section
# Example:
data:text/plain,hello

They can be used in (AFAIK) pretty much any element with a src attribute, even including <iframe> and <script> tags!

<script src="data:text/plain,alert('hi!')"></script>
<!-- Base64: alert('hi from base64') -->
<script src="data:text/plain;base64,YWxlcnQoJ2hpIGZyb20gYmFzZTY0JykgDQo="></script>

If you are not using base64 encoding with Data URIs, you should be extra careful about characters that are not URL safe; some browsers might reject these. You can always use something like data:image/svg+xml;utf8,${encodeURIComponent(svgStr)}

Loading Data via Buffers

This is getting into the advanced part of data loading in JavaScript, but I’d like to point out that certain elements (HTMLMediaElement based, i.e. video and audio) now support loading data more directly, via streams, buffers, and the MediaSource interface. Collectively, most of these technologies fall under the “Media Source Extensions API” (aka MSE), which is an evolving specification for more advanced and dynamic media sources, such as append-able buffers.

The advantage to this approach is that it allows you to load in data in chunks, as opposed to through a single resource URL or blob. This makes it ideal for streaming applications, where video or audio is progressively loaded (and/or rendered).

Certain elements can even emit streams, which can be then captured and fed back into other media elements. A great example of this: you can stream data directly from <canvas> element to a <video> element on the same page (demo). Or stream from one video to another.

I’m not going to get too in-depth into this, as it is an advanced topic (and I have a future blog post in progress on it), but the general minimal steps required to load data into a (video) element with a dynamic MediaSource are:

  1. Create a new mediaSource instance (new MediaSource())
  2. Create an object URL that points to that source. Point the video to it
    • Example: videoElem.src = URL.createObjectURL(mediaSource)
  3. On the MediaSource instance, listen for the sourceopen even. Once it fires, you can begin preparing to load data into it
  4. Before you start loading data, you need to create a buffer to hold it. You do so by creating a SourceBuffer attached to the source, with mediaSource.addSourceBuffer(mimeType)
    • Example: const sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vp9"');
  5. Once you have the buffer, you can finally start loading in raw chunks of binary data
  6. One way to load in a chunk is with ArrayBuffer(s), via the sourceBuffer.appendBuffer(ArrayBuffer) method
    • If you have a blob instead of an ArrayBuffer, you can use myBlob.arrayBuffer() to get Promise<ArrayBuffer>
  7. Eventually, you also need to deal with signaling that the end of data has been reached, and controlling the playback

You can find a full example of these steps put together on MDN’s page for SourceBuffer, and an even more advanced example in the MediaSource / MSE spec. As I mentioned, I’m also planning to write and release a blog post on the topic soon.

Disclaimers

I wrestled with the decision to include this section, as I don’t want to make any developers feel like I am trying to “shame” them, but I also feel that it is important to include it due to the level of abuse that Data URLs and Base64 often receive. This is an attempt to point out situations in which these alternative loading techniques are used, when maybe they shouldn’t be. Or ways in which they can used incorrectly. So, without further delay:

  • Base64 is not a form of encryption. It cannot be used to hide passwords, secrets, or prevent assets from being downloaded
  • Base64 is generally a less efficient way of storing binary data, and it is a non-zero amount of processing power required to convert Base64 strings into other formats
    • This is (part of) why taking a 30 second video and storing it directly as a Base64 encoded string in the page and loading via Data URL, instead of just using a normal URL as the src, is very much a bad idea
  • Data URLs are not interchangeable with regular URLs, or Object URLs. There are caveats to how they work
  • Although MediaSource / MSE has been worked on since early 2013, it is a constantly evolving spec, and it can be difficult to find info on it. Not all browsers support it equally either.

More Resources

More About Me:

Leave a Reply

Your email address will not be published. Required fields are marked *