Using Binary Data with Front-End JavaScript and the Web

  • report
    Disclaimer
    Click for Disclaimer
    This Post is over a year old (first published about 3 years ago). As such, please keep in mind that some of the information may no longer be accurate, best practice, or a reflection of how I would approach the same thing today.
  • infoFull Post Details
    info_outlineClick for Full Post Details
    Date Posted:
    Nov. 20, 2020
    Last Updated:
    Nov. 25, 2020
  • classTags
    classClick for Tags

Using Binary Data as Source For Elements

Loading Data via Object URLs

If you already have your data stored in a binary format, in memory, and need to load it into an element (for example an img element), there is an alternative to Base64 DataURLs that is much preferable: Object URLs.

These are essentially virtual URLs that point to raw data; this could be a blob of binary data stored in memory, or even a reference to a user’s local file from their OS. Because they are references to existing data location, they use less memory than Base64 strings (and you can even release the pointer after loading)

// Example: Loading an image blob
const blob = await (await fetch('https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png')).blob();

const imageElem = document.querySelector('img');
// Good practice is to release once loaded
imageElem.onload = () => {
    URL.revokeObjectURL(imageElem.src);
}
imageElem.src = URL.createObjectURL(blob);

Just like Data URLs, Loading data through Object URLs works for a variety of media element, including <video> elements!

Loading Data via Base64 String Data URLs

Data URLs (formerly called Data URIs) are essentially just strings that are comprised of the data itself, as opposed to a pointer to where the data resides (how normal URLs or Object URLs work).

They follow a standardized syntax:

data:{mimeType};base64,{dataStr}

# Example
data:image/bmp;base64,Qk1aAAAAAAAAAEIAAAAoAAAABgAAAAYAAAABAAQAAAAAAAAAAADEDgAAxA4AAAMAAAADAAAAAAAA/wAA//8A2P//IAEgAAASAAABIAEAEgASACABIAAAEgAA

# If the data is *not* base64 encoded, you can leave off the `;base64` section
# Example:
data:text/plain,hello

They can be used in (AFAIK) pretty much any element with a src attribute, even including <iframe> and <script> tags!

<script src="data:text/plain,alert('hi!')"></script>
<!-- Base64: alert('hi from base64') -->
<script src="data:text/plain;base64,YWxlcnQoJ2hpIGZyb20gYmFzZTY0JykgDQo="></script>

If you are not using base64 encoding with Data URIs, you should be extra careful about characters that are not URL safe; some browsers might reject these. You can always use something like data:image/svg+xml;utf8,${encodeURIComponent(svgStr)}

Loading Data via Buffers

This is getting into the advanced part of data loading in JavaScript, but I’d like to point out that certain elements (HTMLMediaElement based, i.e. video and audio) now support loading data more directly, via streams, buffers, and the MediaSource interface. Collectively, most of these technologies fall under the “Media Source Extensions API” (aka MSE), which is an evolving specification for more advanced and dynamic media sources, such as append-able buffers.

The advantage to this approach is that it allows you to load in data in chunks, as opposed to through a single resource URL or blob. This makes it ideal for streaming applications, where video or audio is progressively loaded (and/or rendered).

Certain elements can even emit streams, which can be then captured and fed back into other media elements. A great example of this: you can stream data directly from <canvas> element to a <video> element on the same page (demo). Or stream from one video to another.

I’m not going to get too in-depth into this, as it is an advanced topic (and I have an entire separate blog post on it), but the general minimal steps required to load data into a (video) element with a dynamic MediaSource are:

  1. Create a new mediaSource instance (new MediaSource())
  2. Create an object URL that points to that source. Point the video to it
    • Example: videoElem.src = URL.createObjectURL(mediaSource)
  3. On the MediaSource instance, listen for the sourceopen even. Once it fires, you can begin preparing to load data into it
  4. Before you start loading data, you need to create a buffer to hold it. You do so by creating a SourceBuffer attached to the source, with mediaSource.addSourceBuffer(mimeType)
    • Example: const sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vp9"');
  5. Once you have the buffer, you can finally start loading in raw chunks of binary data
  6. One way to load in a chunk is with ArrayBuffer(s), via the sourceBuffer.appendBuffer(ArrayBuffer) method
    • If you have a blob instead of an ArrayBuffer, you can use myBlob.arrayBuffer() to get Promise<ArrayBuffer>
  7. Eventually, you also need to deal with signaling that the end of data has been reached, and controlling the playback

You can find a full example of these steps put together on MDN’s page for SourceBuffer, and an even more advanced example in the MediaSource / MSE spec. As I mentioned, I’m also planning to write and release a blog post on the topic soon. I also just finished a blog post that goes into depth on this topic!

Disclaimers

I wrestled with the decision to include this section, as I don’t want to make any developers feel like I am trying to “shame” them, but I also feel that it is important to include it due to the level of abuse that Data URLs and Base64 often receive. This is an attempt to point out situations in which these alternative loading techniques are used, when maybe they shouldn’t be. Or ways in which they can used incorrectly. So, without further delay:

  • Base64 is not a form of encryption. It cannot be used to hide passwords, secrets, or prevent assets from being downloaded
  • Base64 is generally a less efficient way of storing binary data, and it is a non-zero amount of processing power required to convert Base64 strings into other formats
    • This is (part of) why taking a 30 second video and storing it directly as a Base64 encoded string in the page and loading via Data URL, instead of just using a normal URL as the src, is very much a bad idea
  • Data URLs are not interchangeable with regular URLs, or Object URLs. There are caveats to how they work
  • Although MediaSource / MSE has been worked on since early 2013, it is a constantly evolving spec, and it can be difficult to find info on it. Not all browsers support it equally either.

More Resources

More About Me:

Leave a Reply

Your email address will not be published.