Investigating Stalled Video Segments in MediaSource Buffers

I just finished finding the culprit to a very vexing problem, and I want to share my findings in the hope it will help someone.

I’ve been working on constructing an HTML video dynamically, using MediaSource APIs / MSE and the methods outlined in my recent post on MediaSource approaches. I’m trying to build a seek-able video, out of distinct WEBM files, as opposed to byte chunks out of stream. Loading the files directly via src=file.webm or one at time with MediaSource was working fine, but appending multiple segments together was triggering some bizarre behavior.

Issue

After loading multiple segments via MediaSource’s SourceBuffer.appendBuffer() and closing off the stream (MediaSource.endOfStream()), there would be no fatal errors in the console, but the video element would exhibit the following behavior (not always all at the same time):

  • Player element would let me press play, but then would “stall” – time does not move forward, and UI shows video as “playing”, but it clearly is not
    • Sometimes it would play for a very brief amount (milliseconds) before stalling
  • Player element would be seek-able, and I could scrub through timeline with correct preview. However, manually scrubbing position and then trying to resume playback would still result in stalled state
  • Occasionally I would get the well known play() request was interrupted by a call to pause() error, but this would be despite any calls pause()
    • I think this was actually caused by the stalled buffer internally (not my code) requesting a pause to fire. This seems to be backed up by this comment: “Say you have audio+video in a single <video> tag. If just audio underflows, chrome will immediately pause playback”

These issues occurred across both Chromium and Firefox, but again, not always together or consistently.

Investigating

I’ll admit it; my first step at investigation did not get me very far. I tried a bunch of trial-and-error, searching through docs on MediaSource, and switching between the sequence mode of SourceBuffer and segments mode. Nothing helped.

Chrome Media Inspector

I next stumbled across an article that mentioned that Chrome has a “Media” inspector tab in DevTools. Holy smokes is this thing awesome! Firing up the Media inspector and running my code again gave me much better insight into what to focus my investigation on: audio buffering.

Looking at the timeline view, this is what I pretty consistently saw when the video stalled:

Chrome Dev Tools - Media Inspector - Stalled Audio Buffering

Uh… that’s not good 😰. Let’s look at some of the events that are firing when it starts stalling:

kBufferingStateChanged

event = {
    audio_buffering_state: {
        reason: "DEMUXER_UNDERFLOW",
        state: "BUFFERING_HAVE_NOTHING"
    }
}

kBufferingStateChanged

event = {
    pipeline_buffering_state: {
        for_suspended_start: false,
        reason: "DEMUXER_UNDERFLOW",
        state: "BUFFERING_HAVE_NOTHING"
    }
}

Hmm. “DEMUXER_UNDERFLOW” doesn’t exactly have a nice sound to it… Also, a message like this would occasionally show up in the “Messages” panel:

Skipping audio splice trimming at PTS=11626999us. Found only 1us of overlap, need at least 1000us. Multiple occurrences may result in loss of A/V sync.

That definitely seems related, and might also point to an audio buffer issue. What’s the deal?

To make this more confusing, there was a lot of things I could rule out:

  • Buffer / Memory overflow / auto eviction: SourceBuffer is not an endless tube you can fill; it is more like a temporary reservoir. As such it has limits and is subject to garbage collection. However, I could rule this out because my tests were using very small files, and the overall buffer should have been well below the 12 MB audio / 150 MB video limit.
  • Mismatched codecs: Although I couldn’t rule out issues with my source files (more on this shortly), I could rule out that I was maybe using codecs that the browser did not support, as:
    • There were no errors to that effect
    • Calling MediaSource.isTypeSupported(mimeStr) showed support
    • My codecs matched the MSE specification / WEBM specs
    • Video playback worked just fine if videos were loaded directly, bypassing combined appendBuffer calls
  • Network issue
    • Although there were no indicators of issues with fetch() or any network requests, to be safe I even was trying loading videos directly via base64 strings in the JS. Same issues.

To rule out something wrong with my actual JavaScript code that was doing the loading (no errors in console does not rule this out), I swapped out my source files for some others that are commonly used throughout the web as WEBM samples. Issues immediately went away, signaling that this was indeed an issue with my source files.

At this point though, that still didn’t clear up a huge amount. These files played just fine in standard media players, and to reiterate, also played just fine in <video> elements when loaded one at a time.

Checking Files with FFMPEG

I should have tried this from the very start, but at this point I realized I absolutely needed to inspect the video files themselves. Although I had never done this kind of thing before (other than viewing what VLC provides in their info popup, and that sort of thing), FFMPEG makes it super easy to do, with the -i inspect flag, or with the ffprobe tool.

To inspect a file, use:

ffmpeg -i FILE_NAME

# Or, with ffprobe:
ffprobe FILE_NAME

For more advanced file inspection, check out ffprobe tips.

When I inspected my source files, I saw something reallllly funky:

Duration: 00:00:01.65, start: -0.001000, bitrate: 22 kb/s
Stream #0:0: Video: vp9 (Profile 0), yuv420p(tv, bt709/unknown/bt709), 144x96, SAR 1:1 DAR 3:2, 10 fps, 10 tbr, 1k tbn, 1k tbc (default)
    Metadata:
        DURATION        : 00:00:00.207000000
Stream #0:1: Audio: opus, 48000 Hz, mono, fltp (default)
    Metadata:
        DURATION        : 00:00:01.648000000

So, not only is it a little strange that each stream has its own duration metadata (I don’t think thats required), but it is alarming that the duration for the video segment completely does not match up with the audio! It sounds like these numbers don’t have to match exactly, but those numbers are so wildly off it seems suspect.

Note: A negative “start time” is actually an OK thing

Fix

I first tried the command suggested here, to patch the duration with FFMPEG:

ffmpeg -i broken.webm -c copy -fflags +genpts fixed.webm

However, that didn’t work.

At this point, it seemed likely that the file was just seriously malformed. If there are chunks of missing data or corrupted bytes, that is not something FFMPEG copying is likely going to be able to fix.

The file was created with a video editor (Shotcut), which has a lot of control over video output encoding, but rather than tweak every setting under the sun, my first thought was to update the program to the latest version. That… actually ended up fixing the issue. My guess is internally they upgraded their bundled version of the VP9 (or Opus) encoder, or how they were using it, and this in turn resulted in a “more valid” export file for my project.

Here is a sample of the inspection of the “fixed” file:

Duration: 00:00:01.71, start: -0.001000, bitrate: 10 kb/s
Stream #0:0: Video: vp9 (Profile 0), yuv420p(tv, bt709), 144x96, SAR 1:1 DAR 3:2, 10 fps, 10 tbr, 1k tbn, 1k tbc (default)
    Metadata:
        DURATION        : 00:00:01.707000000
Stream #0:1: Audio: opus, 48000 Hz, mono, fltp (default)
    Metadata:
        DURATION        : 00:00:01.708000000

Other Remarks

I also want to point out that in my previous MediaSource post and set of demos, I ran into a variant of this issue that only occurred in Firefox, and ffprobe results looked fine. The issue was specifically around multi-file appends, and I ended up just switching all my input files from VP9 to VP8, which seemed to resolve the issue.

Leave a Reply

Your email address will not be published. Required fields are marked *