Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting gapless audio when concatenating a sequence of videos A, B, C on top of a single audio track #7086

Closed
brAzzi64 opened this issue Mar 14, 2020 · 3 comments

Comments

@brAzzi64
Copy link

Hi,

I'm trying to play a single audio track on top of a concatenation of video segments, as described originally in #6103. After the fix 0a89d0e recently linked on that issue, the approach suggested by @tonihei on Jul 1, 2019 is working really well :)

The only problem remaining is that despite making sure the "segment" duration is set to a multiple of the sample rate of the audio file (in microseconds), you can still hear a small audio gap during playback.

Is there a way to get rid of this?

Here's a patch with my current code, that applies on top of ExoPlayer's sample code (branch dev-v2):
https://gist.github.com/brAzzi64/3a0b37c0cc9e9f079af8041ce9959f92

Please let me know if you see anything off here?

Thanks!

@ojw28
Copy link
Contributor

ojw28 commented Apr 9, 2020

@tonihei - Any guidance?

@tonihei
Copy link
Collaborator

tonihei commented Apr 9, 2020

Sorry, haven't got around to check yet. I'll update the issue once I have something.

@tonihei
Copy link
Collaborator

tonihei commented Apr 20, 2020

I finally found the time to check what's happening with your example. Sorry for the delay!

It looks as if the "gap" you can hear is the player briefly going into BUFFERING state just after the transition. This is because the video timestamps don't align with keyframes in the video and the player needs to decode all frames from the previous keyframe before it can continue rendering from the requested start position. For reference, the key frames are at 0, 5_333_333 and 10_666_666 in your file. If you use any of those keyframe start positions for the video, everything is nice and smooth. Note that if you use enableInitialDiscontinuity=false, you must ensure that the first sample is a keyframe. Otherwise the decoder will see timestamps jumping backwards and some decoders may just break completely when this happens.

I also discovered that there is also a small bug in our fMP4 extractor that causes rounding errors for those clipping points (we round them to milliseconds and then back to microseconds). Not super bad, but I'll fix that too.

One final remark: You don't need to use the long ClippingMediaSource constructor for the audio tracks because they don't use the initial discontinuity by default (because audio samples are always "keyframes"). This parameter only makes a difference for video tracks.

icbaker pushed a commit that referenced this issue Apr 27, 2020
The sample timestamps are currently rounded to milliseconds, only to
be multiplied by 1000 later. This causes rounding errors where the sample
timestamps don't match the timestamps in the seek table (which are already
in microseconds).

issue:#7086
PiperOrigin-RevId: 307630559
ojw28 pushed a commit that referenced this issue May 28, 2020
The sample timestamps are currently rounded to milliseconds, only to
be multiplied by 1000 later. This causes rounding errors where the sample
timestamps don't match the timestamps in the seek table (which are already
in microseconds).

issue:#7086
PiperOrigin-RevId: 307630559
@ojw28 ojw28 closed this as completed Jun 9, 2020
@google google locked and limited conversation to collaborators Aug 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants