Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more default internal audio format. #3008

Merged
merged 1 commit into from
Apr 29, 2023
Merged

Add more default internal audio format. #3008

merged 1 commit into from
Apr 29, 2023

Conversation

toots
Copy link
Member

@toots toots commented Apr 15, 2023

Rationale

Issue #2998 pointed out large memory usage when buffering data.

After doing some experiments, it appears that the most important factor for memory usage with audio content is our internal data representation, namely 64 bit float. See below for a discussion on memory usage.

While 64 bit floats are quite useful for internal computations, they take up 4 times more space than 16 bit integers, which are considered acceptable for CD quality.

While this is true for large buffers, this is also true for all smaller intermediary buffer we have through our processing, i.e. harbor input buffer, file decoder buffer etc.

Changes

This PR introduces ways to mitigate the issue by adding new internal formats pcm_s16 and pcm_f32. These two formats should be usable the same way as the native pcm format.

At this moment, only the FFmpeg decoder and encoder support them. Sources can also be converted back and forth (see below).

These formats can be selected using the ffmpeg encoder:

%ffmpeg(
  %audio(pcm_s16, codec="aac")
)

Type annotations also do work:

s = (single("..."):source(audio=pcm_f32))

Of course, operators working with audio data still need native float. To that end, conversion operators are introduced at track level (track.{encode,decode}.audio.{pcm_s16, pcm_f32}) and source level (audio.{encoder, decode}.{pcm_s16, pcm_s32}.

Conventions and implementation specifics

All APIs to specify audio pcm types now have an explicit pcm_kind argument and encoders/decoders must explicitly specify which pcm implementation they are working with.

In the typing side of things, the convention is now to have (kind, format) with format being Content.Audio.format for all pcm implementation. This makes it possible to share and unify audio params accros pcm types. Typically:

% liquidsoap -h audio.decode.pcm_s16
Decode audio track to pcm_s16

Type: (?id : string?, source(audio=pcm_s16('a), 'b)) -> source(audio=pcm('a), 'b)

Discussion on memory usage

The following script was used as the basic script to test memory usage under different conditions:

s = input.http("https://icecast.radiofrance.fr/fip-hifi.aac")

s = buffer(buffer=100., s)

output.file(fallible=true,
  %ffmpeg(%audio(codec="aac")),
  "/tmp/bla.aac", 
  s
)

thread.run(every=3., fun () -> begin
  let {process_physical_memory} = runtime.memory()
  print("#{time()} #{process_physical_memory}")
end)

Here's the default:

memory_get_usage-graph

That's about ~150Mo of buffered data, about twice as much as the expected raw size of ~70 Mo. It's not clear yet what's causing it, most likely some added overhead from the OCaml boxing.

Next, I tested with the new pcm_s16 format:

memory_get_usage-graph

Here, we see an initial drop from a garbage collection cycle. Overall this seems below ~20Mo, which is close to the expected raw data size of ~17Mo.

The underlying implementation for this data type is a bigarray with a pointer outside of the OCaml memory to store data. Most likely, this is adding much less overhead!

It's worth noting, though, that the garbage collector can be pretty lazy collecting big array. This setting can help but will increase CPU usage as a trade-off:

runtime.gc.set(runtime.gc.get().{custom_major_ratio = 10})

@toots toots requested a review from smimram April 15, 2023 00:44
@toots toots marked this pull request as ready for review April 16, 2023 04:22
@toots toots force-pushed the pcm_s16 branch 3 times, most recently from e238245 to 338b472 Compare April 16, 2023 20:18
@toots toots force-pushed the pcm_s16 branch 4 times, most recently from 33bfc85 to 0d93125 Compare April 28, 2023 15:40
@toots toots merged commit b71a889 into main Apr 29, 2023
@toots toots deleted the pcm_s16 branch April 29, 2023 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant