Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cue more! #3634

Merged
merged 5 commits into from
Jan 15, 2024
Merged

Cue more! #3634

merged 5 commits into from
Jan 15, 2024

Conversation

toots
Copy link
Member

@toots toots commented Jan 13, 2024

More stuff for cue after @Moonbase59 remarks

  • Rename max_length to max_tracks. Even with the documentation, the parameter name is confusing.
  • Switch data to a un-escaped string to allow more flexibility w.r.t. the spec.
  • Switch back to metadata for detecting new track marks. In most cases this will do the same but I think that this is closest to what people will expect in most situations.
  • Add a deduplicate_using to deduplicate metadata based on title/artist/album.
  • Add tests!

@Moonbase59
Copy link

Moonbase59 commented Jan 13, 2024

Phew, this goes fast—everyone working instead of clubbing on a Saturday night!

On LS 2.3.0+git@4a2d37e, when using a stream as input, the TITLEs were missing:

TITLE "[2024-01-13][23:32:00]"
PERFORMER "Recorded with Liquidsoap"
REM COMMENT "http://server14613.streamplus.de:35194/;.mp3"
FILE "aufzeichnung.2024-01-13.233200.mp3" MP3
  TRACK 01 AUDIO
    INDEX 01 00:00:00
  TRACK 02 AUDIO
    INDEX 01 00:04:24
  TRACK 03 AUDIO
    INDEX 01 00:04:27
  TRACK 04 AUDIO
    INDEX 01 03:40:60
  TRACK 05 AUDIO
    INDEX 01 03:55:09
  TRACK 06 AUDIO
    INDEX 01 03:55:11

Stream definitely has metadata, example:

2024/01/13 23:36:01 [nowplaying:4] META: [("icy-br", "128"), ("icy-genre", "Rock, Hardrock, Metal"), ("icy-name", "Radio Paranoid - Skyrider - Nightflight 666"), ("icy-notice1", "<BR>This stream requires <a href=\"http://www.winamp.com/\">Winamp</a><BR>"), ("icy-notice2", "SHOUTcast Distributed Network Audio Server/Linux v1.9.8<BR>"), ("icy-pub", "1"), ("icy-url", "http://www.radio-paranoid.net/"), ("title", "Dio - Caught in the Middle"), ("url", "http://www.audiorealm.com")]

@toots
Copy link
Member Author

toots commented Jan 13, 2024

Phew, this goes fast—everyone working instead of clubbing on a Saturday night!

It's still early here!

On LS 2.3.0+git@4a2d37e, when using a stream as input, the TITLEs were missing:

TITLE "[2024-01-13][23:32:00]"
PERFORMER "Recorded with Liquidsoap"
REM COMMENT "http://server14613.streamplus.de:35194/;.mp3"
FILE "aufzeichnung.2024-01-13.233200.mp3" MP3
  TRACK 01 AUDIO
    INDEX 01 00:00:00
  TRACK 02 AUDIO
    INDEX 01 00:04:24
  TRACK 03 AUDIO
    INDEX 01 00:04:27
  TRACK 04 AUDIO
    INDEX 01 03:40:60
  TRACK 05 AUDIO
    INDEX 01 03:55:09
  TRACK 06 AUDIO
    INDEX 01 03:55:11

Stream definitely has metadata, example:

2024/01/13 23:36:01 [nowplaying:4] META: [("icy-br", "128"), ("icy-genre", "Rock, Hardrock, Metal"), ("icy-name", "Radio Paranoid - Skyrider - Nightflight 666"), ("icy-notice1", "<BR>This stream requires <a href=\"http://www.winamp.com/\">Winamp</a><BR>"), ("icy-notice2", "SHOUTcast Distributed Network Audio Server/Linux v1.9.8<BR>"), ("icy-pub", "1"), ("icy-url", "http://www.radio-paranoid.net/"), ("title", "Dio - Caught in the Middle"), ("url", "http://www.audiorealm.com")]

Yeah. Track and metadata has been rewritten in main so you should expect this to need a little more time to adjust. I'd suggest testing my branch where I've switched back to using metadata and added a metadata de-duplicator.

@Moonbase59
Copy link

The "map_metadata" function also receives no metadata (just [] when I log m).

Your branch: That’d be cue-more?
Does that generate a .deb so I can test?

@Moonbase59
Copy link

Moonbase59 commented Jan 13, 2024

Using your branch did it, thanks for the hint. Using LS 2.3.0+git@d1d3bbb now.

Now we’re talking:

TITLE "[2024-01-14][00:50:14]"
PERFORMER "Recorded with Liquidsoap"
REM COMMENT "https://radio.niteradio.net/listen/niteradio/radio.mp3"
DATE 2024-01-14
FILE "aufzeichnung.2024-01-14.005014.mp3" MP3
  TRACK 01 AUDIO
    TITLE "All Nights"
    PERFORMER "Asuntar"
    INDEX 01 00:09:15

DATE working (have to look up if it should be DATE or REM DATE on the album level).
Metadata mapping working (splitting the TITLE to TITLE and PERFORMER).

Could this super-hacky brute-force metadata mapping be written more elegantly?

def mapmeta(m)
  artist = ref(m["artist"])
  title = ref(m["title"])
  #log(level=4, label="cue", "META: #{metadata.cover.remove(m)}")
  log(level=4, label="cue", "META: #{m}")
  if string.length(m["artist"]) <= 0 then
    if string.contains(substring=" - ", m["title"]) then
      let (a, t) = string.split.first(separator=" - ", m["title"])
      artist := a
      title := t
      log(level=4, "a=#{a}, t=#{t}")
    end
  end
  [("title", "#{title()}"), ("artist", "#{artist()}")]
end

@Moonbase59
Copy link

Moonbase59 commented Jan 14, 2024

@toots Did some more testing re DATE on CD and TRACK level. It seems most cue sheet parsers recognise REM DATE YYYY—on both CD and TRACK level, but only a very few accept DATE …. Also only a very few understand the YYYY-MM and YYYY-MM-DD formats (probably due to ID3v2 initially having only YEAR before we got the more detailed date formats). Tested with audio & video players, cue sheet cutters (Flacon) and some other (tools, media centers like LMS, etc.). Not tested with foobar2000, a popular Windows application, since I have no Windows machine.

So I propose to use REM DATE instead of DATE.

We don’t get all possible metadata with all inputs, but probably should think of feasible data to identify tracks with some formats.

  • Shoutcast/Icecast streams usually have only one "Title" field, most ofen separated as "Artist - Title".
  • Exceptions are OGG streams, as far as I know, these can contain tag data.
  • HLS can contain tag data like Artist, Title, Album, Date, ISRC, as far as I know.
  • Recording from (internal) sources that were generated from annotated or tagged files/playlists could in theory have lots of tags.

The main goal of using a cue sheet together with a recording MP3/FLAC/WAV will normally be

  • legal reasons
  • air checks
  • keeping a show for later retransmission
  • record and list what a DJ transmitted, for licensing reasons
  • auto-generating DJ set lists—after the fact—to be published, for instance in a forum.

To reliably identify a track, it might be nice to include these metadata if available:

  • "CD" basis (file, album)
    • TITLE
    • PERFORMER
    • REM DATE (YYYY, YYYY-MM, or YYYY-MM-DD)
    • REM COMMENT
  • on a TRACK basis (shows can have a multitude of albums, unlike the original CD)
    • TITLE
    • PERFORMER
    • REM ALBUM
    • REM DATE (formats as above)
    • ISRC (this becoming ever more important, and often tagged) Must come after TRACK and before any INDEX commands.

A cue sheet can start with a track number higher than 1 (originally for multi-disc sets), and must be seuqntially counting up, but I guess we can ignore that for our purposes (except maybe the mysterious "max_tracks" setting).

Guess we can also ignore most other options like UPC/EAN etc., since we’re not really talking about CD/DVD/BluRay media here. It’s mainly to create usable recordings and being able to see the titles broadcast, and jump around in a recording.

@Moonbase59
Copy link

Moonbase59 commented Jan 14, 2024

Did several 2-hour testing blocks over night, and the code (cue-more branch) works robust, except for my additional notes above of course:

  • Shoutcast 1.9.8 MP3 stream, fed by sc_trans
  • Icecast MP3 stream, fed by AzuraCast*
  • HLS stream, fed by AzuraCast (4s segments, overlap 2)*
  • Local playlist

* = On these, the cue points are often a few seconds off. I think this might be an AzuraCast issue, since I see metadata changing a few seconds off in their web player, too. Maybe it’s just due to the crossfading…

I will now do a 2-hour test from a local playlist, to verify this. Using a playlist and no crossfade, the cue points should be exact.

test8.liq, in case you want to cross-verify:

# test8.liq

# For timed tests, use `at` and/or `timeout`, i.e.
#   at 11:30
#   at> timeout 120m liquidsoap test8.liq
#   at> Ctrl+D

settings.log.level := 4

# A local playlist
uri = "~/Musik/Playlists/Radio/Electronic.m3u"
radio = mksafe(playlist(uri))

# Icecast stream shows burst mode and intro as first 3 tracks
#uri = "https://radio.niteradio.net/listen/niteradio/radio.mp3"
#radio = mksafe(input.http(uri))

# Another (http:) radio (Shoutcast 1.9.8 w/ sc_trans)
#uri = "http://server14613.streamplus.de:35194/;.mp3"
#radio = mksafe(input.http(uri))

# HLS produces MANY tracks, one per segment (4s in my case)
# and DUPLICATE track entries for each "Segments Overhead" specified (2 in my case).
#uri = "https://radio.niteradio.net/hls/niteradio/live.m3u8"
#radio = mksafe(input.http(uri))

date = time.string("%Y-%m-%d")
dt_start = time.string("[#{date}][%H:%M:%S]")
file = time.string("aufzeichnung.#{date}.%H%M%S")
# cut DATE down to YYYY, to test with software not recognising YYYY-MM-DD
date = string.sub(date, start=0, length=4)  # understood by most parsers

# UGLY UGLY
def mapmeta(m)
  artist = ref(m["artist"])
  title = ref(m["title"])
  album = ref(m["album"])
  #log(level=4, label="cue", "META: #{metadata.cover.remove(m)}")
  log(level=4, label="cue", "META: #{m}")
  if string.length(m["artist"]) <= 0 then
    if string.contains(substring=" - ", m["title"]) then
      let (a, t) = string.split.first(separator=" - ", m["title"])
      artist := a
      title := t
      log(level=4, "a=#{a}, t=#{t}")
    end
  end
  [("title", "#{title()}"), ("artist", "#{artist()}"), ("album", "#{album()}")]
end

radio =
  source.cue(
    title = dt_start,
    performer = "Recorded with Liquidsoap",
    date = date,
    comment = uri,
    # max_length leaves only the last "n" tracks in the cue sheet,
    # numbered track 01 .. n, but with correct INDEX 01 offsets.
    # No idea what the use case for this might be.
    # max_tracks = 10,
    map_metadata = mapmeta,
    file = file ^ ".mp3",
    file ^ ".cue",
    radio
  )

output.file(%mp3, file ^ ".mp3", radio)

@Moonbase59
Copy link

Moonbase59 commented Jan 14, 2024

Local playlist to MP3+CUE roughly okay, but some non-exact cues. This seems to depend heavily on the player, for instance Audacious and VLC gave very different results for the same cue point.

I believe the CUE file to be correct, but it seems exact seeking in a MP3 file is not easy for every MP3 audio player…

Retrying the playlist 2-hour test using FLAC as target now.

Btw, file_type isn’t taken from the file extension in this version; it’s always MP3, even when outputting to a .flac file. Fortunately, we can set it separately and thus make parsers happy that expect, for instance WAVE with FLAC files.

EDIT: Results: With FLAC and WAV, all cue points are exact, and playback, too. So definitely a problem with some players not being able to seek MP3 correctly.

Copy link
Member

@smimram smimram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to use the file extension to get the default file type as suggested.

@@ -426,5 +430,6 @@ def source.cue(
end
end

source.on_track(s, handle_metadata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to understand here: why don't we use tracks and not metadata?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metadata can happen without a track mark. In most situations, e.g. when working with files, track marks and metadata happen at the same time but it's not guaranteed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that but I thought it would be more logical to generate cue entries for tracks rather than arbitrary metadata

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to capture the most general use-case. I think metadata+deduplicate gives us the largest use-case cross-section..

@toots
Copy link
Member Author

toots commented Jan 15, 2024

I've pushed the following:

  • Renamed max_tracks to last_tracks to clarify
  • Changed date to year, made it an int again. Since you reported that YYYY is the most common format I think we should capture that and be explicit about it. Advanced user can re-use our code typically.
  • Added "isrc" and "cue_year" to map to ISRC and REM DATE per-track. Since date metadata formats is most likely a mess, I thought an explicit "cue_year" that you have to specify manually would be the most appropriate.
  • Changed DATE for REM DATE
  • Added doc about recognized meta in the operator's description.

Seems to me that this is plenty for now!

@toots
Copy link
Member Author

toots commented Jan 15, 2024

Sorry I forgot:

  • Pick file type from file extension when it is not provided.

@toots toots added this pull request to the merge queue Jan 15, 2024
Merged via the queue into main with commit 008f275 Jan 15, 2024
26 checks passed
@toots toots deleted the cue-more branch January 15, 2024 17:52
@Moonbase59
Copy link

Moonbase59 commented Jan 15, 2024

Yep, plenty. Unfortunately, some things now impossible (long date).

Since no one will have cue_year in their file’s tags, I’d need to create that from the many other possible metadata in a file.

But the metadata_map function seems not to receive a file’s complete metadata:

If I do a

def mapmeta(m)
  log(level=4, label="cue", "META: #{m}")
end

...

radio =
  source.cue(
  ...
  map_metadata = mapmeta,
  ...
  )

I only see:

2024/01/15 21:08:25 [cue:4] META: [("album", "10 000 Hz Legend"), ("artist", "Air"), ("title", "Caramel Prisoner")]

but the file has much more metadata:

IDv2 tag info for Air - Caramel Prisoner.mp3
APIC=cover front,  (image/jpeg, 42678 bytes)
COMM==eng=Release Date: 2001-05-28 FR

Eager to prove their songwriting smarts and knowledge of traditionalist pop on their sophomore work, French band Air pulled back slightly from the milky synth pop of their 1998 debut, Moon Safari. 10,000 Hz Legend is a darker work, just as contemplative and unhurried as its predecessor, but part of a gradual move from drifting, almost pastoral melancholia to a downright post-modern helplessness in league with Radiohead. Air are still tremendously effective producers, and have actually expanded their palate with a surprising array of pop instrumentation (acoustic guitars, flutes, pianos, a harmonica, harps, and many strings) to file alongside the countless trilling synthesizers and machine sequencers. The two lead-off tracks, »Electronic Performers« and »How Does It Make You Feel,« are breathtaking productions that exploit the same robot-weariness tendencies that made »Sexy Boy« (from Moon Safari) an alternative hit. Still, those detached retro-vocoder treatments sound so much more passé in 2001 than when the duo first tried them out in 1996. Jason Falkner and Beck, a pair of equally hardworking slacker-pop icons, appear (respectively) on the next two tracks, the tongue-in-cheek single »Radio #1« and an excellent morning-after jam named »The Vagabond.«

Again, the production is stellar, but these find Air stranded between art rock and pop, caught in the trap of trying to make great pop music yet never sounding particularly studied or concerned about it. Falkner pops up again on »Lucky and Unhappy« and »People in the City,« a pair of album standouts that subvert any pop inclinations with a raft of bridges and breakdowns among the layers of production. »Wonder Milky Bitch« is another precisely studied track, a haze of lunar-desert synth pop directly evocative of country-pop classicist Lee Hazlewood, and »Radian« brings Air back to the instrumental textures of their early work. Fans and involved listeners are definitely rewarded with increased dividends after multiple listens, but even they may wish for an album that harked back to the simpler days of the Premiers Symptomes EP and Moon Safari.

(John Bush)
COMM=Songs-DB_Custom1=eng=2000s
COMM=Songs-DB_Custom2=eng=Beautiful; Female Vocalists; Singer-Songwriter; Male Vocalists
COMM=Songs-DB_Occasion=eng=Chillout; Sleep; Love
IPLS=[unrepresentable data]
TALB=10 000 Hz Legend
TBPM=133
TCOM=Nicolas Godin/Jean-Benoît Dunckel
TCON=Electronica; Ambient; Instrumental; Trip-Hop; Electro
TDAT=2805
TIT1=Electronica
TIT2=Caramel Prisoner
TKEY=Am
TMED=CD
TORY=2001
TPE1=Air
TPE2=Air
TPOS=1/1
TPUB=Virgin
TRCK=11/11
TSO2=Air
TSRC=FRS630100011
TXXX=ASIN=B00005IABM
TXXX=BARCODE=724381033227
TXXX=CATALOGNUMBER=CDV2945
TXXX=MusicBrainz Album Artist Id=cb67438a-7f50-4f2b-a6f1-2bb2729fd538
TXXX=MusicBrainz Album Id=8feeede3-e708-476e-b6f0-eb974f3437ee
TXXX=MusicBrainz Album Release Country=GB
TXXX=MusicBrainz Album Status=official
TXXX=MusicBrainz Album Type=album
TXXX=MusicBrainz Artist Id=cb67438a-7f50-4f2b-a6f1-2bb2729fd538
TXXX=MusicBrainz Release Group Id=21062c6a-bf91-3d5d-866a-7f54f6918861
TXXX=MusicBrainz Work Id=d95ac445-6731-4c6b-970d-caae51edcec9
TXXX=MusicIP PUID=0591fe33-2853-1be0-ac7f-72b0847ff414
TXXX=MusicMagic Data=AgEAAAKupUKzxOIzg+Go4TP/gAGOXoDpyMyPqlRHgni37tfGgAHD26n+13mqG4FAg+aAAbmBzDEhG+YNgAHOIt+vDTOY5LyhgAKAAbPTp7DK1ILz+ehPYowEhpWGdNtalURdHr461mCglojW28WtArwCyvqe/IW6gAHjnb2mADr+Z4YM83vvKQG3rFXWkILXgaY=
TXXX=MusicMagic Fingerprint=AULlQiw7LybuHtYfixESBzwGRgQ+BygEIwMIBPoCOwMXAa4CLgFHALUFBAIVAM0BCwD7AG0AnADHAJMAhgBeACoAUQBfAI4AxgCFAEQAUQBaoW8fZk5m+0IFy/KmA3j6xwCh/7X+YgEb/7kFGv6dADYAQ/+8/8cAAP9kAGAAL//kAAz/8//w/7z/5f+//+D/7//i/9z/rf+q/8H/9f/n/98a9Eoo9yrCcNbwv+fwC/dr/C/83fcf++f8Zf/E/br9dP4E/gr+9P+O/dkAR//IAAP/Nf/E/8z/Qv+g/2T/kv/U/5r/hP8W/rT/Fv+V/3f/Y9VDRIu09/oTKIcvHRBU/Mb/sQABBBUA9wB+/Xz/VwEvADUAgP+v/9f9cPxc/x/+7v+3/4YAGgAGAFAAfABvAAoANQBgALYBIgC2AGsAdgB8D1TqYh3NmYMW8TyM/fsD8f1m/H8Fef+pAGUDNv51Al4A2P7//3cAVf6K/xn//P9CAG0ALgBRADAAbQBZADYAGgBpAFoAngC/AJ0AMgBtAHD2Ug/jA5YNdJcGQ3z6JgmH/o38XgcG/hQBbfiF/4MAjwMJ/+MBL/+/AM0AaABK//3/iwAGAH0A7/+pAJ8AggAfAEsAXQCWAP8AhQBSAFkASgFK9tL9OvDl6ujv6Xj/ACARFgcaCjgI/QNZCf0BagP1AWIEJQHUALAE8f6CABYAVABU/+sAqgFgANkBfwENABoAVwBnALYBWQDHAIIAmgB1JB8YDA==
TXXX=SCRIPT=Latn
TXXX=XFade=&bmp=132,51
TXXX=originalyear=2001
TXXX=replaygain_album_gain=-9.51 dB
TXXX=replaygain_album_peak=1.374292
TXXX=replaygain_album_range=9.51 dB
TXXX=replaygain_reference_loudness=-18.00 LUFS
TXXX=replaygain_track_gain=-7.17 dB
TXXX=replaygain_track_peak=1.114091
TXXX=replaygain_track_range=6.77 dB
TXXX=time_signature=4/4
TYER=2001
UFID=http://musicbrainz.org=b'3aa51f18-8ec5-4fdd-bab4-108cddaae98a'

@toots
Copy link
Member Author

toots commented Jan 18, 2024

@Moonbase59 Make sure to disable taglib. The set of metadata it can read is limited. I'm about to deprecate it I believe.

@Moonbase59
Copy link

Moonbase59 commented Jan 18, 2024

@toots Ah, ok, thanks for the pointer! Do we have a "disable" or should I simply try

settings.decoder.file_extensions.taglib := []

Do you by chance know if taglib still/again being developed? (I use it in loudgain, and at that time it was the best/most flexible lib to handle tags uniformly in many file formats.)

@toots
Copy link
Member Author

toots commented Jan 18, 2024

I'm about to push the deprecation. It will make sure that taglib is tried last. The settings could work too.

It looks like it's still active. It was cool when there were a gazillion of formats popping out back then but, nowadays, there's a handful of formats that capture most of people's need and we all support them in ocaml-metadata with many more available fields.

@Moonbase59
Copy link

Yeah, it was really cool back in the days, especially when working with C… I hope they continue it, if ever I come round to updating loudgain… sigh.

Since "fields" are so different in different file types, do you internally use some mapping like MusicBrainz Picard does? So we’d easily find a tag by a "common name" in various file types like M4A, MP3, ID3v2.3, ID3v2.4, FLAC, etc.? (Hint: If not, I strongly suggest to use Picard’s Tag Mapping—it has proved worthy for more than a decade.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants