Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live harbor - Bad song title encoding only harbor #411

Closed
desavil opened this issue Mar 23, 2017 · 14 comments
Closed

Live harbor - Bad song title encoding only harbor #411

desavil opened this issue Mar 23, 2017 · 14 comments

Comments

@desavil
Copy link

desavil commented Mar 23, 2017

Oryginal (broadcasting by SHOUTcast DSP): start ę€ół.śążźćń end (Polish characters and Euro symbol)
Send to SHOUTcast 2 via input.harbor/output.shoutcast: start ê�ó³.�¹¿�æñ end

Filename: start ę€ół.śążźćń end.mp3

Liquidsoap log:
2017/03/23 14:37:56 [input(dot)harbor_5820:3] New metadata chunk ? -- start ê�ó³.�¹¿�æñ end.

If I load this file in the liquidsoap playlist, there is no problem with encoding only if I broadcasts by harbor.

Screens:
Good
Bad

@toots
Copy link
Member

toots commented Mar 23, 2017

You should look at this parameter in input.harbor:

 * icy_metadata_charset : string (default: "")
     ICY (shoutcast) metadata charset. Guessed if empty. Default for
     shoutcast is ISO-8859-1. Set to that value if all your clients send
     metadata using this charset and automatic detection is not working for
     you.

Try setting it to "latin1" or "utf8"

@desavil
Copy link
Author

desavil commented Mar 23, 2017

I've checked this before.

"latin1" - start ê�ó³.�¹¿�æñ end

2017/03/23 16:58:29 [camomile:3] Failed to convert "start \234\128\243\179.\156\185\191\159\230\241 end": unknown input encoding latin1
2017/03/23 16:58:29 [camomile:3] Failed to convert "song": unknown input encoding latin1
2017/03/23 16:58:29 [camomile:3] Failed to convert "sehes": unknown input encoding latin1
2017/03/23 16:58:29 [camomile:3] Failed to convert "pass": unknown input encoding latin1
2017/03/23 16:58:29 [input(dot)harbor_5820:3] New metadata chunk ? -- start ▒▒.▒▒▒▒▒▒ end.

"utf8" - start ê�ó³.�¹¿�æñ end

2017/03/23 16:52:51 [camomile:3] Failed to convert "start \234\128\243\179.\156\185\191\159\230\241 end": unknown input encoding utf8
2017/03/23 16:52:51 [camomile:3] Failed to convert "song": unknown input encoding utf8
2017/03/23 16:52:51 [camomile:3] Failed to convert "sehes": unknown input encoding utf8
2017/03/23 16:52:51 [camomile:3] Failed to convert "pass": unknown input encoding utf8
2017/03/23 16:52:51 [input(dot)harbor_5820:3] New metadata chunk ? -- start ▒▒.▒▒▒▒▒▒ end.
2017/03/23 16:52:53 [map_metadata_5822:3] Inserting missing metadata.

@desavil
Copy link
Author

desavil commented Aug 18, 2018

Some solution? I use the newest (compiled) liquidsoap and the problem is still there.

Additionally, sometimes in the logs I see something like that (this is not related to the harbor):

2018/08/18 14:20:00 [95551(dot)m3u:3] Prepared "/home/liquid/mp3/1/Andy Black - We Dont Have To Dance.mp3" (RID 5).
2018/08/18 14:20:00 [camomile:3] Failed to convert "We Don\226\128\153t Have To Dance" from auto(UTF-8,ISO-8859-1) to Latin-1 (CamomileLibrary__UChar.Out_of_range)!
2018/08/18 14:20:00 [camomile:3] Failed to convert "Andy Black - We Don\226\128\153t Have To Dance" from auto(UTF-8,ISO-8859-1) to Latin-1 (CamomileLibrary__UChar.Out_of_range)!

@toots
Copy link
Member

toots commented Aug 18, 2018

For polish it looks like you should try: ISO-8859-2. Let me know if that works, I might add it to the default.

@desavil
Copy link
Author

desavil commented Aug 18, 2018

Not working - https://i.snag.gy/Hi7xMO.jpg :(

2018/08/18 21:46:57 [input(dot)harbor_6446:3] New metadata chunk ? -- start ę�ół.�šż�ćń end.
2018/08/18 21:47:05 [camomile:3] Failed to convert "start \196\153\194\128\195\179\197\130.\194\156\197\161\197\188\194\159\196\135\197\132 end" from auto(UTF-8,ISO-8859-1) to Latin-1 (CamomileLibrary__UChar.Out_of_range)!
2018/08/18 21:47:05 [camomile:3] Failed to convert "start \196\153\194\128\195\179\197\130.\194\156\197\161\197\188\194\159\196\135\197\132 end" from auto(UTF-8,ISO-8859-1) to Latin-1 (CamomileLibrary__UChar.Out_of_range)!

Is coding properly changing here?:

admin.cgi?pass=xxx&mode=updinfo&song=start ę€ół.śążźćń end

@toots
Copy link
Member

toots commented Aug 21, 2018

Who's the base client? If that client encodes metadata with the wrong encoding before sending it to liquidsoap then there isn't much liquidsoap can do. Also, you logs still show automatic encoding detection from UTF-8 and ISO-8859-1.

@desavil
Copy link
Author

desavil commented Aug 21, 2018

SHOUTcast Source DSP v2.3.5 (latest). Now I've checked and it's probably the problem of the base client. So the matter resolved. Sorry!

And what about this?:

2018/08/18 14:20:00 [95551(dot)m3u:3] Prepared "/home/liquid/mp3/1/Andy Black - We Dont Have To Dance.mp3" (RID 5).
2018/08/18 14:20:00 [camomile:3] Failed to convert "We Don\226\128\153t Have To Dance" from auto(UTF-8,ISO-8859-1) to Latin-1 (CamomileLibrary__UChar.Out_of_range)!
2018/08/18 14:20:00 [camomile:3] Failed to convert "Andy Black - We Don\226\128\153t Have To Dance" from auto(UTF-8,ISO-8859-1) to Latin-1 (CamomileLibrary__UChar.Out_of_range)!

@toots
Copy link
Member

toots commented Aug 21, 2018

Hmm. Okay, what is your configuration now?

@desavil
Copy link
Author

desavil commented Aug 21, 2018

set("log.stdout",false)
set("init.daemon",true)
set("init.daemon.pidfile.path","/home/liquid/liquidsoap.pid")
set("log.file",true)
set("log.file.path","/home/liquid/liquidsoap.log")
def update_songtitle(m) =
if m["title"] == "" then
[("title",list.hd(default="",string.split(separator="\.mp3",(list.hd(default="",list.rev(string.split(separator="/",m["filename"])))))))]
else
[("title",m["title"])]
end
end
radio = mksafe(audio_to_stereo(playlist(mode="normal","/home/liquid/playlist.m3u")))

radio = map_metadata(update_songtitle,radio)
output.shoutcast(%fdkaac(bitrate=96,samplerate=44100,channels=2),name="Name",genre="Genre",url="http://website.tld",public=true,host="localhost",port=8000,password="pass",on_error=(fun (_) -> 10.),radio)

@toots
Copy link
Member

toots commented Aug 21, 2018

Try adding:

set("tag.encodings",["UTF-8","ISO-8859-1","ISO-8859-2"])

and possibly more of the ISO-8859 ones.

@desavil
Copy link
Author

desavil commented Aug 21, 2018

Unfortunately, none of this. I tried from ISO-8859-1 to ISO-8859-10. I even have an English song that has no strange characters in the name, and there is a coding problem with it. Should I send you these MP3 files?

@toots
Copy link
Member

toots commented Aug 21, 2018

Ok, sorry now I realize what's going on. Historically, shoutcast metadata are encoded using the latin1 encoding. However, there's no representation of a lot of characters in latin1 such as the one you're trying from polish language and, so, the conversion fails.

The only option is to force output.shoutcast to send metadata using the utf8 encoding. However, some of your listener clients may not know about that and still expect latin1 strings, resulting in issues displaying metadata. I'm afraid that there's not much room for more here, expect perhaps cleaning out metadata to map to the nearest latin1 character, for instance ę to e and etc.

I've just added a change that allows to change output.shoutcast's metadata encoding. I think that this is the best I can do for you here. Feel free to test, either with the latest code or by adding this on top of your script:

def output.shoutcast(
  ~id="output.shoutcast",~start=true,
  ~host="localhost",~port=8000,
  ~user="",~password="hackme",
  ~genre="",~url="",~name="",~encoding="ISO-8859-1",
  ~public=true,~icy_id=1, ~format="",~dj={""},
  ~dumpfile="", ~icy_metadata="guess",
  ~on_connect={()}, ~on_disconnect={()},
  ~aim="",~icq="",~irc="",~icy_reset=true,
  ~fallible=false,~on_start={()},~on_stop={()},
  ~on_error=fun(_)->3., e,s) =

  icy_reset = if icy_reset then "1" else "0" end

  headers = [("icy-aim",aim),("icy-irc",irc),
             ("icy-icq",icq),("icy-reset",icy_reset)]

  def map(m) =
    dj = dj()
    if dj != "" then
      list.add(("dj",dj),m)
    else
      m
    end
  end
  s = map_metadata(map,s)

  output.icecast(
    e, format=format, icy_id=icy_id,
    id=id, headers=headers,
    start=start,icy_metadata=icy_metadata,
    on_connect=on_connect, on_disconnect=on_disconnect,
    host=host, port=port, user=user, password=password,
    genre=genre, url=url, description="UNUSED",
    public=public, dumpfile=dumpfile,encoding=encoding,
    name=name, protocol="icy",on_error=on_error,
    fallible=fallible,on_start=on_start,on_stop=on_stop,
    s)
end

And then:

output.shoutcast(%fdkaac(bitrate=96,samplerate=44100,channels=2),name="Name",genre="Genre",url="http://website.tld",public=true,host="localhost",port=8000,password="pass",encoding="UTF-8",on_error=(fun (_) -> 10.),radio)

@toots toots closed this as completed in a1602f7 Aug 21, 2018
@desavil
Copy link
Author

desavil commented Aug 21, 2018

There is no error now. It looks like it works. Thanks!

Generally, I have not noticed before that some characters were displaying badly. Currently, they are displayed in the same way, only there is no error in the logs.

This error appeared even with no files that had no other characters than English. For example, the file "Bruno Mars - Thats What I Like.mp3". I think it may have something to do with ID3 tags. Although there are also no special signs there.

EDIT:
In file "Bruno Mars - Thats What I Like.mp3" in ID3 (Title) I see that character: (Full title in ID3: That’s What I Like)

@toots
Copy link
Member

toots commented Aug 21, 2018

Glad to hear! UTF8 characters can pop up in surprising ways, here the ' character but you also have unbreakable space, long dash - etc.. :-)

icbaker added a commit to google/ExoPlayer that referenced this issue Dec 13, 2019
Also change IcyInfo.rawMetatadata from String to byte[]

ICY doesn't specify the character encoding, and there are streams
not using UTF-8 (issue:#6753). It seems the default of at least one
server is ISO-8859-1 so let's support that as a fallback:
savonet/liquidsoap#411 (comment)

Also update IcyDecoder to skip strings it doesn't recognise at all
instead of decoding invalid characters.

The feed from issue:#6753 now decodes accents correctly:
EventLogger:   ICY: title="D Pai - Le temps de la rentrée", url="null"
PiperOrigin-RevId: 285388522
ojw28 pushed a commit to google/ExoPlayer that referenced this issue Jan 17, 2020
Also change IcyInfo.rawMetatadata from String to byte[]

ICY doesn't specify the character encoding, and there are streams
not using UTF-8 (issue:#6753). It seems the default of at least one
server is ISO-8859-1 so let's support that as a fallback:
savonet/liquidsoap#411 (comment)

Also update IcyDecoder to skip strings it doesn't recognise at all
instead of decoding invalid characters.

The feed from issue:#6753 now decodes accents correctly:
EventLogger:   ICY: title="D Pai - Le temps de la rentrée", url="null"
PiperOrigin-RevId: 285388522
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants