Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting global text coding in Liquidsoap #3231

Closed
rsp4jack opened this issue Jul 14, 2023 · 10 comments · Fixed by #3238
Closed

Setting global text coding in Liquidsoap #3231

rsp4jack opened this issue Jul 14, 2023 · 10 comments · Fixed by #3238

Comments

@rsp4jack
Copy link

I am using Windows 10 and system language is set to Simplified Chinese. The default code page is 936.

Liquidsoap writes logs (including file and console) and playlog in GB2312 (The default text coding under code page 936). But I want Liquidsoap to write them in a specific coding. i.e. set a global text coding.

For example:

settings.defaultcoding.set('utf-8')
@rsp4jack rsp4jack changed the title Setting global text coding in Liquidsoap on Windows Setting global text coding in Liquidsoap Jul 14, 2023
@toots
Copy link
Member

toots commented Jul 14, 2023

Hi,

Have you tried setting environment variables? LANG, LC_CTYPE or LC_ALL are usually used for that.

@toots
Copy link
Member

toots commented Jul 14, 2023

Nevermind, I found the function we're using. Hold on!

toots added a commit that referenced this issue Jul 14, 2023
@toots
Copy link
Member

toots commented Jul 14, 2023

@rsp4jack any change you could confirm if #3232 fixes your issue? You should be able to do:

runtime.locale.set("...")

@rsp4jack
Copy link
Author

I can't test #3232 because the CI that builds win32 binaries is fail, and I can't build Liquidsoap on my computer right now. @toots

toots added a commit that referenced this issue Jul 15, 2023
@toots
Copy link
Member

toots commented Jul 15, 2023

@rsp4jack
Copy link
Author

It fails on a very simple test.

> .\liquidsoap.exe -v 'output.ao(noise())'
Fatal error: exception Dtools__Dtools_impl.Conf.Undefined(_)
Raised at Dtools__Dtools_impl.Conf.make.object#get in file "src/dtools_impl.ml", line 150, characters 26-51
Called from Liquidsoap_runtime__Main.check_directories.subst in file "src/runtime/main.ml", line 548, characters 46-54
Called from Liquidsoap_runtime__Main.check_directories in file "src/runtime/main.ml", line 549, characters 2-44
Called from Liquidsoap_runtime__Main.(fun) in file "src/runtime/main.ml", line 672, characters 6-26
Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
Called from Lifecycle.make_action.action in file "src/core/tools/lifecycle.ml", line 43, characters 4-20
Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
Called from Dune__exe__Liquidsoap in file "src/bin/liquidsoap.ml", line 1, characters 9-41

toots added a commit that referenced this issue Jul 16, 2023
toots added a commit that referenced this issue Jul 16, 2023
toots added a commit that referenced this issue Jul 17, 2023
@toots
Copy link
Member

toots commented Jul 17, 2023

@toots
Copy link
Member

toots commented Jul 18, 2023

On second thoughts I'm not sure that this would fix your issue. Also, looking at the code, I'm not sure if the fact that liquidsoap is returning log and playlog in GB2312 is really coming from us..!

By default, strings in liquidsoap are actually assumed to be utf8. This is the encoding expected to be returned by string recoding functions (string.recode and all the places using Charset.convert in the code) and the encoding expected in string escaping functions.

In both cases, if we fail to convert or escape, we fallback to the original string.

I would imagine that, in your case, the strings are GB2312 because the program is getting filenames, metadata etc in GB2312 from the system itself.

In this case, the only approach we could do is to fix the logs only. For the rest of the internal logic, we would have to try & convert all strings entering the program through any API (file system calls, etc), which seems pretty impossible.

I think the most reasonable thing to do is to allow conversion of log entries. This is done in #3238, which adds the following:

settings.log.recode := true
settings.log.recode.encoding := "UTF-8"

Windows builds are available here: https://github.com/savonet/liquidsoap/actions/runs/5583458223 let me know if this works for you.

@rsp4jack
Copy link
Author

rsp4jack commented Jul 19, 2023

I would imagine that, in your case, the strings are GB2312 because the program is getting filenames, metadata etc in GB2312 from the system itself.

That's right. I found that some data in the log is in UTF-8, some is in GB2312. I think Liquidsoap should try to detect strings' coding (like metadata, paths, files, etc.) and convert them into UTF-8. This seems impossible now, but may be possible in the future.

Also, playlog needs a recode setting, too.

let me know if this works for you.

I can't reproduce it again, even recode is disabled. I can't run the script that I found the problem with because of another weird bug #3242 of Liquidsoap this version. I have no idea about it.

toots added a commit that referenced this issue Jul 19, 2023
* Add settings.metadata.recode
* Add doc.
Fixes: #3231
@toots
Copy link
Member

toots commented Jul 19, 2023

I would imagine that, in your case, the strings are GB2312 because the program is getting filenames, metadata etc in GB2312 from the system itself.

That's right. I found that some data in the log is in UTF-8, some is in GB2312. I think Liquidsoap should try to detect strings' coding (like metadata, paths, files, etc.) and convert them into UTF-8. This seems impossible now, but may be possible in the future.

This is already done but cannot be done everywhere. Typically, for filenames and path, we do need to keep them in the system's encoding to prevent errors when calling file manipulation functions.

Also, playlog needs a recode setting, too.

I've added a systematic string recode to metadata extractors, this should take care of most of the cases. Also added some doc to explain the situation and document the new settings.

let me know if this works for you.

I can't reproduce it again, even recode is disabled. I can't run the script that I found the problem with because of another weird bug #3242 of Liquidsoap this version. I have no idea about it.

I'll look at this very shortly.

toots added a commit that referenced this issue Jul 19, 2023
* Add settings.metadata.recode
* Add doc.
Fixes: #3231
toots added a commit that referenced this issue Jul 19, 2023
* Add settings.metadata.recode
* Add doc.
Fixes: #3231
toots added a commit that referenced this issue Jul 19, 2023
* Add settings.metadata.recode
* Add doc.
Fixes: #3231
toots added a commit that referenced this issue Jul 20, 2023
* Add settings.metadata.recode
* Add doc.
Fixes: #3231
toots added a commit that referenced this issue Jul 20, 2023
* Add settings.metadata.recode
* Add doc.

Fixes: #3231
github-merge-queue bot pushed a commit that referenced this issue Jul 20, 2023
* Cleanup tutils and tests logic.

* * Add settings.log.recode and settings.log.recode.encoding.
* Add settings.metadata.recode
* Add doc.

Fixes: #3231
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants