Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libtokenizers.so problems on alpine #2946

Closed
fenss opened this issue Jan 17, 2024 · 4 comments
Closed

libtokenizers.so problems on alpine #2946

fenss opened this issue Jan 17, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@fenss
Copy link

fenss commented Jan 17, 2024

Description

(A clear and concise description of what the bug is.)
Hello, I want to use DJL on my project because the backend is written by JAVA.
I only use the HuggingFaceTokenizer, everythings going well on windows.
But its not working when we deploying to alpine:

java.lang.UnsatisfiedLinkError: /root/.djl.ai/tokenizers/0.15.0-0.26.0-linux-x86_64/libtokenizers.so: Error relocating /root/.djl.ai/tokenizers/0.15.0-0.26.0-linux-x86_64/libtokenizers.so: __register_atfork: symbol not found
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1946)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1828)
at java.lang.Runtime.load0(Runtime.java:810)
at java.lang.System.load(System.java:1088)
at ai.djl.huggingface.tokenizers.jni.LibUtils.loadLibrary(LibUtils.java:76)
at ai.djl.huggingface.tokenizers.jni.LibUtils.(LibUtils.java:41)
at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:173)
at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:138)
at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:121)

It seems the libtokenizers.so and the alpine image are incompatible, can we fix this?

Environment Info

System: runtime-jre8u302-alpine
DJL version:

        <dependency>
            <groupId>ai.djl.huggingface</groupId>
            <artifactId>tokenizers</artifactId>
            <version>0.26.0</version>
        </dependency>

I checked the source code about loading libtokenizers.so:
LibUtils.java
Platform.java
And I checked the system info by:

System.out.println("System: " + System.getProperty("os.name") + "; OSArch: " + System.getProperty("os.arch"));

Output:

System: Linux; OSArch: amd64

So how to solve this? Do I need to compile the rust tokenizer on alpine?

Need help, thanks.

@fenss fenss added the bug Something isn't working label Jan 17, 2024
@frankfliu
Copy link
Contributor

We only tested on ubuntu 18.04+ and centos7+. It looks like you are missing some glibc in your system. Can you try to install glibc-2.30-r0.apk in your system?

@fenss
Copy link
Author

fenss commented Jan 17, 2024

We only tested on ubuntu 18.04+ and centos7+. It looks like you are missing some glibc in your system. Can you try to install glibc-2.30-r0.apk in your system?

I tried this:

bash-5.1# apk add glibc-bin-2.35-r1.apk 
(1/3) Upgrading musl (1.2.3-r0 -> 1.2.3-r3)
(2/3) Installing libc6-compat (1.2.3-r3)
(3/3) Installing glibc-bin (2.35-r1)
Executing glibc-bin-2.35-r1.trigger
OK: 184 MiB in 105 packages

And check the glibc version:

bash-5.1# ldd --version
musl libc (x86_64)
Version 1.2.3
Dynamic Program Loader
Usage: /lib/ld-musl-x86_64.so.1 [options] [--] pathname

So change the PATH:

bash-5.1# export PATH="/usr/glibc-compat/bin/:$PATH"
bash-5.1# ldd --version
ldd (GNU libc) 2.35
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

And still got this:

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
Caused by: java.lang.UnsatisfiedLinkError: /root/.djl.ai/tokenizers/0.15.0-0.26.0-linux-x86_64/libtokenizers.so: Error relocating /root/.djl.ai/tokenizers/0.15.0-0.26.0-linux-x86_64/libtokenizers.so: __register_atfork: symbol not found
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1946)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1828)
at java.lang.Runtime.load0(Runtime.java:810)
at java.lang.System.load(System.java:1088)
at ai.djl.huggingface.tokenizers.jni.LibUtils.loadLibrary(LibUtils.java:76)
at ai.djl.huggingface.tokenizers.jni.LibUtils.(LibUtils.java:41)
at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:173)
at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:138)
at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:121)

😭😭

@frankfliu
Copy link
Contributor

Please take a look this: https://stackoverflow.com/questions/69607005/cannot-run-executables-with-alpine-and-busybox-docker-images

Can you use ubuntu or centos based image?

@fenss
Copy link
Author

fenss commented Jan 18, 2024

Please take a look this: https://stackoverflow.com/questions/69607005/cannot-run-executables-with-alpine-and-busybox-docker-images

Can you use ubuntu or centos based image?

Finally I solve this by compiling the libtokenizers.so on the Alpine system.
Here is my solution:

Prepare the compile environment

First we need to build an image:

FROM openjdk:18-jdk-alpine3.13

RUN apk add --update --no-cache \
               libstdc++ \
               bash \
               openssl \
               curl \
               build-base \
               perl

ENV DOCKERIZE_VERSION v0.4.0
RUN wget https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
    && tar -C /usr/local/bin -xzvf dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
    && rm dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz

PAY ATTENTION THAT WE NEED 3.13, or we get this problem while compiling.

Second, we run this image container:

sudo docker run -it -v /usr/local/djl_demo/:/opt/djl_demo/ --rm djl-jdk-alpine:1.0 /bin/bash

And we need to install Rust following this:

curl –proto ‘=https’ –tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

Compile the libtokenizers.so

Basically follow this.

Download the DJL source code:

git clone https://github.com/deepjavalibrary/djl/tree/v0.26.0

Go to the tokenizers source folder:

cd ./djl-0.26.0/extensions/tokenizers

Download the HuggingFace's tokenizer source code:

git clone https://github.com/huggingface/tokenizers/tree/v0.15.0

Compile the libtokenizers.so:

cargo build --manifest-path rust/Cargo.toml --release

Mayby you will get this, just add variable before cargo build:

export RUSTFLAGS="-C target-feature=-crt-static"

Then you will get rust/target/release/libdjl.so, rename it to libtokenizers.so

Cheat the cache dir

mkdir -p /root/.djl.ai/tokenizers/0.15.0-0.26.0-linux-x86_64/
cp libtokenizers.so /root/.djl.ai/tokenizers/0.15.0-0.26.0-linux-x86_64/

Then the DJL uses the new libtokenizers.so.

@fenss fenss closed this as completed Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants