Skip to content

Releases: daulet/tokenizers

v0.9.0

09 Aug 23:20
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.8.0...v0.9.0

v0.8.0

12 Jun 01:43
d503b5b
Compare
Choose a tag to compare

Breaking change:

Path to compiled rust library needs to be specified via -ldflags. I found it most convenient to use CGO_LDFLAGS env variable to avoid always setting it. See #18 for more details.

What's Changed

  • Update to allow for platform dependent libs in CGO by @jmoney in #18

New Contributors

Full Changelog: v0.7.1...v0.8.0

v0.7.1

10 Apr 23:30
Compare
Choose a tag to compare
  • Update core tokenizers library to latest: v0.15.2;
  • Expose init time parameter to encode special tokens (or not);

Full Changelog: v0.7.0...v0.7.1

v0.7.0

07 Jan 00:38
Compare
Choose a tag to compare

What's Changed

  • support more attributes from the Encoding structure by @clems4ever in #5

Full Changelog: v0.6.1...v0.7.0

v0.6.1

09 Nov 23:26
Compare
Choose a tag to compare
  • Simply changing bazel target names

v0.6.0

09 Nov 02:01
5e367fe
Compare
Choose a tag to compare
  • Update underlying core library to v0.14.1 (latest at the moment);
  • Support bazel build system so downstream projects can easily consume this;
  • Artifacts are smaller too since we lost dependency on openssl;

v0.5.1

22 Sep 16:19
315fa52
Compare
Choose a tag to compare
  • fix tokenizer memory leak
  • fix panic in encode/decode with invalid utf8 string

v0.5.0

07 Jul 05:09
Compare
Choose a tag to compare
  • Encode now returns token string representations;
  • Proper free of Rust strings in Decode;

v0.4.3

08 May 19:33
Compare
Choose a tag to compare
  • Release artifact for darwin-x86_64

v0.4.2

07 May 22:25
Compare
Choose a tag to compare
  • Make libtokenizers.a location relative to the source location;
  • Handle empty inputs to encode/decode;