Skip to content

Latest commit

 

History

History
26 lines (20 loc) · 1.32 KB

CHANGELOG.md

File metadata and controls

26 lines (20 loc) · 1.32 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[0.2.0] - 2020-06-14

Added

  • Added UnicodeSentenceTokenizer that tokenizes sentences following Unicode segmentation rules using the unicode-segmentation crate #66
  • Added PunctuationTokenizer that tokenizes sentences delimited by punctuation #70

Changed

  • Updated the Python wrapper to use PyO3 0.10 which in particular raises Rust panics as Python exceptions #69
  • Added Python 3.8 wheel generation #65
  • Tokenizers can now be pickled in Python #73
  • Only Python 3.6+ is now supported in the Python package.
  • Renamed UnicodeSegmentTokenizer to UnicodeWordTokenizer. #75
  • Better error handling. In particular error::VTextError is replaced by error::EstimatorErr. #76

Contributors

  • Josh Bowles
  • Josh Levy-Kramer
  • Roman Yurchak