Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatting issue #3799

Closed
anthonymakela opened this issue May 31, 2019 · 3 comments
Closed

Formatting issue #3799

anthonymakela opened this issue May 31, 2019 · 3 comments
Labels
training Training and updating models usage General spaCy usage

Comments

@anthonymakela
Copy link

anthonymakela commented May 31, 2019

Hi! It seems that im having a bit of trouble regarding the JSON format. Im training NER model(using the spacy CLI train command) for Finnish language and im getting an "ValueError: Expected object or value"

Here's an example of the JSON im trying to input.

[( 'Kairossa on samalla #äänestä parittajaa-viestillä tuhrittu myös Abdul Fattah al-Sisin vaalimainoksia ', {'entities': [(0, 8, 'B-LOC'), (64, 69, 'B-PER'), (70, 76, 'I-PER'), (77, 85, 'I-PER')]} )]

And here is the full error.

Training pipeline: ['tagger', 'parser', 'ner'] Starting with blank model 'fi' Counting training words (limit=0) {'entities': [(138, 145, 'B-ORG'), (207, 214, 'B-ORG')]} Traceback (most recent call last): File "/Users/johanna/anaconda2/envs/py37/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/Users/johanna/anaconda2/envs/py37/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/johanna/anaconda2/envs/py37/lib/python3.7/site-packages/spacy/__main__.py", line 35, in <module> plac.call(commands[command], sys.argv[1:]) File "/Users/johanna/anaconda2/envs/py37/lib/python3.7/site-packages/plac_core.py", line 328, in call cmd, result = parser.consume(arglist) File "/Users/johanna/anaconda2/envs/py37/lib/python3.7/site-packages/plac_core.py", line 207, in consume return cmd, self.func(*(args + varargs + extraopts), **kwargs) File "/Users/johanna/anaconda2/envs/py37/lib/python3.7/site-packages/spacy/cli/train.py", line 196, in train corpus = GoldCorpus(train_path, dev_path, limit=n_examples) File "gold.pyx", line 112, in spacy.gold.GoldCorpus.__init__ File "gold.pyx", line 123, in spacy.gold.GoldCorpus.write_msgpack File "gold.pyx", line 163, in read_tuples File "gold.pyx", line 321, in read_json_file File "gold.pyx", line 378, in _json_iterate File "gold.pyx", line 375, in spacy.gold._json_iterate File "/Users/johanna/anaconda2/envs/py37/lib/python3.7/site-packages/srsly/_json_api.py", line 37, in json_loads return ujson.loads(data) ValueError: Expected object or value

Any ideas what might be causing this?

Your Environment

  • Operating System: Ubuntu 18.04
  • Python Version Used: Python 3.7.2
  • spaCy Version Used: 2.1.3
@honnibal
Copy link
Member

honnibal commented Jun 1, 2019

It does look like your training format is wrong. Here's the docs for the correct format: https://spacy.io/api/annotation#json-input

@honnibal honnibal added training Training and updating models usage General spaCy usage labels Jun 1, 2019
@ines ines closed this as completed Jun 1, 2019
@ines
Copy link
Member

ines commented Jun 1, 2019

We're also working on making the format simpler and more intuitive – see #2928 for detail and progress.

@lock
Copy link

lock bot commented Jul 1, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jul 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
training Training and updating models usage General spaCy usage
Projects
None yet
Development

No branches or pull requests

3 participants