Order inconsistency of output candidate file with original test.json when testing bertSum Extractive #129

cece00 · 2022-07-21T07:36:23Z

Under "test" mode, there will be two files output: xxx.candidate and xxx.gold.
The texts in above two files are in the same order, but do not consistent with the original test.json.
I have checked that "shuffle=False" in dataloader. So where is wrong?
Is there anyone who has encountered the same problem? Can anyone help!?

ashokurlana · 2022-07-29T17:48:10Z

@cece00 Modify the Line 89 src/model/data_loader.py
The following code fixed the similar issue for me

def atoi(text):
return int(text) if text.isdigit() else text

def natural_keys(text):
return [ atoi(c) for c in re.split(r'(\d+)', text) ]

pts = sorted(glob.glob(args.bert_data_path + 'cnndm.' + corpus_type + '.[0-9]*.bert.pt'))
pts.sort(key=natural_keys)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Order inconsistency of output candidate file with original test.json when testing bertSum Extractive #129

Order inconsistency of output candidate file with original test.json when testing bertSum Extractive #129

cece00 commented Jul 21, 2022

ashokurlana commented Jul 29, 2022 •

edited

Loading

Order inconsistency of output candidate file with original test.json when testing bertSum Extractive #129

Order inconsistency of output candidate file with original test.json when testing bertSum Extractive #129

Comments

cece00 commented Jul 21, 2022

ashokurlana commented Jul 29, 2022 • edited Loading

ashokurlana commented Jul 29, 2022 •

edited

Loading