Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extractive Setting? #127

Open
acc-galenicum opened this issue Jun 23, 2022 · 2 comments
Open

Extractive Setting? #127

acc-galenicum opened this issue Jun 23, 2022 · 2 comments

Comments

@acc-galenicum
Copy link

Hi,
This might be a dumb question but I am not getting it.

This model is supposed to perform an extractive summarization process.
But when I look at the raw data (cnn_stories), they provide a text with some highlights at the end (I assume this is the summary), but the problem is this highlights do not belong to the original text, so I don't understand the raw data.

To put a specific example I attach a story file.
00a308681faf9c82a0e62a89b21fcdefb84b88fa.txt

Anyone can help me out with this?
Thanks in advance

@acc-galenicum
Copy link
Author

Ok, self-response in case anyone wonders: I missed the part of the paper where it explains it. The extractive summary is created based on the highlights or the abstractive summary selecting the sentences from the text which maximize the ROUGE metric.

@roronoazoro29
Copy link

Excuse me, may I ask a question?

  1. Is it possible to use other dataset that are truly extractive summaries ( take few sentence from the original text) such as dataset BBC news https://www.kaggle.com/datasets/pariza/bbc-news-summary
  2. how to create .pt as bert_data and .story if using other dataset ( in this repo, already exist and just download) ?
    thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants