Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve output of the tutorials #1675

Closed
wants to merge 28 commits into from
Closed

Improve output of the tutorials #1675

wants to merge 28 commits into from

Conversation

ZanSara
Copy link
Contributor

@ZanSara ZanSara commented Oct 28, 2021

Fixes #1595

@ZanSara ZanSara marked this pull request as draft October 28, 2021 15:59
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@ZanSara ZanSara marked this pull request as ready for review October 29, 2021 16:40

# from pprint import pprint

# pprint(prediction)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we also wanted to show the actual output here (at least the first answer in the json and then ...) 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I put it in the .py script but somehow forgot to put it here 🤦

@ZanSara ZanSara marked this pull request as draft November 3, 2021 09:16
print_answers(prediction, details="minimal")

# Or directly print the object
from pprint import pprint
pprint(prediction)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the console output, the minimal printout is followed immediately by the full printout. I think it would be good if we added headings like

Minimal Output
============

and

Full Output
========

So that new users aren't overwhelmed by how much is being printed out

# Answer(answer='Joffrey', type='extractive', score=0.6753827035427094, }),
# Answer(answer='Robb', type='extractive', score=0.6665983200073242, })
# ],
# 'documents': [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe its worth explaining exactly which documents these are. I assume they are the ones returned by the Retriever. But I assume also multiple answers could come from the one document, meaning that some of these candidate documents might not contain an answer.

@@ -64,7 +64,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More general point about Tutorial 3. This is essentially the same as Tutorial 1 except that we're using a different document store. Do we want to make the same printing changes to this tutorial too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! 👍

print(f"\n * Generating questions for document {idx}: {document.content[:50]}...")
result = question_generation_pipeline.run(documents=[document])

print("Generated questions:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these titles but the printout seems a bit dense. Could you add a line break before "Generated questions"?

@@ -46,8 +52,14 @@
# RetrieverQuestionGenerationPipeline
retriever = ElasticsearchRetriever(document_store=document_store)
rqg_pipeline = RetrieverQuestionGenerationPipeline(retriever, question_generator)

print(f"\n * Generating questions for documents matching the query 'Arya Stark'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also create titles for each pipeline we are using and print them to the console so that the outputs don't get confused so easily e.g.

QuestionGenerationPipeline
=====================

@brandenchan
Copy link
Contributor

Also just want to ask whether you evaluated the necessity of these steps that were suggested in the original issue?

  • Adjust print_answers / print_documents utils
  • Adjust the str representation of Document / Answer and utilize this in the tutorials
  • Use a custom print to only show certain attributes like here

@ZanSara ZanSara closed this Nov 4, 2021
@ZanSara ZanSara deleted the tutorials_output branch November 4, 2021 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prettify printed results in Tutorials
3 participants