-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add execute_eval_run example to Tutorial 5 #2459
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the changes and got the results stored here: https://public-mlflow.deepset.ai/#/experiments/698 that looks good to me! 👍 @tstadel I will go ahead and merge this PR but I'll also leave some comments here on how I think we could improve the tutorial further.
One thing I noticed is that the comparison of scores returned by pipeline.eval(add_isolated_node_eval=True)
and by reader.eval()
is difficult.
reader.eval
output is:
Reader Top-4-Accuracy: 99.09208819714657
Reader Top-1-Exact Match: 95.71984435797665
Reader Top-1-F1-Score: 95.73510337987335
Reader Top-4-Accuracy (without no_answers): 72.0
Reader Top-4-Exact Match (without no_answers): 44.0
Reader Top-4-F1-Score (without no_answers): 59.71344537815126
and pipeline.eval
output is:
0.48 #print(metrics["Reader"]["exact_match"])
0.6027426153741944 #print(metrics["Reader"]["f1"])
Here, we could add a sentence to explain that the "without no_answers" metrics are the ones that a re expected to be similar.
As another small improvement, we could add a sentence in the tutorial after the headline "Run experiments". We should briefly explain here that an experiment consists of several executions of pipeline.eval()
(evaluation runs) so that the user knows what to expect.
Proposed changes:
Storing results in MLflow
to Tutorial 5Status (please check what you already did):