Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamlit UI Evaluation mode #920

Merged
merged 10 commits into from
Apr 22, 2021
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions ui/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ RUN apt-get update && apt-get install -y curl git pkg-config cmake
# copy code
COPY utils.py /home/user/
COPY webapp.py /home/user/
COPY eval_labels_example.csv /home/user/
COPY st_state_patch.py /home/user/

# install as a package
COPY requirements.txt /home/user/
Expand Down
17 changes: 16 additions & 1 deletion ui/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,12 @@ This is a minimal UI that can spin up to test Haystack for your prototypes. It's

## Usage

### Get started with Haystack

The UI interacts with the Haystack REST API. To get started with Haystack please visit the [README](https://github.com/deepset-ai/haystack/tree/master#key-components) or checko ut our [tutorials](https://haystack.deepset.ai/docs/latest/tutorial1md).
Timoeller marked this conversation as resolved.
Show resolved Hide resolved

### Option 1: Local

Execute in this folder:
```
streamlit run webapp.py
Expand All @@ -21,4 +26,14 @@ Just run
docker-compose up -d
```
in the root folder of the Haystack repository. This will start three containers (Elasticsearch, Haystack API, Haystack UI).
You can find the UI at `http://localhost:8501`
You can find the UI at `http://localhost:8501`

## Evaluation Mode

The evaluation mode leverages the feedback REST API endpoint of haystack. The user has the options "Wrong answer", "Wrong answer and wrong passage" and "Wrong answer and wrong passage" to give feedback.

To enter the evaluation mode, select the checkbox "Evaluation mode" in the sidebar. The UI will load the predefined questions from the file `eval_lables_examles`. The file needs to be prefilled with your data. This way, the user will get a random question from the set and can give his feedback with the buttons below the questions. To load a new question, click the button "Get random question".
Timoeller marked this conversation as resolved.
Show resolved Hide resolved

The feedback can be exported with the API endpoint `export-doc-qa-feedback`. To learn more about finetuning a model with user feedback, please check out our [docs](https://haystack.deepset.ai/docs/latest/domain_adaptationmd#User-Feedback).

![Screenshot](https://raw.githubusercontent.com/deepset-ai/haystack/master/docs/_src/img/streamlit_ui_screenshot_eval_mode.png)
Timoeller marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 3 additions & 0 deletions ui/eval_labels_example.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
"ID";"Question Text";"Category";"Answer"
81561;"Who is the father of Arya Starck?";"SHORT";"Ned Stark"
81562;"Who is the mother of Arya Starck?";"SHORT";"Catelyn Stark"
199 changes: 199 additions & 0 deletions ui/st_state_patch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
"""Another prototype of the State implementation.
Usage
-----
How to import this:
import streamlit as st
import st_state_patch
When you do that, you will get 3 new commands in the "st" module:
* st.State
* st.SessionState
* st.GlobalState
The important class here is st.State. The other two are just an alternate API
that provides some syntax sugar.
Using st.State
--------------
Just call st.State() and you'll get a session-specific object to add state into.
To initialize it, just use an "if" block, like this:
s = st.State()
if not s:
# Initialize it here!
s.foo = "bar"
If you want your state to be global rather than session-specific, pass the
"is_global" keyword argument:
s = st.State(is_global=True)
if not s:
# Initialize it here!
s.foo = "bar"
Alternate API
-------------
If you think this reads better, you can create session-specific and global State
objects with these commands instread:
s0 = st.SessionState()
# Same as st.State()
s1 = st.GlobalState()
# Same as st.State(is_global=True)
Multiple states per app
-----------------------
If you'd like to instantiate several State objects in the same app, this will
actually give you 2 different State instances:
s0 = st.State()
s1 = st.State()
print(s0 == s1) # Prints False
If that's not what you want, you can use the "key" argument to specify which
exact State object you want:
s0 = st.State(key="user metadata")
s1 = st.State(key="user metadata")
print(s0 == s1) # Prints True
"""

import inspect
import os
import threading
import collections

import streamlit as st

try:
import streamlit.ReportThread as ReportThread
from streamlit.server.Server import Server
except Exception:
# Streamlit >= 0.65.0
import streamlit.report_thread as ReportThread
from streamlit.server.server import Server

# Normally we'd use a Streamtit module, but I want a module that doesn't live in
# your current working directory (since local modules get removed in between
# runs), and Streamtit devs are likely to have Streamlit in their cwd.
Timoeller marked this conversation as resolved.
Show resolved Hide resolved
import sys
GLOBAL_CONTAINER = sys


class State(object):
def __new__(cls, key=None, is_global=False):
if is_global:
states_dict, key_counts = _get_global_state()
else:
states_dict, key_counts = _get_session_state()

if key is None:
key = _figure_out_key(key_counts)

if key in states_dict:
return states_dict[key]

state = super(State, cls).__new__(cls)
states_dict[key] = state

return state

def __init__(self, key=None, is_global=False):
pass

def __bool__(self):
return bool(len(self.__dict__))

def __contains__(self, name):
return name in self.__dict__


def _get_global_state():
if not hasattr(GLOBAL_CONTAINER, '_global_state'):
GLOBAL_CONTAINER._global_state = {}
GLOBAL_CONTAINER._key_counts = collections.defaultdict(int)

return GLOBAL_CONTAINER._global_state, GLOBAL_CONTAINER._key_counts


def _get_session_state():
session = _get_session_object()

curr_thread = threading.current_thread()

if not hasattr(session, '_session_state'):
session._session_state = {}

if not hasattr(curr_thread, '_key_counts'):
# Put this in the thread because it gets cleared on every run.
curr_thread._key_counts = collections.defaultdict(int)

return session._session_state, curr_thread._key_counts


def _get_session_object():
# Hack to get the session object from Streamlit.

ctx = ReportThread.get_report_ctx()

this_session = None
current_server = Server.get_current()
if hasattr(current_server, '_session_infos'):
# Streamlit < 0.56
session_infos = Server.get_current()._session_infos.values()
else:
session_infos = Server.get_current()._session_info_by_id.values()

for session_info in session_infos:
s = session_info.session
if (
# Streamlit < 0.54.0
(hasattr(s, '_main_dg') and s._main_dg == ctx.main_dg)
or
# Streamlit >= 0.54.0
(not hasattr(s, '_main_dg') and s.enqueue == ctx.enqueue)
or
# Streamlit >= 0.65.2
(not hasattr(s, '_main_dg') and s._uploaded_file_mgr == ctx.uploaded_file_mgr)
):
this_session = s

if this_session is None:
raise RuntimeError(
"Oh noes. Couldn't get your Streamlit Session object"
'Are you doing something fancy with threads?')

return this_session


def _figure_out_key(key_counts):
stack = inspect.stack()

for stack_pos, stack_item in enumerate(stack):
filename = stack_item[1]
if filename != __file__:
break
else:
stack_item = None

if stack_item is None:
return None

# Just breaking these out for readability.
#frame_id = id(stack_item[0])
filename = stack_item[1]
# line_no = stack_item[2]
func_name = stack_item[3]
# code_context = stack_item[4]

key = "%s :: %s :: %s" % (filename, func_name, stack_pos)

count = key_counts[key]
key_counts[key] += 1

key = "%s :: %s" % (key, count)

return key


class SessionState(object):
def __new__(cls, key=None):
return State(key=key, is_global=False)


class GlobalState(object):
def __new__(cls, key=None):
return State(key=key, is_global=True)


st.State = State
st.GlobalState = GlobalState
st.SessionState = SessionState
49 changes: 34 additions & 15 deletions ui/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,40 @@

API_ENDPOINT = os.getenv("API_ENDPOINT", "http://localhost:8000")
DOC_REQUEST = "query"

DOC_FEEDBACK = "feedback"

@st.cache(show_spinner=False)
def haystack_query(query, filters=None, top_k_reader=5, top_k_retriever=5):
url = f"{API_ENDPOINT}/{DOC_REQUEST}"
req = {"query": query, "filters": filters, "top_k_retriever": top_k_retriever, "top_k_reader": top_k_reader}
response_raw = requests.post(url, json=req).json()
def retrieve_doc(query,filters=None,top_k_reader=5,top_k_retriever=5):
# Query Haystack API
url = f"{API_ENDPOINT}/{DOC_REQUEST}"
req = {"query": query, "filters": filters, "top_k_retriever": top_k_retriever, "top_k_reader": top_k_reader}
response_raw = requests.post(url,json=req).json()

# Format response
result = []
answers = response_raw["answers"]
for i in range(len(answers)):
answer = answers[i]['answer']
if answer:
context = '...' + answers[i]['context'] + '...'
meta_name = answers[i]['meta']['name']
relevance = round(answers[i]['probability']*100,2)
document_id = answers[i]['document_id']
offset_start_in_doc = answers[i]['offset_start_in_doc']
result.append({'context':context,'answer':answer,'source':meta_name,'relevance':relevance, 'document_id':document_id,'offset_start_in_doc':offset_start_in_doc})
return result, response_raw

result = []
answers = response_raw["answers"]
for i in range(len(answers)):
answer = answers[i]["answer"]
if answer:
context = "..." + answers[i]["context"] + "..."
meta_name = answers[i]["meta"].get("name")
relevance = round(answers[i]["probability"] * 100, 2)
result.append({"context": context, "answer": answer, "source": meta_name, "relevance": relevance})
return result, response_raw
def feedback_doc(question,is_correct_answer,document_id,model_id,is_correct_document,answer,offset_start_in_doc):
# Feedback Haystack API
url = f"{API_ENDPOINT}/{DOC_FEEDBACK}"
req = {
"question": question,
"is_correct_answer": is_correct_answer,
"document_id": document_id,
"model_id": model_id,
"is_correct_document": is_correct_document,
"answer": answer,
"offset_start_in_doc": offset_start_in_doc
}
response_raw = requests.post(url,json=req).json()
return response_raw
Loading