-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Feature/api list document #880
Conversation
rest_api/controller/documents.py
Outdated
from fastapi import APIRouter, HTTPException | ||
import logging | ||
|
||
sql = sql.SQLDocumentStore() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already initialising document store at https://github.com/deepset-ai/haystack/blob/master/rest_api/controller/search.py#L38 , so we can skip initialisation here or we can refactor that piece of code a bit
rest_api/controller/documents.py
Outdated
sql = sql.SQLDocumentStore() | ||
now = datetime.now() | ||
SUCCESS: int = 200 | ||
CREATED: int = 201 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can directly use HTTPStatus enum.
rest_api/controller/documents.py
Outdated
:param payload: input from the client to create a document. | ||
""" | ||
logger.info("Constructing document object") | ||
created_document = Document.from_dict(payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you yet to add implementation to add document
rest_api/controller/documents.py
Outdated
""" | ||
try: | ||
logger.info("Retrieving documents") | ||
retrieved_documents = sql.get_all_documents() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can build pagination on top of generator.
def get_all_documents_generator( |
rest_api/controller/documents.py
Outdated
raise Exception(f"Something went wrong while retrieving documents", e.message) | ||
|
||
|
||
@router.get("/documents/{id}", status_code=SUCCESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can return response_model
so open api spec will render correctly also to other APIs as well
rest_api/controller/documents.py
Outdated
try: | ||
retrieved_document = sql.get_document_by_id(id) | ||
if not retrieved_document: | ||
raise HTTPException(status_code=404, detail=f"No document with id {id} could be found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception raise here will be caught by line no 63, and that will convert it into normal Exception at line no 67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTPException is not a generic Python exception, meaning an exception will be raised and the code will terminate with exit code != 0. We won't go further in the calling chain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about but FastAPI HTTPException is sub class of Exception.
https://github.com/tiangolo/fastapi/blob/4d208b2b9035e24bdf80505571b5b1bac8f9ae7a/fastapi/exceptions.py#L8
https://github.com/encode/starlette/blob/bfa61ad92b63b6929e485f2189fd299f94189b09/starlette/exceptions.py
rest_api/controller/documents.py
Outdated
:param payload: input from the client to create a Document | ||
""" | ||
try: | ||
if not id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id
can never be None
so no need to check for that here.
rest_api/controller/documents.py
Outdated
:param id: UUID pointing to the document to retrieve | ||
""" | ||
try: | ||
if not id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest_api/controller/documents.py
Outdated
raise HTTPException(status_code=404, detail=f"No document with id {id} could be found") | ||
sql.delete_document_by_id(id) | ||
cursor = sql.SQLDocumentStore(update_existing_documents=True) | ||
document_to_be_updated = cursor.write_documents(payload.to_dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add update_existing_documents
to write_documents
as an optional param.
Also we need to make sure payload["id"] == id
otherwise document store will generate random UUID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write_document() does not accept update_existing_document as an input parameter. That is the reason why I had to instantiate another instance of the SQLDocumentStore. Plus, setting it to True makes sure that any existing document with an id gets updated (per SQLDocumentStore docstring).
Is that correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I mean how about updating write_documents function which accept update_existing_documents as optional param
@ankh6 Thank you for working on it. I have few review comments.
@tholor We can fix coding style, which user can integrate with IDE. Also CI can also run style check. https://realpython.com/python-code-quality/ |
An error is raised when trying to run the test suite with pytest test -m "not elasticsearch" ImportError while loading conftest '/home/andy/opensource/deepset_ai_haystack/haystack/test/conftest.py'. |
Hey @ankh6 thanks for the PR, this is a really useful feature. Do you, ankh6 want to work on those suggestions? Otherwise @oryx1729 would take over soon. |
@Timoeller I am working on it. I'll make another PR this evening. |
@Timoeller Changes have been pushed. |
Hey cool, thanks for the changes. If you work on those comments we can resolve them and let Lalit or @oryx1729 do another round of review. But we are getting into good shape here. Thanks again for the work! |
@Timoeller. Yes, I'll keep working on that. I want my branch to be merged :) All comments written by Lalit have been tackled in the latest commit. Which points are you referring to ? |
Hey, yes you are right, I must have looked at an older version of your code. @oryx1729 could you please take a look at the pydantic error and rest api changes? We will need your review before we can merge here. |
@ankh6 what error you get after you install |
@lalitpagaria I'll install the elasticsearch server and create another PR. |
@oryx1729 Have you had the time to review the PR ? |
Hi @ankh6, thank you for working on this PR, and apologies for the delay in getting back on it. This already looks in very good shape. It needs minor adjustments(mostly related to recently merged #922) and fixing the CI - I'm happy to take over from here to take it to the finish line in the next days! |
@oryx1729 Sure thing! |
…sers to interact with the document store through method requests DELETE, GET, POST, PUT
…ng document, updating write_documents method with a parameter
Hi @ankh6, a quick update. I revisited the PR and realized it currently only has an implementation for the |
@ankh6 just checking if you able to work on it. |
Hey !
Here's the pull request regarding issue #797
I am eagerly waiting for your feedback !