Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add loggers aimed at Elasticsearch #48

Open
danielballan opened this issue Dec 18, 2018 · 9 comments
Open

Add loggers aimed at Elasticsearch #48

danielballan opened this issue Dec 18, 2018 · 9 comments
Milestone

Comments

@danielballan
Copy link
Contributor

We need a Python logging handler that submits data to Elastic. Specifically it should submit at POST request like this:

curl -X "PUT" cmb03.cs.nsls2.local:9200/SOME_SENSIBLE_INDEX_NAME/_doc/1 -d '{"hello": "world"}' -H "Content-Type: application/json" 

There may already be a nice library of StackOverflow snippet for making HTTP requests from Python loggers. If not, I would just roll something using requests.

@mrakitin
Copy link
Member

+1 for requests. Also, should the authentication be supported?

@mrakitin
Copy link
Member

In my case I had to add http:// to the address to have it passed, not refused.

@danielballan
Copy link
Contributor Author

Moving security discussion to a private channel.

@mrakitin
Copy link
Member

This seems to work:

In [1]: import requests

In [2]: topic = 'test'

In [3]: r = requests.put(f'http://cmb03.cs.nsls2.local:9200/{topic}/_doc/1', json={'hello': 'DAMA tester'})

In [4]: r
Out[4]: <Response [200]>

In [5]: r.text
Out[5]: '{"_index":"test","_type":"_doc","_id":"1","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1}'

@danielballan
Copy link
Contributor Author

Good. As a note to future readers, the URL http://cmb03.cs.nsls2.local:9200/ should now be http://elasticsearch.cs.nsls2.local/. The old one may not work.

Also, refer to this Dropbox Paper for suggested index names: https://paper.dropbox.com/doc/Kafka-Topics--AT8pvSkZTo5zs_yP40M6R5HFAg-Kedt0QGwc0DhH9cZzXkDy

@danielballan
Copy link
Contributor Author

danielballan commented Jan 3, 2019

We should wrap the usage demonstrated by in @mrakitin's comment above in a Python logging handler. Something like:

class ElasticHandler(logging.Handler):
    def __init__(self, url):
        self.url = url
        super().__init__()

    def emit(self, record):
        # Extract useful info from the record and put them into a dict.
        response = requests.put(...)
        # Raise an exception if the server we PUT to returns a bad status code.
        # The logging framework will catch the error and print a message.
        response.raise_for_status()

Read up on LogRecord to understand what to expect in record.

Test like so:

from bluesky import RunEngine
RE = RunEngine({})
handler = ElasticHandler(URL)
RE.log.addHandler(handler)
RE.log.setLevel("DEBUG")
RE([])  # Run an empty plan for simplicity's sake. Will still generate some log messages.

@danielballan
Copy link
Contributor Author

@ke-zhang-rd If you are interested in getting involved with the Kibana stuff, this might be a good issue for you to work on. We don't need it for the deployment, but we wish we had it for the deployment and will roll it out to select beamlines as soon as it's ready.

Let us know if you are interested; if not I expect @mrakitin or I will be happy to take it.

@danielballan
Copy link
Contributor Author

danielballan commented Jan 4, 2019

As a separate but related issue that may be tackled in the same PR, we agreed in our meeting today that we would have both an ElasticHandler and a simpler RotatingFileHanlder while we experiment. We may remove one or the other depending on how things go.

@mrakitin
Copy link
Member

mrakitin commented Jan 4, 2019

Awesome, I like that the RotatingFileHandler is very configurable, so we can fully control how large the logs should be and how many of them to keep, and we should experiment with number at each particular beamline and for each particular package we are logging (e.g. caproto may produce 100x amount of data than bluesky, even with an idle IPython session).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants