Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential performance improvements? #1842

Closed
pfbuxton opened this issue Oct 22, 2019 · 3 comments
Closed

Potential performance improvements? #1842

pfbuxton opened this issue Oct 22, 2019 · 3 comments
Milestone

Comments

@pfbuxton
Copy link

I have profiled a simple heatmap here:

profile_code.py

from werkzeug.contrib.profiler import ProfilerMiddleware
from app import server

server.config['PROFILE'] = True
server.wsgi_app = ProfilerMiddleware(server.wsgi_app, restrictions=[30])
server.run(debug = True)

app.py

import numpy as np

import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objects as go

import flask

# Heatmap
Z = np.random.rand(1000,1000)

server = flask.Flask(__name__)
app = dash.Dash(__name__, server=server)

app.layout = html.Div(children=[
    dcc.Graph(
        id='example-graph',
        figure=dict(
			data=[go.Heatmap(
				z=Z
			)],
			layout=dict()
		)
    )
])

Result (Python 3.7 Windows):

         1068 function calls (1060 primitive calls) in 4.032 seconds

   Ordered by: cumulative time
   List reduced from 243 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.032    4.032 C:\Python37-32\lib\site-packages\werkzeug\contrib\profiler.py:95(runapp)
        1    0.000    0.000    4.032    4.032 C:\Python37-32\lib\site-packages\flask\app.py:2262(wsgi_app)
        1    0.000    0.000    4.032    4.032 C:\Python37-32\lib\site-packages\flask\app.py:1801(full_dispatch_request)
        1    0.000    0.000    2.473    2.473 C:\Python37-32\lib\site-packages\flask\app.py:1779(dispatch_request)
        1    0.002    0.002    2.472    2.472 C:\Python37-32\lib\site-packages\dash\dash.py:467(serve_layout)
      2/1    0.013    0.006    2.462    2.462 C:\Python37-32\lib\json\__init__.py:183(dumps)
        1    0.002    0.002    2.451    2.451 C:\Python37-32\lib\site-packages\_plotly_utils\utils.py:35(encode)
        2    0.000    0.000    1.995    0.998 C:\Python37-32\lib\json\encoder.py:182(encode)
        2    1.856    0.928    1.980    0.990 C:\Python37-32\lib\json\encoder.py:204(iterencode)
        1    0.000    0.000    1.559    1.559 C:\Python37-32\lib\site-packages\flask\app.py:1818(finalize_request)
        1    0.000    0.000    1.559    1.559 C:\Python37-32\lib\site-packages\flask\app.py:2091(process_response)
        1    0.000    0.000    1.559    1.559 C:\Python37-32\lib\site-packages\flask_compress.py:78(after_request)
        1    0.000    0.000    1.557    1.557 C:\Python37-32\lib\site-packages\flask_compress.py:113(compress)
        1    0.001    0.001    1.553    1.553 C:\Python37-32\lib\gzip.py:247(write)
        1    1.535    1.535    1.535    1.535 {method 'compress' of 'zlib.Compress' objects}
        1    0.000    0.000    0.452    0.452 C:\Python37-32\lib\json\__init__.py:299(loads)
        1    0.000    0.000    0.452    0.452 C:\Python37-32\lib\json\decoder.py:332(decode)
        1    0.452    0.452    0.452    0.452 C:\Python37-32\lib\json\decoder.py:343(raw_decode)
        4    0.000    0.000    0.124    0.031 C:\Python37-32\lib\site-packages\_plotly_utils\utils.py:66(default)
        1    0.000    0.000    0.093    0.093 C:\Python37-32\lib\site-packages\_plotly_utils\utils.py:123(encode_as_list)
        1    0.093    0.093    0.093    0.093 {method 'tolist' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.017    0.017 C:\Python37-32\lib\site-packages\_plotly_utils\utils.py:131(encode_as_sage)
        3    0.000    0.000    0.017    0.006 C:\Python37-32\lib\site-packages\_plotly_utils\optional_imports.py:15(get_module)
        1    0.000    0.000    0.016    0.016 C:\Python37-32\lib\importlib\__init__.py:109(import_module)
      2/1    0.000    0.000    0.016    0.016 <frozen importlib._bootstrap>:994(_gcd_import)
      2/1    0.000    0.000    0.016    0.016 <frozen importlib._bootstrap>:978(_find_and_load)
      2/1    0.000    0.000    0.016    0.016 <frozen importlib._bootstrap>:948(_find_and_load_unlocked)
        1    0.000    0.000    0.016    0.016 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
        1    0.000    0.000    0.016    0.016 <frozen importlib._bootstrap>:882(_find_spec)
        1    0.000    0.000    0.016    0.016 <frozen importlib._bootstrap_external>:1272(find_spec)

Result with Phython 2.7 linux are almost identical

Looking through the profiling it looks like the main causes is creating the JSON, with
C:\Python37-32\lib\site-packages\_plotly_utils\utils.py taking 2.45s out of a total of 4s.
I know that orjson (only Python 3) can be faster than Python's default JSON. Would you expect that changing to orjson would improve performance / be possible to implement?

Thanks for any insight.

@pfbuxton
Copy link
Author

pfbuxton commented Oct 22, 2019

Looking more into this - this part where the JSON is created, loaded and dumped again to convert NaN to null:

def encode(self, o):
"""
Load and then dump the result using parse_constant kwarg
Note that setting invalid separators will cause a failure at this step.
"""
# this will raise errors in a normal-expected way
encoded_o = super(PlotlyJSONEncoder, self).encode(o)
# now:
# 1. `loads` to switch Infinity, -Infinity, NaN to None
# 2. `dumps` again so you get 'null' instead of extended JSON
try:
new_o = _json.loads(encoded_o, parse_constant=self.coerce_to_strict)
except ValueError:
# invalid separators will fail here. raise a helpful exception
raise ValueError(
"Encoding into strict JSON failed. Did you set the separators "
"valid JSON separators?"
)
else:
return _json.dumps(
new_o,
sort_keys=self.sort_keys,
indent=self.indent,
separators=(self.item_separator, self.key_separator),
)

I tried a very unsafe method:

encoded_o = super(PlotlyJSONEncoder, self).encode(o).replace('NaN', 'null')
return encoded_o

and found the time wend from 4 seconds down to 2.5 seconds, so it looks like there is the potential for good performance improvements?

(this method is unsafe because if you wanted to have NaN in any part of the page, then it would be converted to null)

@pfbuxton
Copy link
Author

pfbuxton commented Nov 4, 2019

A safe (and fast!) solution is to make the object JSON compliant when converting it to a list, here:

    @staticmethod
    def encode_as_list(obj):
        """Attempt to use `tolist` method to convert to normal Python list."""
        if hasattr(obj, "tolist"):
            if isinstance(obj,np.ndarray):
                if obj.dtype=='float64'  or  obj.dtype=='float32':  # need to add more data types, e.g. integers
                    obj_json_compliant = np.where(np.isnan(obj)+np.isinf(obj) , None, obj) # Remove nan's and +/- infinity
                    return obj_json_compliant.tolist()
                else:
                    return obj.tolist()
            else:
                return obj.tolist()
        else:
            raise NotEncodable

(you have to be a bit careful with the numpy types as text is allowed to have nan and inf's).

Then you would simply do a single return (no re-loading JSON and re-exporting):

encoded_o = super(PlotlyJSONEncoder, self).encode(o)
return encoded_o

Would it be possible to implement my solution?

@gvwilson
Copy link
Contributor

Hi - we are trying to tidy up the stale issues and PRs in Plotly's public repositories so that we can focus on things that are still important to our community. Since this one has been sitting for several years, I'm going to close it; if it is still a concern, please add a comment letting us know what recent version of our software you've checked it with so that I can reopen it and add it to our backlog. Thanks for your help - @gvwilson

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants