Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make RGB serialization more robust #14011

Open
Murat-U-Saglam opened this issue Aug 3, 2024 · 8 comments · May be fixed by #14012
Open

Make RGB serialization more robust #14011

Murat-U-Saglam opened this issue Aug 3, 2024 · 8 comments · May be fixed by #14012
Assignees
Labels
python Issues that should only require updating Python code type: task
Milestone

Comments

@Murat-U-Saglam
Copy link

Murat-U-Saglam commented Aug 3, 2024

Software versions

Python version : 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]
IPython version : 8.26.0
Tornado version : 6.4.1
Bokeh version : 2.4.3
BokehJS static path : READACTED/ETL/venv/lib/python3.10/site-packages/bokeh/server/static
node.js version : (not installed)
npm version : (not installed)
Operating system : Linux-6.9.3-76060903-generic-x86_64-with-glibc2.35

Browser name and version

N/A

Jupyter notebook / Jupyter Lab version

N/A

Expected behavior

Aim is to

Deserialize JSON String to Document format

Use appropriate deserialization methods to convert the JSON string to a document object.

Serialize Document in FastAPI:

Before sending the document to the frontend, serialize it properly in your FastAPI endpoint.

Model Document in Streamlit:

On the frontend, use Streamlit to model and render the serialized document.

Note

I am using a library that uses bokeh unfortunately they're still stuck at version 2.4.3 with no intentions of upgrading.

type(plot) initially is bokeh.models.plots.GridPlot

I then send this information to my frontend where I need to render it to a plot to use via st.bokeh_chart

Observed behavior

Error Code

TypeError('list indices must be integers or slices, not str')

Example code

Dependencies

pip3 install streamlit==1.36.0 backtesting==0.3.4.dev30+g0ce24d8 bokeh==2.4.3

import pandas as pd
from backtesting.lib import SignalStrategy, TrailingStrategy
from backtesting.test import SMA
from bokeh.document import Document
import bokeh
from bokeh.core.json_encoder import serialize_json
from backtesting.test import GOOG
import streamlit as st
from backtesting import Backtest
from backtesting.test import GOOG

print(bokeh.__version__)

class SmaCross(SignalStrategy,
               TrailingStrategy):
    n1 = 10
    n2 = 25
    
    def init(self):
        # In init() and in next() it is important to call the
        # super method to properly initialize the parent classes
        super().init()
        
        # Precompute the two moving averages
        sma1 = self.I(SMA, self.data.Close, self.n1)
        sma2 = self.I(SMA, self.data.Close, self.n2)
        
        # Where sma1 crosses sma2 upwards. Diff gives us [-1,0, *1*]
        signal = (pd.Series(sma1) > sma2).astype(int).diff().fillna(0)
        signal = signal.replace(-1, 0)  # Upwards/long only
        
        # Use 95% of available liquidity (at the time) on each order.
        # (Leaving a value of 1. would instead buy a single share.)
        entry_size = signal * .95
                
        # Set order entry sizes using the method provided by 
        # `SignalStrategy`. See the docs.
        self.set_signal(entry_size=entry_size)
        
        # Set trailing stop-loss to 2x ATR using
        # the method provided by `TrailingStrategy`
        self.set_trailing_sl(2)
        


bt = Backtest(GOOG, SmaCross, commission=.002)
bt.run()
plot=bt.plot()
plot = plot.document
plot = plot.to_json()
plot = serialize_json(plot)
print(type(plot))

cart = Document().from_json_string(plot)

st.bokeh_chart(cart, use_container_width=True)

Stack traceback or browser console output

frontend | File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 75, in exec_func_with_error_handling
frontend | result = func()
frontend | File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 574, in code_to_exec
frontend | exec(code, module.dict)
frontend | File "/frontend/app/pages/1_⏪_Backtest.py", line 137, in
frontend | chart = Document().from_json_string(result[1])
frontend | File "/usr/local/lib/python3.10/site-packages/bokeh/document/document.py", line 476, in from_json_string
frontend | return cls.from_json(json_parsed)
frontend | File "/usr/local/lib/python3.10/site-packages/bokeh/document/document.py", line 449, in from_json
frontend | root_ids = roots_json['root_ids']

New Error

ValueError: failed to validate CategoricalColorMapper(id='1775', ...).palette: expected an element of Seq(Color), got seq with invalid items ['rgb(178.5, 27.163043478260846, 0.0)', 'rgb(0.0, 178.5, 0.0)'] Opening in existing browser session.

Screenshots

No response

@bryevdv
Copy link
Member

bryevdv commented Aug 3, 2024

@Murat-U-Saglam we cannot evaluate anything without a complete minimal reproducing example. We need code that we can take and run directly, without any modifications, that reproduces the issue you want us to investigate.

@Murat-U-Saglam
Copy link
Author

@bryevdv Sorry about that I've made changes as appropriate. I've added the associated library that is generating the charts, the charts render but don't serialise and deserialise as intended. I have more verbose logs generated when making the abovementioned changes in "New Error".

@Murat-U-Saglam
Copy link
Author

This part of the json seems to be the offender.
"transform":{"type":"object","name":"CategoricalColorMapper","id":"p1045","attributes":{"palette":["rgb(178.5, 27.163043478260846, 0.0)","rgb(0.0, 178.5, 0.0)"]

Can you provide some documentation on how I can parse the deeply nested object and convert it?
"rgb(178.5, 27.163043478260846, 0.0)","rgb(0.0, 178.5, 0.0)" to CategoricalColorMapper types

@bryevdv
Copy link
Member

bryevdv commented Aug 3, 2024

@Murat-U-Saglam This appears to be a bug related to deserializing the CSS style rgb(...) color values. If save the JSON string to a file then manually edit those to be hex color values e.g.

"palette":["#884400","#008800"]}

Then the document loads without error:

In [7]: json_str = open("out.json").read()

In [8]: Document().from_json_string(json_str)
Out[8]: <bokeh.document.document.Document at 0x104cdcbf0>

It's possible this bug is already fixed in future versions of Bokeh. I'd really like to test that, in fact, but that unfortunately the backtesting library does not work with new version, making it impossible as-is. For future reference, please always attempt to provide a "pure" MRE that is not affected by the limitations of external libraries outside our control. i.e. in this case it would be far preferable for an MRE to generate the required plot all on its own using Bokeh APIs directly, in a self-contained way.

In any case, there will definitely never been any more 2.4.x releases. So even if this still needs to be fixed, it won't be fixed in branches that old. My best suggestion is to update the tool that generates the Bokeh content to use hex color values instead of CSS color values.

cc @mattpap do you happen to recall if this seems like something that ever came up and was fixed? I didn't see any likely PRs at a quick search.

@bryevdv
Copy link
Member

bryevdv commented Aug 3, 2024

For reference, even just this code fails to validate on either Bokeh 2.4.3 or latest branch-3.6

In [4]: from bokeh.models import LinearColorMapper
   ...: color = LinearColorMapper(
   ...:     low=0, high=4,
   ...:     palette=["rgb(178.5, 27.163043478260846, 0.0)","rgb(0.0, 178.5, 0.0)"],
   ...: )

ValueError: failed to validate LinearColorMapper(id='p1001', ...).palette: expected an element of Seq(Color), got seq with invalid items ['rgb(178.5, 27.163043478260846, 0.0)', 'rgb(0.0, 178.5, 0.0)']

However this code, with integer RGB values, does work just fine in old or new Bokeh:

In [5]: color = LinearColorMapper(
   ...:     low=0, high=4,
   ...:     palette=["rgb(178, 27, 0)","rgb(0, 178, 0)"],
   ...: )

The problem is the floating point RGB values in the rgb(...) values. Only integers between 0 and 255 are allowed. I am not even clear how the output with floating rgb values in a color mapper could even have been produced in the first place, given the validation error. Something somewhere in backtester is doing something very strange and unsupported. Unfortunately without a pure Bokeh MRE I don't think there is anything else for us to look into on our end.

@bokeh/core I would propose to close this a working as intended unless someone wants to re-purpose the issue to try to make a more specific error message.

@bryevdv
Copy link
Member

bryevdv commented Aug 4, 2024

I spent a few more minutes. There is an another color mapper, and the "live" (unserialized) object does have bad RGB values:

In [84]: cm = plot.children[0].children[2][0].renderers[4].glyph.line_color["transform"]

In [85]: cm.palette
Out[85]: [rgb(178.5, 27.163043478260846, 0.0), rgb(0.0, 178.5, 0.0)]

AFAIK this should not be possible at all (see the validation errors above) so backtester is definitely doing something weird the circumvents the expected property validation. Someone would have to dig into backtester code to try and understand what that might be.

@bryevdv
Copy link
Member

bryevdv commented Aug 4, 2024

OK. The problem is that bokeh.colors.RGB will allow non-integer components, but then the actual Color property will accept an RGB object without any further scrutiny. Then, RGB objects get serialized to rgb(...) strings, and those do get validated for integer components.

Here is an actual minimal reproducer that displays a plot without issue but then explodes with the validation error when trying to do a serialization round-trip:

from bokeh.colors import RGB
from bokeh.core.json_encoder import serialize_json
from bokeh.document import Document
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, show
from bokeh.transform import linear_cmap

source = ColumnDataSource(data={
    "x": [1, 2, 3, 4],
    "y": [2, 4, 4, 2],
    "v": [1, 3, 3, 1],
})

color = linear_cmap(
    field_name="v", low=0, high=4, 

    # this works "on the way out" but fails "on the way back in"
    palette=[RGB(178.5, 27.163043478260846, 0.0), RGB(0.0, 178.5, 0.0)],
)

p = figure()
p.circle("x", "y", size=20, color=color, source=source)
show(p)

json_str = serialize_json(p.document.to_json())
Document().from_json_string(json_str)

I have to presume that backtester is using the RGB class in a similar way.

Possible resolutions:

  • make things like RGB(0.0, 178.5, 0.0) raise an error immediately
  • have the color property cast components to ints when the value is an RGB object
  • allow floating point rgb(...) components at the property level? The CSS spec actually seems to say this is ok

@bryevdv bryevdv added type: bug python Issues that should only require updating Python code and removed TRIAGE NEEDS MORE INFO labels Aug 4, 2024
@bryevdv bryevdv added this to the 3.6 milestone Aug 4, 2024
@bryevdv
Copy link
Member

bryevdv commented Aug 4, 2024

Just noting that the docs for RGB do state that only integer values are expected:

class RGB(Color):
    def __init__(self, r: int, g: int, b: int, a: float = 1.0) -> None:
        '''
        Args:
            r (int) :
                The value for the red channel in [0, 255]
            g (int) :
                The value for the green channel in [0, 255]
            b (int) :
                The value for the blue channel in [0, 255]
        '''

But backtesting does not ensure it passes int values to RGB so this is technically a usage error on their part, resulting in undefined behavior, not a bug.

That said, I think we can improve the situation on our end. Adding a validation to RGB.__init__ would be trivial to do, but unnecessrily limiting, I think. My preference is to loosen the stated type restriction to float and then have the property also handle floating components in rgb(...) strings.

@bryevdv bryevdv changed the title from_json_string() giving incorrect and weird error. Make RGB serialization more robust Aug 4, 2024
@bryevdv bryevdv self-assigned this Aug 4, 2024
@mattpap mattpap linked a pull request Aug 5, 2024 that will close this issue
@bryevdv bryevdv assigned mattpap and unassigned bryevdv Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Issues that should only require updating Python code type: task
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants