Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single-bar histograms don't respect manual bins specs when data lies in final bin #1229

Closed
andrewkfiedler opened this issue Dec 6, 2016 · 16 comments
Labels
bug something broken

Comments

@andrewkfiedler
Copy link

Here are the relevant codepens to reproduce:
http://codepen.io/andrewkfiedler/pen/ENLgJg
http://codepen.io/andrewkfiedler/pen/zojKgK

Both show different behavior, but both show that manual binning gets ignored.

@etpinard
Copy link
Contributor

etpinard commented Dec 7, 2016

What is the desired behavior here in your mind

@andrewkfiedler
Copy link
Author

In my mind, and judging by the docs (see https://plot.ly/python/histograms/#colored-and-styled-histograms), if a user gives manual bins then data should be binned accordingly. In my example, the bins should be 0 to 20000, and 20000 to 40000 regardless of the data.

Without this behavior, it's hard to make useful overlaid histograms. I went ahead and made another codepen example showing so here: http://codepen.io/andrewkfiedler/pen/YpLBdR?editors=0010

Notice that in the second histogram, there doesn't appear to be an overlay when there should. The bin farthest to the right on the histogram should be twice as dark. However, the data is getting binned as seen in the first codepen I posted here: http://codepen.io/andrewkfiedler/pen/ENLgJg. Instead of getting binned into the 20000 to 40000 bin, it makes it's own bin from 29.9996k to 30.0004k.

My particular use case is similar to what you see here: https://plot.ly/r/shinyapp-linked-brush/

I have a dataset plotted on a histogram and from another view I'm allowing them to select specific results that make up the histogram. When they select those results, I'm trying to highlight the bins that those results belong to (and to what amount they make up those bins).

@andrewkfiedler
Copy link
Author

Anything else I can provide to help out on this? I looked into plotly's code a bit but couldn't figure out where the bins are being overridden.

@etpinard
Copy link
Contributor

Our apologies, we've been a little busy lately and haven't had the time to look at this report in details.

At first glance it seems like your example over-determines the bins specifications: how can a bin start at 0, end at 4000 and have a size of 2000?

To help us out a little more, would you mind looks at other cases e.g. cases with more than 1 bin?

@andrewkfiedler
Copy link
Author

It's all good, I figured you guys are busy (especially with the holidays).

The bin specifications are fine (at least judging by docs: https://plot.ly/python/histograms/#colored-and-styled-histograms). In the example linked they use: start=-1.8, end=4.2, size=0.2
The start and end values correspond to the range of all bins, not a single bin. The size is what determines how to go from the start to the end.

I went ahead and made another codepen showing the behavior when more data is supplied:
http://codepen.io/andrewkfiedler/pen/WoYwaJ
The trace is [10000, 30000] with a manual binning specified as before start=0, end=40000, size=20000.
Yet the final histogram is binned from 2500 to 17500 and 22500 to 37500 (start=2500, end=37500, size=15000). So neither the start and end are respected, or the size.

@etpinard
Copy link
Contributor

etpinard commented Dec 14, 2016

Ah I see. You had both bargap and bargroupgap set. Commenting those out as in http://codepen.io/etpinard/pen/woQWgE? gets you the desired result (I think).

@andrewkfiedler
Copy link
Author

andrewkfiedler commented Dec 14, 2016

You're my hero, I was really hoping it was a configuration mistake!

I'll try it out in my app and make sure it works.

@andrewkfiedler
Copy link
Author

andrewkfiedler commented Dec 14, 2016

Ah dang, I tried your pen with less data and it didn't work.
See http://codepen.io/andrewkfiedler/pen/MbzeOW.

I updated the trace to only have one point at 22000, but it still bins around 29.9996k to 30.0004k.

Interestingly, if I update the trace to only have one point at 5000 it works correctly, see http://codepen.io/andrewkfiedler/pen/JbeKMW.

@andrewkfiedler
Copy link
Author

I think I've tracked down the issue. The manual binning appears to work correctly, so long as the data is not limited to only the final bin.

I've made some new codepens to demonstrate this. I've updated the manual binning to be start=0, end=80000, size=20000. So there are four possible bins. I've added a range on the xaxis from 0 to 80000 as well.
Data in Bin 1: http://codepen.io/andrewkfiedler/pen/ENOyLN?editors=0010
Data in Bin 2: http://codepen.io/andrewkfiedler/pen/ENOyRN?editors=0010
Data in Bin 3: http://codepen.io/andrewkfiedler/pen/VmVjdy?editors=0010
Data in Final Bin: http://codepen.io/andrewkfiedler/pen/BQGzVv?editors=0010

As you can see, it's only the final bin that appears to experience the issue.

Do note that I also added a manual range, as it appears the range gets determined from the data rather than the manual binning. Not sure if that's as big of an issue, since I can specify range to get around it.

@etpinard
Copy link
Contributor

it appears the range gets determined from the data rather

That's correct. This is consistent with all our other trace types.

@etpinard
Copy link
Contributor

Ok @andrewkfiedler looks like http://codepen.io/andrewkfiedler/pen/BQGzVv?editors=0010 does indeed demonstrate a bug. Big thanks for hunting that one down!

I'll change the issue title accordingly. Please note that single-bar histograms aren't very common, so this bug won't take up a high priority on our list.

@etpinard etpinard changed the title Histograms do not respect manual binning Single-bar histograms don't respect manual bins specs when data lies in final bin Dec 15, 2016
@etpinard etpinard added the bug something broken label Dec 15, 2016
@andrewkfiedler
Copy link
Author

No problem, thanks for helping!

No big deal on the priority, a workaround is to specify the end of the bins one size more than you'll actually need (so nothing ends up in the "last" bin).

@etpinard
Copy link
Contributor

a workaround is to specify the end of the bins one size more than you'll actually need (so nothing ends up in the "last" bin).

Great. Thanks again!

andrewkfiedler added a commit to codice/ddf-ui that referenced this issue Dec 22, 2016
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
andrewkfiedler added a commit to codice/ddf-ui that referenced this issue Dec 22, 2016
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
pklinef pushed a commit to codice/ddf-ui that referenced this issue Dec 23, 2016
DDF-2644 
DDF-2644  - plotly/plotly.js#1229
DDF-2644  - plotly/plotly.js#1231
DDF-2644  - Also ensures that strings are treated as strings (by adding a zero width space).
pklinef pushed a commit to codice/ddf-ui that referenced this issue Dec 23, 2016
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
pklinef pushed a commit to codice/ddf-ui that referenced this issue Dec 23, 2016
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
rzwiefel pushed a commit to codice/ddf-ui that referenced this issue Dec 28, 2016
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
rzwiefel pushed a commit to rzwiefel/ddf that referenced this issue Jan 5, 2017
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
shaundmorris pushed a commit to shaundmorris/ddf that referenced this issue Mar 29, 2017
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
@alexcjohnson
Copy link
Collaborator

There have been a bunch of histogram-related fixes in the last couple of months (#2413, #2113, #2028, #1944...), it looks to me like one of them fixed this issue but between all the back-and-forth here I can't exactly tell. @andrewkfiedler or @etpinard can you confirm?

@etpinard
Copy link
Contributor

etpinard commented Mar 7, 2018

http://codepen.io/andrewkfiedler/pen/BQGzVv?editors=0010 now looks ok to me:

image

after since 1.31.0, so I'll close this thing. @andrewkfiedler feel free to open another bug report if your use case isn't fully fixed. Thanks!

@etpinard etpinard closed this as completed Mar 7, 2018
rzwiefel pushed a commit to codice/ddf-ui that referenced this issue Nov 8, 2019
 - plotly/plotly.js#1229
 - plotly/plotly.js#1231
 - Also ensures that strings are treated as strings (by adding a zero width space).
@datalifenyc
Copy link

datalifenyc commented Jul 6, 2020

Since the codepen examples are in JS, I thought a python note may be helpful for others who come across the same issue. xaxis_range can be used to specify the desired range. In the example below, I needed 8 bins (size = 1/8). Thanks to @andrewkfiedler and @etpinard for the clues.

fig = go.Figure()
fig.add_trace(go.Histogram(
    x = pred_prob, # numpy.ndarray
    xbins = dict(
        start = 0.0,
        end = 1.0,
        size = 1/8,
    )
))

fig.update_layout(
    title_text = 'Sampled Results',
    xaxis_title_text = 'Probability Range',
    yaxis_title_text = 'Count',
    xaxis_range = [0, 1]
)
fig.show()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken
Projects
None yet
Development

No branches or pull requests

4 participants