Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram() function is unusably slow on large file #1587

Closed
toptensoftware opened this issue Mar 15, 2024 · 4 comments
Closed

Histogram() function is unusably slow on large file #1587

toptensoftware opened this issue Mar 15, 2024 · 4 comments
Milestone

Comments

@toptensoftware
Copy link

Magick.NET version

Magick.NET-Q16-AnyCPU v13.6.0"

Environment (Operating system, version and so on)

Windows 10

Description

Calling Histogram() on a largish image (10,000 x 1,600) is so slow as to be unusable - as it in can take several minutes, much of the time in MagickColorCollection.ToDictionary()

Besides improving performance here, another idea would be to provide a per-channel histogram function that doesn't depend on dictionaries and just returns an array of counts.

fwiw: in the end I wrote my own function that used GetPixelsUnsafe().GetAreaPointer() and did a per-channel histogram into a array of counts for each channel. This took less than a second.

Steps to Reproduce

Load a large image, call Histogram()

@dlemstra
Copy link
Owner

Is it possible to share an image that demonstrates this? You might need to zip it before upload.

Not sure what you mean by returning an array of counts? Can you explain in more detail what you mean by that?

@toptensoftware
Copy link
Author

Hi Dirk,

Here's an image that demonstrates the problem.

Just load it in the Q16 version of ImageMagick and call Histogram() - it takes almost 3 minutes (and this is a 60mb image - usually I'm working with 2gb images)

Pretty sure this is related to the sheer number of color combinations possible in a Q16 image - the dictionary gets quite large, but even so loading just an average size 8-bit-per-channel jpg and calling Histogram still takes several seconds.

Sorry if I didn't explain "array of counts" clearly....

What I was trying to do here is produce a typical RGB per-channel histogram like you'd see in most image editing application. In this case I don't need a histogram of every possible color, but instead a histogram of the separate R, G and B channels.

My first attempt at this was to call Histogram() and then process the returned dictionary to produce separate histograms for each channel, but it was way too slow.

What would be more useful in this case is just a int pixelCounts[channelCount][65536] for the per-channel histograms. It's a different result to the current histogram function but perhaps more useful and would avoid the cost of the populating the dictionary. Actually, even a reduced precision histogram (int pixelCounts[channelCount][256]) would be good enough for display purposes here.

Brad

(PS: I actually don't need this functionality any longer since I've moved my histogram generation to a later stage in the processing pipeline where I'm working with an OpenGL texture. But I thought I should report it anyway in case it's something you want to fix).

@dlemstra
Copy link
Owner

dlemstra commented Apr 9, 2024

Thanks for reporting this. There is not much I can do to improve the performance when you have that many entries. But I can make some tiny tweaks to improve the performance of that method and I just pushed a patch for that.

@toptensoftware
Copy link
Author

Hi @dlemstra

No problem, as mentioned I'm calculating my on histograms directly now by accessing the pixel data directly. I only reported if because it seemed extreme and thought it might be something you could address.

Brad

@dlemstra dlemstra added this to the 13.7.0 milestone Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants