-
-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea for Converter functions #1777
Comments
Do you get the same performance boost without the parallel loop? I don't think I want to add that in this spot. But I could also make this an optional argument. I wonder what would the performance boost would be without the parallel for loop. |
I do get a large performance boost without the Parallel loop it runs in 5 milliseconds without Parallel and 1-2 Milliseconds with Parallel so roughly half of the benefit seems to come from Parallelization and half of the benefit seems to come from getting rid of the buffers. Perhaps 60-40% Given that the old version is ~12 Milliseconds on my computer. To give another example of what I am talking about we have leveraged nint pointer = pixels.GetAreaPointer(0, 0, width, height); to enormous benefit when Leveling Images. These numbers are for 16 bit Tiff's. I suspect that there are a large number of cases _where the overhead of calling the C Library (Switching Contexts) is more expensive than just manipulating the memory in C# These could be sped up even further by Vectorizing the mathematics and leveraging System.Numerics.Vector. the function below runs at Quadruple the Speed of the current Leveling function without Parallel. In Parallel it takes less than a second on an 8 core machine to Level an Image. Example of our Leveling Extension
}`` |
Here is what I think is Happening for simple functions (like Leveling) that explains why the C# code is so much faster even though it's doing more or less the same thing.
While Calling C Code is not as expensive as a full Context switch, in many cases the Overhead of doing so may be more expensive than the operation itself. I have no doubt that if I ran the C code directly against my image it would be just as fast (or faster) than my C# code but in a tight loop the JIT is able to Optimize against specific cases. |
Is your feature request related to a problem? Please describe
The .ToBitmap Code has a few issues and I offer a potential solution. The code change I propose runs on 2-2.5 year old 12 Core AMD CPU this goes from 18 milliseconds to ~1.9 milliseconds.
The two problems with the existing code are
This is a highly Parallel Problem
Creating several thousand buffers is time consuming (assuming a 16 bit image)
Solution:
We already have a function that allows us to get a pointer to an area, we should use a pointer instead of the buffers. In the case of RGB we can also go two at a time packing six bytes into a ULong and assigning two pixels at once.
Describe the solution you'd like
This is the code I use for Images at the lab. Usually we don't have an Alpha channel but I wrote code that accounts for both 32bppRgb and 24bppRgb. _so I go through and Assign two ULongs at once. The biggest gains though are from Parallel Proccessing and not materializing buffers (we never call .ToByteArray) conceptually it's very simple we just get a pointer to the unsafepixel collection and start iterating.
Describe alternatives you've considered
Describe the solution you'd like
This code works on a variety of images I acknowledge it doesn't have all the error checking and SetBitmapDensity isn't called but the General idea is sound and could easily be applied to several other slower functions.
Describe alternatives you've considered
I tried to avoid using Pointers and just run the code in Parallel with a single buffer. The increase in speed is only marginal the problem is that we still need to call ToByteArray() and materialize an enormous Byte Array with Millions of Bytes when we should really just be pointing to the Bytes.
Additional context
This is something I actually use in production code and I would love to see instantaneous conversion as part of the library. Happy to contribute more if necessary.
The text was updated successfully, but these errors were encountered: