Experiment with ImageSearchAll with tolerance to color variation #36

wind0204 · 2023-12-14T04:59:15Z

I didn't clean the comments, nor build 32-bit binaries, Please take a look into this pull request.

Best regards.

iseahound · 2023-12-14T11:25:08Z

I've only read through the code and it seems like it should work. I'll have to do some testing later. Could you change the line endings for ImagePut back? They got switched in your latest commit

wind0204 · 2023-12-14T12:10:12Z

I've only read through the code and it seems like it should work. I'll have to do some testing later. Could you change the line endings for ImagePut back? They got switched in your latest commit

Aha! That was why the diff read so crazy on Github.

iseahound · 2023-12-14T13:56:45Z

Can you explain the algorithm for finding the focused pixel? I'm glad you figured out why this is important, for a sprite the first few pixels are invisible, so the imagesearch would run really slowly.

Should this be s + w * h?
unsigned int * last_pixel = s + w + h * w;

I'll move the algorithm for finding the focused pixel out of the ImageSearch (because there can be many ways to find a focused pixel), and will be sure to credit you once I figure out what yours does.

wind0204 · 2023-12-14T14:12:26Z

Can you explain the algorithm for finding the focused pixel? I'm glad you figured out why this is important, for a sprite the first few pixels are invisible, so the imagesearch would run really slowly.

The algorithm came from no good reasoning, just thought it would be nice to locate a pixel a little bit far from the top-left corner that is not transparent, and use it as a quick reference.

Should this be s + w * h? unsigned int * last_pixel = s + w + h * w;

That looks simple and readable, but then the loop will stop right before the bottommost line.

…mine

iseahound · 2023-12-14T21:24:06Z

Okay, I'll merge but expect major changes to your code.

iseahound · 2023-12-14T22:03:35Z

I'm not sure if your code even runs, but the logic looks like it's there. ImageSearch doesn't seem to work, nor does ImageSearchAll.

Reasoning behind some of the changes:

The code to find a focused pixel was moved because it is a different problem than ImageSearch
I'm not sure why the focused pixel was being stored in the result array
Not sure if there's a difference between storing x | y << 16 or doing y = offset / width and x = offset % width
Finally, it's faster to have separate C code for the 0 variation case. Each case has to be individually vectorized anyways for a 2x or more speedup.

iseahound · 2023-12-14T23:26:57Z

Benchmarks!

Line 1 - Original ImageSearch (No variation)
Line 2 - Your code (No variation) https://godbolt.org/z/qGexdGqMn
Line 3 - Your code (variation = 1)

wind0204 · 2023-12-15T04:29:06Z

I'm not sure if your code even runs, but the logic looks like it's there. ImageSearch doesn't seem to work, nor does ImageSearchAll.

It ran on my wife's Windows system, with the game macro I made...

* The code to find a focused pixel was moved because it is a different problem than ImageSearch

I guess that's better, it adds the maintainability and organized look to the code base.

* I'm not sure why the focused pixel was being stored in the result array

I did it to prevent another search for the focused pixel in a repeatedly used sprite, just in case the sprite is big and bizarre especially with a lot of transparent part in the middle.

* Not sure if there's a difference between storing `x | y << 16` or doing `y = offset / width` and `x = offset % width`
  Finally, it's faster to have separate C code for the 0 variation case. Each case has to be individually vectorized anyways for a 2x or more speedup.

Awesome! Have you managed to vectorize it without using inline assembly? which SIMD instruction did it use?

iseahound · 2023-12-15T05:03:17Z

Oh I forgot to give context for the benchmarks. Line 2 doesn't matter, it will always default to the dedicated ImageSearch for no variation. It's running at 30% of the "max speed" which seems good to me. Thanks!

iseahound · 2023-12-15T05:05:46Z

Awesome! Have you managed to vectorize it without using inline assembly?

Could you expand on your cmov comment? I saw it pop up a few times with or without div, I'm not too familiar with assembly here.

which SIMD instruction did it use?

Ah not yet! I want to get the specific optimizations in place without using SIMD. Vectorized code is pretty brittle, meaning it's hard to modify after it's written.

wind0204 · 2023-12-15T05:27:20Z

Oh I forgot to give context for the benchmarks. Line 2 doesn't matter, it will always default to the dedicated ImageSearch for no variation. It's running at 30% of the "max speed" which seems good to me.

Is the dedicated ImageSearch coming from the code I wrote? The number was horribly low. 14.27 / 137.23 = 10.4%

Awesome! Have you managed to vectorize it without using inline assembly?

Could you expand on your cmov comment? I saw it pop up a few times with or without div, I'm not too familiar with assembly here.

According to godbolt.org, it seems that GCC 11.4 with at minimum -O1 uses CMOV with my code and your code uses DIV, and I read on the other day that DIV is not good at wide parallel execution on stackoverflow.com. I actually don't have any idea if CMP + CMOV can outperform DIV or not. I have to admit that I just believed without knowing.

wind0204 · 2023-12-15T05:36:13Z

* Not sure if there's a difference between storing `x | y << 16` or doing `y = offset / width` and `x = offset % width`

You must be talking about the line with "x << 16 | y" to give the result of search for focus pixel back to the function caller. the code is basically compressing the coordinates into 4 bytes, I could've used 8 bytes (which is the size of a pointer in a modern system) though.

iseahound · 2023-12-15T18:38:23Z

Feel free to continue the conversation at: #35

wind0204 added 4 commits December 14, 2023 13:40

Experiment with ImageSearchAll with tolerance to color variation

7c5977d

Update ImageSearch method of BitmapBuffer

e2a6e6f

Remove an unnecessary assignment to a variable

d0f803c

Recover the lost work on my wife's windows system

c3eac47

wind0204 added 2 commits December 14, 2023 21:01

Accelerate when the argument variation is 0

85059f1

Put back the carriage return characters on each line ending

80917a4

As you learn English, the quality of the comments improves

22e8f78

Fix a bug where random memory location will be touched in focus-deter…

24e1e0a

…mine

iseahound merged commit 2fbbe63 into iseahound:master Dec 14, 2023

iseahound mentioned this pull request Dec 15, 2023

Improve ImageSearch 2 #35

Open

Repository owner locked as resolved and limited conversation to collaborators Dec 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with ImageSearchAll with tolerance to color variation #36

Experiment with ImageSearchAll with tolerance to color variation #36

wind0204 commented Dec 14, 2023 •

edited

Loading

iseahound commented Dec 14, 2023

wind0204 commented Dec 14, 2023 •

edited

Loading

iseahound commented Dec 14, 2023

wind0204 commented Dec 14, 2023 •

edited

Loading

iseahound commented Dec 14, 2023

iseahound commented Dec 14, 2023 •

edited

Loading

iseahound commented Dec 14, 2023

wind0204 commented Dec 15, 2023 •

edited

Loading

iseahound commented Dec 15, 2023 •

edited

Loading

iseahound commented Dec 15, 2023 •

edited

Loading

wind0204 commented Dec 15, 2023 •

edited

Loading

wind0204 commented Dec 15, 2023 •

edited

Loading

iseahound commented Dec 15, 2023

Experiment with ImageSearchAll with tolerance to color variation #36

Experiment with ImageSearchAll with tolerance to color variation #36

Conversation

wind0204 commented Dec 14, 2023 • edited Loading

iseahound commented Dec 14, 2023

wind0204 commented Dec 14, 2023 • edited Loading

iseahound commented Dec 14, 2023

wind0204 commented Dec 14, 2023 • edited Loading

iseahound commented Dec 14, 2023

iseahound commented Dec 14, 2023 • edited Loading

iseahound commented Dec 14, 2023

wind0204 commented Dec 15, 2023 • edited Loading

iseahound commented Dec 15, 2023 • edited Loading

iseahound commented Dec 15, 2023 • edited Loading

wind0204 commented Dec 15, 2023 • edited Loading

wind0204 commented Dec 15, 2023 • edited Loading

iseahound commented Dec 15, 2023

wind0204 commented Dec 14, 2023 •

edited

Loading

wind0204 commented Dec 14, 2023 •

edited

Loading

wind0204 commented Dec 14, 2023 •

edited

Loading

iseahound commented Dec 14, 2023 •

edited

Loading

wind0204 commented Dec 15, 2023 •

edited

Loading

iseahound commented Dec 15, 2023 •

edited

Loading

iseahound commented Dec 15, 2023 •

edited

Loading

wind0204 commented Dec 15, 2023 •

edited

Loading

wind0204 commented Dec 15, 2023 •

edited

Loading