Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ImageSearch #33

Closed
iseahound opened this issue Sep 26, 2023 · 9 comments
Closed

Improve ImageSearch #33

iseahound opened this issue Sep 26, 2023 · 9 comments
Labels
enhancement New feature or request

Comments

@iseahound
Copy link
Owner

ImageSearch has two components:

  • Linear search
  • Subimage matching

Prioritize:

  • Speed up linear search
  • Prevent subimage matching

Here is some code to help you:

#include *i ImagePut%A_TrayMenu%.ahk
#include *i ImagePut (for v%true%).ahk
#singleinstance force

pic := ImagePutBuffer("[images]\h.png")                ; Screen capture
pic.show() ; or ImageShow(pic)                         ; Show image
if xy := pic.ImageSearch("[images]\g.png") {           ; Search image
    MouseMove xy[1], xy[2]                             ; Move cursor
    Send "{MButton}"                                   ; MsgBox pic[xy*]
} else Tooltip "no"

https://github.com/iseahound/ImagePut/assets/9779668/3c2587da-d6ed-449d-af78-89633e2a7442
https://github.com/iseahound/ImagePut/assets/9779668/25a63f1b-8fb2-4c2a-84c4-34867b1e5ece

Useful ideas:
boyer moore (and string search algorithms in general)
multi-dimensional string searcch
Has ImageSearch been a priority in the public domain?

Some research papers would be immensely valuable.

Current performance is about 19 fps, compared to pixelsearch's 4000 fps. Aim for 800 fps or higher.

@iseahound iseahound added the enhancement New feature or request label Sep 26, 2023
@iseahound
Copy link
Owner Author

Benchmarks (on a slow laptop):

Python's template matching (from OpenCV): 3.43 fps
First Attempt at ImageSearch: 13.43 fps

@iseahound
Copy link
Owner Author

This one does about 200 fps on the test images by preventing the subimage matching.

https://godbolt.org/z/vP679rhYP

The question is: is matching 2/3/4 best? For this script, matching 1,4 is good. But only 4 can be optimized. 2 and 3 may be too correlated.

@iseahound
Copy link
Owner Author

iseahound commented Sep 30, 2023

This one fastest: https://godbolt.org/z/rGKczzYqs

Rank:

  1. focus
  2. start pixel
  3. range

@iseahound
Copy link
Owner Author

#include *i ImagePut%A_TrayMenu%.ahk
#include *i ImagePut (for v%true%).ahk
#singleinstance force

; a := imageputwindow("g.png")

pic := ImagePutBuffer("h.png")                ; Screen capture
hwnd := pic.show() ; or ImageShow(pic)                         ; Show image
if xy := pic.ImageSearch3("g.png") {           ; Search image
    x := xy[1], y := xy[2]
    MouseMove x, y
    WinGetPos &wx, &wy,,, hwnd
    ImagePutWindow("g.png"
       , x ", " y
       , [wx + x, wy + y]) 
} else Tooltip "no"

@iseahound
Copy link
Owner Author

or

pic := ImagePutBuffer("h.png")                ; Screen capture
hwnd := pic.show() ; or ImageShow(pic)                         ; Show image
if xy := pic.ImageSearch("g.png") {           ; Search image
    x := xy[1], y := xy[2]
    MouseMove x, y
    ImagePutWindow("g.png"
       , x ", " y
       , [x,y,,, hwnd])
} else Tooltip "no"

with

      try dpi := DllCall("SetThreadDpiAwarenessContext", "ptr", -3, "ptr")
      if IsObject(pos) && pos.HasKey(5) {
         pos[5] := (hwnd := WinExist(pos[5])) ? hwnd : pos[5]
         VarSetCapacity(rect, 16, 0)
         DllCall("GetClientRect", "ptr", pos[5], "ptr", &rect)
         DllCall("ClientToScreen", "ptr", pos[5], "ptr", &rect)
         x += NumGet(rect, 0, "int")
         y += NumGet(rect, 4, "int")
      }
      try DllCall("SetThreadDpiAwarenessContext", "ptr", dpi, "ptr")

?

@iseahound
Copy link
Owner Author

Yeah, the [x, y, w, h, r] notation seems to be ideal, as it would allow screenshots relative to a window as well.

@iseahound
Copy link
Owner Author

Another note to rank:

  1. focus
  2. start pixel - if removed with 3, there is a very large speed change.
  3. range - if remove, not much speed change

Because the first pixel kind of acts like another check!

@iseahound
Copy link
Owner Author

         ; Search for the address of the first matching image.
         address := DllCall(code, "ptr", this.ptr, "uint", this.width, "uint", this.height
            , "ptr", image.ptr, "uint", image.width, "uint", image.height, "uint", image.width//2, "uint", image.height//2
            , "cdecl ptr")

@iseahound
Copy link
Owner Author

All of the above has been completed. Only useful data is:

Rank:

  1. focus
  2. start pixel
  3. range

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant