You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The puzzle solving algorithm is simply brute force. (Better ideas welcomed!)
9
10
10
11
The target sequences are weighted from top to bottom as this: `1`, `1.1`, `1.2`, ... So it will focus on more hit first, and when even, the lower sequences will have higher priority.
11
12
12
13
The OCR part took the most time. I initailly used the default English OCR provided by tesseract, but it fails randomly (like recognizing "55" into "5") and the success rate is below 50%. Eventually I trained the model by myself, using tesstrain. Instead of recognizing single English characters, I let the program treat the byte as a whole, so the computer actually think "55" or "1C" as a single character in a mysterious language. The self-trained model worked better, but still not perfect. TBH I think maybe tesseract is not the best option, but since it's the only popular choice in JavaScript and I'm not famailiar with WASM, this will be the way to go for now.
13
14
14
15
## Local develop
16
+
15
17
Make sure you have `node` and `yarn` installed, then clone the repo and run `yarn start`.
16
18
17
19
## Screenshots
20
+
18
21

19
22

20
23
21
24
## Acknowledgement
25
+
22
26
-https://github.com/kyle-rader/breach for test data
23
27
-https://github.com/naptha/tesseract.js which made this web app possible
24
28
-https://github.com/tesseract-ocr/tesseract and https://github.com/tesseract-ocr/tesstrain tesstrain made training the model a lot easier
0 commit comments