-
-
Notifications
You must be signed in to change notification settings - Fork 22k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Optimize String::replace
methods
#92546
Conversation
@@ -3986,54 +3986,161 @@ String String::format(const Variant &values, const String &placeholder) const { | |||
return new_string; | |||
} | |||
|
|||
String String::replace(const String &p_key, const String &p_with) const { | |||
String new_string; | |||
static String _replace_common(const String &p_this, const String &p_key, const String &p_with, bool p_case_insensitive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the code only differs on which find
method to use I created a helper method to avoid code duplication
Gonna write up a benchmark test in the next few days when I find the time to convert the code to something benchmarkable (won't necessarily catch some of the bottlenecks but should at least catch the benefits of pre-processing and memcpy) |
There managed to write some benchmarks now in fact, using this: Will do some other case tests as well to see exactly how it might improve, but this shows this specific case The improvements will likely be a bit better thanks to other behind the scenes differences this can't test for (COW stuff primarily) |
The results in some cases are hard to evaluate, might be allocation based so hard to benchmark against them and consider the differences with the engine code, will try further benchmarks and see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally, it works as expected. This provides a slight speedup in a real world scenario (using print_rich()
lots of times, which currently relies on string replacement a lot).
Testing project: test_print_rich.zip
Benchmark
Using optimize=speed lto=full
Linux editor binaries.
Command:
$ hyperfine 'bin/godot.linuxbsd.editor.x86_64.master --path /tmp/4 &> /dev/null' 'bin/godot.linuxbsd.editor.x86_64 --path /tmp/4 &> /dev/null'
Benchmark 1: bin/godot.linuxbsd.editor.x86_64.master --path /tmp/4 &> /dev/null
Time (mean ± σ): 15.030 s ± 0.211 s [User: 10.756 s, System: 3.243 s]
Range (min … max): 14.651 s … 15.292 s 10 runs
Benchmark 2: bin/godot.linuxbsd.editor.x86_64 --path /tmp/4 &> /dev/null <--- This PR
Time (mean ± σ): 14.046 s ± 0.504 s [User: 9.959 s, System: 3.257 s]
Range (min … max): 13.138 s … 14.656 s 10 runs
Summary
bin/godot.linuxbsd.editor.x86_64 --path /tmp/4 &> /dev/null ran
1.07 ± 0.04 times faster than bin/godot.linuxbsd.editor.x86_64.master --path /tmp/4 &> /dev/null
PC specifications
- CPU: Intel Core i9-13900K
- GPU: NVIDIA GeForce RTX 4090
- RAM: 64 GB (2×32 GB DDR5-5800 C30)
- SSD: Solidigm P44 Pro 2 TB
- OS: Linux (Fedora 39)
8031de9
to
53cb3e9
Compare
Performs a single allocation, only when any instances are found, and avoids concatenations and other unnecessary conversions.
53cb3e9
to
cbaec62
Compare
For the record, the unit tests for |
Thanks! |
Thank you! |
Performs a single allocation, only when any instances are found, and avoids concatenations and other unnecessary conversions.
Haven't done any benchmarking yet but as far as I know there's no optimization that can handle the concatenations in a way that avoids repeated allocations.
Will run the code through QuickBench soon with some of the data in the PR below but have to translate some of the code first.
Related: