Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize CowData and LocalVector functions .insert and .remove_at by using move semantics. #100477

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

Ivorforce
Copy link
Contributor

@Ivorforce Ivorforce commented Dec 16, 2024

This PR uses move semantics to move regions in CowData, avoiding repeated copy assignments and destructions.

Safer, but still very fast, version of #100468. #100468 may still be re-opened if needed at some point, but it turns out most of the performance loss from these functions can already be avoided with this simple change.

Beneficiaries of this change include Array, PackedStringArray, and other internal Vector types that use non-trivial elements.

Additionally, some minor inefficiencies are addressed in Vector and LocalVector implementations.

Benchmark

I benchmarked the following code:

	int N = 10000;
	int M = 1000;
	Array vectors[N];
	for (int i = 0; i < N; i++) {
		for (int j = 0; j < M; ++j) {
			vectors[i].push_back("Test");
		}
	}

	{
		auto t0 = std::chrono::high_resolution_clock::now();
		for (int i = 0; i < N; i ++) {
			vectors[i].remove_at(0);
		}
		auto t1 = std::chrono::high_resolution_clock::now();
		std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << "ms\n";
	}

	{
		auto t0 = std::chrono::high_resolution_clock::now();
		for (int i = 0; i < N; i ++) {
			vectors[i].insert(0, vectors[0][0]);
		}
		auto t1 = std::chrono::high_resolution_clock::now();
		std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << "ms\n";
	}

This yields:

As such, it can be concluded that remove_at can be 5.5x as fast as before, and insert can be 8x as fast.

Addendum

Arguably, push_back and pop_back() should be added to CowData, because quite a few callers have various ways of implementing these operations themselves. They could be accelerated with move semantics quite easily, and an efficient implementation would be considerably faster than insert and remove_at.

@Ivorforce Ivorforce requested a review from a team as a code owner December 16, 2024 16:33
@Ivorforce Ivorforce force-pushed the cowdata-move-insert-n-remove branch 2 times, most recently from bac918b to 58e0957 Compare December 16, 2024 16:45
@clayjohn clayjohn requested a review from hpvb December 16, 2024 21:51
@Ivorforce Ivorforce force-pushed the cowdata-move-insert-n-remove branch from 58e0957 to a636c04 Compare December 17, 2024 13:10
Copy link
Member

@hpvb hpvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with this. This will safely use move constructor and assignment overloads where available, and fall back to the normal T& and const T assignment overloads otherwise.

As we add more move constructors and assignment operators to core types this will eventually be as fast as the original version, but without causing any uncertainty about safety in the meantime.

This is at worst a no-op, but since we already have move constructors on some of our more expensive types it won't be!

Copy link
Member

@AThousandShips AThousandShips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hpvb
Copy link
Member

hpvb commented Dec 19, 2024

I did some benchmarks using https://github.com/godotengine/godot-benchmarks between master and this pr + #100483 and after running both master and that combination several times I can confirm that emitting signals is roughly ~10% faster with these changes compared to master, I also think that I see an improvement in the ObjectDB benchmarks but with 64 threads on my computer there's a ton of noise in the threads and it is hard to be sure.

Function Callable seems to be 1% slower though, even after repeated runs. I can't explain it.

@Repiteo Repiteo merged commit 151e7fc into godotengine:master Dec 20, 2024
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Dec 20, 2024

Thanks!

@Ivorforce Ivorforce deleted the cowdata-move-insert-n-remove branch December 20, 2024 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants