-
-
Notifications
You must be signed in to change notification settings - Fork 22k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Argument passing correctness - part 1 - the 'core' module and dependencies #75819
base: master
Are you sure you want to change the base?
Conversation
86d120b
to
50d9051
Compare
Just one question - how do you prefer to pass the following classes: |
50d9051
to
c4f8890
Compare
General sane rules for input parameters would be: Now, I have a concern about the seeming randomness of criteria in the current codebase. In some cases the author may have deliberately picked one way of passing the argument, aware of the implications. However, that intent is lost if there's no comment about it. I'd like the codebase to be more consistent in this matter, so I appreciate this PR, but I would like to hear others' ideas on how to approach the avoidance of performance regressions in key areas. |
c4f8890
to
a95cafc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at my PC atm, just two thoughts i had while scrolling through the diff.
About when to use refs:
If a type is pointer size or less and has no copy constructor it should be passed by value, unless it is an out parameter.
Otherwise it is a tradeoff between the size of the type how many parameters the function has what archetecture is targeted and what the copy constructor actually does.
For godot, most large data structures string, vector, array, ... are ref counted and copy on write, so making copies is generally pretty cheap.
Also what i think is happening with the unit tests (tho i am kinda surprised things compile, if i am correct) is that methods bound in ClassDB, see the BIND_METHOD macro, cant have reference parameters, or scripts and other bindings cant call them correctly
@@ -363,7 +363,7 @@ Error JSON::_get_token(const char32_t *p_str, int &index, int p_len, Token &r_to | |||
return ERR_PARSE_ERROR; | |||
} | |||
|
|||
Error JSON::_parse_value(Variant &value, Token &token, const char32_t *p_str, int &index, int p_len, int &line, int p_depth, String &r_err_str) { | |||
Error JSON::_parse_value(Variant &value, const Token &token, const char32_t *p_str, int &index, int p_len, int &line, int p_depth, String &r_err_str) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this maybe a mutable reference for consistency with other simlar json methods?
@@ -578,7 +578,7 @@ class Geometry3D { | |||
max = x2; \ | |||
} | |||
|
|||
_FORCE_INLINE_ static bool planeBoxOverlap(Vector3 normal, float d, Vector3 maxbox) { | |||
_FORCE_INLINE_ static bool planeBoxOverlap(const Vector3 &normal, float d, const Vector3 &maxbox) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function taken from third-party code we may want stay similar to?
I agree with your overall analysis, but I'd want to qualify this specific statement a bit. Classes backed by |
While I appreciate your efforts, I would advise against doing such broad passes as a first-time contributor. It will take a lot of effort to review such a PR, and since you don't have a track record with the project yet, we would have to pay closer attention to possible mistakes. So keep doing it in smaller batches, like you do here. Preferably targeting individual areas of the engine, so it can be reviewed by a small number of people. |
a95cafc
to
912afbf
Compare
912afbf
to
b5e3724
Compare
Oh, I don't mean that I want to make a big PR that contains fixes in the whole project. Like in this PR, the following commits will be divided in smaller parts - of course if this one will be accepted :) Little about me - to be more reliable - I started programming around 1995 with the first game published in 1997 (Amiga - Dreenshar), then working as a programmer for various studios or freelancing, as for today. Now, back to this PR - after the previous comments and problems with unit tests the refactor was redone only for obvious types (StringName, PackedByteArray, IPAddress, Vector3, etc.). Classes Object, ObjectID, RID and Ref are left intact for future PR after deeper analysis.
Thanks for the tip, but that's not the problem - there are many other methods that are bounded to ClassDB and using CollisionPolygon3D::set_polygon(const Vector<Point2> &p_polygon);
Path3D::set_curve(const Ref<Curve3D> &p_curve);
RayCast3D::set_target_position(const Vector3 &p_point);
Node::set_name(const String &p_name);
Viewport::set_global_canvas_transform(const Transform2D &p_transform); etc... The main reason of this issue was a Finally, according to all the arguments that are confirming or negating the sense of changing the method of passing big structures as references, please take a look at this very simple code: void pass_stringname_by_ref(const StringName& p_value) {
//static StringName inner;
//inner = p_value;
}
void pass_stringname_by_value(StringName p_value) {
//static StringName inner;
//inner = p_value;
}
void pass_vector_of_stringnames_by_ref(const Vector<StringName>& p_value) {
//static Vector<StringName> inner;
//inner = p_value;
}
void pass_vector_of_stringnames_by_value(Vector<StringName> p_value) {
//static Vector<StringName> inner;
//inner = p_value;
}
void arguments_pass_test()
{
StringName s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. \
Morbi porta neque ut feugiat finibus. \
Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. \
Mauris vestibulum mollis felis vel semper. \
Nunc interdum finibus lacus vel sodales. \
Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. \
Nullam ornare justo ac euismod fringilla. \
Duis blandit augue vel aliquet dapibus. \
In faucibus nec orci eu bibendum.";
Vector<StringName> v;
for (int i = 0; i < 1024; i++) {
v.push_back(s);
}
const uint64_t count = 10000000000;
uint64_t begin = 0;
uint64_t end = 0;
uint64_t i = 0;
begin = OS::get_singleton()->get_ticks_usec();
for (i = 0; i < count; i++) {
pass_stringname_by_value(s);
}
end = OS::get_singleton()->get_ticks_usec();
OS::get_singleton()->print("Passing %llu StringNames by value took %llu microseconds (%.3f seconds)\n", count, end - begin, static_cast<double>(end - begin) / 1000000.0);
begin = OS::get_singleton()->get_ticks_usec();
for (i = 0; i < count; i++) {
pass_stringname_by_ref(s);
}
end = OS::get_singleton()->get_ticks_usec();
OS::get_singleton()->print("Passing %llu StringNames by const ref took %llu microseconds (%.3f seconds)\n", count, end - begin, static_cast<double>(end - begin) / 1000000.0);
begin = OS::get_singleton()->get_ticks_usec();
for (i = 0; i < count; i++) {
pass_vector_of_stringnames_by_value(v);
}
end = OS::get_singleton()->get_ticks_usec();
OS::get_singleton()->print("Passing %llu Vectors of StringNames by value took %llu microseconds (%.3f seconds)\n", count, end - begin, static_cast<double>(end - begin) / 1000000.0);
begin = OS::get_singleton()->get_ticks_usec();
for (i = 0; i < count; i++) {
pass_vector_of_stringnames_by_ref(v);
}
end = OS::get_singleton()->get_ticks_usec();
OS::get_singleton()->print("Passing %llu Vectors of StringNames by const ref took %llu microseconds (%.3f seconds)\n", count, end - begin, static_cast<double>(end - begin) / 1000000.0);
} The output:
|
Supersedes #51156 Also that benchmark inlines the function calls, after which the by ref loops become empty and get optimized away entirely. If the functions are moved to a different compilation unit to avoid this these loops take about 10s on my system:
|
I'm starting to think that we should have an explicit syntax for copying potentially expensive objects. Just a brainstormy one cent from me. |
Well that already exists in the form of making the (copy) constructor Tho the really high iteration count makes the whole thing seem to be a way bigger issue than it really is, copying one of these things only takes like 10-20 ns. For example
|
I'm wondering how much the atomics/locking cause cross thread performance
degradation in addition to what your benchmarks are telling.
Not easy to realistically measure besides using an actual test project, I
guess.
…On Thu, May 4, 2023, 17:11 RedworkDE ***@***.***> wrote:
I'm starting to think that we should have an explicit syntax for copying
potentially expensive objects. Just a brainstormy one cent from me.
Well that already exists in the form of making the (copy) constructor
explicit and making the assignment operators take its argument by value.
The compiler will then force you to write out the constructor when a copy
is necessary.
Tho the really high iteration count makes the whole thing seem to be a way
bigger issue than it really is, copying one of these things only takes like
10-20 ns. For example LocalVector would be a way better candidate for
such a treatment: (note the decreased iteration count)
Passing 10000000 StringNames by value took 188540 microseconds (0.189 seconds, 18.854 ns each)
Passing 10000000 StringNames by const ref took 19246 microseconds (0.019 seconds, 1.925 ns each)
Passing 10000000 int by value took 14029 microseconds (0.014 seconds, 1.403 ns each)
Passing 10000000 int by const ref took 14860 microseconds (0.015 seconds, 1.486 ns each)
Passing 10000000 Vectors of 1024 int by value took 146112 microseconds (0.146 seconds, 14.611 ns each)
Passing 10000000 Vectors of 1024 int by const ref took 10375 microseconds (0.010 seconds, 1.038 ns each)
Passing 10000000 LocalVectors of 1024 int by value took 7814799 microseconds (7.815 seconds, 781.480 ns each)
Passing 10000000 LocalVectors of 1024 int by const ref took 10034 microseconds (0.010 seconds, 1.003 ns each)
Passing 10000000 Vectors of 1024 StringName by value took 160938 microseconds (0.161 seconds, 16.094 ns each)
Passing 10000000 Vectors of 1024 StringName by const ref took 13868 microseconds (0.014 seconds, 1.387 ns each)
Passing 10000000 LocalVectors of 1024 StringName by value took 185427378 microseconds (185.427 seconds, 18542.738 ns each)
Passing 10000000 LocalVectors of 1024 StringName by const ref took 12711 microseconds (0.013 seconds, 1.271 ns each)
—
Reply to this email directly, view it on GitHub
<#75819 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC2AFNRERW65XLFWSPFXKULXEPBKVANCNFSM6AAAAAAWXM62BU>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
I'm a little confused about your uncertainty regarding these changes... Game development was always a fight for an every ns and always be like that. Passing large structures to function by reference will be always faster than passing them by value. There are no ifs and buts. Of course, I agree, that making that changes on the entire engine at once is highly dangerous, but that's not my point. I just want to show you, how much we can gain just by implementing this in the right way. Let's make a clean rules which classes/structures should be passed in this way and I, with pleasure, prepare separate PRs for appropriate code segments, that will be easy to check. |
As this is an enhancement and the 4.1 release date looms, I will move the PR's milestone to 4.2. The PR needs a rebase too. |
Fixes for all, not optimal function's arguments passing in the 'core' module.