Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve NavigationLink3DGizmoPlugin::redraw performance #101424

Conversation

kiroxas
Copy link
Contributor

@kiroxas kiroxas commented Jan 11, 2025

Reduced sin/cos calls by 4x.
Correctly resize the vector to only have one allocation.

Spotted during the code review here.
Did not increase the number of points for the circle, as the bottleneck mentionned in the PR by @smix8 may be on the rendering side (too many points).

@kiroxas kiroxas requested a review from a team as a code owner January 11, 2025 07:30
@kiroxas kiroxas force-pushed the improveNavigationLink3DGizmoPlugin_redrawPerformance branch 2 times, most recently from 50f3b5e to 6a1faf7 Compare January 11, 2025 08:09
@Chaosus Chaosus added this to the 4.x milestone Jan 11, 2025
Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are making a mistake where you are treating resize as reserve. If you use resize you'll need to set elements in-place instead of calling push_back. Otherwise, you'll end up with twice the elements :)

@smix8
Copy link
Contributor

smix8 commented Jan 11, 2025

If we resize() instead of reserve() could as well use a ptrw() instead of push_back() which should also be slightly faster with the lines CoW Vector. Looking at the debug render PR I did that there as well so it would fit with the update.

@kiroxas
Copy link
Contributor Author

kiroxas commented Jan 11, 2025

You are making a mistake where you are treating resize as reserve. If you use resize you'll need to set elements in-place instead of calling push_back. Otherwise, you'll end up with twice the elements :)

I cannot find a reserve in Vector though. :(. I will manually add in place.

@kiroxas
Copy link
Contributor Author

kiroxas commented Jan 11, 2025

If we resize() instead of reserve() could as well use a ptrw() instead of push_back() which should also be slightly faster with the lines CoW Vector. Looking at the debug render PR I did that there as well so it would fit with the update.

I will do that

@kiroxas kiroxas force-pushed the improveNavigationLink3DGizmoPlugin_redrawPerformance branch 3 times, most recently from 1b8bb3a to 53e154a Compare January 11, 2025 12:38
@kiroxas kiroxas force-pushed the improveNavigationLink3DGizmoPlugin_redrawPerformance branch 2 times, most recently from e92ed8f to 861fab6 Compare January 11, 2025 12:57
@smix8 smix8 modified the milestones: 4.x, 4.4 Jan 11, 2025
@Calinou
Copy link
Member

Calinou commented Jan 11, 2025

Note that many other gizmos use similar (but not identical) circle-drawing code. If you search for (float)i across the whole codebase, you'll find most of them:

editor/plugins/gizmos/audio_stream_player_3d_gizmo_plugin.cpp:
  88  	for (int i = 0; i < 180; i++) {
  89: 		float a = Math::deg_to_rad((float)i);
  90  		float an = Math::deg_to_rad((float)(i + 1));

editor/plugins/gizmos/collision_shape_3d_gizmo_plugin.cpp:
  364  		for (int i = 0; i <= 360; i++) {
  365: 			float ra = Math::deg_to_rad((float)i);
  366: 			float rb = Math::deg_to_rad((float)i + 1);
  367  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

  430  		for (int i = 0; i < 360; i++) {
  431: 			float ra = Math::deg_to_rad((float)i);
  432: 			float rb = Math::deg_to_rad((float)i + 1);
  433  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * radius;

  501  		for (int i = 0; i < 360; i++) {
  502: 			float ra = Math::deg_to_rad((float)i);
  503: 			float rb = Math::deg_to_rad((float)i + 1);
  504  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * radius;

editor/plugins/gizmos/gpu_particles_collision_3d_gizmo_plugin.cpp:
  175  		for (int i = 0; i <= 360; i++) {
  176: 			float ra = Math::deg_to_rad((float)i);
  177: 			float rb = Math::deg_to_rad((float)i + 1);
  178  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

editor/plugins/gizmos/joint_3d_gizmo_plugin.cpp:
  243  	for (int i = 0; i < 360; i += 10) {
  244: 		float ra = Math::deg_to_rad((float)i);
  245: 		float rb = Math::deg_to_rad((float)i + 10);
  246  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * w;

  265  	for (int i = 0; i < int(ts); i += 5) {
  266: 		float ra = Math::deg_to_rad((float)i);
  267: 		float rb = Math::deg_to_rad((float)i + 5);
  268  		float c = i / 720.0;

editor/plugins/gizmos/vehicle_body_3d_gizmo_plugin.cpp:
  61  	for (int i = 0; i <= 360; i += skip) {
  62: 		float ra = Math::deg_to_rad((float)i);
  63: 		float rb = Math::deg_to_rad((float)i + skip);
  64  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

Additionally, some primitive mesh generation code uses it too:

scene/resources/3d/capsule_shape_3d.cpp:
  43  	for (int i = 0; i < 360; i++) {
  44: 		float ra = Math::deg_to_rad((float)i);
  45: 		float rb = Math::deg_to_rad((float)i + 1);
  46  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * c_radius;

scene/resources/3d/cylinder_shape_3d.cpp:
  43  	for (int i = 0; i < 360; i++) {
  44: 		float ra = Math::deg_to_rad((float)i);
  45: 		float rb = Math::deg_to_rad((float)i + 1);
  46  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * c_radius;

scene/resources/3d/sphere_shape_3d.cpp:
  41  	for (int i = 0; i <= 360; i++) {
  42: 		float ra = Math::deg_to_rad((float)i);
  43: 		float rb = Math::deg_to_rad((float)i + 1);
  44  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

Could you check if these could benefit from similar optimizations?

@kiroxas
Copy link
Contributor Author

kiroxas commented Jan 11, 2025

Note that many other gizmos use similar (but not identical) circle-drawing code. If you search for (float)i across the whole codebase, you'll find most of them:

editor/plugins/gizmos/audio_stream_player_3d_gizmo_plugin.cpp:
  88  	for (int i = 0; i < 180; i++) {
  89: 		float a = Math::deg_to_rad((float)i);
  90  		float an = Math::deg_to_rad((float)(i + 1));

editor/plugins/gizmos/collision_shape_3d_gizmo_plugin.cpp:
  364  		for (int i = 0; i <= 360; i++) {
  365: 			float ra = Math::deg_to_rad((float)i);
  366: 			float rb = Math::deg_to_rad((float)i + 1);
  367  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

  430  		for (int i = 0; i < 360; i++) {
  431: 			float ra = Math::deg_to_rad((float)i);
  432: 			float rb = Math::deg_to_rad((float)i + 1);
  433  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * radius;

  501  		for (int i = 0; i < 360; i++) {
  502: 			float ra = Math::deg_to_rad((float)i);
  503: 			float rb = Math::deg_to_rad((float)i + 1);
  504  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * radius;

editor/plugins/gizmos/gpu_particles_collision_3d_gizmo_plugin.cpp:
  175  		for (int i = 0; i <= 360; i++) {
  176: 			float ra = Math::deg_to_rad((float)i);
  177: 			float rb = Math::deg_to_rad((float)i + 1);
  178  			Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

editor/plugins/gizmos/joint_3d_gizmo_plugin.cpp:
  243  	for (int i = 0; i < 360; i += 10) {
  244: 		float ra = Math::deg_to_rad((float)i);
  245: 		float rb = Math::deg_to_rad((float)i + 10);
  246  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * w;

  265  	for (int i = 0; i < int(ts); i += 5) {
  266: 		float ra = Math::deg_to_rad((float)i);
  267: 		float rb = Math::deg_to_rad((float)i + 5);
  268  		float c = i / 720.0;

editor/plugins/gizmos/vehicle_body_3d_gizmo_plugin.cpp:
  61  	for (int i = 0; i <= 360; i += skip) {
  62: 		float ra = Math::deg_to_rad((float)i);
  63: 		float rb = Math::deg_to_rad((float)i + skip);
  64  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

Additionally, some primitive mesh generation code uses it too:

scene/resources/3d/capsule_shape_3d.cpp:
  43  	for (int i = 0; i < 360; i++) {
  44: 		float ra = Math::deg_to_rad((float)i);
  45: 		float rb = Math::deg_to_rad((float)i + 1);
  46  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * c_radius;

scene/resources/3d/cylinder_shape_3d.cpp:
  43  	for (int i = 0; i < 360; i++) {
  44: 		float ra = Math::deg_to_rad((float)i);
  45: 		float rb = Math::deg_to_rad((float)i + 1);
  46  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * c_radius;

scene/resources/3d/sphere_shape_3d.cpp:
  41  	for (int i = 0; i <= 360; i++) {
  42: 		float ra = Math::deg_to_rad((float)i);
  43: 		float rb = Math::deg_to_rad((float)i + 1);
  44  		Point2 a = Vector2(Math::sin(ra), Math::cos(ra)) * r;

Could you check if these could benefit from similar optimizations?

I definitely can, but the hardest part is checking how it looks ( to check I did not break anything ) and I have no idea where to look for most of them.

@kiroxas kiroxas force-pushed the improveNavigationLink3DGizmoPlugin_redrawPerformance branch 3 times, most recently from bf3b4a4 to 5d78bdc Compare January 11, 2025 20:18
@kiroxas
Copy link
Contributor Author

kiroxas commented Jan 11, 2025

Ok, I found a way to speed it up greatly again. We're now down from inital 320 cos/sin calls to 80 to now ... 8.
This is probably fast enough now, the limiting code is probably the copy into the gizmo now.

@akien-mga akien-mga requested a review from Ivorforce January 11, 2025 21:27
@kiroxas kiroxas force-pushed the improveNavigationLink3DGizmoPlugin_redrawPerformance branch 4 times, most recently from a69a43a to 2060888 Compare January 12, 2025 08:49
@kiroxas
Copy link
Contributor Author

kiroxas commented Jan 12, 2025

Reduced the cos use again to only 4.
Reduced from 19 400 cycles to 4100 (on average after 3000 iterations) now. So bewteen 4-5x speed up. (Measured only the output vertices of the two circles to the vector, not the whole function)

Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your improvement is impressive! I'd still love to see a final benchmark, but it's pretty clear to me this will be substantially faster than the old implementation.
I have some style comments that should probably be addressed before a merge, but generally I think it's ready (assuming you tested it).

@kiroxas
Copy link
Contributor Author

kiroxas commented Jan 12, 2025

Your improvement is impressive! I'd still love to see a final benchmark, but it's pretty clear to me this will be substantially faster than the old implementation. I have some style comments that should probably be addressed before a merge, but generally I think it's ready (assuming you tested it).

I did, here :

Reduced from 19 400 cycles to 4100 (on average after 3000 iterations) now. So bewteen 4-5x speed up. (Measured only the output vertices of the two circles to the vector, not the whole function)

@kiroxas kiroxas force-pushed the improveNavigationLink3DGizmoPlugin_redrawPerformance branch from 2060888 to 460d11e Compare January 12, 2025 18:33
@kiroxas kiroxas force-pushed the improveNavigationLink3DGizmoPlugin_redrawPerformance branch from 460d11e to d36a9ab Compare January 12, 2025 18:48
@akien-mga akien-mga merged commit 06a49a9 into godotengine:master Jan 13, 2025
20 checks passed
@akien-mga
Copy link
Member

Thanks!

@kiroxas kiroxas deleted the improveNavigationLink3DGizmoPlugin_redrawPerformance branch January 14, 2025 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants