Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache String concatenation in _blend_node #101564

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Nazarwadim
Copy link
Contributor

Caches the slowest part of the AnimationTree. If something changes with the tree (a node is added or removed), the cache for all nodes will become dirty.
Combined with #101548 adds 50% to the FPS for MRP from #101494.

Copy link
Contributor

@kiroxas kiroxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like a bandaid over a bleeding wound. All these StringName manipulations during blending would probably require some serious scrutinity for the next releases.
I don't know if this cache method does correctly work everywhere, as I don't how the animationTree works, and if some codepath allow a call of _blend_node with a different p_new_parent without calling make_dirty first, then it will not be correct. And I only see one call to make_cache_dirty in the entire file so that is not really reassuring. Cannot Approve nor Request changes ( as maybe it works correctly, if somehow every codepath that leads here with a change is funneled through the only make_dirty call ). Just raising some concern.

new_path = String(new_parent->node_state.base_path) + String(p_subpath) + "/";
}

p_node->set_node_state_base_path(new_path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p_node->set_node_state_base_path(std::move(new_path)); now that String has move constructors.

new_path = String(node_state.base_path) + String(p_subpath) + "/";
} else {
ERR_FAIL_NULL_V(new_parent, NodeTimeInfo());
new_path = String(new_parent->node_state.base_path) + String(p_subpath) + "/";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a shame that we cannot have access to the StringName ptrs in a const access, because all of these allocations could then only be a memcpy. (Just a remark, nothing to do with your PR)

@Ivorforce
Copy link
Member

Ivorforce commented Jan 15, 2025

Caches the slowest part of the AnimationTree.

Well done, nicely observed!
Your implementation makes sense to me and seems reasonable.
However, i have the same concerns as @kiroxas. We definitely need to test this well to confirm it won't add any problems.

While I'm here, I'll note that i have two open PRs that could help as a stopgap to speed up this path too:

@Nazarwadim Nazarwadim force-pushed the cache_string_concatenation_in_blend_node branch from 1ef98e3 to 616fd00 Compare January 15, 2025 12:35
@Nazarwadim
Copy link
Contributor Author

So I did better as I said by simply caching two Stringnames. The performance is the same as before.

Master:
Project FPS: 70 (14.28 mspf)
Project FPS: 70 (14.28 mspf)
Project FPS: 69 (14.49 mspf)
Project FPS: 69 (14.49 mspf)
Project FPS: 71 (14.08 mspf)
Project FPS: 70 (14.28 mspf)

Current PR:
Project FPS: 93 (10.75 mspf)
Project FPS: 94 (10.63 mspf)
Project FPS: 96 (10.41 mspf)
Project FPS: 95 (10.52 mspf)
Project FPS: 95 (10.52 mspf)

@Nazarwadim
Copy link
Contributor Author

Nazarwadim commented Jan 15, 2025

@Ivorforce One allocation was called for combining these Strings (because there is no realloc on profiling), so I don't think that these PRs will help here. Also need to take into account getting String from StringName, which is slow in multithreading.

@Ivorforce
Copy link
Member

Ivorforce commented Jan 15, 2025

@Ivorforce One allocation was called for combining these Strings (because there is no realloc on profiling), so I don't think that these PRs will help here. Also need to take into account getting String from StringName, which is slow in multithreading.

Well, the current code (+ +) necessitates 7 allocations (4 from scratch, 3 re-, + 7 copies), while the code from #100619 would bring it down to 4 (from scratch) and String::concat from #99929 should bring it down to just 1.
Getting String from StringName is free if the StringName was initialized from a String (rather than a char[] literal). Not sure where you're getting the multithreading thing from, since there's no multithreading primitives involved in StringName.operator String()?
But alas, there's only one way to know for sure :)

@Nazarwadim
Copy link
Contributor Author

Not sure where you're getting the multithreading thing from, since there's no multithreading primitives involved in StringName.operator String()

Godot does not yet have multithreading in the animation system. But if multithreading appears later, very low performance in multithreading will be due to atomics in Memory and RefCount in COW.

Copy link
Contributor

@kiroxas kiroxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more robust like this.

@Nazarwadim Nazarwadim force-pushed the cache_string_concatenation_in_blend_node branch from 616fd00 to 205321a Compare January 15, 2025 15:25
@Saul2022
Copy link

So considering that along with the AHashmap gives 50% more performance, would it close the animation performance problem or would it need more?.

@clayjohn
Copy link
Member

So considering that along with the AHashmap gives 50% more performance, would it close the animation performance problem or would it need more?.

We need more! 3.6 has double the FPS, so 50% more only gets us halfway to our goal.

@Saul2022
Copy link

We need more! 3.6 has double the FPS, so 50% more only gets us halfway to our goal.

hehe I like that spirit. I guess multithreading the whole animation system internally might help quite a lot in that, though the mixer seems to be causing the issue based on the nazar comments

Copy link
Member

@TokageItLab TokageItLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be noted that AnimationNode is a resource type. For example, the same AnimationNode may be placed multiple, in which case the path must always be modified, thus caching becomes meaningless; The paths of those replicated AnimationNodes must be unique and should not be cached for the same reference / single AnimationNode entity.

image

This is a modified MRP that places the multiple same AnimationNode resource in a BlendTree.

perf_2.zip

This is similar to why AnimationNode does not have its own current time, store their time in the AnimationTree insteads, see also #22887. Also you can refer to the behavior of AnimationNode::set_parameter(). So, if we want to cache the AnimationNode path, we would need to have a map in the AnimationTree and store paths there. Then, I think we can monitor the changed signal to the AnimationRootNode in the AnimationTree and update those paths at that time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants