-
-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to memory handling for WorkerThreadPools #11201
Comments
I always found the phrasing rather unintuitive, and always thought it should be join, much like Python. It's not too much work to wrap into a static utility function, but as you say, it's a bit hard to tell from the documentation that you should probably consider to do that. |
Yeah that was one of the main reasons I first started looking into a solution, it's a little odd that the wait functions aren't purely just wait functions as their names would suggest but instead also the main memory management method |
I do agree that there should be some way to just auto free resources once the task is done, as it would be nice to just send this off and forget about it. I don't think it should be renamed to join though, since that's more of a convention of older languages whereas wait makes it clearer to newer people that you are actively waiting for the task to finish. |
I'd like to see a real-world case where manually awaiting the tasks is so inconvenient. Moreover, where you don't care about the task being finished or not, if that's the case. That would help shaping any potential fire-and-forget additions to the API. The current API is built around the idea of using tasks in the pool the same way you'd use threads, that is, being in control of what you submitted and its lifetime. That's why "join" would fit the picture. |
As one of the goals is to eventually run as many things as possible on the WorkerThreadPool to take advantage of the performance benefits, and short lived tasks are ideal for usage in the WorkerThreadPool, I’ve identified the kinds of functions (in accordance to the docs, as well as some real world testing and peer review) that can easily be offloaded onto the WorkerThreadPool as long as certain conditions are met. By having a auto clear / resolve system in place however I believe that the barrier of entry for a developer to take advantage of multithreading can be lowered to achievable levels even for the inexperienced.
Not need a return, although this is also true for task in the current system as well From the docs for the WorkerThreadPool we have this kind of example code
Since this is a group task, the other code that depends on the enemy AI already being processed would prevent it from being set and forget. It COULD be though if the group task was turned into a normal task (although that has performance downsides compared to using group tasks), or if the Other code that depends on the enemy AI already being processed wasn’t there. Going back to the previous example, if we made process_enemy_ai into process_enemies_ai (where now all of the enemies are processed on a single thread / normal task) we can get rid of the wait entirely. As all of the code that would have waited can just trigger once the first part of the task is finished.
Following the save dictionary system of saving. It is fairly trivial to offload most of the overhead to a separate thread. By using mutexes while reading the save dictionary (and or making a copy of the dictionary), the relatively heavy process of saving the file to storage can be pushed to a separate thread without any wait operations being needed. In the case when you would want to display a message telling the user to not shut off the game as it is saving, call_defered and set_defered can be used without ever needing to acknowledge the task’s completion outside of the thread. Performance benefits: Usability benefits of auto clear: |
Currently I'm working on adding multiplayer to my game and I have just realized that the send_p2p_packet function listed on godot steam (as well as some other functions listed) would also benefit from the inclusion of auto_clear. They currently send out data through the steamworks api but there is no return or anything else that relies on the function/task being completed at all since it is just sending out data (and as such not interacting with anything that would be thread unsafe). This would work perfectly with the auto_clear function since it never interacts with the scene tree or any resources after being called at all! For fully self contained functions like this (taken from https://godotsteam.com/tutorials/p2p/): func send_p2p_packet(this_target: int, packet_data: Dictionary) -> void:
var send_type: int = Steam.P2P_SEND_RELIABLE
var channel: int = 0
var this_data: PackedByteArray
this_data.append_array(var_to_bytes(packet_data))
if this_target == 0:
if Steamlobbies.lobby_members.size() > 1:
for this_member in Steamlobbies.lobby_members:
if this_member['steam_id'] != Steamworks.steam_id:
Steam.sendP2PPacket(this_member['steam_id'], this_data, send_type, channel)
else:
Steam.sendP2PPacket(this_target, this_data, send_type, channel) You can just toss the send_p2p_packet into the worker threadpool and forget about it I did so here: BrianBHuynh/Mieu-CatChat@3b609c0 I've also added the save_game function into my gdscript implementation of the code from saving your game (more or less) from the docs without any problem as well! |
I wanted to find more in world examples of when auto_clear would be useful, so I searched up "workerthreadpool language:GDScript" on github, and within the first 21 results, a good 6 of them implemented auto clear in their own way, although most of the time it uses either a async wait, or a loop which goes through an array/dict to clear out task once they're done, and out of these 6 none really benefit from the wait other than for memory management reasons Auto clear implemented in these repos: https://github.com/Khaligufzel/Dimensionfall/blob/a1cdf0b0f5ee74db3e772af67edff1df929c6c01/Scripts/Helper/task_manager.gd#L23 Out of the front page 28% of the uses of worker threadpool included some kind of manually done auto clear task which is more resource intensive than just having a in built auto clear task function which shows that there is atleast some demand for such a feature. |
(TD:LR based on all the usages of worker threadpool in godot on github, roughly 52% of people on github are not clearing memory, and atleast 24% of people are just setting and forgetting/fire and forgetting although that's the rough estimate for both based on github users) |
When I first discovered the WorkerThreadPool, I understood from the documentation that every add_task method must have a corresponding wait_for_task_completion, but it did not occur to me that failing to include the wait_for_task_completion step would be prevent system resources from being freed after task completion. Now that I'm aware of that potential, I do see that it is mentioned, but I did not fully comprehend it when I first read through the documentation. Lately I've been working to refactor my code to take advantage of the WorkerThreadPool, and I can definitely imagine a scenario in which I could have added a task without a corresponding wait_for_task_completion method if the main thread method that added the task didn't require any further steps after the task was added. |
I feel like this would be a valid solution for a memory leak issue, especially how developers have tried to patch it themselves as said above. I'd love to see this inside Godot. |
I am a member of the project mentioned at the beginning of the proposal, As someone who is still studying computer science and learning Godot, many of the programs I write are relatively basic. If I were to use WorkerThreadPools, I would likely not need to access a task once it has been finished (I would mainly use it to run basic functions, like save_game(), in parallel to allow the program to run faster). I think having the tasks be cleared automatically by default would make Godot more accessible to amateur developers like myself. |
From the meeting:
|
To add to Pedro's comment, we had some concerns about auto-clearing/merging the task in the meeting:
We agree that the memory leaks noticed by Brian are a problem, but we want to ensure that our solution to the problem isn't worse than the original problem itself. I suspect that Pedro's second and third points should be enough to address the original problem (clearing the task memory without clearing the task_id from the map, and complaining about unfinished tasks on close). Combined, those two items will eliminate the memory leaks as the memory will get cleaned up right away, and they will help educate users about the need to claim claim tasks once they are finished. The final piece is the usability for cases where you want a truly fire and forget task because you don't care about the result. Pedro's first point might help with those cases. But we might want to also consider something else. Personally, I think Pedro's suggestion would be ideal. We can suggest to users to tag their fire and forget threads with a common tag, then just wait on all of them at once during shutdown to avoid the warning. This approach is nice since it doesn't encourage beginners to create race conditions in their code. |
I think I might have a potential solution which might work with most if not all of these concerns, but at the same time I have to ask if there is any way to access the results of a threaded task like described in your first point @clayjohn, at the same time is there any point in keeping the task in memory other than something which list if its completed or not? I was looking over the worker threadpool class and it seems like you can give each task a description, however I can't see anything which would allow you to access said description or return values (or other values within the task) which would mean that the description argument is wholly unneeded. Since there's no way to access said return value / description as it currently is, I think it would be a great idea to turn all task into solely their task ID after being completed in memory like @RandomShaper suggest on his first point, which would mean that the memory leaked from each uncleared task is a lot smaller. Also as I was thinking, could we potentially have a completely "fire and forget" solution with a somewhat threatening name to dissuade people from using it without knowing what they are doing? For example "Async task" (although I'm not sure if it is an accurate name, it was used by someone previously from the listed commits above). Having a more technical name and a big enough warning should dissuade people from using it without knowing the risks and at least reading up on it first, and at the same time it would help in the cases where someone wouldn't need to access the results or confirm completion in the future. Sorry for the wall of text, just jotting down my thoughts for now so we can discuss them in the next meeting! I should have a more organized document for thursday. |
Describe the project you are working on
I am working on holding thursday sessions where I teach students in university (juniors and seniors) how to use github, as well as collaborate on a project in godot (https://github.com/BrianBHuynh/GameDev-lesson-team-project) which is sanctioned by the computer science department head for my university.
I am also working on a cute but super early stage vpet / social game (currently in accessibility stages) https://github.com/BrianBHuynh/Mieu-CatChat
I am also a frequent contributor to the Godot docs.
Because of the first and 3rd projects ease of use and documentation is quite important to me.
Describe the problem or limitation you are having in your project
Currently the implementation of the WorkerThreadPool is slightly unintuitive and hard to teach/use, especially from a documentation standpoint as the names of functions in the object can be misleading or uninformative. Currently memory for task is not automatically managed, and each task needs to manually be cleared/acknowledged/merged through a "wait" function.
There was an issue (godotengine/godot#84888, godotengine/godot#84899) posted around a year ago which was resolved by a documentation change due to lack of knowledge on how Worker ThreadPools work at the time, however even though the issue was resolved, I think implementation of threadpools can be more intuitive.
Currently the documentation officially states that
Warning: Every task must be waited for completion using wait_for_task_completion or wait_for_group_task_completion at some point so that any allocated resources inside the task can be cleaned up.
However this can be confusing as you must wait a task even if it is completed for memory to be cleared, which is unintuitive as for the WorkerThreadPool, the wait functions do not solely freeze the thread it is called from, but instead act as the main form of resource clearing as well!
Some issues to show that people do run into problems (before and after the documentation change): godotengine/godot#79069
godotengine/godot#84888
(after)
godotengine/godot#84899
godotengine/godot#99316 (mine)
As can be seen in 84899 there has been complaints about the current solution as having to call waits to clear resources "pretty much eliminate the ease of use of WorkerThreadPool (fire and forget)." and requires some active effort to manage memory and some people would expect task to happen automatically as there is no function which's name implies memory/resource management so a reading of the docs is needed to use this object intended to simplify the process of multi threading.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
I have multiple solutions which may make memory management via threads easier (and which I've gotten some progress into implementing)
clear_task_if_completed
function toWorkerThreadPool
godot#99316)OR
Both of these solutions would make it so that there is a way to acknowledge the task, and also clear memory without calling a wait function (which may risk freezing a thread, or the main thread, if implemented badly). Also the second name would have a slightly more intuitive name and use case without the need for documentation reading (and it's existence implies that task need to be resource cleared).
By implementing this in the engine, it will make it easier for programmers to manage memory if there is no need for acknowledgement. In the case when the completion of code does not need to be acknowledged (for example in the case of a save system which is both intensive and also should not be waited on).
clear_task_if_completed
function toWorkerThreadPool
godot#99316)My suggested solution:
I think that a hybrid of the 3 solutions would make the WorkerThreadPools far more intuitive to use. By combining the is_task_completed checks with resource clearing (like in the wait functions) to allow for waitless resource management, adding a method to automatically clear memory for less essential task, and renaming the wait function, the worker threadpool would be far more intuitive to use.
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
As the core engine should be kept as lean as possible, I have included the amount of lines AND loops included in each solution in theory (only for singular task, I have not looked at group task yet but they should be similar )
Implementing the 3 suggested solutions is estimated to add anywhere from 25 lines of extra code, to a low of 10 extra lines, although this is only for single task, and assuming similar implementation both group and single task will take anywhere from 20 to 50 extra lines of code.
solution 1 (lines added 27):
godotengine/godot#99316
solution 2 (lines added 6):
replace the current is_task_completed function with the clear_task_if_completed function code in solution 1, this adds only 6 more lines onto the current function that's already implemented.
solution 3 (aprox lines added 19):
Add these lines to the end of the _process_task internal function
Add this function
bool WorkerThreadPool::auto_clear_task(TaskID p_task_id) {
MutexLock task_lock(task_mutex);
Task **taskp = tasks.getptr(p_task_id);
if (!taskp) {
ERR_FAIL_V_MSG(false, "Invalid Task ID"); // Invalid task
}
Task *task = *taskp;
if (task->completed) {
if (task->waiting_pool == 0 && task->waiting_user == 0) {
tasks.erase(p_task_id);
task_allocator.free(task);
}
return true;
}
task->auto_complete = true;
return task->auto_complete;
}
NOTE: 14 of those lines for solution 3 can be removed IF willing to add a argument into the add task function, although this more intensive solution will work with all current code without breaking it.
Solution 4 (0 lines added): Just renaming function from wait to join which may be more accurate.
If this enhancement will not be used often, can it be worked around with a few lines of script?
This enhancement can be worked around in a few lines of code, however the current workaround can be clunky and hard to simply write/explain in the docs to someone without coding knowledge, and the usage of WorkerThreadPools is ideally to simplify the multithreading process for people not familiar and or unwilling with implementing it themselves with the Threads class.
Workaround for checking to see if a task is completed, and then clearing resources:
If is_task_completed(tid):
wait_for_task_completion(tid)
Note how if you want to prevent memory leaks, and no other code relies on acknowledging that your task is finished you must check to see if a task is completed, and then WAIT on said task completing (although it's already completed).
You also need to keep an array or list of active task as well, as task management is not automatically done, and there is no way to get a list of all the currently monitored tasks.
Is there a reason why this should be core and not an add-on in the asset library?
In worse case this could expand the core by ~50 lines in return for easier use of the worker threadpool and overall some performance improvements all around when using the worker threadpool. (Allows automatic memory management, or a way of managing memory without freezing the thread it is called from (which also calls 2 functions instead of 1)).
The text was updated successfully, but these errors were encountered: