You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the problem or limitation you are having in your project
Some games like Returnal display a preview of CPU resources when the settings menu is open:
From left to right: frames per second, CPU utilization, GPU utilization, GPU temperature, GPU core clock speed, video memory utilization.
This is useful to know whether the CPU is being utilized correctly, without having to install any third-party software such as RTSS or MangoHud.
From the Godot side of things, displaying this information in the editor when the View Frame Time panel is visible could prove useful. In the editor, the CPU may downclock as a result of low utilization, which can cause the displayed CPU frametime to be higher than it should really be. This is especially common on high-end CPUs.
Being able to know the CPU topology more accurately will also be helpful to better use multiple cores in the engine itself. For instance, in some scenarios, a multithreaded workflow is best kept on a smaller number of cores (that matches the number of physical cores available on the system). It may also be worth constraining the workflow to "fast" CPU cores on CPUs with hybrid (big.LITTLE) architectures, such as Intel's 12th and 13th-gen CPUs.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
Add methods to query CPU hardware properties (core topology, utilization, temperatures, core clocks).
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
The set of methods I propose is based on the most commonly needed metrics. In theory, these should be are universally available in one way or another on all GPUs (integrated or dedicated). This is the list of methods I propose adding:
OS.get_physical_processor_count() (returns the number of physical CPU cores available, i.e. excluding cores exposed via HyperThreading)
OS.get_physical_performance_processor_count() (returns the number of "performance" physical CPU cores available, i.e. excluding cores exposed via HyperThreading and E-cores on modern Intel CPUs)
OS.get_cpu_utilization_ratios() (returns an array of floats between 0.0 and 1.0, with 1.0 being 100% utilization of a given CPU thread)
OS.get_cpu_core_temperatures() (returns an array of floats with temperature in degrees Celsius for each CPU core. If this method can't be implemented on a given platform, the value of OS.get_cpu_package_temperature() can be returned for all physical CPU cores.)
OS.get_cpu_package_temperature() (returns a float with temperature in degrees Celsius for the CPU package. This is not an average of all cores, but a general temperature which does not take individual cores into account.)
OS.get_cpu_current_core_clocks() (returns an array of integers with the frequency of each CPU thread in MHz)
Methods should return -1 if the value cannot be queried for any reason, so that the project developer can act accordingly. This also applies to unsupported platforms (Android, iOS, HTML5).
The order of elements in each array is not guaranteed to have any meaning, since it varies depending on CPU vendor and OS configuration. If you need an average readout across all cores for any of these methods, use Array.reduce() to your advantage 🙂
Windows
Windows has some APIs for this, although some of them may be vendor-specific.
Linux (and possibly macOS)
On Linux, knowing CPU information is generally done by reading files in the /proc filesystem. This may also be feasible on macOS to an extent (when not running in sandbox mode).
For example, to get the number of physical CPU cores available on the system:
This takes about 2 milliseconds to run on my system with an i9-13900K (filtering by chip name reduces this a lot, it was originally 46 ms). This may be significantly slower on lower-end CPUs, even if there are fewer values to report due to them having fewer cores. We should figure out a way to run this in the background to avoid stuttering if this is called during gameplay.
sensors also has a -j option for JSON output which may be worth looking into.
To get a list of all CPU core' current frequencies in MHz:
Sampling CPU utilization must be done over a period of time, so this will likely require another method to enable measurements and allow you to query values (so that these values are not queried for no reason).
Note that the grep/cut logic is only given as a (functional) example, and should be rewritten in C++ for efficiency (this avoids spawning subprocesses).
If this enhancement will not be used often, can it be worked around with a few lines of script?
The approach described here can be implemented by a script on Linux. However, native APIs must be used on other platforms. Also, such a script will be fairly complex in nature, especially since it would need to integrate GDExtension to interact with native APIs.
Is there a reason why this should be core and not an add-on in the asset library?
This is about improving the performance troubleshooting and optimization experience for players and developers alike.
Footnotes
On some CPUs, the size of this list may not match the total number of CPU cores. This should be noted in the class reference. ↩
The text was updated successfully, but these errors were encountered:
Describe the project you are working on
The Godot editor 🙂
Describe the problem or limitation you are having in your project
Some games like Returnal display a preview of CPU resources when the settings menu is open:
From left to right: frames per second, CPU utilization, GPU utilization, GPU temperature, GPU core clock speed, video memory utilization.
This is useful to know whether the CPU is being utilized correctly, without having to install any third-party software such as RTSS or MangoHud.
From the Godot side of things, displaying this information in the editor when the View Frame Time panel is visible could prove useful. In the editor, the CPU may downclock as a result of low utilization, which can cause the displayed CPU frametime to be higher than it should really be. This is especially common on high-end CPUs.
Being able to know the CPU topology more accurately will also be helpful to better use multiple cores in the engine itself. For instance, in some scenarios, a multithreaded workflow is best kept on a smaller number of cores (that matches the number of physical cores available on the system). It may also be worth constraining the workflow to "fast" CPU cores on CPUs with hybrid (big.LITTLE) architectures, such as Intel's 12th and 13th-gen CPUs.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
Add methods to query CPU hardware properties (core topology, utilization, temperatures, core clocks).
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
The set of methods I propose is based on the most commonly needed metrics. In theory, these should be are universally available in one way or another on all GPUs (integrated or dedicated). This is the list of methods I propose adding:
OS.get_physical_processor_count()
(returns the number of physical CPU cores available, i.e. excluding cores exposed via HyperThreading)OS.get_physical_performance_processor_count()
(returns the number of "performance" physical CPU cores available, i.e. excluding cores exposed via HyperThreading and E-cores on modern Intel CPUs)OS.get_cpu_utilization_ratios()
(returns an array of floats between0.0
and1.0
, with1.0
being 100% utilization of a given CPU thread)OS.get_cpu_core_temperatures()
(returns an array of floats with temperature in degrees Celsius for each CPU core. If this method can't be implemented on a given platform, the value ofOS.get_cpu_package_temperature()
can be returned for all physical CPU cores.)OS.get_cpu_package_temperature()
(returns a float with temperature in degrees Celsius for the CPU package. This is not an average of all cores, but a general temperature which does not take individual cores into account.)OS.get_cpu_current_core_clocks()
(returns an array of integers with the frequency of each CPU thread in MHz)Methods should return
-1
if the value cannot be queried for any reason, so that the project developer can act accordingly. This also applies to unsupported platforms (Android, iOS, HTML5).The order of elements in each array is not guaranteed to have any meaning, since it varies depending on CPU vendor and OS configuration. If you need an average readout across all cores for any of these methods, use
Array.reduce()
to your advantage 🙂Windows
Windows has some APIs for this, although some of them may be vendor-specific.
Linux (and possibly macOS)
On Linux, knowing CPU information is generally done by reading files in the
/proc
filesystem. This may also be feasible on macOS to an extent (when not running in sandbox mode).For example, to get the number of physical CPU cores available on the system:
To get a list of CPU cores' temperatures1 (requires the
lm-sensors
package to be installed):This takes about 2 milliseconds to run on my system with an i9-13900K (filtering by chip name reduces this a lot, it was originally 46 ms). This may be significantly slower on lower-end CPUs, even if there are fewer values to report due to them having fewer cores. We should figure out a way to run this in the background to avoid stuttering if this is called during gameplay.
sensors
also has a-j
option for JSON output which may be worth looking into.To get a list of all CPU core' current frequencies in MHz:
Sampling CPU utilization must be done over a period of time, so this will likely require another method to enable measurements and allow you to query values (so that these values are not queried for no reason).
Note that the
grep
/cut
logic is only given as a (functional) example, and should be rewritten in C++ for efficiency (this avoids spawning subprocesses).If this enhancement will not be used often, can it be worked around with a few lines of script?
The approach described here can be implemented by a script on Linux. However, native APIs must be used on other platforms. Also, such a script will be fairly complex in nature, especially since it would need to integrate GDExtension to interact with native APIs.
Is there a reason why this should be core and not an add-on in the asset library?
This is about improving the performance troubleshooting and optimization experience for players and developers alike.
Footnotes
On some CPUs, the size of this list may not match the total number of CPU cores. This should be noted in the class reference. ↩
The text was updated successfully, but these errors were encountered: