Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create precise_delay_usec() to avoid excessive CPU penalty when not intended #99656

Closed
wants to merge 1 commit into from

Conversation

mrsaturnsan
Copy link
Contributor

@mrsaturnsan mrsaturnsan commented Nov 24, 2024

Avoids the excess CPU penalty for non-frame pacing related sleeps as mentioned here:

#99593

The original PR aimed to fix inconsistent frame limiting on Windows due to the inaccuracy of the Sleep function.

@Calinou did some testing with the previous PR, and found it greatly helped with frame pacing and getting the timing right.

This PR aims to fix the unintended CPU time increase, while adding the improved frame limiting logic. Since sleep is based on hardware/OS, the method needs to be adaptive to both. On Windows, the sleep granularity is guaranteed, but on Unix we need to dynamically compute the upper bound to sleep.

@mrsaturnsan mrsaturnsan requested review from a team as code owners November 24, 2024 23:26
@tetrapod00 tetrapod00 changed the title Create a precise_delay_usec function to avoid excessive CPU penalty w… Create a precise_delay_usec function to avoid excessive CPU penalty when not intended Nov 24, 2024
@tetrapod00 tetrapod00 added this to the 4.4 milestone Nov 24, 2024
@mrsaturnsan mrsaturnsan requested a review from a team as a code owner November 25, 2024 01:46
@mrsaturnsan mrsaturnsan force-pushed the precise_sleep branch 15 times, most recently from 41b6414 to 0e20fc8 Compare November 25, 2024 06:19
@RandomShaper
Copy link
Member

Makes sense.

I'd just want to suggest ways to trying not to clutter the internal, neither the exposed APIs before we know the precise delays are needed by projects. Options (not necessarily exclusive):
a) Not exposing the precise ones.
b) Not having an msec version of the precise one, at all.
c) Refactoring the precise one so it's only related to frame pacing. Is there any other known usage for it yet?
d) To avoid confusion, maybe the precise one could be renamed to something else. The docs can already explain the difference, but I can foresee many would just prefer the precise version just because it looks like the best choice.

@mrsaturnsan
Copy link
Contributor Author

@Faless
I removed exposing the precise functions in the PR. All OSs currently use OS::add_frame_delay in some form so just adding an internal precise function should work with it.

@Faless
Copy link
Collaborator

Faless commented Nov 25, 2024

I removed exposing the precise functions in the PR. All OSs currently use OS::add_frame_delay in some form so just adding an internal precise function should work with it.

Well, actually, the Web platform implements frame skipping and does not use add_frame_delay so the function is useless there (except when PROXY_TO_PTHREAD_ENABLED which is currently broken so it's not used in official builds).

But this is at least an improvement.

Copy link
Collaborator

@Faless Faless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not convinced we need a dedicated function for this, can't we implement a simple busy wait logic in add_frame_delay using the functions already at our disposal?

@akien-mga
Copy link
Member

Thanks for the work on fixing the regression and improving the implementation further.

From a release management perspective, so we don't rush a significant change as a hotfix for a previous regression, I would suggest we start by reverting #99178 to fix #99593, and then this PR can be rebased to redo the changes from #99178 plus the further changes here.

This is mainly to restart from a known state, and maybe reduce some of the tension I perceive that #99178 was maybe merged a bit too eagerly without fully assessing its consequences (which is now being done here). That's not to say that the change isn't wanted though, it does seem to fix a significant issue.

Does that sound ok? If so I'll take care of reverting #99178.

@mrsaturnsan
Copy link
Contributor Author

@akien-mga

Yes, that sounds reasonable! If you could revert that PR, I'll rebase this on top. Thanks 🙏

@akien-mga
Copy link
Member

The original PR was reverted by #99688, so this one can be rebased.

I would suggest editing the OP with some more details about what the then-resulting PR aims at addressing, relevant references (e.g. Calinou's test in the previous PR), etc.

@mrsaturnsan
Copy link
Contributor Author

Rebased the changes as suggested.

@mrsaturnsan
Copy link
Contributor Author

The excess CPU time reported in this issue is also resolved with the PR.

#99593 (comment)

@mrsaturnsan mrsaturnsan force-pushed the precise_sleep branch 2 times, most recently from a44f163 to 734040e Compare November 25, 2024 22:03
@Calinou
Copy link
Member

Calinou commented Nov 25, 2024

That reminds me, can we use a high-resolution waitable timer when available? It's available on Windows 10 1803 or newer, so most users should be able to use it now.

@mrsaturnsan mrsaturnsan force-pushed the precise_sleep branch 3 times, most recently from 39eba2f to fde8c8f Compare November 25, 2024 22:21
@mrsaturnsan
Copy link
Contributor Author

mrsaturnsan commented Nov 25, 2024

@Calinou
I'm not sure whether or not having it in will cause issues for Windows 7 users. It would be nice though :)
By the way, this change also fixes timeBeginPeriod(1). This may fail on some systems due to hardware timer precision. Instead, it queries the minimum value and uses that now.

Verified

This commit was signed with the committer’s verified signature.
dalexeev Danil Alexeev
…hen not intended
@mrsaturnsan
Copy link
Contributor Author

Closing this PR for now since the regression causing change was reverted, and there is no agreement on how to fix the incorrect frame-limiting. If someone wants to try fixing it again, feel free to use the changes above.

@AThousandShips AThousandShips removed this from the 4.4 milestone Nov 26, 2024
@Faless
Copy link
Collaborator

Faless commented Nov 26, 2024

@mrsaturnsan

I'm sorry my comments felt overly negative.

I understand you are eager to improve the engine, but the review process is in place to make sure that code modifications respects code quality and properly solve the related issues.
This requires questioning the changes done in a PR especially in cases where they are not well described in the description, commit message, or linked issues.

Rest assured that any critique to the code is not meant as directed to the PR author, but the changes themselves. I am in no way criticizing you as a contributor, but it's part of the review process to question the specific code in a PR to ensure we only merge high quality code, especially in core areas.

I hope you can appreciate that this discussion has led to multiple improvements compared to the original proposed changes, including proper error checking and a general purpose fallback.

I thought the PR was getting close to a mergeable state, although it might still need some minor changes, but I do understand that you might not be willing to put more work into it.

Thanks for the work nonetheless, hopefully, this will be picked up by some other contributors to be finalized, and we'll make sure to acknowledge your initial effort.

@adamscott adamscott changed the title Create a precise_delay_usec function to avoid excessive CPU penalty when not intended Create precise_delay_usec() to avoid excessive CPU penalty when not intended Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants