-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#7837 paginate with custom identifier types even with enabled DQL query cache #7865
#7837 paginate with custom identifier types even with enabled DQL query cache #7865
Conversation
Since `WhereInWalker` does not run, query parameters are not translated from their in-memory type to the expected SQL type when the paginator is run again with the same DQL string. This is an architectural issue, since (for the sake of simplicity) we moved parameter translation into the SQL walker, we didn't consider that SQL walkers only act when no cache is in place. The translatio needs to be moved into the paginator logic again.
…ng used In order to figure out the paginated query identifier type, we would have to parse the DQL query into an AST+SQL anyway, so we'd have to re-parse it manually: instead of doing that, we can force the `WhereInWalker` to be reached at all times by forcing the `$whereInQuery` to use no query cache. While it is a sad performance regression, it is also not a noticeable one, since we'll be performing an `O(1)` operation around an I/O one (query execution, in this case).
array_map(static function (GH7820Line $line) : string { | ||
return $line->toString(); | ||
}, iterator_to_array(new Paginator($query))), | ||
'Paginator runs once, query cache is populated with DQL -> SQL translation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks to me this should have been a comment. It doesn't tell me meaning of assertion failure.
instances: 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had these as comments initially: moved to assertion message on purpose, since it makes failures much more descriptive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this says what happens, not what should have happened
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Messages like "null is given, 1 was expected" are the usual messages that are shown on failing assertions.
array_map(static function (GH7820Line $line) : string { | ||
return $line->toString(); | ||
}, iterator_to_array(new Paginator($query))), | ||
'Paginator runs once, query cache is populated with DQL -> SQL translation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Messages like "null is given, 1 was expected" are the usual messages that are shown on failing assertions.
array_map(static function (GH7820Line $line) : string { | ||
return $line->toString(); | ||
}, iterator_to_array(new Paginator($query))), | ||
'Paginator runs again, SQL parameters are translated again, even with cached DQL -> SQL translation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as mentioned for the assert-message above.
…nWithCachedQueryParsing
baa1230
to
98e557b
Compare
I have taken liberty to improve these messages. Let me know if it's ok now |
Assertion messages have been changed by @ostrolucky and are much better. Thanks everyone 👍
This PR prevents the Paginator from causing OpCache "wasted memory" to increase _on every request_ when used with Symfony's `PhpFilesAdapter` as the cache implementation for the query cache. Depending on configured thresholds, wasted memory this will either cause periodic opcache restarts or running out of memory and not being able to cache additional scripts ([Details](https://tideways.com/profiler/blog/fine-tune-your-opcache-configuration-to-avoid-caching-suprises)). Fixes doctrine#9917, closes doctrine#10095. There is a long story (doctrine#7820, doctrine#7821, doctrine#7837, doctrine#7865) behind how the Paginator can take care of DBAL type conversions when creating the pagination query. This conversion has to transform identifier values before they will be used as a query parameter, so it has to happen every time the Paginator is used. For reasons, this conversion happens inside `WhereInWalker`. Tree walkers like this are used only during the DQL parsing/AST processing steps. Having a DQL query in the query cache short-cuts this step by fetching the parsing/processing result from the cache. So, to make sure the conversion happens also with the query cache being enabled, this line https://github.com/doctrine/orm/blob/1753d035005c1125c9fb4855c3fa629341e5734d/lib/Doctrine/ORM/Tools/Pagination/Paginator.php#L165 was added in doctrine#7837. It causes `\Doctrine\ORM\Query::parse()` to re-parse the query every time, but will also put the result into the query cache afterwards. At this point, the setup described in doctrine#9917 – which, to my knowledge, is the default in Symfony + DoctrineBundle projects – will ultimately bring us to this code: https://github.com/symfony/symfony/blob/4b3391725f2fc4a072e776974f00a992cbc70515/src/Symfony/Component/Cache/Adapter/PhpFilesAdapter.php#L248-L249 When writing a cache item with an already existing key, the driver has to make sure the opcache will honor the changed PHP file. This is what causes _wasted memory_ to increase. Instead of using `\Doctrine\ORM\Query::expireQueryCache()`, which will force `\Doctrine\ORM\Query::parse()` to parse the query again before putting it into the cache, use `\Doctrine\ORM\Query::useQueryCache(false)`. The subtle difference is the latter will not place the processed query in the cache in the first place. A test case is added to check that repeated use of the paginator does not call the cache to update existing keys. That should suffice to make sure we're not running into the issue, while at the same time not complicating tests by using the `PhpFilesAdapter` directly. Note that in order to observe the described issue in tests, you will need to use the `PhpFilesDriver` and also make sure that OpCache is enabled and also activated for `php-cli` (which is running the unit tests). This particular subquery generated/used by the Paginator is not put into the query cache. The DQL parsing/to-SQL conversion has to happen _every time_ the Paginator is used. This, however, was already the case before this PR. In other words, this PR only changes that we do not store/update the cached result every time, but instead completely omit caching the query.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917). This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
This PR prevents the Paginator from causing OpCache "wasted memory" to increase _on every request_ when used with Symfony's `PhpFilesAdapter` as the cache implementation for the query cache. Depending on configured thresholds, wasted memory this will either cause periodic opcache restarts or running out of memory and not being able to cache additional scripts ([Details](https://tideways.com/profiler/blog/fine-tune-your-opcache-configuration-to-avoid-caching-suprises)). Fixes #9917, closes #10095. There is a long story (#7820, #7821, #7837, #7865) behind how the Paginator can take care of DBAL type conversions when creating the pagination query. This conversion has to transform identifier values before they will be used as a query parameter, so it has to happen every time the Paginator is used. For reasons, this conversion happens inside `WhereInWalker`. Tree walkers like this are used only during the DQL parsing/AST processing steps. Having a DQL query in the query cache short-cuts this step by fetching the parsing/processing result from the cache. So, to make sure the conversion happens also with the query cache being enabled, this line https://github.com/doctrine/orm/blob/1753d035005c1125c9fb4855c3fa629341e5734d/lib/Doctrine/ORM/Tools/Pagination/Paginator.php#L165 was added in #7837. It causes `\Doctrine\ORM\Query::parse()` to re-parse the query every time, but will also put the result into the query cache afterwards. At this point, the setup described in #9917 – which, to my knowledge, is the default in Symfony + DoctrineBundle projects – will ultimately bring us to this code: https://github.com/symfony/symfony/blob/4b3391725f2fc4a072e776974f00a992cbc70515/src/Symfony/Component/Cache/Adapter/PhpFilesAdapter.php#L248-L249 When writing a cache item with an already existing key, the driver has to make sure the opcache will honor the changed PHP file. This is what causes _wasted memory_ to increase. Instead of using `\Doctrine\ORM\Query::expireQueryCache()`, which will force `\Doctrine\ORM\Query::parse()` to parse the query again before putting it into the cache, use `\Doctrine\ORM\Query::useQueryCache(false)`. The subtle difference is the latter will not place the processed query in the cache in the first place. A test case is added to check that repeated use of the paginator does not call the cache to update existing keys. That should suffice to make sure we're not running into the issue, while at the same time not complicating tests by using the `PhpFilesAdapter` directly. Note that in order to observe the described issue in tests, you will need to use the `PhpFilesDriver` and also make sure that OpCache is enabled and also activated for `php-cli` (which is running the unit tests). This particular subquery generated/used by the Paginator is not put into the query cache. The DQL parsing/to-SQL conversion has to happen _every time_ the Paginator is used. This, however, was already the case before this PR. In other words, this PR only changes that we do not store/update the cached result every time, but instead completely omit caching the query.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917). This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917). This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917). This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's identifier, and we have to derive it from a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. In addition to the processing overhead, this may also waste opcache memory (doctrine#9917). This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's `#[Id]`, and we have to figure it out based on a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's `#[Id]`, and we have to figure it out based on a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
Make the `Paginator`-internal query (`... WHERE ... IN (id, id2, id3...)`) cacheable in the query cache again. When the Paginator creates the internal subquery that does the actual result limiting, it has to take DBAL type conversions for the identifier column of the paginated root entity into account (doctrine#7820, fixed in doctrine#7821). In order to perform this type conversion, we need to know the DBAL type class for the root entity's `#[Id]`, and we have to figure it out based on a given (arbitrary) DQL query. This requires DQL parsing and inspecting the AST, so doctrine#7821 placed the conversion code in the `WhereInWalker` where all the necessary information is available. The problem is that type conversion has to happen every time the paginator is run, but the query that results from running `WhereInWalker` would be kept in the query cache. This was reported in doctrine#7837 and fixed by doctrine#7865, by making this particular query expire every time. The query must not be cached, since the necessary ID type conversion happens as a side-effect of running the `WhereInWalker`. The Paginator internal query that uses `WhereInWalker` has its DQL re-parsed and transformed in every request. This PR moves the code that determines the DBAL type out of `WhereInWalker` into a dedicated SQL walker class, `RootTypeWalker`. `RootTypeWalker` uses a ~hack~ clever trick to report the type back: It sets the type as the resulting "SQL" string. The benefit is that `RootTypeWalker` results can be cached in the query cache themselves. Only the first time a given DQL query has to be paginated, we need to run this walker to find out the root entity's ID type. After that, the type will be returned from the query cache. With the type information being provided, `Paginator` can take care of the necessary conversions by itself. This happens every time the Paginator is used. The internal query that uses `WhereInWalker` can be cached again since it no longer has side effects.
This patch enforces re-parsing of DQL queries used in the paginator internals, so that the
WhereInWalker
can correctly process query parameters after evaluating the AST structure (root expression, specifically).While this isn't nice for performance, it is also true that the alternative approach would be to run the parsing manually on the DQL query (to extract the root of the selection) and then operating on that (this was already attempted in #7821 in 39d2113).
Fixes #7837