Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[internal] add failing test for FrozenDict equality issue #13389

Merged
merged 1 commit into from
Dec 4, 2021

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented Oct 28, 2021

Add a failing test which shows that FrozenDict's do not compare equally with the same elements but different insertion orders.

[ci skip-rust]

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was intentional. @stuhood is it fine from an engine perspective for the ordering to not matter?

I imagine fixing this could improve cache hit rates a lot?

@stuhood
Copy link
Member

stuhood commented Oct 28, 2021

That was intentional. @stuhood is it fine from an engine perspective for the ordering to not matter?

I imagine fixing this could improve cache hit rates a lot?

I'm surprised by this... but yea, as long as __eq__ and __hash__ are "consistent" (for whichever definition of "consistent" you'd like), we're fine. Choosing to account for order or not is up to an implementation.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably fix this.

(didn't encounter it: just combing through open reviews)

@tdyas tdyas changed the title add failing test for FrozenDict equality issue [internal] add failing test for FrozenDict equality issue Dec 4, 2021
@tdyas tdyas merged commit 0bb6655 into pantsbuild:main Dec 4, 2021
@tdyas tdyas deleted the frozen_dict_different_orders branch December 4, 2021 18:21
illicitonion added a commit that referenced this pull request Dec 7, 2021
Internal changes:

* [internal] Remove superfluous f-string specifiers ([#13821](#13821))

* [internal] scala: extract annotations as consumed types ([#13810](#13810))

* [jvm] Split nailgun digest from input file digests ([#13813](#13813))

* [internal] jvm: add jre_major_version and use stderr to properly extract version ([#13812](#13812))

* [internal] Clean up Go `embed` support's handling of dependencies ([#13801](#13801))

* [internal] scala: handle package object syntax in parser ([#13809](#13809))

* [internal] java: fix junit sentinel rule setup ([#13815](#13815))

* [internal] upgrade to rust v1.57.0 ([#13807](#13807))

* [internal] add failing test for FrozenDict equality issue ([#13389](#13389))

* [internal] Use `PyObject` instead of `Value` in more places ([#13802](#13802))

* Remove MultiPlatform Process abstractions ([#12725](#12725))

* [internal] add `JvmToolBase` for lockfile handling for JVM tools ([#13777](#13777))

* [internal] Port `MergeDigests` to Rust ([#13773](#13773))

* [jvm] Spawn nailgun servers outside the pool lock ([#13796](#13796))

* [internal] DRY loading internal Go binaries ([#13800](#13800))

* [internal] Convert unit tests to use pytest ([#13798](#13798))

* [internal] remove dead code - socket util. ([#13797](#13797))

* [internal] Reorganize Go parser scripts ([#13791](#13791))

* Adds Jackson core/datatype to `JVM_ARTIFACT_MAPPINGS` ([#13792](#13792))

* [internal] go: initial support for embedding resources ([#13743](#13743))

* [internal] Refer to `go.mod` path when downloading packages ([#13786](#13786))

* [internal] More robust Go dependency inference tests ([#13785](#13785))

* [internal] `tailor` doesn't add `go_package` for `testdata` folder ([#13783](#13783))

* [internal] Add Scala backend to dryrun for wheel builds. ([#13772](#13772))

* [internal] Unify JVM thirdparty resolves ([#13771](#13771))

* [internal] scala: infer dependencies from consumed symbols and intermediate scopes ([#13758](#13758))

* [internal] java: infer scala encoded symbols ([#13739](#13739))

* [internal] scala: parse and report package scopes ([#13738](#13738))

* [internal] go: configure included env vars for `GoSdkProcess` ([#13734](#13734))

* Fix some `./cargo audit` vulnerabilities. ([#13728](#13728))

* [internal] fix non-empty __init__.py ([#13730](#13730))

* Compute RepositoryPex directly from addresses. ([#13716](#13716))

* Upgrade to cargo-audit 0.16.0. ([#13729](#13729))

* Simplify `NativeHandler`. ([#13727](#13727))

* [internal] Switch to a maintained fork of the `fuse` crate for `brfs`. ([#13679](#13679))

* [internal] Add infrastructure to support deprecating field names ([#13666](#13666))

* [internal] Introduce OptionalPex/OptionalPexRequest. ([#13712](#13712))

* [internal] tailor adds go_package targets ([#13703](#13703))

* [internal] Remove unused testproject for pants-plugin ([#13704](#13704))

* [internal] Rename ambiguous `subpath` variable for Go code ([#13701](#13701))

* [internal] scala: generate the JVM names seen by Java code for Scala code ([#13696](#13696))

* Use RequirementsPexRequest in run_pex_binary.py. ([#13693](#13693))

* [internal] Refactor finding owning `go_mod` for first-party packages ([#13695](#13695))

* [internal] repl: add append_only_caches / run_in_workspace attributes to ReplRequest ([#13599](#13599))

* [internal] switch back to official `cargo-ensure-prefix` crate ([#13692](#13692))

* [internal] scala: extract type names from all Type.* AST nodes ([#13685](#13685))

* [internal] Convert unit tests to use pytest ([#13652](#13652))

* Unblock codegen support for java export analysis (#13645) ([#13675](#13675))

* [internal] upgrade to Rust 2021 Edition ([#13644](#13644))

* [internal] Don't store derived values on `go_first_party_package` targets ([#13676](#13676))

* Upgrade to py03 0.15.1. ([#13725](#13725))

* Add PyPDF2 to module mapping ([#13717](#13717))

* Upgrade console and indacatif. ([#13726](#13726))
huonw added a commit that referenced this pull request Nov 30, 2023
…#20221)

This adjusts `FrozenDict` to have the comparisons like `__eq__` and
`__lt__` and hashing operate in a slightly less error-prone way, to fix
#20219:

- they're order-insensitive, For instance, `FrozenDict({1: 2, 3: 4}) ==
FrozenDict({3: 4, 1: 2})` now compares `True`, but previously was
`False`
- `__eq__` works against `dict`. For instance, `FrozenDict({1: 2}) ==
{1: 2}` now compares `True`, but previously was `False`. (This matches
behaviour of stdlib `dict` subclasses, e.g.
`__import__('collections').Counter([1, 1]) == {1: 2}` is `True`.)

This requires two notable adjustments:

- `__lt__` is significantly more expensive now, because it has to sort
both LHS and RHS (discussed and potentially revised in #20243)
- Computing the hash changes value and is arguably less "safe", because
it now aggregates multiple independent hashes with xor/`^` rather than
hashing one huge tuple and letting Python's tuple hashing magic handle
it
(https://github.com/python/cpython/blob/3faf8e586d36e73faba13d9b61663afed6a24cb4/Objects/tupleobject.c#L291-L344).
I don't think this matters, as, I believe, this hash is only used for
(effectively) being a dict key, and so collisions are at worst a
denial-of-service, because they're always double-checked with `__eq__`;
cryptographic hashing is handled via sha256 etc. in the Rust code.

This potentially improves `@rule` memoisation, because previously some
rule inputs might've been computed differently, but be semantically
identical.

Reminder of history, in case there's a critical reason why this
behaviour should be preserved:

- `FrozenDict` implemented in #9366, including tests that explicitly
call out "Order matters"
- Order-sensitive `__lt__` added in #12898, including tests that
explicitly call out "Order matters"
- `xfail`ed test about `__eq__` being order dependent added in #13389

In theory, a low-level data structure like this would be a good
candidate for property testing with a tool like Hypothesis, with
properties like "if `a == b`, then `hash(a) == hash(b)`". It seems like
Pants doesn't currently have any tests using Hypothesis so I'm not
inclined to add one here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants