Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more conversions between HashSet and HashMap #349

Open
josephcsible opened this issue Feb 2, 2022 · 13 comments
Open

Add more conversions between HashSet and HashMap #349

josephcsible opened this issue Feb 2, 2022 · 13 comments

Comments

@josephcsible
Copy link
Contributor

josephcsible commented Feb 2, 2022

Rough sketch:

{-# LANGUAGE CPP #-}

import Data.Semigroup (Arg(..))
import qualified Data.HashMap.Internal.Array as A
import Data.HashMap.Internal
import Data.HashSet.Internal

#if MIN_VERSION_hashable(1,3,0)
fromArgSet :: HashSet (Arg k a) -> HashMap k a
fromArgSet = go . toMap
  where
    go Empty = Empty
    go (Leaf h (L (Arg k v) _)) = Leaf h (L k v)
    go (BitmapIndexed b ary) = BitmapIndexed b $ A.map go ary
    go (Full ary) = Full $ A.map go ary
    go (Collision h ary) = Collision h $
                           A.map' (\ (L (Arg k v) _) -> L k v) ary
{-# INLINE fromArgSet #-}

argSet :: HashMap k a -> HashSet (Arg k a)
argSet = fromMap . go
  where
    go Empty = Empty
    go (Leaf h (L k v)) = Leaf h (L (Arg k v) ())
    go (BitmapIndexed b ary) = BitmapIndexed b $ A.map go ary
    go (Full ary) = Full $ A.map go ary
    go (Collision h ary) = Collision h $
                           A.map' (\ (L k v) -> L (Arg k v) ()) ary
{-# INLINE argSet #-}
#endif

These are safe since Arg x y and x have the same hash (in hashable 1.3.0.0 and up, hence the #if). See also haskell/containers#814.

@treeowl
Copy link
Collaborator

treeowl commented Feb 2, 2022

Much better to just raise the lower bound on hashable. The only other acceptable (but bad) option is to add an unnecessary Hashable constraint so there can be an #else.

@josephcsible
Copy link
Contributor Author

Much better to just raise the lower bound on hashable.

I personally wouldn't mind this, but I'm not sure if anyone else would. If nobody complains by the time I bother to write the PR, then I'll do that instead.

The only other acceptable (but bad) option is to add an unnecessary Hashable constraint so there can be an #else.

That's not an option. Arg had a Hashable instance even before hashable-1.3.0.0, but the implementation of it was wrong back then.

@treeowl
Copy link
Collaborator

treeowl commented Feb 2, 2022

Oh, right, it wasn't lawful. Yeah, that's no good at all.

@sjakobi
Copy link
Member

sjakobi commented Feb 2, 2022

Please use #55 to discuss fromSet.

Regarding fromArgSet and argSet, what's the use case / motivation?

@sjakobi
Copy link
Member

sjakobi commented Feb 2, 2022

Regarding the proposed implementations for fromArgSet and argSet, it's not even clear that these will produce valid HashMaps: hashable doesn't seem to promise that x and Arg x _ will always produce the same hash.

You can ask the hashable maintainer to document such a promise of course.

@treeowl
Copy link
Collaborator

treeowl commented Feb 2, 2022

@sjakobi yes, that's true, we should ask for that documentation. But it would take a special effort to make them hash to different values, with absolutely no benefit.

@josephcsible
Copy link
Contributor Author

I didn't know about #55. Now that I do, I removed fromSet from this issue.

I opened haskell-unordered-containers/hashable#242 about adding that guarantee.

@sjakobi
Copy link
Member

sjakobi commented Feb 2, 2022

I wonder whether it would be better to add a more general purpose function like

mapHashPreserving :: (k1 -> v1 -> (k2, v2)) -> HashMap k1 v1 -> HashMap k2 v2

Users could then implement the proposed functions like this:

fromArgSet :: HashSet (Arg k a) -> HashMap k a
fromArgSet = mapHashPreserving (\(Arg k v) _ -> (k, v)) . toMap

argSet :: HashMap k a -> HashSet (Arg k a)
argSet = fromMap . mapHashPreserving (\k v -> (Arg k v, ()))

The main advantage would be that users can use mapHashPreserving for other hash-preserving functions too.

@josephcsible
Copy link
Contributor Author

Aside from adding and removing an Arg wrapper around the key, what else would it be safe to do with mapHashPreserving that you can't already do with just mapWithKey?

@treeowl
Copy link
Collaborator

treeowl commented Feb 2, 2022

Wrap/unwrap (almost) any newtype.

@josephcsible
Copy link
Contributor Author

Okay, that makes sense then. And if we go with that, should we add this too?

traverseHashPreserving :: Applicative t => (k1 -> v1 -> t (k2, v2)) -> HashMap k1 v1 -> t (HashMap k2 v2)

@treeowl
Copy link
Collaborator

treeowl commented Feb 2, 2022

Traverse all the things, I think. But now I'm doubting the newtype thing a bit. Mapping keys and mapping values separately, rewrite rules can turn the operation into a coercion. For the combined thing, that may also be possible, but if so will require tricky footwork. So is there any other application?

@sjakobi
Copy link
Member

sjakobi commented Feb 2, 2022

Aside from adding and removing an Arg wrapper around the key, what else would it be safe to do with mapHashPreserving that you can't already do with just mapWithKey?

People may want to use other Argish types, e.g. ones that are strict in one or both of the fields.

But frankly I still don't have an idea what the original argSet and fromArgSet functions would be useful for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants