Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lazy foldM #1030

Closed
ceedubs opened this issue May 13, 2016 · 18 comments
Closed

lazy foldM #1030

ceedubs opened this issue May 13, 2016 · 18 comments

Comments

@ceedubs
Copy link
Contributor

ceedubs commented May 13, 2016

Foldable.foldM is a really handy function. For example, the following code will sum the input numbers until the sum reaches 10. Once 10 is reached, it will return a Left, or if 10 is never reached, it will return a Right with the final sum.

(1 to 20).toStream.foldM[Long Xor ?, Long](0L){ (acc, a) =>
  val sum = acc + a
  if (sum >= 10) Xor.left(sum) else Xor.right(sum)}

Unfortunately, in its current form, it will eagerly evaluate the entire input stream, because it's implemented in terms of foldLeft. You can see this if you change (1 to 20).toStream to Stream.continually(1).

The initial proposal of foldM in #356 was lazy in consuming the input list (good), but wasn't stack safe (bad).

I've played around with this a bit, and I've noticed that you can implement a foldM that is both lazy and stack-safe given an Iterator[A] and a Monad that implements something like def flatMapEval[A, B](fa: F[A])(f: A => Eval[F[B]]): Eval[F[B]]. However, the only way you can implement flatMapEval in terms of flatMap is by calling .value on an Eval, so this foldM instance ends up not being stack-safe for types that don't override flatMapEval.

I suspect that the best answer to this issue is to define a more constrained Monad type and require it. It's possible that one with the flatMapEval definition I mentioned earlier would work. I suspect something like #616 might be a better solution.

@ceedubs
Copy link
Contributor Author

ceedubs commented May 13, 2016

cc @eed3si9n @TomasMikula

@eed3si9n
Copy link
Contributor

I suspect that the best answer to this issue is to define a more constrained Monad type and require it.

Sounds exciting. LazyMonad, EvalFlatMap, EvalT?

@ceedubs
Copy link
Contributor Author

ceedubs commented May 13, 2016

@eed3si9n that's a good question. I'm currently re-reading http://functorial.com/stack-safety-for-free/index.pdf and thinking about whether/where Eval fits into the picture here. Especially since I just moved free back into its own module which I'm now hoping wasn't a mistake...

@TomasMikula
Copy link
Contributor

I wasn't able to come up with an implementation that is both stack-safe and able to terminate early. What is yours (given that you have flatMapEval at hand)?

I was only able to imagine one if the return type of f (the accumulation function) indicated that there is no need to process any more elements, for example by returning a M[B] Xor M[B].

Maybe the current foldM should be renamed to foldLeftM?

I think BindRec is a good thing to have regardless, but I don't see how it can help here (without also modifying the signature of the accumulation function).

@ceedubs
Copy link
Contributor Author

ceedubs commented May 13, 2016

@TomasMikula I don't think you can create a general solution that satisfies both requirements via foldLeft/foldRight (even with flatMapEval), but instances for specific types can implement this. For example, I think that this works for any type that can return an iterator:

def foldM[G[_], A, B](it: Iterator[A], z: B)(f: (B, A) => G[B])(implicit G: Monad[G]): G[B] = {
  def loop(z: B): Eval[G[B]] =
    if (it.hasNext) Eval.defer(G.flatMapEval(f(z, it.next))(loop))
    else Always(G.pure(z))
  loop(z).value
}

So I guess that Iterator is a better abstraction than Foldable ;)

@ceedubs
Copy link
Contributor Author

ceedubs commented May 13, 2016

I meant to include in my last comment that the Stack Safety for Free paper specifically mentions foldM as something that can be implemented safely via MonadRec (using FreeT). I haven't given this a try in the scala world yet.

@TomasMikula
Copy link
Contributor

@ceedubs Can your implementation terminate early, e.g. for an infinite Iterator? I don't see how.

@TomasMikula
Copy link
Contributor

TomasMikula commented May 13, 2016

Intuitively, the decision about termination needs to be made by f. I can imagine two ways f can execute/indicate such decision:

  • take a lazy argument that it possibly doesn't evaluate (the foldRight style); or
  • indicate the wish to terminate in the return type (such as G[B] Xor G[B], where Right means "done" and Left means "continue").

None of these is the case in your implementation, so I wonder...

@ceedubs
Copy link
Contributor Author

ceedubs commented May 14, 2016

@TomasMikula yeah as long as the flatMapEval implementation is sufficiently lazy, it terminates early. For example, consider this implementation of flatMapEval on Option:

      override def flatMapEval[A, B](fa: Option[A])(f: A => Eval[Option[B]]): Eval[Option[B]] =
        fa match {
          case None => Now(None)
          case Some(a) => f(a)
        }

If we use the foldM method I pasted above with Stream, then you get this:

scala> Stream.continually(1).foldM(0)((acc, i) => if (acc < 10) Some(acc + i) else None)
res1: Option[Int] = None

Once we return None, the flatMapEval call doesn't evaluate f(a), so we don't loop on the rest of the iterator. This is somewhat similar to the way that foldRight works, but it allows you to use the accumulated result of previous items.

Having said all of that, I don't think this is a very satisfying solution. Even if we were to add a FlatMapEval type class and require it on foldM calls, we wouldn't have a general solution in terms of foldLeft/foldRight, so each Foldable instance would have to override their foldM implementation.

@ceedubs
Copy link
Contributor Author

ceedubs commented May 14, 2016

@TomasMikula I pushed the code I was playing around with here.

@TomasMikula
Copy link
Contributor

Yeah, but can you terminate early and return something else than None?

@ceedubs
Copy link
Contributor Author

ceedubs commented May 14, 2016

@TomasMikula not when working with Option, but I guess I would just view that as inherent to what foldM is. I would use Xor instead of Option if I wanted to terminate the computation early (with a Left) and still return a value.

@TomasMikula
Copy link
Contributor

@ceedubs that doesn't seem very satisfying indeed. You basically use the monadic effect for early termination, which means 1) you cannot use the monadic effect for something else, and 2) Xor is about the only monad you can use for that.

So why not use f: (B, A) => G[B Xor B] and then it should be possible to use BindRec for stack safety. I could flesh out the implementation later this weekend.

@ceedubs
Copy link
Contributor Author

ceedubs commented May 16, 2016

@TomasMikula You should be able to use XorT if you want both the possibility of early termination and another effect. I think there are quite a few use-cases where you want to fold across the entire structure as long as your monadic binds succeed, so it would seem unfortunate to me to force users into Xor if they don't need the "early termination" effect.

I say that specifically with regard to foldM though. For the general case of left-associative folds, I actually recommended something very similar to what you have brought up in scalaz going on two years ago.

I was playing around locally, and it's pretty straightforward to create a stack-safe foldM with laziness if you have BindRec and FreeT (you can take the runSafeT approach described in Stack Safety for Free). I haven't yet tried writing a version of foldM implemented directly with tailRecM without the use of FreeT (currently cats doesn't have FreeT, and other free structures don't live in the core module).

@TomasMikula
Copy link
Contributor

Makes sense. I didn't mean to impose Xor on all users of foldM, I imagined that would be a different method, providing a more explicit support for early termination.

@TomasMikula
Copy link
Contributor

TomasMikula commented May 16, 2016

I haven't yet tried writing a version of foldM implemented directly with tailRecM without the use of FreeT

Given this signature of tailRecM

@typeclass trait FlatMapRec[F[_]] extends FlatMap[F] {
  def tailRecM[A, B](f: A => F[A Xor B])(a: A): F[B]
}

@typeclass trait MonadRec[F[_]] extends FlatMapRec[F] with Applicative[F]

this could be the implementation of foldM for List:

def foldM[M[_], A, B](z: B, as: List[A])(f: (B, A) => M[B])(implicit M: MonadRec[M]): M[B] = {

  def go(x: (B, List[A])): M[(B, List[A]) Xor B] = x match {
    case (b, Nil) => M.pure(Xor.right(b))
    case (b, a::as) => M.map(f(b, a))(b1 => Xor.left((b1, as)))
  }

  M.tailRecM(go)((z, as))
}

@ceedubs
Copy link
Contributor Author

ceedubs commented May 16, 2016

@TomasMikula nice - thanks! I was scratching my head trying to figure out how to do things with tailRecM. I was trying to do this for a generic Foldable, not specifically a List, which added to the confusion. It was so easy when I just delegated to a FreeT of Id :). I hope to follow up on this in the near future.

@TomasMikula
Copy link
Contributor

TomasMikula commented May 16, 2016

Yeah, I don't think we can generalize this to any Foldable, but we sure can to any Uncons (which I'm not sure would be a lawful typeclass, though):

@typeclass trait Uncons[B, A] {
  def uncons(b: B): Option[(A, B)]
}

EDIT: which is essentially the same as Iterator :).

Regarding your FreeT experiment, assuming you are using FreeT[Id, M, A], I think you should be able to replace it by Free[M, A], if you also add Free.foldMapRec, i.e. a stack-safe version of foldMap (or just strengthen the constraints of current Free.foldMap to require M[_]: MonadRec).

ceedubs added a commit to ceedubs/cats that referenced this issue May 18, 2016
Resolves typelevel#1030, though it may be desirable change the default
implementation to be lazy in the future. This can be done with `FreeT`
but as of now I'm not sure how to do it without.
peterneyens pushed a commit to peterneyens/cats that referenced this issue Oct 19, 2016
Resolves typelevel#1030, though it may be desirable change the default
implementation to be lazy in the future. This can be done with `FreeT`
but as of now I'm not sure how to do it without.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants