lazy `foldM` #1030

ceedubs · 2016-05-13T10:59:32Z

Foldable.foldM is a really handy function. For example, the following code will sum the input numbers until the sum reaches 10. Once 10 is reached, it will return a Left, or if 10 is never reached, it will return a Right with the final sum.

(1 to 20).toStream.foldM[Long Xor ?, Long](0L){ (acc, a) =>
  val sum = acc + a
  if (sum >= 10) Xor.left(sum) else Xor.right(sum)}

Unfortunately, in its current form, it will eagerly evaluate the entire input stream, because it's implemented in terms of foldLeft. You can see this if you change (1 to 20).toStream to Stream.continually(1).

The initial proposal of foldM in #356 was lazy in consuming the input list (good), but wasn't stack safe (bad).

I've played around with this a bit, and I've noticed that you can implement a foldM that is both lazy and stack-safe given an Iterator[A] and a Monad that implements something like def flatMapEval[A, B](fa: F[A])(f: A => Eval[F[B]]): Eval[F[B]]. However, the only way you can implement flatMapEval in terms of flatMap is by calling .value on an Eval, so this foldM instance ends up not being stack-safe for types that don't override flatMapEval.

I suspect that the best answer to this issue is to define a more constrained Monad type and require it. It's possible that one with the flatMapEval definition I mentioned earlier would work. I suspect something like #616 might be a better solution.

The text was updated successfully, but these errors were encountered:

ceedubs · 2016-05-13T11:11:55Z

cc @eed3si9n @TomasMikula

eed3si9n · 2016-05-13T11:50:17Z

I suspect that the best answer to this issue is to define a more constrained Monad type and require it.

Sounds exciting. LazyMonad, EvalFlatMap, EvalT?

ceedubs · 2016-05-13T12:05:08Z

@eed3si9n that's a good question. I'm currently re-reading http://functorial.com/stack-safety-for-free/index.pdf and thinking about whether/where Eval fits into the picture here. Especially since I just moved free back into its own module which I'm now hoping wasn't a mistake...

TomasMikula · 2016-05-13T18:17:34Z

I wasn't able to come up with an implementation that is both stack-safe and able to terminate early. What is yours (given that you have flatMapEval at hand)?

I was only able to imagine one if the return type of f (the accumulation function) indicated that there is no need to process any more elements, for example by returning a M[B] Xor M[B].

Maybe the current foldM should be renamed to foldLeftM?

I think BindRec is a good thing to have regardless, but I don't see how it can help here (without also modifying the signature of the accumulation function).

ceedubs · 2016-05-13T22:03:10Z

@TomasMikula I don't think you can create a general solution that satisfies both requirements via foldLeft/foldRight (even with flatMapEval), but instances for specific types can implement this. For example, I think that this works for any type that can return an iterator:

def foldM[G[_], A, B](it: Iterator[A], z: B)(f: (B, A) => G[B])(implicit G: Monad[G]): G[B] = {
  def loop(z: B): Eval[G[B]] =
    if (it.hasNext) Eval.defer(G.flatMapEval(f(z, it.next))(loop))
    else Always(G.pure(z))
  loop(z).value
}

So I guess that Iterator is a better abstraction than Foldable ;)

ceedubs · 2016-05-13T22:05:23Z

I meant to include in my last comment that the Stack Safety for Free paper specifically mentions foldM as something that can be implemented safely via MonadRec (using FreeT). I haven't given this a try in the scala world yet.

TomasMikula · 2016-05-13T23:18:51Z

@ceedubs Can your implementation terminate early, e.g. for an infinite Iterator? I don't see how.

TomasMikula · 2016-05-13T23:24:31Z

Intuitively, the decision about termination needs to be made by f. I can imagine two ways f can execute/indicate such decision:

take a lazy argument that it possibly doesn't evaluate (the foldRight style); or
indicate the wish to terminate in the return type (such as G[B] Xor G[B], where Right means "done" and Left means "continue").

None of these is the case in your implementation, so I wonder...

ceedubs · 2016-05-14T02:27:36Z

@TomasMikula yeah as long as the flatMapEval implementation is sufficiently lazy, it terminates early. For example, consider this implementation of flatMapEval on Option:

      override def flatMapEval[A, B](fa: Option[A])(f: A => Eval[Option[B]]): Eval[Option[B]] =
        fa match {
          case None => Now(None)
          case Some(a) => f(a)
        }

If we use the foldM method I pasted above with Stream, then you get this:

scala> Stream.continually(1).foldM(0)((acc, i) => if (acc < 10) Some(acc + i) else None)
res1: Option[Int] = None

Once we return None, the flatMapEval call doesn't evaluate f(a), so we don't loop on the rest of the iterator. This is somewhat similar to the way that foldRight works, but it allows you to use the accumulated result of previous items.

Having said all of that, I don't think this is a very satisfying solution. Even if we were to add a FlatMapEval type class and require it on foldM calls, we wouldn't have a general solution in terms of foldLeft/foldRight, so each Foldable instance would have to override their foldM implementation.

ceedubs · 2016-05-14T02:31:42Z

@TomasMikula I pushed the code I was playing around with here.

TomasMikula · 2016-05-14T04:04:28Z

Yeah, but can you terminate early and return something else than None?

ceedubs · 2016-05-14T11:50:23Z

@TomasMikula not when working with Option, but I guess I would just view that as inherent to what foldM is. I would use Xor instead of Option if I wanted to terminate the computation early (with a Left) and still return a value.

TomasMikula · 2016-05-14T17:32:31Z

@ceedubs that doesn't seem very satisfying indeed. You basically use the monadic effect for early termination, which means 1) you cannot use the monadic effect for something else, and 2) Xor is about the only monad you can use for that.

So why not use f: (B, A) => G[B Xor B] and then it should be possible to use BindRec for stack safety. I could flesh out the implementation later this weekend.

ceedubs · 2016-05-16T13:57:36Z

@TomasMikula You should be able to use XorT if you want both the possibility of early termination and another effect. I think there are quite a few use-cases where you want to fold across the entire structure as long as your monadic binds succeed, so it would seem unfortunate to me to force users into Xor if they don't need the "early termination" effect.

I say that specifically with regard to foldM though. For the general case of left-associative folds, I actually recommended something very similar to what you have brought up in scalaz going on two years ago.

I was playing around locally, and it's pretty straightforward to create a stack-safe foldM with laziness if you have BindRec and FreeT (you can take the runSafeT approach described in Stack Safety for Free). I haven't yet tried writing a version of foldM implemented directly with tailRecM without the use of FreeT (currently cats doesn't have FreeT, and other free structures don't live in the core module).

TomasMikula · 2016-05-16T18:29:38Z

Makes sense. I didn't mean to impose Xor on all users of foldM, I imagined that would be a different method, providing a more explicit support for early termination.

TomasMikula · 2016-05-16T18:38:27Z

I haven't yet tried writing a version of foldM implemented directly with tailRecM without the use of FreeT

Given this signature of tailRecM

@typeclass trait FlatMapRec[F[_]] extends FlatMap[F] {
  def tailRecM[A, B](f: A => F[A Xor B])(a: A): F[B]
}

@typeclass trait MonadRec[F[_]] extends FlatMapRec[F] with Applicative[F]

this could be the implementation of foldM for List:

def foldM[M[_], A, B](z: B, as: List[A])(f: (B, A) => M[B])(implicit M: MonadRec[M]): M[B] = {

  def go(x: (B, List[A])): M[(B, List[A]) Xor B] = x match {
    case (b, Nil) => M.pure(Xor.right(b))
    case (b, a::as) => M.map(f(b, a))(b1 => Xor.left((b1, as)))
  }

  M.tailRecM(go)((z, as))
}

ceedubs · 2016-05-16T18:48:36Z

@TomasMikula nice - thanks! I was scratching my head trying to figure out how to do things with tailRecM. I was trying to do this for a generic Foldable, not specifically a List, which added to the confusion. It was so easy when I just delegated to a FreeT of Id :). I hope to follow up on this in the near future.

TomasMikula · 2016-05-16T19:26:39Z

Yeah, I don't think we can generalize this to any Foldable, but we sure can to any Uncons (which I'm not sure would be a lawful typeclass, though):

@typeclass trait Uncons[B, A] {
  def uncons(b: B): Option[(A, B)]
}

EDIT: which is essentially the same as Iterator :).

Regarding your FreeT experiment, assuming you are using FreeT[Id, M, A], I think you should be able to replace it by Free[M, A], if you also add Free.foldMapRec, i.e. a stack-safe version of foldMap (or just strengthen the constraints of current Free.foldMap to require M[_]: MonadRec).

Resolves typelevel#1030, though it may be desirable change the default implementation to be lazy in the future. This can be done with `FreeT` but as of now I'm not sure how to do it without.

ceedubs added a commit to ceedubs/cats that referenced this issue May 18, 2016

Add foldMRec

4d08b60

Resolves typelevel#1030, though it may be desirable change the default implementation to be lazy in the future. This can be done with `FreeT` but as of now I'm not sure how to do it without.

ceedubs mentioned this issue May 18, 2016

Add foldMRec TomasMikula/cats#1

Closed

peterneyens mentioned this issue Oct 19, 2016

Lazy foldM for "Iterables" #1414

Merged

kailuowang closed this as completed in 2cc1122 Oct 22, 2016

peterneyens mentioned this issue Jun 3, 2017

Foldable instance for List breaks substitution #1716

Closed

LukaJCB mentioned this issue Oct 20, 2017

Add scanMap and scanM combinators to Observable monix/monix#449

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lazy `foldM` #1030

lazy `foldM` #1030

ceedubs commented May 13, 2016

ceedubs commented May 13, 2016

eed3si9n commented May 13, 2016

ceedubs commented May 13, 2016

TomasMikula commented May 13, 2016

ceedubs commented May 13, 2016

ceedubs commented May 13, 2016

TomasMikula commented May 13, 2016

TomasMikula commented May 13, 2016 •

edited

Loading

ceedubs commented May 14, 2016

ceedubs commented May 14, 2016

TomasMikula commented May 14, 2016

ceedubs commented May 14, 2016

TomasMikula commented May 14, 2016

ceedubs commented May 16, 2016

TomasMikula commented May 16, 2016

TomasMikula commented May 16, 2016 •

edited

Loading

ceedubs commented May 16, 2016

TomasMikula commented May 16, 2016 •

edited

Loading

lazy foldM #1030

lazy foldM #1030

Comments

ceedubs commented May 13, 2016

ceedubs commented May 13, 2016

eed3si9n commented May 13, 2016

ceedubs commented May 13, 2016

TomasMikula commented May 13, 2016

ceedubs commented May 13, 2016

ceedubs commented May 13, 2016

TomasMikula commented May 13, 2016

TomasMikula commented May 13, 2016 • edited Loading

ceedubs commented May 14, 2016

ceedubs commented May 14, 2016

TomasMikula commented May 14, 2016

ceedubs commented May 14, 2016

TomasMikula commented May 14, 2016

ceedubs commented May 16, 2016

TomasMikula commented May 16, 2016

TomasMikula commented May 16, 2016 • edited Loading

ceedubs commented May 16, 2016

TomasMikula commented May 16, 2016 • edited Loading

lazy `foldM` #1030

lazy `foldM` #1030

TomasMikula commented May 13, 2016 •

edited

Loading

TomasMikula commented May 16, 2016 •

edited

Loading

TomasMikula commented May 16, 2016 •

edited

Loading