Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fuse Seq.* calls #1525

Closed
wants to merge 8 commits into from
Closed

[WIP] Fuse Seq.* calls #1525

wants to merge 8 commits into from

Conversation

forki
Copy link
Contributor

@forki forki commented Sep 8, 2016

this is highly experimental.
This PR enables the compiler to rewrite

[1; 2] |> Seq.map (fun x -> x + 2) |> Seq.map (fun x -> x * 2)
["hello"; "world"; "!"] |> Seq.map (fun (y:string) -> y.Length) |> Seq.map (fun x -> x * 3)

into:

Seq.map (fun x -> (x + 2) * 2) [1; 2]
Seq.map (fun x -> x.Length * 3) ["hello"; "world"; "!"]

Future rules:

Seq.filter f (Seq.filter g xs)  ==>  Seq.filter (fun x -> g x && f x)) xs
Seq.iter f (Seq.map g xs)  ==>  Seq.iter (fun x -> f(g x)) xs
Seq.iter f (Seq.filter g xs)  ==>  Seq.iter (fun x -> if g x then f x)) xs

@forki forki changed the title WIP Try to deforest Seq.map calls WIP Try to fuse Seq.map calls Sep 8, 2016
@forki forki changed the title WIP Try to fuse Seq.map calls WIP Fuse multiple Seq.map calls Sep 8, 2016
@dsyme
Copy link
Contributor

dsyme commented Sep 8, 2016

Hmmm... I must admit I'm not totally sure I want to do these kinds of optimizations in F#. There are so many high-level optimizations like this that can be applied, and each one can cause surprise when they stop working (i.e. when they don't commute with some other code change). For example, what of Seq.map (Seq.filter (Seq.map ...) reducing to some Seq.choose and hundreds of other rewriting optimizations?

For example, an alternative approach may be to improve and broaden the existing sequence state machine compilation to work on a broader range of expressions. Currently that only applies on the desugared form of seq { ... } expressions. But maybe it could also apply to Seq.map f (Seq.map g xs) and so on. Transforming them to seq { for x in xs do let y = f x in yield g y } and then applying state machine compilation? That may commute with more combinations.

@forki
Copy link
Contributor Author

forki commented Sep 8, 2016

I agree rewritting into choose from map and filter would feel weird. I person would only try to reduce stuff from two functions into one of the already existing. I think this would also ensure that we reach a fix point.

Rewriting into seq expressions is also interesting. We could do that in a separate step. Does this sound reasonable?

@forki
Copy link
Contributor Author

forki commented Sep 8, 2016

After rereading: do you suggest we should rewrite every Seq.map into seq expressions and just let the seq expression rewriter do it's thing?

@liboz
Copy link
Contributor

liboz commented Sep 8, 2016

@forki Another thing you could try is to do something similar to LINQ in which when you use map, you try to downcast the enumerator of the input sequence to a specialized MapEnumerator that keeps track of the previous function call. Then I think you can just compose the functions.

That seems like it would be safer.

@forki
Copy link
Contributor Author

forki commented Sep 8, 2016

You mean to rewrite Seq.map xs into List.map xs when xs is a list? That would change from lazy to eager evaluation. We can't do that

@liboz
Copy link
Contributor

liboz commented Sep 8, 2016

No, I mean like:

let map f (ie: seq<'T>) = 
    match ie.GetEnumerator() with
    | :? SpecialMapEnumerator<'T> as me -> Seq.map (f >> me.mapFunction) me
    | e -> Seq.map f e

I don't think this would change the lazy evaluation of map

@forki
Copy link
Contributor Author

forki commented Sep 8, 2016

And in a last rewriting step we would rewrite everything back to seq again?

@liboz
Copy link
Contributor

liboz commented Sep 8, 2016

@forki

Seq.map currently uses MapEnumerator<'T> defined here. It then upcasts that to an IEnumerator before using that to generate the sequence. So it already does correct upcasting when needed. The only thing we need to do is that when we call GetEnumerator to check if the return value is a SpecialMapEnumerator so we can compose the map functions before just passing as before.

@forki
Copy link
Contributor Author

forki commented Sep 8, 2016

I think I don't really understand that suggestion yet.

Tomorrow I'll try to rewrite all Seq.map into seq expressions and see how far this gets optimized in the seq expression optimizer.

For lists I will try to do a similar optimization like the one I proposed here. The only difference I see is that we need to make sure that the inner lambda is side effect free. I think I saw helpers which test this. They are marked as being slow, but hey ;-)

@polytypic
Copy link

polytypic commented Sep 9, 2016

A more general rewrite system, like described in

Playing by the Rules: Rewriting as a practical optimisation technique in GHC

might be very useful. This could allow optimizations such as these (see the commit messages) to be performed automatically.

@forki
Copy link
Contributor Author

forki commented Sep 9, 2016

So simply rewriting Seq.map f xs as seq { for x in xs do yield f x } doesn't work.

I just compiled

seq { for x in seq { for (y:string) in data do yield y.Length } do yield x * 42 }

and the result is a big state machine with two calls to GetEnumerator, but the (fun y -> y.Length) and (fun x -> x * 42) are far away and don't get optimized any further.

@forki forki changed the title WIP Fuse multiple Seq.map calls WIP Fuse Seq.* calls Sep 9, 2016
@forki forki force-pushed the deforest branch 2 times, most recently from e686d9e to fe97589 Compare September 9, 2016 08:58
@manofstick
Copy link
Contributor

@forki

The state machine creating logic I have found to be quite terrible (performance wise), so unless a complete rewrite of that code is on the cards then I think that path is not going to yield results (IMHO).

@manofstick
Copy link
Contributor

Maybe some inspiration could be gained from Nessos's LinqOptimizer (it does what you want, I think, but at runtime)

@dsyme
Copy link
Contributor

dsyme commented Sep 12, 2016

The state machine creating logic I have found to be quite terrible (performance wise)

@manofstick Please link a specific issue with repro - it's important not to leave this as hearsay, and to make sure we know what we're comparing with. While surely imperfect, the perf gains from the state machine rewriting were often around 20x in many common cases. So it's all going to depend on what you're comparing with, and the specific examples. I'd be especially interested in any cases where the introduction of state machines is a net negative over the combinator encoding of seq { ... } expressions.

@forki Perhaps we first need to be clearer about the current approach to performance and optimizations in F#. Some descriptions are in the F# compiler guide, but we need more.

My feeling is that rewriting of combinations of library functions (of the kind in this PR) doesn't fit into the kinds of optimizations we do in the compiler, nor do we want to go there at this stage, except perhaps in an experimental branch, unless we have a really comprehensive approach to the problem. There are things the F# compiler does, there are things it doesn't do.

For example, if we did start doing rewrites of this kind, there are a very large number we'd like applied - there are probably 300 or more useful rewrites over FSharp.Core functions. Ideally we would do these using a general rewriting approach (no doubt inspired by Haskell - see @polytypic link above - Coq, HOL, Isabelle and many other rewriting systems). But that's also a big topic.

There are risks to every optimization, and we've previously shipped critical bugs in new optimizations. Correctness, completeness, predictability, robustness-under-change, "obviousness", "learnability" ("reducing what a reasonable user needs to know") are all reasons why I'm loathe to start adding new rewrites of combinations-of-library-functions into the compiler, especially in cases where the user can reasonably apply the optimization manually once a hotspot is identified.

So overall my feeling is that this is in the "things we don't do" category, unless we're iterating to a much more general solution to the topic, or to a particular sub-domain of optimization.

One exception is the introduction of state machines. However this is an algorithmically non-trivial operation that is extremely difficult to code by hand. Ideally that would be generalized to other computational structures (notably Async), and potentially to further combinations of sequence functions.

@manofstick
Copy link
Contributor

manofstick commented Sep 12, 2016

@dsyme

I agree that a state machine implementation is the way to go (and much faster than a pure function fest), but the current implementation is overly cautious (in regards to Current) and sub-optimal with regards to looping constructs (uses helper functions). I have whipped up a trivial example for you. In 64-bit the example runs ~5x faster and in 32-bit it runs ~6x faster.

In my code base at work profiling often leads to seq{}s, which I end up ripping out...

Anyway, I must say that it is an area that I have been thinking of attacking for a long while (i.e. state machine generation). But I still want to get my equality/comparison stuff back up off the ground, as well as UseVirtualTag stuff finished... Hmmmm... Anyone got some cloning technology?

(Update: modifying the seq to use "while" (with a mutable index) rather than "for" brings the speed basically up to the manual implementation; but I do recall having some other issue with "while" under some circumstances; but I can't remember at the moment. I'll try to remember)

@KevinRansom KevinRansom changed the title WIP Fuse Seq.* calls [WIP] Fuse Seq.* calls Sep 16, 2016
@manofstick manofstick mentioned this pull request Sep 30, 2016
99 tasks
@manofstick
Copy link
Contributor

@forki given the tepid response to the concept from this PR and the progress of #1570, do you think this PR should be closed?

@forki
Copy link
Contributor Author

forki commented Nov 2, 2016

no I don't think it should be closed ;-)
Instead I think we should work really really hard to come up with a sound system that does fusion on many levels. Of course I know this PR is only the very first step on such a challenge.

@manofstick
Copy link
Contributor

Well maybe it should change it's name to some general fusing?

Anyway, if you run this gist which compares "fused" Seq calls via manual seq {} block vs the modified piping technique as described in #1570 (And this is just using the Seq module, not the proposed Seq.Composer module, which would include inling, and thus be even faster) you get the following results:

Bitage Original "Fused" #1570
32-bit 17459 4590 3592
64-bit 5697 2186 1234

times in milliseconds.

@forki
Copy link
Contributor Author

forki commented Nov 3, 2016

Oh maybe I wasn't clear enough. I'm absolutely pro #1570 and related PRs. That is absolutely amazing work! I see this pull request a bit more as something that is research for future versions.

Anyways the benchmark is interesting. How can #1570 be faster?

@forki forki closed this Nov 3, 2016
@forki forki reopened this Nov 3, 2016
@forki
Copy link
Contributor Author

forki commented Nov 3, 2016

What if you apply both?

@manofstick
Copy link
Contributor

manofstick commented Nov 3, 2016

Pretty sweet eh? (Because, as I mentioned before, the state machine implementation is less than ideal! But I got snapped at for mentioning it!)

And yes, you could fuse a seq implementation based off the composer. But lot less fat there than there was. And it adds considerable complexity.

And I wasn't against the PR per se, I was just keen for it to change name, if it is research based. But if it's just going to sit here and rot, then i think it is better closed and worked on in a private repository. But that is just my personal preference of how to manage PRs/tasks. Ie once you get to a certain level of open ones then the whole management process collapses. Well that's my personal experience anyway.

@KevinRansom
Copy link
Member

@forki @dsyme

Hey, is there anything to do with this PR, or should I close it?

Kevin

@forki
Copy link
Contributor Author

forki commented Dec 14, 2016

Ok closing for now. Let's discuss post RTM again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants