[WIP] Fuse Seq.* calls #1525

forki · 2016-09-08T12:54:27Z

this is highly experimental.
This PR enables the compiler to rewrite

[1; 2] |> Seq.map (fun x -> x + 2) |> Seq.map (fun x -> x * 2)
["hello"; "world"; "!"] |> Seq.map (fun (y:string) -> y.Length) |> Seq.map (fun x -> x * 3)

into:

Seq.map (fun x -> (x + 2) * 2) [1; 2]
Seq.map (fun x -> x.Length * 3) ["hello"; "world"; "!"]

Future rules:

Seq.filter f (Seq.filter g xs)  ==>  Seq.filter (fun x -> g x && f x)) xs
Seq.iter f (Seq.map g xs)  ==>  Seq.iter (fun x -> f(g x)) xs
Seq.iter f (Seq.filter g xs)  ==>  Seq.iter (fun x -> if g x then f x)) xs

dsyme · 2016-09-08T15:59:27Z

Hmmm... I must admit I'm not totally sure I want to do these kinds of optimizations in F#. There are so many high-level optimizations like this that can be applied, and each one can cause surprise when they stop working (i.e. when they don't commute with some other code change). For example, what of Seq.map (Seq.filter (Seq.map ...) reducing to some Seq.choose and hundreds of other rewriting optimizations?

For example, an alternative approach may be to improve and broaden the existing sequence state machine compilation to work on a broader range of expressions. Currently that only applies on the desugared form of seq { ... } expressions. But maybe it could also apply to Seq.map f (Seq.map g xs) and so on. Transforming them to seq { for x in xs do let y = f x in yield g y } and then applying state machine compilation? That may commute with more combinations.

forki · 2016-09-08T16:56:27Z

I agree rewritting into choose from map and filter would feel weird. I person would only try to reduce stuff from two functions into one of the already existing. I think this would also ensure that we reach a fix point.

Rewriting into seq expressions is also interesting. We could do that in a separate step. Does this sound reasonable?

forki · 2016-09-08T17:34:11Z

After rereading: do you suggest we should rewrite every Seq.map into seq expressions and just let the seq expression rewriter do it's thing?

liboz · 2016-09-08T17:48:42Z

@forki Another thing you could try is to do something similar to LINQ in which when you use map, you try to downcast the enumerator of the input sequence to a specialized MapEnumerator that keeps track of the previous function call. Then I think you can just compose the functions.

That seems like it would be safer.

forki · 2016-09-08T18:11:59Z

You mean to rewrite Seq.map xs into List.map xs when xs is a list? That would change from lazy to eager evaluation. We can't do that

liboz · 2016-09-08T18:25:18Z

No, I mean like:

let map f (ie: seq<'T>) = 
    match ie.GetEnumerator() with
    | :? SpecialMapEnumerator<'T> as me -> Seq.map (f >> me.mapFunction) me
    | e -> Seq.map f e

I don't think this would change the lazy evaluation of map

forki · 2016-09-08T18:31:43Z

And in a last rewriting step we would rewrite everything back to seq again?

liboz · 2016-09-08T18:36:25Z

@forki

Seq.map currently uses MapEnumerator<'T> defined here. It then upcasts that to an IEnumerator before using that to generate the sequence. So it already does correct upcasting when needed. The only thing we need to do is that when we call GetEnumerator to check if the return value is a SpecialMapEnumerator so we can compose the map functions before just passing as before.

forki · 2016-09-08T18:39:21Z

I think I don't really understand that suggestion yet.

Tomorrow I'll try to rewrite all Seq.map into seq expressions and see how far this gets optimized in the seq expression optimizer.

For lists I will try to do a similar optimization like the one I proposed here. The only difference I see is that we need to make sure that the inner lambda is side effect free. I think I saw helpers which test this. They are marked as being slow, but hey ;-)

polytypic · 2016-09-09T05:31:51Z

A more general rewrite system, like described in

Playing by the Rules: Rewriting as a practical optimisation technique in GHC

might be very useful. This could allow optimizations such as these (see the commit messages) to be performed automatically.

forki · 2016-09-09T06:39:19Z

So simply rewriting Seq.map f xs as seq { for x in xs do yield f x } doesn't work.

I just compiled

seq { for x in seq { for (y:string) in data do yield y.Length } do yield x * 42 }

and the result is a big state machine with two calls to GetEnumerator, but the (fun y -> y.Length) and (fun x -> x * 42) are far away and don't get optimized any further.

manofstick · 2016-09-10T01:05:57Z

@forki

The state machine creating logic I have found to be quite terrible (performance wise), so unless a complete rewrite of that code is on the cards then I think that path is not going to yield results (IMHO).

manofstick · 2016-09-10T01:13:36Z

Maybe some inspiration could be gained from Nessos's LinqOptimizer (it does what you want, I think, but at runtime)

dsyme · 2016-09-12T10:24:09Z

The state machine creating logic I have found to be quite terrible (performance wise)

@manofstick Please link a specific issue with repro - it's important not to leave this as hearsay, and to make sure we know what we're comparing with. While surely imperfect, the perf gains from the state machine rewriting were often around 20x in many common cases. So it's all going to depend on what you're comparing with, and the specific examples. I'd be especially interested in any cases where the introduction of state machines is a net negative over the combinator encoding of seq { ... } expressions.

@forki Perhaps we first need to be clearer about the current approach to performance and optimizations in F#. Some descriptions are in the F# compiler guide, but we need more.

My feeling is that rewriting of combinations of library functions (of the kind in this PR) doesn't fit into the kinds of optimizations we do in the compiler, nor do we want to go there at this stage, except perhaps in an experimental branch, unless we have a really comprehensive approach to the problem. There are things the F# compiler does, there are things it doesn't do.

For example, if we did start doing rewrites of this kind, there are a very large number we'd like applied - there are probably 300 or more useful rewrites over FSharp.Core functions. Ideally we would do these using a general rewriting approach (no doubt inspired by Haskell - see @polytypic link above - Coq, HOL, Isabelle and many other rewriting systems). But that's also a big topic.

There are risks to every optimization, and we've previously shipped critical bugs in new optimizations. Correctness, completeness, predictability, robustness-under-change, "obviousness", "learnability" ("reducing what a reasonable user needs to know") are all reasons why I'm loathe to start adding new rewrites of combinations-of-library-functions into the compiler, especially in cases where the user can reasonably apply the optimization manually once a hotspot is identified.

So overall my feeling is that this is in the "things we don't do" category, unless we're iterating to a much more general solution to the topic, or to a particular sub-domain of optimization.

One exception is the introduction of state machines. However this is an algorithmically non-trivial operation that is extremely difficult to code by hand. Ideally that would be generalized to other computational structures (notably Async), and potentially to further combinations of sequence functions.

manofstick · 2016-09-12T23:42:56Z

@dsyme

I agree that a state machine implementation is the way to go (and much faster than a pure function fest), but the current implementation is overly cautious (in regards to Current) and sub-optimal with regards to looping constructs (uses helper functions). I have whipped up a trivial example for you. In 64-bit the example runs ~5x faster and in 32-bit it runs ~6x faster.

In my code base at work profiling often leads to seq{}s, which I end up ripping out...

Anyway, I must say that it is an area that I have been thinking of attacking for a long while (i.e. state machine generation). But I still want to get my equality/comparison stuff back up off the ground, as well as UseVirtualTag stuff finished... Hmmmm... Anyone got some cloning technology?

(Update: modifying the seq to use "while" (with a mutable index) rather than "for" brings the speed basically up to the manual implementation; but I do recall having some other issue with "while" under some circumstances; but I can't remember at the moment. I'll try to remember)

manofstick · 2016-11-02T18:46:21Z

@forki given the tepid response to the concept from this PR and the progress of #1570, do you think this PR should be closed?

forki · 2016-11-02T19:16:03Z

no I don't think it should be closed ;-)
Instead I think we should work really really hard to come up with a sound system that does fusion on many levels. Of course I know this PR is only the very first step on such a challenge.

manofstick · 2016-11-02T22:03:04Z

Well maybe it should change it's name to some general fusing?

Anyway, if you run this gist which compares "fused" Seq calls via manual seq {} block vs the modified piping technique as described in #1570 (And this is just using the Seq module, not the proposed Seq.Composer module, which would include inling, and thus be even faster) you get the following results:

Bitage	Original	"Fused"	#1570
32-bit	17459	4590	3592
64-bit	5697	2186	1234

times in milliseconds.

forki · 2016-11-03T06:24:48Z

Oh maybe I wasn't clear enough. I'm absolutely pro #1570 and related PRs. That is absolutely amazing work! I see this pull request a bit more as something that is research for future versions.

Anyways the benchmark is interesting. How can #1570 be faster?

forki · 2016-11-03T06:33:23Z

What if you apply both?

manofstick · 2016-11-03T06:41:31Z

Pretty sweet eh? (Because, as I mentioned before, the state machine implementation is less than ideal! But I got snapped at for mentioning it!)

And yes, you could fuse a seq implementation based off the composer. But lot less fat there than there was. And it adds considerable complexity.

And I wasn't against the PR per se, I was just keen for it to change name, if it is research based. But if it's just going to sit here and rot, then i think it is better closed and worked on in a private repository. But that is just my personal preference of how to manage PRs/tasks. Ie once you get to a certain level of open ones then the whole management process collapses. Well that's my personal experience anyway.

KevinRansom · 2016-12-14T22:13:08Z

@forki @dsyme

Hey, is there anything to do with this PR, or should I close it?

Kevin

forki · 2016-12-14T22:44:13Z

Ok closing for now. Let's discuss post RTM again

Try to deforest Seq.map calls

ffb1b73

msftclas added the cla-already-signed label Sep 8, 2016

forki changed the title ~~WIP Try to deforest Seq.map calls~~ WIP Try to fuse Seq.map calls Sep 8, 2016

forki added 2 commits September 8, 2016 16:03

Add unit tests for fusion

ce3c88a

Fix types

cf21e54

forki changed the title ~~WIP Try to fuse Seq.map calls~~ WIP Fuse multiple Seq.map calls Sep 8, 2016

forki added 3 commits September 8, 2016 16:59

cleanup

647e8dc

Test order of evaluation

e177f17

Unit test worked

4bf34c6

Check evaluation order

4e5f010

forki changed the title ~~WIP Fuse multiple Seq.map calls~~ WIP Fuse Seq.* calls Sep 9, 2016

forki force-pushed the deforest branch 2 times, most recently from e686d9e to fe97589 Compare September 9, 2016 08:58

Add test for Seq.iter

c3e871d

forki force-pushed the deforest branch from fe97589 to c3e871d Compare September 9, 2016 08:59

liboz mentioned this pull request Sep 9, 2016

[WIP - discussion] - Alternate Method of Seq.map chaining #1528

Closed

KevinRansom changed the title ~~WIP Fuse Seq.* calls~~ [WIP] Fuse Seq.* calls Sep 16, 2016

manofstick mentioned this pull request Sep 30, 2016

[WIP] Seq Composer #1570

Closed

99 tasks

forki closed this Nov 3, 2016

forki reopened this Nov 3, 2016

msftclas added the cla-already-signed label Nov 3, 2016

KevinRansom added the Need More Info label Dec 14, 2016

forki closed this Dec 14, 2016

forki mentioned this pull request May 4, 2017

[WIP] [CompilerPerf] Seq - the next generation. #2745

Closed

11 tasks

forki mentioned this pull request Jun 21, 2017

[WIP] Fuse Seq.* calls #3233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Fuse Seq.* calls #1525

[WIP] Fuse Seq.* calls #1525

forki commented Sep 8, 2016 •

edited

Loading

dsyme commented Sep 8, 2016 •

edited

Loading

forki commented Sep 8, 2016

forki commented Sep 8, 2016

liboz commented Sep 8, 2016

forki commented Sep 8, 2016

liboz commented Sep 8, 2016 •

edited

Loading

forki commented Sep 8, 2016

liboz commented Sep 8, 2016

forki commented Sep 8, 2016

polytypic commented Sep 9, 2016 •

edited

Loading

forki commented Sep 9, 2016

manofstick commented Sep 10, 2016

manofstick commented Sep 10, 2016

dsyme commented Sep 12, 2016

manofstick commented Sep 12, 2016 •

edited

Loading

manofstick commented Nov 2, 2016

forki commented Nov 2, 2016

manofstick commented Nov 2, 2016

forki commented Nov 3, 2016 •

edited

Loading

forki commented Nov 3, 2016

manofstick commented Nov 3, 2016 •

edited

Loading

KevinRansom commented Dec 14, 2016

forki commented Dec 14, 2016

[WIP] Fuse Seq.* calls #1525

[WIP] Fuse Seq.* calls #1525

Conversation

forki commented Sep 8, 2016 • edited Loading

dsyme commented Sep 8, 2016 • edited Loading

forki commented Sep 8, 2016

forki commented Sep 8, 2016

liboz commented Sep 8, 2016

forki commented Sep 8, 2016

liboz commented Sep 8, 2016 • edited Loading

forki commented Sep 8, 2016

liboz commented Sep 8, 2016

forki commented Sep 8, 2016

polytypic commented Sep 9, 2016 • edited Loading

forki commented Sep 9, 2016

manofstick commented Sep 10, 2016

manofstick commented Sep 10, 2016

dsyme commented Sep 12, 2016

manofstick commented Sep 12, 2016 • edited Loading

manofstick commented Nov 2, 2016

forki commented Nov 2, 2016

manofstick commented Nov 2, 2016

forki commented Nov 3, 2016 • edited Loading

forki commented Nov 3, 2016

manofstick commented Nov 3, 2016 • edited Loading

KevinRansom commented Dec 14, 2016

forki commented Dec 14, 2016

forki commented Sep 8, 2016 •

edited

Loading

dsyme commented Sep 8, 2016 •

edited

Loading

liboz commented Sep 8, 2016 •

edited

Loading

polytypic commented Sep 9, 2016 •

edited

Loading

manofstick commented Sep 12, 2016 •

edited

Loading

forki commented Nov 3, 2016 •

edited

Loading

manofstick commented Nov 3, 2016 •

edited

Loading