-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Method Table view #4227
Implement Method Table view #4227
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4227 +/- ##
==========================================
- Coverage 88.56% 87.91% -0.65%
==========================================
Files 282 283 +1
Lines 25333 25619 +286
Branches 6817 6886 +69
==========================================
+ Hits 22435 22523 +88
- Misses 2696 2865 +169
- Partials 202 231 +29 ☔ View full report in Codecov by Sentry. |
Very nice! It seems to work correctly. I have a few comments, about the naming and about the implementation. First off, using the word "method" in "Method Table" is a bit Java-centric. In other languages, not every piece of executable code is a method. So I would propose a different name instead: "Function List". Furthermore, I think the implementation should not make use of the CallNodeInfo type. Because we're dealing with a flat list here, I think it's better to avoid all use of "tree" and "node" terminology. Instead, you could put your information into a new FuncListInfo struct. Also, the current implementation does a lot of work when the preview selection changes, because it iterates over all samples and walks the entire stack for each sample (doing repeated work if the same stack is present in multiple samples), and it allocates a fresh Set object for every sample.
I'm not sure what's the best way to represent the "does stack N contain func M" cache. It feels like a sparse boolean matrix. I guess you could use an array of sets, i.e. one set for each stack, where the set contains the funcs in that stack. But maybe there's a more efficient data structure for this. |
I have thought about this more and have come to the conclusion that my advice about the cache was probably misguided. Computing a set per stack upfront is not necessarily going to be better, in fact it could even lead to wasted work because there are many stacks which aren't directly used by samples and only exist as ancestor nodes. And computing a set per stack would mean that all those sets need to be retained, whereas your solution only needs one set at a given time. |
I totally forget that, thanks for catching this.
That would really complicate the implementation. My implementation uses the CallTree and Sidebar component (and has therefore to use the CallNodeInfo type) to reuse most of the existing functionality. So replacing the types is only really possible by using an alias to CallNodeInfo in all FunctionList specific places. This should clear most of the confusion. |
Oh, I forgot to comment on this. One thing I noticed that isn't supported yet is the context menu. As for the timeline highlighting, it doesn't seem to be highlighting based on the selection in the function list. The highlight seems to be based on the selected node in the call tree tab, i.e. based on a selection that's invisible when the function list is shown. And clicking the graph in the timeline does not change which row in the function list is selected.
|
I agree with you that all of this is to tricky for this small PR and I fear that these discussions will prevent this PR from ever getting merged. The side bar works nonetheless, which is really important. |
3b870ef
to
ee9e16a
Compare
@mstange I fixed all your suggestions |
I hope to look at this later this week. While testing this, I found a profile which shows sluggish preview selection handling: Maybe this example can help us optimize this view. Dragging the mouse in the timeline is much smoother in the Call Tree tab than it is in the Function Table tab. |
Thanks for the sample. I have a few ideas how to solve this (and will try to test them tomorrow), they also are based on caching information so that less information is computed on every change. I think the lag for small (< 3s or so) selections is quite good, but everything above is problematic. There are two ideas that come into my mind:
These two ideas combined should result in larger memory usage for a drastically reduced number of (expensive) method table computations from samples. I'll try to implement these and check with the profiler whether it worked. But I'm hopeful and it is far less complicated than previous ideas. (One remaining problem could be the sheer number of methods. But this could be solved by storing the functions with one or two usages more efficiently.) |
My new version improves of the computation uses a multi-level cache and trades memory for performance: Firefox Profiler uses around 10 - 15 % more memory with this change. But the parameters of the cache can be configured. The cache is generic and could be used for speeding up the computation callees and callers in a yet to be proposed feature. |
Still haven't found time to look at this in detail, hoping for this week. In the new deploy preview, dragging a selection now works very fast, nice! I noticed that it's still sluggish if the mouse is over the marker track while it's being moved, but that's completely unrelated to this PR. I wonder if the sluggish behavior I saw previously was due to the same issue - I hope I didn't accidentally blame this PR for a sluggish marker track. With the cache, do you mean It looks like graph highlighting is working nicely now, though I don't really understand how it can work - how did you address this case I mentioned in a previous comment?
One bug I found is that double-clicking a function will not scroll the source view to the clicked function. For example, opening the source view on I also noticed that the search filter's current behavior is not very intuitive for the function table view. I would expect it to filter out all functions that don't match the search string. At the moment, it still displays functions that are present elsewhere in the matching stacks. But this is a bigger change and should be deferred for a separate PR. |
The remaining sluggishness comes from a too primitive implementation in the call tree, combined with using React events which are triggered not that regularly.
I mean the
It was not that hard: I just walk over all stacks and look for the selected function where I previously just looked at the top frame. See https://github.com/firefox-devtools/profiler/pull/4227/files#diff-7a963fec8fbae1d005d4cbb324ff7399f707c667f19fc79fef0ceada360a4d1fR531
This is the same for the call tree. I find searching pretty unintuitive in the whole profiler UI. I will tackle this in an upcoming proposal: I want to be able to specify for detailed filters based on glob patterns, like |
It works for me for almost all examples correctly, so I rather fix it in a later PR. I don't think your particular case has anything to do with my implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made some progress on the review, but I didn't dig into the actual meaty pieces. I hope to do so tomorrow. I'm posting what I have so far.
Oh, you took the easy way out and made it highlight the same areas for different samples, by always sorting "selected" before "unselected". Here's an example profile where you can compare the highlighting of the functions "Left" and "Right" between the Call Tree and the Function Table tabs. In your current implementation, both "Left" and "Right" highlight the same areas, even though they correspond to different samples. |
I think this is because BottomBox isn't adjusting its selectedCallNodeLineTimings:
selectedNodeSelectors.getSourceViewLineTimings(state), |
I still need to read the "summary" code. I'm getting there, slowly. I have one more general request: Please use JavaScript arrays for the new arrays you're adding, not Typed Arrays such as Int32Array. The typed arrays may have slightly better performance in some cases, but they're more complicated to deal with because they can't contain null and because you can't specify the Flow type of the array element. We still have a fair amount of them in the profiler code but we've been slowly phasing them out and are using regular JS arrays in new code. We haven't observed any adverse performance from them. So please change your new arrays into regular JS arrays and use |
Good to know, I'll change them soon. |
The problem with this is that my new code is intertwined with lot's of old code which uses these typed arrays and passes them into my code or expects it. I'm not too inclined to do a major refactoring of the other code in this PR. But I'm open to do this in another PR if need be. |
I'm happy for any feedback. |
I have read the summary cache tree code now. Let me try to summarize what it does: It reduces the time taken to compute function timings for a sub-range of the samples, by re-using previous calculations over other sub-ranges of samples. This way, when the user drags a preview selection, we ideally only need to iterate over the newly-added samples (and add their timings into to the timings of existing cached "summary nodes") instead of iterating over all samples in the selection, when computing the timings. For now, can you remove the summary cache tree again? I think performance is good enough without it. And then, as a follow-up, we can improve the performance in other ways. Here's how I would reduce the cost of what's currently happening in
The last optimization idea sounds quite complicated and is probably not worth doing, but I wanted to throw it out there. |
How about using this version and implementing the other optimizations as another PR, so that this PR is not stalled? I'm willing to commit to further improve the Function Table going forward.
The performance is almost unusable for larger profiles (and my use case has these a lot)... |
Yes of course, that's what I meant with "as follow-ups".
Oh, really? I hadn't realized that. That's unfortunate. Ok, then let's keep the tree for now, but please document it. The comments for the cache tree should say which use case is being optimized, and roughly how it works. Furthermore, the invariants of the tree fields should be documented, i.e. what's the sum of what, which sample index ranges are nested / overlapping / adjacent, what it means for children to be null, etc. |
I've added the requested documentation (the field invariants are stated in the documentation of each field). Is this now ready for adding tests? Any recommendations on what tests I should add? |
Hey @mstange, do you think you would have the chance to look at this again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posting a few more comments about things I noticed while reading. I still haven't completely read the summary cache implementation.
As for tests, let's start with a few basic tests which exercise the code in call-tree.js, and a few which exercise the actions and reducers.
For the call tree tests, make a simple example profile and then check that the counts of each function are correct, including the recursion case where the function must only be counted once.
Oh, Jeff noticed a bug when testing the deploy preview: When a preview selection is applied, the percentages are currently of the non-preview-filtered thread, which is inconsistent with the call tree tab. |
d97cb29
to
aa05d53
Compare
I rebased it on the current main branch. |
Squashed all previous commits
aa05d53
to
a2e8359
Compare
I fixed it, it was probably just forgotten. |
I noticed another issue: the transform shortcuts work in the function table, but they don't target the right node index. Probably you need to change But taking a step back, I feel like you're trying a bit too hard to reuse existing code, instead of starting from scratch for this panel. But I think I need to think about it more. We don't want to copy paste everything either, especially the call-tree part. I don't like much how you call the new structure also a call node table, I think this is confusing things. I need to think a bit more. |
(taking this out of my queue for now) |
This PR seems to have stalled. |
Hi! This feature seems interesting and useful, is the PR abandoned? |
Hey, I'm also looking for this feature that would make the profiler UI more usable to investigate such things. |
Reading this PR again, my main concern during the review was the complexity of the implementation, to the degree that it was hard for me as a reviewer to understand what was going on. At the time I also didn't have a lot of time to think of an approach I'd prefer. Now that I've worked on speeding up the inverted call tree, I have a much better understanding of what I'd like the implementation to look like. I've also changed the infrastructure around the call tree and the computation of call tree timings in a way that should make a function table easier to plug in, so that the code reuse with the existing call tree happens at the right abstraction boundary and we don't need to create a fake call node table. I've posted the recommended implementation in this comment: #15 (comment) I'll close this PR for now. The new implementation would look differently enough that it should happen in a new PR. |
This PR implements a CallTree based view that presents all methods with their total and self times:
It implements a simplified version of the view first suggested in #15 (Add a "top functions" call tree view) and reiterated in #4205.
It supports all the amenities of the CallTree view like the side bar or highlighting the samples in the timeline.
Sample profile
Deployment preview
As in other PRs: This PR has not yet any tests, but it is manually tested and I'll add the tests after I'm sure what the final implementation is. The tests should be really similar to the tests for the CallTree view itself, as it is essentially the same view just with a different source of data.
┆Issue is synchronized with this Jira Task