You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve type and namespace lookup using .debug_names parent chain
Summary:
This diff fixes T199918951 by changing lldb type and namespace lookup to use .debug_names parent chain.
It has incorporated latest feedback from Apple/Google. We need to land it internally because:
* Unblock Ads partners from dogfooding (currently, they keep on hitting this long delay which disrupts the dogfooding experience)
* Dogfooding to flush out any new issues
Note: I will keep update the internal patch the same way as the single thread stepping deadlock timeout PR so that we won't get any merge conflict.
It includes 4 commits squashed into one diff:
* This reverts the internal patch from @WanYi to "Revert PeekDIEName related patch"
* Note: `PeekDIEName()` has been removed my the new changes so we won't suffer the original crash anymore.
* Improve type lookup review: llvm#108907
* Improve namespace lookup review: llvm#110062
* Use a better algorithm for SameAsEntryContext: Pending
=================== Original PR Summary ===================
A customer has reported that expanding the "this" pointer for a non-trivial class method context takes more than **70–90 seconds**. This issue occurs on a Linux target with the `.debug_names` index table enabled and using split DWARF (DWO) files to avoid debug info overflow.
Profiling shows that the majority of the bottleneck occurs in the `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()` functions, which search for types and namespaces in the `.debug_names` index table by base name, then parse/load the underlying DWO files—an operation that can be very slow.
The whole profile trace is pretty big, but here is a part of the hot path:
```
--61.01%-- clang::ASTNodeImporter::VisitRecordDecl(clang::RecordDecl*)
--61.00%-- clang::ASTNodeImporter::ImportDeclParts(clang::NamedDecl*, clang::DeclContext*&, clang::DeclContext*&, clang::DeclarationName&, clang::NamedDecl*&, clang::SourceLocation&)
clang::ASTImporter::ImportContext(clang::DeclContext*)
clang::ASTImporter::Import(clang::Decl*)
lldb_private::ClangASTImporter::ASTImporterDelegate::Imported(clang::Decl*, clang::Decl*)
lldb_private::ClangASTImporter::BuildNamespaceMap(clang::NamespaceDecl const*)
lldb_private::ClangASTSource::CompleteNamespaceMap(
std::shared_ptr<std::vector<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>,
std::allocator<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>>> >&,
lldb_private::ConstString,
std::shared_ptr<std::vector<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>,
std::allocator<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>>> >&) const
)
lldb_private::plugin::dwarf::SymbolFileDWARF::FindNamespace(lldb_private::ConstString, lldb_private::CompilerDeclContext const&, bool)
lldb_private::plugin::dwarf::DebugNamesDWARFIndex::GetNamespaces(lldb_private::ConstString, llvm::function_ref<bool (lldb_private::plugin::dwarf::DWARFDIE)>)
--60.29%-- lldb_private::plugin::dwarf::DebugNamesDWARFIndex::GetDIE(llvm::DWARFDebugNames::Entry const&) const
--46.11%-- lldb_private::plugin::dwarf::DWARFUnit::GetDIE(unsigned long)
--46.11%-- lldb_private::plugin::dwarf::DWARFUnit::ExtractDIEsIfNeeded()
--45.77%-- lldb_private::plugin::dwarf::DWARFUnit::ExtractDIEsRWLocked()
--23.22%-- lldb_private::plugin::dwarf::DWARFDebugInfoEntry::Extract(lldb_private::DWARFDataExtractor const&,
lldb_private::plugin::dwarf::DWARFUnit const&, unsigned long*)
--10.04%-- lldb_private::plugin::dwarf::DWARFFormValue::SkipValue(llvm::dwarf::Form,
lldb_private::DWARFDataExtractor const&, unsigned long*,
lldb_private::plugin::dwarf::DWARFUnit const*)
```
This PR improves `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()` by utilizing the newly added parent chain `DW_IDX_parent` in `.debug_names`. The proposal was originally discussed in [this RFC](https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151).
This PR also introduces a new algorithm in `DebugNamesDWARFIndex::SameAsEntryATName()` that significantly reduces the need to parse DWO files.
To leverage the parent chain for `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()`, this PR adds two new APIs: `GetTypesWithParents` and `GetNamespacesWithParents` in the `DWARFIndex` base class. These APIs perform the same function as `GetTypes`/`GetNamespaces`, with additional filtering if the `parent_names` parameter is not empty. Since this only introduces filtering, the callback mechanisms at all call sites remain unchanged. A default implementation in the `DWARFIndex` class parses the type context from each base name matching DIE, then filter by parent_names. In the `DebugNameDWARFIndex` override, the parent_names is cross checked with parent chain in .debug_names for for much faster filtering.
Unlike the `GetFullyQualifiedType` API, which fully consumes the `DW_IDX_parent` parent chain for exact matching, these new APIs perform partial subset matching for type/namespace queries. This is necessary to support queries involving anonymous or inline namespaces. For instance, a user might request `NS1::NS2::NS3::Foo`, while the index table's parent chain might contain `NS1::inline_NS2::NS3::Foo`, which would fail exact matching.
Another optimization is implemented in `DebugNamesDWARFIndex::SameAsEntryATName()`. The old implementation used `GetNonSkeletonUnit()`, which triggered parsing/loading of underlying DWO files—a very costly operation we aim to avoid. The new implementation minimizes the need to touch DWO files by employing two strategies:
1. Build a `<pool_entry_range => NameEntry>` mapping table (built lazily as needed), allowing us to check the name of the parent entry.
2. Search for the name the entry is trying to match from the current `NameIndex`, then check if the parent entry can be found from the name entry.
In my testing and profiling, strategy (1) did not improve wall-time, as the cost of building the mapping table was high. However, strategy (2) proved to be very fast.
=========================================================
Test Plan:
* `ninja check-lldb` are passing.
* It improves T199918951 from around 110s+ => 1.8s.
Reviewers: toyang, wanyi, jsamouh, jaihari, jalalonde, #lldb_team
Reviewed By: wanyi
Subscribers: wanyi, #lldb_team
Differential Revision: https://phabricator.intern.facebook.com/D63440813
Tasks: T199918951
0 commit comments