Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve type lookup using .debug_names parent chain #108907

Merged
merged 2 commits into from
Oct 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -126,3 +126,28 @@ bool DWARFIndex::GetFullyQualifiedTypeImpl(
return callback(die);
return true;
}

void DWARFIndex::GetTypesWithQuery(
TypeQuery &query, llvm::function_ref<bool(DWARFDIE die)> callback) {
GetTypes(query.GetTypeBasename(), [&](DWARFDIE die) {
return ProcessTypeDIEMatchQuery(query, die, callback);
});
}

bool DWARFIndex::ProcessTypeDIEMatchQuery(
TypeQuery &query, DWARFDIE die,
llvm::function_ref<bool(DWARFDIE die)> callback) {
// Nothing to match from query
if (query.GetContextRef().size() <= 1)
return callback(die);

std::vector<lldb_private::CompilerContext> die_context;
if (query.GetModuleSearch())
die_context = die.GetDeclContext();
else
die_context = die.GetTypeLookupContext();

if (!query.ContextMatches(die_context))
return true;
return callback(die);
}
12 changes: 12 additions & 0 deletions lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,13 @@ class DWARFIndex {
virtual void
GetNamespaces(ConstString name,
llvm::function_ref<bool(DWARFDIE die)> callback) = 0;
/// Get type DIEs meeting requires of \a query.
/// in its decl parent chain as subset. A base implementation is provided,
/// Specializations should override this if they are able to provide a faster
/// implementation.
virtual void
GetTypesWithQuery(TypeQuery &query,
llvm::function_ref<bool(DWARFDIE die)> callback);
virtual void
GetFunctions(const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
const CompilerDeclContext &parent_decl_ctx,
Expand Down Expand Up @@ -115,6 +122,11 @@ class DWARFIndex {
bool
GetFullyQualifiedTypeImpl(const DWARFDeclContext &context, DWARFDIE die,
llvm::function_ref<bool(DWARFDIE die)> callback);

/// Check if the type \a die can meet the requirements of \a query.
bool
ProcessTypeDIEMatchQuery(TypeQuery &query, DWARFDIE die,
llvm::function_ref<bool(DWARFDIE die)> callback);
};
} // namespace dwarf
} // namespace lldb_private::plugin
Expand Down
123 changes: 122 additions & 1 deletion lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,8 @@ using Entry = llvm::DWARFDebugNames::Entry;
/// If any parent does not have an `IDX_parent`, or the Entry data is corrupted,
/// nullopt is returned.
std::optional<llvm::SmallVector<Entry, 4>>
getParentChain(Entry entry, uint32_t max_parents) {
getParentChain(Entry entry,
uint32_t max_parents = std::numeric_limits<uint32_t>::max()) {
llvm::SmallVector<Entry, 4> parent_entries;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a job for this patch, but I'm noticed that this function will return nothing if the parent chain was cut short (e.g. because one of the intermediate types is in a type unit). It'd probably be better to return a partial list (which is marked as partial), as that can still be useful for some filtering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what branch in this function will cut short during "intermediate types is in a type unit" situation? Is it "entry.hasParentInformation()" returning false, or "entry.getParentDIEEntry()" return failure?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one. Basically, if you try something like clang++ -o - -c -g -gpubnames -fdebug-types-section -x c++ - <<<"struct A { struct B {}; }; A a; A::B b;", you'll see that the entry for B has no DW_IDX_parent attribute (I think it could have it, but we'd have to figure out what exactly it should refer to).


do {
Expand Down Expand Up @@ -374,6 +375,21 @@ void DebugNamesDWARFIndex::GetFullyQualifiedType(
m_fallback.GetFullyQualifiedType(context, callback);
}

bool DebugNamesDWARFIndex::SameAsEntryContext(
const CompilerContext &query_context,
const DebugNames::Entry &entry) const {
// TODO: check dwarf tag matches.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a long term todo, do we have a github issue that reflects this (and if we do, can we add the number in teh comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not a long term todo. My 3rd PR in the stack will implement it.

// Peek at the AT_name of `entry` and test equality to `name`.
auto maybe_dieoffset = entry.getDIEUnitOffset();
if (!maybe_dieoffset)
return false;
DWARFUnit *unit = GetNonSkeletonUnit(entry);
if (!unit)
return false;
return query_context.name ==
unit->PeekDIEName(unit->GetOffset() + *maybe_dieoffset);
}

bool DebugNamesDWARFIndex::SameParentChain(
llvm::ArrayRef<llvm::StringRef> parent_names,
llvm::ArrayRef<DebugNames::Entry> parent_entries) const {
Expand Down Expand Up @@ -402,6 +418,45 @@ bool DebugNamesDWARFIndex::SameParentChain(
return true;
}

bool DebugNamesDWARFIndex::SameParentChain(
llvm::ArrayRef<CompilerContext> parent_contexts,
llvm::ArrayRef<DebugNames::Entry> parent_entries) const {
if (parent_entries.size() != parent_contexts.size())
return false;

// If the AT_name of any parent fails to match the expected name, we don't
// have a match.
for (auto [parent_context, parent_entry] :
llvm::zip_equal(parent_contexts, parent_entries))
if (!SameAsEntryContext(parent_context, parent_entry))
return false;
return true;
}

bool DebugNamesDWARFIndex::WithinParentChain(
llvm::ArrayRef<CompilerContext> query_contexts,
llvm::ArrayRef<DebugNames::Entry> parent_chain) const {
if (query_contexts.size() == parent_chain.size())
return SameParentChain(query_contexts, parent_chain);

// If parent chain does not have enough entries, we can't possibly have a
// match.
while (!query_contexts.empty() &&
query_contexts.size() <= parent_chain.size()) {
if (SameAsEntryContext(query_contexts.front(), parent_chain.front())) {
query_contexts = query_contexts.drop_front();
parent_chain = parent_chain.drop_front();
} else {
// Name does not match, try next parent_chain entry if the current entry
// is namespace because the current one can be an inline namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a way of guaranteeing it is?
If the comparison that failed to match is with a real (non inline) namespace, we would return a false result here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would love to but I do not think .debug_names tag has information to filter it out. It simply encodes DW_TAG_namespace tag:

   Name 1748 {
      Hash: 0xB3EF34A4
      String: 0x0000ae01 "inline_ns"
      Entry @ 0xb2d9 {
        Abbrev: 0x5
        Tag: DW_TAG_namespace
        DW_IDX_die_offset: 0x00015df7
        DW_IDX_parent: Entry @ 0xbab8
      }
    }

I guess we could try to PeekDieName() to parse this DIE from dwo files to check if it is an inline vs non-inline namespace but that is the expensive code path this PR wants to avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that help? I don't see anything in the DIE that let's us identify inline namespaces...

https://godbolt.org/z/GT98Yfbqv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, did not know godbolt can show dwarfdump output.

DWARFASTParserClang identifies namespace DIE with DW_AT_export_symbols as inline namespace:

bool is_inline =
die.GetAttributeValueAsUnsigned(DW_AT_export_symbols, 0) != 0;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good to know we have a mechanism to detect inline namespaces.

That still leaves the question of how to avoid false results though. I understand peeking at the DIE would be expensive for dwos, but I don't think we can sacrifice the correctness of the answer for these queries. One possibility to explore is to add another IDX entry for namespaces, which would include whether they export symbols or not.

Copy link
Contributor Author

@jeffreytan81 jeffreytan81 Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design is made to perform as much filtering as possible in DebugNamesDWARFIndex::GetTypesWithQuery(), it does not have to be accurate, because it will eventually delegate to base class DWARFIndex::ProcessTypeDieMatchQuery() which performs accurate semantics matching as the original callsite.

if (parent_chain.front().tag() != DW_TAG_namespace)
return false;
parent_chain = parent_chain.drop_front();
}
}
return query_contexts.empty();
}

void DebugNamesDWARFIndex::GetTypes(
ConstString name, llvm::function_ref<bool(DWARFDIE die)> callback) {
for (const DebugNames::Entry &entry :
Expand Down Expand Up @@ -444,6 +499,72 @@ void DebugNamesDWARFIndex::GetNamespaces(
m_fallback.GetNamespaces(name, callback);
}

llvm::SmallVector<CompilerContext>
DebugNamesDWARFIndex::GetTypeQueryParentContexts(TypeQuery &query) {
std::vector<lldb_private::CompilerContext> &query_decl_context =
query.GetContextRef();
llvm::SmallVector<CompilerContext> parent_contexts;
if (!query_decl_context.empty()) {
// Skip the last entry as it's the type we're matching parents for.
// Reverse the query decl context to match parent chain order.
llvm::ArrayRef<CompilerContext> parent_contexts_ref(
query_decl_context.data(), query_decl_context.size() - 1);
for (const CompilerContext &ctx : llvm::reverse(parent_contexts_ref)) {
// Skip any context without name because .debug_names might not encode
// them. (e.g. annonymous namespace)
if ((ctx.kind & CompilerContextKind::AnyType) !=
CompilerContextKind::Invalid &&
!ctx.name.IsEmpty())
parent_contexts.push_back(ctx);
}
}
return parent_contexts;
}

void DebugNamesDWARFIndex::GetTypesWithQuery(
TypeQuery &query, llvm::function_ref<bool(DWARFDIE die)> callback) {
ConstString name = query.GetTypeBasename();
std::vector<lldb_private::CompilerContext> query_context =
query.GetContextRef();
if (query_context.size() <= 1)
return GetTypes(name, callback);

llvm::SmallVector<CompilerContext> parent_contexts =
GetTypeQueryParentContexts(query);
// For each entry, grab its parent chain and check if we have a match.
for (const DebugNames::Entry &entry : m_debug_names_up->equal_range(name)) {
if (!isType(entry.tag()))
continue;

// If we get a NULL foreign_tu back, the entry doesn't match the type unit
// in the .dwp file, or we were not able to load the .dwo file or the DWO ID
// didn't match.
std::optional<DWARFTypeUnit *> foreign_tu = GetForeignTypeUnit(entry);
if (foreign_tu && foreign_tu.value() == nullptr)
continue;

std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
getParentChain(entry);
if (!parent_chain) {
// Fallback: use the base class implementation.
if (!ProcessEntry(entry, [&](DWARFDIE die) {
return ProcessTypeDIEMatchQuery(query, die, callback);
}))
return;
continue;
}

if (WithinParentChain(parent_contexts, *parent_chain) &&
!ProcessEntry(entry, [&](DWARFDIE die) {
// After .debug_names filtering still sending to base class for
// further filtering before calling the callback.
return ProcessTypeDIEMatchQuery(query, die, callback);
}))
return;
}
m_fallback.GetTypesWithQuery(query, callback);
}

void DebugNamesDWARFIndex::GetFunctions(
const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
const CompilerDeclContext &parent_decl_ctx,
Expand Down
34 changes: 34 additions & 0 deletions lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ class DebugNamesDWARFIndex : public DWARFIndex {
llvm::function_ref<bool(DWARFDIE die)> callback) override;
void GetNamespaces(ConstString name,
llvm::function_ref<bool(DWARFDIE die)> callback) override;
void
GetTypesWithQuery(TypeQuery &query,
llvm::function_ref<bool(DWARFDIE die)> callback) override;

void GetFunctions(const Module::LookupInfo &lookup_info,
SymbolFileDWARF &dwarf,
const CompilerDeclContext &parent_decl_ctx,
Expand Down Expand Up @@ -118,6 +122,36 @@ class DebugNamesDWARFIndex : public DWARFIndex {
bool SameParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
llvm::ArrayRef<DebugNames::Entry> parent_entries) const;

bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_contexts,
llvm::ArrayRef<DebugNames::Entry> parent_entries) const;

/// Returns true if \a parent_contexts entries are within \a parent_chain.
/// This is diffferent from SameParentChain() which checks for exact match.
/// This function is required because \a parent_chain can contain inline
/// namespace entries which may not be specified in \a parent_contexts by
/// client.
///
/// \param[in] parent_contexts
/// The list of parent contexts to check for.
///
/// \param[in] parent_chain
/// The fully qualified parent chain entries from .debug_names index table
/// to check against.
///
/// \returns
/// True if all \a parent_contexts entries are can be sequentially found
/// inside
/// \a parent_chain, otherwise False.
bool WithinParentChain(llvm::ArrayRef<CompilerContext> parent_contexts,
llvm::ArrayRef<DebugNames::Entry> parent_chain) const;

/// Returns true if .debug_names pool entry \p entry matches \p query_context.
bool SameAsEntryContext(const CompilerContext &query_context,
const DebugNames::Entry &entry) const;

llvm::SmallVector<CompilerContext>
GetTypeQueryParentContexts(TypeQuery &query);

static void MaybeLogLookupError(llvm::Error error,
const DebugNames::NameIndex &ni,
llvm::StringRef name);
Expand Down
5 changes: 3 additions & 2 deletions lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2748,8 +2748,9 @@ void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {

std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());

TypeQuery query_full(query);
bool have_index_match = false;
m_index->GetTypes(type_basename, [&](DWARFDIE die) {
m_index->GetTypesWithQuery(query_full, [&](DWARFDIE die) {
// Check the language, but only if we have a language filter.
if (query.HasLanguage()) {
if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
Expand Down Expand Up @@ -2813,7 +2814,7 @@ void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
auto type_basename_simple = query_simple.GetTypeBasename();
// Copy our match's context and update the basename we are looking for
// so we can use this only to compare the context correctly.
m_index->GetTypes(type_basename_simple, [&](DWARFDIE die) {
m_index->GetTypesWithQuery(query_simple, [&](DWARFDIE die) {
// Check the language, but only if we have a language filter.
if (query.HasLanguage()) {
if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
Expand Down
Loading