Skip to content

Commit c51b9b6

Browse files
committed
Auto merge of rust-lang#133832 - madsmtm:apple-symbols.o, r=DianQK
Make `#[used]` work when linking with `ld64` To make `#[used]` work in static libraries, we use the `symbols.o` trick introduced in rust-lang#95604. However, the linker shipped with Xcode, ld64, works a bit differently from other linkers; in particular, [it completely ignores undefined symbols by themselves](https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/parsers/macho_relocatable_file.cpp#L2455-L2468), and only consider them if they have relocations (something something atoms something fixups, I don't know the details). So to make the `symbols.o` file work on ld64, we need to actually insert a relocation. That's kinda cumbersome to do though, since the relocation must be valid, and hence must point to a valid piece of machine code, and is hence very architecture-specific. Fixes rust-lang#133491, see that for investigation. --- Another option would be to pass `-u _foo` to the final linker invocation. This has the problem that `-u` causes the linker to not be able to dead-strip the symbol, which is undesirable. (If we did this, we would possibly also want to do it by putting the arguments in a file by itself, and passing that file via ``@`,` e.g. ``@undefined_symbols.txt`,` similar to rust-lang#52699, though that [is only supported since Xcode 12](https://developer.apple.com/documentation/xcode-release-notes/xcode-12-release-notes#Linking), and I'm not sure we wanna bump that). Various other options that are probably all undesirable as they affect link time performance: - Pass `-all_load` to the linker. - Pass `-ObjC` to the linker (the Objective-C support in the linker has different code paths that load more of the binary), and instrument the binaries that contain `#[used]` symbols. - Pass `-force_load` to libraries that contain `#[used]` symbols. Failed attempt: Embed `-u _foo` in the object file with `LC_LINKER_OPTION`, akin to rust-lang#121293. Doesn't work, both because `ld64` doesn't read that from archive members unless it already has a reason to load the member (which is what this PR is trying to make it do), and because `ld64` only support the `-l`, `-needed-l`, `-framework` and `-needed_framework` flags in there. --- TODO: - [x] Support all Apple architectures. - [x] Ensure that this works regardless of the actual type of the symbol. - [x] Write up more docs. - [x] Wire up a few proper tests. `@rustbot` label O-apple
2 parents ad27045 + b202430 commit c51b9b6

File tree

7 files changed

+202
-9
lines changed

7 files changed

+202
-9
lines changed

compiler/rustc_codegen_ssa/src/back/apple.rs

+84
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ use std::env;
22
use std::fmt::{Display, from_fn};
33
use std::num::ParseIntError;
44

5+
use rustc_middle::middle::exported_symbols::SymbolExportKind;
56
use rustc_session::Session;
67
use rustc_target::spec::Target;
78

@@ -26,6 +27,89 @@ pub(super) fn macho_platform(target: &Target) -> u32 {
2627
}
2728
}
2829

30+
/// Add relocation and section data needed for a symbol to be considered
31+
/// undefined by ld64.
32+
///
33+
/// The relocation must be valid, and hence must point to a valid piece of
34+
/// machine code, and hence this is unfortunately very architecture-specific.
35+
///
36+
///
37+
/// # New architectures
38+
///
39+
/// The values here are basically the same as emitted by the following program:
40+
///
41+
/// ```c
42+
/// // clang -c foo.c -target $CLANG_TARGET
43+
/// void foo(void);
44+
///
45+
/// extern int bar;
46+
///
47+
/// void* foobar[2] = {
48+
/// (void*)foo,
49+
/// (void*)&bar,
50+
/// // ...
51+
/// };
52+
/// ```
53+
///
54+
/// Can be inspected with:
55+
/// ```console
56+
/// objdump --macho --reloc foo.o
57+
/// objdump --macho --full-contents foo.o
58+
/// ```
59+
pub(super) fn add_data_and_relocation(
60+
file: &mut object::write::Object<'_>,
61+
section: object::write::SectionId,
62+
symbol: object::write::SymbolId,
63+
target: &Target,
64+
kind: SymbolExportKind,
65+
) -> object::write::Result<()> {
66+
let authenticated_pointer =
67+
kind == SymbolExportKind::Text && target.llvm_target.starts_with("arm64e");
68+
69+
let data: &[u8] = match target.pointer_width {
70+
_ if authenticated_pointer => &[0, 0, 0, 0, 0, 0, 0, 0x80],
71+
32 => &[0; 4],
72+
64 => &[0; 8],
73+
pointer_width => unimplemented!("unsupported Apple pointer width {pointer_width:?}"),
74+
};
75+
76+
if target.arch == "x86_64" {
77+
// Force alignment for the entire section to be 16 on x86_64.
78+
file.section_mut(section).append_data(&[], 16);
79+
} else {
80+
// Elsewhere, the section alignment is the same as the pointer width.
81+
file.section_mut(section).append_data(&[], target.pointer_width as u64);
82+
}
83+
84+
let offset = file.section_mut(section).append_data(data, data.len() as u64);
85+
86+
let flags = if authenticated_pointer {
87+
object::write::RelocationFlags::MachO {
88+
r_type: object::macho::ARM64_RELOC_AUTHENTICATED_POINTER,
89+
r_pcrel: false,
90+
r_length: 3,
91+
}
92+
} else if target.arch == "arm" {
93+
// FIXME(madsmtm): Remove once `object` supports 32-bit ARM relocations:
94+
// https://github.com/gimli-rs/object/pull/757
95+
object::write::RelocationFlags::MachO {
96+
r_type: object::macho::ARM_RELOC_VANILLA,
97+
r_pcrel: false,
98+
r_length: 2,
99+
}
100+
} else {
101+
object::write::RelocationFlags::Generic {
102+
kind: object::RelocationKind::Absolute,
103+
encoding: object::RelocationEncoding::Generic,
104+
size: target.pointer_width as u8,
105+
}
106+
};
107+
108+
file.add_relocation(section, object::write::Relocation { offset, addend: 0, symbol, flags })?;
109+
110+
Ok(())
111+
}
112+
29113
/// Deployment target or SDK version.
30114
///
31115
/// The size of the numbers in here are limited by Mach-O's `LC_BUILD_VERSION`.

compiler/rustc_codegen_ssa/src/back/link.rs

+56-3
Original file line numberDiff line numberDiff line change
@@ -2058,8 +2058,8 @@ fn add_post_link_args(cmd: &mut dyn Linker, sess: &Session, flavor: LinkerFlavor
20582058
/// linker, and since they never participate in the linking, using `KEEP` in the linker scripts
20592059
/// can't keep them either. This causes #47384.
20602060
///
2061-
/// To keep them around, we could use `--whole-archive` and equivalents to force rlib to
2062-
/// participate in linking like object files, but this proves to be expensive (#93791). Therefore
2061+
/// To keep them around, we could use `--whole-archive`, `-force_load` and equivalents to force rlib
2062+
/// to participate in linking like object files, but this proves to be expensive (#93791). Therefore
20632063
/// we instead just introduce an undefined reference to them. This could be done by `-u` command
20642064
/// line option to the linker or `EXTERN(...)` in linker scripts, however they does not only
20652065
/// introduce an undefined reference, but also make them the GC roots, preventing `--gc-sections`
@@ -2101,8 +2101,20 @@ fn add_linked_symbol_object(
21012101
file.set_mangling(object::write::Mangling::None);
21022102
}
21032103

2104+
// ld64 requires a relocation to load undefined symbols, see below.
2105+
// Not strictly needed if linking with lld, but might as well do it there too.
2106+
let ld64_section_helper = if file.format() == object::BinaryFormat::MachO {
2107+
Some(file.add_section(
2108+
file.segment_name(object::write::StandardSegment::Data).to_vec(),
2109+
"__data".into(),
2110+
object::SectionKind::Data,
2111+
))
2112+
} else {
2113+
None
2114+
};
2115+
21042116
for (sym, kind) in symbols.iter() {
2105-
file.add_symbol(object::write::Symbol {
2117+
let symbol = file.add_symbol(object::write::Symbol {
21062118
name: sym.clone().into(),
21072119
value: 0,
21082120
size: 0,
@@ -2116,6 +2128,47 @@ fn add_linked_symbol_object(
21162128
section: object::write::SymbolSection::Undefined,
21172129
flags: object::SymbolFlags::None,
21182130
});
2131+
2132+
// The linker shipped with Apple's Xcode, ld64, works a bit differently from other linkers.
2133+
//
2134+
// Code-wise, the relevant parts of ld64 are roughly:
2135+
// 1. Find the `ArchiveLoadMode` based on commandline options, default to `parseObjects`.
2136+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/Options.cpp#L924-L932
2137+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/Options.h#L55
2138+
//
2139+
// 2. Read the archive table of contents (__.SYMDEF file).
2140+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/parsers/archive_file.cpp#L294-L325
2141+
//
2142+
// 3. Begin linking by loading "atoms" from input files.
2143+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/doc/design/linker.html
2144+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/InputFiles.cpp#L1349
2145+
//
2146+
// a. Directly specified object files (`.o`) are parsed immediately.
2147+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/parsers/macho_relocatable_file.cpp#L4611-L4627
2148+
//
2149+
// - Undefined symbols are not atoms (`n_value > 0` denotes a common symbol).
2150+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/parsers/macho_relocatable_file.cpp#L2455-L2468
2151+
// https://maskray.me/blog/2022-02-06-all-about-common-symbols
2152+
//
2153+
// - Relocations/fixups are atoms.
2154+
// https://github.com/apple-oss-distributions/ld64/blob/ce6341ae966b3451aa54eeb049f2be865afbd578/src/ld/parsers/macho_relocatable_file.cpp#L2088-L2114
2155+
//
2156+
// b. Archives are not parsed yet.
2157+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/parsers/archive_file.cpp#L467-L577
2158+
//
2159+
// 4. When a symbol is needed by an atom, parse the object file that contains the symbol.
2160+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/InputFiles.cpp#L1417-L1491
2161+
// https://github.com/apple-oss-distributions/ld64/blob/ld64-954.16/src/ld/parsers/archive_file.cpp#L579-L597
2162+
//
2163+
// All of the steps above are fairly similar to other linkers, except that **it completely
2164+
// ignores undefined symbols**.
2165+
//
2166+
// So to make this trick work on ld64, we need to do something else to load the relevant
2167+
// object files. We do this by inserting a relocation (fixup) for each symbol.
2168+
if let Some(section) = ld64_section_helper {
2169+
apple::add_data_and_relocation(&mut file, section, symbol, &sess.target, *kind)
2170+
.expect("failed adding relocation");
2171+
}
21192172
}
21202173

21212174
let path = tmpdir.join("symbols.o");

tests/run-make/include-all-symbols-linking/lib.rs

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
mod foo {
2-
#[link_section = ".rodata.STATIC"]
2+
#[cfg_attr(target_os = "linux", link_section = ".rodata.STATIC")]
3+
#[cfg_attr(target_vendor = "apple", link_section = "__DATA,STATIC")]
34
#[used]
45
static STATIC: [u32; 10] = [1; 10];
56
}

tests/run-make/include-all-symbols-linking/rmake.rs

+10-5
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,20 @@
77
// See https://github.com/rust-lang/rust/pull/95604
88
// See https://github.com/rust-lang/rust/issues/47384
99

10-
//@ only-linux
11-
// Reason: differences in object file formats on OSX and Windows
12-
// causes errors in the llvm_objdump step
10+
//@ ignore-wasm differences in object file formats causes errors in the llvm_objdump step.
11+
//@ ignore-windows differences in object file formats causes errors in the llvm_objdump step.
1312

14-
use run_make_support::{dynamic_lib_name, llvm_objdump, llvm_readobj, rustc};
13+
use run_make_support::{dynamic_lib_name, llvm_objdump, llvm_readobj, rustc, target};
1514

1615
fn main() {
1716
rustc().crate_type("lib").input("lib.rs").run();
18-
rustc().crate_type("cdylib").link_args("-Tlinker.ld").input("main.rs").run();
17+
let mut main = rustc();
18+
main.crate_type("cdylib");
19+
if target().contains("linux") {
20+
main.link_args("-Tlinker.ld");
21+
}
22+
main.input("main.rs").run();
23+
1924
// Ensure `#[used]` and `KEEP`-ed section is there
2025
llvm_objdump()
2126
.arg("--full-contents")
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
//! Add a constructor that runs pre-main, similar to what the `ctor` crate does.
2+
//!
3+
//! #[ctor]
4+
//! fn constructor() {
5+
//! println!("constructor");
6+
//! }
7+
8+
//@ no-prefer-dynamic explicitly test with crates that are built as an archive
9+
#![crate_type = "rlib"]
10+
11+
#[cfg_attr(
12+
any(
13+
target_os = "linux",
14+
target_os = "android",
15+
target_os = "freebsd",
16+
target_os = "netbsd",
17+
target_os = "openbsd",
18+
target_os = "dragonfly",
19+
target_os = "illumos",
20+
target_os = "haiku"
21+
),
22+
link_section = ".init_array"
23+
)]
24+
#[cfg_attr(target_vendor = "apple", link_section = "__DATA,__mod_init_func,mod_init_funcs")]
25+
#[cfg_attr(target_os = "windows", link_section = ".CRT$XCU")]
26+
#[used]
27+
static CONSTRUCTOR: extern "C" fn() = constructor;
28+
29+
#[cfg_attr(any(target_os = "linux", target_os = "android"), link_section = ".text.startup")]
30+
extern "C" fn constructor() {
31+
println!("constructor");
32+
}
+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
//! Ensure that `#[used]` in archives are correctly registered.
2+
//!
3+
//! Regression test for https://github.com/rust-lang/rust/issues/133491.
4+
5+
//@ run-pass
6+
//@ check-run-results
7+
//@ aux-build: used_pre_main_constructor.rs
8+
9+
//@ ignore-wasm ctor doesn't work on WASM
10+
11+
// Make sure `rustc` links the archive, but intentionally do not import/use any items.
12+
extern crate used_pre_main_constructor as _;
13+
14+
fn main() {
15+
println!("main");
16+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
constructor
2+
main

0 commit comments

Comments
 (0)