|
70 | 70 | //! interpretation of provenance. It's ok if your code doesn't strictly conform to it.**
|
71 | 71 | //!
|
72 | 72 | //! [Strict Provenance][] is an experimental set of APIs that help tools that try
|
73 |
| -//! to validate the memory-safety of your program's execution. Notably this includes [miri][] |
| 73 | +//! to validate the memory-safety of your program's execution. Notably this includes [Miri][] |
74 | 74 | //! and [CHERI][], which can detect when you access out of bounds memory or otherwise violate
|
75 | 75 | //! Rust's memory model.
|
76 | 76 | //!
|
|
136 | 136 | //!
|
137 | 137 | //! The strict provenance experiment is mostly only interested in exploring stricter *spatial*
|
138 | 138 | //! provenance. In this sense it can be thought of as a subset of the more ambitious and
|
139 |
| -//! formal [Stacked Borrows][] research project, which is what tools like [miri][] are based on. |
| 139 | +//! formal [Stacked Borrows][] research project, which is what tools like [Miri][] are based on. |
140 | 140 | //! In particular, Stacked Borrows is necessary to properly describe what borrows are allowed
|
141 | 141 | //! to do and when they become invalidated. This necessarily involves much more complex
|
142 | 142 | //! *temporal* reasoning than simply identifying allocations. Adjusting APIs and code
|
|
170 | 170 | //! Under Strict Provenance, a usize *cannot* accurately represent a pointer, and converting from
|
171 | 171 | //! a pointer to a usize is generally an operation which *only* extracts the address. It is
|
172 | 172 | //! therefore *impossible* to construct a valid pointer from a usize because there is no way
|
173 |
| -//! to restore the address-space and provenance. |
| 173 | +//! to restore the address-space and provenance. In other words, pointer-integer-pointer |
| 174 | +//! roundtrips are not possible (in the sense that the resulting pointer is not dereferencable). |
174 | 175 | //!
|
175 | 176 | //! The key insight to making this model *at all* viable is the [`with_addr`][] method:
|
176 | 177 | //!
|
|
194 | 195 | //! and then immediately converting back to a pointer. To make this use case more ergonomic,
|
195 | 196 | //! we provide the [`map_addr`][] method.
|
196 | 197 | //!
|
197 |
| -//! To help make it clear that code is "following" Strict Provenance semantics, we also |
198 |
| -//! provide an [`addr`][] method which is currently equivalent to `ptr as usize`. In the |
199 |
| -//! future we may provide a lint for pointer<->integer casts to help you audit if your |
200 |
| -//! code conforms to strict provenance. |
| 198 | +//! To help make it clear that code is "following" Strict Provenance semantics, we also provide an |
| 199 | +//! [`addr`][] method which promises that the returned address is not part of a |
| 200 | +//! pointer-usize-pointer roundtrip. In the future we may provide a lint for pointer<->integer |
| 201 | +//! casts to help you audit if your code conforms to strict provenance. |
201 | 202 | //!
|
202 | 203 | //!
|
203 | 204 | //! ## Using Strict Provenance
|
|
310 | 311 | //! For instance, ARM explicitly supports high-bit tagging, and so CHERI on ARM inherits
|
311 | 312 | //! that and should support it.
|
312 | 313 | //!
|
| 314 | +//! ## Pointer-usize-pointer roundtrips and 'exposed' provenance |
| 315 | +//! |
| 316 | +//! **This section is *non-normative* and is part of the [Strict Provenance] experiment.** |
| 317 | +//! |
| 318 | +//! As discussed above, pointer-usize-pointer roundtrips are not possible under [Strict Provenance]. |
| 319 | +//! However, there exists legacy Rust code that is full of such roundtrips, and legacy platform APIs |
| 320 | +//! regularly assume that `usize` can capture all the information that makes up a pointer. There |
| 321 | +//! also might be code that cannot be ported to Strict Provenance (which is something we would [like |
| 322 | +//! to hear about][Strict Provenance]). |
| 323 | +//! |
| 324 | +//! For situations like this, there is a fallback plan, a way to 'opt out' of Strict Provenance. |
| 325 | +//! However, note that this makes your code a lot harder to specify, and the code will not work |
| 326 | +//! (well) with tools like [Miri] and [CHERI]. |
| 327 | +//! |
| 328 | +//! This fallback plan is provided by the [`expose_addr`] and [`from_exposed_addr`] methods (which |
| 329 | +//! are equivalent to `as` casts between pointers and integers). [`expose_addr`] is a lot like |
| 330 | +//! [`addr`], but additionally adds the provenance of the pointer to a global list of 'exposed' |
| 331 | +//! provenances. (This list is purely conceptual, it exists for the purpose of specifying Rust but |
| 332 | +//! is not materialized in actual executions, except in tools like [Miri].) [`from_exposed_addr`] |
| 333 | +//! can be used to construct a pointer with one of these previously 'exposed' provenances. |
| 334 | +//! [`from_exposed_addr`] takes only `addr: usize` as arguments, so unlike in [`with_addr`] there is |
| 335 | +//! no indication of what the correct provenance for the returned pointer is -- and that is exactly |
| 336 | +//! what makes pointer-usize-pointer roundtrips so tricky to rigorously specify! There is no |
| 337 | +//! algorithm that decides which provenance will be used. You can think of this as "guessing" the |
| 338 | +//! right provenance, and the guess will be "maximally in your favor", in the sense that if there is |
| 339 | +//! any way to avoid undefined behavior, then that is the guess that will be taken. However, if |
| 340 | +//! there is *no* previously 'exposed' provenance that justifies the way the returned pointer will |
| 341 | +//! be used, the program has undefined behavior. |
| 342 | +//! |
| 343 | +//! Using [`expose_addr`] or [`from_exposed_addr`] (or the equivalent `as` casts) means that code is |
| 344 | +//! *not* following Strict Provenance rules. The goal of the Strict Provenance experiment is to |
| 345 | +//! determine whether it is possible to use Rust without [`expose_addr`] and [`from_exposed_addr`]. |
| 346 | +//! If this is successful, it would be a major win for avoiding specification complexity and to |
| 347 | +//! facilitate adoption of tools like [CHERI] and [Miri] that can be a big help in increasing the |
| 348 | +//! confidence in (unsafe) Rust code. |
313 | 349 | //!
|
314 | 350 | //! [aliasing]: ../../nomicon/aliasing.html
|
315 | 351 | //! [book]: ../../book/ch19-01-unsafe-rust.html#dereferencing-a-raw-pointer
|
|
322 | 358 | //! [`map_addr`]: pointer::map_addr
|
323 | 359 | //! [`addr`]: pointer::addr
|
324 | 360 | //! [`ptr::invalid`]: core::ptr::invalid
|
325 |
| -//! [miri]: https://github.com/rust-lang/miri |
| 361 | +//! [`expose_addr`]: pointer::expose_addr |
| 362 | +//! [`from_exposed_addr`]: from_exposed_addr |
| 363 | +//! [Miri]: https://github.com/rust-lang/miri |
326 | 364 | //! [CHERI]: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/
|
327 | 365 | //! [Strict Provenance]: https://github.com/rust-lang/rust/issues/95228
|
328 | 366 | //! [Stacked Borrows]: https://plv.mpi-sws.org/rustbelt/stacked-borrows/
|
@@ -547,6 +585,78 @@ pub const fn invalid_mut<T>(addr: usize) -> *mut T {
|
547 | 585 | addr as *mut T
|
548 | 586 | }
|
549 | 587 |
|
| 588 | +/// Convert an address back to a pointer, picking up a previously 'exposed' provenance. |
| 589 | +/// |
| 590 | +/// This is equivalent to `addr as *const T`. The provenance of the returned pointer is that of *any* |
| 591 | +/// pointer that was previously passed to [`expose_addr`][pointer::expose_addr] or a `ptr as usize` |
| 592 | +/// cast. If there is no previously 'exposed' provenance that justifies the way this pointer will be |
| 593 | +/// used, the program has undefined behavior. Note that there is no algorithm that decides which |
| 594 | +/// provenance will be used. You can think of this as "guessing" the right provenance, and the guess |
| 595 | +/// will be "maximally in your favor", in the sense that if there is any way to avoid undefined |
| 596 | +/// behavior, then that is the guess that will be taken. |
| 597 | +/// |
| 598 | +/// On platforms with multiple address spaces, it is your responsibility to ensure that the |
| 599 | +/// address makes sense in the address space that this pointer will be used with. |
| 600 | +/// |
| 601 | +/// Using this method means that code is *not* following strict provenance rules. "Guessing" a |
| 602 | +/// suitable provenance complicates specification and reasoning and may not be supported by |
| 603 | +/// tools that help you to stay conformant with the Rust memory model, so it is recommended to |
| 604 | +/// use [`with_addr`][pointer::with_addr] wherever possible. |
| 605 | +/// |
| 606 | +/// On most platforms this will produce a value with the same bytes as the address. Platforms |
| 607 | +/// which need to store additional information in a pointer may not support this operation, |
| 608 | +/// since it is generally not possible to actually *compute* which provenance the returned |
| 609 | +/// pointer has to pick up. |
| 610 | +/// |
| 611 | +/// This API and its claimed semantics are part of the Strict Provenance experiment, see the |
| 612 | +/// [module documentation][crate::ptr] for details. |
| 613 | +#[must_use] |
| 614 | +#[inline] |
| 615 | +#[unstable(feature = "strict_provenance", issue = "95228")] |
| 616 | +pub fn from_exposed_addr<T>(addr: usize) -> *const T |
| 617 | +where |
| 618 | + T: Sized, |
| 619 | +{ |
| 620 | + // FIXME(strict_provenance_magic): I am magic and should be a compiler intrinsic. |
| 621 | + addr as *const T |
| 622 | +} |
| 623 | + |
| 624 | +/// Convert an address back to a mutable pointer, picking up a previously 'exposed' provenance. |
| 625 | +/// |
| 626 | +/// This is equivalent to `addr as *mut T`. The provenance of the returned pointer is that of *any* |
| 627 | +/// pointer that was previously passed to [`expose_addr`][pointer::expose_addr] or a `ptr as usize` |
| 628 | +/// cast. If there is no previously 'exposed' provenance that justifies the way this pointer will be |
| 629 | +/// used, the program has undefined behavior. Note that there is no algorithm that decides which |
| 630 | +/// provenance will be used. You can think of this as "guessing" the right provenance, and the guess |
| 631 | +/// will be "maximally in your favor", in the sense that if there is any way to avoid undefined |
| 632 | +/// behavior, then that is the guess that will be taken. |
| 633 | +/// |
| 634 | +/// On platforms with multiple address spaces, it is your responsibility to ensure that the |
| 635 | +/// address makes sense in the address space that this pointer will be used with. |
| 636 | +/// |
| 637 | +/// Using this method means that code is *not* following strict provenance rules. "Guessing" a |
| 638 | +/// suitable provenance complicates specification and reasoning and may not be supported by |
| 639 | +/// tools that help you to stay conformant with the Rust memory model, so it is recommended to |
| 640 | +/// use [`with_addr`][pointer::with_addr] wherever possible. |
| 641 | +/// |
| 642 | +/// On most platforms this will produce a value with the same bytes as the address. Platforms |
| 643 | +/// which need to store additional information in a pointer may not support this operation, |
| 644 | +/// since it is generally not possible to actually *compute* which provenance the returned |
| 645 | +/// pointer has to pick up. |
| 646 | +/// |
| 647 | +/// This API and its claimed semantics are part of the Strict Provenance experiment, see the |
| 648 | +/// [module documentation][crate::ptr] for details. |
| 649 | +#[must_use] |
| 650 | +#[inline] |
| 651 | +#[unstable(feature = "strict_provenance", issue = "95228")] |
| 652 | +pub fn from_exposed_addr_mut<T>(addr: usize) -> *mut T |
| 653 | +where |
| 654 | + T: Sized, |
| 655 | +{ |
| 656 | + // FIXME(strict_provenance_magic): I am magic and should be a compiler intrinsic. |
| 657 | + addr as *mut T |
| 658 | +} |
| 659 | + |
550 | 660 | /// Forms a raw slice from a pointer and a length.
|
551 | 661 | ///
|
552 | 662 | /// The `len` argument is the number of **elements**, not the number of bytes.
|
@@ -762,7 +872,7 @@ pub const unsafe fn swap_nonoverlapping<T>(x: *mut T, y: *mut T, count: usize) {
|
762 | 872 | );
|
763 | 873 | }
|
764 | 874 |
|
765 |
| - // NOTE(scottmcm) MIRI is disabled here as reading in smaller units is a |
| 875 | + // NOTE(scottmcm) Miri is disabled here as reading in smaller units is a |
766 | 876 | // pessimization for it. Also, if the type contains any unaligned pointers,
|
767 | 877 | // copying those over multiple reads is difficult to support.
|
768 | 878 | #[cfg(not(miri))]
|
|
0 commit comments