Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify unwinding #638

Closed
wants to merge 1 commit into from
Closed

Specify unwinding #638

wants to merge 1 commit into from

Conversation

Aatch
Copy link
Contributor

@Aatch Aatch commented Jan 21, 2015

Specifies behaviour around unwinding. Especially with respect to non-Rust code.


Rendered

unsafe when used incorrectly) can be used to mark functions as specifically not unwinding. This is
primarily for performance reasons in cases where the function cannot be inferred to not unwind. The
primary use case for this attribute is the `oom` function that aborts the process in the case of a
out-of-memory condition.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, I think I remember hearing that code size benefited from this change, do you have some examples or numbers that this could cite to show the benefits that it reaps?

@nikomatsakis
Copy link
Contributor

cc me

@dgrunwald
Copy link
Contributor

Instead of #[unsafe_no_unwind], why not #[no_panic] with the similar semantics as C++ noexcept? (see also: http://discuss.rust-lang.org/t/no-panic/1356)
I think you'd still get most of the optimization benefit, but don't risk undefined behavior.

The semantics of #[no_panic] would be:

  • A caller of a no_panic function is guaranteed that we won't attempt to unwind out of the function.
  • If a panic does happen within a no_panic function, the process is aborted.
  • There is no guarantee how many Drop impls will run when unwinding into a no_panic function (but if a Drop impl does run, all other Drop impls that should have run prior are guaranteed to do so, in the correct order)

Because #[no_panic] allows the compiler to see whether a function might panic in cross-crate calls, it could skip emitting unwinding infos for more functions because we're guaranteed not to unwind into them.
The last point allows the compiler to reuse the same aborting code for all no_panic functions, as it doesn't need to call drop for any locals used by the functions. But it goes further: it allows the compiler to automatically mark any functions as no_panic if they are only reachable from other no_panic functions in the call graph. And any explicit panic!() calls in no_panic functions can be directly turned into an abort() call.
Thus, a program that doesn't use try or threads, and marks the main function as #[no_panic] would always abort() instead of panicking, and wouldn't have to link against the rust unwinding logic at all! (though you'd need LTO for the the compiler to be able to figure out the complete call graph)

@alexcrichton
Copy link
Member

@dgrunwald I think that the #[no_panic] attribute you're describing may be able to work in tandem with the #[unsafe_no_unwind] attribute being proposed here. The compiler would have to emit extra code for a safe version of #[no_panic] (force abort on unwind), but the unsafe version of the attribute has a clear analog in LLVM which may allow for optimizations it may not otherwise be able to deduce.

In general the semantics of #[no_panic] that you're describing sound very similar to the nounwind attribute in LLVM as I believe it already performs much of the propagating analysis.


@Aatch could you update this RFC with some of the feedback in this thread? There hasn't been much discussion on this but I think that it may be ready for merging soon.

@dgrunwald
Copy link
Contributor

AFAIK the only optimizations that #[unsafe_no_unwind] can do and #[no_panic] can't are those where the undefined behavior "travels back in time" and causes the compiler to eliminate whole code paths.

For example, I'd expect that #[unsafe_no_unwind] will optimize away all bounds checking (which should probably be mentioned in the RFC, since it makes the attribute more dangerous than one might naively expect from the current description)

With my no_panic, there's really two directions of propagation:
a) Functions that only calls no_panic functions and does not cause panics itself, can never be observed to unwind, and thus be marked no_panic. (propagation: callees -> caller)
b) Functions that are only called by no_panic functions will always unwind into a no_panic function. Because no_panic doesn't have to run drop impls, it's impossible to observe when exactly during unwinding the process was aborted, and thus we can mark the function no_panic (propagation: callers -> callee)

I was under the impression that this corresponds to the noexcept semantics in C++ -- looking more closely, C++ just says that unwinding into noexcept functions is implementation-defined. g++ seems to handle noexcept the way as I'm proposing; but clang still unwinds within noexcept functions, and LLVM nounwind seems to do only the callee->caller propagation :(

Example:

fn set_index<T>(arr: &mut [T], idx: usize, val: T) {
    arr[idx] = val
}
#[no_panic] // or #[no_unwind], not sure how we should call this
fn main() {
   let arr : &mut[Box<int>] = get_array();
   set_index(arr, 5, Box::new(42));
}

When the index 5 is out-of-range, by my no_panic rules, the compiler does not have to call the Box destructor in set_index. This is especially important when set_index gets inlined into main: I don't want inlining to cause main to have unwinding tables despite being marked #[no_panic], just so that a box can get destroyed before the process is aborted.

@dgrunwald
Copy link
Contributor

Back to the main part of the RFC: catching panics and converting them to error codes.

The mechanism suggested by this RFC is unsafe, because catching panics is unsafe in the presence of TLS. Moreover, one cannot easily wrap the mechanism behind a safe API, because reasoning about the safety requires global knowledge about the usage of TLS in the program.

In my mind, this makes the try mechanism useless where it is needed most: FFI binding libraries that attempt to provide a safe Rust API around a C library that involves callbacks. Aborting the whole process remains the only valid choice for those libraries.

@carllerche
Copy link
Member

@dgrunwald Why would #[unsafe_no_unwind] optimize bound checks away? Bound checking guards against more than segv crashes, for example, it prevents accessing memory that isn't owned by the container (and could contain secrets). Optimizing bound checks away seems more dangerous than one would otherwise expect.

@codyps
Copy link

codyps commented Feb 8, 2015

@carllerche if you can assume a lack of panic, all panics become unreachable markers. And bounds check failures panic. As a result, the compiler is able to assume that all of the bounds checks succeed.

As to #[unsafe_no_unwind]'s safety: it's got unsafe in the name to warn users that they have to be very careful about using it, just like the unsafe keyword we already have.

@alexcrichton
Copy link
Member

@dgrunwald

If a panic does happen within a no_panic function, the process is aborted.

This was one of the points you mentioned for #[no_panic]. I'm curious, how do you envision this being implemented? I would personally implement this via a "try/catch" block, but that would incur a cost whereas #[unsafe_no_unwind] would not incur any cost (it's just an LLVM attribute).

@dgrunwald
Copy link
Contributor

I'm curious, how do you envision this being implemented?

I had hoped that it would be possible to get the unwinding implementation to abort the process when a function is marked in some way in the unwinding tables (or even simpler: if the function is not present in the unwinding tables).

Unfortunately it looks like LLVM has no good way to do this, and indeed needs a try/catch block: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-May/036960.html
:-(

@aturon
Copy link
Member

aturon commented Mar 5, 2015

ping @alexcrichton

@alexcrichton alexcrichton added the T-lang Relevant to the language team, which will review and decide on the RFC. label May 27, 2015
@alexcrichton
Copy link
Member

Unassigning myself as I believe this has now moved under the T-lang umbrella

@alexcrichton alexcrichton removed their assignment May 28, 2015
@aturon aturon self-assigned this Jul 16, 2015
@nikomatsakis
Copy link
Contributor

We discussed this RFC today in the language subteam meeting, and decided that it ought to be closed in favor of RFC #1236, which covered roughly the same ground. Thanks @Aatch nonetheless. There are also a few bits in here that may be worth pulling out for a later RFC, such as the annotations for functions that govern whether landing pads are created or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants