Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghost types #16

Open
HKalbasi opened this issue Nov 25, 2021 · 7 comments
Open

Ghost types #16

HKalbasi opened this issue Nov 25, 2021 · 7 comments

Comments

@HKalbasi
Copy link
Member

Not sure if here is the right place, but there are some ideas for filling const generic params with runtime values in ZSTs, that enables dependent types for "proof as program" style things.

Ghost types

A type is ghost if it has no runtime data and effect. That is, a ZST annotated with a ghost keyword:

ghost struct Le<const A: usize, const B: usize>(());
// we control instantiating of this type in a way that it has instance iff A <= B

Why someone would annotate type with ghost? Because it will enable for its const generic parameters to depend on runtime values, so this will be accepted:

let x = 2;
let prf: Le<x, 5> = Le::<x, 5>(());

The x in Le<x, 5> will be considered as a const variable, and only unifies with itself. only immutable variables and function parameters are allowed (that types doesn't contain mutable reference or interior mutablity) to be in const generics of ghost types.

This allows to write function with dependent signature:

fn do_smth(x: usize, y: usize, x_le_y: Le<x, y>) {
  // we know x is lesser than y, so we can for example index an array without bound checking.
}

Which is the whole propose of ghost types. Benefits of powerful type system is clear, it can enable more correct and specified code, less unsafe code, less runtime checks, ... This kind of thing is actually the tool for formal verification in languages like Coq. I don't explain it more as you are probably already aware of.

Type checking

How we type check such function calls? It will reuse const generic infrastructures. For example that function will become roughly equivalent to this for type checking propose:

fn do_smth<const x: usize, const y: usize>(x_le_y: Le<x, y>);

And in call site, we consider each immutable binding as a const variable and each parameter as an abstract const expression, so in:

do_smth(f(a), g(2), prf);
  • f and g should be const fn
  • if we assume g(2) is 7, type of prf should unify with Le<f(a), 7>.

so this is actually different with:

let b = f(a);
do_smth(b, g(2), prf);

as here f can be anything and type of prf is Le<b, 7> in this case.

Code generation

After type checking, we replace all ghost types with (), so that function will become:

fn do_smth(x: usize, y: usize, x_le_y: ());

which is normal function today.

I have some extenstions like ghost fn in mind, but I think this is enough for starting discussion.

@lcnr
Copy link
Contributor

lcnr commented Nov 26, 2021

How we type check such function calls? It will reuse const generic infrastructures. For example that function will become roughly equivalent to this for type checking propose:

fn do_smth<const x: usize, const y: usize>(x_le_y: Le<x, y>);

and

After type checking, we replace all ghost types with (), so that function will become:

fn do_smth(x: usize, y: usize, x_le_y: ());

seems to disagree with each other.


More generally, while I would love a language like Rust with dependent types, I do not believe that we should add them to Rust itself.

There are three reasons for this:

  • adding it to Rust without any breaking changes will leave us with large parts of std/the ecosystem at large not supporting it, which makes using it a lot more cumbersome than needed
  • dependent types really do not mesh well with our current query system and would need fairly large changes to how typeck and const evaluation works. Even feature(generic_const_exprs) feels like its pretty much at the edge of what we can support.
  • const fns are not guaranteed to evaluate successfully, so you can't use fn(MyStatement) -> ! as a proof of ¬MyStatement and I don't believe that adding something like ghost fn or pure fn for this is worth it. To my knowledge this is an important part of any useful proof assistant.

Other members of the language or compiler team might be a bit more open to something like your proposal, but even then, I do not assume this to be something we're going to tackle any time soon.

An imo far more promising approach would be unique/existential lifetimes outside of functions. While using them will be more complex than actual dependent types, these lifetimes can already be used to assert things about the runtime behavior of your programs, e.g. https://twitter.com/lcnr7/status/1390653250383355904

edit: for more, section 6.3 of https://raw.githubusercontent.com/Gankra/thesis/master/thesis.pdf is fairly nice

@HKalbasi
Copy link
Member Author

seems to disagree with each other.

Can you explain more? That just means we type check ghost types as const generics, but remove them before monomorphisation and code generation.

adding it to Rust without any breaking changes will leave us with large parts of std/the ecosystem at large not supporting it, which makes using it a lot more cumbersome than needed

There are already three style of api: panicking, fail with Option/Result, and unsafe. A forth style that takes proofs as input can be added?

dependent types really do not mesh well with our current query system and would need fairly large changes to how typeck and const evaluation works. Even feature(generic_const_exprs) feels like its pretty much at the edge of what we can support.

Dependent types (types that depends on value) are already here, but they are limited to compile-time constant, and it isn't because typeck (Correct me if I'm wrong), it is because compiler can't generate code for [i32; runtime_value] and need to monomorphize types for code generation. If we limit usage of types that depend on runtime values to ZSTs, we can remove them in code generation. Ghost type is all about it. It shouldn't make difference in type checking,

const fns are not guaranteed to evaluate successfully, so you can't use fn(MyStatement) -> ! as a proof of ¬MyStatement and I don't believe that adding something like ghost fn or pure fn for this is worth it. To my knowledge this is an important part of any useful proof assistant.

fn(MyStatement) -> ! guarantees that ¬MyStatement is partially correct, i.e. programs that use it to proof other properties will work as intended or won't terminate (successfully). And it is not special to fn(MyStatement) -> !. Existence of Le<2, 5> only proofs that Le<2, 5> is partially correct, as its proof can come from let x: Le<2, 5> = todo!(). So it can't be used for proofing mathematics statements, but it is fine for rust programs. An annotation for total functions is useful (even for non-const functions and regardless of ghost types) but it is another topic.

@lcnr
Copy link
Contributor

lcnr commented Nov 26, 2021

Can you explain more?

if we typeck these as const parameters, they can never be used with runtime values, so there isn't really a need to codegen them at all, they could just be const fn's or even associated consts of a generic type.

There are already three style of api: panicking, fail with Option/Result, and unsafe. A forth style that takes proofs as input can be added?

we not just want things to take proofs as inputs, we also want to add proofs of postconditions as outputs, don't we? These three kinds of api only rarely exist for the same functionality, so I believe you underestimate the necessary amount of duplication. Esp. because the "proof" versions are pretty much a strict improvement over non proof methods, so having both seems suboptimal.

Dependent types (types that depends on value) are already here, but they are limited to compile-time constant, and it isn't because typeck (Correct me if I'm wrong)

To me, the core part of dependent typing is that values from a context can flow into types in that same context. Rusts implementation of const generics seperate computations used in types from the places where these types are used which makes it a lot easier to querify. Right now we can separately compute any const arguments during the typeck where these constants are used. Once these values are used in types which exist in the same context as they do themselves, the separation between const eval and typeck breaks down. We then have to partially evaluate a function while typechecking it which is far from trivial.

Finally, these statements seem contradictory

fn(MyStatement) -> ! guarantees that ¬MyStatement is partially correct, i.e. programs that use it to proof other properties will work as intended or won't terminate (successfully).

After type checking, we replace all ghost types with ()

if we replace all ghost types with () after typeck we have to catch all "false proofs" during compile time, for a function like
fn proof_of_not_P(x: &str) -> fn(P<x>) -> ! this seems pretty much impossible to me.

This is at the edge of my knowledge, but more generally: for ghost types to be erasable before codegen you need proof irrelevance, i.e. the values of your proof objects must not flow into non proof objects. I am not sure how one would add proof irrelevance to rust.

@meowjesty
Copy link

I believe this is related to The pi type trilogy, especially Fully dependent pi types..

@HKalbasi
Copy link
Member Author

Esp. because the "proof" versions are pretty much a strict improvement over non proof methods, so having both seems suboptimal.

They are more cumbersome to use so it isn't improvement in this sense, and because of this most people won't use the proofed versions. So existence of normal version isn't bad imo. Unsafe codes, crictical code and library implementations can use it. Normal versions can be a wrapper over proofed versions.

That said, I understand your concern. Specially, good names are reserved for non proofed version, and this can make using proofed versions even more cumbersome. But the ability of express proofs is still very useful imo, even if it won't be used by default.


Finally, these statements seem contradictory

Function pointers are not zst, so they can not be (part of) ghost types. Whole propose of declaring a type as ghost is to using runtime values for its const generic parameters, not marking a type that will be used as proof. Another type that isn't ghost but can be used as proof is this:

pub enum LogicalOr<A, B> {
    Left(A),
    Right(B),
}

If A and B are ghosts, this still has one bit of data and can affect runtime behavior. We don't remove it (nor function pointers) from runtime. If someone wants to use fn (T) -> ! to proof something else, they need to actually call it, and if it is wrong, we may reach that call and panic (or loop), but no bad thing will happen. For example:

fn example(a: usize, b: usize, not_a_le_b: fn(Le<a, b>) -> !) -> Le<b, a> {
    match decide_le(a, b) {
        Left(a_le_b) => not_a_le_b(a_le_b),
        Right(b_le_a) => b_le_a,
    }
}

If proof of not_a_le_b is cheat, and b_le_a is wrong, we will see panic (or non termination).

This is at the edge of my knowledge, but more generally: for ghost types to be erasable before codegen you need proof irrelevance.

Ghost types are zst, and each two instance of a zst type are equal. For other proofs, there is no proof irrelevance, and we don't erase them.


if we typeck these as const parameters, they can never be used with runtime values, so there isn't really a need to codegen them at all, they could just be const fn's or even associated consts of a generic type.

We consider each function parameter and let binding a const variable, build a new function for each function, typeck these functions, then remove these functions and generic parameters of ghost types, and continue as normal. This type checking is only a check, like borrow checking, and won't affect anything else.

For example:

fn f1(x: usize) -> P<x> {
    todo!();
}

fn f2(a: usize, b: P<a>) -> bool {
    let c = f1(a);
    c == b
}

fn f3() {
    let x = 2;
    let px = f1(x);
    let b = f2(x, px);
}

At first we type check this code:

fn f1<const PARAM_x: usize>(x: usize) -> P<PARAM_x> {
    todo!();
}

fn f2<const PARAM_a: usize>(a: usize, b: P<PARAM_a>) -> bool {
    let c = f1::<PARAM_a>(a);
    c == b
}

fn f3<const LOCAL_x: usize>() {
    let x = 2;
    let px = f1::<LOCAL_x>(x);
    let b = f2::<LOCAL_x>(x, px);
}

And if it succeed, we completely ignore them and go with this:

fn f1(x: usize) -> P {
    todo!();
}

fn f2(a: usize, b: P) -> bool {
    let c = f1(a);
    c == b
}

fn f3() {
    let x = 2;
    let px = f1(x);
    let b = f2(x, px);
}

We then have to partially evaluate a function while typechecking it which is far from trivial.

Ghost types don't change how we unify const generics, so if 1+N won't be unified with N+1, 1+runtime_x won't be unified with runtime_x+1 as well. This harms usability of proofs, but can be worked around with axiomatization.

@lcnr
Copy link
Contributor

lcnr commented Mar 29, 2022

Looked more into dependent types and took some time to think more about the current status of const generics. I disagree with some of my previous statements here. That is less than ideal, considering that I am the lead of the const generics effort. I wasn't as thoughtful of my authority as I should have, sorry for that.

One of the more interesting things I've looked it is an accepted proposal for adding dependent types to Haskell.

While I still believe that this issue is something we don't have the capacity for right now and I don't intend to spend a lot of time on this subject at least until a basic form of #![feature(generic_const_exprs)] is close to stable, I am going to quickly summary my current position here. I am not yet an expert on dependent types, so that opinion may again change in the future.

  • adding the abilities for locals and function parameters to be influence types is something we may add in the future. Given infinite resources we should add it.
    • I don't want to work towards it or flesh out the design for that right now, as there are a lot of - imo more important - parts of const generics we have to finish first.
    • I intend to be careful when stabilizing features so that they do not prevent us from adding this in the future.
  • Adding ghost types does not seem like the right approach to me. The actual things getting erased should be generic parameters, not types. Guaranteeing that a generic parameter does not impact the runtime behavior is fairly straightforward and something I am interested in. This is also related to polymorphization, where this erasure happens implicitly.
    • This is again something I do not have the capacity for rn.

I intend to open a new design doc for "allow locals and function parameters to be used in types" which should contain a summary of this issue in the near future. At that point I will close this issue so that only design docs are open in this repository. I prefer zulip for active discussions.

@HKalbasi
Copy link
Member Author

I also lost my interest on this, particularly because it would not work for proofing properties of the mutable, imperative code, which is idiomatic and performant rust. That said, it would be still helpful in encoding properties at api boundary in the type system.

The actual things getting erased should be generic parameters, not types.

We can simulate erasing generic parameters by wrapping them in some ghost type, so there is some equivalency, I think annotating types is better than annotating generic parameters if it needs some explicit annotation, but both would work anyway.

I don't intend to spend a lot of time on this subject at least until a basic form of #![feature(generic_const_exprs)] is close to stable

And probably dependent const generics, i.e. X<const A: usize, const B: Foo<A>> needs to be fully implemented/near stabilization, which needs ADT in const generics as well (I don't know their feature flags). So yeah, it is not actionable, and summarizing into a design doc and closing this is a good outcome for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants