Plumb Features through into Routes #433

TheBlueMatt · 2019-12-29T21:44:01Z

~~Based on #428,~~ this plumbs features through into Routes so that we can use that information when we go to construct onions. This is gonna be required for #430.

While we're tracking new info in ChannelManager, we create the map that should eventually hold all channel data and implement #422.

ariard

Overall it's good once #428 is merged though few questions

ariard · 2020-01-07T19:34:26Z

lightning/src/ln/features.rs

-pub trait FeatureContextInitNode : FeatureContext {}
-impl FeatureContextInitNode for FeatureContextInit {}
-impl FeatureContextInitNode for FeatureContextNode {}
+mod sealed { // You should just use the type aliases instead.


Like it better, you may keep at least comments "This Context represents when the Feature appear..." to let know what struct tie to what message

Note that this got moved into #428, so probably best to keep the conversation there.

Thoughts on marking a PR as "draft" if it is based on an open PR?

Hmm, I can do whatever. I generally just use "draft" to mean "not ready to hit merge if this got acks as-is".

lightning/src/ln/channelmanager.rs

ariard · 2020-01-07T20:36:39Z

lightning/src/ln/channelmanager.rs

+						latest_features: init_msg.features.clone(),
+					}));
+				},
+				hash_map::Entry::Occupied(e) => {


Is there any risk of peer collision ? An attacker can usurp pubkey available on the net but without privkey it won't get through the noise phase..and if a peer leaks its privkey we can't guarantee consistency of our structs (and we may have bigger troubles)

Right. If you can impersonate a peer, that peer is screwed. Not a lot we can do then.

When would this occur? Won't PeerManager disconnect when it sees a duplicate id?

https://github.com/rust-bitcoin/rust-lightning/blob/a252f81477c25bf1248fd9ea5269a6620187c487/lightning/src/ln/peer_handler.rs#L540-L554

If you're using our PeerManager, yes, this should not be possible, though in theory these functions are public and the user may run their own PeerManager-like thing that may not properly check.

What sort of semantics do we want to provide to users in this case and where do we plan on documenting it? I can see three possibilities:

Return an error / panic

Overwrite the peer state with a new entry

Update the peer state with new data

For (1), my assumption is (that if used properly) there should already have been a peer_disconnected event even if this is a reconnect. So the entry would not exist in that case. Is my understanding accurate?

Currently (2) and (3) are equivalent but may not be in the future. You're implementing this as (3) though the code would be simplified if (2) is the desired behavior (i.e., you could blindly insert into the map instead of matching on the entry).

Good question. I think in general if you don't follow the API, there are no guarantees period. It maybe could be further documented that you must somehow verify that a given message came from something that has proven knowledge of the relevant private key, but if a counterparty's private key leaks, we won't panic, it just may result in the counterparty losing their funds/channels.

Currently (2) and (3) are equivalent but may not be in the future.

Hmmm not sure I would change, it's state relative to the connection it shouldn't leak to the ChannelManager layer, if it's concerning channel+routing, assuming LN messages are ordered newer ones should take precedence.

It maybe could be further documented that you must somehow verify that a given message came from something that has proven knowledge of the relevant private key

Transport is assumed to be authenticated and encrypted so verification than message came from something which owns the relevant private key is done I think. That's said identity of the public key responder, how do you establish this knowledge is beyond specs. Keys leakage means your funds are loss, just send a HTLC emptying your balance to some exit node, it doesn't enter in the scope of option_upfront_shutdown_script.

ariard · 2020-01-07T20:44:05Z

lightning/src/ln/channelmanager.rs

+	/// new channel.
+	/// If we are connected to a peer we always at least have an entry here, to store their features
+	/// so that we have them available if we open a channel with them and need it for routing.
+	per_peer_state: RwLock<HashMap<PublicKey, Mutex<PerPeerState>>>,


What parts of ChannelHolder do you plan to move here, by_id ?

Further, I'm wondering if we can do better on lock management. Like not putting the lock on the PerPeerState directly but something like PerPeerState { peer_chans_by_id <RwLock<HashMap<[u8; 32], Mutex<Channel>>>, latest_features: Mutex }, tho not sure it's worth the complexity if we assume most of time you'll have one chan per peer.

I was figuring by_id and pending_msg_events (at least for those message events that are non-broadcast, but I dont think there's harm in having broadcast events in there too). That would mean our message queue and channel state per-peer would be under a lock, and it would all be per-peer, which I think is the right design. Indeed, we could lock per-channel too but I don't think its worth it because the caller isn't allowed to have two in-flight message calls for a single peer, so we'd almost never be able to be doing something on both channels at once.

Hmmm I'm not sure what do we win on concurrency with putting pending_msg_events there, unless we change get_and_clear_pending_msg_events to be per-peer too but it's just going to push the bottleneck to PeerManager and its Mutex<PeerHolder ?

but I don't think its worth it because the caller isn't allowed to have two in-flight message calls for a single peer

Could you elaborate ? Because I always thought than the ChannelMessageHandler API was flexible enough so that you could at least call two handle_* at same time (even if it's likely you contend for locks inside them)

re: concurrency - the moving of pending_msg_events into per_peer data is more for logical consistency than anything - we have to push events onto that queue while holding the lock we took to edit the Channel data. The only way to keep parallelism is to split the pending_msg_events into per_peer_pending_msg_events and include it in the same Mutex that holds the Channel.

re: elaboration: The ChannelMessageHandler API explicitly states "Messages MAY be called in parallel when they originate from different their_node_ids, however they MUST NOT be called in parallel when the two calls have the same their_node_id". Fundamentally the lightning protocol requires in-order message delivery.

ariard · 2020-01-07T21:16:22Z

lightning/src/ln/channelmanager.rs

+	/// This is particularly useful for routing as many node_announcement-context features are also
+	/// set in the init-context, and we should almost certainly consider the init-context version
+	/// to be the latest copy.
+	pub remote_node_features: InitFeatures,


I'm not sure about assuming the init-context version is the latest copy, a peer may be able to change its advertised services without renewing any connection with us, just by broadcasting a new node_announcement, so between init and node_announcement I think we should reconciliate features where we have both states available, i.e in get_route ?

But do we really need to cache them in ChannelManager, can't we store everything in PeerManager ? It's harder to see given a lot of features are obviously yet to be designed, so maybe we should have at least an access function returning NodeDetails to let client select nodes.

Hmm, I was kinda assuming to get new features you'd at least have to restart your node. It may make sense to further clarify the BOLTs in this regard, but certainly today the restart requirement holds (and it may be some time after restart before you create a new announcement, so Init would be the latest-correct).

As for storing it in peer_manager, we could do that, but it just makes for an awkward API - it would mean the client has to call something on ChannelManger, get a list of channels, then go ask PeerHandler for the InitFeatures for each node that we are connected to. It's not exactly an obviously-correct place to store it, but I don't think its completely absurd for ChannelManager to store the latest InitFeatures we have, especially given it already tracks connected-ness state for peers.

I thought at first that features were considered the same than service bits on base layer and so stable until restart but I can't find this explicit requirement in the bolts, maybe we should amend them ?

Based on this we may have to adapt the API to conciliate both messages but if your interpretation holds I'm okay with the API.

The NodeDetails is a different point though and I think it's needed anyway (e.g a client don't want to open channel with nodes not using opt_anchor_output or other), can be added in a follow-up PR.

Right, its definitely not "required" by the BOLTs but I figured it was a safe assumption. As for init features overriding node_announcement features, I think thats also a reasonable assumption but I agree that we should make that explicit in the BOLTs.

ariard · 2020-01-07T21:20:37Z

lightning/src/ln/router.rs

 						}
 					}
-					add_entry!(hop.short_channel_id, target, hop, 0);
+					// BOLT 11 doesn't allow inclusion of features for the last hop hints, which


Because we may want to use var_onion_optin for routing to last_hop but we don't have any connection or receive yet a node_announcement for him ? Even wonder if not allowing this is bad for privacy because it forces node to advertise its existence on the net or allow incoming connections instead of few selected gateways.

Note that there's currently no reason to need TLV-based onion for second-to-last-hops (and the invoice does provide features for last-hops), so its no big deal. It should probably be fixed eventually, but it doesn't impact B-AMP or so currently. May be worth opening an issue on the BOLT repo.

You mean like amending r field to provide features with routing hints ? You can assume than receiving is only to use nodes feature-compliant with its payment scheme, but we may have more complex schemes though

Right, ideally r fields could hold feature bits in some form for those nodes, but no reason to need it today.

jkczyz

9206ae7 and aae111e seem to be outside the scope of this PR.

lightning/src/ln/channelmanager.rs

jkczyz · 2020-01-16T01:45:17Z

lightning/src/ln/channelmanager.rs

+/// State we hold per-peer. In the future we should put channels in here, but for now we only hold
+/// the latest Init features we heard from the peer.
+struct PerPeerState {
+	latest_features: InitFeatures,


High-level question: Given this is duplicated in PeerManager's Peer structs, has there been any though around designing the API in terms of Peer and Channel abstractions?

I'm not sure what you mean - you mean something like replacing the their_node_id argument with some struct that holds more information (eg the init message)? We could, I suppose, though note that latest_features here is intended to be available even while disconnected (though maybe it doesn't actually need to be, just nice to have).

I'm mostly thinking whether it would be suitable to design the overall API (not just this module) in terms of Peer and Channel abstractions and use them throughout to build higher-level abstractions like ChannelManager. Admittedly, this is outside the scope of this PR.

Hmm. Maybe. It would certainly be tricky because a Peer "owns" a channel, but also the Peer can disconnect and reconnect and get back the same Channel. We can explore more in the future.

jkczyz · 2020-01-16T01:46:39Z

lightning/src/ln/channelmanager.rs

+	/// Because adding or removing an entry is rare, we usually take an outer read lock and then
+	/// operate on the inner value freely. Sadly, this prevents parallel operation when opening a
+	/// new channel.


Not sure that I follow what is meant by "usually" here. Can it be removed?

As in, if you add or remove a connection you'll need the outer write lock, but usually we dont so take the read lock.

jkczyz · 2020-01-16T02:00:26Z

lightning/src/ln/channelmanager.rs

+	/// If we are connected to a peer we always at least have an entry here, to store their features
+	/// so that we have them available if we open a channel with them and need it for routing.


This is a bit verbose and may get out of sync with PerPeerState by talking about specific data within it. I'd recommend rewording more succinctly as:

Per-peer state used for opening channels and routing.

This can also replace the first sentence of the documentation since currently that mostly just restates the field name.

Hmm, I mean where features gets stored, sure, I suppose, though the note that "we'll always have something here if we're connected" is an important thing to document - it indicates when you can rely on an entry vs having to check and see if there is one. I'll just drop the "to store their features..." bit.

My larger point is that if it is important then it should be stated upfront in the documentation (rather than after the locking comments) and done in a succinct manner such that the reader can quickly understand its purpose. IMHO, "Per-peer state storage" doesn't accomplish that as it's essentially restating the field name.

Right, ok, missed the objection to the opening line. I reworded it to "The bulk of our storage will eventually be here (channels and message queues and the like)." I presume thats a bit more descriptive?

Eh, I'd expect the first sentence to use the word "peer" at very least; "storage" doesn't seem like a primary characteristic of what the field contains. And why is "bulk" relevant to the reader? The fact that this data is persisted seems like an ancillary detail to me. More important is how it is used.

I mean the variable name is pretty clear about it being a peer-indexed object :). As for "storage" its more of a future-looking thing - once we move channels and pending_msg_events, its really almost all of the ChannelManager "stuff" lives there.
I'm not sure what you're referring to about "persisted" there - nothing mentions persistence, only that a precondition exists that there is an entry if the peer is connected (ie if you're in a message handler, its guaranteed to be there, so you don't have to check).

Generally speaking, documentation should primarily state what something is and how it is used (i.e., its purpose). This is the first thing that should be stated. In other words, we are storing it only because it has some primary purpose. By "persisted" I simply meant saved to storage, which seems like an ancillary detail to the fact that this has some purpose.

I can see the argument that some of this documentation belongs on PeerState while other belongs on per_peer_state. I would like to see a clear divide between the two. :) In that sense, yes, I see your point: this should mostly talk about locking and how the mapping works. But I'd still argue that storage is not the primary purpose and therefore should not be in the first sentence.

Separately, anything future-looking should not be part of documentation. Just put it in once the code reflects that reality. Or at very least it should be more of an aside rather than the primary sentence in my opinion.

jkczyz · 2020-01-16T02:15:55Z

lightning/src/ln/channelmanager.rs

+						latest_features: init_msg.features.clone(),
+					}));
+				},
+				hash_map::Entry::Occupied(e) => {


When would this occur? Won't PeerManager disconnect when it sees a duplicate id?

https://github.com/rust-bitcoin/rust-lightning/blob/a252f81477c25bf1248fd9ea5269a6620187c487/lightning/src/ln/peer_handler.rs#L540-L554

TheBlueMatt · 2020-01-16T19:32:58Z

"Fix EnforcingChannelKeys panic when our counterparty burns their $." is #445. The message fuzz target strictness commit I can move into a later PR, but its useful for the TLV stuff and I figured it'd be easier to land it earlier in the stack since the TLV stuff is rather large.

jkczyz

The message fuzz target strictness commit I can move into a later PR, but its useful for the TLV stuff and I figured it'd be easier to land it earlier in the stack since the TLV stuff is rather large.

If it's related to the TVL stuff, then it should go with it, no? The commit itself is rather small so shouldn't be problematic there.

jkczyz · 2020-01-16T21:01:17Z

lightning/src/ln/channelmanager.rs

+						latest_features: init_msg.features.clone(),
+					}));
+				},
+				hash_map::Entry::Occupied(e) => {


What sort of semantics do we want to provide to users in this case and where do we plan on documenting it? I can see three possibilities:

Return an error / panic

Overwrite the peer state with a new entry

Update the peer state with new data

For (1), my assumption is (that if used properly) there should already have been a peer_disconnected event even if this is a reconnect. So the entry would not exist in that case. Is my understanding accurate?

Currently (2) and (3) are equivalent but may not be in the future. You're implementing this as (3) though the code would be simplified if (2) is the desired behavior (i.e., you could blindly insert into the map instead of matching on the entry).

jkczyz · 2020-01-16T21:05:16Z

lightning/src/ln/channelmanager.rs

+/// State we hold per-peer. In the future we should put channels in here, but for now we only hold
+/// the latest Init features we heard from the peer.
+struct PerPeerState {
+	latest_features: InitFeatures,


I'm mostly thinking whether it would be suitable to design the overall API (not just this module) in terms of Peer and Channel abstractions and use them throughout to build higher-level abstractions like ChannelManager. Admittedly, this is outside the scope of this PR.

ariard · 2020-01-17T20:15:11Z

ACK 28abb40. Changes since ee2411e are better docs and feature flags endianess handling.

lightning/src/ln/channelmanager.rs

jkczyz · 2020-01-17T19:28:56Z

lightning/src/ln/channelmanager.rs

+		let per_peer_state = self.per_peer_state.read().unwrap();
+		for chan in res.iter_mut() {
+			if let Some(peer_state) = per_peer_state.get(&chan.remote_network_id) {
+				chan.remote_node_features = peer_state.lock().unwrap().latest_features.clone();
+			}
+		}


This code is repeated. Could you extract this behavior into a method? Better yet -- since the relationship between list_usable_channels and list_channels is that the former is a subset of the latter -- define list_usable_channels as a filter over list_channels. Then even more code wouldn't be repeated.

Can do one better, instead of filtering after creation we can do it while doing so to avoid pushing ChannelDetails onto the Vec and then removing them :)

jkczyz · 2020-01-17T20:37:21Z

lightning/src/ln/channelmanager.rs

+	/// The Features this node provided us opon our last connection with them.
+	/// This is particularly useful for routing as many node_announcement-context features are also
+	/// set in the init-context, and we should almost certainly consider the init-context version
+	/// to be the latest copy.


This documentation is a bit verbose. Some tips:

Avoid adverbs, especially when they introduce ambiguity. Phrases like "should almost certainly" aren't meaningful to the reader without further explanation as to when what follows does not hold.

Cut out extraneous words. Does the meaning change if you used "useful" instead of "this is particularly useful"? If not, get rid of the words so the reader has less to parse.

Avoid first-person pronouns like "we" and "our". See this reference for details.

Consider this more concise version:

Features the channel counterparty provided upon last connection. Useful for routing since it contains features also provided by node_announcement messages only more recent.

I tweaked it some, let me know if you like it, I did want to highlight a bit more that not all routing-relevant features may be present in the init context, but many are.

jkczyz · 2020-01-17T20:54:22Z

lightning/src/ln/features.rs

+
+	/// Takes the flags that we know how to interpret in an init-context features that are also
+	/// relevant in a channel-context features and creates a channel-context features from them.
+	pub(crate) fn relevant_init_flags_to_channel(_init_ctx: &InitFeatures) -> Self {


"channel" in the name is redundant with the struct name. Instead, could you use std::convert::From for this? This is more idiomatic for Rust.

Hmm, at least personally I really hate From unless its super, duper, absolutely clear, and documentation describing exactly what it does would be wholly redundant. I don't think thats true here cause we're only copying over known flags. I changed it to with_known_relevant_init_flags, so callsites read ChannelFeatures::with_known_relevant_init_flags, which I think is pretty clear.

~~Should this be "supported" rather than "known" features?~~ I'd consider dropping "relevant" as that should be implied by the fact that it returns ChannelFeatures.

jkczyz · 2020-01-17T21:15:01Z

lightning/src/ln/channelmanager.rs

+	/// This is particularly useful for routing as many node_announcement-context features are also
+	/// set in the init-context, and we should almost certainly consider the init-context version
+	/// to be the latest copy.
+	pub remote_node_features: InitFeatures,


Can we reconcile the overloading of node_features? Here, it is referring to something of type InitFeatures but in RouteHop the node_features field has type NodeFeatures. More confusingly, they are used together in Router::get_route.

Took your wording of "counterparty".

Since we want to keep track of the Init-context features for every peer we have channels with, we have to keep them for as long as the peer is connected (since we may open a channel with them at any point). We go ahead and take this opportunity to create a new per-peer-state struct which has two levels of mutexes which is appropriate for moving channel storage to. Since we can't process messages from a given peer in parallel, the inner lock is a regular mutex, but the outer lock is RW so that we can process for different peers at the same time with an outer read lock.

lightning/src/ln/router.rs

jkczyz · 2020-01-21T18:43:11Z

lightning/src/ln/router.rs

+					if res.last().unwrap().pubkey == *target {
+						break;
+					}


Is there a reason features are set on the last hop but other fields are not? It's unclear to me.

The RouteHop object is a little confusing - some fields are about the channel used to get to the given node and some fields are about the node itself. Every field should be set by the end of the loop, but the fee_msat and cltv_expiry_delat is set in the previous iteration.

lightning/src/ln/router.rs

This exposes the latest Init-context features in the ChannelDetails passed to the Router during route calculation, which combines those with the Node-context features tracked from node_announcements to provide the latest Node-context features in RouteHop structs. Fields are also added for Channel-context features, though those are only partially used since no such features are defined today anyway. These will be useful when determining whether to use new TLV-formatted onion hop datas when generating onions for peers.

jkczyz · 2020-01-22T18:52:12Z

ACK 617a680

TheBlueMatt changed the title ~~2019 12 features in routes~~ Plumb Features through into Routes Dec 29, 2019

TheBlueMatt mentioned this pull request Dec 29, 2019

TLV-based Variable Length Onion #434

Merged

TheBlueMatt force-pushed the 2019-12-features-in-routes branch from 1d4f860 to fac80d6 Compare December 29, 2019 22:42

TheBlueMatt force-pushed the 2019-12-features-in-routes branch 4 times, most recently from f47d4aa to 196fd59 Compare January 7, 2020 20:19

TheBlueMatt assigned jkczyz Jan 7, 2020

ariard reviewed Jan 7, 2020

View reviewed changes

TheBlueMatt force-pushed the 2019-12-features-in-routes branch 4 times, most recently from 3b748f9 to 7b2276c Compare January 13, 2020 22:56

jkczyz reviewed Jan 16, 2020

View reviewed changes

TheBlueMatt force-pushed the 2019-12-features-in-routes branch from 7b2276c to 865b0ea Compare January 16, 2020 19:31

jkczyz reviewed Jan 16, 2020

View reviewed changes

TheBlueMatt force-pushed the 2019-12-features-in-routes branch from 865b0ea to 28abb40 Compare January 16, 2020 23:03

jkczyz reviewed Jan 17, 2020

View reviewed changes

TheBlueMatt force-pushed the 2019-12-features-in-routes branch 3 times, most recently from f6328c4 to be74dbf Compare January 18, 2020 00:58

TheBlueMatt added 2 commits January 19, 2020 22:47

Pass peer's Init message through to ChannelManager

d2ba7ca

jkczyz reviewed Jan 21, 2020

View reviewed changes

TheBlueMatt force-pushed the 2019-12-features-in-routes branch from be74dbf to 2804fd3 Compare January 21, 2020 20:06

TheBlueMatt added 2 commits January 21, 2020 15:09

DRY-up list_channels by having a common lister that takes a filter

617a680

TheBlueMatt force-pushed the 2019-12-features-in-routes branch from 2804fd3 to 617a680 Compare January 21, 2020 20:09

jkczyz approved these changes Jan 22, 2020

View reviewed changes

TheBlueMatt merged commit c991643 into lightningdevkit:master Jan 22, 2020

		/// If we are connected to a peer we always at least have an entry here, to store their features
		/// so that we have them available if we open a channel with them and need it for routing.

Plumb Features through into Routes #433

Plumb Features through into Routes #433

Conversation

TheBlueMatt commented Dec 29, 2019 • edited Loading

ariard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariard Jan 17, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz Jan 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Jan 16, 2020

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariard commented Jan 17, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz Jan 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz commented Jan 22, 2020

TheBlueMatt commented Dec 29, 2019 •

edited

Loading

ariard Jan 17, 2020 •

edited

Loading

jkczyz Jan 16, 2020 •

edited

Loading

jkczyz Jan 21, 2020 •

edited

Loading