-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time out HTLCs before they expire #587
Time out HTLCs before they expire #587
Conversation
Rebase plz? |
d3a418c
to
3a3403a
Compare
Done :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review
lightning/src/ln/channelmanager.rs
Outdated
pending_msg_events.push(events::MessageSendEvent::SendAnnouncementSignatures { | ||
let res = channel.block_connected(header, height, txn_matched, indexes_of_txn_matched); | ||
if let Ok((chan_res, mut timed_out_pending_htlcs)) = res { | ||
timed_out_htlcs.reserve(timed_out_pending_htlcs.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this commit won't compile bc timed_out_htlcs
is declared in the next commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, every commit seems to compile fine here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, think I was looking at the commits in the wrong order (I blame github)
lightning/src/ln/functional_tests.rs
Outdated
@@ -3680,6 +3689,49 @@ fn test_drop_messages_peer_disconnect_dual_htlc() { | |||
claim_payment(&nodes[0], &[&nodes[1]], payment_preimage_2, 1_000_000); | |||
} | |||
|
|||
#[test] | |||
fn test_htlc_timeout() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a holding cell HTLC that gets failed back, to the test?
I found that case a bit weird since if an HTLC is in the holding cell when it times out, if we are the sender it seems like we could just remove it from the holding cell without failing it backwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'm not sure which case you're asking about. This test/the previous commit is really only intended to be useful for inbound HTLCs (which can't be in the holding cell, that's just for outbound stuff or the fail/fulfill action itself) which are to us, as forwarded ones we are mainly focused on going to the chain to broadcast HTLC-Success if it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I was looking at commits out of order.
Thanks for the explanation 👍 though, I thought we still have to close a channel if a forward HTLC times out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the preimage and its outbound, yea, or if its inbound, always, yea. If we don't, its probably not worth the fee to close the channel when we aren't gonna make money for it.
3a3403a
to
a342d32
Compare
}, onion_packet), channel_state, chan) | ||
} { | ||
Some((update_add, commitment_signed, monitor_update)) => { | ||
if let Err(e) = self.monitor.update_monitor(chan.get().get_funding_txo().unwrap(), monitor_update) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea for another day, but could there be a way to push a a "test the waters" monitor update, before pushing the real one? to help help dodge these sorts of problems/confusing partial timeouts due to monitor update failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and no. There was a long discussion at #441 (comment) which is somewhat related, but the real issue with this API is its not really possible to implement a client for it. Imagine a hardware wallet that has to implement it - there's no way to guarantee that an update goes through in advance, even if you can send the update to the computer before you need to commit it - what if the user unplugged the device at an inopportune time?
|
||
let (payment_preimage, payment_hash) = get_payment_preimage_hash!(&nodes[0]); | ||
let payment_secret = PaymentSecret([0xdb; 32]); | ||
let mut route = nodes[0].router.get_route(&nodes[3].node.get_our_node_id(), None, &[], 100000, TEST_FINAL_CLTV).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth asserting what route.paths[0] is, to make sure we're not just using 2 of the same path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass_along_path does it for us, albeit indirectly.
lightning/src/ln/channelmanager.rs
Outdated
pending_msg_events.push(events::MessageSendEvent::SendAnnouncementSignatures { | ||
let res = channel.block_connected(header, height, txn_matched, indexes_of_txn_matched); | ||
if let Ok((chan_res, mut timed_out_pending_htlcs)) = res { | ||
timed_out_htlcs.reserve(timed_out_pending_htlcs.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, think I was looking at the commits in the wrong order (I blame github)
lightning/src/ln/channel.rs
Outdated
&HTLCUpdateAwaitingACK::AddHTLC { ref payment_hash, ref source, ref cltv_expiry, ref amount_msat, .. } => { | ||
if cltv_expiry <= &height { | ||
timed_out_htlcs.push((source.clone(), payment_hash.clone(), *amount_msat)); | ||
false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add a test for this case, to ensure holding cell HTLCs are timed out properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Caught a few bugs along the way, of course :/.
lightning/src/ln/channelmanager.rs
Outdated
|
||
channel_state.claimable_htlcs.retain(|&(ref payment_hash, _), htlcs| { | ||
htlcs.retain(|htlc| { | ||
if height >= htlc.cltv_expiry - CLTV_CLAIM_BUFFER - LATENCY_GRACE_PERIOD_BLOCKS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you help me understand LATENCY_GRACE_PERIOD_BLOCKS
? I would have thought it should be added here if its purpose is to give our peer more time. Specifically, could you tell me if I'm reading this comment incorrectly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if I'm reading this wrong, maybe it would be clearer if LATENCY_GRACE_PERIOD_BLOCKS
was moved to the other side of the inequality:
if height + LATENCY_GRACE_PERIOD_BLOCKS >= htlc.cltv_expiry - CLTV_CLAIM_BUFFER {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
claimable_htlcs
isnt about peers, its about the user of RL finding the preimage and claim. Another way that may help would be to look at LATENCY_GRACE_PERIOD_BLOCKS
as "the time it takes us to do a commitment update". I added a comment to this effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense then to define an appropriately named constant for CLTV_CLAIM_BUFFER + LATENCY_GRACE_PERIOD_BLOCKS
?
I see these constants used together in a few places, so if there is a name that suitably captures this concept, it may be worth defining such a constant to make the code more understandable and document it there instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep!
lightning/src/ln/functional_tests.rs
Outdated
// Expect pending failures, but we don't bother trying to update the channel state with them. | ||
let events = nodes[0].node.get_and_clear_pending_events(); | ||
assert_eq!(events.len(), 1); | ||
match events[0] { | ||
Event::PendingHTLCsForwardable { .. } => { }, | ||
_ => panic!("Unexpected event"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment essentially states how the following code block differs from expect_pending_htlcs_forwardable
(i.e., "but we don't bother trying to update the channel state with them"). However, I think the comment would be more useful if it stated why we're not updating the channel state, assuming that this is pertinent to the test.
Also, given that this block of code and preceding comment is repeated a few times, I'd recommend making an appropriately named utility function for it, especially since it is almost identical to expect_pending_htlcs_forwardable
. Then a comment might be unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not sure what to say aside from "the test doesn't need to do the forwards". I converted it to a macro with "_ignore" so its a bit more DRY, thgouh.
'path_loop: for path in route.paths.iter() { | ||
macro_rules! check_res_push { | ||
($res: expr) => { match $res { | ||
Ok(r) => r, | ||
Err(e) => { | ||
results.push(Err(e)); | ||
continue 'path_loop; | ||
}, | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to see a helper method added rather than using this macro! 😀
lightning/src/ln/channelmanager.rs
Outdated
@@ -1227,6 +1227,80 @@ impl<ChanSigner: ChannelKeys, M: Deref, T: Deref, K: Deref, F: Deref> ChannelMan | |||
}) | |||
} | |||
|
|||
// Only public for testing, this should otherwise never be called direcly | |||
pub(crate) fn send_payment_path(&self, path: &Vec<RouteHop>, payment_hash: &PaymentHash, payment_secret: &Option<PaymentSecret>, total_value: u64, cur_height: u32) -> Result<(), APIError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To others, I found this command helpful for reviewing since the diff contains mostly moved code:
git diff --color-moved=dimmed-zebra --color-moved-ws=allow-indentation-change 9224f01^ 9224f01 lightning/src/ln/channelmanager.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ this is what i was missing 😄
e799bf6
to
664987b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding 9501d64! Much easier to follow the test when it can be broken up into more digestible parts. 😀
lightning/src/ln/channelmanager.rs
Outdated
|
||
channel_state.claimable_htlcs.retain(|&(ref payment_hash, _), htlcs| { | ||
htlcs.retain(|htlc| { | ||
if height >= htlc.cltv_expiry - CLTV_CLAIM_BUFFER - LATENCY_GRACE_PERIOD_BLOCKS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense then to define an appropriately named constant for CLTV_CLAIM_BUFFER + LATENCY_GRACE_PERIOD_BLOCKS
?
I see these constants used together in a few places, so if there is a name that suitably captures this concept, it may be worth defining such a constant to make the code more understandable and document it there instead.
lightning/src/ln/channelmanager.rs
Outdated
|
||
channel_state.claimable_htlcs.retain(|&(ref payment_hash, _), htlcs| { | ||
htlcs.retain(|htlc| { | ||
if height >= htlc.cltv_expiry - CLTV_CLAIM_BUFFER - LATENCY_GRACE_PERIOD_BLOCKS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a check/panic in case the height is much higher than we expect, and we should be closing the channel rather than failing the HTLC backwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we could maybe be more strict about things like "the user is connecting a block at height 20 but we thought we were at height 10", but, in general, there's no reason to fail a channel to the chain just because we Definitely Cannot Claim such an HTLC.
lightning/src/ln/functional_tests.rs
Outdated
commitment_signed_dance!(nodes[0], nodes[1], htlc_timeout_updates.commitment_signed, false); | ||
let events = nodes[0].node.get_and_clear_pending_events(); | ||
match events[0] { | ||
Event::PaymentFailed { payment_hash, rejected_by_dest, error_code } => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting this compilation error on this commit: https://i.imgur.com/dXJErz0.png
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be resolved by the later commit: "f rebase on ..."
@@ -3066,7 +3066,12 @@ impl<ChanSigner: ChannelKeys, M: Deref + Sync + Send, T: Deref + Sync + Send, K: | |||
// number of blocks we generally consider it to take to do a commitment update, | |||
// just give up on it and fail he HTLC. | |||
if height >= htlc.cltv_expiry - CLTV_CLAIM_BUFFER - LATENCY_GRACE_PERIOD_BLOCKS { | |||
timed_out_htlcs.push((HTLCSource::PreviousHopData(htlc.prev_hop.clone()), payment_hash.clone(), htlc.value)); | |||
let mut htlc_msat_height_data = byte_utils::be64_to_array(htlc.value).to_vec(); | |||
htlc_msat_height_data.extend_from_slice(&byte_utils::be32_to_array(height)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch in this commit
@@ -769,13 +769,19 @@ macro_rules! expect_payment_sent { | |||
} | |||
|
|||
macro_rules! expect_payment_failed { | |||
($node: expr, $expected_payment_hash: expr, $rejected_by_dest: expr) => { | |||
($node: expr, $expected_payment_hash: expr, $rejected_by_dest: expr $(, $expected_error_code: expr, $expected_error_data: expr)*) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just always check the error code/data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because that requires going back and writing in a new error check at every callsite. It could be done in a followup but its quite a bit of work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finished going through the commits :)
// Pass the first HTLC of the payment along to nodes[3]. | ||
let mut events = nodes[0].node.get_and_clear_pending_msg_events(); | ||
assert_eq!(events.len(), 1); | ||
pass_along_path(&nodes[0], &[&nodes[1], &nodes[3]], 0, payment_hash.clone(), Some(payment_secret), events.pop().unwrap(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could check that we aren't able to claim before the monitor gets successfully updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass_along_path checks for us that we don't get the PaymentReceived event (the last false/true bool).
let mut htlc_msat_height_data = byte_utils::be64_to_array(htlc.value).to_vec(); | ||
htlc_msat_height_data.extend_from_slice(&byte_utils::be32_to_array(height)); | ||
timed_out_htlcs.push((HTLCSource::PreviousHopData(htlc.prev_hop.clone()), payment_hash.clone(), HTLCFailReason::Reason { | ||
failure_code: 0x4000 | 15, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this failure code is "missing payment details" but above it's "expiry too soon"? expiry too soon makes more sense to me in both situations.. could be wrong but this just doesn't feel like the intended purpose of the "missing payment details" error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec for incorrect_or_unknown_payment_details
says "or the CLTV expiry of the htlc is too close to the current block height for safe handling." :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Touché
})); | ||
} | ||
if let Some(funding_locked) = chan_res { | ||
pending_msg_events.push(events::MessageSendEvent::SendFundingLocked { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could add a log here as well similar to the one just below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm? I'm not sure which you're referring to (the log line three down is sent in either if case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, missed that
664987b
to
edb93e2
Compare
Addressed all the outstanding feedback, I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
edb93e2
to
05904e3
Compare
Great explanation of |
We only do this for incoming HTLCs directly as we rely on channel closure and HTLC-Timeout broadcast to fail any HTLCs which we relayed onwards where our next-hop doesn't update_fail in time.
expect_payment_failed!() was introduced after many of the tests which could use it were written, so we take this opportunity to switch them over now, increasing test coverage slightly by always checking the payment hash expected.
This is a key test for our automatic HTLC time-out logic, as it ensures we don't allow an HTLC which indicates we should wait for additional HTLCs before responding to cause us to force-close a channel due to HTLC near-timeout.
Relatively simple test that, after a monitor update fails, we get the right return value and continue with the bits of the MPP that did not send after the monitor updating is restored.
In case of committing out-of-time outgoing HTLCs, we force ourselves to close the channel to avoid remote peer claims on a non-backed HTLC
05904e3
to
d316f30
Compare
This was broken out from the rest of #441 as @valentinewallace pointed out that it wasn't strictly necessary and #441 was big. I am tagging it 0.0.11 because it is kinda necessary, just not strictly. Namely, if we get a partial payment, but not a full one, we really need to fail back the partial bits before the channel has to go to chain for it. Based on #441, and also includes another test that makes sense in #441 but requires some of the refactors that were already in here so might as well wait.