Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jool pool4 flush crashes with debug kernel to use-after-free #368

Closed
terofinn opened this issue Nov 12, 2021 · 5 comments
Closed

Jool pool4 flush crashes with debug kernel to use-after-free #368

terofinn opened this issue Nov 12, 2021 · 5 comments
Milestone

Comments

@terofinn
Copy link

terofinn commented Nov 12, 2021

The jool's pool4 db flush seem to always crash, looks like use-after-free based on memory poison values in registers.
0x6b = POISON_FREE
Added some printk debugs to src/mod/common/db/pool4/db.c

Jool version is 4.1.5 and kernel version is 4.19.181.

Following memory debugging options are enabled in the kernel:

CONFIG_DEBUG_PAGEALLOC=y
CONFIG_PAGE_POISONING=y

Backtrace from crash

...
[  383.079504] Clear TCP mark ffff888037bf35c8
[  383.081727] Clear UDP mark ffff888037bf35d0
[  383.084181] TABLE ffff888024784138
[  383.086095] Clear ICMP mark ffff888037bf35d8
[  383.088352] TABLE ffff888077a78008
[  383.090185] Clear TCP addr ffff888037bf35e0
[  383.092358] Clear TCP addr ffff888037bf35e8
[  383.094532] TABLE ffff88802c0c5778
[  383.096714] TABLE ffff888029996e18
[  383.098555] TABLE ffff888074902198
[  383.100485] stack segment: 0000 [#1] SMP PTI
[  383.102707] CPU: 0 PID: 4986 Comm: jool Kdump: loaded Tainted: G           O      4.19.181+smp-debug #1
[  383.106977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[  383.110607] RIP: 0010:rbtree_foreach+0x63/0x90 [jool_common]
[  383.112689] Code: 89 df 41 ff d5 48 85 ed 75 20 eb 3d 48 39 d8 75 2a 48 89 ef 4c 89 e6 48 89 eb 41 ff d5 48 8b 45 00 48 83 e0 fc 48 89 c5 74 1f <48> 39 5d 10 48 8b 45 08 75 d8 48 85 c0 74 d8 eb ac 48 c7 c7 d8 6d
[  383.119453] RSP: 0018:ffffc90001f239d8 EFLAGS: 00010202
[  383.121391] RAX: 6b6b6b6b6b6b6b68 RBX: ffff8880749021a8 RCX: 0000000000140012
[  383.123993] RDX: 0000000000140013 RSI: 0000000000000000 RDI: 0000000000000246
[  383.126629] RBP: 6b6b6b6b6b6b6b68 R08: 0000000000000000 R09: 0000000000000000
[  383.129290] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  383.131884] R13: ffffffffa125fa28 R14: ffff8880248ad098 R15: ffff88802a1761d8
[  383.134612] FS:  00007ffff783c080(0000) GS:ffff88807d800000(0000) knlGS:0000000000000000
[  383.137582] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  383.139652] CR2: 000055555557d048 CR3: 00000000207b2000 CR4: 00000000000006f0
[  383.142254] Call Trace:
[  383.143209]  rbtree_clear+0xe/0x20 [jool_common]
[  383.144937]  clear_trees+0xbe/0xe0 [jool_common]
[  383.146687]  pool4db_flush+0x1e/0x30 [jool_common]
[  383.148463]  handle_pool4_flush+0x3e/0xd0 [jool_common]
[  383.150424]  ? handling_hairpinning_siit+0xf0/0xf0 [jool_common]
[  383.152637]  ? is_hairpin_nat64+0x40/0x40 [jool_common]
[  383.154620]  genl_family_rcv_msg+0x18a/0x390
[  383.156772]  genl_rcv_msg+0x47/0x90
[  383.158226]  ? genl_family_rcv_msg+0x390/0x390
[  383.159875]  netlink_rcv_skb+0x37/0xf0
[  383.161270]  genl_rcv+0x24/0x40
[  383.162453]  netlink_unicast+0x16c/0x210
[  383.163926]  netlink_sendmsg+0x1ca/0x3e0
[  383.165471]  sock_sendmsg+0x13/0x20
[  383.167492]  ___sys_sendmsg+0x23b/0x280
[  383.169647]  ? ___sys_recvmsg+0x134/0x190
[  383.171984]  ? __handle_mm_fault+0x9f7/0xf20
[  383.174510]  ? _raw_spin_unlock+0x24/0x30
[  383.176904]  ? __handle_mm_fault+0x65d/0xf20
[  383.179558]  __sys_sendmsg+0x47/0x80
[  383.181873]  do_syscall_64+0x50/0x1a0
[  383.184176]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  383.187289] RIP: 0033:0x7ffff7b93431
[  383.189255] Code: ad 9b 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f 1f 80 00 00 00 00 8b 05 1a e0 00 00 85 c0 75 16 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 41 89 d4 55 48
[  383.197250] RSP: 002b:00007fffffffe868 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  383.200006] RAX: ffffffffffffffda RBX: 000055555557ae10 RCX: 00007ffff7b93431
[  383.202873] RDX: 0000000000000000 RSI: 00007fffffffe8a0 RDI: 0000000000000003
[  383.205560] RBP: 000055555557af30 R08: 00007fffffffe970 R09: ffffffff00000000
[  383.208740] R10: 000055555557a010 R11: 0000000000000246 R12: 000055555557ad20
[  383.211346] R13: 00007fffffffe8a0 R14: 00007fffffffeb08 R15: 0000555555577620

@terofinn terofinn changed the title Jool pool4 flush crashes with debug build Jool pool4 flush crashes with debug kernel to use-after-free Nov 12, 2021
@ydahhrk
Copy link
Member

ydahhrk commented Nov 12, 2021

The jool's pool4 db flush seem to always crash

When you say "always," you mean even when there's nothing in the table?

And if not, do you have a sample population add/remove/flush sequence?

@ydahhrk
Copy link
Member

ydahhrk commented Nov 12, 2021

Ok, I think I found the bug: Line 60 or 62 deletes the parent, then lines 68-69 attempt to dereference it. Duh.

I suppose I could fix it, but support for kernels 3.11- was abandoned a long time ago, so the right solution is to drop rbtree_foreach() in favor of rbtree_postorder_for_each_entry_safe().

@ydahhrk
Copy link
Member

ydahhrk commented Nov 12, 2021

How about now?

@terofinn
Copy link
Author

Thanks, works fine now!

@terofinn
Copy link
Author

Hmm, did already close this issue but maybe it should remain open util the fix is in master?

@terofinn terofinn reopened this Nov 15, 2021
@ydahhrk ydahhrk added this to the 4.1.6 milestone Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants