-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIIT-DC with EAM not working on Ubuntu 20.04 #338
Comments
Well, why did you switch from iptables to netfilter?
Netfilter Jool is very greedy, and my first hypothesis would be that pool6
is probably eating up all your IPv4 traffic.
For example, from your "ping my router" test:
1. Jool machine writes echo request 172.16.29.6 -> 172.16.29.1
2. Router responds echo reply 172.16.29.1 -> 172.16.29.6
3. Jool translates that into echo reply 64:ff9b::172.16.29.1 ->
64:ff9b::172.16.29.6
4. Packet gets lost because nobody's listening at 64:ff9b::172.16.29.6 or
something like that
Possible solutions:
1. Use blacklist4 [0] to prevent 172.16.29.1 and/or 172.16.29.6 from being
translated
2. Stop using pool6, use instead the EAMT to specify exactly which
addresses should be translated
2. Move setup to iptables so you can filter normally
Also, you might want to install Jool 4.1.2 so you can enable debug easily
and see what's happening: [1]
(You will probably want to uninstall 4.0.7 first.)
[0] https://jool.mx/en/usr-flags-blacklist4.html
[1] https://jool.mx/en/usr-flags-global.html#logging-debug
…On Sat, Aug 1, 2020 at 11:11 AM Tristan ***@***.***> wrote:
Hi, I'm trying to setup SIIT-DC so that I can offer v4 access to services
in my v6-only network for legacy clients. I was able to get everything
working by compiling jool from source on ubuntu 18.04 and using iptables.
Once my test setup worked, I decided to upgrade to ubuntu 20.04 and use
netfilter and everything broke apart. NAT64 is working great but SIIT is
completely broken. I am running jool 4.0.7 packaged in the ubuntu focal
repository.
Here are my configuration files:
***@***.***:~$ cat /etc/jool/jool.conf
{
"comment": "Configuration for the systemd NAT64 Jool service.",
"instance": "default",
"framework": "netfilter",
"global": {
"comment": "NAT64 prefix",
"pool6": "64:ff9b::/96"
}
}
***@***.***:~$ cat /etc/jool/jool_siit.conf
{
"comment": "Sample full SIIT configuration.",
"instance": "default",
"framework": "netfilter",
"global": {
"comment": "pool6 and the RFC6791v4 pool belong here, ever since Jool 4.",
"pool6": "64:ff9b::/96",
},
"eamt": [
{
"ipv6 prefix": "2607:fa48:6ed8:8a54:3::",
"ipv4 prefix": "172.16.30.2"
}
]
}
***@***.***:~$ sudo jool_siit instance display
+--------------------+-----------------+-----------+
| Namespace | Name | Framework |
+--------------------+-----------------+-----------+
| b95e2100 | default | netfilter |
+--------------------+-----------------+-----------+
***@***.***:~$ sudo jool_siit eamt display
+---------------------------------------------+--------------------+
| IPv6 Prefix | IPv4 Prefix |
+---------------------------------------------+--------------------+
| 2607:fa48:6ed8:8a54:3::/128 | 172.16.30.2/32 |
+---------------------------------------------+--------------------+
***@***.***:~$ sudo jool_siit global display
manually-enabled: true
pool6: 64:ff9b::/96
lowest-ipv6-mtu: 1280
logging-debug: false
zeroize-traffic-class: false
override-tos: false
tos: 0
mtu-plateaus: 65535,32000,17914,8166,4352,2002,1492,1006,508,296,68
amend-udp-checksum-zero: false
eam-hairpin-mode: intrinsic
randomize-rfc6791-addresses: true
rfc6791v6-prefix: (unset)
rfc6791v4-prefix: (unset)
My router has a static route directing 64:ff9b::/96 to
2607:fa48:6ed8:8a51::64 which is my machine running jool. It also has a
route sending traffic destined to 172.16.30.2 through 172.16.29.6.
Here are my ip addresses:
***@***.***:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether c2:6a:23:da:d4:c9 brd ff:ff:ff:ff:ff:ff
inet 172.16.29.6/24 brd 172.16.29.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 2607:fa48:6ed8:8a51:c06a:23ff:feda:d4c9/64 scope global dynamic mngtmpaddr noprefixroute
valid_lft 431991sec preferred_lft 3591sec
inet6 2607:fa48:6ed8:8a51::64/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::c06a:23ff:feda:d4c9/64 scope link
valid_lft forever preferred_lft forever
When jool_siit is running (jool_siit global update manually-enabled true)
ipv4 access seems to break and translations according to the EAMT don't
happen at all. If I ping my router 172.16.29.1 from that machine, it
times out. If I ping my machine 172.16.29.6 from the router, it times
out. However, as soon as I stop jool_siit (jool_siit global update
manually-enabled false) all of those ipv4 pings start working again.
If I run curl http://172.16.30.2 on my router, it just times out instead
of loading the page. Running curl http://[2607:fa48:6ed8:8a54:3::] works
fine.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#338>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AASHNF2H32NZS665SSBD3K3R6Q5DHANCNFSM4PR3L4ZQ>
.
|
I thought it was better because distros are moving away from iptables to nftables. I have reverted my setup back to iptables and added the correct iptables and ip6table prerouting rules with the jool chain. I also removed the global pool6 section from my config. This is all that is left:
My ipv4 access to that vm is no longer broken, however, attempting to run |
You did the first part of the second option, but ignored the second part:
Since you removed pool6, you now have to provide a replacement. Now the packet flow is likely
Just to clarify: You don't have to do both options simultaneously. Replacing pool6 and moving to iptables are both self-sufficient solutions meant to solve the same problem.
I should have realized this sooner: Which router are you talking about? You said this was SIIT-DC, so I thought yours was an IPv6-only network. Is this router outside of your domain? Or does your IPv6 router have an IPv4 route? Does it also have an IPv4 address? What is the purpose of the IPv4 address? |
The EAMT is clearly defined in the config snippet I posted above...
Yes, my kubernetes network is completely ipv6-only. My Jool box has ipv4 access because it needs to also run nat64 for my kubernetes services to be able to access the ipv4 internet. My goal here is to be able to get end-users who don't have ipv6 access to connect to my ipv6-only services on my kubernetes load balancer (2607:fa48:6ed8:8a54:3::).
The router has ipv4 and ipv6 access and is controlled by me.
Yes
The purpose is to allow ipv4 clients to connect to my services and also to allow my services to access ipv4 resources while most of my network is ipv6-only. EDIT: |
Yes, you have an EAM entry that can be used to translate the packet's destination address, but you don't have one for the source address. That's what pool6 was meant for. Suppose random Internet node 1.2.3.4 makes a request to you server 172.16.30.2. You want the packet flow to look like this:
You currently do not have a means to translate Either revert pool6 ( |
For what it's worth: #339 and your original problem seem to be the same bug. I'm currently investigating further. |
Can you still debug this?
I now agree that this is a bug.
When I wrote this, I had forgotten that Jool has an inbuilt "generic blacklist" that is supposed to prevent this from happening. The logic is "if the IPv4 packet's destination address belongs to the translator's interface, cancel translation." For some reason, this appears to not be working on your end. More unfortunately still, I cannot reproduce the problem. To understand what Jool is thinking, we have two options:
Another idea pops to mind: Please post the output of |
Hey, sorry for the silence. I'm super busy with school right now because I have a final tomorrow. I will respond properly this weekend. |
Hey, apologies for the long delay, I am back with a lot of free time. I can certainly keep debugging this as I'd love for this setup to eventually work!
I tried that and the stats command shows no output at all.
I installed Jool 4.1.2-1 and enabled debugging using the userspace tools but I wasn't able to get any meaningful info because it seems a ton of traffic is being sent to jool and the kernel log get totally spammed from the SSH traffic. Perhaps I need to edit my iptables rules only send traffic destined to the EAMT addresses into the jool chain? What kind of debug info are we looking for?
Also, here are my config files if you'd like to try to reproduce this on your end:
My iptables rules:
|
You can always filter out the logging blocks that do not involve 172.16.29.6 in any way.
Well, that would probably fix the ping too. Which would be great for solving the problem, but not so much for debugging the bug.
WAIT. Wait. Waitwaitwaitwait. I just realized. Why do you have a Stateful NAT64 Jool configuration? SIIT-DC officially relies solely on stateless translators.
Perhaps the NAT64 instance is swallowing all the traffic. This would also explain why you're getting no stats. But it's strange. The stateless translator is listed before the stateful one, so I'd expect the former to have more priority.
Thank you. I'll try to reproduce this again with the new information, and will hopefully have more questions later. |
So apparently, SIIT wasn't applying the generic blacklist to the destination address of incoming IPv4 packets. Strangest is that, even though this should cause untold mayhem immediately, I was somehow unable to reproduce it for the longest time. From my reading of the history, this bug first appeared during the 2019-10-30 RFC7915 review. Before that, the generic blacklist behavior used to be - Source address: Always enabled - Destination address: Always enabled RFC7915 wanted me to disable the generic blacklist for the source address for ICMP errors, but for some reason what I actually did was - Source address: Disabled on ICMP errors - Destination address: Disabled on ICMP errors Then, during the 7915 graybox testing of 2020-02-14, I ran into some problem with this and nesciently reverted back to - Source address: Always enabled - Destination address: Always enabled Then, during another graybox batch test on 2020-05-20, it became - Source address: Disabled on ICMP errors - Destination address: Always disabled This commit changes into what I believe is the correct behavior (and which is consistent with RFC 7915): - Source address: Disabled on ICMP errors - Destination address: Always enabled This commit fixes the translator-router ping of #338, and also probably the entirety of #339.
Found the cause of the loss of ping between the router and the translator. I just applied the patch to master. Can you test it? (I still think you should get rid of the Stateful NAT64 instance.) |
Just for the sake of completeness, here's a checklist of the routing I configured to make it work (in addition to patching the code): The Vyos machine needs to route 64:ff9b:: and 172.16.30.2 through the translator: me@vyos:~$ sudo ip route add 64:ff9b::/96 via 2607:fa48:6ed8:8a51::64
me@vyos:~$ # I'm assuming the entire 172.16.30 network is reserved for EAMT usage,
me@vyos:~$ # but I might be overdoing it. But whatever.
me@vyos:~$ sudo ip route add 172.16.30.0/24 via 172.16.29.6 The Vyos machine needs forwarding enabled: me@vyos:~$ sudo sysctl -w net.ipv4.conf.all.forwarding=1
me@vyos:~$ sudo sysctl -w net.ipv6.conf.all.forwarding=1 And so does the translator (though it's not as crucial): me@k8s-natdns64:~$ sudo sysctl -w net.ipv4.conf.all.forwarding=1
me@k8s-natdns64:~$ sudo sysctl -w net.ipv6.conf.all.forwarding=1 The Kubernetes machine needs a 64:ff9b::/96 route towards Vyos. In my case, I just defaulted it: me@kubernetes:~$ sudo ip route add default via 2607:fa48:6ed8:8a54:1:: Did the same for the translator, for both protocols: me@k8s-natdns64:~$ sudo ip route add default via 2607:fa48:6ed8:8a51::1
me@k8s-natdns64:~$ sudo ip route add default via 172.16.30.1 With this configuration, I was able to perform the following pings from vyos. Sniffing the traffic, I didn't notice anything out of place: me@vyos:~$ ping 172.16.29.6 # Answered by the translator
me@vyos:~$ ping 172.16.30.2 # Answered by kubernetes I think that's all. |
I just cloned Jool master, compiled and installed the userspace tools and kernel module via
I quickly skimmed over this and everything is exactly like my setup. I also disabled the stateful NAT64 translator for the time being via Unfortunately, the connectivity issue is not resolved. However, I did a bunch of tcpdumping and can confirm that the issue seems to be exactly what you speculated above when you wrote:
If I ping my router from the jool box, the the ping is sent to vyos over ipv4, vyos responds over ipv4 and sends it back to the jool box. Jool then translates 172.16.29.1 to 64:ff9b::172.16.29.1 and an icmp6 reply is received at 64:ff9b::172.16.29.6 and nothing happens because ping is expecting a response at 172.16.29.6. |
Did you uninstall the previous v4.1.2 version? If you installed one of them from the .deb package, and the other from the code, both will exist in your system and one of them will have precedence over the other. So it's possible you're still running old code. I just uploaded a commit which bumps Jool's version number from 4.1.2.0 to 4.1.2.1. Try uninstalling the old version, install this new one, and make sure that it prints the intended version number (both in |
I think I had the wrong version installed. I uninstalled the
I rolled back to the commit before you bumped the version number and made sure the right DKMS module was installed and everything works now! I even turned back on the stateful NAT64 translator just to test and both can coexist just fine on the same box. Thanks for helping me out, I think we can close this issue once the kernel module's version number is bumped. |
Thanks for the feedback! Currently releasing 4.1.3; closing. |
Hi, I'm trying to setup SIIT-DC so that I can offer v4 access to services in my v6-only network for legacy clients. I was able to get everything working by compiling jool from source on ubuntu 18.04 and using iptables. Once my test setup worked, I decided to upgrade to ubuntu 20.04 and use netfilter and everything broke apart. NAT64 is working great but SIIT is completely broken. I am running jool 4.0.7 packaged in the ubuntu focal repository.
Here are my configuration files:
My router has a static route directing
64:ff9b::/96
to2607:fa48:6ed8:8a51::64
which is my machine running jool. It also has a route sending traffic destined to172.16.30.2
through172.16.29.6
.Here are my ip addresses:
When jool_siit is running (
jool_siit global update manually-enabled true
) ipv4 access seems to break and translations according to the EAMT don't happen at all. If I ping my router172.16.29.1
from that machine, it times out. If I ping my machine172.16.29.6
from the router, it times out. However, as soon as I stop jool_siit (jool_siit global update manually-enabled false
) all of those ipv4 pings start working again.If I run
curl http://172.16.30.2
on my router, it just times out instead of loading the page. Runningcurl http://[2607:fa48:6ed8:8a54:3::]
works fine.EDIT:
Just to clarify, going back to iptables and adding the right iptables rules from the docs makes everything work again but I would rather use netfilter for simplicity. Also, when in iptables mode I also can't interact with my jool box over ipv4, all traffic is dropped (but nat64/dns64/siit all work fine), not sure if that's a bug or intentional.
The text was updated successfully, but these errors were encountered: