lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABhP=tYZYC-U1eAp9j8sdnaTUAMQXVSw78XHMkjcyXOdbxRi8Q@mail.gmail.com>
Date: Fri, 13 Jun 2025 10:42:37 +0200
From: Antonio Ojea <antonio.ojea.garcia@...il.com>
To: webmaster@...wa338.de
Cc: Phil Sutter <phil@....cc>, Klaus Frank <vger.kernel.org@...nk.fyi>, 
	netfilter-devel@...r.kernel.org, Pablo Neira Ayuso <pablo@...filter.org>, 
	Florian Westphal <fw@...len.de>, Lukas Wunner <lukas@...ner.de>, netfilter@...r.kernel.org, 
	Maciej Żenczykowski <zenczykowski@...il.com>, 
	netdev@...r.kernel.org
Subject: Re: Status of native NAT64/NAT46 in Netfilter?

On Fri, 13 Jun 2025 at 00:19, <webmaster@...wa338.de> wrote:
>
> On Thu, Jun 12, 2025 at 11:55:06PM +0200, Phil Sutter wrote:
> > On Thu, Jun 12, 2025 at 08:13:02PM +0000, Klaus Frank wrote:
> > > On Thu, Jun 12, 2025 at 09:45:00PM +0200, Phil Sutter wrote:
> > > > On Thu, Jun 12, 2025 at 08:19:53PM +0200, Antonio Ojea wrote:
> > > > > On Thu, 12 Jun 2025 at 15:56, Phil Sutter <phil@....cc> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Thu, Jun 12, 2025 at 03:34:08PM +0200, Antonio Ojea wrote:
> > > > > > > On Thu, 12 Jun 2025 at 11:57, Phil Sutter <phil@....cc> wrote:
> > > > > > > > On Sun, Jun 08, 2025 at 08:37:10PM +0000, Klaus Frank wrote:
> > > > > > > > > I've been looking through the mailling list archives and couldn't find a clear anser.
> > > > > > > > > So I wanted to ask here what the status of native NAT64/NAT46 support in netfilter is?
> > > > >
> > > > > > > we ended doing some "smart hack" , well, really a combination of them
> > > > > > > to provide a nat64 alternative for kubernetes
> > > > > > > https://github.com/kubernetes-sigs/nat64:
> > > > > > > - a virtual dummy interface to "attract" the nat64 traffic with the
> > > > > > > well known prefix
> > > > > > > - ebpf tc filters to do the family conversion using static nat for
> > > > > > > simplicity on the dummy interface
> > > > > > > - and reusing nftables masquerading to avoid to reimplement conntrack
> > > > > >
> > > > > > Oh, interesting! Would you benefit from a native implementation in
> > > > > > nftables?
> > > > >
> > > > > Indeed we'll benefit a lot, see what we have to do :)
> > > > >
> > > > > > > you can play with it using namespaces (without kubernetes), see
> > > > > > > https://github.com/kubernetes-sigs/nat64/blob/main/tests/integration/e2e.bats
> > > > > > > for kind of selftest environment
> > > > > >
> > > > > > Refusing to look at the code: You didn't take care of the typical NAT
> > > > > > helper users like FTP or SIP, did you?
> > > > >
> > > > > The current approach does static NAT64 first, switching the IPv6 ips
> > > > > to IPv4 and adapting the IPv4 packet, the "real nat" is done by
> > > > > nftables on the ipv4 family after that, so ... it may work?
> > > >
> > > > That was my approach as well: The incoming IPv6 packet was translated to
> > > > IPv4 with an rfc1918 source address linked to the IPv6 source, then
> > > > MASQUERADE would translate to the external IP.
> > > >
> > > > In reverse direction, iptables would use the right IPv6 destination
> > > > address from given rfc1918 destination address.
> > > >
> > > > The above is a hack which limits the number of IPv6 clients to the size
> > > > of that IPv4 transfer net. Fixing it properly would probably require
> > > > conntrack integration, not sure if going that route is feasible (note
> > > > that I have no clue about conntrack internals).
> > >
> > > Well technically all that needs to be done is NAT66 instead of NAT44
> > > within that hack and that limitation vanishes.
> >
> > I don't comprehend: I have to use an IPv4 transfer net because I need to
> > set a source address in the generated IPv4 header. The destination IPv4
> > address is extracted from the IPv6 destination address. Simple example:
> >

Yeah, my bad, I was thinking about the Service implementation and the
need to track multiple connections for the same DNAT, and this is
about embedded IPs.

> >
> > > Also in regards to the above code it looks like currently only tcp and
> > > udp are supported. All other traffic appears to just dropped at the
> > > moment instead of just passed through. Is there a particular reason for
> > > this?
> >
> > I guess tcp and udp are simply sufficient in k8s.
>
> doesn't k8s also support sctp? Also still no need to just drop
> everything else, would have expected to just pass through it without
> special handling...
>

Ok, k8s support, here by dragons, taking some licenses for keeping the
explanation simple:
k8s orchestrate containers with a flat network as model, as every
container needs to talk to other containers in the cluster without
NAT, there is nothing kubernetes mandates about protocols here, only
about how the network in the cluster should look like, this end to end
principle makes networking "simple" to think about and put the
complexity on the endpoints.

k8s also offers a mechanism for service discovery, the famous
kube-proxy that basically does DNAT to some virtual ip and Port
configured by the user, that is the thing that users can configure to
expose their containers, and this API only allow users to set TCP ,
UDP or SCTP protocols, this is what people traditional mean as
"kubernetes supports SCTP", although the use cases for this are very
niche.

The need for NAT64 in k8s is because unfortunately the world is not
IPv6 ready, so people deploy containers that pull images from
registries that are not reachable via IPv6 or use some git repository
that is IPv4 only, this traffic "from container to Internet" is the
traffic that needs a translation mechanism and is what this project
tries to solve, is a stopgap solution, just that.

The "mixing IP families inside the same cluster", so I have some
containers IPv4 and others IPv6 is something I heard people trying to
do, but that basically breaks the networking model of k8s, since means
some containers will only be able to reach other containers through
NAT, I really do not recommend that, as I do not see the benefit, is
something cool to work with, but not something I will use if I have my
company business on that cluster, you better use multiple clusters
with different network domains and ip families and communicate through
the "external to cluster" mechanisms

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ