lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B4F24AC.70105@trash.net>
Date:	Thu, 14 Jan 2010 15:05:32 +0100
From:	Patrick McHardy <kaber@...sh.net>
To:	Netfilter Development Mailinglist 
	<netfilter-devel@...r.kernel.org>
CC:	Linux Netdev List <netdev@...r.kernel.org>,
	containers@...ts.linux-foundation.org
Subject: RFC: netfilter: nf_conntrack: add support for "conntrack zones"

The attached largish patch adds support for "conntrack zones",
which are virtual conntrack tables that can be used to seperate
connections from different zones, allowing to handle multiple
connections with equal identities in conntrack and NAT.

A zone is simply a numerical identifier associated with a network
device that is incorporated into the various hashes and used to
distinguish entries in addition to the connection tuples. Additionally
it is used to seperate conntrack defragmentation queues. An iptables
target for the raw table could be used alternatively to the network
device for assigning conntrack entries to zones.

This is mainly useful when connecting multiple private networks using
the same addresses (which unfortunately happens occasionally) to pass
the packets through a set of veth devices and SNAT each network to a
unique address, after which they can pass through the "main" zone and
be handled like regular non-clashing packets and/or have NAT applied a
second time based f.i. on the outgoing interface.

Something like this, with multiple tunl and veth devices, each pair
using a unique zone:

  <tunl0 / zone 1>
     |
  PREROUTING
     |
  FORWARD
     |
  POSTROUTING: SNAT to unique network
     |
  <veth1 / zone 1>
  <veth0 / zone 0>
     |
  PREROUTING
     |
  FORWARD
     |
  POSTROUTING: SNAT to eth0 address
     |
  <eth0>

As probably everyone has noticed, this is quite similar to what you
can do using network namespaces. The main reason for not using
network namespaces is that its an all-or-nothing approach, you can't
virtualize just connection tracking. Beside the difficulties in
managing different namespaces from f.i. an IKE or PPP daemon running
in the initial namespace, network namespaces have a quite large
overhead, especially when used with a large conntrack table.

I'm not too fond of this partial feature duplication myself, but I
couldn't think of a better way to do this without the downsides of
using namespaces. Having partially shared network namespaces would
be great, but it doesn't seem to fit in the design very well.
I'm open for any better suggestion :)

A couple of notes on the patch:

- its not entirely finished yet (ctnetlink and xt_connlimit are
  missing), I wanted to have a discussion about the general idea first.

- the patch uses ct_extend to avoid increasing the connection tracking
  entry size when this feature is not used. An older version of this
  patch adds the zone identifier to the conntrack tuples. This greatly
  simplifies the changes to the code since the zone doesn't has to
  passed around (something like 40 lines total), but has the downside
  of increasing the tuple size.

- the overhead should be quite small, its mainly the extra argument
  passing and an occasional extra comparison. Code size increase with
  all netfilter options enabled on x86_64 is 152 bytes.

Any comments welcome.

View attachment "01.diff" of type "text/x-patch" (50284 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ