lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 14 Apr 2016 17:14:07 -0700
From:	Joe Stringer <joe@....org>
To:	Florian Westphal <fw@...len.de>
Cc:	David Laight <David.Laight@...lab.com>,
	"netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>,
	"diproiettod@...are.com" <diproiettod@...are.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH nf] netfilter: ipv6: Orphan skbs in nf_ct_frag6_gather()

On 14 April 2016 at 01:40, Florian Westphal <fw@...len.de> wrote:
> David Laight <David.Laight@...LAB.COM> wrote:
>> From: Joe Stringer
>> > Sent: 13 April 2016 19:10
>> > This is the IPv6 equivalent of commit 8282f27449bf ("inet: frag: Always
>> > orphan skbs inside ip_defrag()").
>> >
>> > Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free
>> > clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
>> > cloned (implicitly orphaning) prior to queueing for reassembly. As such,
>> > when the IPv6 message is eventually reassembled, the skb->sk for all
>> > fragments would be NULL. After that commit was introduced, rather than
>> > cloning, the original skbs were queued directly without orphaning. The
>> > end result is that all frags except for the first and last may have a
>> > socket attached.
>>
>> I'd have thought that the queued fragments would still want to be
>> resource-counted against the socket (I think that is what skb->sk is for).
>
> No, ipv4/ipv6 reasm has its own accouting.
>
>> Although I can't imagine why IPv6 reassembly is happening on skb
>> associated with a socket.
>
> Right, thats a much more interesting question -- both ipv4 and
> ipv6 orphan skbs before NF_HOOK prerouting trip.
>
> (That being said, I don't mind the patch, I'm just be curious how this
>  can happen).

The topology is quite simple, there is a veth pair connected between a
namespace and OVS. The OVS bridge has an internal port. The
bridge is configured with flows to send packets through conntrack
(causing packet reassembly + refragmentation on output), and then
forward packets between the host and the veth. The internal port and
the veth inside the netns have IP addresses configured in the same
subnet.

In the test case, we send a large ICMPv6 ping request from the
namespace to the internal port. The namespace will fragment the IP
message into fragments and send through the veth. OVS processes these,
sends to conntrack (reassembly), then decides to output to the
internal port (refragmenting). The host stack finally receives the
fragments and processes the ICMP request. On response, the host sends
several fragments to OVS. OVS reassembles these and sends them to
conntrack, then decides to forward to the veth. When forwarding to the
veth, the skb frag queue is in this state with many skbs (all except
first and last) having skb->sk populated, and we hit the
BUG_ON(skb->sk) in ip6_fragment() just prior to transmitting to the veth.

In regards to your question about prerouting, does the response even
hit the input path on the host? An ICMP response is generated, and it
needs to be directed out to the device (output path), then when the
internal device receives it, it starts OVS processing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ