linux-kernel - Re: [PATCH] openvswitch: Orphan frags before sending to userspace via Netlink to avoid guest stall

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5319EC8E.2010606@redhat.com>
Date:	Fri, 07 Mar 2014 16:58:06 +0100
From:	Thomas Graf <tgraf@...hat.com>
To:	Pravin Shelar <pshelar@...ira.com>,
	Zoltan Kiss <zoltan.kiss@...rix.com>
CC:	Jesse Gross <jesse@...ira.com>,
	"dev@...nvswitch.org" <dev@...nvswitch.org>,
	xen-devel@...ts.xenproject.org, netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, kvm@...r.kernel.org
Subject: Re: [PATCH] openvswitch: Orphan frags before sending to userspace
 via Netlink to avoid guest stall

On 03/07/2014 05:46 AM, Pravin Shelar wrote:
> But I found bug in datapath user-space queue code. I am not sure how
> this can work with skb fragments and MMAP-netlink socket.
> Here is what happens, OVS allocates netlink skb and adds fragments to
> skb using skb_zero_copy(), then calls genlmsg_unicast().
> But if netlink sock is mmped then netlink-send queues netlink
> allocated skb->head (linear data of skb) and ignore skb frags.
>
> Currently this is not problem with OVS vswitchd since it does not use
> netlink MMAP sockets. But if vswitchd stats using MMAP-netlink socket,
> it can break it.

The secret is out ;-)

I was very surprised too when I noticed that it worked. It's not just
OVS, it's nfqueue as well. The reason is that an netlink mmaped skb is
setup with a giant tailroom in netlink_ring_setup_skb():

	skb->end	= skb->tail + size;

and skb_zerocopy() will consume whatever tailroom is available first:

	/* dont bother with small payloads */
	if (len <= skb_tailroom(to)) {
		skb_copy_bits(from, 0, skb_put(to, len), len);
		return;
	}

I was planning to fix this while adding GSO support to the upcall as
that is the moment when this bug would really surface.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/