[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5319EC8E.2010606@redhat.com>
Date: Fri, 07 Mar 2014 16:58:06 +0100
From: Thomas Graf <tgraf@...hat.com>
To: Pravin Shelar <pshelar@...ira.com>,
Zoltan Kiss <zoltan.kiss@...rix.com>
CC: Jesse Gross <jesse@...ira.com>,
"dev@...nvswitch.org" <dev@...nvswitch.org>,
xen-devel@...ts.xenproject.org, netdev <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, kvm@...r.kernel.org
Subject: Re: [PATCH] openvswitch: Orphan frags before sending to userspace
via Netlink to avoid guest stall
On 03/07/2014 05:46 AM, Pravin Shelar wrote:
> But I found bug in datapath user-space queue code. I am not sure how
> this can work with skb fragments and MMAP-netlink socket.
> Here is what happens, OVS allocates netlink skb and adds fragments to
> skb using skb_zero_copy(), then calls genlmsg_unicast().
> But if netlink sock is mmped then netlink-send queues netlink
> allocated skb->head (linear data of skb) and ignore skb frags.
>
> Currently this is not problem with OVS vswitchd since it does not use
> netlink MMAP sockets. But if vswitchd stats using MMAP-netlink socket,
> it can break it.
The secret is out ;-)
I was very surprised too when I noticed that it worked. It's not just
OVS, it's nfqueue as well. The reason is that an netlink mmaped skb is
setup with a giant tailroom in netlink_ring_setup_skb():
skb->end = skb->tail + size;
and skb_zerocopy() will consume whatever tailroom is available first:
/* dont bother with small payloads */
if (len <= skb_tailroom(to)) {
skb_copy_bits(from, 0, skb_put(to, len), len);
return;
}
I was planning to fix this while adding GSO support to the upcall as
that is the moment when this bug would really surface.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists