[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <531A0A5B.2000104@redhat.com>
Date: Fri, 07 Mar 2014 19:05:15 +0100
From: Thomas Graf <tgraf@...hat.com>
To: Pravin Shelar <pshelar@...ira.com>
CC: Zoltan Kiss <zoltan.kiss@...rix.com>,
Jesse Gross <jesse@...ira.com>,
"dev@...nvswitch.org" <dev@...nvswitch.org>,
xen-devel@...ts.xenproject.org, netdev <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, kvm@...r.kernel.org
Subject: Re: [PATCH] openvswitch: Orphan frags before sending to userspace
via Netlink to avoid guest stall
On 03/07/2014 06:19 PM, Pravin Shelar wrote:
> On Fri, Mar 7, 2014 at 7:58 AM, Thomas Graf <tgraf@...hat.com> wrote:
>> On 03/07/2014 05:46 AM, Pravin Shelar wrote:
>>>
>>> But I found bug in datapath user-space queue code. I am not sure how
>>> this can work with skb fragments and MMAP-netlink socket.
>>> Here is what happens, OVS allocates netlink skb and adds fragments to
>>> skb using skb_zero_copy(), then calls genlmsg_unicast().
>>> But if netlink sock is mmped then netlink-send queues netlink
>>> allocated skb->head (linear data of skb) and ignore skb frags.
>>>
>>> Currently this is not problem with OVS vswitchd since it does not use
>>> netlink MMAP sockets. But if vswitchd stats using MMAP-netlink socket,
>>> it can break it.
>>
>>
>> The secret is out ;-)
>>
>> I was very surprised too when I noticed that it worked. It's not just
>> OVS, it's nfqueue as well. The reason is that an netlink mmaped skb is
>> setup with a giant tailroom in netlink_ring_setup_skb():
>>
>> skb->end = skb->tail + size;
>>
> For OVS use-case, the size is linear part of skb. so I think for
> mmap-netlink socket it will fail.
Could you rephrase? I'm not sure I understand correctly.
The tailroom size equals to the configured frame payload size of
the ring buffer. So as long as the frame size chosen is large
enough to hold whatever pieces comes out of skb_gso_segment() we are
fine. That said, I agree that we should fix this properly before we
enable mmap on the OVS user space side.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists