[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5323821C.70001@citrix.com>
Date: Fri, 14 Mar 2014 22:26:36 +0000
From: Zoltan Kiss <zoltan.kiss@...rix.com>
To: Thomas Graf <tgraf@...hat.com>, Pravin Shelar <pshelar@...ira.com>
CC: "dev@...nvswitch.org" <dev@...nvswitch.org>,
<xen-devel@...ts.xenproject.org>,
LKML <linux-kernel@...r.kernel.org>,
netdev <netdev@...r.kernel.org>
Subject: Re: [ovs-dev] [PATCH] openvswitch: Orphan frags before sending to
userspace via Netlink to avoid guest stall
On 11/03/14 19:41, Zoltan Kiss wrote:
> On 07/03/14 17:59, Thomas Graf wrote:
>> On 03/07/2014 06:28 PM, Pravin Shelar wrote:
>>> Problem is mapping SKBTX_DEV_ZEROCOPY pages to userspace. skb_zerocopy
>>> is not doing that.
>>>
>>> Unless I missing something, Current netlink code can not handle
>>> skb-frags with zero copy. So we have to copy skb anyways and no need
>>> to orphan-frags here.
>>> If you are planning on handling skb-frags without copying then
>>> skb_orphan_frags should be done in netlink.
>>
>> If you look at the second part of skb_zerocopy() this is exactly what
>> it is doing unless the target skb has sufficient linear space
>> preallocated. At least unless mmap is enabled in which case we would
>> have to copy again until we have implemented a way to pass page refs
>> via the nl ring buffer.
>>
>> So I think Zoltan is correct in orphaning frags that come from f.e.
>> a tun device via zerocopy_sg_from_iovec().
>
> Now as I'm checking how Netlink works, I might be wrong at some parts :)
> skb_zerocopy correctly add the frags to the user_skb we are sending
> upwards, however when the userspace receive it in netlink_recvmsg(), it
> gets copied to the supplied buffer anyway. Is that correct? In which
> case we don't need to worry that userspace will sit on that page
> indefinitely. However we have to worry about userspace not calling recv
> on that Netlink socket, so in the end we still need skb_orphan_frags,
> just for a different reason :)
> We can put skb_orphan_frags into skb_zerocopy, skb_clone also do that.
>
> However with Netlink mmapped IO, we should take a different approach,
> and instead of calling skb_orphan_frags we should make sure user_skb can
> hold any skb we get from the kernel, and copy the frags there. Even if
> we would be able to pass page refs to userspace through the ring buffer
> (AFAIK currently we can't), it would be fragile to just pass kernel
> pages directly to userspace, even if they came without the
> SKBTX_DEV_ZEROCOPY flag. And I think it would be quite rare that we need
> that copy anyway, because the flow setup usually happens with small
> packets without frags.
> If we choose the above approach with Netlink mmap, we don't need
> skb_orphan_frags, in fact
I spent some time to think about this mmaped scenaria, and discussed it
with others: the conclusion is that it shouldn't be a big problem to
expose local kernel pages through the frags array as I thought before.
So OVS can get along with passing refs to those pages in the shared
ring. However skb_orphan_frags would be still necessary in skb_zerocopy,
for the same reason as now.
Should I post a new patch which does calls orphan_frags in zerocopy? Or
do you have any other opinion?
Zoli
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists