lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <531F66D0.1050000@citrix.com>
Date:	Tue, 11 Mar 2014 19:41:04 +0000
From:	Zoltan Kiss <zoltan.kiss@...rix.com>
To:	Thomas Graf <tgraf@...hat.com>, Pravin Shelar <pshelar@...ira.com>
CC:	Jesse Gross <jesse@...ira.com>,
	"dev@...nvswitch.org" <dev@...nvswitch.org>,
	<xen-devel@...ts.xenproject.org>, netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] openvswitch: Orphan frags before sending to userspace
 via Netlink to avoid guest stall

On 07/03/14 17:59, Thomas Graf wrote:
> On 03/07/2014 06:28 PM, Pravin Shelar wrote:
>> Problem is mapping SKBTX_DEV_ZEROCOPY pages to userspace. skb_zerocopy
>> is not doing that.
>>
>> Unless I missing something, Current netlink code can not handle
>> skb-frags with zero copy. So we have to copy skb anyways and no need
>> to orphan-frags here.
>> If you are planning on handling skb-frags without copying then
>> skb_orphan_frags should be done in netlink.
>
> If you look at the second part of skb_zerocopy() this is exactly what
> it is doing unless the target skb has sufficient linear space
> preallocated. At least unless mmap is enabled in which case we would
> have to copy again until we have implemented a way to pass page refs
> via the nl ring buffer.
>
> So I think Zoltan is correct in orphaning frags that come from f.e.
> a tun device via zerocopy_sg_from_iovec().

Now as I'm checking how Netlink works, I might be wrong at some parts :) 
skb_zerocopy correctly add the frags to the user_skb we are sending 
upwards, however when the userspace receive it in netlink_recvmsg(), it 
gets copied to the supplied buffer anyway. Is that correct? In which 
case we don't need to worry that userspace will sit on that page 
indefinitely. However we have to worry about userspace not calling recv 
on that Netlink socket, so in the end we still need skb_orphan_frags, 
just for a different reason :)
We can put skb_orphan_frags into skb_zerocopy, skb_clone also do that.

However with Netlink mmapped IO, we should take a different approach, 
and instead of calling skb_orphan_frags we should make sure user_skb can 
hold any skb we get from the kernel, and copy the frags there. Even if 
we would be able to pass page refs to userspace through the ring buffer 
(AFAIK currently we can't), it would be fragile to just pass kernel 
pages directly to userspace, even if they came without the 
SKBTX_DEV_ZEROCOPY flag. And I think it would be quite rare that we need 
that copy anyway, because the flow setup usually happens with small 
packets without frags.
If we choose the above approach with Netlink mmap, we don't need 
skb_orphan_frags, in fact

Regards,

Zoli
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ