lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 7 May 2012 16:53:46 +0300
From:	"Michael S. Tsirkin" <mst@...hat.com>
To:	Ian Campbell <Ian.Campbell@...rix.com>
Cc:	David Miller <davem@...emloft.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"eric.dumazet@...il.com" <eric.dumazet@...il.com>
Subject: [PATCH RFC 0/6] copy aside frags with destructors (was [PATCH 7/9]
 net: add skb_orphan_frags to copy aside frags with destructors)

On Fri, May 04, 2012 at 11:51:31AM +0100, Ian Campbell wrote:
> On Fri, 2012-05-04 at 11:03 +0100, Michael S. Tsirkin wrote:
> > On Fri, May 04, 2012 at 02:54:33AM -0400, David Miller wrote:
> > > From: "Michael S. Tsirkin" <mst@...hat.com>
> > > Date: Fri, 4 May 2012 00:10:24 +0300
> > > 
> > > > Hmm we orphan skbs when we loop them back so how about reusing the
> > > > skb->destructor for this?
> > > 
> > > That's one possibility.

So originally I thought it would work: destructor would wake the
original owner which would do data copy and modify the fragments.  But
then I unfortunately realized that would be racy: the new owner could be
using the old frags and there appears no way for us to
make sure it doesn't so that we can put the original page.
And the same logic applies to modifying the frags at any
other time if the skb is cloned. So it seems we must copy if we
want to clone the skb.

Further, destructor itself can't do the copy because this needs
to allocate memory in atomic context, and destructor
itself can't fail.
For this second problem I see two solutions: either pre-allocate the
copy buffer just in case and track it together with the destructor, or
use an skb flag to make the check for destructors quick (if not
completely free).

Second option is what macvtap zero copy uses and it already
does copy on clone too. So I hacked that to make it support tcp/udp
used by sunrpc.

> > > 
> > > But I fear we're about to toss Ian into yet another rabbit hole. :-)
> > > 
> > > Let's try to converge on something quickly as I think integration of
> > > his work has been delayed enough as-is.

...

> > It's weekend here, I'll work on a patch like this
> > Sunday.
> 
> Thanks, I was starting to feel my nose twitching and my ears beginning
> to elongate ;-)
> 
> Ian.

Here's a first stub at a fix. Basically to be able to modify frags on
the fly we must make sure the skb isn't cloned, so the moment someone
clones the skb we need to trigger the frag copy logic.  And this is
exactly what happens with SKBTX_DEV_ZEROCOPY so it seems to make sense
to reuse that logic.

The below patchset replaces patch 7
([PATCH 7/9] net: add skb_orphan_frags to copy aside frags with destructors)
in Ian's patchset and needs to be applied there.


Compiled only but I'd like to hear what people think all the same
because it does add a couple of branches on fast path.  On the other
hand this makes it generic so the same logic will be reusable for packet
sockets (which IIRC are currently buggy in the same way as sunrpc) and
for adding zero copy support to tun.

Please comment,
Thanks!

-- 
MST


Michael S. Tsirkin (6):
  skbuff: support per-page destructors in copy_ubufs
  skbuff: add an api to orphan frags
  skbuff: convert to skb_orphan_frags
  tun: orphan frags on xmit
  net: orphan frags on receive
  skbuff: set zerocopy flag on frag destructor

 drivers/net/tun.c      |    2 ++
 include/linux/skbuff.h |   41 +++++++++++++++++++++++++++++++++++++++++
 net/core/dev.c         |    2 ++
 net/core/skbuff.c      |   43 +++++++++++++++++++++++++------------------
 4 files changed, 70 insertions(+), 18 deletions(-)

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ