lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN6PR15MB1553F7630C04A6334168AE9C9AF40@BN6PR15MB1553.namprd15.prod.outlook.com>
Date:   Thu, 15 Feb 2018 08:57:14 +0000
From:   Jon Maloy <jon.maloy@...csson.com>
To:     David Miller <davem@...emloft.net>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "Mohan Krishna Ghanta Krishnamurthy" 
        <mohan.krishna.ghanta.krishnamurthy@...csson.com>,
        "Tung Quang Nguyen" <tung.q.nguyen@...tech.com.au>,
        Hoang Huu Le <hoang.h.le@...tech.com.au>,
        Canh Duc Luu <canh.d.luu@...tech.com.au>,
        "Ying Xue" <ying.xue@...driver.com>,
        "tipc-discussion@...ts.sourceforge.net" 
        <tipc-discussion@...ts.sourceforge.net>
Subject: RE: [net-next 1/1] tipc: avoid unnecessary copying of bundled
 messages



> -----Original Message-----
> From: netdev-owner@...r.kernel.org [mailto:netdev-
> owner@...r.kernel.org] On Behalf Of David Miller
> Sent: Wednesday, February 14, 2018 21:27
> To: Jon Maloy <jon.maloy@...csson.com>
> Cc: netdev@...r.kernel.org; Mohan Krishna Ghanta Krishnamurthy
> <mohan.krishna.ghanta.krishnamurthy@...csson.com>; Tung Quang Nguyen
> <tung.q.nguyen@...tech.com.au>; Hoang Huu Le
> <hoang.h.le@...tech.com.au>; Canh Duc Luu
> <canh.d.luu@...tech.com.au>; Ying Xue <ying.xue@...driver.com>; tipc-
> discussion@...ts.sourceforge.net
> Subject: Re: [net-next 1/1] tipc: avoid unnecessary copying of bundled
> messages
> 
> From: Jon Maloy <jon.maloy@...csson.com>
> Date: Wed, 14 Feb 2018 13:50:31 +0100
> 
> > diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 4e1c6f6..a368fa8
> > 100644
> > --- a/net/tipc/msg.c
> > +++ b/net/tipc/msg.c
> > @@ -434,6 +434,9 @@ bool tipc_msg_extract(struct sk_buff *skb, struct
> sk_buff **iskb, int *pos)
> >  	skb_pull(*iskb, offset);
> >  	imsz = msg_size(buf_msg(*iskb));
> >  	skb_trim(*iskb, imsz);
> > +
> > +	/* Scale extracted buffer's truesize to avoid double accounting */
> > +	(*iskb)->truesize = SKB_TRUESIZE(imsz);
> >  	if (unlikely(!tipc_msg_validate(iskb)))
> >  		goto none;
> >  	*pos += align(imsz);
> 
> As Eric said, you have to be really careful here.
> 
> If you clone a 10K SKB 10 times, you really have to account for the full
> truesize 10 times.
> 
> That is unless you explicitly trim off frags in the new clone, then adjust the
> truesize by explicitly decreasing it by the amount of memory backing the
> frags you trimmed off completely (not partially).

The buffers we are cloning are linearized 1 MTU incoming buffers. There are no fragments. 
Each clone normally points to only a tiny fraction of the data area of the base buffer.
I don't claim that copying always is bad, but in this case it happens in the majority of cases, and as I see it completely unnecessarily.

There is actually some under accounting, however, since we now only count the data area of the base buffer (== the sum of the data area of the clones) plus the overhead of the clones.
A more accurate calculation, taking into account even the overhead of the base buffer, would look  like this:
(*iskb)->truesize =  SKB_TRUSIZE(imsz) + (skb->truesize - skb->len)  / msg_msgcnt(msg);

I.e.,  we calculate the overhead of the base buffer and divide it equally among the clones.
Now I really can't see we are missing anything.

BR
///jon

> 
> Finally, you can only do this on an SKB that has never entered a socket SKB
> queue, otherwise you corrupt memory accounting.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ