netdev - RE: [RFC HACK] xfrm: make state refcounting percpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <VE1PR04MB66701E2D8CE661F280D6A6C48B350@VE1PR04MB6670.eurprd04.prod.outlook.com>
Date:   Fri, 3 May 2019 06:34:29 +0000
From:   Vakul Garg <vakul.garg@....com>
To:     Steffen Klassert <steffen.klassert@...unet.com>
CC:     Florian Westphal <fw@...len.de>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [RFC HACK] xfrm: make state refcounting percpu



> -----Original Message-----
> From: Steffen Klassert <steffen.klassert@...unet.com>
> Sent: Friday, May 3, 2019 11:52 AM
> To: Vakul Garg <vakul.garg@....com>
> Cc: Florian Westphal <fw@...len.de>; netdev@...r.kernel.org
> Subject: Re: [RFC HACK] xfrm: make state refcounting percpu
> 
> On Fri, May 03, 2019 at 06:13:22AM +0000, Vakul Garg wrote:
> >
> >
> > > -----Original Message-----
> > > From: Steffen Klassert <steffen.klassert@...unet.com>
> > > Sent: Friday, May 3, 2019 11:38 AM
> > > To: Florian Westphal <fw@...len.de>
> > > Cc: Vakul Garg <vakul.garg@....com>; netdev@...r.kernel.org
> > > Subject: Re: [RFC HACK] xfrm: make state refcounting percpu
> > >
> > > On Wed, Apr 24, 2019 at 12:40:23PM +0200, Florian Westphal wrote:
> > > > I'm not sure this is a good idea to begin with, refcount is right
> > > > next to state spinlock which is taken for both tx and rx ops, plus
> > > > this complicates debugging quite a bit.
> > >
> > >
> > > Hm, what would be the usecase where this could help?
> > >
> > > The only thing that comes to my mind is a TX state with wide
> > > selectors. In that case you might see traffic for this state on a
> > > lot of cpus. But in that case we have a lot of other problems too,
> > > state lock, replay window etc. It might make more sense to install a
> > > full state per cpu as this would solve all the other problems too (I've
> talked about that idea at the IPsec workshop).
> > >
> > > In fact RFC 7296 allows to insert multiple SAs with the same traffic
> > > selector, so it is possible to install one state per cpu. We did a
> > > PoC for this at the IETF meeting the week after the IPsec workshop.
> > >
> >
> > On 16-core arm64 processor, I am getting very high cpu usage (~ 40 %) in
> refcount atomics.
> > E.g. in function dst_release() itself, I get 19% cpu usage  in refcount api.
> > Will the PoC help here?
> 
> If your usecase is that what I described above, then yes.
> 
> I guess the high cpu usage comes from cachline bounces because one SA is
> used from many cpus simultaneously.
> Is that the case?

I don't find kernel code to be taking care of reservation granule size alignment (or cacheline size)
for refcount vars. So it is possible that wasteful reservation loss is happening in atomics.

> 
> Also, is this a new problem or was it always like that?

It is always like this. On 4-core, 8-core platforms as well, these atomics consume significant cpu 
(8 core cpu usage is more than 4 core).

On 16-core system, we are seeing no throughput scalability beyond 8 cores.