lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Jun 2021 10:16:52 +0200
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Haakon Bugge <haakon.bugge@...cle.com>
Cc:     Jason Gunthorpe <jgg@...dia.com>,
        Leon Romanovsky <leon@...nel.org>,
        Doug Ledford <dledford@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Nathan Chancellor <nathan@...nel.org>,
        Adit Ranadive <aditr@...are.com>,
        Ariel Elior <aelior@...vell.com>,
        Christian Benvenuti <benve@...co.com>,
        "clang-built-linux@...glegroups.com" 
        <clang-built-linux@...glegroups.com>,
        Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>,
        Devesh Sharma <devesh.sharma@...adcom.com>,
        Gal Pressman <galpress@...zon.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        OFED mailing list <linux-rdma@...r.kernel.org>,
        Michal Kalderon <mkalderon@...vell.com>,
        Mike Marciniszyn <mike.marciniszyn@...nelisnetworks.com>,
        Mustafa Ismail <mustafa.ismail@...el.com>,
        Naresh Kumar PBS <nareshkumar.pbs@...adcom.com>,
        Nelson Escobar <neescoba@...co.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Potnuri Bharat Teja <bharat@...lsio.com>,
        Selvin Xavier <selvin.xavier@...adcom.com>,
        Shiraz Saleem <shiraz.saleem@...el.com>,
        VMware PV-Drivers <pv-drivers@...are.com>,
        Yishai Hadas <yishaih@...dia.com>,
        Zhu Yanjun <zyjzyj2000@...il.com>
Subject: Re: [PATCH rdma-next v1 10/15] RDMA/cm: Use an attribute_group on
 the ib_port_attribute intead of kobj's

On Fri, Jun 11, 2021 at 07:25:46AM +0000, Haakon Bugge wrote:
> 
> 
> > On 7 Jun 2021, at 14:50, Jason Gunthorpe <jgg@...dia.com> wrote:
> > 
> > On Mon, Jun 07, 2021 at 02:39:45PM +0200, Greg KH wrote:
> >> On Mon, Jun 07, 2021 at 09:14:11AM -0300, Jason Gunthorpe wrote:
> >>> On Mon, Jun 07, 2021 at 12:25:03PM +0200, Greg KH wrote:
> >>>> On Mon, Jun 07, 2021 at 11:17:35AM +0300, Leon Romanovsky wrote:
> >>>>> From: Jason Gunthorpe <jgg@...dia.com>
> >>>>> 
> >>>>> This code is trying to attach a list of counters grouped into 4 groups to
> >>>>> the ib_port sysfs. Instead of creating a bunch of kobjects simply express
> >>>>> everything naturally as an ib_port_attribute and add a single
> >>>>> attribute_groups list.
> >>>>> 
> >>>>> Remove all the naked kobject manipulations.
> >>>> 
> >>>> Much nicer.
> >>>> 
> >>>> But why do you need your counters to be atomic in the first place?  What
> >>>> are they counting that requires this?  
> >>> 
> >>> The write side of the counter is being updated from concurrent kernel
> >>> threads without locking, so this is an atomic because the write side
> >>> needs atomic_add().
> >> 
> >> So the atomic write forces a lock :(
> > 
> > Of course, but a single atomic is cheaper than the double atomic in a
> > full spinlock.
> > 
> >>> Making them a naked u64 will cause significant corruption on the write
> >>> side, and packet counters that are not accurate after quiescence are
> >>> not very useful things.
> >> 
> >> How "accurate" do these have to be?
> > 
> > They have to be accurate. They are networking packet counters. What is
> > the point of burning CPU cycles keeping track of inaccurate data?
> 
> Consider a CPU with a 32-bit wide datapath to memory, which reads and writes the most significant 4-byte word first:

What CPU is that?

>     Memory                   CPU1                   CPU2
> MSW         LSW        MSW         LSW        MSW         LSW
> 0x0  0xffffffff
> 0x0  0xffffffff        0x0
> 0x0  0xffffffff        0x0  0xffffffff
> 0x0  0xffffffff        0x1         0x0                         cpu1 has incremented its register
> 0x1  0xffffffff        0x1         0x0                         cpu1 has written msw
> 0x1  0xffffffff        0x1         0x0        0x1              cpu2 has read msw
> 0x1  0xffffffff        0x1         0x0        0x1  0xffffffff
> 0x1         0x0        0x1         0x0        0x2         0x0
> 0x2         0x0        0x1         0x0        0x2         0x0
> 0x2         0x0        0x1         0x0        0x2         0x0
> 
> 
> I would say that 0x200000000 vs. 0x100000001 is more than inaccurate!

True, then maybe these should just be 32bit counters :)

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ