lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <44D2E875-33FF-4756-9FAB-7F2E1ED56139@ilammy.net>
Date:   Wed, 15 Sep 2021 13:24:11 +0900
From:   Alexei Lozovsky <me@...mmy.net>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Alexey Dobriyan <adobriyan@...il.com>,
        Christoph Lameter <cl@...ux.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 0/7] proc/stat: Maintain monotonicity of "intr" and
 "softirq"

Thanks for vetting my ideas!

On Tue, Sep 14, 2021, at 23:11, Thomas Gleixner wrote:
> On Sun, Sep 12 2021 at 21:37, Alexei Lozovsky wrote:
>> On Sun, Sep 12, 2021, at 18:30, Alexey Dobriyan wrote:
>>> How about making everything "unsigned long" or even "u64" like NIC
>>> drivers do?
>> 
>> I see some possible hurdles ahead:
>> 
>> - Not all architectures have atomic operations for 64-bit values
> 
> This is not about atomics.

Yeah, I got mixed up in terminology. As you said, atomic
read-modify-write for increment is not important here, but what
*is* important is absence of tearing when doing loads and stores.

If there is no tearing we don't need any barriers to observe counters
that make sense. They might be slightly outdated but we don't care
as long as they are observed to be monotonically increasing and
we don't see the low bits wrap before the high bits are updated
because 64-bit store got split into two 32-bit ones.

That said, I believe this rules out updating counter types to u64
because on 32-bit platforms those will tear. However, we can use
unsigned long so that platforms with 64-bit native words get 64-bit
counters and platforms with 32-bit words stay with 32-bit counters
that wrap like they should.

I've checked this on Godbolt for a number of archs and it seems that
all of them will emit single loads and stores for unsigned long.
Well, except for 16-bit platforms, but those would certainly not use
PPC or x86 and procfs in the first place, so I think we can ignore
them for this matter.

> On 32bit systems a 32bit load (as long as the compiler does not emit
> load tearing) is always consistent even when there is a concurrent
> increment going on. It either gets the old or the new value.

Regarding tearing, I thought about wrapping counter reads in READ_ONCE()
to signal that they should be performed in one load. __this_cpu_inc()
should probably do WRITE_ONCE() for the sake of pairing, but that
should not be too important.

Is it a good idea to use READ_ONCE here?
Or just assume that compiler will not emit any weird loads?

(READ_ONCE does not strictly check that reads will not tear. Right now
it allows unsigned long long because reasons. But I guess it will enable
some extra debugging checks.)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ