[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <44D2E875-33FF-4756-9FAB-7F2E1ED56139@ilammy.net>
Date: Wed, 15 Sep 2021 13:24:11 +0900
From: Alexei Lozovsky <me@...mmy.net>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Alexey Dobriyan <adobriyan@...il.com>,
Christoph Lameter <cl@...ux.com>,
LKML <linux-kernel@...r.kernel.org>,
linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 0/7] proc/stat: Maintain monotonicity of "intr" and
"softirq"
Thanks for vetting my ideas!
On Tue, Sep 14, 2021, at 23:11, Thomas Gleixner wrote:
> On Sun, Sep 12 2021 at 21:37, Alexei Lozovsky wrote:
>> On Sun, Sep 12, 2021, at 18:30, Alexey Dobriyan wrote:
>>> How about making everything "unsigned long" or even "u64" like NIC
>>> drivers do?
>>
>> I see some possible hurdles ahead:
>>
>> - Not all architectures have atomic operations for 64-bit values
>
> This is not about atomics.
Yeah, I got mixed up in terminology. As you said, atomic
read-modify-write for increment is not important here, but what
*is* important is absence of tearing when doing loads and stores.
If there is no tearing we don't need any barriers to observe counters
that make sense. They might be slightly outdated but we don't care
as long as they are observed to be monotonically increasing and
we don't see the low bits wrap before the high bits are updated
because 64-bit store got split into two 32-bit ones.
That said, I believe this rules out updating counter types to u64
because on 32-bit platforms those will tear. However, we can use
unsigned long so that platforms with 64-bit native words get 64-bit
counters and platforms with 32-bit words stay with 32-bit counters
that wrap like they should.
I've checked this on Godbolt for a number of archs and it seems that
all of them will emit single loads and stores for unsigned long.
Well, except for 16-bit platforms, but those would certainly not use
PPC or x86 and procfs in the first place, so I think we can ignore
them for this matter.
> On 32bit systems a 32bit load (as long as the compiler does not emit
> load tearing) is always consistent even when there is a concurrent
> increment going on. It either gets the old or the new value.
Regarding tearing, I thought about wrapping counter reads in READ_ONCE()
to signal that they should be performed in one load. __this_cpu_inc()
should probably do WRITE_ONCE() for the sake of pairing, but that
should not be too important.
Is it a good idea to use READ_ONCE here?
Or just assume that compiler will not emit any weird loads?
(READ_ONCE does not strictly check that reads will not tear. Right now
it allows unsigned long long because reasons. But I guess it will enable
some extra debugging checks.)
Powered by blists - more mailing lists