linux-kernel - Re: [PATCH] block: Annotate a racy read in blk_do_io

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9df8351-7cc2-4562-a8b5-440344bfeb91@paulmck-laptop>
Date: Wed, 15 May 2024 14:51:02 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Marco Elver <elver@...gle.com>
Cc: Bart Van Assche <bvanassche@....org>, Breno Leitao <leitao@...ian.org>,
	Jens Axboe <axboe@...nel.dk>,
	"open list:BLOCK LAYER" <linux-block@...r.kernel.org>,
	open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] block: Annotate a racy read in blk_do_io_stat()

On Wed, May 15, 2024 at 07:40:08PM +0200, Marco Elver wrote:
> On Wed, 15 May 2024 at 17:57, Paul E. McKenney <paulmck@...nel.org> wrote:
> > On Wed, May 15, 2024 at 09:58:35AM +0200, Marco Elver wrote:
> > > On Wed, 15 May 2024 at 01:47, Paul E. McKenney <paulmck@...nel.org> wrote:
> > > > On Mon, May 13, 2024 at 10:13:49AM +0200, Marco Elver wrote:
> > > > > On Sat, 11 May 2024 at 02:41, Paul E. McKenney <paulmck@...nel.org> wrote:
> > > > > [...]
> > > > > > ------------------------------------------------------------------------
> > > > > >
> > > > > > commit 930cb5f711443d8044e88080ee21b0a5edda33bd
> > > > > > Author: Paul E. McKenney <paulmck@...nel.org>
> > > > > > Date:   Fri May 10 15:36:57 2024 -0700
> > > > > >
> > > > > >     kcsan: Add example to data_race() kerneldoc header
> > > > > >
> > > > > >     Although the data_race() kerneldoc header accurately states what it does,
> > > > > >     some of the implications and usage patterns are non-obvious.  Therefore,
> > > > > >     add a brief locking example and also state how to have KCSAN ignore
> > > > > >     accesses while also preventing the compiler from folding, spindling,
> > > > > >     or otherwise mutilating the access.
> > > > > >
> > > > > >     [ paulmck: Apply Bart Van Assche feedback. ]
> > > > > >
> > > > > >     Reported-by: Bart Van Assche <bvanassche@....org>
> > > > > >     Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> > > > > >     Cc: Marco Elver <elver@...gle.com>
> > > > > >     Cc: Breno Leitao <leitao@...ian.org>
> > > > > >     Cc: Jens Axboe <axboe@...nel.dk>
> > > > > >
> > > > > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > > > > > index c00cc6c0878a1..9249768ec7a26 100644
> > > > > > --- a/include/linux/compiler.h
> > > > > > +++ b/include/linux/compiler.h
> > > > > > @@ -194,9 +194,17 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> > > > > >   * This data_race() macro is useful for situations in which data races
> > > > > >   * should be forgiven.  One example is diagnostic code that accesses
> > > > > >   * shared variables but is not a part of the core synchronization design.
> > > > > > + * For example, if accesses to a given variable are protected by a lock,
> > > > > > + * except for diagnostic code, then the accesses under the lock should
> > > > > > + * be plain C-language accesses and those in the diagnostic code should
> > > > > > + * use data_race().  This way, KCSAN will complain if buggy lockless
> > > > > > + * accesses to that variable are introduced, even if the buggy accesses
> > > > > > + * are protected by READ_ONCE() or WRITE_ONCE().
> > > > > >   *
> > > > > > - * This macro *does not* affect normal code generation, but is a hint
> > > > > > - * to tooling that data races here are to be ignored.
> > > > > > + * This macro *does not* affect normal code generation, but is a hint to
> > > > > > + * tooling that data races here are to be ignored.  If code generation must
> > > > > > + * be protected *and* KCSAN should ignore the access, use both data_race()
> > > > >
> > > > > "code generation must be protected" seems ambiguous, because
> > > > > protecting code generation could mean a lot of different things to
> > > > > different people.
> > > > >
> > > > > The more precise thing would be to write that "If the access must be
> > > > > atomic *and* KCSAN should ignore the access, use both ...".
> > > >
> > > > Good point, and I took your wording, thank you.
> > > >
> > > > > I've also had trouble in the past referring to "miscompilation" or
> > > > > similar, because that also entirely depends on the promised vs.
> > > > > expected semantics: if the code in question assumes for the access to
> > > > > be atomic, the compiler compiling the code in a way that the access is
> > > > > no longer atomic would be a "miscompilation". Although is it still a
> > > > > "miscompilation" if the compiler generated code according to specified
> > > > > language semantics (say according to our own LKMM) - and that's where
> > > > > opinions can diverge because it depends on which side we stand
> > > > > (compiler vs. our code).
> > > >
> > > > Agreed, use of words like "miscompilation" can annoy people.  What
> > > > word would you suggest using instead?
> > >
> > > Not sure. As suggested above, I try to just stick to "atomic" vs
> > > "non-atomic" because that's ultimately the functional end result of
> > > such a miscompilation. Although I've also had people be confused as in
> > > "what atomic?! as in atomic RMW?!", but I don't know how to remove
> > > that kind of confusion.
> > >
> > > If, however, our intended model is the LKMM and e.g. a compiler breaks
> > > a dependency-chain, then we could talk about miscompilation, because
> > > the compiler violates our desired language semantics. Of course the
> > > compiler writers then will say that we try to do things that are
> > > outside any implemented language semantics the compiler is aware of,
> > > so it's not a miscompilation again. So it all depends on which side
> > > we're arguing for. Fun times.
> >
> > ;-) ;-) ;-)
> >
> > > > > > + * and READ_ONCE(), for example, data_race(READ_ONCE(x)).
> > > > >
> > > > > Having more documentation sounds good to me, thanks for adding!
> > > > >
> > > > > This extra bit of documentation also exists in a longer form in
> > > > > access-marking.txt, correct? I wonder how it would be possible to
> > > > > refer to it, in case the reader wants to learn even more.
> > > >
> > > > Good point, especially given that I had forgotten about it.
> > > >
> > > > I don't have any immediate ideas for calling attention to this file,
> > > > but would the following update be helpful?
> > >
> > > Mentioning __data_racy along with data_race() could be helpful, thank
> > > you. See comments below.
> >
> > I did add a mention of it in "Linux-Kernel RCU Shared-Variable Marking"
> > [1], but just a mention, given that I do not expect that we will use it
> > within RCU.
> >
> > > Thanks,
> > > -- Marco
> > >
> > > >                                                         Thanx, Paul
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > diff --git a/tools/memory-model/Documentation/access-marking.txt b/tools/memory-model/Documentation/access-marking.txt
> > > > index 65778222183e3..690dd59b7ac59 100644
> > > > --- a/tools/memory-model/Documentation/access-marking.txt
> > > > +++ b/tools/memory-model/Documentation/access-marking.txt
> > > > @@ -24,6 +24,12 @@ The Linux kernel provides the following access-marking options:
> > > >  4.     WRITE_ONCE(), for example, "WRITE_ONCE(a, b);"
> > > >         The various forms of atomic_set() also fit in here.
> > > >
> > > > +5.     ASSERT_EXCLUSIVE_ACCESS() and ASSERT_EXCLUSIVE_WRITER().
> > >
> > > Perhaps worth mentioning, but they aren't strictly access-marking
> > > options. In the interest of simplicity could leave it out.
> >
> > Interestingly enough, they can be said to be implicitly marking other
> > concurrent accesses to the variable.  ;-)
> 
> The document starts with "guidelines for marking intentionally
> concurrent normal accesses to shared memory".  The ASSERT_EXCLUSIVE
> macros do capture more of the concurrency rules, and perhaps they
> could be seen as some kind of "negative marking" where concurrent
> access should _not_ happen concurrently with these. But I'm still not
> convinced it's the same kind of marking the document introduces.
> 
> I always considered them in the realm of general assertions that we
> can just use to tell the tool more than can be inferred from the bits
> of C code required for the functional implementation of whatever we're
> doing.
> 
> > I believe that the do need to be mentioned more prominently, though.
> >
> > Maybe a second list following this one?  In that case, what do we name
> > the list?  I suppose the other alternative would be to leave them in
> > this list, and change the preceding sentence to say something like
> > "The Linux kernel provides the following access-marking-related primitives"
> >
> > Thoughts?
> 
> And I just checked the current access-marking.txt to see where we
> might add more, and found the section "ACCESS-DOCUMENTATION OPTIONS"
> already exists. I think that section is perfectly reasonable as is,
> and it does explicitly talk about ASSERT_EXCLUSIVE* macros.
> 
> Did you want to add it more prominently at the top? If so, maybe a
> brief forward-reference to that section might be helpful.

How about like this?

------------------------------------------------------------------------

The Linux kernel provides the following access-marking options:

1.	Plain C-language accesses (unmarked), for example, "a = b;"

2.	Data-race marking, for example, "data_race(a = b);"

3.	READ_ONCE(), for example, "a = READ_ONCE(b);"
	The various forms of atomic_read() also fit in here.

4.	WRITE_ONCE(), for example, "WRITE_ONCE(a, b);"
	The various forms of atomic_set() also fit in here.

5.	__data_racy, for example "int __data_racy a;"

6.	KCSAN's negative-marking assertions, ASSERT_EXCLUSIVE_ACCESS()
	and ASSERT_EXCLUSIVE_WRITER(), are desccribed in the
	"ACCESS-DOCUMENTATION OPTIONS" section below.

------------------------------------------------------------------------

Would that work?

							Thanx, Paul