[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjdVsunbZNq39nv4dbcXiro3fzpt-v2TGV_nR0DUsUCLw@mail.gmail.com>
Date: Fri, 26 Aug 2022 10:19:53 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Alan Stern <stern@...land.harvard.edu>,
Andrea Parri <parri.andrea@...il.com>,
Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Boqun Feng <boqun.feng@...il.com>,
Nicholas Piggin <npiggin@...il.com>,
David Howells <dhowells@...hat.com>,
Jade Alglave <j.alglave@....ac.uk>,
Luc Maranget <luc.maranget@...ia.fr>,
"Paul E. McKenney" <paulmck@...nel.org>,
Akira Yokosawa <akiyks@...il.com>,
Daniel Lustig <dlustig@...dia.com>,
Joel Fernandes <joel@...lfernandes.org>,
linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org
Subject: Re: [PATCH v3] wait_on_bit: add an acquire memory barrier
On Fri, Aug 26, 2022 at 9:45 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> So narrowing the load is fine (but you generally never want to narrow
> a *store*, because that results in huge problems with subsequent wider
> loads).
.. so making that statement made me go look around, and it's exactly
what we use for clear_bit() on x86.
Which is actually ok too, because it's a locked operation, and at that
point the whole "store buffer forwarding" issue pretty much goes out
the window anyway.
But because I looked at where the new test_bit_acquire() triggers and
where there are other bitops around it, I found this beauty in
fs/buffer.c: clean_bdev_aliases():
if (!buffer_mapped(bh) ||
(bh->b_blocknr < block))
goto next;
if (bh->b_blocknr >= block + len)
break;
clear_buffer_dirty(bh);
wait_on_buffer(bh);
clear_buffer_req(bh);
where it basically works on four different bits (buffer_mapped, dirty,
lock, req) right next to each other.
That code sequence really doesn't matter, but it was interesting
seeing the generated code. Not pretty, but the ugliest part was
actually how the might_sleep() calls in those helper functions result
in __cond_resched() when PREEMPT_VOLUNTARY is on, and how horrid that
is for register allocation.
And in bh_submit_read() we have another ugly thing:
mov (%rdi),%rax
test $0x4,%al
je <bh_submit_read+0x7d>
push %rbx
mov %rdi,%rbx
testb $0x1,(%rdi)
where we have a mix of those two kinds of "test_bit()" (and test
"testb" version most definitely looks better. But that's due to the
source code being
BUG_ON(!buffer_locked(bh));
if (buffer_uptodate(bh)) {
ie we have that *ancient* BUG_ON() messing things up. Oh well.
None of the buffer-head code matters any more, bit it's certainly
showing the effects of that patch of yours, and the code isn't pretty.
But none of the ugliness I found was actually _due_ to your patch. It
was all due to other things, and your patch just makes code generation
better in the cases I looked at.
Linus
Powered by blists - more mailing lists