[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj2g7kDTKPawbhOKXFsAF+Zayygmp1f64oerQktc_LCYw@mail.gmail.com>
Date: Sat, 22 Jul 2023 11:52:22 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Heiko Carstens <hca@...ux.ibm.com>
Cc: Alexander Gordeev <agordeev@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] s390 fixes for 6.5-rc3
On Sat, 22 Jul 2023 at 09:02, Heiko Carstens <hca@...ux.ibm.com> wrote:
>
> - Fix per vma lock fault handling: add missing !(fault & VM_FAULT_ERROR)
> check to fault handler to prevent error handling for return values that
> don't indicate an error
Hmm. The s390 code / people seems to still be a bit confused about the
VM_FAULT flags.
The commit comment says "With per-vma locks, handle_mm_fault() may
return non-fatal error flags".
That's actively misleading.
Why? Because handle_mm_fault() may - and will - return non-fatal error
flags *regardless* of the per-vma locks.
There's VM_FAULT_COMPLETED, there's VM_FAULT_MAJOR, there are all
those kinds of "informational" bits.
So honestly, when that patch then does
+ if (likely(!(fault & VM_FAULT_ERROR)))
+ fault = 0;
I feel like the code is very confused about what is going on, and is
papering over the real bug.
The *real* bug seems to be that do_protection_exception() and
do_dat_exception() do this:
fault = do_exception(regs, access);
if (unlikely(fault))
do_fault_error(regs, fault);
which is basically nonsensical. And the reason that s390 does that
seems to be that s390 (and arm, for that matter) seem, to have added a
few extra VM_FAULT_xyz bits that aren't part of VM_FAULT_ERROR, so
then in do_fault_error() you have that nonsensical "test some of the
fields as values, and other fields as bits".
Anyway, I have pulled this, since it clearly fixes a problem. But I do
think that the *deeper* problem is that s390 treats those bits as
errors in the first place, when they really aren't. Yes, the error
bits are *common*, but that field really shouldn't be seen as just
errors, and I really think that the deeper problem is that
if (unlikely(fault))
do_fault_error(regs, fault);
logic. It's simply wrong.
Of course, it looks like the reason you found this is that the s390
do_fault_error() then does a BUG() on any bits it doesn't understand.
You have that nonsensical "clear flags" in other places too. So it's
not like this work-around is new. But it's a workaround, and a sign of
confusion, I feel.
Maybe the extra s390 fault conditions should be added to the generic
list and added to the VM_FAULT_ERROR mask. I dunno.
Linus
Powered by blists - more mailing lists