[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA5enKaUYehLZGL3abv4rsS7caoUG-pN9wF3R+qek-DGNZufbA@mail.gmail.com>
Date: Mon, 31 Jul 2023 22:08:03 +0100
From: Lorenzo Stoakes <lstoakes@...il.com>
To: Will Deacon <will@...nel.org>
Cc: Mike Galbraith <efault@....de>,
lkml <linux-kernel@...r.kernel.org>,
Mark Rutland <mark.rutland@....com>,
wangkefeng.wang@...wei.com, catalin.marinas@....com,
ardb@...nel.org
Subject: Re: arm64: perf test 26 rpi4 oops
On Mon, 31 Jul 2023 at 12:52, Will Deacon <will@...nel.org> wrote:
>
> On Mon, Jul 31, 2023 at 11:43:40AM +0100, Will Deacon wrote:
> > [+Lorenzo, Kefeng and others]
> >
> > On Sun, Jul 30, 2023 at 06:09:15PM +0200, Mike Galbraith wrote:
> > > On Fri, 2023-07-28 at 15:18 +0100, Will Deacon wrote:
> > > >
> > > > Looking at this quickly with Mark, the most likely explanation is that
> > > > a bogus kernel address is being passed as the source pointer to
> > > > copy_to_user().
> > >
> > > 'start' in read_kcore_iter() is bogus a LOT when running perf test 26,
> > > and that back to at least 5.15. Seems removal of bogon-proofing gave a
> > > toothless old bug teeth, but seemingly only to perf test 26. Rummaging
> > > around with crash vmlinux /proc/kcore seems to be bogon free anyway.
> > >
> > > Someone should perhaps take a peek at perf. Bogons aside, it also
> > > doesn't seem to care deeply about kernel response. Whether the kernel
> > > oops or I bat 945 bogons aside, it says 'OK'. That seems a tad odd.
> >
> > Aha, so I think I triggered the issue you're seeing under QEMU (log
> > below). perf (unhelpfully) doesn't have stable test numbers, so it's
> > test 21 in my case. However, it only explodes if I run it as root, since
> > /proc/kcore is 0400 on my system.
> >
> > The easiest way to trigger the problem is simply:
> >
> > # objdump -d /proc/kcore
> >
> > Looking at the history, I wonder whether this is because of a combination
> > of:
> >
> > e025ab842ec3 ("mm: remove kern_addr_valid() completely")
> >
> > which removed the kern_addr_valid() check on the basis that kcore used
> > copy_from_kernel_nofault() anyway, and:
> >
> > 2e1c0170771e ("fs/proc/kcore: avoid bounce buffer for ktext data")
> >
> > which replaced the copy_from_kernel_nofault() with _copy_to_user().
> >
> > So with both of those applied, we're missing the address check on arm64.
>
> Digging into this a little more, the fault occurs because kcore is
> treating everything from '_text' to '_end' as KCORE_TEXT and expects it
> to be mapped linearly. However, there's plenty of stuff we _don't_ map
> in that range on arm64 (e.g. .head.text, the pKVM hypervisor, the entry
> trampoline) so kcore is broken.
>
> One hack is to limit KCORE_TEXT to actually point at the kernel text
> (see below), but this is a user-visible change in behaviour for things
> like .data so I think it would be better to restore the old behaviour
> of handling the faults.
>
> Lorenzo?
FYI there is a parallel discussion at
https://lore.kernel.org/all/ZHc2fm+9daF6cgCE@krava/ :)
[sorry lei isn't playing ball so will have to reply from gmail,
apologies if this breaks formatting]
It'd be a real pity to have to revert that behaviour, as using a
bounce buffer is such a hack and means you have to iterate through a
page at a time...
Either that or a change such that for KCORE_TEXT specifically we
reinstate the bounce buffer and use copy_from_kernel_nofault().
It definitely is a bug in kcore to have ranges of memory that are not
mapped marked as readable. What kind of behaviour changes do you
anticipate exactly with your prospective change re: .data? The
fallthroughs?
kcore as a whole needs some love and attention I think.
An alternative is to implement some version of
copy_from_kernel_nofault() in the iterator code.
However TL;DR - I think we probably do need a semi-revert and to just
make the ktext do a bounce buffer thing. I definitely want to keep the
use of iterators so I would really not want to revert anything else.
>
> Will
>
> --->8
>
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index 9cb32e1a78a0..3696a209c1ec 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -635,7 +635,7 @@ static struct kcore_list kcore_text;
> */
> static void __init proc_kcore_text_init(void)
> {
> - kclist_add(&kcore_text, _text, _end - _text, KCORE_TEXT);
> + kclist_add(&kcore_text, _stext, _etext - _stext, KCORE_TEXT);
> }
> #else
> static void __init proc_kcore_text_init(void)
>
--
Lorenzo Stoakes
https://ljs.io
Powered by blists - more mailing lists