linux-kernel - Re: Bug: broken /proc/kcore in 6.13

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ecb22f03-ca7f-4212-9f02-cceafb9cfb7f@lucifer.local>
Date: Fri, 17 Jan 2025 18:13:25 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Alexandre Ferrieux <alexandre.ferrieux@...il.com>
Cc: Steven Rostedt <rostedt@...dmis.org>, linux-trace-users@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
        Mike Rapoport <rppt@...nel.org>
Subject: Re: Bug: broken /proc/kcore in 6.13

+cc Mike

OK so nothing to worry about here - the feature that causes this problem
has been completely disabled. This may not be in Linus's tree yet but will
be for 6.13 release [0].

I think the vread_iter() check for 0 can wait for 6.14, as once the area of
memory is identified this should never happen, but we do want to pick up on
it, with a WARN_ON_ONCE() to catch stuff like this right away.

Thanks so much for the repro, though I observed the 'core /proc/kcore'
command freezing up before any 'disass' in my qemu setup, interestingly!

[0]:https://lore.kernel.org/all/20250113112934.GA8385@noisy.programming.kicks-ass.net/

On Fri, Jan 17, 2025 at 04:31:54PM +0000, Lorenzo Stoakes wrote:
> On Fri, Jan 17, 2025 at 04:28:32PM +0100, Alexandre Ferrieux wrote:
> >
> >
> > On 17/01/2025 16:19, Alexandre Ferrieux wrote:
> > > On 17/01/2025 15:44, Lorenzo Stoakes wrote:
> > >>> Alexandre Ferrieux <alexandre.ferrieux@...il.com> wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> Somewhere in the 6.13 branch (not bisected yet, sorry), it stopped being
> > >>>> possible to disassemble the running kernel from gdb through /proc/kcore.
> > >> Thanks for the report! Much appreciated.
> > >>
> > >> I may try to bisect here also unless you're close to finding the commit that
> > >> broke this?
> > >
> > > I'm currently homing in on copy_page_to_iter_nofault(), will report shortly :)
> >
> > Hmm, actually, that baby ain't cooperative:
> >
> >   [Fri Jan 17 15:23:05 2025] trace_kprobe: Could not probe notrace function
> >   copy_page_to_iter_nofault
> >
> > ... if I cannot insert kprobes to sniff around, I'm a bit stuck :}
> > So I think you'll reach the goal faster than me !
> >
> > PS: For your bisection: the last working kernel I know of is Debian's 6.12 final:
> >
> >   ii  linux-image-6.12.9-amd64         6.12.9-1                         amd64
> >     Linux 6.12 for 64-bit PCs (signed)
> >
> >
>
> Cheers much appreciated, have been able to repro and am bisecting now! Will
> update with results when done.

OK I bisected this to commit 5185e7f9f3bd ("x86/module: enable ROX caches for
module text on 64 bit").

It seems that vmalloc logic is used to handle module memory too in
vread_iter(), and somehow the execmem stuff is breaking this.

So this would explain why this worked previously, and it was in fact ok to
assume vread_iter() should never return 0 (though I believe we should now
definitely check this and error out if so).

I have tracked it down to (forgive me Alexandre, I realise I'm duplicating
some of your analysis, I'm just doing things from the kernel side here :>)

read_kcore_iter()
-> vread_iter()
-> aligned_vread_iter() (returns 0, indicating error on copy)
  -> gets page via vmalloc_to_page()
-> copy_page_to_iter_nofault()
-> copy_to_user_iter_nofault()
-> copy_to_user_nofault()
-> __copy_to_user_inatomic()
-> raw_copy_to_user()
-> copy_user_generic() [page fault]

In discussion with Mike, he pointed me at execmem_cache_populate() marking
the region as not being direct-map valid via execmem_set_direct_map_valid().

However it seems the problem is that the above logic results in the
following calls:

copy_page_to_iter_nofault() -> kmap_local_page() -> page_to_virt()

Which _assume_ the mapping is in the direct map afaict. It's not, so we get
a page fault, which is fixed up and results in the 0 result and the whole
problem.

So I think this code would have to be modified to be aware of such
non-direct map memory for this to work.

In any case it's moot as this feature is now disabled. But hopefully the
analysis helps Mike in the next spin of his ROX series!