[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20240828023135.944867-1-sunyiqixm@gmail.com>
Date: Wed, 28 Aug 2024 10:31:35 +0800
From: sunyiqi <sunyiqixm@...il.com>
To: gregkh@...uxfoundation.org
Cc: linux-kernel@...r.kernel.org,
rafael@...nel.org,
sunyiqixm@...il.com
Subject: Re: Re: [PATCH] cpu: add CAP_SYSLOG check for reading crash_notes address
On Tue, 27 Aug 2024 11:15:25 +0200, Greg KH wrote:
> On Tue, Aug 27, 2024 at 04:11:33PM +0800, sunyiqi wrote:
> > CPU crash_notes value can be obtained through /sys/devices/system/cpu/\
> > cpu[NUM]/crash_notes for leaking the phys address(introduced in kernel
> > 5.10-rc1, but some distributions may migrate this feature to 4.x
> > kernel).
> > The relevant function is crash_notes_show() in file drivers/base/cpu.c.
> >
> > Though crash_notes file permission is 400 and owner is root:root,
> > but in container, the root user can also read crash_notes which leads to
> > information leak as most of kernel pointer value can not by read for
> > root user in container without CAP_SYSLOG capability.
>
> "most", but not all?
Kernel usually use printk for output kernel pointer through /proc and
/sys, when investigating the /proc and /sys directory, the kernel
pointer I found is all printed using printk guarded by CAP_SYSLOG
except crash_notes.
And kernel also has a script to check kernel pointer value leak in
/proc and /sys (will be mentioned later again).
> > In current linux kernel implementation, kernel pointer value or address
> > printked by %pK is not directly exposed to root user in container. For
> > kernel interface which includes those values, like /sys/kallsyms,
> > /proc/net/packet, etc., address values are guarded by kernel function
> > restricted_pointer(). Without CAP_SYSLOG capability, value 0 or NULL
> > will be returned for reading those interfaces in container using root
> > user.
>
> I understand the request here, but why is giving the "real" kernel
> pointer value somehow bad here? What can userspace in a container do
> with it?
>
> And why not give root permissions access to it container or not?
Linux Kernel exploits often need kernel address to achieve LPE or
container escape just like lots of exploits in repo of kCTF held by
Google.
When kernel has a memory-based vulnerability, kernel address
is needed for exploit. Attackers in container can leverage this
crash_notes to get kernel physcial address to use. Similar info
leak issue is still popular in exploit for distributions which
has not fixed those issue:
- CVE-2022-4543 kernel: KASLR Prefetch Bypass Breaks KPTI
- CVE-2023-0597 kernel: x86/mm: Randomize per-cpu entry area
- CVE-2024-26816 kernel: x86, relocs: Ignore relocations in .notes section
> > In restricted_pointer() and container, address values only returned by
> > kernel when root user has CAP_SYSLOG capability which is not the default
> > capabilities for Docker container. CAP_SYSLOG prevents root user in
> > container to get kernel pointer from lots of interfaces based on printk,
> > but not for cpu crash_notes.
> >
> > Add CAP_SYSLOG permission check in crash_notes_show() for viewing kernel
> > address.
>
> Is this really the only place where this type of check needs to be
> added?
Early this year, when fixing CVE-2024-26816(kernel ELF note leak kernel
.text address). LWN article say that kernel has a leaking_addresses.pl
script to scan pointer value that expose to userspace in /sys, /proc,
but this tool can only identify the pointer value in human-readable string
type. In CVE-2024-26816 case, it outputs value in binary form, so the
script missed it. Cook <keescook-AT-chromium.org> fixed the script in
18 Feb 2024.
After that, pointer leaking in human-readable and non-hunman-readable
through /proc and /sys seems to be solved.
Why this leak is missed by leaking_addresses.pl script?
The output of crash_note actually is offset to PAGE_OFFSET which is
0xffff880000000000 in x86_64, it's difficult for scanner to distinguish
whether it is an address or normal data.
> > Fixes: aa838896d87a ("drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions")
> > Signed-off-by: sunyiqi <sunyiqixm@...il.com>
>
> No cc: stable?
I used get_maintainer.pl script in kernel repo which is a way for finding
maintainers for specific file maintainers in kernel submitting patches
docs, only linux-kernel mailing list is listed in script result.
Also in submitting patches docs, "a severe bug in a released kernel"
should be directed toward the stable maintainers. This crash_notes issue,
needs high privilege and happens in container scenario, also cannot
be used for LPE or DoS directly. So I think it's not "a severe bug".
I'm very willing to abide by the rules of the kernel community.
This time get_maintainer.pl confused me a little if the result of
get_maintainer.pl script is believeable.
> > ---
> > drivers/base/cpu.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > index fdaa24bb641a..a2f27bb0ffe6 100644
> > --- a/drivers/base/cpu.c
> > +++ b/drivers/base/cpu.c
> > @@ -156,6 +156,9 @@ static ssize_t crash_notes_show(struct device *dev,
> > unsigned long long addr;
> > int cpunum;
> >
> > + if (!has_capability_noaudit(current, CAP_SYSLOG))
> > + return sysfs_emit(buf, "%llx\n", 0ull);
>
> Why not return an error? Why is 0 ok?
>
> thanks,
>
> greg k-h
This patch refers to restricted_pointer CAP_SYSLOG solution, output 0
for unauthorized user. Also, I think in this show function, return 0
value maintains consistency of the result for reading crash_notes.
Thanks,
Yiqi Sun
Powered by blists - more mailing lists