[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a0372f7f-9a85-4d3e-ba20-b5911a8189e3@lucifer.local>
Date: Mon, 18 Nov 2024 11:17:18 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-doc@...r.kernel.org, linux-fsdevel@...r.kernel.org,
cgroups@...r.kernel.org, linux-kselftest@...r.kernel.org,
akpm@...ux-foundation.org, corbet@....net, derek.kiernan@....com,
dragan.cvetic@....com, arnd@...db.de, gregkh@...uxfoundation.org,
viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
tj@...nel.org, hannes@...xchg.org, mhocko@...nel.org,
roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
muchun.song@...ux.dev, Liam.Howlett@...cle.com, vbabka@...e.cz,
jannh@...gle.com, shuah@...nel.org, vegard.nossum@...cle.com,
vattunuru@...vell.com, schalla@...vell.com, david@...hat.com,
willy@...radead.org, osalvador@...e.de, usama.anjum@...labora.com,
andrii@...nel.org, ryan.roberts@....com, peterx@...hat.com,
oleg@...hat.com, tandersen@...flix.com, rientjes@...gle.com,
gthelen@...gle.com
Subject: Re: [RFCv1 0/6] Page Detective
On Sat, Nov 16, 2024 at 05:59:16PM +0000, Pasha Tatashin wrote:
> Page Detective is a new kernel debugging tool that provides detailed
> information about the usage and mapping of physical memory pages.
>
> It is often known that a particular page is corrupted, but it is hard to
> extract more information about such a page from live system. Examples
> are:
>
> - Checksum failure during live migration
> - Filesystem journal failure
> - dump_page warnings on the console log
> - Unexcpected segfaults
>
> Page Detective helps to extract more information from the kernel, so it
> can be used by developers to root cause the associated problem.
I like the _concept_ of providing more information like this.
But you've bizarrely gone to great lengths to expose mm internal
implementation details to drivers in order to implement this as a driver.
This is _very clearly_ an mm thing, and _very clearly_ subject to change
depending on how mm changes. It should live under mm/ and not be a loadable
driver.
I am also very very much not in favour of re-implementing yet another page
table walker, this time in driver code (!). Please no.
So NACK in its current form. This has to be implemented within mm if we are
to take it.
I'm also concerned about its scalability and impact on the system, as it
takes every single mm lock in the system at once, which seems kinda unwise
or at least problematic, and not something we want happening outside of mm,
at any rate.
>
> It operates through the Linux debugfs interface, with two files: "virt"
> and "phys".
>
> The "virt" file takes a virtual address and PID and outputs information
> about the corresponding page.
>
> The "phys" file takes a physical address and outputs information about
> that page.
>
> The output is presented via kernel log messages (can be accessed with
> dmesg), and includes information such as the page's reference count,
> mapping, flags, and memory cgroup. It also shows whether the page is
> mapped in the kernel page table, and if so, how many times.
I mean, even though I'm not a huge fan of kernel pointer hashing etc. this
is obviously leaking as much information as you might want about kernel
internal state to the point of maybe making the whole kernel pointer
hashing thing moot.
I know this requires CAP_SYS_ADMIN, but there are things that also require
that which _still_ obscure kernel pointers.
And you're outputting it all to dmesg.
So yeah, a security person (Jann?) would be better placed to comment on
this than me, but are we sure we want to do this when not in a
CONFIG_DEBUG_VM* kernel?
>
> Pasha Tatashin (6):
> mm: Make get_vma_name() function public
> pagewalk: Add a page table walker for init_mm page table
> mm: Add a dump_page variant that accept log level argument
> misc/page_detective: Introduce Page Detective
> misc/page_detective: enable loadable module
> selftests/page_detective: Introduce self tests for Page Detective
>
> Documentation/misc-devices/index.rst | 1 +
> Documentation/misc-devices/page_detective.rst | 78 ++
> MAINTAINERS | 8 +
> drivers/misc/Kconfig | 11 +
> drivers/misc/Makefile | 1 +
> drivers/misc/page_detective.c | 808 ++++++++++++++++++
> fs/inode.c | 18 +-
> fs/kernfs/dir.c | 1 +
> fs/proc/task_mmu.c | 61 --
> include/linux/fs.h | 5 +-
> include/linux/mmdebug.h | 1 +
> include/linux/pagewalk.h | 2 +
> kernel/pid.c | 1 +
> mm/debug.c | 53 +-
> mm/memcontrol.c | 1 +
> mm/oom_kill.c | 1 +
> mm/pagewalk.c | 32 +
> mm/vma.c | 60 ++
> tools/testing/selftests/Makefile | 1 +
> .../selftests/page_detective/.gitignore | 1 +
> .../testing/selftests/page_detective/Makefile | 7 +
> tools/testing/selftests/page_detective/config | 4 +
> .../page_detective/page_detective_test.c | 727 ++++++++++++++++
> 23 files changed, 1787 insertions(+), 96 deletions(-)
> create mode 100644 Documentation/misc-devices/page_detective.rst
> create mode 100644 drivers/misc/page_detective.c
> create mode 100644 tools/testing/selftests/page_detective/.gitignore
> create mode 100644 tools/testing/selftests/page_detective/Makefile
> create mode 100644 tools/testing/selftests/page_detective/config
> create mode 100644 tools/testing/selftests/page_detective/page_detective_test.c
>
> --
> 2.47.0.338.g60cca15819-goog
>
Powered by blists - more mailing lists