[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ze4sSR0DJaR2Hy6v@devil>
Date: Sun, 10 Mar 2024 21:55:21 +0000
From: Lorenzo Stoakes <lstoakes@...il.com>
To: Richard Weinberger <richard@....at>
Cc: linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
upstream+pagemap@...ma-star.at, adobriyan@...il.com,
wangkefeng.wang@...wei.com, ryan.roberts@....com, hughd@...gle.com,
peterx@...hat.com, david@...hat.com, avagin@...gle.com,
vbabka@...e.cz, akpm@...ux-foundation.org,
usama.anjum@...labora.com, corbet@....net
Subject: Re: [PATCH 1/2] [RFC] proc: pagemap: Expose whether a PTE is writable
On Thu, Mar 07, 2024 at 12:23:38AM +0100, Richard Weinberger wrote:
> Is a PTE present and writable, bit 58 will be set.
> This allows detecting CoW memory mappings and other mappings
> where a write access will cause a page fault.
I think David has highlighted it elsewhere in the thread, but this
explanation definitely needs bulking up.
Need to emphsaise that we detect cases where a fault will occur (_possibly_
CoW, _possibly_ write notify clean file-backed page, _possibly_ other cases
where we need write fault tracking).
Very important to differentiate between a _page table_ read/write flag
being set and the mapping being read-only, it's a concern that being loose
on this might confuse people somewhat.
>
> Signed-off-by: Richard Weinberger <richard@....at>
> ---
> fs/proc/task_mmu.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 3f78ebbb795f..7c7e0e954c02 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1341,6 +1341,7 @@ struct pagemapread {
> #define PM_SOFT_DIRTY BIT_ULL(55)
> #define PM_MMAP_EXCLUSIVE BIT_ULL(56)
> #define PM_UFFD_WP BIT_ULL(57)
> +#define PM_WRITE BIT_ULL(58)
As an extension of the above comment re: confusion, I really dislike
PM_WRITE. Something like PM_PTE_WRITABLE might be better?
> #define PM_FILE BIT_ULL(61)
> #define PM_SWAP BIT_ULL(62)
> #define PM_PRESENT BIT_ULL(63)
> @@ -1417,6 +1418,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm,
> flags |= PM_SOFT_DIRTY;
> if (pte_uffd_wp(pte))
> flags |= PM_UFFD_WP;
> + if (pte_write(pte))
> + flags |= PM_WRITE;
> } else if (is_swap_pte(pte)) {
> swp_entry_t entry;
> if (pte_swp_soft_dirty(pte))
> @@ -1483,6 +1486,8 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end,
> flags |= PM_SOFT_DIRTY;
> if (pmd_uffd_wp(pmd))
> flags |= PM_UFFD_WP;
> + if (pmd_write(pmd))
> + flags |= PM_WRITE;
> if (pm->show_pfn)
> frame = pmd_pfn(pmd) +
> ((addr & ~PMD_MASK) >> PAGE_SHIFT);
> @@ -1586,6 +1591,9 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask,
> if (huge_pte_uffd_wp(pte))
> flags |= PM_UFFD_WP;
>
> + if (pte_write(pte))
This should be huge_pte_write(). It amounts to the same thing, but for
consistency :)
> + flags |= PM_WRITE;
> +
> flags |= PM_PRESENT;
> if (pm->show_pfn)
> frame = pte_pfn(pte) +
> --
> 2.35.3
>
Overall I _really_ like the idea of exposing this. Not long ago I wanted to
be able to assess whether private mappings were CoW'd or not 'at a glance'
and couldn't find any means of doing this (of course I might have missed
something but I don't think there is anything).
So I think a single bit in /proc/$pid/pagemap is absolutely worthwhile to
get this information.
I'd like to see a non-RFC version submitted :) as discussed on irc,
probably best after merge window!
Powered by blists - more mailing lists