lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <d4d32e17-d8e2-4447-bd33-af41e89a528f@redhat.com> Date: Fri, 20 Dec 2024 20:01:02 +0100 From: David Hildenbrand <david@...hat.com> To: Alistair Popple <apopple@...dia.com>, akpm@...ux-foundation.org, dan.j.williams@...el.com, linux-mm@...ck.org Cc: lina@...hilina.net, zhang.lyra@...il.com, gerald.schaefer@...ux.ibm.com, vishal.l.verma@...el.com, dave.jiang@...el.com, logang@...tatee.com, bhelgaas@...gle.com, jack@...e.cz, jgg@...pe.ca, catalin.marinas@....com, will@...nel.org, mpe@...erman.id.au, npiggin@...il.com, dave.hansen@...ux.intel.com, ira.weiny@...el.com, willy@...radead.org, djwong@...nel.org, tytso@....edu, linmiaohe@...wei.com, peterx@...hat.com, linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org, nvdimm@...ts.linux.dev, linux-cxl@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org, linux-xfs@...r.kernel.org, jhubbard@...dia.com, hch@....de, david@...morbit.com Subject: Re: [PATCH v4 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings On 17.12.24 06:12, Alistair Popple wrote: > In preparation for using insert_page() for DAX, enhance > insert_page_into_pte_locked() to handle establishing writable > mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a > PTE which bypasses the typical set_pte_range() in finish_fault. > > Signed-off-by: Alistair Popple <apopple@...dia.com> > Suggested-by: Dan Williams <dan.j.williams@...el.com> > > --- > > Changes since v2: > > - New patch split out from "mm/memory: Add dax_insert_pfn" > --- > mm/memory.c | 45 +++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 37 insertions(+), 8 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 06bb29e..cd82952 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2126,19 +2126,47 @@ static int validate_page_before_insert(struct vm_area_struct *vma, > } > > static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, > - unsigned long addr, struct page *page, pgprot_t prot) > + unsigned long addr, struct page *page, > + pgprot_t prot, bool mkwrite) > { > struct folio *folio = page_folio(page); > + pte_t entry = ptep_get(pte); > pte_t pteval; > > - if (!pte_none(ptep_get(pte))) > - return -EBUSY; > + if (!pte_none(entry)) { > + if (!mkwrite) > + return -EBUSY; > + > + /* > + * For read faults on private mappings the PFN passed in may not > + * match the PFN we have mapped if the mapped PFN is a writeable > + * COW page. In the mkwrite case we are creating a writable PTE > + * for a shared mapping and we expect the PFNs to match. If they > + * don't match, we are likely racing with block allocation and > + * mapping invalidation so just skip the update. > + */ Would it make sense to instead have here /* See insert_pfn(). */ But ... > + if (pte_pfn(entry) != page_to_pfn(page)) { > + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(entry))); > + return -EFAULT; > + } > + entry = maybe_mkwrite(entry, vma); > + entry = pte_mkyoung(entry); > + if (ptep_set_access_flags(vma, addr, pte, entry, 1)) > + update_mmu_cache(vma, addr, pte); ... I am not sure if we want the above at all. Someone inserted a page, which is refcounted + mapcounted already. Now you ignore that and do like the second insertion "worked" ? No, that feels wrong, I suspect you will run into refcount+mapcount issues. If there is already something, inserting must fail IMHO. If you want to change something to upgrade write permissions, then a different interface should be used. > + return 0; > + } > + > /* Ok, finally just insert the thing.. */ > pteval = mk_pte(page, prot); > if (unlikely(is_zero_folio(folio))) { > pteval = pte_mkspecial(pteval); > } else { > folio_get(folio); > + entry = mk_pte(page, prot); > + if (mkwrite) { > + entry = pte_mkyoung(entry); > + entry = maybe_mkwrite(pte_mkdirty(entry), vma);> + } > inc_mm_counter(vma->vm_mm, mm_counter_file(folio)); > folio_add_file_rmap_pte(folio, page, vma); > } > @@ -2147,7 +2175,7 @@ static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, > } > > static int insert_page(struct vm_area_struct *vma, unsigned long addr, > - struct page *page, pgprot_t prot) > + struct page *page, pgprot_t prot, bool mkwrite) > { > int retval; > pte_t *pte; > @@ -2160,7 +2188,8 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr, > pte = get_locked_pte(vma->vm_mm, addr, &ptl); > if (!pte) > goto out; > - retval = insert_page_into_pte_locked(vma, pte, addr, page, prot); > + retval = insert_page_into_pte_locked(vma, pte, addr, page, prot, > + mkwrite); Alignment looks odd. Likely you can also just put it into a single line. -- Cheers, David / dhildenb
Powered by blists - more mailing lists