lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ubf5hakohi4hhmoqdxk255rwarwv3vwt7j3l5aqtznorfcxx6r@37jhsud4gjtw>
Date: Wed, 15 Jan 2025 17:38:06 +1100
From: Alistair Popple <apopple@...dia.com>
To: David Hildenbrand <david@...hat.com>
Cc: akpm@...ux-foundation.org, dan.j.williams@...el.com, 
	linux-mm@...ck.org, alison.schofield@...el.com, lina@...hilina.net, 
	zhang.lyra@...il.com, gerald.schaefer@...ux.ibm.com, vishal.l.verma@...el.com, 
	dave.jiang@...el.com, logang@...tatee.com, bhelgaas@...gle.com, jack@...e.cz, 
	jgg@...pe.ca, catalin.marinas@....com, will@...nel.org, mpe@...erman.id.au, 
	npiggin@...il.com, dave.hansen@...ux.intel.com, ira.weiny@...el.com, 
	willy@...radead.org, djwong@...nel.org, tytso@....edu, linmiaohe@...wei.com, 
	peterx@...hat.com, linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org, nvdimm@...ts.linux.dev, 
	linux-cxl@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org, 
	linux-xfs@...r.kernel.org, jhubbard@...dia.com, hch@....de, david@...morbit.com, 
	chenhuacai@...nel.org, kernel@...0n.name, loongarch@...ts.linux.dev
Subject: Re: [PATCH v6 15/26] huge_memory: Add vmf_insert_folio_pud()

On Tue, Jan 14, 2025 at 05:22:15PM +0100, David Hildenbrand wrote:
> On 10.01.25 07:00, Alistair Popple wrote:
> > Currently DAX folio/page reference counts are managed differently to
> > normal pages. To allow these to be managed the same as normal pages
> > introduce vmf_insert_folio_pud. This will map the entire PUD-sized folio
> > and take references as it would for a normally mapped page.
> > 
> > This is distinct from the current mechanism, vmf_insert_pfn_pud, which
> > simply inserts a special devmap PUD entry into the page table without
> > holding a reference to the page for the mapping.
> > 
> > Signed-off-by: Alistair Popple <apopple@...dia.com>
> 
> [...]
> 
> > +/**
> > + * vmf_insert_folio_pud - insert a pud size folio mapped by a pud entry
> > + * @vmf: Structure describing the fault
> > + * @folio: folio to insert
> > + * @write: whether it's a write fault
> > + *
> > + * Return: vm_fault_t value.
> > + */
> > +vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio, bool write)
> > +{
> > +	struct vm_area_struct *vma = vmf->vma;
> > +	unsigned long addr = vmf->address & PUD_MASK;
> > +	pud_t *pud = vmf->pud;
> > +	struct mm_struct *mm = vma->vm_mm;
> > +	spinlock_t *ptl;
> > +
> > +	if (addr < vma->vm_start || addr >= vma->vm_end)
> > +		return VM_FAULT_SIGBUS;
> > +
> > +	if (WARN_ON_ONCE(folio_order(folio) != PUD_ORDER))
> > +		return VM_FAULT_SIGBUS;
> > +
> > +	ptl = pud_lock(mm, pud);
> > +	if (pud_none(*vmf->pud)) {
> > +		folio_get(folio);
> > +		folio_add_file_rmap_pud(folio, &folio->page, vma);
> > +		add_mm_counter(mm, mm_counter_file(folio), HPAGE_PUD_NR);
> > +	}
> > +	insert_pfn_pud(vma, addr, vmf->pud, pfn_to_pfn_t(folio_pfn(folio)), write);
> 
> This looks scary at first (inserting something when not taking a reference),
> but insert_pfn_pud() seems to handle that. A comment here would have been
> nice.

Indeed, I will add one.
 
> It's weird, though, that if there is already something else, that we only
> WARN but don't actually return an error. So ...

Note we only WARN when there is already a mapping there and we're trying to
upgrade it to writeable. This just mimics the logic which currently exists in
insert_pfn() and insert_pfn_pmd().

The comment in insert_pfn() sheds more light:

                        /*
                         * For read faults on private mappings the PFN passed
                         * in may not match the PFN we have mapped if the
                         * mapped PFN is a writeable COW page.  In the mkwrite
                         * case we are creating a writable PTE for a shared
                         * mapping and we expect the PFNs to match. If they
                         * don't match, we are likely racing with block
                         * allocation and mapping invalidation so just skip the
                         * update.
                         */

> > +	spin_unlock(ptl);
> > +
> > +	return VM_FAULT_NOPAGE;
> 
> I assume always returning VM_FAULT_NOPAGE, even when something went wrong,
> is the right thing to do?

Yes, I think so. I guess in the WARN case we could return something like
VM_FAULT_SIGBUS to kill the application, but the existing vmf_insert_*()
functions don't currently do that so I think that would be a separate clean-up.

> Apart from that LGTM.
> 
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ