lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fed73be7-6f34-48b9-a9c9-2fe5ad46f5ba@lucifer.local>
Date: Mon, 19 May 2025 20:14:17 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        Pedro Falcato <pfalcato@...e.de>, Xu Xin <xu.xin16@....com.cn>,
        Chengming Zhou <chengming.zhou@...ux.dev>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 3/4] mm: prevent KSM from completely breaking VMA merging

On Mon, May 19, 2025 at 08:59:32PM +0200, David Hildenbrand wrote:
> > >
> > > I am not 100% sure why we bail out on special mappings: all we have to do is
> > > reliably identify anon pages, and we should be able to do that.
> >
> > But they map e.g. kernel memory (at least for VM_PFNMAP, purely and by
> > implication really VM_IO), it wouldn't make sense for KSM to be asked to
> > try to merge these right?
> >
> > And of course no underlying struct page to pin, no reference counting
> > either, so I think you'd end up in trouble potentially here wouldn't you?
> > And how would the CoW work?
>
> KSM only operates on anonymous pages. It cannot de-duplicate anything else.
> (therefore, only MAP_PRIVATE applies)

Yes I had this realisation see my reply to your reply :)

I mean is MAP_PRIVATE of a VM_PFNMAP really that common?...

>
> Anything else (no struct page, not a CoW anon folio in such a mapping) is
> skipped.
>
> Take a look at scan_get_next_rmap_item() where we do
>
> folio = folio_walk_start(&fw, vma, ksm_scan.address, 0);
> if (folio) {
> 	if (!folio_is_zone_device(folio) &&
> 	    folio_test_anon(folio)) {
> 		folio_get(folio);
> 		tmp_page = fw.page;
> 	}
> 	folio_walk_end(&fw, vma)
> }
>
>
> Before I changed that code, we were using GUP. And GUP just always refuses
> VM_IO|VM_PFNMAP because it cannot handle it properly.

OK so it boils down to doing KSM _on the already CoW'd_ MAP_PRIVATE mapping?

But hang on, how do we discover this? vm_normal_page() will screw this up right?
As VM_SPECIAL will be set...

OK now I'm not sure I understand how MAP_PRIVATE-mapped VM_PFNMAP mappings work
:)))

>
> > >
> > > So, assuming we could remove the VM_PFNMAP | VM_IO | VM_DONTEXPAND |
> > > VM_MIXEDMAP constraint from vma_ksm_compatible(), could we simplify?
> >
> > Well I question removing this constraint for above reasons.
> >
> > At any rate, even if we _could_ this feels like a bigger change that we
> > should come later.
>
> "bigger" -- it might just be removing these 4 flags from the check ;)
>
> I'll dig a bit more.

Right, but doing so would be out of scope here don't you think?

I'd rather not delay fixing this bug on this basis ideally, esp. as easy to
adjust later.

I suggest we put this in as-is (or close to it anyway) and then if we remove the
flags we can change this...

As I said in other reply, .mmap() means the driver can do literally anything
they want (which is _hateful_), so we'd really want to have some confidence they
didn't do something so crazy, unless we were happy to just let that possibly
explode.

>
> --
> Cheers,
>
> David / dhildenb
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ