lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aTJlS_oFL_uiEoTw@casper.infradead.org>
Date: Fri, 5 Dec 2025 04:53:31 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Dominique Martinet <asmadeus@...ewreck.org>,
	David Howells <dhowells@...hat.com>,
	Vlastimil Babka <vbabka@...e.cz>
Cc: Chris Arges <carges@...udflare.com>,
	David Howells <dhowells@...hat.com>, ericvh@...nel.org,
	lucho@...kov.net, linux_oss@...debyte.com, v9fs@...ts.linux.dev,
	linux-kernel@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: kernel BUG when mounting large block xfs backed by 9p (folio ref
 count bug)

On Tue, Nov 25, 2025 at 06:03:12PM +0900, Dominique Martinet wrote:
> I'm sorry but I'm not sure I see what I should do from this -- your
> patch looks to me like it should now work with this?
> Oh, it's not merged?... I don't see where the discussion stalled
> either...
> 
> For context, in this case virtio needs the pages to be pinned because
> the host will write directly into it, and the API we're using is
> virtqueue_add_sgs() (drivers/virtio/virtio_ring.c) which expects a
> scatterlist, which I guess must be pages (can't say I'm very familiar
> with this particular API either, but the word `folio` doesn't show up in
> drivers/virtio)

I was hoping Dave Howells would chime in, but since he hasn't ...

The root problem is that iov_iter_get_pages_alloc() takes a reference
on the page.  It thinks this will prevent the memory from being fredd
under it.  That's not true with slab allocations; the slab won't get
freed back to the page allocator, but the original memory can be kfreed
and reallocated to another kmalloc.  So at best this is a useless
bumping of the refcount, and at worst it'll corrupt the data of some
unsuspecting user.

So we delberately broke this usage.  You can't pass slab allocated memory
to iov_iter_get_pages_alloc() any more.  And then we decided to break
the "large kmalloc" case too.  It's an implementation detail whether a
kmalloc comes from slab or not, and we might change things in the future
such that allocations which are currently deemed too large to come from
a slab now come from a slab instead.

I don't really have concrete advice for you what you should be doing
to fix this.  We should never have allowed this to work, but I'm
insufficiently familiar with the iov_iter APIs to tell you what you
should be doing instead.  Hence my hope that Dave Howells would ride
to the rescue.

But this isn't anything to do with folios, at least not directly.
It's a spinoff of the folio project.

> Since we don't know where the iov comes from, we can't have any
> expectation about it, but we can check things and try to act
> appropriately (or error out and/or somehow fallback to non-zc if there's
> a reason we can't do it).
> 
> What would one need to go from an iov_iter to something this could use?
> 
> out of curiosity I looked at other "big" virtqueue users (e.g. vhost
> scsi must be shuffling similar data around), but I don't quite see how
> the buffers are passed, I'd need to spend more time than I can afford immediately...
> 
> 
> Thanks (and sorry for pulling the whole arm when you give a hand),
> -- 
> Dominique

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ