[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5075cd03-0a4a-46d3-abac-3eda27b9ddcc@suse.de>
Date: Thu, 13 Mar 2025 09:52:18 +0100
From: Hannes Reinecke <hare@...e.de>
To: Christoph Hellwig <hch@...radead.org>
Cc: Matthew Wilcox <willy@...radead.org>, Jakub Kicinski <kuba@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>, netdev@...r.kernel.org,
Vlastimil Babka <vbabka@...e.cz>, linux-mm@...ck.org
Subject: Re: [PATCH] mm: Decline to manipulate the refcount on a slab page
On 3/13/25 09:44, Christoph Hellwig wrote:
> On Thu, Mar 13, 2025 at 09:34:39AM +0100, Hannes Reinecke wrote:
>> nvmf_connect_command_prep() returns a kmalloced buffer.
>
> Yes.
>
>> That is stored in a bvec in _nvme_submit_sync_cmd() via
>> blk_mq_rq_map_kern()->bio_map_kern().
>> And from that point on we are dealing with bvecs (iterators
>> and all), and losing the information that the page referenced
>> is a slab page.
>
> Yes. But so does every other consomer of the block layer that passes
> slab memory, of which there are quite a few. Various internal scsi
> and nvme command come to mind, as does the XFS buffer cache.
>
>> The argument is that the network layer expected a kvec iterator
>> when slab pages are referred to, not a bvec iterator.
>
> It doesn't. It just doesn't want you to use ->sendpage.
>
But we don't; we call 'sendpage_ok()' and disabling the MSG_SPLICE_PAGES
flag. Actual issue is that tls_sw() is calling iov_iter_alloc_pages(),
which is taking a page reference.
It probably should be calling iov_iter_extract_pages() (which does not
take a reference), but then one would need to review the entire network
stack as taking and releasing page references are littered throughout
the stack.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists