[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24870f73-97f9-496d-a1ca-787b54c222e4@suse.de>
Date: Tue, 4 Mar 2025 16:11:01 +0100
From: Hannes Reinecke <hare@...e.de>
To: Vlastimil Babka <vbabka@...e.cz>, Hannes Reinecke <hare@...e.com>,
Matthew Wilcox <willy@...radead.org>, Boris Pismenny <borisp@...dia.com>,
John Fastabend <john.fastabend@...il.com>, Jakub Kicinski <kuba@...nel.org>
Cc: Sagi Grimberg <sagi@...mberg.me>,
"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
linux-mm@...ck.org, Harry Yoo <harry.yoo@...cle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Kernel oops with 6.14 when enabling TLS
On 3/4/25 11:26, Vlastimil Babka wrote:
> On 3/4/25 11:20, Hannes Reinecke wrote:
[ .. ]
>> So I'd be happy with an 'easy' fix for now. Obviously :-)
>>
With this patch:
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 65f550cb5081..b035a9928cdd 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1190,8 +1190,14 @@ static ssize_t __iov_iter_get_pages_alloc(struct
iov_iter *i,
if (!n)
return -ENOMEM;
p = *pages;
- for (int k = 0; k < n; k++)
- get_page(p[k] = page + k);
+ for (int k = 0; k < n; k++) {
+ if (!get_page_unless_zero(p[k] = page + k)) {
+ pr_warn("%s: frozen page %d of %d\n",
+ __func__, k, n);
+ return -ENOMEM;
+ }
+ }
+
maxsize = min_t(size_t, maxsize, n * PAGE_SIZE - *start);
i->count -= maxsize;
i->iov_offset += maxsize;
the system doesn't crash anymore:
[ 51.520949] __iov_iter_get_pages_alloc: frozen page 0 of 1
[ 51.536393] nvme nvme0: creating 4 I/O queues.
[ 51.968897] nvme nvme0: mapped 4/0/0 default/read/poll queues.
[ 51.972207] __iov_iter_get_pages_alloc: frozen page 0 of 1
[ 51.974528] __iov_iter_get_pages_alloc: frozen page 0 of 1
[ 51.976928] __iov_iter_get_pages_alloc: frozen page 0 of 1
[ 51.978980] __iov_iter_get_pages_alloc: frozen page 0 of 1
[ 51.981236] nvme nvme0: new ctrl: NQN "nqn.blktests-subsystem-1",
addr 10.161.9.19:4420, hostnqn:
nqn.2014-08.org.nvmexpress:uuid:027a49dc-b554-40e5-b0f9-0a9ea03ec30c
and the allocation in question is coming from
drivers/nvme/host/fabrics.c:nvmf_connect_data_prep(), which
coincidentally _is_ a kmalloc()ed buffer.
But TLS doesn't work, either:
[ 58.886754] nvme nvme0: I/O tag 1 (3001) type 4 opcode 0x18 (Keep
Alive) QID 0 timeout
[ 58.889112] nvme nvme0: starting error recovery
[ 58.892176] nvme nvme0: failed nvme_keep_alive_end_io error=10
[ 58.892282] nvme nvme0: reading non-mdts-limits failed: -4
[ 58.902490] nvme nvme0: Reconnecting in 10 seconds...
(probably not surprising seeing that an error is returned ..)
So yeah, looks like TLS has issues with kmalloced data.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists