lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 3 Jul 2023 14:35:52 +0200
From: Hannes Reinecke <hare@...e.de>
To: David Howells <dhowells@...hat.com>
Cc: Sagi Grimberg <sagi@...mberg.me>, Keith Busch <kbusch@...nel.org>,
 Christoph Hellwig <hch@....de>, linux-nvme@...ts.infradead.org,
 Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
 Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Subject: Re: [PATCHv6 0/5] net/tls: fixes for NVMe-over-TLS

On 7/3/23 14:26, David Howells wrote:
> Hannes Reinecke <hare@...e.de> wrote:
> 
>>> 'discover' and 'connect' works, but when I'm trying to transfer data
>>> (eg by doing a 'mkfs.xfs') the whole thing crashes horribly in
>>> sock_sendmsg() as it's trying to access invalid pages :-(
> 
> Can you be more specific about the crash?
> 
[ 1804.190109] nvme nvme0: new ctrl: NQN "nqn.test", addr 127.0.0.1:4420
[ 1814.346751] BUG: unable to handle page fault for address: 
ffffffffc4e166cc
[ 1814.347980] #PF: supervisor read access in kernel mode
[ 1814.348864] #PF: error_code(0x0000) - not-present page
[ 1814.349727] PGD 2b239067 P4D 2b239067 PUD 2b23b067 PMD 0
[ 1814.350630] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1814.351369] CPU: 0 PID: 2309 Comm: mkfs.xfs Kdump: loaded Tainted: G 
           E      6.4.0-54-default+ #296
[ 1814.353021] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
0.0.0 02/06/2015
[ 1814.354298] RIP: 0010:skb_splice_from_iter+0xda/0x310
[ 1814.355171] Code: 0f 8e 2f 02 00 00 4c 8d 6c 24 30 48 8b 54 24 28 4c 
89 f8 4d 89 f7 4d 89 ee 49 89 c5 49 8b 1e bd 00
10 00 00 48 29 d5 4c 39 fd <48> 8b 43 08 49 0f 47 ef a8 01 0f 85 c9 01 
00 00 66 90 48 89 d8 48
[ 1814.358314] RSP: 0018:ffffa90f00db7778 EFLAGS: 00010283
[ 1814.359202] RAX: ffff8941ad156600 RBX: ffffffffc4e166c4 RCX: 
0000000000000000
[ 1814.360426] RDX: 0000000000000fff RSI: 0000000000000000 RDI: 
0000000000000000
[ 1814.361613] RBP: 0000000000000001 R08: ffffa90f00db7958 R09: 
00000000ac6df68e
[ 1814.362797] R10: 0000000000020000 R11: ffffffffac9873c0 R12: 
0000000000000000
[ 1814.363998] R13: ffff8941ad156600 R14: ffffa90f00db77a8 R15: 
0000000000001000
[ 1814.365175] FS:  00007f5f0ea63780(0000) GS:ffff8941fb400000(0000) 
knlGS:0000000000000000
[ 1814.366493] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1814.367456] CR2: ffffffffc4e166cc CR3: 0000000005e68000 CR4: 
0000000000350ef0
[ 1814.368671] Call Trace:
[ 1814.369092]  <TASK>
[ 1814.369463]  ? __die_body+0x1a/0x60
[ 1814.370060]  ? page_fault_oops+0x151/0x470
[ 1814.370746]  ? search_bpf_extables+0x65/0x70
[ 1814.371471]  ? fixup_exception+0x22/0x310
[ 1814.372162]  ? exc_page_fault+0xb1/0x150
[ 1814.372822]  ? asm_exc_page_fault+0x22/0x30
[ 1814.373521]  ? kmem_cache_alloc_node+0x1c0/0x320
[ 1814.374294]  ? skb_splice_from_iter+0xda/0x310
[ 1814.375036]  ? skb_splice_from_iter+0xa9/0x310
[ 1814.375777]  tcp_sendmsg_locked+0x763/0xce0
[ 1814.376488]  ? tcp_sendmsg_locked+0x936/0xce0
[ 1814.377225]  tcp_sendmsg+0x27/0x40
[ 1814.377797]  sock_sendmsg+0x8b/0xa0
[ 1814.378386]  nvme_tcp_try_send_data+0x131/0x400 [nvme_tcp]
[ 1814.379297]  ? nvme_tcp_try_send_cmd_pdu+0x164/0x290 [nvme_tcp]
[ 1814.380302]  ? __kernel_text_address+0xe/0x40
[ 1814.381037]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 1814.381909]  ? arch_stack_walk+0xa1/0xf0
[ 1814.382565]  nvme_tcp_try_send+0x197/0x2c0 [nvme_tcp]
[ 1814.383407]  ? __rq_qos_issue+0x24/0x40
[ 1814.384069]  nvme_tcp_queue_rq+0x38b/0x3c0 [nvme_tcp]
[ 1814.384907]  __blk_mq_issue_directly+0x3e/0xe0
[ 1814.385648]  blk_mq_try_issue_directly+0x6c/0xa0
[ 1814.386411]  blk_mq_submit_bio+0x52b/0x650
[ 1814.387095]  __submit_bio+0xd8/0x150
[ 1814.387697]  submit_bio_noacct_nocheck+0x151/0x360
[ 1814.388503]  ? submit_bio_wait+0x54/0xc0
[ 1814.389157]  submit_bio_wait+0x54/0xc0
[ 1814.389780]  __blkdev_direct_IO_simple+0x124/0x280

My suspicion is here:

         while (true) {
                 struct bio_vec bvec;
                 struct msghdr msg = {
                         .msg_flags = MSG_DONTWAIT | MSG_SPLICE_PAGES,
                 };
                 struct page *page = nvme_tcp_req_cur_page(req);
                 size_t offset = nvme_tcp_req_cur_offset(req);
                 size_t len = nvme_tcp_req_cur_length(req);
                 bool last = nvme_tcp_pdu_last_send(req, len);
                 int req_data_sent = req->data_sent;
                 int ret;

                 if (!last || queue->data_digest || 
nvme_tcp_queue_more(queue))
                         msg.msg_flags |= MSG_MORE;

                 if (!sendpage_ok(page))
                         msg.msg_flags &= ~MSG_SPLICE_PAGES,

                 bvec_set_page(&bvec, page, len, offset);
                 iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len);
                 ret = sock_sendmsg(queue->sock, &msg);

Thing is, 'req' already has an iter (that's what nvme_tcp_req_cur_page() 
extracts from.
Maybe a missing kmap() somewhere?
But the entire thing is borderline pedantic.
'req' already has an iter, so we should be using it directly and not 
create another iter which just replicates the existing one.

Cheers,

Hannes


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ