lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PS1PR03MB4939A124F814F35E69C7D59B88819@PS1PR03MB4939.apcprd03.prod.outlook.com>
Date:   Tue, 21 Mar 2023 13:30:18 +0000
From:   Lei Lei2 Yin <yinlei2@...ovo.com>
To:     Sagi Grimberg <sagi@...mberg.me>,
        "kbusch@...nel.org" <kbusch@...nel.org>,
        "axboe@...com" <axboe@...com>, "hch@....de" <hch@....de>
CC:     "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "cybeyond@...mail.com" <cybeyond@...mail.com>
Subject: 回复: [External] Re: [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for nvme multipath


	No, I have not verified this issue with a system larger than 5.10.y(such as 5.15.y and 6.0 or furthor), because some function we need like cgroup in upper version kernel has changed too much, we can't use these upper version kernel.
	

	In addition , uptreams have change bi_disk's modify to bio_set_dev(bio, ns->disk->part0), and as you said there is no bi_disk in struct bio anymore. So that is too involving because of code dependencies,  i want to do is what you said, to send an alternative surgical fix.
	(I will confirm upstream for this problem in the near future, if it have same problem, i will submit this fix.)

	I'm not sure what evidence is needed to prove this problem and patch. The following is child bio and parent bio struct when heap-use-after-free occur catched by crash(I turn on kasan and panic_on_warn).

	Please help me confirm if this is enough, thanks.

	all bio from nvme_ns_head_submit_bio to bio_endio is nvme head disk, and failed bio is origin bio's parent, and its bi_disk(0xffff888153ead000) is kfreed before kasan warn(I confirmed this by adding a log). 


      KERNEL: /usr/lib/debug/vmlinux  [TAINTED]                        
    DUMPFILE: vmcore  [PARTIAL DUMP]
        CPUS: 8
        DATE: Mon Mar 20 19:43:39 CST 2023
      UPTIME: 00:05:33
LOAD AVERAGE: 73.43, 20.60, 7.11
       TASKS: 526
    NODENAME: C8
     RELEASE: 5.10.167
     VERSION: #1 SMP Fri Feb 17 11:02:17 CST 2023
     MACHINE: x86_64  (2194 Mhz)
      MEMORY: 64 GB
       PANIC: "Kernel panic - not syncing: KASAN: panic_on_warn set ..."
         PID: 417
     COMMAND: "kworker/5:1H"
        TASK: ffff888126972040  [THREAD_INFO: ffff888126972040]
         CPU: 5
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 417      TASK: ffff888126972040  CPU: 5    COMMAND: "kworker/5:1H"
 #0 [ffff88810ebcf828] machine_kexec at ffffffff8f701b3e
 #1 [ffff88810ebcf948] __crash_kexec at ffffffff8f9d28eb
 #2 [ffff88810ebcfa60] panic at ffffffff913967e9
 #3 [ffff88810ebcfb30] bio_endio at ffffffff902541f7
 #4 [ffff88810ebcfb78] bio_endio at ffffffff902541f7
 #5 [ffff88810ebcfba8] bio_endio at ffffffff902541f7
 #6 [ffff88810ebcfbd8] nvme_ns_head_submit_bio at ffffffffc13cf960 [nvme_core]
 #7 [ffff88810ebcfcc8] submit_bio_noacct at ffffffff9026b134
 #8 [ffff88810ebcfdb8] nvme_requeue_work at ffffffffc13cdc40 [nvme_core]
 #9 [ffff88810ebcfdf8] process_one_work at ffffffff8f8133c8
#10 [ffff88810ebcfe78] worker_thread at ffffffff8f813fb7
#11 [ffff88810ebcff10] kthread at ffffffff8f825e6f
#12 [ffff88810ebcff50] ret_from_fork at ffffffff8f60619f
crash> p *(struct bio *)0xffff8881890f4900   // child bio
$1 = {
  bi_next = 0x0, 
  bi_disk = 0xdae00188000001a1, 
  bi_opf = 33605633, 
  bi_flags = 1922, 
  bi_ioprio = 0, 
  bi_write_hint = 0, 
  bi_status = 10 '\n', 
  bi_partno = 0 '\000', 
  __bi_remaining = {
    counter = 1
  }, 
  bi_iter = {
    bi_sector = 12287744, 
    bi_size = 65536, 
    bi_idx = 3, 
    bi_bvec_done = 106496
  }, 
  bi_end_io = 0xffffffff90254280 <bio_chain_endio>, 
  bi_private = 0xffff888198b778d0, 
  bi_blkg = 0x0, 
  bi_issue = {
    value = 288230712376101481
  }, 
  {
    bi_integrity = 0x0
  }, 
  bi_vcnt = 0, 
  bi_max_vecs = 0, 
  __bi_cnt = {
    counter = 1
  }, 
  bi_io_vec = 0xffff8881a4530000, 
  bi_pool = 0xffff888141bd7af8, 
  bi_inline_vecs = 0xffff8881890f4978
}

crash> p *(struct bio *)0xffff888198b778d0    // parent bio
$2 = {
  bi_next = 0x0, 
  bi_disk = 0xffff888153ead000, 
  bi_opf = 33589249, 
  bi_flags = 1664, 
  bi_ioprio = 0, 
  bi_write_hint = 0, 
  bi_status = 10 '\n', 
  bi_partno = 0 '\000', 
  __bi_remaining = {
    counter = 0
  }, 
  bi_iter = {
    bi_sector = 12288000, 
    bi_size = 0, 
    bi_idx = 5, 
    bi_bvec_done = 0
  }, 
  bi_end_io = 0xffffffff8ff8df80 <blkdev_bio_end_io_simple>, 
  bi_private = 0xffff8881b0c54080, 
  bi_blkg = 0xffff8881974df400, 
  bi_issue = {
    value = 288230665264113654
  }, 
  {
    bi_integrity = 0x0
  }, 
  bi_vcnt = 5, 
  bi_max_vecs = 256, 
  __bi_cnt = {
    counter = 1
  }, 
  bi_io_vec = 0xffff8881a4530000, 
  bi_pool = 0x0, 
  bi_inline_vecs = 0xffff888198b77948
}
	
 

-----邮件原件-----
发件人: Sagi Grimberg <sagi@...mberg.me> 
发送时间: 2023年3月21日 20:26
收件人: Lei Lei2 Yin <yinlei2@...ovo.com>; kbusch@...nel.org; axboe@...com; hch@....de
抄送: linux-nvme@...ts.infradead.org; linux-kernel@...r.kernel.org; cybeyond@...mail.com
主题: Re: [External] Re: [PATCH] nvme: fix heap-use-after-free and oops in bio_endio for nvme multipath


> 	Thank you for your reply
> 
> 	This problem occurs in nvme over rdma and nvme over tcp with nvme generate multipath. Delete the ns gendisk is caused by nvmf target subsystem is faulty, then host detect all path keep alive overtime and io timeout. After ctrl-loss-tmo seconds, host will remove fail ctrl and ns gendisk.

That is fine, but it is a problem if it does not correctly drain inflight I/O, weather it was split or not. And this looks like the wrong place to address this.

> 	We have reappear this proble in Linux-5.10.136, Linux-5.10.167 and 
> the latest commit in linux-5.10.y, and this patch is only applicable 
> to Linux-5.10.y

So my understanding that this does not reproduce upstream?

> 
> 	Yes , this is absolutely the wrong place to do this . Can i move this modification after nvme_trace_bio_complete?
> 
> 	Do I need to resubmit a patch, if modifications are needed?

Yes, but a backport fix needs to be sent to stable mailing list
(stable@...r.kernel.org) and cc'd to linux-nvme mailing list.

But I don't think that this fix is the correct one. What is needed is to identify where this was fixed upstream and backport that fix instead.
If that is too involving because of code dependencies, it may be possible to send an alternative surgical fix, but it needs to be justified.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ