linux-kernel - Re: [PATCH] scsi: storvsc: Fix a panic in the hibernation procedure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ade7f096-4a09-4d4e-753a-f9e4acb7b550@acm.org>
Date:   Thu, 23 Apr 2020 09:37:31 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     Dexuan Cui <decui@...rosoft.com>,
        "jejb@...ux.ibm.com" <jejb@...ux.ibm.com>,
        "martin.petersen@...cle.com" <martin.petersen@...cle.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "hch@....de" <hch@....de>, "hare@...e.de" <hare@...e.de>,
        Michael Kelley <mikelley@...rosoft.com>,
        Long Li <longli@...rosoft.com>,
        "ming.lei@...hat.com" <ming.lei@...hat.com>,
        Balsundar P <Balsundar.P@...rochip.com>
Cc:     "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "wei.liu@...nel.org" <wei.liu@...nel.org>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        KY Srinivasan <kys@...rosoft.com>
Subject: Re: [PATCH] scsi: storvsc: Fix a panic in the hibernation procedure

On 4/23/20 12:04 AM, Dexuan Cui wrote:
> It looks the sd suspend callbacks are only for the I/O from the disk, e.g.
> from the file system that lives in some partition of some disk.
> 
> The panic I'm seeing is not from sd. I think it's from a kernel thread
> that tries to detect the status of the SCSI CDROM. This is the snipped
> messages (the full version is at https://lkml.org/lkml/2020/4/10/47): here
> the suspend callbacks of the sd, sr and scsi_bus_type.pm have been called,
> and later the storvsc LLD's suspend callback is also called, but
> sr_block_check_events() can still try to submit SCSI commands to storvsc:
> 
> [   11.668741] sr 0:0:0:1: bus quiesce
> [   11.668804] sd 0:0:0:0: bus quiesce
> [   11.698082] scsi target0:0:0: bus quiesce
> [   11.703296] scsi host0: bus quiesce
> [   11.781730] hv_storvsc bf78936f-7d8f-45ce-ab03-6c341452e55d: noirq bus quiesce
> [   11.796479] hv_netvsc dda5a2be-b8b8-4237-b330-be8a516a72c0: noirq bus quiesce
> [   11.804042] BUG: kernel NULL pointer dereference, address: 0000000000000090
> [   11.804996] Workqueue: events_freezable_power_ disk_events_workfn
> [   11.804996] RIP: 0010:storvsc_queuecommand+0x261/0x714 [hv_storvsc]
> [   11.804996] Call Trace:
> [   11.804996]  scsi_queue_rq+0x593/0xa10
> [   11.804996]  blk_mq_dispatch_rq_list+0x8d/0x510
> [   11.804996]  blk_mq_sched_dispatch_requests+0xed/0x170
> [   11.804996]  __blk_mq_run_hw_queue+0x55/0x110
> [   11.804996]  __blk_mq_delay_run_hw_queue+0x141/0x160
> [   11.804996]  blk_mq_sched_insert_request+0xc3/0x170
> [   11.804996]  blk_execute_rq+0x4b/0xa0
> [   11.804996]  __scsi_execute+0xeb/0x250
> [   11.804996]  sr_check_events+0x9f/0x270 [sr_mod]
> [   11.804996]  cdrom_check_events+0x1a/0x30 [cdrom]
> [   11.804996]  sr_block_check_events+0xcc/0x110 [sr_mod]
> [   11.804996]  disk_check_events+0x68/0x160
> [   11.804996]  process_one_work+0x20c/0x3d0
> [   11.804996]  worker_thread+0x2d/0x3e0
> [   11.804996]  kthread+0x10c/0x130
> [   11.804996]  ret_from_fork+0x35/0x40
> 
> It looks the issue is: scsi_bus_freeze() -> ... -> scsi_dev_type_suspend ->
> scsi_device_quiesce() does not guarantee the device is totally quiescent:

During hibernation processes are frozen before devices are quiesced. 
freeze_processes() calls try_to_freeze_tasks() and that function in turn 
calls freeze_workqueues_begin() and freeze_workqueues_busy(). 
freeze_workqueues_busy() freezes all freezable workqueues including 
system_freezable_power_efficient_wq, the workqueue from which 
check_events functions are called. Some time after freezable workqueues 
are frozen dpm_suspend(PMSG_FREEZE) is called. That last call triggers 
the pm_ops.freeze callbacks, including the pm_ops.freeze callbacks 
defined in the SCSI core.

The above trace seems to indicate that freezing workqueues has not 
happened before devices were frozen. How about doing the following to 
retrieve more information about what is going on?
* Enable CONFIG_PM_DEBUG in the kernel configuration.
* Run echo 1 > /sys/power/pm_print_times and echo 1 > 
/sys/power/pm_debug_messages before hibernation starts.

>> Documentation/driver-api/device_link.rst: "By default, the driver core
>> only enforces dependencies between devices that are borne out of a
>> parent/child relationship within the device hierarchy: When suspending,
>> resuming or shutting down the system, devices are ordered based on this
>> relationship, i.e. children are always suspended before their parent,
>> and the parent is always resumed before its children." Is there a single
>> storvsc_drv instance for all SCSI devices supported by storvsc_drv? Has
>> it been considered to make storvsc_drv the parent device of all SCSI
>> devices created by the storvsc driver?
> 
> Yes, I think so:
> 
> root@...alhost:~# ls -rtl  /sys/bus/vmbus/devices/9be03cb2-d37b-409f-b09b-81059b4f6943/host3/target3:0:0/3:0:0:0/driver
> lrwxrwxrwx 1 root root 0 Apr 22 01:10 /sys/bus/vmbus/devices/9be03cb2-d37b-409f-b09b-81059b4f6943/host3/target3:0:0/3:0:0:0/driver -> ../../../../../../../../../../bus/scsi/drivers/sd
> 
> Here the driver of /sys/bus/vmbus/devices/9be03cb2-d37b-409f-b09b-81059b4f6943
> is storvsc, which creates host3/target3:0:0/3:0:0:0.
> 
> So it looks there is no ordering issue.

Right, I had overlooked the code in storvsc_probe() that associates SCSI 
devices with storvsc_drv.

Bart.