linux-kernel - Re: [EXTERNAL] [PATCH] scsi: storvsc: Fix scheduling while atomic on PREEMPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b4933df-6af2-449c-922b-30ef8fd4c8b8@siemens.com>
Date: Tue, 3 Feb 2026 06:57:46 +0100
From: Jan Kiszka <jan.kiszka@...mens.com>
To: Long Li <longli@...rosoft.com>, KY Srinivasan <kys@...rosoft.com>,
 Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>,
 Dexuan Cui <DECUI@...rosoft.com>,
 "James E.J. Bottomley" <James.Bottomley@...senPartnership.com>,
 "Martin K. Petersen" <martin.petersen@...cle.com>,
 "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>
Cc: "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
 Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
 Florian Bezdeka <florian.bezdeka@...mens.com>,
 RT <linux-rt-users@...r.kernel.org>, Mitchell Levy <levymitchell0@...il.com>
Subject: Re: [EXTERNAL] [PATCH] scsi: storvsc: Fix scheduling while atomic on
 PREEMPT_RT

On 03.02.26 00:47, Long Li wrote:
>> From: Jan Kiszka <jan.kiszka@...mens.com>
>>
>> This resolves the follow splat and lock-up when running with PREEMPT_RT
>> enabled on Hyper-V:
> 
> Hi Jan,
> 
> It's interesting to know the use-case of running a RT kernel over Hyper-V.
> 
> Can you give an example?
> 

- functional testing of an RT base image over Hyper-V
- re-use of a common RT base image, without exploiting RT properties

> As far as I know, Hyper-V makes no RT guarantees of scheduling VPs for a VM.

This is well understood and not our goal. We only need the kernel to run
correctly over Hyper-V with PREEMPT-RT enabled, and that is not the case
right now.

Thanks,
Jan

PS: Who had to idea to drop a virtual UART from Gen 2 VMs? Early boot
guest debugging is true fun now...

> 
> Thanks,
> Long
> 
>>
>> [  415.140818] BUG: scheduling while atomic: stress-ng-
>> iomix/1048/0x00000002 [  415.140822] INFO: lockdep is turned off.
>> [  415.140823] Modules linked in: intel_rapl_msr intel_rapl_common
>> intel_uncore_frequency_common intel_pmc_core pmt_telemetry
>> pmt_discovery pmt_class intel_pmc_ssram_telemetry intel_vsec
>> ghash_clmulni_intel aesni_intel rapl binfmt_misc nls_ascii nls_cp437 vfat fat
>> snd_pcm hyperv_drm snd_timer drm_client_lib drm_shmem_helper snd sg
>> soundcore drm_kms_helper pcspkr hv_balloon hv_utils evdev joydev drm
>> configfs efi_pstore nfnetlink vsock_loopback
>> vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport
>> vsock vmw_vmci efivarfs autofs4 ext4 crc16 mbcache jbd2 sr_mod sd_mod
>> cdrom hv_storvsc serio_raw hid_generic scsi_transport_fc hid_hyperv
>> scsi_mod hid hv_netvsc hyperv_keyboard scsi_common [  415.140846]
>> Preemption disabled at:
>> [  415.140847] [<ffffffffc0656171>] storvsc_queuecommand+0x2e1/0xbe0
>> [hv_storvsc] [  415.140854] CPU: 8 UID: 0 PID: 1048 Comm: stress-ng-iomix
>> Not tainted 6.19.0-rc7 #30 PREEMPT_{RT,(full)} [  415.140856] Hardware
>> name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V
>> UEFI Release v4.1 09/04/2024 [  415.140857] Call Trace:
>> [  415.140861]  <TASK>
>> [  415.140861]  ? storvsc_queuecommand+0x2e1/0xbe0 [hv_storvsc]
>> [  415.140863]  dump_stack_lvl+0x91/0xb0 [  415.140870]
>> __schedule_bug+0x9c/0xc0 [  415.140875]  __schedule+0xdf6/0x1300
>> [  415.140877]  ? rtlock_slowlock_locked+0x56c/0x1980
>> [  415.140879]  ? rcu_is_watching+0x12/0x60 [  415.140883]
>> schedule_rtlock+0x21/0x40 [  415.140885]
>> rtlock_slowlock_locked+0x502/0x1980
>> [  415.140891]  rt_spin_lock+0x89/0x1e0
>> [  415.140893]  hv_ringbuffer_write+0x87/0x2a0 [  415.140899]
>> vmbus_sendpacket_mpb_desc+0xb6/0xe0
>> [  415.140900]  ? rcu_is_watching+0x12/0x60 [  415.140902]
>> storvsc_queuecommand+0x669/0xbe0 [hv_storvsc] [  415.140904]  ?
>> HARDIRQ_verbose+0x10/0x10 [  415.140908]  ? __rq_qos_issue+0x28/0x40
>> [  415.140911]  scsi_queue_rq+0x760/0xd80 [scsi_mod] [  415.140926]
>> __blk_mq_issue_directly+0x4a/0xc0 [  415.140928]
>> blk_mq_issue_direct+0x87/0x2b0 [  415.140931]
>> blk_mq_dispatch_queue_requests+0x120/0x440
>> [  415.140933]  blk_mq_flush_plug_list+0x7a/0x1a0 [  415.140935]
>> __blk_flush_plug+0xf4/0x150 [  415.140940]  __submit_bio+0x2b2/0x5c0
>> [  415.140944]  ? submit_bio_noacct_nocheck+0x272/0x360
>> [  415.140946]  submit_bio_noacct_nocheck+0x272/0x360
>> [  415.140951]  ext4_read_bh_lock+0x3e/0x60 [ext4] [  415.140995]
>> ext4_block_write_begin+0x396/0x650 [ext4] [  415.141018]  ?
>> __pfx_ext4_da_get_block_prep+0x10/0x10 [ext4] [  415.141038]
>> ext4_da_write_begin+0x1c4/0x350 [ext4] [  415.141060]
>> generic_perform_write+0x14e/0x2c0 [  415.141065]
>> ext4_buffered_write_iter+0x6b/0x120 [ext4] [  415.141083]
>> vfs_write+0x2ca/0x570 [  415.141087]  ksys_write+0x76/0xf0
>> [  415.141089]  do_syscall_64+0x99/0x1490 [  415.141093]  ?
>> rcu_is_watching+0x12/0x60 [  415.141095]  ?
>> finish_task_switch.isra.0+0xdf/0x3d0
>> [  415.141097]  ? rcu_is_watching+0x12/0x60 [  415.141098]  ?
>> lock_release+0x1f0/0x2a0 [  415.141100]  ? rcu_is_watching+0x12/0x60
>> [  415.141101]  ? finish_task_switch.isra.0+0xe4/0x3d0
>> [  415.141103]  ? rcu_is_watching+0x12/0x60 [  415.141104]  ?
>> __schedule+0xb34/0x1300 [  415.141106]  ?
>> hrtimer_try_to_cancel+0x1d/0x170 [  415.141109]  ?
>> do_nanosleep+0x8b/0x160 [  415.141111]  ?
>> hrtimer_nanosleep+0x89/0x100 [  415.141114]  ?
>> __pfx_hrtimer_wakeup+0x10/0x10 [  415.141116]  ?
>> xfd_validate_state+0x26/0x90 [  415.141118]  ? rcu_is_watching+0x12/0x60
>> [  415.141120]  ? do_syscall_64+0x1e0/0x1490 [  415.141121]  ?
>> do_syscall_64+0x1e0/0x1490 [  415.141123]  ? rcu_is_watching+0x12/0x60
>> [  415.141124]  ? do_syscall_64+0x1e0/0x1490 [  415.141125]  ?
>> do_syscall_64+0x1e0/0x1490 [  415.141127]  ? irqentry_exit+0x140/0x7e0
>> [  415.141129]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> get_cpu() disables preemption while the spinlock hv_ringbuffer_write is using
>> is converted to an rt-mutex under PREEMPT_RT.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@...mens.com>
>> ---
>>
>> This is likely just the tip of an iceberg, see specifically [1], but if you never start
>> addressing it, it will continue to crash ships, even if those are only on test
>> cruises (we are fully aware that Hyper-V provides no RT guarantees for
>> guests). A pragmatic alternative to that would be a simple
>>
>> config HYPERV
>>     depends on !PREEMPT_RT
>>
>> Please share your thoughts if this fix is worth it, or if we should better stop
>> looking at the next splats that show up after it. We are currently considering to
>> thread some of the hv platform IRQs under PREEMPT_RT as potential next
>> step.
>>
>> TIA!
>>
>> [1]
>> https://lore.
>> kernel.org%2Fall%2F20230809-b4-rt_preempt-fix-v1-0-
>> 7283bbdc8b14%40gmail.com%2F&data=05%7C02%7Clongli%40microsoft.c
>> om%7C9bcc663272304e06251908de5f42fe3b%7C72f988bf86f141af91ab2
>> d7cd011db47%7C1%7C0%7C639052938514762134%7CUnknown%7CTWF
>> pbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW
>> 4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=WyFA
>> %2FIUPpZDcayM%2Fj7Ky8%2Bm93bey239zVWguDspSbdo%3D&reserved=0
>>
>>  drivers/scsi/storvsc_drv.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index
>> b43d876747b7..68c837146b9e 100644
>> --- a/drivers/scsi/storvsc_drv.c
>> +++ b/drivers/scsi/storvsc_drv.c
>> @@ -1855,8 +1855,9 @@ static int storvsc_queuecommand(struct Scsi_Host
>> *host, struct scsi_cmnd *scmnd)
>>  	cmd_request->payload_sz = payload_sz;
>>
>>  	/* Invokes the vsc to start an IO */
>> -	ret = storvsc_do_io(dev, cmd_request, get_cpu());
>> -	put_cpu();
>> +	migrate_disable();
>> +	ret = storvsc_do_io(dev, cmd_request, smp_processor_id());
>> +	migrate_enable();
>>
>>  	if (ret)
>>  		scsi_dma_unmap(scmnd);
>> --
>> 2.51.0

-- 
Siemens AG, Foundational Technologies
Linux Expert Center