lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFcZKTwQgd9hrTaXnThML=+WG82TH3DK90FT1-WWsBSoRj7dRw@mail.gmail.com>
Date: Fri, 14 Nov 2025 11:49:22 +0000
From: Peter Morrow <pdmorrow@...il.com>
To: Naman Jain <namjain@...ux.microsoft.com>
Cc: Salvatore Bonaccorso <carnil@...ian.org>, Long Li <longli@...rosoft.com>, 1120602@...s.debian.org, 
	linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org, 
	regressions@...ts.linux.dev, stable@...r.kernel.org, 
	John Starks <jostarks@...rosoft.com>, Michael Kelley <mhklinux@...look.com>, 
	Tianyu Lan <tiala@...rosoft.com>, "K. Y. Srinivasan" <kys@...rosoft.com>, 
	Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, 
	Dexuan Cui <decui@...rosoft.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [REGRESSION 6.12.y] hyper-v: BUG: kernel NULL pointer
 dereference, address: 00000000000000a0: RIP: 0010:hv_uio_channel_cb+0xd/0x20 [uio_hv_generic]

Hi Naman,

On Fri, 14 Nov 2025 at 06:03, Naman Jain <namjain@...ux.microsoft.com> wrote:
>
>
>
> On 11/13/2025 11:59 PM, Salvatore Bonaccorso wrote:
> > Peter Morrow reported in Debian a regression, reported in
> > https://bugs.debian.org/1120602 . The regression was seen after
> > updating, to 6.12.57-1 in Debian, but details on the offending commit
> > follows.
> >
> > His report was as follows:
> >
> >> Dear Maintainer,
> >>
> >> I'm seeing a kernel crash quite soon after boot on a debian trixie based
> >> system running 6.12.57+deb13-amd64, unfortunately the kernel panics before
> >> I can access the system to gather more information. Thus I'll provide details
> >> of the system using a previously known good version. The panic is happening
> >> 100% of the time unfortunately. I have access to the serial console however
> >> so can enable any required verbose logging during boot if necessary.
> >>
> >> Crucially the crash is not seen with kernel version 6.12.41+deb13-amd64 with the
> >> same userspace. We had pinned to that version until very recently to in order
> >> to work around https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1109676
> >>
> >> I'm running a dpdk application here (VPP) on Azure, VM form factor is a
> >> "Standard DS3 v2 (4 vcpus, 14 GiB memory)".
> >>
> >> The only relevant upstream commit in this area (as far as I can see) is:
> >>
> >> https://lore.kernel.org/linux-hyperv/1bb599ee-fe28-409d-b430-2fc086268936@linux.microsoft.com/
> >>
> >> The comment regarding avoiding races at start adds a bit more weight behind this
> >> hunch, though it's only a hunch as I am most definitely nowhere near an expert
> >> in this area.
> >>
> >> -- Package-specific info:
> >>
> >> [   19.625535] BUG: kernel NULL pointer dereference, address: 00000000000000a0
> >> [   19.628874] #PF: supervisor read access in kernel mode
> >> [   19.630841] #PF: error_code(0x0000) - not-present page
> >> [   19.632788] PGD 0 P4D 0
> >> [   19.633905] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
> >> [   19.635586] CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Not tainted 6.12.57+deb13-amd64 #1  Debian 6.12.57-1
> >> [   19.640216] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024
> >> [   19.644514] RIP: 0010:hv_uio_channel_cb+0xd/0x20 [uio_hv_generic]
> >> [   19.646994] Code: 02 00 00 5b 5d e9 53 98 69 e9 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 47 10 <48> 8b b8 a0 00 00 00 f0 83 44 24 fc 00 e9 51 6f fa ff 90 90 90 90
> >> [   19.654377] RSP: 0018:ffffb15ac01a4fa8 EFLAGS: 00010046
> >> [   19.656385] RAX: 0000000000000000 RBX: 0000000000000015 RCX: 0000000000000015
> >> [   19.659240] RDX: 0000000000000001 RSI: ffffffffffffffff RDI: ffff8ff69c759400
> >> [   19.662168] RBP: ffff8ff548790200 R08: ffff8ff548790200 R09: 00fca75150b080e9
> >> [   19.665239] R10: 0000000000000000 R11: ffffb15ac01a4ff8 R12: ffff8ff871dc1480
> >> [   19.668193] R13: ffff8ff69c759400 R14: ffff8ff69c7596a0 R15: ffffffffc106e160
> >> [   19.671106] FS:  0000000000000000(0000) GS:ffff8ff871d80000(0000) knlGS:0000000000000000
> >> [   19.674281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [   19.676533] CR2: 00000000000000a0 CR3: 0000000100ba6003 CR4: 00000000003706f0
> >> [   19.679385] Call Trace:
> >> [   19.680361]  <IRQ>
> >> [   19.681181]  vmbus_isr+0x1a5/0x210 [hv_vmbus]
> >> [   19.682916]  __sysvec_hyperv_callback+0x32/0x60
> >> [   19.684991]  sysvec_hyperv_callback+0x6c/0x90
> >> [   19.686665]  </IRQ>
> >> [   19.687509]  <TASK>
> >> [   19.688366]  asm_sysvec_hyperv_callback+0x1a/0x20
> >> [   19.690262] RIP: 0010:pv_native_safe_halt+0xf/0x20
> >> [   19.692067] Code: 09 e9 c5 08 01 00 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d e5 3b 31 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
> >> [   19.699119] RSP: 0018:ffffb15ac0103ed8 EFLAGS: 00000246
> >> [   19.701412] RAX: 0000000000000003 RBX: ffff8ff5403b1fc0 RCX: ffff8ff54c64ce30
> >> [   19.704328] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 000000000001f894
> >> [   19.706910] RBP: 0000000000000003 R08: 000000000bb760d9 R09: 00fca75150b080e9
> >> [   19.709762] R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000000
> >> [   19.712510] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> >> [   19.715173]  default_idle+0x9/0x20
> >> [   19.716846]  default_idle_call+0x29/0x100
> >> [   19.718623]  do_idle+0x1fe/0x240
> >> [   19.720045]  cpu_startup_entry+0x29/0x30
> >> [   19.721595]  start_secondary+0x11e/0x140
> >> [   19.723080]  common_startup_64+0x13e/0x141
> >> [   19.725222]  </TASK>
> >> [   19.726387] Modules linked in: isofs cdrom uio_hv_generic uio binfmt_misc intel_rapl_msr intel_rapl_common intel_uncore_frequency_common isst_if_mbox_msr isst_if_common rpcrdma skx_edac_common nfit sunrpc libnvdimm crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 sha256_ssse3 rdma_ucm ib_iser sha1_ssse3 rdma_cm aesni_intel iw_cm gf128mul crypto_simd libiscsi cryptd ib_umad ib_ipoib scsi_transport_iscsi ib_cm rapl sg hv_utils hv_balloon evdev pcspkr joydev mpls_router ip_tunnel ramoops configfs pstore_blk efi_pstore pstore_zone nfnetlink vsock_loopback vmw_vsock_virtio_transport_common hv_sock vmw_vsock_vmci_transport vsock vmw_vmci efivarfs ip_tables x_tables autofs4 overlay squashfs dm_verity dm_bufio reed_solomon dm_mod loop ext4 crc16 mbcache jbd2 crc32c_generic mlx5_ib ib_uverbs ib_core mlx5_core mlxfw pci_hyperv pci_hyperv_intf hyperv_drm drm_shmem_helper sd_mod drm_kms_helper hv_storvsc scsi_transport_fc drm scsi_mod hid_generic hid_hyperv hid serio_raw hv_netvsc hyperv_keyboard scsi_common hv_vmbus
> >> [   19.726466]  crc32_pclmul crc32c_intel
> >> [   19.765771] CR2: 00000000000000a0
> >> [   19.767524] ---[ end trace 0000000000000000 ]---
> >> [   19.800433] RIP: 0010:hv_uio_channel_cb+0xd/0x20 [uio_hv_generic]
> >> [   19.803170] Code: 02 00 00 5b 5d e9 53 98 69 e9 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 47 10 <48> 8b b8 a0 00 00 00 f0 83 44 24 fc 00 e9 51 6f fa ff 90 90 90 90
> >> [   19.811041] RSP: 0018:ffffb15ac01a4fa8 EFLAGS: 00010046
> >> [   19.813466] RAX: 0000000000000000 RBX: 0000000000000015 RCX: 0000000000000015
> >> [   19.816504] RDX: 0000000000000001 RSI: ffffffffffffffff RDI: ffff8ff69c759400
> >> [   19.819484] RBP: ffff8ff548790200 R08: ffff8ff548790200 R09: 00fca75150b080e9
> >> [   19.822625] R10: 0000000000000000 R11: ffffb15ac01a4ff8 R12: ffff8ff871dc1480
> >> [   19.825569] R13: ffff8ff69c759400 R14: ffff8ff69c7596a0 R15: ffffffffc106e160
> >> [   19.828804] FS:  0000000000000000(0000) GS:ffff8ff871d80000(0000) knlGS:0000000000000000
> >> [   19.832214] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [   19.834709] CR2: 00000000000000a0 CR3: 0000000100ba6003 CR4: 00000000003706f0
> >> [   19.837976] Kernel panic - not syncing: Fatal exception in interrupt
> >> [   19.841825] Kernel Offset: 0x28a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> >> [   19.896620] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
> >>
>
> <snip>
>
> > The offending commit appers to be the backport of b15b7d2a1b09
> > ("uio_hv_generic: Let userspace take care of interrupt mask") for
> > 6.12.y.
> >
> > Peter confirmed that reverting this commit on top of 6.12.57-1 as
> > packaged in Debian resolves indeed the issue. Interestingly the issue
> > is *not* seen with 6.17.7 based kernel in Debian.
> >
> > #regzbot introduced: 37bd91f22794dc05436130d6983302cb90ecfe7e
> > #regzbot monitor: https://bugs.debian.org/1120602
> >
> > Thank you already!
> >
> > Regards,
> > Salvatore
>
> Hi Peter, Salvatore,
> Thanks for reporting this crash, and sorry for the trouble. Here is my
> analysis.
>
> On 6.17.7, where commit d062463edf17 ("uio_hv_generic: Set event for all
> channels on the device") is present, hv_uio_irqcontrol() supports
> setting of interrupt mask from userspace for sub-channels as well.
>
> This aligns with commit e29587c07537 ("uio_hv_generic: Let userspace
> take care of interrupt mask") which relies on userspace to manage
> interrupt mask, so it safely removes the interrupt mask management logic
> in the driver.
>
> However, in 6.12.57, the first commit is not present, but the second one
> is, so there is no way to disable interrupt mask for sub-channels and
> interrupt_mask stays 0, which means interrupts are not masked. So we may
> be having an interrupt callback being handled for a sub-channel, where
> we do not expect it to come. This may be causing this issue.
>
> This would have led to a crash in hv_uio_channel_cb() for sub-channels:
> struct hv_device *hv_dev = chan->device_obj;
>
>
> I have ported commit d062463edf17 ("uio_hv_generic: Set event for all
> channels on the device") on 6.12.57, and resolved some merge conflicts.
> Could you please help with testing this, if it works for you.

Applying the patch against the debian 6.12.57 kernel worked, I am no
longer seeing that panic on boot:

gnos@...ge:~$ uname -a
Linux vEdge 6.12+unreleased-amd64 #1 SMP PREEMPT_DYNAMIC Debian
6.12.57-1a~test (2025-11-14) x86_64 GNU/Linux
gnos@...ge:~$ uptime
 11:46:33 up 4 min,  1 user,  load average: 3.31, 2.07, 0.89
gnos@...ge:~$ sudo dmidecode -t system
# dmidecode 3.6
Getting SMBIOS data from sysfs.
SMBIOS 3.1.0 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: Microsoft Corporation
        Product Name: Virtual Machine
        Version: Hyper-V UEFI Release v4.1
        Serial Number: 0000-0002-8036-1108-7588-3134-50
        UUID: 26e86d6e-140c-496a-862c-a3b3bbcd16ad
        Wake-up Type: Power Switch
        SKU Number: None
        Family: Virtual Machine

Handle 0x0010, DMI type 32, 11 bytes
System Boot Information
        Status: No errors detected

gnos@...ge:~$

Thanks a lot for the quick analysis!

Peter.

>
> Hi Long,
> If this works, do you see any concerns if I back-port your patch on
> older kernels (6.12 and prior)?
>
> Regards,
> Naman
>
> --------------
> Patch:
>
>  From 2f14d48d2bde3f86b153b9f756a9cd688cda3453 Mon Sep 17 00:00:00 2001
> From: Long Li <longli@...rosoft.com>
> Date: Mon, 10 Mar 2025 15:12:01 -0700
> Subject: [PATCH] uio_hv_generic: Set event for all channels on the device
>
> Hyper-V may offer a non latency sensitive device with subchannels without
> monitor bit enabled. The decision is entirely on the Hyper-V host not
> configurable within guest.
>
> When a device has subchannels, also signal events for the subchannel
> if its monitor bit is disabled.
>
> This patch also removes the memory barrier when monitor bit is enabled
> as it is not necessary. The memory barrier is only needed between
> setting up interrupt mask and calling vmbus_set_event() when monitor
> bit is disabled.
>
> Signed-off-by: Long Li <longli@...rosoft.com>
> Reviewed-by: Michael Kelley <mhklinux@...look.com>
> Reviewed-by: Saurabh Sengar <ssengar@...ux.microsoft.com>
> Signed-off-by: Naman Jain <namjain@...ux.microsoft.com>
> ---
>   drivers/uio/uio_hv_generic.c | 32 ++++++++++++++++++++++++++------
>   1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
> index 0b414d1168dd..9f3b124a5e09 100644
> --- a/drivers/uio/uio_hv_generic.c
> +++ b/drivers/uio/uio_hv_generic.c
> @@ -65,6 +65,16 @@ struct hv_uio_private_data {
>          char    send_name[32];
>   };
>
> +static void set_event(struct vmbus_channel *channel, s32 irq_state)
> +{
> +       channel->inbound.ring_buffer->interrupt_mask = !irq_state;
> +       if (!channel->offermsg.monitor_allocated && irq_state) {
> +               /* MB is needed for host to see the interrupt mask first */
> +               virt_mb();
> +               vmbus_set_event(channel);
> +       }
> +}
> +
>   /*
>    * This is the irqcontrol callback to be registered to uio_info.
>    * It can be used to disable/enable interrupt from user space processes.
> @@ -79,12 +89,15 @@ hv_uio_irqcontrol(struct uio_info *info, s32 irq_state)
>   {
>          struct hv_uio_private_data *pdata = info->priv;
>          struct hv_device *dev = pdata->device;
> +       struct vmbus_channel *primary, *sc;
>
> -       dev->channel->inbound.ring_buffer->interrupt_mask = !irq_state;
> -       virt_mb();
> +       primary = dev->channel;
> +       set_event(primary, irq_state);
>
> -       if (!dev->channel->offermsg.monitor_allocated && irq_state)
> -               vmbus_setevent(dev->channel);
> +       mutex_lock(&vmbus_connection.channel_mutex);
> +       list_for_each_entry(sc, &primary->sc_list, sc_list)
> +               set_event(sc, irq_state);
> +       mutex_unlock(&vmbus_connection.channel_mutex);
>
>          return 0;
>   }
> @@ -95,11 +108,18 @@ hv_uio_irqcontrol(struct uio_info *info, s32 irq_state)
>   static void hv_uio_channel_cb(void *context)
>   {
>          struct vmbus_channel *chan = context;
> -       struct hv_device *hv_dev = chan->device_obj;
> -       struct hv_uio_private_data *pdata = hv_get_drvdata(hv_dev);
> +       struct hv_device *hv_dev;
> +       struct hv_uio_private_data *pdata;
>
>          virt_mb();
>
> +       /*
> +       * The callback may come from a subchannel, in which case look
> +       * for the hv device in the primary channel
> +       */
> +       hv_dev = chan->primary_channel ?
> +       chan->primary_channel->device_obj : chan->device_obj;
> +       pdata = hv_get_drvdata(hv_dev);
>          uio_event_notify(&pdata->info);
>   }
>
> --
> 2.43.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ