linux-kernel - Re: [PATCH 1/1] Drivers: hv: vmbus: Fix rescind handling issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87wp4wrwsx.fsf@vitty.brq.redhat.com>
Date:   Mon, 18 Sep 2017 14:55:10 +0200
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     KY Srinivasan <kys@...rosoft.com>
Cc:     Stephen Hemminger <stephen@...workplumber.org>,
        leann.ogasawara@...onical.com,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "apw\@canonical.com" <apw@...onical.com>,
        "olaf\@aepfle.de" <olaf@...fle.de>,
        "marcelo.cerri\@canonical.com" <marcelo.cerri@...onical.com>,
        "gregkh\@linuxfoundation.org" <gregkh@...uxfoundation.org>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        "linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
        "jasowang\@redhat.com" <jasowang@...hat.com>,
        "devel\@linuxdriverproject.org" <devel@...uxdriverproject.org>
Subject: Re: [PATCH 1/1] Drivers: hv: vmbus: Fix rescind handling issues

Vitaly Kuznetsov <vkuznets@...hat.com> writes:

>
> Reverting 6f3d791f300618caf82a2be0c27456edd76d5164 still helps.

In addition to the above I got the following crash while playing with
4.14-rc1 (unmodified):

[   55.810080] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[   55.814293] BUG: unable to handle kernel paging request at ffff8800059985f0
[   55.818065] IP: 0xffff8800059985f0
[   55.819925] PGD 22eb067 P4D 22eb067 PUD 22ec067 PMD 5f37063 PTE 8000000005998163
[   55.820018] Oops: 0011 [#1] SMP
[   55.820018] Modules linked in: vfat fat bnx2x mdio efi_pstore hv_utils efivars pci_hyperv ptp pps_core pcspkr hv_balloon xfs libcrc32c hv_storvsc hyperv_fb hv_netvsc scsi_transport_fc hid_hyperv hyperv_keyboard hv_vmbus
[   55.834837] CPU: 0 PID: 498 Comm: kworker/0:2 Not tainted 4.14.0-rc1 #63
[   55.834837] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   55.834837] Workqueue: events vmbus_onmessage_work [hv_vmbus]
[   55.834837] task: ffff88003f448000 task.stack: ffffc90005398000
[   55.834837] RIP: 0010:0xffff8800059985f0
[   55.834837] RSP: 0018:ffffc9000539be00 EFLAGS: 00010286
[   55.834837] RAX: ffff880005998010 RBX: ffff880005998000 RCX: 0000000000000000
[   55.834837] RDX: ffff8800059985f0 RSI: 0000000000000246 RDI: ffff880005998000
[   55.860040] RBP: ffffc9000539be18 R08: 00000000000002e6 R09: 0000000000000000
[   55.865057] R10: ffffc9000539bdf0 R11: 000000000000a000 R12: 0000000000000286
[   55.865057] R13: ffff88007ae1ed00 R14: 0000000000000000 R15: ffff8800065c3200
[   55.865057] FS:  0000000000000000(0000) GS:ffff88007ae00000(0000) knlGS:0000000000000000
[   55.865057] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   55.865057] CR2: ffff8800059985f0 CR3: 00000000075a5000 CR4: 00000000001406f0
[   55.886745] Call Trace:
[   55.886745]  ? vmbus_onoffer_rescind+0xfa/0x160 [hv_vmbus]
[   55.890968]  vmbus_onmessage+0x2a/0x90 [hv_vmbus]
[   55.891934]  vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
[   55.891934]  process_one_work+0x193/0x390
[   55.891934]  worker_thread+0x48/0x3c0
[   55.891934]  kthread+0x120/0x140
[   55.891934]  ? process_one_work+0x390/0x390
[   55.891934]  ? kthread_create_on_node+0x60/0x60
[   55.891934]  ret_from_fork+0x25/0x30
[   55.891934] Code: 88 ff ff c0 85 99 05 00 88 ff ff d0 85 99 05 00 88 ff ff d0 85 99 05 00 88 ff ff e0 85 99 05 00 88 ff ff e0 85 99 05 00 88 ff ff <f0> 85 99 05 00 88 ff ff f0 85 99 05 00 88 ff ff 00 86 99 05 00 
[   55.922505] RIP: 0xffff8800059985f0 RSP: ffffc9000539be00
[   55.922505] CR2: ffff8800059985f0
[   55.922505] ---[ end trace 25226e00af3f94fb ]---
[   55.933590] Kernel panic - not syncing: Fatal exception
[   55.933590] Kernel Offset: disabled
[   55.933590] ---[ end Kernel panic - not syncing: Fatal exception

So it seems that during 

	while (READ_ONCE(channel->probe_done) == false) {
		/*                                                                                                                                                                                                 
                 * We wait here until any channel offer is currently                                                                                                                                               
                 * being processed.                                                                                                                                                                                
                 */
                msleep(1);
	}

loop the channel disappeared. The issue may not be related to the netvsc
hang I mentioned before. It may make sense to do refcounting for
channels/subchannels (or employ RCU).

-- 
  Vitaly