lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YbFHsYJ5ua3J286o@google.com>
Date:   Thu, 9 Dec 2021 00:02:57 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Joerg Roedel <joro@...tes.org>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
        kvm@...r.kernel.org, iommu@...ts.linux-foundation.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 00/26] KVM: x86: Halt and APICv overhaul

On Thu, Dec 09, 2021, Maxim Levitsky wrote:
> Also got this while trying a VM with passed through device:
> 
> [mlevitsk@...laptop ~]$[   34.926140] usb 5-3: reset full-speed USB device number 3 using xhci_hcd
> [   42.583661] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
> [  363.562173] VFIO - User Level meta-driver version: 0.3
> [  365.160357] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e@...54
> [  384.138110] BUG: kernel NULL pointer dereference, address: 0000000000000021
> [  384.154039] #PF: supervisor read access in kernel mode
> [  384.165645] #PF: error_code(0x0000) - not-present page
> [  384.177254] PGD 16da9d067 P4D 16da9d067 PUD 13ad1a067 PMD 0 
> [  384.190036] Oops: 0000 [#1] SMP
> [  384.197117] CPU: 3 PID: 14403 Comm: CPU 3/KVM Tainted: G           O      5.16.0-rc4.unstable #6
> [  384.216978] Hardware name: LENOVO 20UF001CUS/20UF001CUS, BIOS R1CET65W(1.34 ) 06/17/2021
> [  384.235258] RIP: 0010:amd_iommu_update_ga+0x32/0x160
> [  384.246469] Code: <4c> 8b 62 20 48 8b 4a 18 4d 85 e4 0f 84 ca 00 00 00 48 85 c9 0f 84
> [  384.288932] RSP: 0018:ffffc9000036fca0 EFLAGS: 00010046
> [  384.300727] RAX: 0000000000000000 RBX: ffff88810b68ab60 RCX: ffff8881667a6018
> [  384.316850] RDX: 0000000000000001 RSI: ffff888107476b00 RDI: 0000000000000003

RDX, a.k.a. ir_data is NULL.  This check in svm_ir_list_add() 

	if (pi->ir_data && (pi->prev_ga_tag != 0)) {

implies pi->ir_data can be NULL, but neither avic_update_iommu_vcpu_affinity()
nor amd_iommu_update_ga() check ir->data for NULL.

amd_ir_set_vcpu_affinity() returns "success" without clearing pi.is_guest_mode

	/* Note:
	 * This device has never been set up for guest mode.
	 * we should not modify the IRTE
	 */
	if (!dev_data || !dev_data->use_vapic)
		return 0;

so it's plausible svm_ir_list_add() could add to the list with a NULL pi->ir_data.

But none of the relevant code has seen any meaningful changes since 5.15, so odds
are good I broke something :-/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ