lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 06 Jun 2011 16:17:40 -0600
From:	Alex Williamson <alex.williamson@...hat.com>
To:	padmanabh ratnakar <pratnakarlx@...il.com>
Cc:	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	iommu <iommu@...ts.linux-foundation.org>, dwmw2@...radead.org
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

On Mon, 2011-06-06 at 14:39 +0530, padmanabh ratnakar wrote:
> Hi,
>         I am using linux kernel 2.6.39. I have a IBM x3650 M3 system.
> I have used following boot options -
> intel_iommu=on iommu=pt
> 
> I was loading/unloading my NIC driver(be2net) with num_vfs=7.
> 
> After some iterations I get following DMAR errors -
> Jun  4 03:50:20 rhel6 kernel: Uhhuh. NMI received for unknown reason
> 2d on CPU 0.
> Jun  4 03:50:20 rhel6 kernel: Do you have a strange power saving mode enabled?
> Jun  4 03:50:20 rhel6 kernel: Dazed and confused, but trying to continue
> Jun  4 03:50:20 rhel6 kernel: DRHD: handling fault status reg 2
> Jun  4 03:50:20 rhel6 kernel: DMAR:[DMA Read] Request device [1a:00.2]
> fault addr 78077000
> Jun  4 03:50:20 rhel6 kernel: DMAR:[fault reason 02] Present bit in
> context entry is clear
> 
> I was trying to debug this. I dont understand iommu code much.
> The physical address belongs the printed PCI function and there should
> not have been an error.
> 
> I am unable to see pci_dev(pdev) of VFs getting removed from
> si_domain->devices list(intel-iommu.c)
> when driver gets unloaded calling pci_disable_sriov() freeing VF pdevs.
> Looks like issue happens when when freed pdev is allocated again and
> as it is already in list,
> required initializations dont happen.
> 
> I dont know if my understanding is correct. Can anyone point me to
> what the issue may be?

Typically devices are removed from the domain via
drivers/pci/intel-iommu.c:device_notifier(), which is called as the
device is unbound from the driver.  However, this seems to get skipped
when running in passthrough mode, so I'm not sure where that's supposed
to occur.  Does it happen w/o passthrough?  Also note that some
intel-iommu fixes have rolled into 3.0.0-rc2, you might want to update
and see if anything is better there.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ