lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120926010154.49cc2588@brain.lan>
Date:	Wed, 26 Sep 2012 01:01:54 +0200
From:	Florian Dazinger <florian@...inger.net>
To:	Alex Williamson <alex.williamson@...hat.com>
Cc:	linux-kernel@...r.kernel.org,
	"Roedel, Joerg" <Joerg.Roedel@....com>,
	iommu <iommu@...ts.linux-foundation.org>
Subject: Re: 3.6-rc7 boot crash + bisection

Am Tue, 25 Sep 2012 13:43:46 -0600
schrieb Alex Williamson <alex.williamson@...hat.com>:

> On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> > Am Tue, 25 Sep 2012 12:32:50 -0600
> > schrieb Alex Williamson <alex.williamson@...hat.com>:
> > 
> > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > > Hi,
> > > > I think I've found a regression, which causes an early boot crash, I
> > > > appended the kernel output via jpg file, since I do not have a serial
> > > > console or sth.
> > > > 
> > > > after bisection, it boils down to this commit:
> > > > 
> > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > > Author: Alex Williamson <alex.williamson@...hat.com>
> > > > Date:   Wed May 30 14:19:07 2012 -0600
> > > > 
> > > >     amd_iommu: Support IOMMU groups
> > > >     
> > > >     Add IOMMU group support to AMD-Vi device init and uninit code.
> > > >     Existing notifiers make sure this gets called for each device.
> > > >     
> > > >     Signed-off-by: Alex Williamson <alex.williamson@...hat.com>
> > > >     Signed-off-by: Joerg Roedel <joerg.roedel@....com>
> > > > 
> > > > :040000 040000 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M      drivers
> > > 
> > > [switching back to mailing list thread]
> > > 
> > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > > 
> > > [    1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
> > > [    1.485683] AMD-Vi:        mmio-addr: 00000000feb20000
> > > [    1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
> > > [    1.485935] AMD-Vi:   DEV_RANGE_END           devid: 00:00.2
> > > [    1.485969] AMD-Vi:   DEV_SELECT                      devid: 00:02.0 flags: 00
> > > [    1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
> > > [    1.486036] AMD-Vi:   DEV_RANGE_END           devid: 01:00.1
> > > [    1.486070] AMD-Vi:   DEV_SELECT                      devid: 00:04.0 flags: 00
> > > [    1.486103] AMD-Vi:   DEV_SELECT                      devid: 02:00.0 flags: 00
> > > [    1.486137] AMD-Vi:   DEV_SELECT                      devid: 00:05.0 flags: 00
> > > [    1.486170] AMD-Vi:   DEV_SELECT                      devid: 03:00.0 flags: 00
> > > [    1.486204] AMD-Vi:   DEV_SELECT                      devid: 00:06.0 flags: 00
> > > [    1.486238] AMD-Vi:   DEV_SELECT                      devid: 04:00.0 flags: 00
> > > [    1.486271] AMD-Vi:   DEV_SELECT                      devid: 00:07.0 flags: 00
> > > [    1.486305] AMD-Vi:   DEV_SELECT                      devid: 05:00.0 flags: 00
> > > [    1.486338] AMD-Vi:   DEV_SELECT                      devid: 00:09.0 flags: 00
> > > [    1.486372] AMD-Vi:   DEV_SELECT                      devid: 06:00.0 flags: 00
> > > [    1.486406] AMD-Vi:   DEV_SELECT                      devid: 00:0b.0 flags: 00
> > > [    1.486439] AMD-Vi:   DEV_SELECT                      devid: 07:00.0 flags: 00
> > > [    1.486473] AMD-Vi:   DEV_ALIAS_RANGE                 devid: 08:01.0 flags: 00 devid_to: 08:00.0
> > > [    1.486510] AMD-Vi:   DEV_RANGE_END           devid: 08:1f.7
> > > [    1.486548] AMD-Vi:   DEV_SELECT                      devid: 00:11.0 flags: 00
> > > [    1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
> > > [    1.486620] AMD-Vi:   DEV_RANGE_END           devid: 00:12.2
> > > [    1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
> > > [    1.486688] AMD-Vi:   DEV_RANGE_END           devid: 00:13.2
> > > [    1.486721] AMD-Vi:   DEV_SELECT                      devid: 00:14.0 flags: d7
> > > [    1.486755] AMD-Vi:   DEV_SELECT                      devid: 00:14.3 flags: 00
> > > [    1.486788] AMD-Vi:   DEV_SELECT                      devid: 00:14.4 flags: 00
> > > [    1.486822] AMD-Vi:   DEV_ALIAS_RANGE                 devid: 09:00.0 flags: 00 devid_to: 00:14.4
> > > [    1.486859] AMD-Vi:   DEV_RANGE_END           devid: 09:1f.7
> > > [    1.486897] AMD-Vi:   DEV_SELECT                      devid: 00:14.5 flags: 00
> > > [    1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
> > > [    1.486965] AMD-Vi:   DEV_RANGE_END           devid: 00:16.2
> > > [    1.487055] AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40
> > > 
> > > 
> > > > lspci:
> > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (external gfx0 port B) (rev 02)
> > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory Management Unit (IOMMU)
> > > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (PCI express gpp port B)
> > > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (PCI express gpp port D)
> > > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (PCI express gpp port E)
> > > > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (PCI express gpp port F)
> > > > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (PCI express gpp port G)
> > > > 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (PCI express gpp port H)
> > > > 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (NB-SB link)
> > > > 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> > > > 00:12.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > > 00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > > 00:13.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > > 00:13.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > > 00:14.0 SMBus: Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller (rev 42)
> > > > 00:14.3 ISA bridge: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
> > > > 00:14.4 PCI bridge: Advanced Micro Devices [AMD] nee ATI SBx00 PCI to PCI Bridge (rev 40)
> > > > 00:14.5 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
> > > > 00:16.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> > > > 00:16.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
> > > > 00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration
> > > > 00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map
> > > > 00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller
> > > > 00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control
> > > > 00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control
> > > > 01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV730XT [Radeon HD 4670]
> > > > 01:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI RV710/730 HDMI Audio [Radeon HD 4000 series]
> > > > 02:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)
> > > > 03:00.0 Ethernet controller: Intel Corporation 82583V Gigabit Network Connection
> > > > 04:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
> > > > 05:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
> > > > 06:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
> > > > 07:00.0 PCI bridge: PLX Technology, Inc. PEX8112 x1 Lane PCI Express-to-PCI Bridge (rev aa)
> > > > 08:04.0 Multimedia audio controller: C-Media Electronics Inc CMI8788
> > > > [Oxygen HD Audio]         
> > > 
> > > We can see this is clearly wrong:
> > > 
> > > [    1.486473] AMD-Vi:   DEV_ALIAS_RANGE                 devid: 08:01.0 flags: 00 devid_to: 08:00.0
> > > [    1.486510] AMD-Vi:   DEV_RANGE_END           devid: 08:1f.7
> > > 
> > > So the BIOS is telling us to alias everything in the range of 08:01.0 to
> > > 08:1f.7 to device id 08:00.0, which doesn't exist :(  Can you send lspci
> > > -vvv?  I suspect we'll find that 07:00.0 sources bus 08 and that alias
> > > should really be to 07:00.0 instead of 08:00.0.  Please also provide
> > > dmidecode for this system, we may need to create a quirk for this box.
> > > Thanks,
> 
> [corrected alias and range in text above, adding iommu list]
> 
> > 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (NB-SB link) (prog-if 00 [Normal decode])
> > 	Bus: primary=00, secondary=07, subordinate=08, sec-latency=0
> > 	Capabilities: [58] Express (v2) Root Port (Slot+), MSI 00
> 
> 
> > 07:00.0 PCI bridge: PLX Technology, Inc. PEX8112 x1 Lane PCI Express-to-PCI Bridge (rev aa) (prog-if 00 [Normal decode])
> > 	Bus: primary=07, secondary=08, subordinate=08, sec-latency=32
> > 	Capabilities: [60] Express (v1) PCI/PCI-X Bridge, MSI 00
> 
> > 08:04.0 Multimedia audio controller: C-Media Electronics Inc CMI8788 [Oxygen HD Audio]
> > 	Subsystem: ASUSTeK Computer Inc. Virtuoso 100 (Xonar Essence STX)
> > 	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > 	Latency: 32 (500ns min, 6000ns max)
> > 	Interrupt: pin A routed to IRQ 32
> > 	Region 0: I/O ports at b000 [size=256]
> > 	Capabilities: [c0] Power Management version 2
> > 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
> > 		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> > 	Kernel driver in use: snd_virtuoso
> > 
> 
> Yep, my guess appears correct, the alias should be to device 07:00.0.
> It looks like this is a x1 PCIe card, so I think that PLX bridge is on
> the card.  The system probably boots fine if you remove the audio card
> (or of course with amd_iommu=off).  It looks like there is one rev newer
> BIOS for this motherboard; we should probably exhaust the possibility
> that this bug has already been fixed in BIOS 1503 before we implement a
> quirk.  Can you test this?
> 
> Joerg, any thoughts on a quirk for this?  Unfortunately we can't just
> skip IOMMU groups when an alias is broken because it puts the other
> IOMMU groups at risk that might not actually be isolated from this
> device.  It looks like we parse the alias info before PCI is probed, so
> maybe we'd need to call the quirk from iommu_init_device itself.
> Thanks,
> 
> Alex
> 
> 

Alex,
you're right, either "amd_iommu=off" or removing the audio card makes the failure disappear. I will test the new BIOS rev. tomorrow.
thanks, Florian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ