lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <6278d2220704200419p6f035769p654caa23e9a724e9@mail.gmail.com>
Date:	Fri, 20 Apr 2007 12:19:24 +0100
From:	"Daniel J Blueman" <daniel.blueman@...il.com>
To:	"Stephen Hemminger" <shemminger@...ux-foundation.org>
Cc:	"Linux Networking" <linux-net@...r.kernel.org>,
	"Linux Kernel" <linux-kernel@...r.kernel.org>
Subject: Re: sky2 X86-64 PCI synchronization problems

Stephen,

>From your description of the problem, it sounds just like writes to
PCI memory-mapped address space are being delayed and/or reordered in
the processor when in x86-64 mode.

I'd suggest to add wmb()/rmb() or (overkill) mb() calls after MMIO
stores to flush writes immediately and force give the intended
ordering; often drivers will issue a dummy read, which flushes writes
(but is far less efficient than a 'wmb', due to the PCI bus
read-latency). Also, if a PCI MMIO page is marked write-combining, the
processor will delay writes and burst them in up to 64-byte blocks,
which can be very good for getting maximal performance, but needs
careful considerations.

On the ia64 platform, the maximum time MMIO writes can be waiting in
the processor's write buffers, is something of the order of 2uS (!),
and MMIO store semantics are out-of-order, so read/write barriers are
critical.

Thanks,
  Dan

Stephen Hemminger wrote:
> I am testing a Gigabyte 965P-S3 motherboard with onboard Marvell
> 88E8056 Ethernet controller (sky2 driver). The CPU is a Core-2 Duo.
> Strange errors occur under moderate load with X86-64 kernel.
> Surprisingly, with i386 kernel the controller runs fine without errors.
>
> These look bus/PCI related because:
> * there is also 88E8052 on an PCI-E X1 card, it has no problem.
>   Similar chip but different bus interface internally (no ram buffer).
>
> * the errors look like the controller didn't access memory correctly:
> 	1. receive status arrives but data buffer not updated
> 	2. transmit descriptor errors, implying that controller read
>            a non-initialized memory.
>   the command blocks look okay, it's like the device didn't read them.
>
> It looks like a PCI bus synchronization issue. Any ideas for driver
> workarounds?
>
> This is the working controller:
>
> 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8052 PCI-E ASF Gigabit Ethernet Controller (rev 22)
> 	Subsystem: Marvell Technology Group Ltd. Marvell RDK-8052
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0, Cache Line Size: 32 bytes
> 	Interrupt: pin A routed to IRQ 217
> 	Region 0: Memory at f8000000 (64-bit, non-prefetchable) [size=16K]
> 	Region 2: I/O ports at 7000 [size=256]
> 	[virtual] Expansion ROM at 80000000 [disabled] [size=128K]
> 	Capabilities: [48] Power Management version 2
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> 		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
> 	Capabilities: [50] Vital Product Data
> 	Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable+
> 		Address: 00000000fee0300c  Data: 417a
> 	Capabilities: [e0] Express Legacy Endpoint IRQ 0
> 		Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
> 		Device: Latency L0s unlimited, L1 unlimited
> 		Device: AtnBtn- AtnInd- PwrInd-
> 		Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
> 		Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> 		Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
> 		Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s, Port 0
> 		Link: Latency L0s <256ns, L1 unlimited
> 		Link: ASPM Disabled RCB 128 bytes CommClk- ExtSynch-
> 		Link: Speed 2.5Gb/s, Width x1
> 	Capabilities: [100] Advanced Error Reporting
> 00: ab 11 60 43 07 04 10 00 22 00 00 02 08 00 00 00
> 10: 04 00 00 f8 00 00 00 00 01 70 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 ab 11 21 52
> 30: 00 00 00 00 48 00 00 00 00 00 00 00 0b 01 00 00
>
>
> This is the non-working controller:
>
> 05:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
> 	Subsystem: Giga-byte Technology Unknown device e000
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0, Cache Line Size: 32 bytes
> 	Interrupt: pin A routed to IRQ 216
> 	Region 0: Memory at fa000000 (64-bit, non-prefetchable) [size=16K]
> 	Region 2: I/O ports at a000 [size=256]
> 	[virtual] Expansion ROM at 80100000 [disabled] [size=128K]
> 	Capabilities: [48] Power Management version 3
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> 		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
> 	Capabilities: [50] Vital Product Data
> 	Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
> 		Address: 00000000fee0300c  Data: 4182
> 	Capabilities: [e0] Express Legacy Endpoint IRQ 0
> 		Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
> 		Device: Latency L0s unlimited, L1 unlimited
> 		Device: AtnBtn- AtnInd- PwrInd-
> 		Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
> 		Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> 		Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
> 		Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 0
> 		Link: Latency L0s <256ns, L1 unlimited
> 		Link: ASPM Disabled RCB 128 bytes CommClk- ExtSynch-
> 		Link: Speed 2.5Gb/s, Width x1
> 	Capabilities: [100] Advanced Error Reporting
> 00: ab 11 64 43 07 04 10 00 12 00 00 02 08 00 00 00
> 10: 04 00 00 fa 00 00 00 00 01 a0 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 58 14 00 e0
> 30: 00 00 00 00 48 00 00 00 00 00 00 00 0e 01 00 00
-- 
Daniel J Blueman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ