lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAErSpo7qj2SNR7BfuBsfF-saZXvgcqqLtgoP5Hyb5u99qR4Ncg@mail.gmail.com>
Date:	Tue, 12 Feb 2013 21:26:44 -0700
From:	Bjorn Helgaas <bhelgaas@...gle.com>
To:	"Artem S. Tashkinov" <t.artem@...os.com>
Cc:	torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org, "Rafael J. Wysocki" <rjw@...k.pl>,
	Alan Stern <stern@...land.harvard.edu>
Subject: Re: Abysmal HDD/USB write speed after sleep on a UEFI system

[+cc linux-pci, Rafael, Alan]

[https://bugzilla.kernel.org/show_bug.cgi?id=53551]

On Tue, Feb 12, 2013 at 1:13 PM, Artem S. Tashkinov <t.artem@...os.com> wrote:
> Feb 13, 2013 01:32:53 AM, Linus Torvalds wrote:
> On Tue, Feb 12, 2013 at 10:29 AM, Artem S. Tashkinov wrote:
>>> Feb 12, 2013 11:30:20 PM, Linus Torvalds wrote:
>>>>
>>>>A few things to try to pinpoint:
>>>>
>>>> (a) Is it *only* write performance that suffers, or is it other
>>>>performance too? Networking (DMA? Perhaps only writing *to* the
>>>>network?)? CPU?
>>>
>>> I  've tested hdpard -tT --direct and the output on boot and after suspend
>>> is quite similar.
>>>
>>> I  've also checked my network read/write speed, and it  's the same
>>> ~ 100MBit/sec (I have no 1Gbit computers on my network
>>> unfortunately).
>>
>>Ok. So it really sounds like just USB and HD writes. Which is quite
>>odd, since they have basically nothing in common I can think of
>>(except the obvious block layer issues).
>>
>>>> (b) the fact that it apparently happens with both SATA and USB
>>>>implies that it    's neither, and is more likely something core like
>>>>memory speed (mtrr, caching) or PCI (DMA, burst sizes, whatever).
>>>
>>> I  've no idea, please, check my bug report where I  've just added lots of
>>> information including a diff between on boot and after suspend.
>>
>>I  'm not seeing anything particularly interesting there.
>>
>>Except why/how did the MSI address/data change for the SATA
>>controller? The irq itself hasn  't changed.. There  's probably some sane
>>reason for that too (it  's an odd encoding, maybe they code for the
>>same thing), and there  's nothing like that for USB, so...
>>
>>And if it was irq problems, I  'd expect you to see it more for reads
>>than for writes anyway. Along with a few messages about missed irqs
>>and whatever.
>>
>>I'm stumped, and have no ideas. I can  't even begin to guess how this
>>would happen. One thing to try is if it happens for all USB ports (you
>>have multiple controllers) and I assume performance doesn  't come back
>>if you unplug and replug the USB disk..
>
> I've just plugged and unplugged my USB stick into all available hubs
> (including a USB3 one, that is xhci_hcd) and I've got the same write speed
> on all of them - around 930KB/sec (quite a weird number - as if I'm on USB
> 1.1) - lsusb says I'm happily running ehci_hcd/2p, 480M and xhci_hcd/2p,
> 5000M.
>
> The only pattern that I see here is that write speed to real devices degrades,
> tmpfs write speed stays the same:
>
> $ dd if=/dev/zero of=test bs=32M count=32
> 32+0 records indegrade
> 32+0 records out
> 1073741824 bytes (1.1 GB) copied, 0.296323 s, 3.6 GB/s

I'm sort of stumped here, too.  For the SATA controller, the only
PCI-related difference I see is the change in the MSI address, which
should just change the target CPU, which doesn't seem like it should
make this much difference.  But could you try this after the resume:

    $ sudo setpci -s00:1f.2 0x84.L=0xfee0400c

to set the MSI address back to the original value to see if it makes a
difference?

The XHCI controllers both have Unsupported Request errors logged.  I
assume these are related to the suspend/resume, and it seems like we
ought to either avoid them or clean them up somehow, but I don't know
enough about AER, and I don't know whether they would cause the
performance issue you're seeing.

There should be more AER logging than is decoded by lspci, so can you
also collect the output of "lspci -vvv -xxxx"?  That will include the
raw logging registers that lspci doesn't decode.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ