linux-kernel - Re: HDD not suspending properly / dead on resume

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201007101508.11488.rjw@sisk.pl>
Date:	Sat, 10 Jul 2010 15:08:11 +0200
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Stephan Diestelhorst <stephan.diestelhorst@...il.com>
Cc:	Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org,
	linux-ide@...r.kernel.org, linux-pm@...ts.osdl.org,
	stephan.diestelhorst@....com
Subject: Re: HDD not suspending properly / dead on resume

On Saturday, July 10, 2010, Stephan Diestelhorst wrote:
> Rafael J. Wysocki wrote:
> > On Saturday, July 10, 2010, Stephan Diestelhorst wrote:
> > > Rafael J. Wysocki wrote:
> > > > On Friday, July 09, 2010, Stephan Diestelhorst wrote:
> > > > > I wrote:
> > > > > >   I have an issue with suspend to RAM and I/O load on a disk. Symptoms
> > > > > > are that the disk does not respond to requests when woken up, producing
> > > > > > only I/O errors on all tested kernels (newest 2.6.35-rc4 (Ubuntu
> > > > > > mainline PPA build)):
> > > > > > 
> > > > > <snip>
> > > > >  
> > > > > > This can be triggered most reliably with multiple "direct" writes to
> > > > > > disk, I create the load with the attached script. If the issue is
> > > > > > triggered, suspend (through pm-suspend) takes very long.
> > > > > 
> > > > > > IMHO the interesting log output during suspend is:
> > > > > > [ 1674.700125] ata1.00: qc timeout (cmd 0xec)
> > 
> > I have a box where this problem is kind of reproducible, but it happens _very_
> > rarely.  Also I can't reproduce it on demand running suspend-resume in a tight
> > loop.  Are you able to reproduce it more regurarly?
> 
> For me it is much more reproducible. If I run multiple direct writing
> dd-s to the disk in question I trigger it rather reliably (~75% or
> higher). See the attached script from an earlier email.
> Maybe that helps triggering your case more reliabl, too?
> 
> > Also, what kind of disk do you use?
> 
> It is a Samsung HM321HI in a Samsung Eikee R525 notebook, please also 
> see my smartctl -a log, attached earlier.
> 
> Interesting, I have a similar symptom on one of my home servers,
> which has a *Samsung* SpinPoint F1 and it went away with different
> disks. So maybe these disks are either faulty themselves or they
> trigger the issue more often?

They may be doing something that causes the issue to appear.

That said, on my test box this only happens during suspend and it's an Intel
SSD (INTEL SSDSA2M080G2GC, 2CV102HD to be precise).

> I also have a LVM on top of LUKS on the disk. So the I/O will also
> add some computational overhead for encryption.

There are only ext3/ext4 partitions on the disk in my case.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/