linux-kernel - Re: 32GB SSD on USB1.1 P3/700 == ___HELL__

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4BBE38B9.6020507@tmr.com>
Date:	Thu, 08 Apr 2010 16:12:41 -0400
From:	Bill Davidsen <davidsen@....com>
To:	Andreas Mohr <andi@...as.de>
CC:	Jens Axboe <axboe@...nel.dk>,
	Wu Fengguang <fengguang.wu@...el.com>,
	linux-kernel@...r.kernel.org
Subject: Re: 32GB SSD on USB1.1 P3/700 == ___HELL___ (2.6.34-rc3)

Andreas Mohr wrote:
> [CC'd some lucky candidates]
> 
> Hello,
> 
> I was just running
> mkfs.ext4 -b 4096 -E stride=128 -E stripe-width=128 -O ^has_journal
> /dev/sdb2
> on my SSD18M connected via USB1.1, and the result was, well,
> absolutely, positively _DEVASTATING_.
> 
> The entire system became _FULLY_ unresponsive, not even switching back
> down to tty1 via Ctrl-Alt-F1 worked (took 20 seconds for even this key
> to be respected).
> 
> Once back on ttys, invoking any command locked up for minutes
> (note that I'm talking about attempted additional I/O to the _other_,
> _unaffected_ main system HDD - such as loading some shell binaries -,
> NOT the external SSD18M!!).
> 
> Having an attempt at writing a 300M /dev/zero file to the SSD's filesystem
> was even worse (again tons of unresponsiveness), combined with multiple
> OOM conditions flying by (I/O to the main HDD was minimal, its LED was
> almost always _off_, yet everything stuck to an absolute standstill).
> 
> Clearly there's a very, very important limiter somewhere in bio layer
> missing or broken, a 300M dd /dev/zero should never manage to put
> such an onerous penalty on a system, IMHO.
> 
You are using a USB 1.1 connection, about the same speed as a floppy. If you 
have not tuned your system to prevent all of the memory from being used to cache 
writes, it will be used that way. I don't have my notes handy, but I believe you 
need to tune the "dirty" parameters of /proc/sys/vm so that it makes better use 
of memory.

Of course putting a fast device like SSD on a super slow connection makes no 
sense other than as a test of system behavior on misconfigured machines.
> 
> I've got SysRq-W traces of these lockup conditions if wanted.
> 
> 
> Not sure whether this is a 2.6.34-rc3 thing, might be a general issue.
> 
> Likely the lockup behaviour is a symptom of very high memory pressure.
> But this memory pressure shouldn't even be allowed to happen in the first
> place, since the dd submission rate should immediately get limited by the kernel's
> bio layer / elevators.
> 
> Also, I'm wondering whether perhaps additionally there are some cond_resched()
> to be inserted in some places, to try to improve coping with such a
> broken situation at least.
> 
> Thanks,
> 
> Andreas Mohr


-- 
Bill Davidsen <davidsen@....com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/