lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 2 Mar 2016 16:43:30 -0500
From:	Theodore Ts'o <tytso@....edu>
To:	Benjamin LaHaise <bcrl@...ck.org>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: ext4 bug: getdents uninterruptible for 117 seconds

On Wed, Mar 02, 2016 at 12:15:11PM -0500, Benjamin LaHaise wrote:
> Hi folks,
> 
> While working on a bug involving write starvation, the test I was running 
> managed to trigger some pretty horrific worst case behaviour in ext4.  The 
> filesystem I'm working on is about 4TB in size, and is used for storing a 
> number of spool files across 100 subdirectories in the filesystem.  One of 
> these subdirectories ended up growing to ~497MB in size.  Once all of the 
> files were removed from these directories, the filesystem was unmounted.  
> On subsequent mounts of the filesystem, it became apparent that whenever 
> a specific directory was accessed using ls or find, the kernel would block 
> in getdents() for north of 117 seconds.  It is clear that ext4 is slowly 
> reading the entire contents of the directory into memory during this time 
> at a rate of ~4MB/s.  This filesystem is being stored on an external 8Gbps 
> FC SAN comprised of about 8 x 10Krpm spindles.
> 
> I've placed a copy of the e2image for the filesystem at 
> http://www.kvack.org/~bcrl/ext4/ext4-readdir.img.xz .  The problematic 
> directory is broken/1.  The relevant snippet of strace output is below.  
> Thoughts?

Yes, this is a known problem.  Right now we don't have a way of
removing empty directory blocks from a directory.  This can be fixed
up by running "e2fsck -fD /dev/sdXX" off-line, but it's not terribly
satisfying.

There are things we could do in theory try to make things better, but
they haven't been implemented yet.  In practice they tend to happen
with pathological workloads, but they do happen occasionally in real
life.  It's just not something we've had time to address up until now.

       	    	     	       	     - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ