lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 15 May 2009 14:25:41 -0400
From:	Theodore Tso <tytso@....edu>
To:	Timo Sirainen <tss@....fi>
Cc:	Josef Bacik <josef@...icpanda.com>, linux-kernel@...r.kernel.org
Subject: Re: ext3/ext4 directories don't shrink after deleting lots of files

On Fri, May 15, 2009 at 01:29:04PM -0400, Timo Sirainen wrote:
> On Fri, 2009-05-15 at 06:58 -0400, Theodore Tso wrote:
> > > I was rather thinking something that I could run while the system was  
> > > fully operational. Otherwise just moving the files to a temp directory + 
> > > rmdir() + rename() would have been fine too.
> > >
> > > I just tested that xfs, jfs and reiserfs all shrink the directories  
> > > immediately. Is it more difficult to implement for ext* or has no one  
> > > else found this to be a problem?
> > 
> > It's probably fairest to say no one has thought it worth the effort.
> 
> My problem is with mail servers and Maildir format where it's possible
> that a user has tons of emails and wants to delete them. The mailbox
> maybe slowly grows back to the huge size, but in the meantime it's
> slower than necessary.

The problem is that unless the user is deleting a *huge* number of
files, it's rare that the directory entry block goes completely empty.
If you shrink from 15,000 messages to 12,000 messages, say, because of
the fact that we use a hashed b-tree as our data structure, the leaf
blocks in the btree generally still contain some directory entries.
So to fix this we need to actually coalesce directory leaf blocks on
the fly, on top of everything else that I had mentioned.  It's
certianly doable, but again, someone would have to submit a patch.  We
might get around to it one of these days, but plates of those of us
who are doing ext4 are pretty full with higher priority items at
present.

There is an off-line fix that works quite well -- e2fsck -fD, but
obviously that requires scheduling downtime.

How big of a deal is this for you?  I use a local maildir myself, and
they can get quite large:

% ls /home/tytso/isync/mit
total 2132
1412 cur/   716 new/     4 tmp/

But once they are in cache, it's no longer a major problem.  I suppose
on a mail server where you have a very large number of users, caching
2 megs of directory data per user could get ugly; and it does take
time the first time you pull their directory entry into the cache.
What sort of performance degredation are you measuring, and what are
the impacts operationally at the moment for you?  Is this just a
theoretical concern, or are you measuring a significant slowdown as a result?

	    	     	    		    - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ