linux-kernel - Re: [PATCH] async: Don't call async_synchronize_full_special() while holding sb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090109082250.GQ9448@disturbed>
Date:	Fri, 9 Jan 2009 19:22:50 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Arjan van de Ven <arjan@...ux.intel.com>
Cc:	Dave Kleikamp <shaggy@...ux.vnet.ibm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Grissiom <chaos.proton@...il.com>, linux-kernel@...r.kernel.org,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH] async: Don't call async_synchronize_full_special()
	while holding sb_lock

On Thu, Jan 08, 2009 at 08:45:06PM -0800, Arjan van de Ven wrote:
> Dave Chinner wrote:
>
>>
>> So, given the potential impact of this change, what testing have
>> you done in terms of:
>>
>> 	- performance impact
>
> I tested this on my machines and it gave a real performance
> improvement (11 to 8 seconds for a full kernel tree unlink, and
> cutting out latency for normal applications)

Not really waht I'd call a significant performance test. What
happens under serious sustained I/O load? On multiple different
filesystems?

FWIW, my current kernel tree is:

$ find . -print | wc -l
30082

Which is just under the async operation limit you set so this
probably didn't even stress the mixing of sync and async deletes at
the same time...

>> 	- sync() safety
> that was exactly the synchronization point that's discussed here.

Right - how many fs developers did you discuss the impact of this
with? Did you crash test any filesystems? Did you run any fs QA
suites to check for regressions? How many different filesystems
did you even test?

We already know that sync is busted and doesn't provide the
guarantees it is supposed so on some filesystems. I'm extremely
concerned that changing unlink behaviour in this manner subtly
breaks sync even more than it already is....

>> 	- removing a million files and queuing all of the
>> 	  deletes in the async queues....
>
> the async code throttles at 32k outstanding.
> Yes 32K is arbitrary, but if you delete  a million files fast, all
> but the first few thousand are synchronous.

No, after the first 32k you get a mix of synchronous and async as
the async queue gets processed in parallel. This means you are
screwing up the temporal locality of the unlink of inodes. i.e. some
unlinks are being done immediately, others are being delayed until
another 32k inodes have been processed of the unlink list.

If we've been careful about allocation locality during untar so
that inodes that are allocated sequentially by untar will be
deleted sequential by rm -rf, then this new pattern will break those
optimisations because the unlink locality is being broken by
the some sync/some async behaviour that this queuing mechanism
creates.

Hmmmm - I suspect that rm -rf will also trigger lots of kernel
threads to be created. We do not want 256 kernel threads to
be spawned to bang on the serialised regions of filesystem
journals (generating lock contention) when one thread would
be sufficient....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/