lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <4D3C76BE.3090908@shiftmail.org>
Date:	Sun, 23 Jan 2011 19:43:10 +0100
From:	torn5 <torn5@...ftmail.org>
To:	Ted Ts'o <tytso@....edu>
Cc:	torn5 <torn5@...ftmail.org>, Josef Bacik <josef@...hat.com>,
	Jon Leighton <j@...athanleighton.com>,
	linux-ext4@...r.kernel.org
Subject: Re: Severe slowdown caused by jbd2 process

On 01/23/2011 06:17 AM, Ted Ts'o wrote:
>
>> that's why a fakefsync mount option would be nice to have.
>>      
> Yes, except the file system developers don't want to take on the moral
> liability of system administrators using such a mount option
> incorrectly.

I understand

> The fsync waits for all data to be sent to disk.  It has to; since we
> can't easily, given the current disk protocols, distinguish between
> the 5 MB of I/O that pertains to file A which is being fsync'ed, but
> not the 20 MB of I/O pertaining to file B which is going on in the
> background.

So it's a queue drain + cache flush, right?

> There is a way, for some newer disk drives, to do what's
> called a FUA (Force Unit Attention) ...
>    

I thought it was possible via the completion notifications from the disk.
AFAIK if a disk is in NCQ mode it will return completion for a command 
only when the write was really delivered to the platters. While in 
non-NCQ mode the disk immediately returns completion and caches the 
write. Is this correct?

Oh ok but that's not the problem, I understand now, the problem is that 
you want to see all 5MB of data delivered to the platters, not only 1 
write command...
So the only way is a queue drain.

So if we want to see faster fsyncs we have to reduce the nr_requests of 
a disk, so that the request_queue is short, right?


There were ideas around for an API for dependencies among BIOs.
e.g. here:
https://lwn.net/Articles/399148/
This would solve the problem of needing a queue drain for an fsync, 
right? Ext4 could make the last BIO of the file being synced to depend 
on all the other BIOs related to the same file, and then wait the NCQ 
completion notification for the last BIO. There wouldn't be a need to to 
drain the queue any more.
At that point it could even make sense to make all fsyncs-related I/O to 
jump at the head of the request_queue, so that fsyncs (hopefully related 
to small amounts of data) could return quickly even when there is a 
large file streaming or copy in the background filling the whole 
request_queue...
Does what I'm saying make sense?
I understand this feature would require major changes in Linux though...


Thank you for all these explanations,
these things really help us ignorant ext4 users understand...

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ