lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180906091718.GL24519@BitWizard.nl>
Date:   Thu, 6 Sep 2018 11:17:18 +0200
From:   Rogier Wolff <R.E.Wolff@...Wizard.nl>
To:     Dave Chinner <david@...morbit.com>
Cc:     Jeff Layton <jlayton@...hat.com>,
        焦晓冬 <milestonejxd@...il.com>,
        bfields@...ldses.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: POSIX violation by writeback error

On Thu, Sep 06, 2018 at 12:57:09PM +1000, Dave Chinner wrote:
> On Wed, Sep 05, 2018 at 02:07:46PM +0200, Rogier Wolff wrote:

> > And this has worked for years because
> > the kernel caches stuff from inodes and data-blocks. If you suddenly
> > write stuff to harddisk at 10ms for each seek between inode area and
> > data-area..
> 
> You're assuming an awful lot about filesystem implementation here.
> Neither ext4, btrfs or XFS issue physical IO like this when flushing
> data.

My thinking is: When fsync (implicit or explicit)  needs to know 
the result of the underlying IO, it needs to wait for it to have
happened. 

My thinking is: You can either log the data in the logfile or just the
metadata. By default/most people will chose the last. In the "make sure
it hits storage" case, you have three areas. 
* The logfile
* the inode area
* the data area. 

When you allow the application to continue pasta close, you can gather
up say a few megabytes of updates to each area and do say 50 seeks per
second. (achieving maybe about 50% of the throughput performance of
your drive)

If you don't store the /data/, you can stay in the inode or logfile
area and get a high throughput on your drive. But when a crash has the
filesystem in a defined state, what use is that if your application is
in a bad state because it is getting bad data?


Of course the application can be rewritten to have multiple threads so
that while one thread is waiting for a close to finish another one can
open/write/close another file. But there are existing applicaitons run
by users who do not have the knowledge or option to delve into the
source and rewrite the application to be multithreaded.

Your 100k files per second is closely similar to mine. In real life we
are not going to see such extreme numbers, but in some cases the
benchmark does predict a part of the performance of an application.
In practice, an application may spend 50% of the time on thinking
about the next file to make, and then 50k times per second actually
making the file.

	Roger. 

-- 
** R.E.Wolff@...Wizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ