linux-kernel - Re: POSIX violation by writeback error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180905070847.GC24519@BitWizard.nl>
Date:   Wed, 5 Sep 2018 09:08:47 +0200
From:   Rogier Wolff <R.E.Wolff@...Wizard.nl>
To:     Jeff Layton <jlayton@...hat.com>
Cc:     焦晓冬 <milestonejxd@...il.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: POSIX violation by writeback error

On Tue, Sep 04, 2018 at 11:44:20AM -0400, Jeff Layton wrote:
> On Tue, 2018-09-04 at 22:56 +0800, 焦晓冬 wrote:
> > On Tue, Sep 4, 2018 at 7:09 PM Jeff Layton <jlayton@...hat.com> wrote:
> > > 
> > > On Tue, 2018-09-04 at 16:58 +0800, Trol wrote:
> > > > That is certainly not possible to be done. But at least, shall we report
> > > > error on read()? Silently returning wrong data may cause further damage,
> > > > such as removing wrong files since it was marked as garbage in the old file.
> > > > 
> > > 
> > > Is the data wrong though? You tried to write and then that failed.
> > > Eventually we want to be able to get at the data that's actually in the
> > > file -- what is that point?
> > 
> > The point is silently data corruption is dangerous. I would prefer getting an
> > error back to receive wrong data.
> > 
> 
> Well, _you_ might like that, but there are whole piles of applications
> that may fall over completely in this situation. Legacy usage matters
> here.

Can I make a suggestion here?

First imagine a spherical cow in a vacuum..... 

What I mean is: In the absence of boundary conditions (the real world)
what would ideally happen?

I'd say: 

* When you've written data to a file, you would want to read that
  written data back. Even in the presence of errors on the backing
  media.

But already this is controversial: I've seen time-and-time again that
people with raid-5 setups continue to work untill the second drive
fails: They ignored the signals the system was giving: "Please replace
a drive".

So when a mail queuer puts mail the mailq files and the mail processor
can get them out of there intact, nobody is going to notice.  (I know
mail queuers should call fsync and report errors when that fails, but
there are bound to be applications where calling fsync is not
appropriate (*))

So maybe when the write fails, the reads on that file should fail?

Then it means the data required to keep in memory is much reduced: you
only have to keep the metadata.

In both cases, semantics change when a reboot happens before the
read. Should we care? If we can't fix it when a reboot has happened,
does it make sense to do something different when a reboot has NOT
happened?

	Roger. 

(*) I have 800Gb of data I need to give to a client. The
truck-of-tapes solution of today is a 1Tb USB-3 drive. Writing that
data onto the drive runs at 30Mb/sec (USB2 speed: USB3 didn't work for
some reason) for 5-10 seconds and then slows down to 200k/sec for
minutes at a time. One of the reasons might be that fuse-ntfs is
calling fsync on the MFT and directory files to keep stuff consistent
just in case things crash. Well... In this case this means that
copying the data took 3 full days instead of 3 hours. Too much calling
fsync is not good either.

-- 
** R.E.Wolff@...Wizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.