lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 19 Mar 2024 14:14:47 +0000
From: David Howells <dhowells@...hat.com>
To: Miklos Szeredi <miklos@...redi.hu>
Cc: dhowells@...hat.com, Matthew Wilcox <willy@...radead.org>,
    Trond Myklebust <trond.myklebust@...merspace.com>,
    Christoph Hellwig <hch@....de>,
    Andrew Morton <akpm@...ux-foundation.org>,
    Alexander Viro <viro@...iv.linux.org.uk>,
    Christian Brauner <brauner@...nel.org>,
    Jeff Layton <jlayton@...nel.org>, linux-mm@...ck.org,
    linux-fsdevel@...r.kernel.org, netfs@...ts.linux.dev,
    v9fs@...ts.linux.dev, linux-afs@...ts.infradead.org,
    ceph-devel@...r.kernel.org, linux-cifs@...r.kernel.org,
    linux-nfs@...r.kernel.org, devel@...ts.orangefs.org,
    linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mm: Replace ->launder_folio() with flush and wait

Miklos Szeredi <miklos@...redi.hu> wrote:

> >  (2) invalidate_inode_pages2() is used in some places to effect
> >      invalidation of the pagecache in the case where the server tells us
> >      that a third party modified the server copy of a file.  What the
> >      right behaviour should be here, I'm not sure, but at the moment, any
> >      dirty data will get laundered back to the server.  Possibly it should
> >      be simply invalidated locally or the user asked how they want to
> >      handle the divergence.
> 
> Skipping ->launder_page will mean there's a window where the data
> *will* be lost, AFAICS.
> 
> Of course concurrent cached writes on different hosts against the same
> region (the size of which depends on how the caching is done) will
> conflict.

Indeed.  Depending on when you're using invalidate_inode_pages2() and co. and
what circumstances you're using it for, you *are* going to suffer data loss.

For instance, if you have dirty data on the local host and get an invalidation
notification from the server: if you write just your dirty data back, you may
corrupt the file on the server, losing the third party changes; if you write
back your entire copy of the file, you might avoid corrupting the file, but
completely obliterate the third party changes; if you discard your changes,
you lose those instead, but save the third party changes.

I'm working towards supporting disconnected operation where I'll need to add
some sort of user interaction mechanism that will allow the user to say how
they want to handle this.

> But if concurrent writes are to different regions, then they shouldn't
> be lost, no?  Without the current ->launder_page thing I don't see how
> that could be guaranteed.

Define "different regions".  If they're not on the same folios, then why would
they be lost by simply flushing the data before doing the invalidation?  If
they are on different parts of the same folio, all the above still apply when
you flush the whole folio.

Now, you can mitigate the latter case by keeping track of which bytes changed,
but that still allows you to corrupt the file by writing back just your
particular changes.

And then there's the joker in the deck: mmap.  The main advantage of
invalidate_inode_pages2() is that it forcibly unmaps the page before
laundering it.  However, this doesn't prevent you then corrupting the upstream
copy by writing the changes back.

What particular usage case of invalidate_inode_pages2() are you thinking of?

DIO read/write can only be best effort: flush, invalidate then do the DIO
which may bring the buffers back in because they're mmapped.  In which case
doing a flush and a non-laundering invalidate that leaves dirty pages in place
ought to be fine.

David


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ