[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4gueN4BsE94CGeO_rDad+MBs==5a+m7SKymJ_RywNCW3w@mail.gmail.com>
Date: Fri, 21 Feb 2020 15:26:45 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Dave Chinner <david@...morbit.com>
Cc: Christoph Hellwig <hch@....de>, "Weiny, Ira" <ira.weiny@...el.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
"Darrick J. Wong" <darrick.wong@...cle.com>,
"Theodore Y. Ts'o" <tytso@....edu>, Jan Kara <jack@...e.cz>,
linux-ext4 <linux-ext4@...r.kernel.org>,
linux-xfs <linux-xfs@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH V4 07/13] fs: Add locking for a dynamic address space
operations state
On Fri, Feb 21, 2020 at 2:44 PM Dave Chinner <david@...morbit.com> wrote:
>
> On Fri, Feb 21, 2020 at 06:44:49PM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 20, 2020 at 04:41:28PM -0800, ira.weiny@...el.com wrote:
> > > From: Ira Weiny <ira.weiny@...el.com>
> > >
> > > DAX requires special address space operations (aops). Changing DAX
> > > state therefore requires changing those aops.
> > >
> > > However, many functions require aops to remain consistent through a deep
> > > call stack.
> > >
> > > Define a vfs level inode rwsem to protect aops throughout call stacks
> > > which require them.
> > >
> > > Finally, define calls to be used in subsequent patches when aops usage
> > > needs to be quiesced by the file system.
> >
> > I am very much against this. There is a reason why we don't support
> > changes of ops vectors at runtime anywhere else, because it is
> > horribly complicated and impossible to get right. IFF we ever want
> > to change the DAX vs non-DAX mode (which I'm still not sold on) the
> > right way is to just add a few simple conditionals and merge the
> > aops, which is much easier to reason about, and less costly in overall
> > overhead.
>
> *cough*
>
> That's exactly what the original code did. And it was broken
> because page faults call multiple aops that are dependent on the
> result of the previous aop calls setting up the state correctly for
> the latter calls. And when S_DAX changes between those calls, shit
> breaks.
>
> It's exactly the same problem as switching aops between two
> dependent aops method calls - we don't solve anything by merging
> aops and checking IS_DAX in each method because the race condition
> is still there.
>
> /me throws his hands in the air and walks away
Please come back, because I think it's also clear that the "we don't
support changes of ops vectors at runtime" assertion is already being
violated by ext4 [1]. So that leaves "IFF we ever want to change the
dax vs non-dax mode" which I thought was already consensus given the
lingering hopes about having some future where the kernel is able to
dynamically optimize for dax vs non-dax based on memory media
performance characteristics. I thought the only thing missing from the
conclusion of the last conversation [2] was the "physical" vs
"effective" split that we identified at LSF'19 [3]. Christoph, that
split allows for for your concern about application intent to be
considered / overridden by kernel policy, and it allows for
communication of the effective state which applications need for
resource planning [4] independent of MAP_SYNC and other dax semantics.
The status quo of globally enabling dax for all files is empirically
the wrong choice for page-cache friendly workloads running on
slower-than-DRAM pmem media.
I am struggling to see how we address the above items without first
having a dynamic / less than global-filesystem scope facility to
control dax.
[1]: https://lore.kernel.org/linux-fsdevel/20191108131238.GK20863@quack2.suse.cz
[2]: https://lore.kernel.org/linux-fsdevel/20170927064001.GA27601@infradead.org
[3]: https://lwn.net/Articles/787973/
[4]: https://lwn.net/Articles/787233/
Powered by blists - more mailing lists