[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b177c427-ca94-bbbf-b09d-6f2faef4121c@kernel.dk>
Date: Mon, 13 Jan 2020 10:09:01 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
Linux Next Mailing List <linux-next@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Aleksa Sarai <cyphar@...har.com>
Subject: Re: linux-next: build failure after merge of the block tree
On 1/12/20 11:32 AM, Al Viro wrote:
> On Mon, Jan 06, 2020 at 09:04:01PM -0700, Jens Axboe wrote:
>> On 1/5/20 6:30 PM, Stephen Rothwell wrote:
>>> Hi all,
>>>
>>> After merging the block tree, today's linux-next build (arm
>>> multi_v7_defconfig) failed like this:
>>>
>>> fs/open.c:977:12: error: conflicting types for 'build_open_flags'
>>> 977 | inline int build_open_flags(const struct open_how *how,
>>> | ^~~~~~~~~~~~~~~~
>>> In file included from /home/sfr/next/next/fs/open.c:36:
>>> fs/internal.h:127:12: note: previous declaration of 'build_open_flags' was here
>>> 127 | extern int build_open_flags(int flags, umode_t mode, struct open_flags *op);
>>> | ^~~~~~~~~~~~~~~~
>>>
>>> Caused by commits
>>>
>>> 4e9e15c9426e ("fs: make build_open_flags() available internally")
>>> 3bba3e571bc8 ("io_uring: add support for IORING_OP_OPENAT")
>>>
>>> interacting with commit
>>>
>>> 0a51692d49ec ("open: introduce openat2(2) syscall")
>>>
>>> from the vfs tree.
>>>
>>> I have applied the following fix up patch for today:
>>
>> Thanks Stephen - I'll pull in the VFS tree and rebase the 5.6 io_uring
>> bits on that. Then I'll send it out for review again, haven't heard from
>> Al on the non-block open change.
>
> FWIW, I don't believe that your approach is workable. First of all,
> *ANY* transition out of RCU mode can lead to blocking. You need to
> acquire several references (mount and dentry, at the very least).
> Suppose the last one fails (->d_seq mismatch). Now you suddenly
> have to drop the one(s) you've acquired. And both dput() and mntput()
> are fundamentally blocking operations.
>
> It simply does not work. You could cobble up something that kinda-sorta
> works, if your added flag had
> * caused hard failure on unlazy_child()
> * caused hard failure on unlazy_walk() with any symlinks in stack
> * caused hard failure on unlazy_walk() if it would've been required
> to grab root
> * made unlazy_walk() go through very careful dance if it's just
> about nd->path; I'm not sure how well that could be done, but theoretically
> it's not impossible.
>
> But for open() it's not going to work at all. Any open for write => you
> will have to wait if you run into fs freeze. O_TRUNC => you've got IO
> to do. Worst of all, once you've dropped out of RCU mode, *YOU* *CAN'T*
> *FAIL*. Because that means blocking operations. So you need to verify
> that you won't run into a blocking ->open(), IMA deciding to play silly
> buggers and read through the entire file, etc., etc. _before_ dropping
> out of RCU mode.
>
> do_last() is messy enough as it is; adding _this_ is completely out of
> question.
Thanks Al, that's useful! Sounds like the lookup is doable, but the open
part is just a wasp nest of "don't even go there". For now, I'll drop
the lookup change and just have the io_uring open punt to async. With
that, I don't need any non-blocking guarantees. That is workable for
now.
> Jens, if you have a workable plan on that non-blocking open of yours,
> post it in full details. Until then - NAK, and that's about as hard one
> as I ever had to give.
It's like the other io_uring opcodes - we prefer to try a non-blocking
attempt first, and if that fails, then we go async. I have no grand
async open design, was just hoping I could make it work with minimal
effort. That's obviously not doable. I would not mind working on
actually making it doable, but that's a bigger project than I originally
wanted to take on.
So the most likely outcome longer term is for io_uring to adopt a syslet
type of approach to this, where we always just just call the open
helper, and if we need to block/reschedule, then we move context to an
appropriate worker thread. Time is better spent there rather than trying
to make every useful system call provide a sane non-blocking path, I
think.
--
Jens Axboe
Powered by blists - more mailing lists