[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190308162017.GA26207@infradead.org>
Date: Fri, 8 Mar 2019 08:20:17 -0800
From: Christoph Hellwig <hch@...radead.org>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Christoph Hellwig <hch@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jann Horn <jannh@...gle.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] fs: use KERNEL_DS instead of get_ds()
On Fri, Mar 08, 2019 at 02:23:31PM +0000, Al Viro wrote:
> You do realize that nested pairs of that sort are not all there is?
> Even leaving m68k aside (there the same registers that select
> userland or kernel for that kind of access can be used e.g. for
> writeback control, or to switch to accessing sun3 MMU tables, etc.)
Yes. And the whole point is to keep these uses clear and separate.
> there are
> * temporary switches to USER_DS in things like unaligned
> access handlers, etc., where the kernel is doing emulation of possibly
> userland insns; similar for oops code dumping, etc.
> * use_mm()/unuse_mm() should probably switch to USER_DS and
> back, rather than doing that in callers.
> * switch to USER_DS (and no, it's *not* "USER_DS unless we started
> with KERNEL_DS" - nested counter is no-go here) for perf callbacks.
> * regular non-paired switches to USER_DS: do_exit() and
> flush_old_exec().
And that is probably the close to full list of callers that want
to explicitly enable access to the user address space, and thus
mark the thread as a user thread (and occasionally clear that in e.g.
unuse_mm).
Unless I'm completely missing something our general rule of thumb
should be:
- threads are started with uaccess kernel turned on (count = 1)
- if we execute in userspace we switch to user uaccess (count = 0)
- same for use_mm style threads that want user access
- every current random kernel code override increments the refcount
and drops the reference when done
- force uaccess cases like do_exit or the validation check on
return to userspace force it back to 0.
Initially each 1 > 0 transition (decrement or force) will do
set_fs(USER_DS), each 0 > 1 transition will do set_fs(KERNEL_DS).
Then later architectures can kill the set_fs API, and potentially
optimize things by getting rid of the addr_limit field in its current
form.
Powered by blists - more mailing lists