[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131210232449.GP10323@ZenIV.linux.org.uk>
Date: Tue, 10 Dec 2013 23:24:49 +0000
From: Al Viro <viro@...IV.linux.org.uk>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Jones <davej@...hat.com>, Oleg Nesterov <oleg@...hat.com>,
Darren Hart <dvhart@...ux.intel.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Mel Gorman <mgorman@...e.de>
Subject: Re: process 'stuck' at exit.
On Tue, Dec 10, 2013 at 11:42:15PM +0100, Thomas Gleixner wrote:
> /*
> * If write access is not required (eg. FUTEX_WAIT), try
> * and get read-only access.
> */
> if (err == -EFAULT && rw == VERIFY_READ) {
> err = get_user_pages_fast(address, 1, 0, &page);
>
> That's a legitimate use case. And futex_requeue only requests
> VERIFY_READ for the !requeue_pi case.
>
> Now, if that map is RO, i.e. we took the fallback path then the THP
> one will fail as it has write=1 unconditionally.
access_ok() has nothing whatsoever to do with RO vs. RW mappings.
It checks whether the address is OK for userland on architectures
with userland and kernel sharing the same address space (e.g. x86).
On something like e.g. sparc64 or s390 it's constant 1.
Note that there's nothing to stop another thread from remapping an RW
area RO just as you've returned from access_ok(), so checking for
writability in access_ok() would've been racy as hell. Ditto for
address being mapped at all...
Moreover, there are exactly two architectures that do not ignore the
first argument of access_ok() - microblaze and um. The former uses
it in debugging printk in failure case. The latter... AFAICS, it's
pointless - it's a special dispensation for read access to host
vsyscall page from guest process. The thing is, writes there are
going to fail anyway - host kernel won't let the guest kernel to
modify that page, period. IOW, it looks like um might as well drop
the (type == VERIFY_READ) part in __access_ok_vsyscall().
Why do we have the 'type' argument of access_ok(), anyway?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists