[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151002062340.GB30051@gmail.com>
Date: Fri, 2 Oct 2015 08:23:40 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Kees Cook <keescook@...gle.com>, Dave Hansen <dave@...1.net>,
"x86@...nel.org" <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andy Lutomirski <luto@...nel.org>,
Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH 26/26] x86, pkeys: Documentation
* Andy Lutomirski <luto@...capital.net> wrote:
> >> Assuming it boots up fine on a typical distro, i.e. assuming that there are no
> >> surprises where PROT_READ && PROT_EXEC sections are accessed as data.
> >
> > I can't wait to find out what implicitly expects PROT_READ from
> > PROT_EXEC mappings. :)
So what seems to happen is that there are no pure PROT_EXEC mappings in practice -
they are only omnibus PROT_READ|PROT_EXEC mappings, an unknown proportion of which
truly relies on PROT_READ:
$ for C in firefox ls perf libreoffice google-chrome Xorg xterm \
konsole; do echo; echo "# $C:"; strace -e trace=mmap -f $C -h 2>&1 | cut -d, -f3 | \
grep PROT | sort | uniq -c; done
# firefox:
13 PROT_READ
82 PROT_READ|PROT_EXEC
184 PROT_READ|PROT_WRITE
2 PROT_READ|PROT_WRITE|PROT_EXEC
# ls:
2 PROT_READ
7 PROT_READ|PROT_EXEC
17 PROT_READ|PROT_WRITE
# perf:
1 PROT_READ
20 PROT_READ|PROT_EXEC
44 PROT_READ|PROT_WRITE
# libreoffice:
2 PROT_NONE
87 PROT_READ
148 PROT_READ|PROT_EXEC
339 PROT_READ|PROT_WRITE
# google-chrome:
39 PROT_READ
121 PROT_READ|PROT_EXEC
345 PROT_READ|PROT_WRITE
# Xorg:
1 PROT_READ
22 PROT_READ|PROT_EXEC
39 PROT_READ|PROT_WRITE
# xterm:
1 PROT_READ
25 PROT_READ|PROT_EXEC
46 PROT_READ|PROT_WRITE
# konsole:
1 PROT_READ
101 PROT_READ|PROT_EXEC
175 PROT_READ|PROT_WRITE
So whatever kernel side method we come up with, it's not something that I expect
to become production quality. "Proper" conversion to pkeys has to be driven from
the user-space side.
That does not mean we can not try! :-)
> There's one annoying issue at least:
>
> mprotect_pkey(..., PROT_READ | PROT_EXEC, 0) sets protection key 0.
> mprotect_pkey(..., PROT_EXEC, 0) maybe sets protection key 15 or
> whatever we use for this. What does mprotect_pkey(..., PROT_EXEC, 0)
> do? What if the caller actually wants key 0? What if some CPU vendor
> some day implements --x for real?
That comes from the hardcoded "user-space has 4 bits to itself, not managed by the
kernel" assumption in the whole design. So no layering between different
user-space libraries using pkeys in a different fashion, no transparent kernel use
of pkeys (such as it may be), etc.
I'm not sure it's _worth_ managing these 4 bits, but '16 separate keys' does seem
to be to me above a certain resource threshold that should be more explicitly
managed than telling user-space: "it's all yours!".
> Also, how do we do mprotect_pkey and say "don't change the key"?
So if we start managing keys as a resource (i.e. alloc/free up to 16 of them), and
provide APIs for user-space to do all that, then user-space is not supposed to
touch keys it has not allocated for itself - just like it's not supposed to write
to fds it has not opened.
Such an allocation method can still 'mess up', and if the kernel allocates a key
for its purposes it should not assume that user-space cannot change it, but at
least for non-buggy code there's no interaction and it would work out fine.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists