lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c93e47e106c42659b2004e8de604d61@AcuMS.aculab.com>
Date:   Wed, 18 Nov 2020 13:22:28 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Alexandre Chartre' <alexandre.chartre@...cle.com>,
        Borislav Petkov <bp@...en8.de>
CC:     "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "hpa@...or.com" <hpa@...or.com>, "x86@...nel.org" <x86@...nel.org>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "luto@...nel.org" <luto@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "thomas.lendacky@....com" <thomas.lendacky@....com>,
        "jroedel@...e.de" <jroedel@...e.de>,
        "konrad.wilk@...cle.com" <konrad.wilk@...cle.com>,
        "jan.setjeeilers@...cle.com" <jan.setjeeilers@...cle.com>,
        "junaids@...gle.com" <junaids@...gle.com>,
        "oweisse@...gle.com" <oweisse@...gle.com>,
        "rppt@...ux.vnet.ibm.com" <rppt@...ux.vnet.ibm.com>,
        "graf@...zon.de" <graf@...zon.de>,
        "mgross@...ux.intel.com" <mgross@...ux.intel.com>,
        "kuzuno@...il.com" <kuzuno@...il.com>
Subject: RE: [RFC][PATCH v2 00/21] x86/pti: Defer CR3 switch to C code

From: Alexandre Chartre
> Sent: 18 November 2020 10:30
...
> Correct, this RFC is not changing the overhead. However, it is a step forward
> for being able to execute some selected syscalls or interrupt handlers without
> switching to the kernel page-table. The next step would be to identify and add
> the necessary mapping to the user page-table so that specified syscalls can be
> executed without switching the page-table.

Remember that without PTI user space can read all kernel memory.
(I'm not 100% sure you can force a cache-line read.)
It isn't even that slow.
(Even I can understand how it works.)

So if you are worried about user space doing that you can't really
run anything on the user page tables.

System calls like getpid() are irrelevant - they aren't used (much).
Even the time of day ones are implemented in the VDSO without a
context switch.

So the overheads come from other system calls that 'do work'
without actually sleeping.
I'm guessing things like read, write, sendmsg, recvmsg.

The only interesting system call I can think of is futex.
As well as all the calls that return immediately because the
mutex has been released while entering the kernel, I suspect
that being pre-empted by a different thread (of the same process)
doesn't actually need CR3 reloading (without PTI).

I also suspect that it isn't just the CR3 reload that costs.
There could (depending on the cpu) be associated TLB and/or cache
invalidations that have a much larger effect on programs with
large working sets than on simple benchmark programs.

Now bits of data that you are 'more worried about' could be kept
in physical memory that isn't normally mapped (or referenced by
a TLB) and only mapped when needed.
But that doesn't help the general case.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ