lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Apr 2021 07:55:17 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Arjun Roy' <arjunroy@...gle.com>
CC:     Eric Dumazet <edumazet@...gle.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        paulmck <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2 3/3] rseq: optimise rseq_get_rseq_cs() and
 clear_rseq_cs()

From: Arjun Roy
> Sent: 13 April 2021 23:04
> 
> On Tue, Apr 13, 2021 at 2:19 PM David Laight <David.Laight@...lab.com> wrote:
> >
> > > If we're special-casing 64-bit architectures anyways - unrolling the
> > > 32B copy_from_user() for struct rseq_cs appears to be roughly 5-10%
> > > savings on x86-64 when I measured it (well, in a microbenchmark, not
> > > in rseq_get_rseq_cs() directly). Perhaps that could be an additional
> > > avenue for improvement here.
> >
> > The killer is usually 'user copy hardening'.
> > It significantly slows down sendmsg() and recvmsg().
> > I've got measurable performance improvements by
> > using __copy_from_user() when the buffer since has
> > already been checked - but isn't a compile-time constant.
> >
> > There is also scope for using _get_user() when reading
> > iovec[] (instead of copy_from_user()) and doing all the
> > bound checks (etc) in the loop.
> > That gives a measurable improvement for writev("/dev/null").
> > I must sort those patches out again.
> >
> >         David
> >
> 
> In this case I mean replacing copy_from_user(rseq_cs, urseq_cs,
> sizeof(*rseq_cs)) with  4 (x8B=32 total) unsafe_get_user() (wrapped in
> user_read_access_begin/end()) which I think would just bypass user
> copy hardening (as far as I can tell).

Yes that is one advantage over any of the get_user() calls.
You also lose all the 'how shall we optimise this' checks
in copy_from_user().

Repeated unsafe_get_user() calls are crying out for an optimisation.
You get something like:
	failed = 0;
	copy();
	if (failed) goto error;
	copy();
	if (failed) goto error;
Where 'failed' is set by the fault handler.

This could be optimised to:
	failed = 0;
	copy();
	copy();
	if (failed) goto error;
Even if it faults on every invalid address it probably
doesn't matter - no one cares about that path.

I've not really looked at how it could be achieved though.

It might be that the 'asm goto with outputs' variant
manages to avoid the compare and jump.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ