lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1711171922420.2186@nanos>
Date:   Fri, 17 Nov 2017 19:59:33 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Andi Kleen <andi@...stfloor.org>
cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Andy Lutomirski <luto@...capital.net>,
        Dave Watson <davejwatson@...com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-api <linux-api@...r.kernel.org>,
        Paul Turner <pjt@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Russell King <linux@....linux.org.uk>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Andrew Hunter <ahh@...gle.com>,
        Chris Lameter <cl@...ux.com>, Ben Maurer <bmaurer@...com>,
        rostedt <rostedt@...dmis.org>,
        Josh Triplett <josh@...htriplett.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will.deacon@....com>,
        Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: [RFC PATCH v3 for 4.15 08/24] Provide cpu_opv system call

On Fri, 17 Nov 2017, Andi Kleen wrote:
> > 1) Handling single-stepping from tools
> > 
> > Tools like debuggers, and simulators like record-replay ("rr") use
> > single-stepping to run through existing programs. If core libraries start
> 
> No rr doesn't use single stepping. It uses branch stepping based on the
> PMU, and those should only happen on external events or syscalls which would
> abort the rseq anyways.
> 
> Eventually it will suceeed because not every retry there will be a new
> signal. If you put a syscall into your rseq you will just get
> what you deserve.
> 
> If it was single stepping it couldn't execute the vDSO (and it would
> be incredible slow)
>
> Yes debuggers have to skip instead of step, but that's easily done
> (and needed today already for every gettimeofday, which tends to be
> the most common syscall...)

The same problem exists with TSX. Hitting a break point inside a
transaction section triggers abort with the cause bit 'breakpoint' set.

rseq can be looked at as a variant of software transactional memory with a
limited feature set, but the underlying problems are exactly the
same. Breakpoints are not working either.

> > to use restartable sequences for e.g. memory allocation, this means
> > pre-existing programs cannot be single-stepped, simply because the
> > underlying glibc or jemalloc has changed.
> 
> But how likely is it that the fall back path even works?
> 
> It would never be exercised in normal operation, so it would be a prime
> target for bit rot, or not ever being tested and be broken in the first place.

What's worse is that the fall back path cannot be debugged at all. It's a
magic byte code which is interpreted inside the kernel.


I really do not understand why this does not utilize existing design
patterns well known from transactional memory.

The most straight forward is to have a mechanism which forces everything
into the slow path in case of debugging, lack of progress, etc. The slow
path uses traditional locking to resolve the situation. That's well known
to work and if done correctly then the only difference between slow path
and fast path is locking/transaction control, i.e. it's a single
implementation of the actual memory operations which can be single stepped,
debug traced and whatever.

It solves _ALL_ of the problems you describe including support for systems
which do not support rseq at all.

This syscall is horribly overengineered and creates more problems than it
solves.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ