linux-kernel - Re: Test 73 Sig_trap fails on arm64 (was Re: [PATCH] perf test: Test 73 Sig

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+Z8pKXw=8nwVtdo2W=hu_rBk1ws-Q=7-tBkLGTcD85NaA@mail.gmail.com>
Date:   Wed, 16 Feb 2022 12:54:16 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     John Garry <john.garry@...wei.com>
Cc:     Will Deacon <will@...nel.org>, Leo Yan <leo.yan@...aro.org>,
        Marco Elver <elver@...gle.com>,
        Thomas Richter <tmricht@...ux.ibm.com>,
        linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
        acme@...nel.org, svens@...ux.ibm.com, gor@...ux.ibm.com,
        sumanthk@...ux.ibm.com, hca@...ux.ibm.com,
        Mark Rutland <mark.rutland@....com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: Test 73 Sig_trap fails on arm64 (was Re: [PATCH] perf test: Test
 73 Sig_trap fails on s390)

On Wed, 16 Feb 2022 at 12:47, John Garry <john.garry@...wei.com> wrote:
>
> Hi Will,
>
> > Sorry, I haven't had time to look at this (or the thousands of other mails
> > in my inbox) lately.
> >
>
> Thanks
>
> > I don't recall all of the details, but basically hw_breakpoint really
> > doesn't work well on arm/arm64 -- the sticking points are around handling
> > the stepping and whether to step into or over exceptions. Sadly, our ptrace
> > interface (which is what is used by GDB) is built on top of hw_breakpoint,
> > so we can't just rip it out and any significant changes are pretty risky.
> >
> > What I would like to happen is that we rework our debug exception handling
> > as outlined by [1] so that kernel debug is better defined and the ptrace
> > interface can interact directly with the debug architecture instead of being
> > funnelled through hw_breakpoint. Once we have that, I think we could try to
> > improve hw_breakpoint much more comfortably (or at least defeature it
> > considerably without having to worry about breaking GDB). I started this a
> > couple of years ago, but I haven't found time to get back to it for ages.
> >
> > Anyway, to this specific test...
> >
> > When we hit a break/watchpoint the faulting PC points at the instruction
> > which faulted and the exception is reported before the instruction has had
> > any other side-effects (e.g. if a watchpoint triggers on a store, then
> > memory will not have been updated when the watchpoint handler runs), so if
> > we were to return as usual after reporting the exception to perf then we
> > would just hit the same break/watchpoint again and we'd get stuck. GDB
> > handles stepping over the faulting instruction, but for perf (and assumedly
> > these tests), the kernel is expected to handle the step. This handling
> > amounts to disabling the break/watchpoint which we think we hit and then
> > attempting a hardware single-step. During the step we could run into more
> > break/watchpoints on the same instruction, so we'll keep disabling things
> > until we eventually manage to complete the step, which is signalled by a
> > specific type of debug exception. At this point, we re-enable the
> > break/watchpoints and we're good.
> >
> > Signals make this messy, as the step logic will step_into_  the signal
> > handler -- we have to do this, otherwise we would miss break/watchpoints
> > triggered by the signal handler if we had disabled them for the step.
> > However, it means that when we return back from the signal handler we will
> > run back into the break/watchpoint which we initially stepped over. When
> > perf uses SIGTRAP to notify userspace that we hit a break/watchpoint,
> > then we'll get stuck because we'll step into the handler every time.
> >
> > Hopefully that clears things up a bit. Ideally, the kernel wouldn't
> > pretend to handle this stepping at all for arm64 as it adds a bunch of
> > complexity, overhead to our context-switch and I don't think the current
> > behaviour is particularly useful.
> >
>
> Right, so what I am hearing altogether is that for now we should just
> skip this test.
>
> And since the kernel does not seem to advertise this capability we need
> to disable for specific architectures.

It does and fwiw I am just trying to use it. Things work only on x86 so far.