lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+Zwm47qs8yco0nNoD_hFzHccoGyPznLHkBjAeg9REZ3gA@mail.gmail.com>
Date:   Sun, 7 Jun 2020 11:37:20 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Andrey Konovalov <andreyknvl@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Marco Elver <elver@...gle.com>,
        Mark Rutland <mark.rutland@....com>,
        Borislav Petkov <bp@...en8.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        clang-built-linux <clang-built-linux@...glegroups.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Alexander Potapenko <glider@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        LKML <linux-kernel@...r.kernel.org>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH -tip v3 1/2] kcov: Make runtime functions noinstr-compatible

On Fri, Jun 5, 2020 at 3:25 PM 'Andrey Konovalov' via kasan-dev
<kasan-dev@...glegroups.com> wrote:
> > On Fri, Jun 05, 2020 at 12:57:15PM +0200, Dmitry Vyukov wrote:
> > > On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <elver@...gle.com> wrote:
> > > >
> > > > While we lack a compiler attribute to add to noinstr that would disable
> > > > KCOV, make the KCOV runtime functions return if the caller is in a
> > > > noinstr section, and mark them noinstr.
> > > >
> > > > Declare write_comp_data() as __always_inline to ensure it is inlined,
> > > > which also reduces stack usage and removes one extra call from the
> > > > fast-path.
> > > >
> > > > In future, our compilers may provide an attribute to implement
> > > > __no_sanitize_coverage, which can then be added to noinstr, and the
> > > > checks added in this patch can be guarded by an #ifdef checking if the
> > > > compiler has such an attribute or not.
> > >
> > > Adding noinstr attribute to instrumentation callbacks looks fine to me.
> > >
> > > But I don't understand the within_noinstr_section part.
> > > As the cover letter mentions, kcov callbacks don't do much and we
> > > already have it inserted and called. What is the benefit of bailing
> > > out a bit earlier rather than letting it run to completion?
> > > Is the only reason for potential faults on access to the vmalloc-ed
> > > region?
> >
> > Vmalloc faults (on x86, the only arch that had them IIRC) are gone, per
> > this merge window.
> >
> > The reason I mentioned them is because it is important that they are
> > gone, and that this hard relies on them being gone, and the patch didn't
> > call that out.
> >
> > There is one additional issue though; you can set hardware breakpoint on
> > vmalloc space, and that would trigger #DB and then we'd be dead when we
> > were already in #DB (IST recursion FTW).
> >
> > And that is not something you can trivially fix, because you can set the
> > breakpoint before the allocation (or perhaps on a previous allocation).
> >
> > That said; we already have this problem with task_struct (and
> > task_stack). IIRC Andy wants to fix the task_stack issue by making all
> > of noinstr run on the entry stack, but we're not there yet.
> >
> > There are no good proposals for random allocations like task_struct or
> > in your case kcov_area.
> >
> > > Andrey, Mark, do you know if it's possible to pre-fault these areas?
> >
> > Under the assumption that vmalloc faults are still a thing:
> >
> > You cannot pre-fault the remote area thing, kernel threads use the mm of
> > the previous user task, and there is no guarantee that mm will have had
> > the vmalloc fault.
>
> To clarify this part AFAIU it, even if we try to prefault the whole
> remote area each time kcov_remote_start() is called, then (let alone
> the performance impact) the kernel thread can be rescheduled between
> kcov_remote_start() and kcov_remote_stop(), and then it might be
> running with a different mm than the one that was used when
> kcov_remote_start() happened.

Ugh, this is nasty. But this has also gone, which I am happy about :)

Why I am looking at this is because with coverage instrumentation
__sanitizer_cov_trace_pc is the hottest function in the kernel and we
are adding additional branches to it.

Can we touch at least some per-cpu data within noinstr code?
If yes, we could try to affect the existing
in_task()/in_serving_softirq() check.
If not, it would be useful to have a comment clarifying that
within_noinstr_section check must happen before we touch anything
else.

I assume objtool can now also detect all violations. How bad is it now
without within_noinstr_section check? I am assuming we marking noinstr
functions to not be instrumented, but we are getting some stray
instrumentation from inlined functions or something, right? How many
are there? Is it fixable/unfixable? Marco, do you know the list, or
could you please collect the list of violations?

Is there any config that disables #DB? We could well disable it on
syzbot, I think we already disable some production hardening/debugging
confings, which are not too useful for testing setup.
E.g. we support RANDOMIZE_BASE, no problem, but if one disables it
(which we do), that becomes no-op:

#ifdef CONFIG_RANDOMIZE_BASE
    ip -= kaslr_offset();
#endif
    return ip;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ