linux-kernel - Re: [PATCH v2 3/4] arm64: Make debug exception handlers visible from RCU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190724204748.94abebef1f184032d2e77f73@kernel.org>
Date:   Wed, 24 Jul 2019 20:47:48 +0900
From:   Masami Hiramatsu <mhiramat@...nel.org>
To:     James Morse <james.morse@....com>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will.deacon@....com>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Naresh Kamboju <naresh.kamboju@...aro.org>,
        Dan Rue <dan.rue@...aro.org>,
        Matt Hart <matthew.hart@...aro.org>,
        Anders Roxell <anders.roxell@...aro.org>,
        Daniel Diaz <daniel.diaz@...aro.org>,
        "Paul E . McKenney" <paulmck@...ux.ibm.com>
Subject: Re: [PATCH v2 3/4] arm64: Make debug exception handlers visible
 from RCU

On Tue, 23 Jul 2019 18:07:56 +0100
James Morse <james.morse@....com> wrote:

> Hi,
> 
> On 22/07/2019 08:48, Masami Hiramatsu wrote:
> > Make debug exceptions visible from RCU so that synchronize_rcu()
> > correctly track the debug exception handler.
> > 
> > This also introduces sanity checks for user-mode exceptions as same
> > as x86's ist_enter()/ist_exit().
> > 
> > The debug exception can interrupt in idle task. For example, it warns
> > if we put a kprobe on a function called from idle task as below.
> > The warning message showed that the rcu_read_lock() caused this
> > problem. But actually, this means the RCU is lost the context which
> > is already in NMI/IRQ.
> 
> > So make debug exception visible to RCU can fix this warning.
> 
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index 9568c116ac7f..a6b244240db6 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -777,6 +777,42 @@ void __init hook_debug_fault_code(int nr,
> >  	debug_fault_info[nr].name	= name;
> >  }
> >  
> > +/*
> > + * In debug exception context, we explicitly disable preemption.
> > + * This serves two purposes: it makes it much less likely that we would
> > + * accidentally schedule in exception context and it will force a warning
> > + * if we somehow manage to schedule by accident.
> > + */
> > +static void debug_exception_enter(struct pt_regs *regs)
> > +{
> > +	if (user_mode(regs)) {
> > +		RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> 
> Would moving entry.S's context_tracking_user_exit() call to be before do_debug_exception()
> also fix this?

It sounds like treating only user context, correct?
This part is just adding assertion, not fixing the problem which Naresh reported.

> 
> I don't know the reason its done 'after' debug exception handling. Its always been like
> this: commit 6c81fe7925cc4c42 ("arm64: enable context tracking").
> 
> 
> > +	} else {
> > +		/*
> > +		 * We might have interrupted pretty much anything.  In
> > +		 * fact, if we're a debug exception, we can even interrupt
> > +		 * NMI processing.
> 
> > +		 * We don't want in_nmi() to return true,
> > +		 * but we need to notify RCU.
> 
> How come? If you interrupted an SError or pseudo-nmi, it already is. Those paths should
> all be painted no-kprobe, but I'm sure there are gaps. The hw-breakpoints can almost
> certainly hook them.

I think that sentense means "we don't want that this code makes in_nmi() to return true"
So, if the breakpoint interrupts pNMI/SError context, it is OK that in_nmi() returns true.

> 
> 
> > +		 */
> > +		rcu_nmi_enter();
> 
> Can we interrupt printk()? Do we need printk_nmi_enter()? ... What about ftrace?

Good point! As far as I know, we don't use it because ftrace doesn't use printk.
But indeed, kprobes user can use printk and they have to call printk_nmi_enter()/exit(),
that must be commented in the documentation. Anyway, basically it is user's choice.

> 
> Because SError and pseudo-nmi can interrupt interrupt-masked code, we describe them as
> NMI. The only difference here is these exceptions are synchronous.
> 
> 
> I suspect we should make these debug exceptions nmi for EL1. We can then use this for the
> kprobe-re-entrance stuff so the pre/post hooks don't get run if they interrupted something
> also described as NMI.

I'm not sure how it can prevent... anyway because we have to run a single-stepping for
recovery, and kprobe already check the reentered kprobes and skip user-handlers in
such case.

Thank you,

> 
> 
> > +	}
> > +
> > +	preempt_disable();
> > +
> > +	/* This code is a bit fragile.  Test it. */
> > +	RCU_LOCKDEP_WARN(!rcu_is_watching(), "exception_enter didn't work");
> > +}
> > +NOKPROBE_SYMBOL(debug_exception_enter);
> 
> 
> Thanks,
> 
> James


-- 
Masami Hiramatsu <mhiramat@...nel.org>