lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200309022829.GB9615@lenoir>
Date:   Mon, 9 Mar 2020 03:28:30 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Alex Belits <abelits@...vell.com>
Cc:     "mingo@...nel.org" <mingo@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        Prasun Kapoor <pkapoor@...vell.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "catalin.marinas@....com" <catalin.marinas@....com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        "will@...nel.org" <will@...nel.org>
Subject: Re: [EXT] Re: [PATCH 11/12] task_isolation: kick_all_cpus_sync:
 don't kick isolated cpus

On Sun, Mar 08, 2020 at 06:48:43AM +0000, Alex Belits wrote:
> On Fri, 2020-03-06 at 16:34 +0100, Frederic Weisbecker wrote:
> > On Wed, Mar 04, 2020 at 04:15:24PM +0000, Alex Belits wrote:
> > > From: Yuri Norov <ynorov@...vell.com>
> > > 
> > > Make sure that kick_all_cpus_sync() does not call CPUs that are
> > > running
> > > isolated tasks.
> > > 
> > > Signed-off-by: Alex Belits <abelits@...vell.com>
> > > ---
> > >  kernel/smp.c | 14 +++++++++++++-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/smp.c b/kernel/smp.c
> > > index 3a8bcbdd4ce6..d9b4b2fedfed 100644
> > > --- a/kernel/smp.c
> > > +++ b/kernel/smp.c
> > > @@ -731,9 +731,21 @@ static void do_nothing(void *unused)
> > >   */
> > >  void kick_all_cpus_sync(void)
> > >  {
> > > +	struct cpumask mask;
> > > +
> > >  	/* Make sure the change is visible before we kick the cpus */
> > >  	smp_mb();
> > > -	smp_call_function(do_nothing, NULL, 1);
> > > +
> > > +	preempt_disable();
> > > +#ifdef CONFIG_TASK_ISOLATION
> > > +	cpumask_clear(&mask);
> > > +	task_isolation_cpumask(&mask);
> > > +	cpumask_complement(&mask, &mask);
> > > +#else
> > > +	cpumask_setall(&mask);
> > > +#endif
> > > +	smp_call_function_many(&mask, do_nothing, NULL, 1);
> > > +	preempt_enable();
> > >  }
> > 
> > That looks very dangerous, the callers of kick_all_cpus_sync() want
> > to
> > sync all CPUs for a reason. You will rather need to fix the callers.
> 
> All callers of this use this function to synchronize IPIs and icache,
> and they have no idea if there is anything special about the state of
> CPUs. If a task is isolated, this call would not be necessary because
> the task is in userspace, and it would have to enter kernel for any of
> that to become relevant but then it will have to switch from userspace
> to kernel. At worst it is returning to userspace after entering
> isolation or back in kernel running cleanup after isolation is broken
> but before tsk_thread_flags_cache is updated. There will be nothing to
> run on the same CPU because we have just left isolation, so task will
> either exit or go back to userspace.
> 
> Is there any reason for a race at that point?


I can imagine several races:

1) The isolated task has set the cpumask but hasn't exited the kernel
yet. If it still runs kernel code while kick_all_cpus_sync() has completed,
we fail.

2) The isolated task is running do_exit() but the caller of kick_all_cpus_sync()
still sees the target as part of the isolated mask.

3) The isolated task has just set the isolated cpumask and entered userspace
but the caller still don't see the new value in the isolated cpumask, so it sends
the IPI to the isolated CPU.

Besides, any caller of kick_all_cpus_sync() is in its right to expect that
everything preceding the call to that function is visible to all CPUs
after that call. If you spare that IPI to an isolated CPU, what ensures
it will see what it is supposed to once it calls do_exit() or prctl()?

Is there a way we could fix the callers instead? For example synchronize_rcu()
could be a replacement (it handles very well nohz_full CPUs), provided the
callsites can sleep. It seems to be the case for __do_tune_cpucache() at least.

flush_icache_range() is scarier I have to admit, doesn't look like it can
sleep.


> > Thanks.
> > 
> > >  EXPORT_SYMBOL_GPL(kick_all_cpus_sync);
> > >  
> > > -- 
> > > 2.20.1
> > > 
> 
> -- 
> Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ