[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210611123740.GA143945@lothringen>
Date: Fri, 11 Jun 2021 14:37:40 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
"Eric W . Biederman" <ebiederm@...ssion.com>,
Oleg Nesterov <oleg@...hat.com>, Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH 1/6] posix-cpu-timers: Fix rearm racing against process
tick
On Fri, Jun 11, 2021 at 01:49:08PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 09, 2021 at 01:54:00PM +0200, Frederic Weisbecker wrote:
> > On Fri, Jun 04, 2021 at 01:31:54PM +0200, Frederic Weisbecker wrote:
> > > Since the process wide cputime counter is started locklessly from
> > > posix_cpu_timer_rearm(), it can be concurrently stopped by operations
> > > on other timers from the same thread group, such as in the following
> > > unlucky scenario:
> > >
> > > CPU 0 CPU 1
> > > ----- -----
> > > timer_settime(TIMER B)
> > > posix_cpu_timer_rearm(TIMER A)
> > > cpu_clock_sample_group()
> > > (pct->timers_active already true)
> > >
> > > handle_posix_cpu_timers()
> > > check_process_timers()
> > > stop_process_timers()
> > > pct->timers_active = false
> > > arm_timer(TIMER A)
> > >
> > > tick -> run_posix_cpu_timers()
> > > // sees !pct->timers_active, ignore
> > > // our TIMER A
> > >
> > > Fix this with simply locking process wide cputime counting start and
> > > timer arm in the same block.
> > >
> > > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > > Cc: Oleg Nesterov <oleg@...hat.com>
> > > Cc: Thomas Gleixner <tglx@...utronix.de>
> > > Cc: Peter Zijlstra (Intel) <peterz@...radead.org>
> > > Cc: Ingo Molnar <mingo@...nel.org>
> > > Cc: Eric W. Biederman <ebiederm@...ssion.com>
> >
> > Fixes: 60f2ceaa8111 ("posix-cpu-timers: Remove unnecessary locking around cpu_clock_sample_group")
> > Cc: stable@...r.kernel.org
>
> Acked-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>
>
> Problem seems to be calling cpu_clock_sample_group(.start = true)
> without sighand locked. Do we want a lockdep assertion for that?
It's part of the problem. The other part is that it must be locked in the
same sequence than arm_timer(). So yes, a lockdep assertion would already be
a good indicator that something goes wrong.
Thanks.
Powered by blists - more mailing lists