linux-kernel - Re: WARNING in timer_wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZDADdMnY0oW2k5BV@lothringen>
Date:   Fri, 7 Apr 2023 13:50:12 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Marco Elver <elver@...gle.com>,
        syzbot <syzbot+3b14b2ed9b3d06dcaa07@...kaller.appspotmail.com>,
        linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
        Anna-Maria Behnsen <anna-maria@...utronix.de>,
        Jacob Keller <jacob.e.keller@...el.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: WARNING in timer_wait_running

On Fri, Apr 07, 2023 at 10:44:22AM +0200, Thomas Gleixner wrote:
> On Thu, Apr 06 2023 at 21:37, Thomas Gleixner wrote:
> > On Thu, Apr 06 2023 at 00:19, Frederic Weisbecker wrote:
> >> We could arrange for doing the same thing as hrtimer_cancel_wait_running()
> >> but for posix cpu timers, with taking a similar lock within
> >> handle_posix_cpu_timers() that timer_wait_running() could sleep on and
> >> inject its PI into.
> >
> > I have a faint memory that we discussed something like that, but there
> > was an issue which completely escaped my memory.
> 
> Now memory came back. The problem with posix CPU timers is that it is
> not really known to the other side which task is actually doing the
> expiry. For process wide timers this could be any task in the process.
> 
> For hrtimers this works because the expiring context is known.

So if posix_cpu_timer_del() were to clear ctmr->pid to NULL and then
delay put_pid() with RCU, we could retrieve that information without
holding the timer lock (with appropriate RCU accesses all around).

> 
> > Though we should quickly shut this warning up for the !RT case by
> > providing an callback which does
> >
> >   WARN_ON_ONCE(IS_ENABLED(CONFIG_PREEMPT_RT);
> >
> > and let the RT folks deal with it.
> 
> OTOH, this is not only a RT issue.
> 
> On preemptible kernels the task which collected the expired timers onto
> a local list and set the firing bit, can be preempted after dropping
> sighand lock. So the other side still can busy wait for quite a while.
> Same is obviously true for guests independent of preemption when the
> vCPU gets scheduled out.

Ok, fortunately task work is a sleepable context so using a mutex would
work for everyone, at the cost of a new mutex in task_struct though.
Lemme try something.

> 
> Thanks,
> 
>         tglx
>