[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y+ZsWx4gnx4Cak7D@lothringen>
Date: Fri, 10 Feb 2023 17:10:03 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Wei Li <liwei391@...wei.com>,
Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>,
Thomas Gleixner <tglx@...utronix.de>,
Yu Liao <liaoyu15@...wei.com>, Hillf Danton <hdanton@...a.com>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH 4/6] timers/nohz: Add a comment about broken iowait
counter update race
On Fri, Feb 10, 2023 at 03:39:43PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 10, 2023 at 03:09:15PM +0100, Frederic Weisbecker wrote:
> > The per-cpu iowait task counter is incremented locally upon sleeping.
> > But since the task can be woken to (and by) another CPU, the counter may
> > then be decremented remotely. This is the source of a race involving
> > readers VS writer of idle/iowait sleeptime.
> >
> > The following scenario shows an example where a /proc/stat reader
> > observes a pending sleep time as IO whereas that pending sleep time
> > later eventually gets accounted as non-IO.
> >
> > CPU 0 CPU 1 CPU 2
> > ----- ----- ------
> > //io_schedule() TASK A
> > current->in_iowait = 1
> > rq(0)->nr_iowait++
> > //switch to idle
> > // READ /proc/stat
> > // See nr_iowait_cpu(0) == 1
> > return ts->iowait_sleeptime +
> > ktime_sub(ktime_get(), ts->idle_entrytime)
> >
> > //try_to_wake_up(TASK A)
> > rq(0)->nr_iowait--
> > //idle exit
> > // See nr_iowait_cpu(0) == 0
> > ts->idle_sleeptime += ktime_sub(ktime_get(), ts->idle_entrytime)
> >
> > As a result subsequent reads on /proc/stat may expose backward progress.
> >
> > This is unfortunately hardly fixable. Just add a comment about that
> > condition.
>
> It is far worse than that, the whole concept of per-cpu iowait is
> absurd. Also see the comment near nr_iowait().
Alas I know :-(
Powered by blists - more mailing lists