[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180112185428.GE1950@lerouge>
Date: Fri, 12 Jan 2018 19:54:29 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Eric Dumazet <edumazet@...gle.com>,
LKML <linux-kernel@...r.kernel.org>,
Levin Alexander <alexander.levin@...izon.com>,
Peter Zijlstra <peterz@...radead.org>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
Wanpeng Li <wanpeng.li@...mail.com>,
Dmitry Safonov <dima@...sta.com>,
Thomas Gleixner <tglx@...utronix.de>,
Radu Rendec <rrendec@...sta.com>,
Ingo Molnar <mingo@...nel.org>,
Stanislaw Gruszka <sgruszka@...hat.com>,
Paolo Abeni <pabeni@...hat.com>,
Rik van Riel <riel@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
David Miller <davem@...emloft.net>
Subject: Re: [RFC PATCH 1/2] softirq: Account time and iteration stats per
vector
On Fri, Jan 12, 2018 at 10:12:32AM -0800, Linus Torvalds wrote:
> On Fri, Jan 12, 2018 at 6:34 AM, Frederic Weisbecker
> <frederic@...nel.org> wrote:
> >
> > That's right. But I thought it was bit large for the stack:
> >
> > struct {
> > u64 time;
> > u64 count;
> > } [NR_SOFTIRQS]
>
> Note that you definitely don't want "u64" here.
>
> Both of these values had better be very limited. The "count" is on the
> order of 10 - it fits in 4 _bits_ without any overflow.
>
> And 'time' is on the order of 2ms, so even if it's in nanoseconds, we
> already know that we want to limit it to a single ms or so (yes, yes,
> right now our limit is 2ms, but I think that's long). So even that
> doesn't need 64-bit.
Ok.
>
> Finally, I think you can join them. If we do a "time or count" limit,
> let's just make the "count" act as some arbitrary fixed time, so that
> we limit things that way.
>
> Say, if we want to limit it to 2ms, consider one count to be 0.2ms. So
> instead of keeping track of count at all, just say "make each softirq
> call count as at least 200,000ns even if the scheduler clock says it's
> less". End result: we'd loop at most ten times.
>
> So now you only need one value, and you know it can't be bigger than 2
> million, so it can be a 32-bit one. Boom. Done.
Right.
Now I believe that the time was added as a limit because count alone was not
reliable enough to diagnose a softirq overrun. But if everyone is fine with
keeping the count as a single metric, I would be much happier because that
means less overhead, no need to fetch the clock, etc...
>
> Also, don't you want these to be percpu, and keep accumulating them
> until you decide to either age them away (just clear it in timer
> interrupt?) or if the value gets so big that you want o fall back to
> the thread instead (and then the thread can clear it every iteration,
> so you don't need to track whether the thread is active or not).
>
> I don't know. I'm traveling today, so I didn't actually have time to
> really look at the patches, I'm just reacting to Eric's reaction.
Clearing the accumulation on tick and flush, that sounds like a good plan.
Well I'm probably not going to use the tick for that because of nohz (again)
but I can check if jiffies changed since we started the accumulation and
reset it if so.
I'm going to respin, thanks!
Powered by blists - more mailing lists