[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SG2P15301MB0015F23FF0E44BE991E8C38EBF930@SG2P15301MB0015.APCP153.PROD.OUTLOOK.COM>
Date: Tue, 15 May 2018 03:02:27 +0000
From: Dexuan Cui <decui@...rosoft.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: Ingo Molnar <mingo@...nel.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Rakib Mullick <rakib.mullick@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: RE: for_each_cpu() is buggy for UP kernel?
> From: Linus Torvalds <torvalds@...ux-foundation.org>
> Sent: Sunday, May 13, 2018 11:22
> On Tue, May 8, 2018 at 11:24 PM Dexuan Cui <decui@...rosoft.com> wrote:
>
> > Should we fix the for_each_cpu() in include/linux/cpumask.h for UP?
>
> As Thomas points out, this has come up before.
>
> One of the issues is historical - we tried very hard to make the SMP code
> not cause code generation problems for UP, and part of that was just that
> all these loops were literally designed to entirely go away under UP. It
> still *looks* syntactically like a loop, but an optimizing compiler will
> see that there's nothing there, and "for_each_cpu(...) x" essentially just
> turns into "x" on UP. An empty mask simply generally doesn't make sense,
> since opn UP you also don't have any masking of CPU ops, so the mask is
> ignored, and that helps the code generation immensely.
>
> If you have to load and test the mask, you immediately lose out badly in
> code generation.
Thank you all for the insights and the detailed background introduction!
> So honestly, I'd really prefer to keep our current behavior. Perhaps with a
> debug option that actually tests (on SMP - because that's what every
> developer is actually _using_ these days) that the mask isn't empty. But
> I'm not sure that would find this case, since presumably on SMP it might
> never be empty.
I agree.
> Now, there is likely a fairly good argument that UP is getting _so_
> uninteresting that we shouldn't even worry about code generation. But the
> counter-argument to that is that if people are using UP in this day and
> age, they probably are using some really crappy hardware that needs all the
> help it can get.
FWIW, I happened to find this issue in a SMP virtual machine, but the kernel
from a customer was built with CONFIG_SMP disabled. After spending 1 day
debugging the strange boot-up delay, which was caused by the unexpected
PIT interrupt storm, I finally tracked it down to the UP version of for_each_cpu().
The function exposing the issue is kernel/time/tick-broadcast.c:
tick_handle_oneshot_broadcast().
If you're OK with the below fix (not tested yet), I'll submit a patch for it:
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -616,6 +616,10 @@ static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
now = ktime_get();
/* Find all expired events */
for_each_cpu(cpu, tick_broadcast_oneshot_mask) {
+#ifndef CONFIG_SMP
+ if (cpumask_empty(tick_broadcast_oneshot_mask))
+ break;
+#endif
td = &per_cpu(tick_cpu_device, cpu);
if (td->evtdev->next_event <= now) {
cpumask_set_cpu(cpu, tmpmask);
> At least for now, I'd rather have this inconsistency, because it really
> makes a surprisingly *big* difference in code generation. From the little
> test I just did, adding that mask testing to a *single* case of
> for_each_cpu() added 20 instructions. I didn't look at exactly why that
> happened (because the code generation was so radically different), but it
> was very noticeable. I used your macro replacement in kernel/taskstats.c in
> case you want to try to dig into what happened, but I'm not surprised. It
> really turns an unconditional trivial loop into a much more complex thing
> that needs to look at and test a value that we didn't care about before.
I agree.
> Maybe we should introduce a "for_each_cpu_maybe_empty()" helper for
> cases like this?
> Linus
Sounds like a good idea.
Thanks,
-- Dexuan
Powered by blists - more mailing lists