[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F3978C5.7080901@linux.vnet.ibm.com>
Date: Tue, 14 Feb 2012 02:25:33 +0530
From: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To: Venki Pallipadi <venki@...gle.com>
CC: Tony Luck <tony.luck@...il.com>,
Rusty Russell <rusty@...tcorp.com.au>,
Andrew Morton <akpm@...ux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@...il.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Mike Travis <travis@....com>,
"Paul E. McKenney" <paul.mckenney@...aro.org>,
"Rafael J. Wysocki" <rjw@...k.pl>,
Paul Gortmaker <paul.gortmaker@...driver.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Avoid mask based num_possible_cpus and num_online_cpus
-v5
On 02/14/2012 02:13 AM, Venki Pallipadi wrote:
> On Mon, Feb 13, 2012 at 12:25 PM, Srivatsa S. Bhat
> <srivatsa.bhat@...ux.vnet.ibm.com> wrote:
>> On 02/14/2012 01:24 AM, Tony Luck wrote:
>>
>>> On Thu, Feb 2, 2012 at 12:03 PM, Rusty Russell <rusty@...tcorp.com.au> wrote:
>>>> IIRC playing with 3 archs boot code seemed like a recipe for disaster.
>>>> Feel free to try to fix this in -next though, and see what breaks...
>>>
>>> ia64 is what breaks ... well not actually broken ... but some very
>>> weird delays that
>>> show up in different places depending on whether this patch is present.
>>>
>>> First linux-next kernel to be blessed with this patch was
>>> next-20120210. Booting it
>>> I see:
>>> [ 7.164233] Switching to clocksource itc
>>> [ 146.077315] pnp: PnP ACPI init
>>>
>>> An ugly 138.913 second delay. Digging in the code showed that the bad bits
>>> happened inside stop_machine()
>>>
>>> Reverting just this patch makes this big delay disappear:
>>>
>>> [ 32.780232] Switching to clocksource itc
>>> [ 32.832100] pnp: PnP ACPI init
>>>
>>> but notice that it takes 25 extra seconds to get to this point in the
>>> boot (and while
>>> we expect to save some time by not re-computing num_online_cpus each time we
>>> need it ... this looks to be a lot more than I'd expect!)
>>>
>>
>>
>> Oh no!! ia64 directly uses cpu_set() and cpu_clear() on cpu_online_map!!
>> Grr.. It means num_online_cpus can be different from the actual number of
>> online cpus because it doesn't go through the set_cpu_online() path.. I haven't
>> yet pin-pointed the exact problem, but this definitely doesn't look good...
>>
>
> This feels like a minefield in general. ia64, mips and um seems to
> have cpu_set and cpu_clear of cpu_online_map and/or cpu_possible_map
> in there.
>
Since I had almost never seen code using "cpu_online_map" instead of
"cpu_online_mask", I hadn't checked it while reviewing your patch... :-(
Honestly, it is only now that I realized that there is this other variant too!
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists