lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 13 Nov 2010 17:15:10 -0800
From:	Yinghai Lu <yinghai@...nel.org>
To:	Myron Stowe <myron.stowe@...com>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	Bjorn Helgaas <bjorn.helgaas@...com>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	Venkatesh Pallipadi <venki@...gle.com>,
	Nikhil Rao <ncrao@...gle.com>,
	Takuya Yoshikawa <yoshikawa.takuya@....ntt.co.jp>,
	linux-kernel@...r.kernel.org, knikanth@...e.de, rjenties@...gle.com
Subject: Re: divide error in select_task_rq_fair()

On Thu, Nov 11, 2010 at 10:28 AM, Myron Stowe <myron.stowe@...com> wrote:
> On Fri, 2010-11-05 at 07:17 +0100, Eric Dumazet wrote:
>> Le jeudi 04 novembre 2010 à 20:00 -0600, Bjorn Helgaas a écrit :
>>
>> > Is that going to help you debug the problem?  The solution is not going
>> > to be something like "set NR_CPUS=x".  If NR_CPUS is too small, the
>> > machine should still *boot*, even if we can't use all the CPUs in the
>> > box.
>> >
>>
>> Yes, it will help to understand the layout of cpu / domains and make
>> appropriate changes.
>>
>> Alternative is you send me such a machine :=)
>
> I opened a BZ on this issue as it seems to be a regression -
> https://bugzilla.kernel.org/show_bug.cgi?id=22662
>
> I also, as indicated in the BZ, bisected the kernel which gave the
> following results and reverting 50f2d7f682f9c0ed58191d0982fe77888d59d162
> did re-enable booting on the box in question (an HP dl980g7).  Let me
> know what further info you need or patches to test for debugging this.
>
> Thanks,
>
> commit 50f2d7f682f9c0ed58191d0982fe77888d59d162
> Author: Nikanth Karthikesan <knikanth@...e.de>
> Date:   Thu Sep 30 17:34:10 2010 +0530
>
>    x86, numa: Assign CPUs to nodes in round-robin manner on fake NUMA
>
>    commit d9c2d5ac6af87b4491bff107113aaf16f6c2b2d9 "x86, numa: Use near(er)
>    online node instead of roundrobin for NUMA" changed NUMA initialization on
>    Intel to choose the nearest online node or first node.  Fake NUMA would be
>    better of with round-robin initialization, instead of the all CPUS on
>    first node.  Change the choice of first node, back to round-robin.
>
>    For testing NUMA kernel behaviour without cpusets and NUMA aware
>    applications, it would be better to have cpus in different nodes, rather
>    than all in a single node.  With cpusets migration of tasks scenarios
>    cannot not be tested.
>
>    I guess having it round-robin shouldn't affect the use cases for all cpus
>    on the first node.
>
>    The code comments in arch/x86/mm/numa_64.c:759 indicate that this used to
>    be the case, which was changed by commit d9c2d5ac6.  It changed from
>    roundrobin to nearer or first node.  And I couldn't find any reason for
>    this change in its changelog.
>
>    Signed-off-by: Nikanth Karthikesan <knikanth@...e.de>
>    Cc: David Rientjes <rientjes@...gle.com>
>    Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>


please check

http://lkml.org/lkml/2010/11/13/176

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ