[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c7b58543-ffd2-44bb-1005-e9733782bad5@redhat.com>
Date: Sat, 10 Nov 2018 08:59:30 -0500
From: Waiman Long <longman@...hat.com>
To: Qian Cai <cai@....us>
Cc: "Joel Fernandes (Google)" <joel@...lfernandes.org>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Zhong Jiang <zhongjiang@...wei.com>,
Arnd Bergmann <arnd@...db.de>,
open list <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: ODEBUG: Out of memory. ODEBUG disabled
On 11/09/2018 08:45 PM, Qian Cai wrote:
>> Sent: Friday, November 09, 2018 at 5:08 PM
>> From: "Waiman Long" <longman@...hat.com>
>> To: "Qian Cai" <cai@....us>, "Yang Shi" <yang.shi@...ux.alibaba.com>
>> Cc: "open list" <linux-kernel@...r.kernel.org>, "Thomas Gleixner" <tglx@...utronix.de>, "Arnd Bergmann" <arnd@...db.de>, "Joel Fernandes (Google)" <joel@...lfernandes.org>, "Zhong Jiang" <zhongjiang@...wei.com>
>> Subject: Re: ODEBUG: Out of memory. ODEBUG disabled
>>
>> On 11/09/2018 04:51 PM, Qian Cai wrote:
>>>> On Nov 9, 2018, at 4:42 PM, Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/9/18 1:36 PM, Qian Cai wrote:
>>>>> It is a bit annoying on this aarch64 server with 64 CPUs that is
>>>>> booting the latest mainline (3541833fd1f2) causes object debugging
>>>>> always running out of memory.
>>>> May you please paste the detail failure log?
>>> I assume you mean dmesg.
>>>
>>> Here is the dmesg for 64 CPUs,
>>> https://paste.ubuntu.com/p/BnhvXXhn7k/
>>>>> I have to boot the kernel with only 16 CPUs instead (nr_cpus=16)
>>>>> to make it work. Is it expected that object debugging is not going
>>>>> to work with large machines?
>>>> I don't think so. I'm supposed it works well with large CPU number on x86.
>>> Here is the one with nr_cpus workaround,
>>> https://paste.ubuntu.com/p/qMpd2CCPSV/
>> The debugobjects code have a set of 1024 statically allocated debug
>> objects that can be used in early boot before the slab memory allocator
>> is initialized. Apparently, the system may have used up all the
>> statically allocated objects. Try double ODEBUG_POOL_SIZE to see if it
>> helps.
> Great, you are right. Doubling the size makes it work. Does it make sense
> to have a kconfig option instead?
First, I think you need to figure out what your system needed to use up
so many debug objects in early boot. If there is a legitimate reason for
this behavior, we can talk about having a kconfig option to increase that.
>> There are also quite a number of warnings in your console log. So there
>> is certainly something wrong with your kernel or config options.
> Yes, I am working on all those warnings. This one is found by ODEBUG,
> https://lkml.org/lkml/2018/11/10/136
Cheers,
Longman
Powered by blists - more mailing lists