[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <FE1B58AA-FC0E-4E0A-8A76-F98B70548F36@gmx.us>
Date: Mon, 12 Nov 2018 23:33:01 -0500
From: Qian Cai <cai@....us>
To: Waiman Long <longman@...hat.com>
Cc: Arnd Bergmann <arnd@...db.de>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Zhong Jiang <zhongjiang@...wei.com>,
linux kernel <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"Joel Fernandes (Google)" <joel@...lfernandes.org>
Subject: Re: ODEBUG: Out of memory. ODEBUG disabled
> On Nov 10, 2018, at 9:11 AM, Qian Cai <cai@....us> wrote:
>
> On 11/10/18 at 8:59 AM, Waiman Long wrote:
>
>> On 11/09/2018 08:45 PM, Qian Cai wrote:
>>>> Sent: Friday, November 09, 2018 at 5:08 PM
>>>> From: "Waiman Long" <longman@...hat.com>
>>>> To: "Qian Cai" <cai@....us>, "Yang Shi" <yang.shi@...ux.alibaba.com>
>>>> Cc: "open list" <linux-kernel@...r.kernel.org>, "Thomas Gleixner" <tglx@...utronix.de>, "Arnd Bergmann" <arnd@...db.de>, "Joel Fernandes (Google)" <joel@...lfernandes.org>, "Zhong Jiang" <zhongjiang@...wei.com>
>>>> Subject: Re: ODEBUG: Out of memory. ODEBUG disabled
>>>>
>>>> On 11/09/2018 04:51 PM, Qian Cai wrote:
>>>>>> On Nov 9, 2018, at 4:42 PM, Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/9/18 1:36 PM, Qian Cai wrote:
>>>>>>> It is a bit annoying on this aarch64 server with 64 CPUs that is
>>>>>>> booting the latest mainline (3541833fd1f2) causes object debugging
>>>>>>> always running out of memory.
>>>>>> May you please paste the detail failure log?
>>>>> I assume you mean dmesg.
>>>>>
>>>>> Here is the dmesg for 64 CPUs,
>>>>> https://paste.ubuntu.com/p/BnhvXXhn7k/
>>>>>>> I have to boot the kernel with only 16 CPUs instead (nr_cpus=16)
>>>>>>> to make it work. Is it expected that object debugging is not going
>>>>>>> to work with large machines?
>>>>>> I don't think so. I'm supposed it works well with large CPU number on x86.
>>>>> Here is the one with nr_cpus workaround,
>>>>> https://paste.ubuntu.com/p/qMpd2CCPSV/
>>>> The debugobjects code have a set of 1024 statically allocated debug
>>>> objects that can be used in early boot before the slab memory allocator
>>>> is initialized. Apparently, the system may have used up all the
>>>> statically allocated objects. Try double ODEBUG_POOL_SIZE to see if it
>>>> helps.
>>> Great, you are right. Doubling the size makes it work. Does it make sense
>>> to have a kconfig option instead?
>>
>> First, I think you need to figure out what your system needed to use up
>> so many debug objects in early boot. If there is a legitimate reason for
>> this behavior, we can talk about having a kconfig option to increase that.
> Anybody else not getting ODEBUG OOM with more than 64-CPU? As
> mentioned, restricting to 16-CPU works fine. How can I figure out why the
> system uses so much debug objects?
On another aarch64 server with 256-CPU, even double the size of
ODEBUG_POOL_SIZE, i.e., 2048 will get "ODEBUG: Out of memory. ODEBUG
disabled”.
Powered by blists - more mailing lists