[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <A7D86B8F-678F-4159-8796-D358B70BCC79@lca.pw>
Date: Thu, 16 Jan 2020 11:27:28 -0500
From: Qian Cai <cai@....pw>
To: David Hildenbrand <david@...hat.com>
Cc: Michal Hocko <mhocko@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
pmladek@...e.com, rostedt@...dmis.org, peterz@...radead.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with
printk()
> On Jan 16, 2020, at 11:04 AM, David Hildenbrand <david@...hat.com> wrote:
>
> On 16.01.20 16:54, Michal Hocko wrote:
>> On Thu 16-01-20 09:53:13, Qian Cai wrote:
>>>
>>>
>>>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@...nel.org> wrote:
>>>>
>>>> On Wed 15-01-20 12:29:16, Qian Cai wrote:
>>>>> It is guaranteed to trigger a lockdep splat if calling printk() with
>>>>> zone->lock held because there are many places (tty, console drivers,
>>>>> debugobjects etc) would allocate some memory with another lock
>>>>> held which is proved to be difficult to fix them all.
>>>>
>>>> I am still not happy with the above much. What would say about something
>>>> like below instead?
>>>> "
>>>> It is not that hard to trigger lockdep splats by calling printk from
>>>> under zone->lock. Most of them are false positives caused by lock chains
>>>> introduced early in the boot process and they do not cause any real
>>>> problems. There are some console drivers which do allocate from the
>>>> printk context as well and those should be fixed. In any case false
>>>> positives are not that trivial to workaround and it is far from optimal
>>>> to lose lockdep functionality for something that is a non-issue.
>>>> <An example of such a false positive goes here>
>>>> "
>>>
>>> I feel like I repeated myself too many times. A call trace for one lock dependency
>>> is sometimes from early boot process because lockdep will save the first one it
>>> encountered, but it does not mean the lock dependency will only not happen in
>>> early boot. I spent some time to study those early boot call traces in the given
>>> lockdep splats, and it looks to me the lock dependency is also possible after
>>> the boot.
>>
>> Then state it explicitly with an example of the trace and explanation
>> that the deadlock is real. If the deadlock is real then it shouldn't be
>> really terribly hard to notice even without lockdep splats which get
>> disabled after the first false positive, right?
>
> I was asking myself for a long time: did anybody actually see this
> deadlock in real life?
Nobody knows for sure. I think one reason is that not many people will use
memory offiline even if they do, it will mostly not be a continuous activity in
the system. debugobjects make it way easier to reproduce because it allocates
memory in random places, but then it is not all that popular.
Powered by blists - more mailing lists