linux-kernel - Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20200116174331.GC19428@dhcp22.suse.cz>
Date:   Thu, 16 Jan 2020 18:43:31 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Qian Cai <cai@....pw>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        pmladek@...e.com, rostedt@...dmis.org, peterz@...radead.org,
        david@...hat.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with
 printk()

On Thu 16-01-20 11:05:07, Qian Cai wrote:
> 
> 
> > On Jan 16, 2020, at 10:54 AM, Michal Hocko <mhocko@...nel.org> wrote:
> > 
> > On Thu 16-01-20 09:53:13, Qian Cai wrote:
> >> 
> >> 
> >>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@...nel.org> wrote:
> >>> 
> >>> On Wed 15-01-20 12:29:16, Qian Cai wrote:
> >>>> It is guaranteed to trigger a lockdep splat if calling printk() with
> >>>> zone->lock held because there are many places (tty, console drivers,
> >>>> debugobjects etc) would allocate some memory with another lock
> >>>> held which is proved to be difficult to fix them all.
> >>> 
> >>> I am still not happy with the above much. What would say about something
> >>> like below instead?
> >>> "
> >>> It is not that hard to trigger lockdep splats by calling printk from
> >>> under zone->lock. Most of them are false positives caused by lock chains
> >>> introduced early in the boot process and they do not cause any real
> >>> problems. There are some console drivers which do allocate from the
> >>> printk context as well and those should be fixed. In any case false
> >>> positives are not that trivial to workaround and it is far from optimal
> >>> to lose lockdep functionality for something that is a non-issue.
> >>> <An example of such a false positive goes here>
> >>> "
> >> 
> >> I feel like I repeated myself too many times. A call trace for one lock dependency
> >> is sometimes from early boot process because lockdep will save the first one it
> >> encountered, but it does not mean the lock dependency will only not happen in
> >> early boot. I spent some time to study those early boot call traces in the given
> >> lockdep splats, and it looks to me the lock dependency is also possible after
> >> the boot.
> > 
> > Then state it explicitly with an example of the trace and explanation
> > that the deadlock is real. If the deadlock is real then it shouldn't be
> > really terribly hard to notice even without lockdep splats which get
> > disabled after the first false positive, right?
> 
> A deadlock could be really hard to trigger though which needs a perfect
> timing between multiple threads.

All I am saying is: Do not speculate in changelog. Make clear arguments.
So far we have seen many false positives and that is stated in the
wording I have suggested. It is also explained why those suck. There is
also a note that _some_ consoles might indeed deadlock. Compare that to
the original changelog which doesn't really saying anything useful about
those lockdep splats.

I obviously do not insist on my wording but please make the changelog
clear on the actual problem and stick to facts.

Thanks!
-- 
Michal Hocko
SUSE Labs