lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 23 Nov 2018 12:36:48 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Waiman Long <longman@...hat.com>
Cc:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Will Deacon <will.deacon@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, kasan-dev@...glegroups.com,
        linux-mm@...ck.org, iommu@...ts.linux-foundation.org,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Tejun Heo <tj@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 07/17] debugobjects: Move printk out of db lock
 critical sections

On Thu 2018-11-22 15:29:35, Waiman Long wrote:
> On 11/22/2018 11:02 AM, Petr Mladek wrote:
> > Anyway, I wonder what was the primary motivation for this patch.
> > Was it the system hang? Or was it lockdep report about nesting
> > two terminal locks: db->lock, pool_lock with logbuf_lock?
> 
> The primary motivation was to make sure that printk() won't be called
> while holding either db->lock or pool_lock in the debugobjects code. In
> the determination of which locks can be made terminal, I focused on
> local spinlocks that won't cause boundary to an unrelated subsystem as
> it will greatly complicate the analysis.

Then please mention this as the real reason in the commit message.

The reason that printk might take too long in IRQ context does
not explain why we need the patch. printk() is called in IRQ
context in many other locations. I do not see why this place
should be special. The delay is the price for the chance to
see the problem.


> I didn't realize that it fixed a hang problem that I was seeing until I
> did bisection to find out that it was caused by the patch that cause the
> debugobjects splat in the first place a few days ago. I was comparing
> the performance status of the pre and post patched kernel. The pre-patch
> kernel failed to boot in the one of my test systems, but the
> post-patched kernel could. I narrowed it down to this patch as the fix
> for the hang problem.

The hang is a mystery. My guess is that there is a race somewhere.
The patch might have reduced the race window but it probably did
not fix the race itself.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ