lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200929142750.GT6442@alley>
Date:   Tue, 29 Sep 2020 16:27:51 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Chengming Zhou <zhouchengming@...edance.com>,
        maarten.lankhorst@...ux.intel.com, mripard@...nel.org,
        tzimmermann@...e.de, airlied@...ux.ie, daniel@...ll.ch,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        sergey.senozhatsky@...il.com, rostedt@...dmis.org,
        mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        bsegall@...gle.com, mgorman@...e.de, songmuchun@...edance.com,
        john.ogness@...utronix.de
Subject: Re: [External] Re: [PATCH 2/2] sched: mark
 PRINTK_DEFERRED_CONTEXT_MASK in __schedule()

On Mon 2020-09-28 12:25:59, Peter Zijlstra wrote:
> On Mon, Sep 28, 2020 at 06:04:23PM +0800, Chengming Zhou wrote:
> 
> > Well, you are lucky. So it's a problem in our printk implementation.
> 
> Not lucky; I just kicked it in the groin really hard:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git debug/experimental
> 
> > The deadlock path is:
> > 
> > printk
> >   vprintk_emit
> >     console_unlock
> >       vt_console_print
> >         hide_cursor
> >           bit_cursor
> >             soft_cursor
> >               queue_work_on
> >                 __queue_work
> >                   try_to_wake_up
> >                     _raw_spin_lock
> >                       native_queued_spin_lock_slowpath
> > 
> > Looks like it's introduced by this commit:
> > 
> > eaa434defaca1781fb2932c685289b610aeb8b4b
> > 
> > "drm/fb-helper: Add fb_deferred_io support"
> 
> Oh gawd, yeah, all the !serial consoles are utter batshit.
> 
> Please look at John's last printk rewrite, IIRC it farms all that off to
> a kernel thread instead of doing it from the printk() caller's context.
> 
> I'm not sure where he hides his latests patches, but I'm sure he'll be
> more than happy to tell you.

AFAIK, John is just working on updating the patchset so that it will
be based on the lockless ringbuffer that is finally in the queue
for-5.10.

Upstreaming the console handling will be the next big step. I am sure
that there will be long discussion about it. But there might be
few things that would help removing printk_deferred().

1. Messages will be printed on consoles by dedicated kthreads. It will
   be safe context. No deadlocks.

2. The registration and unregistration of consoles should not longer
   be handled by console_lock (semaphore). It should be possible to
   call most consoles without a sleeping lock. It should remove all
   these deadlocks between printk() and scheduler().

   There might be problems with some consoles. For example, tty would
   most likely still need a sleeping lock because it is using the console
   semaphore also internally.


3. We will try harder to get the messages out immediately during
   panic().


It would take some time until the above reaches upstream. But it seems
to be the right way to go.


About printk_deferred():

It is a whack a mole game. It is easy to miss printk() that might
eventually cause the deadlock.

printk deferred context is more safe. But it is still a what a mole
game. The kthreads will do the same job for sure.

Finally, the deadlock happens "only" when someone is waiting on
console_lock() in parallel. Otherwise, the waitqueue for the semaphore
is empty and scheduler is not called.

It means that there is quite a big change to see the WARN(). It might
be even bigger than with printk_deferred() because WARN() in scheduler
means that the scheduler is big troubles. Nobody guarantees that
the deferred messages will get handled later.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ