lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAq0SUm8TTaSWGmkmC90T3H0ePwv_td6Qn4t+__8k2C6QGEJMQ@mail.gmail.com>
Date:   Fri, 7 Jan 2022 17:16:23 -0300
From:   Wander Costa <wcosta@...hat.com>
To:     "Paul E . McKenney" <paulmck@...nel.org>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Wander Lairson Costa <wander@...hat.com>,
        Petr Mladek <pmladek@...e.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        John Ogness <john.ogness@...utronix.de>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/1] printk: suppress rcu stall warnings caused by slow
 console devices

On Fri, Jan 7, 2022 at 4:03 PM Paul E. McKenney <paulmck@...nel.org> wrote:
>
> On Fri, Nov 12, 2021 at 06:57:55AM -0800, Paul E. McKenney wrote:
> > On Fri, Nov 12, 2021 at 11:42:39AM -0300, Wander Costa wrote:
> > > On Thu, Nov 11, 2021 at 10:42 PM Sergey Senozhatsky
> > > <senozhatsky@...omium.org> wrote:
> > > >
> > > > On (21/11/11 16:59), Wander Lairson Costa wrote:
> > > > >
> > > > > If we have a reasonable large dataset to flush in the printk ring
> > > > > buffer in the presence of a slow console device (like a serial port
> > > > > with a low baud rate configured), the RCU stall detector may report
> > > > > warnings.
> > > > >
> > > > > This patch suppresses RCU stall warnings while flushing the ring buffer
> > > > > to the console.
> > > > >
> > > > [..]
> > > > > +extern int rcu_cpu_stall_suppress;
> > > > > +
> > > > > +static void rcu_console_stall_suppress(void)
> > > > > +{
> > > > > +     if (!rcu_cpu_stall_suppress)
> > > > > +             rcu_cpu_stall_suppress = 4;
> > > > > +}
> > > > > +
> > > > > +static void rcu_console_stall_unsuppress(void)
> > > > > +{
> > > > > +     if (rcu_cpu_stall_suppress == 4)
> > > > > +             rcu_cpu_stall_suppress = 0;
> > > > > +}
> > > > > +
> > > > >  /**
> > > > >   * console_unlock - unlock the console system
> > > > >   *
> > > > > @@ -2634,6 +2648,9 @@ void console_unlock(void)
> > > > >        * and cleared after the "again" goto label.
> > > > >        */
> > > > >       do_cond_resched = console_may_schedule;
> > > > > +
> > > > > +     rcu_console_stall_suppress();
> > > > > +
> > > > >  again:
> > > > >       console_may_schedule = 0;
> > > > >
> > > > > @@ -2645,6 +2662,7 @@ void console_unlock(void)
> > > > >       if (!can_use_console()) {
> > > > >               console_locked = 0;
> > > > >               up_console_sem();
> > > > > +             rcu_console_stall_unsuppress();
> > > > >               return;
> > > > >       }
> > > > >
> > > > > @@ -2716,8 +2734,10 @@ void console_unlock(void)
> > > > >
> > > > >               handover = console_lock_spinning_disable_and_check();
> > > > >               printk_safe_exit_irqrestore(flags);
> > > > > -             if (handover)
> > > > > +             if (handover) {
> > > > > +                     rcu_console_stall_unsuppress();
> > > > >                       return;
> > > > > +             }
> > > > >
> > > > >               if (do_cond_resched)
> > > > >                       cond_resched();
> > > > > @@ -2738,6 +2758,8 @@ void console_unlock(void)
> > > > >       retry = prb_read_valid(prb, next_seq, NULL);
> > > > >       if (retry && console_trylock())
> > > > >               goto again;
> > > > > +
> > > > > +     rcu_console_stall_unsuppress();
> > > > >  }
> > > >
> > > > May be we can just start touching watchdogs from printing routine?
> > > >
> > > Hrm, console_unlock is called from vprintk_emit [0] with preemption
> > > disabled. and it already has the logic implemented to call
> > > cond_resched when possible [1].
> > >
> > > [0] https://elixir.bootlin.com/linux/latest/source/kernel/printk/printk.c#L2244
> > > [1] https://elixir.bootlin.com/linux/latest/source/kernel/printk/printk.c#L2719
> >
> > So when we are having problems is when console_may_schedule == 0?
>
> Just following up...  Any progress on this?  The ability to suppress RCU
> CPU stall warnings due to console slowness would likely be valuable to
> quite a few people.
>

My understanding is that the consensus is that the proper fix is the
printk threads currently under work and it wouldn't take long before
it is ready to review.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ