linux-kernel - Re: [PATCH v2 1/2] seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230727151029.e_M9bi8N@linutronix.de>
Date:   Thu, 27 Jul 2023 17:10:29 +0200
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Petr Mladek <pmladek@...e.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Ingo Molnar <mingo@...hat.com>,
        John Ogness <john.ogness@...utronix.de>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...e.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>
Subject: Re: [PATCH v2 1/2] seqlock: Do the lockdep annotation before locking
 in do_write_seqcount_begin_nested()

On 2023-06-28 21:14:16 [+0900], Tetsuo Handa wrote:
> > Anyway, please do not do this change only because of printk().
> > IMHO, the current ordering is more logical and the printk() problem
> > should be solved another way.
> 
> Then, since [PATCH 1/2] cannot be applied, [PATCH 2/2] is automatically
> rejected.

My understanding is that this patch gets applied and your objection will
be noted.

> I found
> 
>   /*
>    * Locking a pcp requires a PCP lookup followed by a spinlock. To avoid
>    * a migration causing the wrong PCP to be locked and remote memory being
>    * potentially allocated, pin the task to the CPU for the lookup+lock.
>    * preempt_disable is used on !RT because it is faster than migrate_disable.
>    * migrate_disable is used on RT because otherwise RT spinlock usage is
>    * interfered with and a high priority task cannot preempt the allocator.
>    */
>   #ifndef CONFIG_PREEMPT_RT
>   #define pcpu_task_pin()         preempt_disable()
>   #define pcpu_task_unpin()       preempt_enable()
>   #else
>   #define pcpu_task_pin()         migrate_disable()
>   #define pcpu_task_unpin()       migrate_enable()
>   #endif
> 
> in mm/page_alloc.c . Thus, I think that calling migrate_disable() if CONFIG_PREEMPT_RT=y
> and calling local_irq_save() if CONFIG_PREEMPT_RT=n (i.e. Alternative 3) will work.
> 
> But thinking again, since CONFIG_PREEMPT_RT=y uses special printk() approach where messages
> are printed from a dedicated kernel thread, do we need to call printk_deferred_enter() if
> CONFIG_PREEMPT_RT=y ? That is, isn't the fix as straightforward as below?

That below will cause a splat with CONFIG_PROVE_RAW_LOCK_NESTING. That
is because seqlock_t::lock is acquired without disabling interrupts.
Additionally it is a bad example because the seqcount API is bypassed
due to printk's limitations and the problems, that are caused on
PREEMPT_RT, are "ifdefed away". None of this is documented/ explained.

Let me summarize your remaining problem:
- With (and only with) CONFIG_PROVE_LOCKING there can be a printk splat
  caused by a lock validation error noticed by lockdep during
  write_sequnlock_irqrestore().

- This can deadlock if there is a printing output on the tty which is
  using the same console as printk and memory hotplug is active at the
  same time.
  That is because the tty layer acquires the same lock as printk's
  console during memory allocation (of the tty layer).

Now:
- before this deadlocks (with CONFIG_PROVE_LOCKING) chances are high
  that a splat is seen first.

- printk is reworked and the printk output should either happen from a
  dedicated thread or directly via a different console driver which is
  not using uart_port::lock. Thus avoiding the deadlock.

Sebastian