lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b0a50290-9387-8398-e241-f76f4719f963@redhat.com>
Date:   Mon, 26 Jun 2023 09:56:52 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Michal Hocko <mhocko@...e.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Ingo Molnar <mingo@...hat.com>,
        John Ogness <john.ogness@...utronix.de>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Peter Zijlstra <peterz@...radead.org>,
        Petr Mladek <pmladek@...e.com>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Thomas Gleixner <tglx@...utronix.de>,
        Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>
Subject: Re: [PATCH v3 2/2] mm/page_alloc: Use write_seqlock_irqsave() instead
 write_seqlock() + local_irq_save().

On 23.06.23 22:15, Sebastian Andrzej Siewior wrote:
> __build_all_zonelists() acquires zonelist_update_seq by first disabling
> interrupts via local_irq_save() and then acquiring the seqlock with
> write_seqlock(). This is troublesome and leads to problems on
> PREEMPT_RT. The problem is that the inner spinlock_t becomes a sleeping
> lock on PREEMPT_RT and must not be acquired with disabled interrupts.
> 
> The API provides write_seqlock_irqsave() which does the right thing in
> one step.
> printk_deferred_enter() has to be invoked in non-migrate-able context to
> ensure that deferred printing is enabled and disabled on the same CPU.
> This is the case after zonelist_update_seq has been acquired.
> 
> There was discussion on the first submission that the order should be:
> 	local_irq_disable();
> 	printk_deferred_enter();
> 	write_seqlock();
> 
> to avoid pitfalls like having an unaccounted printk() coming from
> write_seqlock_irqsave() before printk_deferred_enter() is invoked. The
> only origin of such a printk() can be a lockdep splat because the
> lockdep annotation happens after the sequence count is incremented.
> This is exceptional and subject to change.
> 
> It was also pointed that PREEMPT_RT can be affected by the printk
> problem since its write_seqlock_irqsave() does not really disable
> interrupts. This isn't the case because PREEMPT_RT's printk
> implementation differs from the mainline implementation in two important
> aspects:
> - Printing happens in a dedicated threads and not at during the
>    invocation of printk().
> - In emergency cases where synchronous printing is used, a different
>    driver is used which does not use tty_port::lock.
> 
> Acquire zonelist_update_seq with write_seqlock_irqsave() and then defer
> printk output.
> 
> Fixes: 1007843a91909 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> Acked-by: Michal Hocko <mhocko@...e.com>
> ---
> v2…v3
>    - Update comment as per Michal's suggestion.
> 
> v1…v2:
>    - Improve commit description
> 
>   mm/page_alloc.c | 15 ++++++---------
>   1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 47421bedc12b7..440e9af67b48d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5808,19 +5808,17 @@ static void __build_all_zonelists(void *data)
>   	unsigned long flags;
>   
>   	/*
> -	 * Explicitly disable this CPU's interrupts before taking seqlock
> -	 * to prevent any IRQ handler from calling into the page allocator
> -	 * (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock.
> +	 * The zonelist_update_seq must be acquired with irqsave because the
> +	 * reader can be invoked from IRQ with GFP_ATOMIC.
>   	 */
> -	local_irq_save(flags);
> +	write_seqlock_irqsave(&zonelist_update_seq, flags);
>   	/*
> -	 * Explicitly disable this CPU's synchronous printk() before taking
> -	 * seqlock to prevent any printk() from trying to hold port->lock, for
> +	 * Also disable synchronous printk() to prevent any printk() from
> +	 * trying to hold port->lock, for
>   	 * tty_insert_flip_string_and_push_buffer() on other CPU might be
>   	 * calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held.
>   	 */
>   	printk_deferred_enter();
> -	write_seqlock(&zonelist_update_seq);
>   
>   #ifdef CONFIG_NUMA
>   	memset(node_load, 0, sizeof(node_load));
> @@ -5857,9 +5855,8 @@ static void __build_all_zonelists(void *data)
>   #endif
>   	}
>   
> -	write_sequnlock(&zonelist_update_seq);
>   	printk_deferred_exit();
> -	local_irq_restore(flags);
> +	write_sequnlock_irqrestore(&zonelist_update_seq, flags);
>   }
>   
>   static noinline void __init

Reviewed-by: David Hildenbrand <david@...hat.com>

-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ