lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87d0eae3-e16e-4820-adde-afb519c5dcfc@redhat.com>
Date: Thu, 22 Jan 2026 15:56:39 -0500
From: Waiman Long <llong@...hat.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Mike Rapoport <rppt@...nel.org>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Clark Williams <clrkwllms@...nel.org>, Steven Rostedt <rostedt@...dmis.org>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 linux-rt-devel@...ts.linux.dev, Wei Yang <richard.weiyang@...il.com>,
 David Hildenbrand <david@...nel.org>, "Paul E . McKenney"
 <paulmck@...nel.org>
Subject: Re: [PATCH v3] mm/mm_init: Don't cond_resched() in
 deferred_init_memmap_chunk() if called from deferred_grow_zone()

On 1/22/26 2:29 PM, Andrew Morton wrote:
> On Thu, 22 Jan 2026 13:43:43 -0500 Waiman Long <longman@...hat.com> wrote:
>
>> Commit 3acb913c9d5b ("mm/mm_init: use deferred_init_memmap_chunk()
>> in deferred_grow_zone()") made deferred_grow_zone() call
>> deferred_init_memmap_chunk() within a pgdat_resize_lock() critical
>> section with irqs disabled.
>>
>> It did check for irqs_disabled() in
>> deferred_init_memmap_chunk() to avoid calling cond_resched(). For a
>> PREEMPT_RT kernel build, however, spin_lock_irqsave() does not disable
>> interrupt but rcu_read_lock() is called. This leads to the following
>> bug report.
>>
>>    BUG: sleeping function called from invalid context at mm/mm_init.c:2091
>>    in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
>>    preempt_count: 0, expected: 0
>>
>> @@ -2085,10 +2085,10 @@ deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn,
>>   
>>   			spfn = chunk_end;
>>   
>> -			if (irqs_disabled())
>> -				touch_nmi_watchdog();
>> -			else
>> +			if (can_resched)
>>   				cond_resched();
>> +			else
>> +				touch_nmi_watchdog();
>>   		}
>>   	}
> Disables the cond_resched() in some situations.  Can this reintroduce
> the watchdog warnings which that cond_resched() was intended to
> prevent?
cond_resched() is disabled only when it is called from 
deferred_grow_zone() where a spinlock was acquired with irqs disabled in 
the case of non-RT kernel and in a rcu_read_lock() acquired with RT 
kernel. In either case, scheduling out should not be allowed or 
something bad may happen. I suppose that iterating of pfn's in 
deferred_grow_zone() requires pgdat_resize_lock() protection.

>
> The cond_resched() was added by <dig, dig> da97f2d56bbd ("mm: call
> cond_resched() from deferred_init_memmap()").
>
> Pasha's 2020 patch replaced touch_nmi_watchdog() with cond_resched() to
> prevent RCU stall warnings.  So I think the answer to my question is
> yes, going back to touch_nmi_watchdog() could reintroduce those RCU
> warnings.

deferred_init_memmap() will  still have cond_resched() called in the 
iteration loop. It had RCU stall problem before without cond_resched() 
because it needs to iterate all the available memory which can takes a 
long time if we are talking about TBs of memory.

For deferred_grow_zone(), as long as the number of pfn's that are 
iterated are not huge, RCU stall warning shouldn't happen.

Cheers,
Longman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ