[<prev] [next>] [day] [month] [year] [list]
Message-ID: <0c1b7ff7-b053-4868-a550-e2044aba300f@linux.alibaba.com>
Date: Tue, 5 Aug 2025 09:43:15 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Junli Liu <liujunli@...iang.com>, linux-erofs@...ts.ozlabs.org,
linux-kernel@...r.kernel.org
Cc: xiang@...nel.org, chao@...nel.org, yangsonghua@...iang.com
Subject: Re: [PATCH v3] erofs: fix atomic context detection when
!CONFIG_DEBUG_LOCK_ALLOC
On 2025/8/5 09:19, Junli Liu wrote:
> Since EROFS handles decompression in non-atomic contexts due to
> uncontrollable decompression latencies and vmap() usage, it tries
> to detect atomic contexts and only kicks off a kworker on demand
> in order to reduce unnecessary scheduling overhead.
>
> However, the current approach is insufficient and can lead to
> sleeping function calls in invalid contexts, causing kernel
> warnings and potential system instability. See the stacktrace [1]
> and previous discussion [2].
>
> The current implementation only checks rcu_read_lock_any_held(),
> which behaves inconsistently across different kernel configurations:
>
> - When CONFIG_DEBUG_LOCK_ALLOC is enabled: correctly detects
> RCU critical sections by checking rcu_lock_map
> - When CONFIG_DEBUG_LOCK_ALLOC is disabled: compiles to
> "!preemptible()", which only checks preempt_count and misses
> RCU critical sections
>
> This patch introduces z_erofs_in_atomic() to provide comprehensive
> atomic context detection:
>
> 1. Check RCU preemption depth when CONFIG_PREEMPTION is enabled,
> as RCU critical sections may not affect preempt_count but still
> require atomic handling
>
> 2. Always use async processing when CONFIG_PREEMPT_COUNT is disabled,
> as preemption state cannot be reliably determined
>
> 3. Fall back to standard preemptible() check for remaining cases
>
> The function replaces the previous complex condition check and ensures
> that z_erofs always uses (kthread_)work in atomic contexts to minimize
> scheduling overhead and prevent sleeping in invalid contexts.
>
> [1] Problem stacktrace
> BUG: sleeping function called from invalid context at
> kernel/locking/rtmutex_api.c:510
> in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 107,
> name: irq/54-ufshcd
> preempt_count: 0, expected: 0
> RCU nest depth: 2, expected: 0
>
> [2] https://lore.kernel.org/r/58b661d0-0ebb-4b45-a10d-c5927fb791cd@paulmck-laptop
>
> Signed-off-by: Junli Liu <liujunli@...iang.com>
This version seems applicable to me:
Reviewed-by: Gao Xiang <hsiangkao@...ux.alibaba.com>
Thanks,
Gao Xiang
Powered by blists - more mailing lists