[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a1c559b7-335e-5401-d167-301c5b1cd312@I-love.SAKURA.ne.jp>
Date: Wed, 28 Jun 2023 21:14:16 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: Petr Mladek <pmladek@...e.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
"Luis Claudio R. Goncalves" <lgoncalv@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Boqun Feng <boqun.feng@...il.com>,
Ingo Molnar <mingo@...hat.com>,
John Ogness <john.ogness@...utronix.de>,
Mel Gorman <mgorman@...hsingularity.net>,
Michal Hocko <mhocko@...e.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>
Subject: Re: [PATCH v2 1/2] seqlock: Do the lockdep annotation before locking
in do_write_seqcount_begin_nested()
On 2023/06/26 23:44, Petr Mladek wrote:
> On Mon 2023-06-26 10:12:54, Sebastian Andrzej Siewior wrote:
>> On 2023-06-24 15:54:12 [+0900], Tetsuo Handa wrote:
>>> Why not to do the same on the end side?
>>>
>>> static inline void do_write_seqcount_end(seqcount_t *s)
>>> {
>>> - seqcount_release(&s->dep_map, _RET_IP_);
>>> do_raw_write_seqcount_end(s);
>>> + seqcount_release(&s->dep_map, _RET_IP_);
>>> }
>>
>> I don't have a compelling argument for doing it. It is probably better
>> to release the lock from lockdep's point of view and then really release
>> it (so it can't be acquired before it is released).
>
> If this is true then we should not change the ordering on the _begin
> side either. I mean that we should call the lockdep code only
> after the lock is taken. Anyway, both sides should be symmetric.
>
> That said, lockdep is about chains of locks and not about timing.
> We must not call lockdep annotation when the lock is still available
> for a nested context. So the ordering is probably important only when
> the lock might be taken from both normal and interrupt context.
>
> Anyway, please do not do this change only because of printk().
> IMHO, the current ordering is more logical and the printk() problem
> should be solved another way.
Then, since [PATCH 1/2] cannot be applied, [PATCH 2/2] is automatically
rejected.
I found
/*
* Locking a pcp requires a PCP lookup followed by a spinlock. To avoid
* a migration causing the wrong PCP to be locked and remote memory being
* potentially allocated, pin the task to the CPU for the lookup+lock.
* preempt_disable is used on !RT because it is faster than migrate_disable.
* migrate_disable is used on RT because otherwise RT spinlock usage is
* interfered with and a high priority task cannot preempt the allocator.
*/
#ifndef CONFIG_PREEMPT_RT
#define pcpu_task_pin() preempt_disable()
#define pcpu_task_unpin() preempt_enable()
#else
#define pcpu_task_pin() migrate_disable()
#define pcpu_task_unpin() migrate_enable()
#endif
in mm/page_alloc.c . Thus, I think that calling migrate_disable() if CONFIG_PREEMPT_RT=y
and calling local_irq_save() if CONFIG_PREEMPT_RT=n (i.e. Alternative 3) will work.
But thinking again, since CONFIG_PREEMPT_RT=y uses special printk() approach where messages
are printed from a dedicated kernel thread, do we need to call printk_deferred_enter() if
CONFIG_PREEMPT_RT=y ? That is, isn't the fix as straightforward as below?
----------------------------------------
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 47421bedc12b..a2a3bfa69a12 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5805,6 +5805,7 @@ static void __build_all_zonelists(void *data)
int nid;
int __maybe_unused cpu;
pg_data_t *self = data;
+#ifndef CONFIG_PREEMPT_RT
unsigned long flags;
/*
@@ -5820,6 +5821,7 @@ static void __build_all_zonelists(void *data)
* calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held.
*/
printk_deferred_enter();
+#endif
write_seqlock(&zonelist_update_seq);
#ifdef CONFIG_NUMA
@@ -5858,8 +5860,10 @@ static void __build_all_zonelists(void *data)
}
write_sequnlock(&zonelist_update_seq);
+#ifndef CONFIG_PREEMPT_RT
printk_deferred_exit();
local_irq_restore(flags);
+#endif
}
static noinline void __init
----------------------------------------
Powered by blists - more mailing lists