linux-kernel - Re: [PATCH] tick/nohz: Reduce the critical region for jiffies

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b08bc867-91db-3fcc-8927-ac94db2327cd@huawei.com>
Date:   Mon, 16 Nov 2020 21:24:23 +0800
From:   Yunfeng Ye <yeyunfeng@...wei.com>
To:     Thomas Gleixner <tglx@...utronix.de>, <fweisbec@...il.com>,
        <mingo@...nel.org>, <linux-kernel@...r.kernel.org>,
        Shiyuan Hu <hushiyuan@...wei.com>,
        Hewenliang <hewenliang4@...wei.com>
Subject: Re: [PATCH] tick/nohz: Reduce the critical region for jiffies_seq



On 2020/11/16 19:29, Thomas Gleixner wrote:
> On Mon, Nov 16 2020 at 14:07, Yunfeng Ye wrote:
>> On 2020/11/16 3:43, Thomas Gleixner wrote:
>>>> and the conflict between jiffies_lock and jiffies_seq increases,
>>>> especially in multi-core scenarios.
>>>
>>> This does not make sense. The sequence counter is updated when holding
>>> the lock, so there is no conflict between the lock and the sequence
>>> count.
>>>
>> Yes, there is no conflict between the lock and the sequence count, but
>> when tick_do_update_jiffies64() is called one by one, the sequence count
>> will be updated, it will affect the latency of tick_nohz_next_event(),
>> because the priority of read seqcount is less than writer.
> 
> It's clear to me because I know how that code works, but for someone who
> is not familiar with it your description of the problem is confusing at
> best.
> 
>> During a tick period,
> 
> 'During a tick period' is misleading. The tick period is the reciprocal
> value ot the tick frequency.
> 
> What you want to explain is that if jiffies require an update, i.e.
> 
>      now > last_update + period
> 
> then multiple CPUs might contend on it until last_update is forwarded
> and the quick check is preventing it again.
> 
Yes, your are right, thanks.

>> the tick_do_update_jiffies64() is called concurrency, and the
>> time is up to 30+us. so the lockless quick check in tick_do_update_jiffies64()
>> cannot intercept all concurrency.
> 
> It cannot catch all of it, right. 
> 
> There are quite some other inefficiencies in that code and the seqcount
> held time can be reduced way further. Let me stare at it.
> 
I think the write seqcount only protecting the last_jiffies_update/jiffies_64/
tick_next_period is enough. The modification which has not been tested,
look like this:

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f0199a4ba1ad..d5f9930e6bc7 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -66,15 +66,13 @@ static void tick_do_update_jiffies64(ktime_t now)

 	/* Reevaluate with jiffies_lock held */
 	raw_spin_lock(&jiffies_lock);
-	write_seqcount_begin(&jiffies_seq);

 	delta = ktime_sub(now, last_jiffies_update);
 	if (delta >= tick_period) {
+		ktime_t tmp_jiffies_update =
+			   ktime_add(last_jiffies_update, tick_period);

 		delta = ktime_sub(delta, tick_period);
-		/* Pairs with the lockless read in this function. */
-		WRITE_ONCE(last_jiffies_update,
-			   ktime_add(last_jiffies_update, tick_period));

 		/* Slow path for long timeouts */
 		if (unlikely(delta >= tick_period)) {
@@ -82,21 +80,25 @@ static void tick_do_update_jiffies64(ktime_t now)

 			ticks = ktime_divns(delta, incr);

-			/* Pairs with the lockless read in this function. */
-			WRITE_ONCE(last_jiffies_update,
-				   ktime_add_ns(last_jiffies_update,
-						incr * ticks));
+			tmp_jiffies_update =
+				   ktime_add_ns(tmp_jiffies_update,
+						incr * ticks);
 		}
-		do_timer(++ticks);
+		ticks++;
+
+		write_seqcount_begin(&jiffies_seq);
+		/* Pairs with the lockless read in this function. */
+		WRITE_ONCE(last_jiffies_update, tmp_jiffies_update);
+		jiffies_64 += ticks;

 		/* Keep the tick_next_period variable up to date */
 		tick_next_period = ktime_add(last_jiffies_update, tick_period);
-	} else {
 		write_seqcount_end(&jiffies_seq);
+		calc_global_load();
+	} else {
 		raw_spin_unlock(&jiffies_lock);
 		return;
 	}
-	write_seqcount_end(&jiffies_seq);
 	raw_spin_unlock(&jiffies_lock);
 	update_wall_time();
 }

> Thanks,
> 
>         tglx
> .
>