lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1583506974.7365.160.camel@lca.pw>
Date:   Fri, 06 Mar 2020 10:02:54 -0500
From:   Qian Cai <cai@....pw>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     fweisbec@...il.com, mingo@...nel.org, elver@...gle.com,
        linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] tick/sched: fix data races at tick_do_timer_cpu

On Wed, 2020-03-04 at 06:20 -0500, Qian Cai wrote:
> > On Mar 4, 2020, at 4:39 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> > 
> > They are reported, but are they actually a real problem?
> > 
> > This completely lacks analysis why these 8 places need the
> > READ/WRITE_ONCE() treatment at all and if so why the other 14 places
> > accessing tick_do_timer_cpu are safe without it.
> > 
> > I definitely appreciate the work done with KCSAN, but just making the
> > tool shut up does not cut it.
> 
> Looks at tick_sched_do_timer(), for example,
> 
> if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) {
> #ifdef CONFIG_NO_HZ_FULL
> 		WARN_ON(tick_nohz_full_running);
> #endif
> 		tick_do_timer_cpu = cpu;
> 	}
> #endif
> 
> /* Check, if the jiffies need an update */
> if (tick_do_timer_cpu == cpu)
> 	tick_do_update_jiffies64(now);
> 
> Could we rule out all compilers and archs will not optimize it if !CONFIG_NO_HZ_FULL to,
> 
> if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE) || tick_do_timer_cpu == cpu)
>          tick_do_update_jiffies64(now);
> 
> So it could save a branch or/and realized that tick_do_timer_cpu is not used later in this cpu, so it could discard the store?
> 
> I am not all that familiar with all other 14 places if it is possible to happen concurrently as well, so I took a pragmatic approach that for now only deal with the places that KCSAN confirmed, and then look forward for an incremental approach if there are more places needs treatments later once confirmed.

If you don't think that will be happening and dislike using READ/WRITE_ONCE()
for documentation purpose, we could annotate those using the data_race() macro.
Is that more acceptable?

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index a792d21cac64..08ce4088da87 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -129,7 +129,7 @@ static void tick_sched_do_timer(struct tick_sched *ts,
ktime_t now)
 	 * If nohz_full is enabled, this should not happen because the
 	 * tick_do_timer_cpu never relinquishes.
 	 */
-	if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) {
+	if (unlikely(data_race(tick_do_timer_cpu == TICK_DO_TIMER_NONE))) {
 #ifdef CONFIG_NO_HZ_FULL
 		WARN_ON(tick_nohz_full_running);
 #endif
@@ -138,7 +138,7 @@ static void tick_sched_do_timer(struct tick_sched *ts,
ktime_t now)
 #endif
 
 	/* Check, if the jiffies need an update */
-	if (tick_do_timer_cpu == cpu)
+	if (data_race(tick_do_timer_cpu == cpu))
 		tick_do_update_jiffies64(now);
 
 	if (ts->inidle)
@@ -737,8 +737,9 @@ static ktime_t tick_nohz_next_event(struct tick_sched *ts,
int cpu)
 	 * Otherwise we can sleep as long as we want.
 	 */
 	delta = timekeeping_max_deferment();
-	if (cpu != tick_do_timer_cpu &&
-	    (tick_do_timer_cpu != TICK_DO_TIMER_NONE || !ts->do_timer_last))
+	if (data_race(cpu != tick_do_timer_cpu) &&
+	    (data_race(tick_do_timer_cpu != TICK_DO_TIMER_NONE) ||
+	    !ts->do_timer_last))
 		delta = KTIME_MAX;
 
 	/* Calculate the next expiry time */
@@ -771,10 +772,10 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int
cpu)
 	 * do_timer() never invoked. Keep track of the fact that it
 	 * was the one which had the do_timer() duty last.
 	 */
-	if (cpu == tick_do_timer_cpu) {
+	if (data_race(cpu == tick_do_timer_cpu)) {
 		tick_do_timer_cpu = TICK_DO_TIMER_NONE;
 		ts->do_timer_last = 1;
-	} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
+	} else if (data_race(tick_do_timer_cpu != TICK_DO_TIMER_NONE)) {
 		ts->do_timer_last = 0;
 	}
 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ