lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Jan 2023 07:25:05 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Zhouyi Zhou <zhouzhouyi@...il.com>, fweisbec@...il.com,
        tglx@...utronix.de, mingo@...nel.org, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH linux-next] mark access to tick_do_timer_cpu with
 READ_ONCE/WRITE_ONCE

On Mon, Jan 09, 2023 at 01:51:29PM +0100, Frederic Weisbecker wrote:
> On Mon, Dec 19, 2022 at 01:21:28PM +0800, Zhouyi Zhou wrote:
> > mark access to tick_do_timer_cpu with READ_ONCE/WRITE_ONCE to fix concurrency bug
> > reported by KCSAN.
> > 
> > Signed-off-by: Zhouyi Zhou <zhouzhouyi@...il.com>
> > ---
> > During the rcutorture test on linux-next,
> > ./tools/testing/selftests/rcutorture/bin/torture.sh --do-kcsan  --kcsan-kmake-arg "CC=clang-12"
> > following KCSAN BUG is reported:
> > [   35.397089] BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event^M
> > [   35.400593] ^M
> > [   35.401377] write to 0xffffffffb64b1270 of 4 bytes by task 0 on cpu 3:^M
> > [   35.405325]  tick_nohz_idle_stop_tick+0x14c/0x3e0^M
> > [   35.407162]  do_idle+0xf3/0x2a0^M
> > [   35.408016]  cpu_startup_entry+0x15/0x20^M
> > [   35.409084]  start_secondary+0x8f/0x90^M
> > [   35.410207]  secondary_startup_64_no_verify+0xe1/0xeb^M
> > [   35.411607] ^M
> > [   35.412042] no locks held by swapper/3/0.^M
> > [   35.413172] irq event stamp: 53048^M
> > [   35.414175] hardirqs last  enabled at (53047): [<ffffffffb41f8404>] tick_nohz_idle_enter+0x104/0x140^M
> > [   35.416681] hardirqs last disabled at (53048): [<ffffffffb41229f1>] do_idle+0x91/0x2a0^M
> > [   35.418988] softirqs last  enabled at (53038): [<ffffffffb40bf21e>] __irq_exit_rcu+0x6e/0xc0^M
> > [   35.421347] softirqs last disabled at (53029): [<ffffffffb40bf21e>] __irq_exit_rcu+0x6e/0xc0^M
> > [   35.423685] ^M
> > [   35.424119] read to 0xffffffffb64b1270 of 4 bytes by task 0 on cpu 0:^M
> > [   35.425870]  tick_nohz_next_event+0x233/0x2b0^M
> > [   35.427119]  tick_nohz_idle_stop_tick+0x8f/0x3e0^M
> > [   35.428386]  do_idle+0xf3/0x2a0^M
> > [   35.429265]  cpu_startup_entry+0x15/0x20^M
> > [   35.430429]  rest_init+0x20c/0x210^M
> > [   35.431382]  arch_call_rest_init+0xe/0x10^M
> > [   35.432508]  start_kernel+0x544/0x600^M
> > [   35.433519]  secondary_startup_64_no_verify+0xe1/0xeb^M
> > 
> > fix above bug by marking access to tick_do_timer_cpu with READ_ONCE/WRITE_ONCE
> 
> This has been discussed before with passion:
> 
> http://archive.lwn.net:8080/linux-kernel/1C65422C-FFA4-4651-893B-300FAF9C49DE@....pw/T/
> 
> To me data_race() would be more appropriate but that would need a changelog with
> proper analysis of the tick_do_timer_cpu state machine.

Please also an analysis of why the compiler cannot do any destructive
optimizations in this case.  Maybe also comments.

> One more thing on my TODO list, but feel free to beat me at it :-)

I know that feeling!  ;-)

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ