lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220330190556.GL4285@paulmck-ThinkPad-P17-Gen-1>
Date:   Wed, 30 Mar 2022 12:05:56 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Zqiang <qiang1.zhang@...el.com>
Cc:     frederic@...nel.org, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rcu-tasks: Check the atomic variable
 trc_n_readers_need_end again when wait timeout

On Wed, Mar 30, 2022 at 07:20:14PM +0800, Zqiang wrote:
> When the trc_wait waitqueue timeout, the atomic variable
> trc_n_readers_need_end need to be checked again, perhaps the
> conditions have been established at this time, avoid invalid
> stall information output.

But are you actually seeing this invalid stall information?  Either way,
please seem my comments and question below.

Please don't get me wrong, we do have similar checks for normal vanilla
RCU stall warnings, for example, this statement in print_other_cpu_stall()
in kernel/rcu/tree_stall.h:

	pr_err("INFO: Stall ended before state dump start\n");

However, the approach used there did benefit from significant real-world
experience with its absence.  ;-)

							Thanx, Paul

> Signed-off-by: Zqiang <qiang1.zhang@...el.com>
> ---
>  kernel/rcu/tasks.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> index 65d6e21a607a..b73a2b362d6b 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -1544,7 +1544,7 @@ static void rcu_tasks_trace_postgp(struct rcu_tasks *rtp)
>  				trc_wait,
>  				atomic_read(&trc_n_readers_need_end) == 0,
>  				READ_ONCE(rcu_task_stall_timeout));

If I understand correctly, this patch is intended to handle a situation
where the wait_event_idle_exclusive_timeout() timed out, but where the
value of trc_n_readers_need_end decreased to zero just at this point.
If so, please see my question below.  If not, please show me the exact
sequence of events leading up to the problem.

> -		if (ret)
> +		if (ret || !atomic_read(&trc_n_readers_need_end))
>  			break;  // Count reached zero.

Couldn't the value of trc_n_readers_need_end decrease to zero right
here, still resulting in invalid stall information?

>  		// Stall warning time, so make a list of the offenders.
>  		rcu_read_lock();
> -- 
> 2.25.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ