lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230603012338.GA2795276@google.com>
Date:   Sat, 3 Jun 2023 01:23:38 +0000
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     "Paul E . McKenney" <paulmck@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, rcu <rcu@...r.kernel.org>,
        Uladzislau Rezki <urezki@...il.com>,
        Neeraj Upadhyay <quic_neeraju@...cinc.com>,
        Giovanni Gherdovich <ggherdovich@...e.cz>
Subject: Re: [PATCH 4/9] rcu: Introduce lazy queue's own qhimark

On Wed, May 31, 2023 at 12:17:31PM +0200, Frederic Weisbecker wrote:
> The lazy and the regular bypass queues share the same thresholds in
> terms of number of callbacks after which a flush to the main list is
> performed: 10 000 callbacks.
> 
> However lazy and regular bypass don't have the same purposes and neither
> should their respective thresholds:
> 
> * The bypass queue stands for relieving the main queue in case of a
>   callback storm. It makes sense to allow a high number of callbacks to
>   pile up before flushing to the main queue, especially as the life
>   cycle for this queue is very short (1 jiffy).

Sure.

> 
> * The lazy queue aims to spare wake ups and reduce the number of grace
>   periods. There it doesn't make sense to allow a huge number of
>   callbacks before flushing so as not to introduce memory pressure,
>   especially as the life cycle for this queue is very long (10
>   seconds).

It does make sense as we have a shrinker, and it is better to avoid RCU
disturbing the system unwantedly when there's lots of memory lying around.

> 
> For those reasons, set the default threshold for the lazy queue to
> 100.

I am OK with splitting the qhimark, but this lazy default is too low. In
typical workloads on ChromeOS, we see 1000s of callback even when CPU
utilization is low. So considering that, we would be flushing all the time.

Eventually I want the mm subsystem to tell us when flushing is needed so we
could flush sooner at that time if really needed, but for now we have a
shrinker so it should be OK. Is there a reason the shrinker does not work for
you?

thanks,

 - Joel


> 
> Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> ---
>  kernel/rcu/tree.c      | 2 ++
>  kernel/rcu/tree_nocb.h | 9 ++++-----
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index bc4e7c9b51cb..9b98d87fa22e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -379,6 +379,8 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  static long blimit = DEFAULT_RCU_BLIMIT;
>  #define DEFAULT_RCU_QHIMARK 10000 // If this many pending, ignore blimit.
>  static long qhimark = DEFAULT_RCU_QHIMARK;
> +#define DEFAULT_RCU_QHIMARK_LAZY 100 // If this many pending, flush.
> +static long qhimark_lazy = DEFAULT_RCU_QHIMARK_LAZY;
>  #define DEFAULT_RCU_QLOMARK 100   // Once only this many pending, use blimit.
>  static long qlowmark = DEFAULT_RCU_QLOMARK;
>  #define DEFAULT_RCU_QOVLD_MULT 2
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 8320eb77b58b..c08447db5a2e 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -480,10 +480,9 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  
>  	// If ->nocb_bypass has been used too long or is too full,
>  	// flush ->nocb_bypass to ->cblist.
> -	if ((ncbs && !bypass_is_lazy && j != READ_ONCE(rdp->nocb_bypass_first)) ||
> -	    (ncbs &&  bypass_is_lazy &&
> -	     (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_lazy_flush))) ||
> -	    ncbs >= qhimark) {
> +	if (ncbs &&
> +	    ((!bypass_is_lazy && ((j != READ_ONCE(rdp->nocb_bypass_first)) || ncbs >= qhimark)) ||
> +	     (bypass_is_lazy && (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_lazy_flush) || ncbs >= qhimark_lazy)))) {
>  		rcu_nocb_lock(rdp);
>  		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
>  
> @@ -724,7 +723,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
>  
>  		if (bypass_ncbs && (lazy_ncbs == bypass_ncbs) &&
>  		    (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_lazy_flush) ||
> -		     bypass_ncbs > 2 * qhimark)) {
> +		     bypass_ncbs > 2 * qhimark_lazy)) {
>  			flush_bypass = true;
>  		} else if (bypass_ncbs && (lazy_ncbs != bypass_ncbs) &&
>  		    (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
> -- 
> 2.40.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ