lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 20 Nov 2023 12:31:05 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Yury Norov <yury.norov@...il.com>, mathieu.desnoyers@...icios.com
Cc:     linux-kernel@...r.kernel.org,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Jan Kara <jack@...e.cz>,
        Mirsad Todorovac <mirsad.todorovac@....unizg.hr>,
        Matthew Wilcox <willy@...radead.org>,
        Maxim Kuvyrkov <maxim.kuvyrkov@...aro.org>,
        Alexey Klimov <klimov.linux@...il.com>
Subject: Re: [PATCH 04/34] sched: add cpumask_find_and_set() and use it in
 __mm_cid_get()

On Sat, Nov 18, 2023 at 07:50:35AM -0800, Yury Norov wrote:
> __mm_cid_get() uses a __mm_cid_try_get() helper to atomically acquire a
> bit in mm cid mask. Now that we have atomic find_and_set_bit(), we can
> easily extend it to cpumasks and use in the scheduler code.
> 
> __mm_cid_try_get() has an infinite loop, which may delay forward
> progress of __mm_cid_get() when the mask is dense. The
> cpumask_find_and_set() doesn't poll the mask infinitely, and returns as
> soon as nothing has found after the first iteration, allowing to acquire
> the lock, and set use_cid_lock faster, if needed.

Methieu, I forgot again, but the comment delete seems to suggest you did
this on purpose...

> cpumask_find_and_set() considers cid mask as a volatile region of memory,
> as it actually is in this case. So, if it's changed while search is in
> progress, KCSAN wouldn't fire warning on it.
> 
> Signed-off-by: Yury Norov <yury.norov@...il.com>
> ---
>  include/linux/cpumask.h | 12 ++++++++++
>  kernel/sched/sched.h    | 52 ++++++++++++-----------------------------
>  2 files changed, 27 insertions(+), 37 deletions(-)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index cfb545841a2c..c2acced8be4e 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -271,6 +271,18 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
>  		small_cpumask_bits, n + 1);
>  }
>  
> +/**
> + * cpumask_find_and_set - find the first unset cpu in a cpumask and
> + *			  set it atomically
> + * @srcp: the cpumask pointer
> + *
> + * Return: >= nr_cpu_ids if nothing is found.
> + */
> +static inline unsigned int cpumask_find_and_set(volatile struct cpumask *srcp)
> +{
> +	return find_and_set_bit(cpumask_bits(srcp), small_cpumask_bits);
> +}
> +
>  /**
>   * for_each_cpu - iterate over every cpu in a mask
>   * @cpu: the (optionally unsigned) integer iterator
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 2e5a95486a42..b2f095a9fc40 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -3345,28 +3345,6 @@ static inline void mm_cid_put(struct mm_struct *mm)
>  	__mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
>  }
>  
> -static inline int __mm_cid_try_get(struct mm_struct *mm)
> -{
> -	struct cpumask *cpumask;
> -	int cid;
> -
> -	cpumask = mm_cidmask(mm);
> -	/*
> -	 * Retry finding first zero bit if the mask is temporarily
> -	 * filled. This only happens during concurrent remote-clear
> -	 * which owns a cid without holding a rq lock.
> -	 */
> -	for (;;) {
> -		cid = cpumask_first_zero(cpumask);
> -		if (cid < nr_cpu_ids)
> -			break;
> -		cpu_relax();
> -	}
> -	if (cpumask_test_and_set_cpu(cid, cpumask))
> -		return -1;
> -	return cid;
> -}
> -
>  /*
>   * Save a snapshot of the current runqueue time of this cpu
>   * with the per-cpu cid value, allowing to estimate how recently it was used.
> @@ -3381,25 +3359,25 @@ static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
>  
>  static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
>  {
> +	struct cpumask *cpumask = mm_cidmask(mm);
>  	int cid;
>  
> -	/*
> -	 * All allocations (even those using the cid_lock) are lock-free. If
> -	 * use_cid_lock is set, hold the cid_lock to perform cid allocation to
> -	 * guarantee forward progress.
> -	 */
> +	/* All allocations (even those using the cid_lock) are lock-free. */
>  	if (!READ_ONCE(use_cid_lock)) {
> -		cid = __mm_cid_try_get(mm);
> -		if (cid >= 0)
> +		cid = cpumask_find_and_set(cpumask);
> +		if (cid < nr_cpu_ids)
>  			goto end;
> -		raw_spin_lock(&cid_lock);
> -	} else {
> -		raw_spin_lock(&cid_lock);
> -		cid = __mm_cid_try_get(mm);
> -		if (cid >= 0)
> -			goto unlock;
>  	}
>  
> +	/*
> +	 * If use_cid_lock is set, hold the cid_lock to perform cid
> +	 * allocation to guarantee forward progress.
> +	 */
> +	raw_spin_lock(&cid_lock);
> +	cid = cpumask_find_and_set(cpumask);
> +	if (cid < nr_cpu_ids)
> +		goto unlock;
> +
>  	/*
>  	 * cid concurrently allocated. Retry while forcing following
>  	 * allocations to use the cid_lock to ensure forward progress.
> @@ -3415,9 +3393,9 @@ static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
>  	 * all newcoming allocations observe the use_cid_lock flag set.
>  	 */
>  	do {
> -		cid = __mm_cid_try_get(mm);
> +		cid = cpumask_find_and_set(cpumask);
>  		cpu_relax();
> -	} while (cid < 0);
> +	} while (cid >= nr_cpu_ids);
>  	/*
>  	 * Allocate before clearing use_cid_lock. Only care about
>  	 * program order because this is for forward progress.
> -- 
> 2.39.2
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ