linux-kernel - Re: [PATCH v2 5/6] mm, cpuset: always use seqlock when changing task's nodemask

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170523071521.GH12813@dhcp22.suse.cz>
Date:   Tue, 23 May 2017 09:15:21 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-api@...r.kernel.org, linux-kernel@...r.kernel.org,
        cgroups@...r.kernel.org, Li Zefan <lizefan@...wei.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        David Rientjes <rientjes@...gle.com>,
        Christoph Lameter <cl@...ux.com>,
        Hugh Dickins <hughd@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH v2 5/6] mm, cpuset: always use seqlock when changing
 task's nodemask

On Wed 17-05-17 10:11:39, Vlastimil Babka wrote:
> When updating task's mems_allowed and rebinding its mempolicy due to cpuset's
> mems being changed, we currently only take the seqlock for writing when either
> the task has a mempolicy, or the new mems has no intersection with the old
> mems. This should be enough to prevent a parallel allocation seeing no
> available nodes, but the optimization is IMHO unnecessary (cpuset updates
> should not be frequent), and we still potentially risk issues if the
> intersection of new and old nodes has limited amount of free/reclaimable
> memory. Let's just use the seqlock for all tasks.

Agreed

> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>

Acked-by: Michal Hocko <mhocko@...e.com>

> ---
>  kernel/cgroup/cpuset.c | 29 ++++++++---------------------
>  1 file changed, 8 insertions(+), 21 deletions(-)
> 
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index dfd5b420452d..26a1c360a481 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1038,38 +1038,25 @@ static void cpuset_post_attach(void)
>   * @tsk: the task to change
>   * @newmems: new nodes that the task will be set
>   *
> - * In order to avoid seeing no nodes if the old and new nodes are disjoint,
> - * we structure updates as setting all new allowed nodes, then clearing newly
> - * disallowed ones.
> + * We use the mems_allowed_seq seqlock to safely update both tsk->mems_allowed
> + * and rebind an eventual tasks' mempolicy. If the task is allocating in
> + * parallel, it might temporarily see an empty intersection, which results in
> + * a seqlock check and retry before OOM or allocation failure.
>   */
>  static void cpuset_change_task_nodemask(struct task_struct *tsk,
>  					nodemask_t *newmems)
>  {
> -	bool need_loop;
> -
>  	task_lock(tsk);
> -	/*
> -	 * Determine if a loop is necessary if another thread is doing
> -	 * read_mems_allowed_begin().  If at least one node remains unchanged and
> -	 * tsk does not have a mempolicy, then an empty nodemask will not be
> -	 * possible when mems_allowed is larger than a word.
> -	 */
> -	need_loop = task_has_mempolicy(tsk) ||
> -			!nodes_intersects(*newmems, tsk->mems_allowed);
>  
> -	if (need_loop) {
> -		local_irq_disable();
> -		write_seqcount_begin(&tsk->mems_allowed_seq);
> -	}
> +	local_irq_disable();
> +	write_seqcount_begin(&tsk->mems_allowed_seq);
>  
>  	nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
>  	mpol_rebind_task(tsk, newmems);
>  	tsk->mems_allowed = *newmems;
>  
> -	if (need_loop) {
> -		write_seqcount_end(&tsk->mems_allowed_seq);
> -		local_irq_enable();
> -	}
> +	write_seqcount_end(&tsk->mems_allowed_seq);
> +	local_irq_enable();
>  
>  	task_unlock(tsk);
>  }
> -- 
> 2.12.2

-- 
Michal Hocko
SUSE Labs