[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170523071521.GH12813@dhcp22.suse.cz>
Date: Tue, 23 May 2017 09:15:21 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-api@...r.kernel.org, linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org, Li Zefan <lizefan@...wei.com>,
Mel Gorman <mgorman@...hsingularity.net>,
David Rientjes <rientjes@...gle.com>,
Christoph Lameter <cl@...ux.com>,
Hugh Dickins <hughd@...gle.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH v2 5/6] mm, cpuset: always use seqlock when changing
task's nodemask
On Wed 17-05-17 10:11:39, Vlastimil Babka wrote:
> When updating task's mems_allowed and rebinding its mempolicy due to cpuset's
> mems being changed, we currently only take the seqlock for writing when either
> the task has a mempolicy, or the new mems has no intersection with the old
> mems. This should be enough to prevent a parallel allocation seeing no
> available nodes, but the optimization is IMHO unnecessary (cpuset updates
> should not be frequent), and we still potentially risk issues if the
> intersection of new and old nodes has limited amount of free/reclaimable
> memory. Let's just use the seqlock for all tasks.
Agreed
> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
Acked-by: Michal Hocko <mhocko@...e.com>
> ---
> kernel/cgroup/cpuset.c | 29 ++++++++---------------------
> 1 file changed, 8 insertions(+), 21 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index dfd5b420452d..26a1c360a481 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1038,38 +1038,25 @@ static void cpuset_post_attach(void)
> * @tsk: the task to change
> * @newmems: new nodes that the task will be set
> *
> - * In order to avoid seeing no nodes if the old and new nodes are disjoint,
> - * we structure updates as setting all new allowed nodes, then clearing newly
> - * disallowed ones.
> + * We use the mems_allowed_seq seqlock to safely update both tsk->mems_allowed
> + * and rebind an eventual tasks' mempolicy. If the task is allocating in
> + * parallel, it might temporarily see an empty intersection, which results in
> + * a seqlock check and retry before OOM or allocation failure.
> */
> static void cpuset_change_task_nodemask(struct task_struct *tsk,
> nodemask_t *newmems)
> {
> - bool need_loop;
> -
> task_lock(tsk);
> - /*
> - * Determine if a loop is necessary if another thread is doing
> - * read_mems_allowed_begin(). If at least one node remains unchanged and
> - * tsk does not have a mempolicy, then an empty nodemask will not be
> - * possible when mems_allowed is larger than a word.
> - */
> - need_loop = task_has_mempolicy(tsk) ||
> - !nodes_intersects(*newmems, tsk->mems_allowed);
>
> - if (need_loop) {
> - local_irq_disable();
> - write_seqcount_begin(&tsk->mems_allowed_seq);
> - }
> + local_irq_disable();
> + write_seqcount_begin(&tsk->mems_allowed_seq);
>
> nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
> mpol_rebind_task(tsk, newmems);
> tsk->mems_allowed = *newmems;
>
> - if (need_loop) {
> - write_seqcount_end(&tsk->mems_allowed_seq);
> - local_irq_enable();
> - }
> + write_seqcount_end(&tsk->mems_allowed_seq);
> + local_irq_enable();
>
> task_unlock(tsk);
> }
> --
> 2.12.2
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists