lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170517081140.30654-6-vbabka@suse.cz>
Date:   Wed, 17 May 2017 10:11:39 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm@...ck.org, linux-api@...r.kernel.org,
        linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
        Li Zefan <lizefan@...wei.com>,
        Michal Hocko <mhocko@...nel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        David Rientjes <rientjes@...gle.com>,
        Christoph Lameter <cl@...ux.com>,
        Hugh Dickins <hughd@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: [PATCH v2 5/6] mm, cpuset: always use seqlock when changing task's nodemask

When updating task's mems_allowed and rebinding its mempolicy due to cpuset's
mems being changed, we currently only take the seqlock for writing when either
the task has a mempolicy, or the new mems has no intersection with the old
mems. This should be enough to prevent a parallel allocation seeing no
available nodes, but the optimization is IMHO unnecessary (cpuset updates
should not be frequent), and we still potentially risk issues if the
intersection of new and old nodes has limited amount of free/reclaimable
memory. Let's just use the seqlock for all tasks.

Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
---
 kernel/cgroup/cpuset.c | 29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index dfd5b420452d..26a1c360a481 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1038,38 +1038,25 @@ static void cpuset_post_attach(void)
  * @tsk: the task to change
  * @newmems: new nodes that the task will be set
  *
- * In order to avoid seeing no nodes if the old and new nodes are disjoint,
- * we structure updates as setting all new allowed nodes, then clearing newly
- * disallowed ones.
+ * We use the mems_allowed_seq seqlock to safely update both tsk->mems_allowed
+ * and rebind an eventual tasks' mempolicy. If the task is allocating in
+ * parallel, it might temporarily see an empty intersection, which results in
+ * a seqlock check and retry before OOM or allocation failure.
  */
 static void cpuset_change_task_nodemask(struct task_struct *tsk,
 					nodemask_t *newmems)
 {
-	bool need_loop;
-
 	task_lock(tsk);
-	/*
-	 * Determine if a loop is necessary if another thread is doing
-	 * read_mems_allowed_begin().  If at least one node remains unchanged and
-	 * tsk does not have a mempolicy, then an empty nodemask will not be
-	 * possible when mems_allowed is larger than a word.
-	 */
-	need_loop = task_has_mempolicy(tsk) ||
-			!nodes_intersects(*newmems, tsk->mems_allowed);
 
-	if (need_loop) {
-		local_irq_disable();
-		write_seqcount_begin(&tsk->mems_allowed_seq);
-	}
+	local_irq_disable();
+	write_seqcount_begin(&tsk->mems_allowed_seq);
 
 	nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
 	mpol_rebind_task(tsk, newmems);
 	tsk->mems_allowed = *newmems;
 
-	if (need_loop) {
-		write_seqcount_end(&tsk->mems_allowed_seq);
-		local_irq_enable();
-	}
+	write_seqcount_end(&tsk->mems_allowed_seq);
+	local_irq_enable();
 
 	task_unlock(tsk);
 }
-- 
2.12.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ