lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Jan 2020 12:23:11 -0800
From:   Fenghua Yu <fenghua.yu@...el.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Reinette Chatre <reinette.chatre@...el.com>,
        Borislav Petkov <bp@...en8.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, x86@...nel.org
Subject: Re: [bug report] resctrl high memory comsumption

On Wed, Jan 08, 2020 at 09:07:41AM -0800, Shakeel Butt wrote:
> Hi,
> 
> Recently we had a bug in the system software writing the same pids to
> the tasks file of resctrl group multiple times. The resctrl code
> allocates "struct task_move_callback" for each such write and call
> task_work_add() for that task to handle it on return to user-space
> without checking if such request already exist for that particular
> task. The issue arises for long sleeping tasks which has thousands for
> such request queued to be handled. On our production, we notice
> thousands of tasks having thousands of such requests and taking GiBs
> of memory for "struct task_move_callback". I am not very familiar with
> the code to judge if task_work_cancel() is the right approach or just
> checking closid/rmid before doing task_work_add().
> 

Thank you for reporting the issue, Shakeel!

Could you please check if the following patch fixes the issue?
>From 3c23c39b6a44fdfbbbe0083d074dcc114d7d7f1c Mon Sep 17 00:00:00 2001
From: Fenghua Yu <fenghua.yu@...el.com>
Date: Wed, 8 Jan 2020 19:53:33 +0000
Subject: [RFC PATCH] x86/resctrl: Fix redundant task movements

Currently a task can be moved to a rdtgroup multiple times.
But, this can cause multiple task works are added, waste memory
and degrade performance.

To fix the issue, only move the task to a rdtgroup when the task
is not in the rdgroup. Don't try to move the task to the rdtgroup
again when the task is already in the rdtgroup.

Reported-by: Shakeel Butt <shakeelb@...gle.com>
Signed-off-by: Fenghua Yu <fenghua.yu@...el.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 2e3b06d6bbc6..75300c4a5969 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -546,6 +546,17 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
 	struct task_move_callback *callback;
 	int ret;
 
+	/* If the task is already in rdtgrp, don't move the task. */
+	if ((rdtgrp->type == RDTCTRL_GROUP && tsk->closid == rdtgrp->closid &&
+	    tsk->rmid == rdtgrp->mon.rmid) ||
+	    (rdtgrp->type == RDTMON_GROUP &&
+	     rdtgrp->mon.parent->closid == tsk->closid &&
+	     tsk->rmid == rdtgrp->mon.rmid)) {
+		rdt_last_cmd_puts("Task is already in the rdgroup\n");
+
+		return -EINVAL;
+	}
+
 	callback = kzalloc(sizeof(*callback), GFP_KERNEL);
 	if (!callback)
 		return -ENOMEM;
-- 
2.19.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ