linux-kernel - [bug report] resctrl high memory comsumption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com>
Date:   Wed, 8 Jan 2020 09:07:41 -0800
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Reinette Chatre <reinette.chatre@...el.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Borislav Petkov <bp@...en8.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, x86@...nel.org
Subject: [bug report] resctrl high memory comsumption

Hi,

Recently we had a bug in the system software writing the same pids to
the tasks file of resctrl group multiple times. The resctrl code
allocates "struct task_move_callback" for each such write and call
task_work_add() for that task to handle it on return to user-space
without checking if such request already exist for that particular
task. The issue arises for long sleeping tasks which has thousands for
such request queued to be handled. On our production, we notice
thousands of tasks having thousands of such requests and taking GiBs
of memory for "struct task_move_callback". I am not very familiar with
the code to judge if task_work_cancel() is the right approach or just
checking closid/rmid before doing task_work_add().

==repro==
# mkdir /sys/fs/resctrl/test
# cat /proc/slabinfo | grep kmalloc-32
kmalloc-32         57219  57288     32  124    1 : tunables  120   60
  8 : slabdata    462    462      0
# sleep 600&
[1] 17611
# for i in {1..200000}; do echo 17611 > /sys/fs/resctrl/test/tasks ; done
# cat /proc/slabinfo | grep kmalloc-32
kmalloc-32        257466 257548     32  124    1 : tunables  120   60
  8 : slabdata   2077   2077      5
# kill 17611
[1]+  Terminated              sleep 600
# cat /proc/slabinfo | grep kmalloc-32
kmalloc-32         57924  60636     32  124    1 : tunables  120   60
  8 : slabdata    470    489    385

thanks,
Shakeel