linux-cve-announce - CVE-2021-47011: mm: memcontrol: slab: fix obtain a reference to a freeing memcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <2024022831-CVE-2021-47011-5b75@gregkh>
Date: Wed, 28 Feb 2024 09:15:01 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: gregkh@...nel.org
Subject: CVE-2021-47011: mm: memcontrol: slab: fix obtain a reference to a freeing memcg

From: gregkh@...nel.org

Description
===========

In the Linux kernel, the following vulnerability has been resolved:

mm: memcontrol: slab: fix obtain a reference to a freeing memcg

Patch series "Use obj_cgroup APIs to charge kmem pages", v5.

Since Roman's series "The new cgroup slab memory controller" applied.
All slab objects are charged with the new APIs of obj_cgroup.  The new
APIs introduce a struct obj_cgroup to charge slab objects.  It prevents
long-living objects from pinning the original memory cgroup in the
memory.  But there are still some corner objects (e.g.  allocations
larger than order-1 page on SLUB) which are not charged with the new
APIs.  Those objects (include the pages which are allocated from buddy
allocator directly) are charged as kmem pages which still hold a
reference to the memory cgroup.

E.g.  We know that the kernel stack is charged as kmem pages because the
size of the kernel stack can be greater than 2 pages (e.g.  16KB on
x86_64 or arm64).  If we create a thread (suppose the thread stack is
charged to memory cgroup A) and then move it from memory cgroup A to
memory cgroup B.  Because the kernel stack of the thread hold a
reference to the memory cgroup A.  The thread can pin the memory cgroup
A in the memory even if we remove the cgroup A.  If we want to see this
scenario by using the following script.  We can see that the system has
added 500 dying cgroups (This is not a real world issue, just a script
to show that the large kmallocs are charged as kmem pages which can pin
the memory cgroup in the memory).

	#!/bin/bash

	cat /proc/cgroups | grep memory

	cd /sys/fs/cgroup/memory
	echo 1 > memory.move_charge_at_immigrate

	for i in range{1..500}
	do
		mkdir kmem_test
		echo $$ > kmem_test/cgroup.procs
		sleep 3600 &
		echo $$ > cgroup.procs
		echo `cat kmem_test/cgroup.procs` > cgroup.procs
		rmdir kmem_test
	done

	cat /proc/cgroups | grep memory

This patchset aims to make those kmem pages to drop the reference to
memory cgroup by using the APIs of obj_cgroup.  Finally, we can see that
the number of the dying cgroups will not increase if we run the above test
script.

This patch (of 7):

The rcu_read_lock/unlock only can guarantee that the memcg will not be
freed, but it cannot guarantee the success of css_get (which is in the
refill_stock when cached memcg changed) to memcg.

  rcu_read_lock()
  memcg = obj_cgroup_memcg(old)
  __memcg_kmem_uncharge(memcg)
      refill_stock(memcg)
          if (stock->cached != memcg)
              // css_get can change the ref counter from 0 back to 1.
              css_get(&memcg->css)
  rcu_read_unlock()

This fix is very like the commit:

  eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge")

Fix this by holding a reference to the memcg which is passed to the
__memcg_kmem_uncharge() before calling __memcg_kmem_uncharge().

The Linux kernel CVE team has assigned CVE-2021-47011 to this issue.

Affected and fixed versions
===========================

	Issue introduced in 5.10.11 with commit 26f54dac1564 and fixed in 5.10.37 with commit 31df8bc4d3fe
	Issue introduced in 5.11 with commit 3de7d4f25a74 and fixed in 5.11.21 with commit 89b1ed358e01
	Issue introduced in 5.11 with commit 3de7d4f25a74 and fixed in 5.12.4 with commit c3ae6a3f3ca4
	Issue introduced in 5.11 with commit 3de7d4f25a74 and fixed in 5.13 with commit 9f38f03ae8d5

Please see https://www.kernel.org or a full list of currently supported
kernel versions by the kernel community.

Unaffected versions might change over time as fixes are backported to
older supported kernel versions.  The official CVE entry at
	https://cve.org/CVERecord/?id=CVE-2021-47011
will be updated if fixes are backported, please check that for the most
up to date information about this issue.

Affected files
==============

The file(s) affected by this issue are:
	mm/memcontrol.c

Mitigation
==========

The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes.  Individual
changes are never tested alone, but rather are part of a larger kernel
release.  Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all.  If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
	https://git.kernel.org/stable/c/31df8bc4d3feca9f9c6b2cd06fd64a111ae1a0e6
	https://git.kernel.org/stable/c/89b1ed358e01e1b0417f5d3b0082359a23355552
	https://git.kernel.org/stable/c/c3ae6a3f3ca4f02f6ccddf213c027302586580d0
	https://git.kernel.org/stable/c/9f38f03ae8d5f57371b71aa6b4275765b65454fd