lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 12 Nov 2015 18:41:19 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	David Miller <davem@...emloft.net>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Vladimir Davydov <vdavydov@...tuozzo.com>,
	Tejun Heo <tj@...nel.org>, Michal Hocko <mhocko@...e.cz>,
	netdev@...r.kernel.org, linux-mm@...ck.org,
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
	kernel-team@...com
Subject: [PATCH 00/14] mm: memcontrol: account socket memory in unified hierarchy

Hi,

this is version 3 of the patches to add socket memory accounting to
the unified hierarchy memory controller. Changes since v2 include:

- Fixed an underflow bug in the mem+swap counter that came through the
  design of the per-cpu charge cache. To fix that, the unused mem+swap
  counter is now fully patched out on unified hierarchy. Double whammy.

- Restored the counting jump label such that the networking callbacks
  get patched out again when the last memory-controlled cgroup goes
  away. The code was already there, so we might as well keep it.

- Broke down the massive tcp_memcontrol rewrite patch into smaller
  logical pieces to (hopefully) make it easier to review and verify.

---

Socket buffer memory can make up a significant share of a workload's
memory footprint that can be directly linked to userspace activity,
and so it needs to be part of the memory controller to provide proper
resource isolation/containment.

Historically, socket buffers were accounted in a separate counter,
without any pressure equalization between anonymous memory, page
cache, and the socket buffers. When the socket buffer pool was
exhausted, buffer allocations would fail hard and cause network
performance to tank, regardless of whether there was still memory
available to the group or not. Likewise, struggling anonymous or cache
workingsets could not dip into an idle socket memory pool. Because of
this, the feature was not usable for many real life applications.

To not repeat this mistake, the new memory controller will account all
types of memory pages it is tracking on behalf of a cgroup in a single
pool. Upon pressure, the VM reclaims and shrinks and puts pressure on
whatever memory consumer in that pool is within its reach.

For socket memory, pressure feedback is provided through vmpressure
events. When the VM has trouble freeing memory, the network code is
instructed to stop growing the cgroup's transmit windows.

This series begins with a rework of the existing tcp memory controller
that simplifies and cleans up the code while allowing us to have only
one set of networking hooks for both memory controller versions. The
original behavior of the existing tcp controller should be preserved.

It then adds socket accounting to the v2 memory controller, including
the use of the per-cpu charge cache and async memory.high enforcement
from socket memory charges.

Lastly, vmpressure is hooked up to the socket code so that it stops
growing transmit windows when the VM has trouble reclaiming memory.

 include/linux/memcontrol.h   |  71 ++++++----
 include/net/sock.h           | 149 ++------------------
 include/net/tcp.h            |   5 +-
 include/net/tcp_memcontrol.h |   1 -
 mm/backing-dev.c             |   2 +-
 mm/memcontrol.c              | 303 +++++++++++++++++++++++++++--------------
 mm/vmpressure.c              |  25 +++-
 mm/vmscan.c                  |  31 +++--
 net/core/sock.c              |  78 +++--------
 net/ipv4/tcp.c               |   3 +-
 net/ipv4/tcp_ipv4.c          |   9 +-
 net/ipv4/tcp_memcontrol.c    |  85 ++++--------
 net/ipv4/tcp_output.c        |   7 +-
 net/ipv6/tcp_ipv6.c          |   3 -
 14 files changed, 353 insertions(+), 419 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ