lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon,  2 May 2022 23:19:35 +0000
From:   "T.J. Mercier" <tjmercier@...gle.com>
To:     tjmercier@...gle.com, Tejun Heo <tj@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>
Cc:     daniel@...ll.ch, hridya@...gle.com, christian.koenig@....com,
        jstultz@...gle.com, tkjos@...roid.com, cmllamas@...gle.com,
        surenb@...gle.com, kaleshsingh@...gle.com, Kenny.Ho@....com,
        mkoutny@...e.com, skhan@...uxfoundation.org,
        kernel-team@...roid.com, cgroups@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH v6 1/6] gpu: rfc: Proposal for a GPU cgroup controller

From: Hridya Valsaraju <hridya@...gle.com>

This patch adds a proposal for a new GPU cgroup controller for
accounting/limiting GPU and GPU-related memory allocations.
The proposed controller is based on the DRM cgroup controller[1] and
follows the design of the RDMA cgroup controller.

The new cgroup controller would:
* Allow setting per-device limits on the total size of buffers
  allocated by device within a cgroup.
* Expose a per-device/allocator breakdown of the buffers charged to a
  cgroup.

The prototype in the following patches is only for memory accounting
using the GPU cgroup controller and does not implement limit setting.

[1]: https://lore.kernel.org/amd-gfx/20210126214626.16260-1-brian.welty@intel.com/

Signed-off-by: Hridya Valsaraju <hridya@...gle.com>
Signed-off-by: T.J. Mercier <tjmercier@...gle.com>

---
v6 changes
Move documentation into cgroup-v2.rst per Tejun Heo.

v5 changes
Drop the global GPU cgroup "total" (sum of all device totals) portion
of the design since there is no currently known use for this per
Tejun Heo.

Update for renamed functions/variables.

v3 changes
Remove Upstreaming Plan from gpu-cgroup.rst per John Stultz.

Use more common dual author commit message format per John Stultz.
---
 Documentation/admin-guide/cgroup-v2.rst | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 69d7a6983f78..baeec096f1d8 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2352,6 +2352,30 @@ first, and stays charged to that cgroup until that resource is freed. Migrating
 a process to a different cgroup does not move the charge to the destination
 cgroup where the process has moved.
 
+
+GPU
+---
+
+The GPU controller accounts for device and system memory allocated by the GPU
+and related subsystems for graphics use. Resource limits are not currently
+supported.
+
+GPU Interface Files
+~~~~~~~~~~~~~~~~~~~~
+
+  gpu.memory.current
+	A read-only file containing memory allocations in flat-keyed format. The key
+	is a string representing the device name. The value is the size of the memory
+	charged to the device in bytes. The device names are globally unique.::
+
+	  $ cat /sys/kernel/fs/cgroup1/gpu.memory.current
+	  dev1 4194304
+	  dev2 104857600
+
+	The device name string is set by a device driver when it registers with the
+	GPU cgroup controller to participate in resource accounting. Non-unique names
+	will be rejected at the point of registration.
+
 Others
 ------
 
-- 
2.36.0.464.gb9c8b46e94-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ