lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADnq5_Ng7oe_NMSb6GdL=_T_zw22Gk0B6ePDXRiU7Ljind6Gww@mail.gmail.com>
Date:   Tue, 31 May 2022 18:00:51 -0400
From:   Alex Deucher <alexdeucher@...il.com>
To:     Christian König <ckoenig.leichtzumerken@...il.com>,
        Maling list - DRI developers 
        <dri-devel@...ts.freedesktop.org>
Cc:     linux-media <linux-media@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
        amd-gfx list <amd-gfx@...ts.freedesktop.org>,
        nouveau <nouveau@...ts.freedesktop.org>,
        linux-tegra@...r.kernel.org,
        Linux-Fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        Andrey Grodzovsky <andrey.grodzovsky@....com>,
        Hugh Dickens <hughd@...gle.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Daniel Vetter <daniel@...ll.ch>,
        "Deucher, Alexander" <alexander.deucher@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Christian Koenig <christian.koenig@....com>
Subject: Re: Per file OOM badness

+ dri-devel

On Tue, May 31, 2022 at 6:00 AM Christian König
<ckoenig.leichtzumerken@...il.com> wrote:
>
> Hello everyone,
>
> To summarize the issue I'm trying to address here: Processes can allocate
> resources through a file descriptor without being held responsible for it.
>
> Especially for the DRM graphics driver subsystem this is rather
> problematic. Modern games tend to allocate huge amounts of system memory
> through the DRM drivers to make it accessible to GPU rendering.
>
> But even outside of the DRM subsystem this problem exists and it is
> trivial to exploit. See the following simple example of
> using memfd_create():
>
>          fd = memfd_create("test", 0);
>          while (1)
>                  write(fd, page, 4096);
>
> Compile this and you can bring down any standard desktop system within
> seconds.
>
> The background is that the OOM killer will kill every processes in the
> system, but just not the one which holds the only reference to the memory
> allocated by the memfd.
>
> Those problems where brought up on the mailing list multiple times now
> [1][2][3], but without any final conclusion how to address them. Since
> file descriptors are considered shared the process can not directly held
> accountable for allocations made through them. Additional to that file
> descriptors can also easily move between processes as well.
>
> So what this patch set does is to instead of trying to account the
> allocated memory to a specific process it adds a callback to struct
> file_operations which the OOM killer can use to query the specific OOM
> badness of this file reference. This badness is then divided by the
> file_count, so that every process using a shmem file, DMA-buf or DRM
> driver will get it's equal amount of OOM badness.
>
> Callbacks are then implemented for the two core users (memfd and DMA-buf)
> as well as 72 DRM based graphics drivers.
>
> The result is that the OOM killer can now much better judge if a process
> is worth killing to free up memory. Resulting a quite a bit better system
> stability in OOM situations, especially while running games.
>
> The only other possibility I can see would be to change the accounting of
> resources whenever references to the file structure change, but this would
> mean quite some additional overhead for a rather common operation.
>
> Additionally I think trying to limit device driver allocations using
> cgroups is orthogonal to this effort. While cgroups is very useful, it
> works on per process limits and tries to enforce a collaborative model on
> memory management while the OOM killer enforces a competitive model.
>
> Please comment and/or review, we have that problem flying around for years
> now and are not at a point where we finally need to find a solution for
> this.
>
> Regards,
> Christian.
>
> [1] https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> [2] https://lkml.org/lkml/2018/1/18/543
> [3] https://lkml.org/lkml/2021/2/4/799
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ