lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220624080444.7619-1-christian.koenig@amd.com>
Date:   Fri, 24 Jun 2022 10:04:30 +0200
From:   "Christian König" 
        <ckoenig.leichtzumerken@...il.com>
To:     linux-media@...r.kernel.org, linux-kernel@...r.kernel.org,
        intel-gfx@...ts.freedesktop.org, amd-gfx@...ts.freedesktop.org,
        nouveau@...ts.freedesktop.org, linux-tegra@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        dri-devel@...ts.freedesktop.org
Cc:     mhocko@...e.com
Subject: [RFC] Per file OOM-badness / RSS once more

Hello everyone,

To summarize the issue I'm trying to address here: Processes can allocate
resources through a file descriptor without being held responsible for it.

I'm not explaining all the details again. See here for a more deeply
description of the problem: https://lwn.net/ml/linux-kernel/20220531100007.174649-1-christian.koenig@amd.com/

With this iteration I'm trying to address a bunch of the comments Michal Hocko
(thanks a lot for that) gave as well as giving some new ideas.

Changes made so far:
1. Renamed the callback into file_rss(). This is at least a start to better
   describe what this is all about. I've been going back and forth over the
   naming here, if you have any better idea please speak up.

2. Cleanups, e.g. now providing a helper function in the fs layer to sum up
   all the pages allocated by the files in a file descriptor table.

3. Using the actual number of allocated pages for the shmem implementation
   instead of just the size. I also tried to ignore shmem files which are part
   of tmpfs, cause that has a separate accounting/limitation approach.

4. The OOM killer now prints the memory of the killed process including the per
   file pages which makes the whole things much more comprehensible.

5. I've added the per file pages to the different reports in RSS in procfs.
   This has the interesting effect that tools like top suddenly give a much
   more accurate overview of the memory use as well. This of course increases
   the overhead of gathering those information quite a bit and I'm not sure how
   feasible that is for up-streaming. On the other hand this once more clearly
   shows that we need to do something about this issue.

Another rather interesting observation is that multiple subsystems (shmem,
tmpfs, ttm) came up with the same workaround of limiting the memory which can
be allocated through them to 50% of the whole system memory. Unfortunately
that isn't the same 50% and it doesn't apply everywhere, so you can still
easily crash the box.

Ideas and/or comments are really welcome.

Thanks,
Christian.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ