linux-kernel - Re: [PATCH] mm/swap: add function get_total_swap_pages to expose total_swap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7d5ce7ab-d16d-36bc-7953-e1da2db350bf@amd.com>
Date:   Tue, 30 Jan 2018 11:32:49 +0100
From:   Christian König <christian.koenig@....com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     "He, Roger" <Hongbo.He@....com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>
Subject: Re: [PATCH] mm/swap: add function get_total_swap_pages to expose
 total_swap_pages

Am 30.01.2018 um 11:18 schrieb Michal Hocko:
> On Tue 30-01-18 10:00:07, Christian König wrote:
>> Am 30.01.2018 um 08:55 schrieb Michal Hocko:
>>> On Tue 30-01-18 02:56:51, He, Roger wrote:
>>>> Hi Michal:
>>>>
>>>> We need a API to tell TTM module the system totally has how many swap
>>>> cache.  Then TTM module can use it to restrict how many the swap cache
>>>> it can use to prevent triggering OOM.  For Now we set the threshold of
>>>> swap size TTM used as 1/2 * total size and leave the rest for others
>>>> use.
>>> Why do you so much memory? Are you going to use TB of memory on large
>>> systems? What about memory hotplug when the memory is added/released?
>> For graphics and compute applications on GPUs it isn't unusual to use large
>> amounts of system memory.
>>
>> Our standard policy in TTM is to allow 50% of system memory to be pinned for
>> use with GPUs (the hardware can't do page faults).
>>
>> When that limit is exceeded (or the shrinker callbacks tell us to make room)
>> we wait for any GPU work to finish and copy buffer content into a shmem
>> file.
>>
>> This copy into a shmem file can easily trigger the OOM killer if there isn't
>> any swap space left and that is something we want to avoid.
>>
>> So what we want to do is to apply this 50% rule to swap space as well and
>> deny allocation of buffer objects when it is exceeded.
> How does that help when the rest of the system might eat swap?

Well it doesn't, but that is not the problem here.

When an application keeps calling malloc() it sooner or later is 
confronted with an OOM killer.

But when it keeps for example allocating OpenGL textures the expectation 
is that this sooner or later starts to fail because we run out of memory 
and not trigger the OOM killer.

So what we do is to allow the application to use all of video memory + a 
certain amount of system memory + swap space as last resort fallback 
(e.g. when you Alt+Tab from your full screen game back to your browser).

The problem we try to solve is that we haven't limited the use of swap 
space somehow.

>>>> But get_nr_swap_pages is the only API we can accessed from other
>>>> module now.  It can't cover the case of the dynamic swap size
>>>> increment.  I mean: user can use "swapon" to enable new swap file or
>>>> swap disk dynamically or "swapoff" to disable swap space.
>>> Exactly. Your scaling configuration based on get_nr_swap_pages or the
>>> available memory simply sounds wrong.
>> Why? That is pretty much exactly what we are doing with buffer objects and
>> system memory for years.
> Could you be more specific? What kind of buffer objects you have in
> mind?

Those are GEM buffer objects which user space uses for things like 
OpenGL textures, OpenCL matrix, Vulkan surfaces, video codec surfaces 
etc etc...

Regards,
Christian.