[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3597e38e-ace7-104c-dcc8-59471e11dcfe@amd.com>
Date: Fri, 5 Feb 2021 11:57:09 +0100
From: Christian König <christian.koenig@....com>
To: Michal Hocko <mhocko@...e.com>
Cc: Hugh Dickins <hughd@...gle.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: Possible deny of service with memfd_create()
Am 05.02.21 um 11:50 schrieb Michal Hocko:
> On Fri 05-02-21 08:54:31, Christian König wrote:
>> Am 05.02.21 um 01:32 schrieb Hugh Dickins:
>>> On Thu, 4 Feb 2021, Michal Hocko wrote:
>>>> On Thu 04-02-21 17:32:20, Christian Koenig wrote:
>>>>> Hi Michal,
>>>>>
>>>>> as requested in the other mail thread the following sample code gets my test
>>>>> system down within seconds.
>>>>>
>>>>> The issue is that the memory allocated for the file descriptor is not
>>>>> accounted to the process allocating it, so the OOM killer pics whatever
>>>>> process it things is good but never my small test program.
>>>>>
>>>>> Since memfd_create() doesn't need any special permission this is a rather
>>>>> nice deny of service and as far as I can see also works with a standard
>>>>> Ubuntu 5.4.0-65-generic kernel.
>>>> Thanks for following up. This is really nasty but now that I am looking
>>>> at it more closely, this is not really different from tmpfs in general.
>>>> You are free to create files and eat the memory without being accounted
>>>> for that memory because that is not seen as your memory from the sysstem
>>>> POV. You would have to map that memory to be part of your rss.
>> I mostly agree. The big difference is that tmpfs is only available when
>> mounted.
>>
>> And tmpfs can be restricted in size per mount point as well as per user
>> quotas IIRC. Looking at my desktop system those restrictions are actually
>> exactly what I see there.
> I cannot find anything about per user quotas for tmpfs in the tmpfs man
> page. Or maybe I am looking at a wrong layer and there is a generic
> handling somewhere in the vfs core?
I think so, yes. I briefly remember a discussion about how to implement
quotas for tmpfs, but that was a really long time ago and I didn't
followed it till the end.
>> But memfd_create() is just free for all, you don't have any size limit nor
>> access restriction as far as I can see.
> Yes, this is unfortunate and a design decision that should have been
> considered when the syscall has been introduced. But this boat has
> sailed looong ago to change that without risking a userspace breakage.
>
>>>> The only existing protection right now is to use memoery cgroup
>>>> controller because the tmpfs memory is accounted to the process which
>>>> faults the memory in (or write to the file).
>> Agreed, but having to rely on cgroup is not really satisfying when you have
>> to maintain a hardened server.
> Yes I do recognize the pain. The only other way to mitigate the risk is
> to disallow the syscall to untrusted users in a hardened environment.
> You should be very strict in tmpfs usage there already.
>
Well it is perfectly valid for a process to use as much memory as it
wants, the problem is that we are not holding the process accountable
for it.
As I said we have similar problems with GPU drivers and I think we just
need a way to do this.
Let me think about it a bit, maybe we can somehow use the file owner for
this.
Thanks,
Christian.
Powered by blists - more mailing lists