[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36274836-1968-e712-fb15-f3e15eeb7741@daenzer.net>
Date: Mon, 8 Feb 2021 14:49:51 +0100
From: Michel Dänzer <michel@...nzer.net>
To: Daniel Vetter <daniel@...ll.ch>
Cc: Will Drewry <wad@...omium.org>, Kees Cook <keescook@...omium.org>,
Jann Horn <jannh@...gle.com>,
intel-gfx <intel-gfx@...ts.freedesktop.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
dri-devel <dri-devel@...ts.freedesktop.org>,
Andy Lutomirski <luto@...capital.net>,
Andrew Morton <akpm@...ux-foundation.org>,
Chris Wilson <chris@...is-wilson.co.uk>
Subject: Re: [PATCH] kernel: Expose SYS_kcmp by default
On 2021-02-08 2:34 p.m., Daniel Vetter wrote:
> On Mon, Feb 8, 2021 at 12:49 PM Michel Dänzer <michel@...nzer.net> wrote:
>>
>> On 2021-02-05 9:53 p.m., Daniel Vetter wrote:
>>> On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@...omium.org> wrote:
>>>>
>>>> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
>>>>> Userspace has discovered the functionality offered by SYS_kcmp and has
>>>>> started to depend upon it. In particular, Mesa uses SYS_kcmp for
>>>>> os_same_file_description() in order to identify when two fd (e.g. device
>>>>> or dmabuf) point to the same struct file. Since they depend on it for
>>>>> core functionality, lift SYS_kcmp out of the non-default
>>>>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@...is-wilson.co.uk>
>>>>> Cc: Kees Cook <keescook@...omium.org>
>>>>> Cc: Andy Lutomirski <luto@...capital.net>
>>>>> Cc: Will Drewry <wad@...omium.org>
>>>>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>>>>> Cc: Dave Airlie <airlied@...il.com>
>>>>> Cc: Daniel Vetter <daniel@...ll.ch>
>>>>> Cc: Lucas Stach <l.stach@...gutronix.de>
>>>>> ---
>>>>> init/Kconfig | 11 +++++++++++
>>>>> kernel/Makefile | 2 +-
>>>>> tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +-
>>>>> 3 files changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/init/Kconfig b/init/Kconfig
>>>>> index b77c60f8b963..f62fca13ac5b 100644
>>>>> --- a/init/Kconfig
>>>>> +++ b/init/Kconfig
>>>>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
>>>>> config CHECKPOINT_RESTORE
>>>>> bool "Checkpoint/restore support"
>>>>> select PROC_CHILDREN
>>>>> + select KCMP
>>>>> default n
>>>>> help
>>>>> Enables additional kernel features in a sake of checkpoint/restore.
>>>>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
>>>>> config ARCH_HAS_MEMBARRIER_SYNC_CORE
>>>>> bool
>>>>>
>>>>> +config KCMP
>>>>> + bool "Enable kcmp() system call" if EXPERT
>>>>> + default y
>>>>
>>>> I would expect this to be not default-y, especially if
>>>> CHECKPOINT_RESTORE does a "select" on it.
>>>>
>>>> This is a really powerful syscall, but it is bounded by ptrace access
>>>> controls, and uses pointer address obfuscation, so it may be okay to
>>>> expose this. As it is, at least Ubuntu already has
>>>> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
>>>> difference on exposure.
>>>>
>>>> So, if you drop the "default y", I'm fine with this.
>>>
>>> It was maybe stupid, but our userspace started relying on fd
>>> comaprison through sys_kcomp. So for better or worse, if you want to
>>> run the mesa3d gl/vk stacks, you need this.
>>
>> That's overstating things somewhat. The vast majority of applications
>> will work fine regardless (as they did before Mesa started using this
>> functionality). Only some special ones will run into issues, because the
>> user-space drivers incorrectly assume two file descriptors reference
>> different descriptions.
>>
>>
>>> Was maybe not the brighest ideas, but since enough distros had this
>>> enabled by defaults,
>>
>> Right, that (and the above) is why I considered it fair game to use.
>> What should I have done instead? (TBH I was surprised that this
>> functionality isn't generally available)
>
> Yeah that one is fine, but I thought we've discussed (irc or
> something) more uses for de-duping dma-buf and stuff like that. But
> quick grep says that hasn't landed yet, so I got a bit confused (or
> just dreamt). Looking at this again I'm kinda surprised the drmfd
> de-duping blows up on normal linux distros, but I guess it can all
> happen.
One example: GEM handle name-spaces are per file description. If
user-space incorrectly assumes two DRM fds are independent, when they
actually reference the same file description, closing a GEM handle with
one file descriptor will make it unusable with the other file descriptor
as well.
>>> Ofc we can leave the default n, but the select if CONFIG_DRM is
>>> unfortunately needed I think.
>>
>> Per above, not sure this is really true.
>
> We seem to be going boom on linux distros now, maybe userspace got
> more creative in abusing stuff?
I don't know what you're referring to. I've only seen maybe two or three
reports from people who didn't enable CHECKPOINT_RESTORE in their
self-built kernels.
> The entire thing is small enough that imo we don't really have to care,
> e.g. we also unconditionally select dma-buf, despite that on most
> systems there's only 1 gpu, and you're never going to end up with a
> buffer sharing case that needs any of that code (aside from the
> "here's an fd" part).
>
> But I guess we can limit to just KCMP_FILE like you suggest in another
> reply. Just feels a bit like overkill.
Making KCMP_FILE gated by DRM makes as little sense to me as by
CHECKPOINT_RESTORE.
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and X developer
Powered by blists - more mailing lists