lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08a97aac-6c77-451a-a2a5-20606bdde51f@schaufler-ca.com>
Date: Fri, 30 May 2025 09:52:17 -0700
From: Casey Schaufler <casey@...aufler-ca.com>
To: Günther Noack <gnoack@...gle.com>,
 Mickaël Salaün <mic@...ikod.net>
Cc: Günther Noack <gnoack3000@...il.com>,
 Paul Moore <paul@...l-moore.com>, sergeh@...nel.org,
 David Howells <dhowells@...hat.com>, Kees Cook <keescook@...omium.org>,
 linux-security-module@...r.kernel.org,
 Konstantin Meskhidze <konstantin.meskhidze@...wei.com>,
 Jann Horn <jannh@...gle.com>, linux-kernel@...r.kernel.org,
 Peter Newman <peternewman@...gle.com>, Andy Lutomirski
 <luto@...capital.net>, Will Drewry <wad@...omium.org>,
 Jarkko Sakkinen <jarkko@...nel.org>, Casey Schaufler <casey@...aufler-ca.com>
Subject: Re: [RFC 1/2] landlock: Multithreading support for
 landlock_restrict_self()

On 5/30/2025 9:26 AM, Günther Noack wrote:
> On Fri, May 30, 2025 at 05:11:34PM +0200, Mickaël Salaün wrote:
>> On Fri, May 30, 2025 at 03:16:20PM +0200, Günther Noack wrote:
>>> On Sun, May 18, 2025 at 09:57:32PM +0200, Mickaël Salaün wrote:
>>>> We should rename the flag to LANDLOCK_RESTRICT_SELF_PROCESS to make it
>>>> clear what it does.
>>>>
>>>> The remaining issues are still the potential memory allocation failures.
>>>> There are two things:
>>>>
>>>> 1. We should try as much as possible to limit useless credential
>>>>    duplications by not creating a new struct cred if parent credentials
>>>>    are the same.
>>>>
>>>> 2. To avoid the libpsx inconsistency (because of ENOMEM or EPERM),
>>>>    landlock_restrict_self(2) should handle memory allocation and
>>>>    transition the process from a known state to another known state.
>>>>
>>>> What about this approach:
>>>> - "Freeze" all threads of the current process (not ideal but simple) to
>>>>   make sure their credentials don't get updated.
>>>> - Create a new blank credential for the calling thread.
>>>> - Walk through all threads and create a new blank credential for all
>>>>   threads with a different cred than the caller.
>>>> - Inject a task work that will call cred_transfer() for all threads with
>>>>   either the same new credential used by the caller (incrementing the
>>>>   refcount), or it will populate and use a blank one if it has different
>>>>   credentials than the caller.
>>>>
>>>> This may not efficiently deduplicate credentials for all threads but it
>>>> is a simple deduplication approach that should be useful in most cases.
>>>>
>>>> The difficult part is mainly in the "fleezing". It would be nice to
>>>> change the cred API to avoid that but I'm not sure how.
>>> I don't see an option how we could freeze the credentials of other threads:
>>>
>>> To freeze a task's credentials, we would have to inhibit that commit_creds()
>>> succeeds on that task, and I don't see how that would be done - we can not
>>> prevent these tasks from calling commit_creds() [1], and when commit_creds()
>>> gets called, it is guaranteed to work.
>>>
>>> So in my mind, we have to somehow deal with the possibility that a task has a
>>> new and not-previously-seen struct creds, by the time that its task_work gets
>>> called.  As a consequence, I think a call to prepare_creds() would then be
>>> unavoidable in the task_work?
>> OK, we don't need to freeze all threads, just to block thread creation.
>>
>> What about that:
>> 1. lock thread creation for this process
>> 2. call prepare_creds() for the calling thread (called new_cred)
>> 3. call cred_alloc_blank() for all other threads, store them in a list,
>>    and exit if ENOMEM
>> 4. asynchronously walk through all threads, and for each:
>>   a. if its creds are the same (i.e. same pointer) as the calling
>>      thread's ones, then call get_cred(new_cred) and
>>      commit_creds(new_cred).
>>   b. otherwise, take a blank cred, call cred_transfer(), add the
>>      Landlock domain, and commit_creds() with it.
>> 5. free all unused blank creds (most)
>> 6. call commit_creds(new_creds) and return
>>
>> Pros:
>> - do not block threads
>> - minimize cred duplication
>> - atomic operation (from the point of view of the caller): all or
>>   nothing (with an error)
>> - almost no change to existing cred API
>>
>> Cons:
>> - block thread creation
>> - initially allocate one cred per thread (but free most of them after)
> The proposal is growing on me.
>
> One way to view transfer_creds() and have it nicely fit into the credentials API
> would be to view prepare_creds() as a convenience wrapper around
> transfer_creds(), so that prepare_creds() is implemented as a function which:
>
>   1) allocates a new struct cred (this may fail)
>   2) calls transfer_creds() with the new struct cred to populate
>
> We could then move the bulk of its existing prepare_creds() implementation into
> the new transfer_creds(), and could also move the keyctl implementation to use
> that.
>
> A remaining problem is: The caveat and the underlying assumption is that
> transfer_creds() must never fail when it runs in the task work, if we want to
> avoid the additional synchronization.  The existing cases in which the
> credentials preparation logic can return an error are:
>
> * Allocation failure for struct cred  (we would get rid of this)
> * get_ucounts(new->ucounts) returns NULL  (not supposed to happen, AFAICT)
> * security_prepare_creds() fails.  Existing LSM implementations are:
>   * Tomoyo, SELinux, AppArmor, Landlock: always return 0
>   * SMACK: May return -ENOMEM on allocation failure
>
>
> So summing up, in my understanding, the prerequisites for this approach are that:
>
>   1) get_ucounts() never returns NULL, and
>
>   2) SMACK does not bubble up allocation errors from the cred_prepare hook
>
>      Casey, Jarkko: do you think this is realistic for SMACK to change?

It's possible, but not trivial. The task based access rule list and relabel
list would need to move from the Smack cred blob to a Smack task blob. There
wasn't a task blob when these lists were introduced, so the cred was the only
option. I would entertain patches.

>
> and that
>
>   3) We can block new thread creation
>
>      Mickaël, do you have a specific approach in mind for that?
>
>      As Jann pointed out in [1], the tasklist_lock and siglock are not sleepable
>      and can't be used while waiting, which is why he proposed an approach where
>      we retry in a loop until no new threads show up any more, while getting the
>      existing threads stuck in the task_work as well (where they can't spawn new
>      threads).
>
>
> Thanks,
>
> —Günther
>
> [1] https://lore.kernel.org/all/CAG48ez0pWg3OTABfCKRk5sWrURM-HdJhQMcWedEppc_z1rrVJw@mail.gmail.com/
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ