lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a3c71579-4238-0967-b61d-522859f740aa@digikod.net>
Date:   Wed, 19 Apr 2017 01:24:16 +0200
From:   Mickaël Salaün <mic@...ikod.net>
To:     Kees Cook <keescook@...omium.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Andy Lutomirski <luto@...capital.net>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Casey Schaufler <casey@...aufler-ca.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        David Drysdale <drysdale@...gle.com>,
        "David S . Miller" <davem@...emloft.net>,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        James Morris <james.l.morris@...cle.com>,
        Jann Horn <jann@...jh.net>, Jonathan Corbet <corbet@....net>,
        Matthew Garrett <mjg59@...f.ucam.org>,
        Michael Kerrisk <mtk.manpages@...il.com>,
        Paul Moore <paul@...l-moore.com>,
        Sargun Dhillon <sargun@...gun.me>,
        "Serge E . Hallyn" <serge@...lyn.com>,
        Shuah Khan <shuah@...nel.org>, Tejun Heo <tj@...nel.org>,
        Thomas Graf <tgraf@...g.ch>, Will Drewry <wad@...omium.org>,
        "kernel-hardening@...ts.openwall.com" 
        <kernel-hardening@...ts.openwall.com>,
        Linux API <linux-api@...r.kernel.org>,
        linux-security-module <linux-security-module@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH net-next v6 06/11] seccomp,landlock: Handle Landlock
 events per process hierarchy



On 19/04/2017 00:53, Kees Cook wrote:
> On Tue, Mar 28, 2017 at 4:46 PM, Mickaël Salaün <mic@...ikod.net> wrote:
>> The seccomp(2) syscall can be used by a task to apply a Landlock rule to
>> itself. As a seccomp filter, a Landlock rule is enforced for the current
>> task and all its future children. A rule is immutable and a task can
>> only add new restricting rules to itself, forming a chain of rules.
>>
>> A Landlock rule is tied to a Landlock event. If the use of a kernel
>> object is allowed by the other Linux security mechanisms (e.g. DAC,
>> capabilities, other LSM), then a Landlock event related to this kind of
>> object is triggered. The chain of rules for this event is then
>> evaluated. Each rule return a 32-bit value which can deny the use of a
>> kernel object with a non-zero value. If every rules of the chain return
>> zero, then the use of the object is allowed.
>>
>> Changes since v5:
>> * remove struct landlock_node and use a similar inheritance mechanisme
>>   as seccomp-bpf (requested by Andy Lutomirski)
>> * rename SECCOMP_ADD_LANDLOCK_RULE to SECCOMP_APPEND_LANDLOCK_RULE
>> * rename file manager.c to providers.c
>> * add comments
>> * typo and cosmetic fixes
>>
>> Changes since v4:
>> * merge manager and seccomp patches
>> * return -EFAULT in seccomp(2) when user_bpf_fd is null to easely check
>>   if Landlock is supported
>> * only allow a process with the global CAP_SYS_ADMIN to use Landlock
>>   (will be lifted in the future)
>> * add an early check to exit as soon as possible if the current process
>>   does not have Landlock rules
>>
>> Changes since v3:
>> * remove the hard link with seccomp (suggested by Andy Lutomirski and
>>   Kees Cook):
>>   * remove the cookie which could imply multiple evaluation of Landlock
>>     rules
>>   * remove the origin field in struct landlock_data
>> * remove documentation fix (merged upstream)
>> * rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
>> * internal renaming
>> * split commit
>> * new design to be able to inherit on the fly the parent rules
>>
>> Changes since v2:
>> * Landlock programs can now be run without seccomp filter but for any
>>   syscall (from the process) or interruption
>> * move Landlock related functions and structs into security/landlock/*
>>   (to manage cgroups as well)
>> * fix seccomp filter handling: run Landlock programs for each of their
>>   legitimate seccomp filter
>> * properly clean up all seccomp results
>> * cosmetic changes to ease the understanding
>> * fix some ifdef
>>
>> Signed-off-by: Mickaël Salaün <mic@...ikod.net>
>> Cc: Alexei Starovoitov <ast@...nel.org>
>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>> Cc: Andy Lutomirski <luto@...capital.net>
>> Cc: James Morris <james.l.morris@...cle.com>
>> Cc: Kees Cook <keescook@...omium.org>
>> Cc: Serge E. Hallyn <serge@...lyn.com>
>> Cc: Will Drewry <wad@...omium.org>
>> Link: https://lkml.kernel.org/r/c10a503d-5e35-7785-2f3d-25ed8dd63fab@digikod.net
>> ---
>>  include/linux/landlock.h      |  36 +++++++
>>  include/linux/seccomp.h       |   8 ++
>>  include/uapi/linux/seccomp.h  |   1 +
>>  kernel/fork.c                 |  14 ++-
>>  kernel/seccomp.c              |   8 ++
>>  security/landlock/Makefile    |   2 +-
>>  security/landlock/hooks.c     |  37 +++++++
>>  security/landlock/hooks.h     |   5 +
>>  security/landlock/init.c      |   3 +-
>>  security/landlock/providers.c | 232 ++++++++++++++++++++++++++++++++++++++++++
>>  10 files changed, 342 insertions(+), 4 deletions(-)
>>  create mode 100644 security/landlock/providers.c
>>
>> diff --git a/include/linux/landlock.h b/include/linux/landlock.h
>> index 53013dc374fe..c40ee78e86e0 100644
>> --- a/include/linux/landlock.h
>> +++ b/include/linux/landlock.h
>> @@ -12,6 +12,9 @@
>>  #define _LINUX_LANDLOCK_H
>>  #ifdef CONFIG_SECURITY_LANDLOCK
>>
>> +#include <linux/bpf.h> /* _LANDLOCK_SUBTYPE_EVENT_LAST */
>> +#include <linux/types.h> /* atomic_t */
>> +
>>  /*
>>   * This is not intended for the UAPI headers. Each userland software should use
>>   * a static minimal version for the required features as explained in the
>> @@ -19,5 +22,38 @@
>>   */
>>  #define LANDLOCK_VERSION 1
>>
>> +struct landlock_rule {
>> +       atomic_t usage;
> 
> This should be refcount_t. (And I should convert seccomp to use
> refcount_t too!) :)

OK

> 
>> +       struct landlock_rule *prev;
>> +       struct bpf_prog *prog;
>> +};
>> +
>> +/**
>> + * struct landlock_events - Landlock event rules enforced on a thread
>> + *
>> + * This is used for low performance impact when forking a process. Instead of
>> + * copying the full array and incrementing the usage of each entries, only
>> + * create a pointer to &struct landlock_events and increments its usage. When
>> + * appending a new rule, if &struct landlock_events is shared with other tasks,
>> + * then duplicate it and append the rule to this new &struct landlock_events.
>> + *
>> + * @usage: reference count to manage the object lifetime. When a thread need to
>> + *         add Landlock rules and if @usage is greater than 1, then the thread
>> + *         must duplicate &struct landlock_events to not change the children's
>> + *         rules as well.
>> + * @rules: array of non-NULL &struct landlock_rule pointers
>> + */
>> +struct landlock_events {
>> +       atomic_t usage;
>> +       struct landlock_rule *rules[_LANDLOCK_SUBTYPE_EVENT_LAST];
>> +};
>> +
>> +void put_landlock_events(struct landlock_events *events);
>> +
>> +#ifdef CONFIG_SECCOMP_FILTER
> 
> Isn't CONFIG_SECCOMP_FILTER already required for landlock?

Yes it is, but Landlock could only/also be used through cgroups in the
future. :)

> 
>> +int landlock_seccomp_append_prog(unsigned int flags,
>> +               const char __user *user_bpf_fd);
>> +#endif /* CONFIG_SECCOMP_FILTER */
>> +
>>  #endif /* CONFIG_SECURITY_LANDLOCK */
>>  #endif /* _LINUX_LANDLOCK_H */
>> diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
>> index e25aee2cdfc0..9a38de3c0e72 100644
>> --- a/include/linux/seccomp.h
>> +++ b/include/linux/seccomp.h
>> @@ -10,6 +10,10 @@
>>  #include <linux/thread_info.h>
>>  #include <asm/seccomp.h>
>>
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +struct landlock_events;
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
> 
> Testing LANDLOCK should be sufficient, since it requires ..._FILTER.
> 
>> +
>>  struct seccomp_filter;
>>  /**
>>   * struct seccomp - the state of a seccomp'ed process
>> @@ -18,6 +22,7 @@ struct seccomp_filter;
>>   *         system calls available to a process.
>>   * @filter: must always point to a valid seccomp-filter or NULL as it is
>>   *          accessed without locking during system call entry.
>> + * @landlock_events: contains an array of Landlock rules.
>>   *
>>   *          @filter must only be accessed from the context of current as there
>>   *          is no read locking.
>> @@ -25,6 +30,9 @@ struct seccomp_filter;
>>  struct seccomp {
>>         int mode;
>>         struct seccomp_filter *filter;
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +       struct landlock_events *landlock_events;
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
> 
> Same.
> 
>>  };
>>
>>  #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
>> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
>> index 0f238a43ff1e..74891cf60ca6 100644
>> --- a/include/uapi/linux/seccomp.h
>> +++ b/include/uapi/linux/seccomp.h
>> @@ -13,6 +13,7 @@
>>  /* Valid operations for seccomp syscall. */
>>  #define SECCOMP_SET_MODE_STRICT        0
>>  #define SECCOMP_SET_MODE_FILTER        1
>> +#define SECCOMP_APPEND_LANDLOCK_RULE   2
>>
>>  /* Valid flags for SECCOMP_SET_MODE_FILTER */
>>  #define SECCOMP_FILTER_FLAG_TSYNC      1
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index a27d8e67ce33..14c09486c565 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -47,6 +47,7 @@
>>  #include <linux/security.h>
>>  #include <linux/hugetlb.h>
>>  #include <linux/seccomp.h>
>> +#include <linux/landlock.h>
>>  #include <linux/swap.h>
>>  #include <linux/syscalls.h>
>>  #include <linux/jiffies.h>
>> @@ -528,7 +529,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
>>          * the usage counts on the error path calling free_task.
>>          */
>>         tsk->seccomp.filter = NULL;
>> -#endif
>> +#ifdef CONFIG_SECURITY_LANDLOCK
>> +       tsk->seccomp.landlock_events = NULL;
>> +#endif /* CONFIG_SECURITY_LANDLOCK */
>> +#endif /* CONFIG_SECCOMP */
>>
>>         setup_thread_stack(tsk, orig);
>>         clear_user_return_notifier(tsk);
>> @@ -1405,7 +1409,13 @@ static void copy_seccomp(struct task_struct *p)
>>
>>         /* Ref-count the new filter user, and assign it. */
>>         get_seccomp_filter(current);
>> -       p->seccomp = current->seccomp;
>> +       p->seccomp.mode = current->seccomp.mode;
>> +       p->seccomp.filter = current->seccomp.filter;
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +       p->seccomp.landlock_events = current->seccomp.landlock_events;
>> +       if (p->seccomp.landlock_events)
>> +               atomic_inc(&p->seccomp.landlock_events->usage);
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
> 
> Hrm. So, this needs config cleanup as above. Also, I'm going to need
> some help understanding the usage tracking on landlock_events (which
> should use a get/put rather than open-coding the _inc). I don't see
> why individual assignments are needed here. The only thing that
> matters is the usage bump. I would have expected no changes at all in
> this code, actually. The filter and the events share the same usage
> don't they?

Right, I can move the struct landlock_event into the struct
seccomp_filter. This should make the code cleaner.


> 
>>         /*
>>          * Explicitly enable no_new_privs here in case it got set
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 326f79e32127..d122829e6da1 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -34,6 +34,7 @@
>>  #include <linux/security.h>
>>  #include <linux/tracehook.h>
>>  #include <linux/uaccess.h>
>> +#include <linux/landlock.h>
>>
>>  /**
>>   * struct seccomp_filter - container for seccomp BPF programs
>> @@ -494,6 +495,9 @@ static void put_seccomp_filter(struct seccomp_filter *filter)
>>  void put_seccomp(struct task_struct *tsk)
>>  {
>>         put_seccomp_filter(tsk->seccomp.filter);
>> +#ifdef CONFIG_SECURITY_LANDLOCK
>> +       put_landlock_events(tsk->seccomp.landlock_events);
>> +#endif /* CONFIG_SECURITY_LANDLOCK */
> 
> put_landlock_events() should be defined in a header file to be a
> static inline no-op if ..._LANDLOCK isn't defined. That way we can
> avoid all the #ifdef stuff here. Similarly, I think all of these
> should take just task_struct so that there isn't an exposed structure
> name which lets you avoid the #ifdef too.

Right

> 
>>  }
>>
>>  static void seccomp_init_siginfo(siginfo_t *info, int syscall, int reason)
>> @@ -813,6 +817,10 @@ static long do_seccomp(unsigned int op, unsigned int flags,
>>                 return seccomp_set_mode_strict();
>>         case SECCOMP_SET_MODE_FILTER:
>>                 return seccomp_set_mode_filter(flags, uargs);
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
>> +       case SECCOMP_APPEND_LANDLOCK_RULE:
>> +               return landlock_seccomp_append_prog(flags, uargs);
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
>>         default:
>>                 return -EINVAL;
>>         }
>> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
>> index c0db504a6335..da8ba8b5183e 100644
>> --- a/security/landlock/Makefile
>> +++ b/security/landlock/Makefile
>> @@ -2,4 +2,4 @@ ccflags-$(CONFIG_SECURITY_LANDLOCK) += -Werror=unused-function
>>
>>  obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
>>
>> -landlock-y := init.o hooks.o hooks_fs.o
>> +landlock-y := init.o providers.o hooks.o hooks_fs.o
>> diff --git a/security/landlock/hooks.c b/security/landlock/hooks.c
>> index eaee8162ff70..4fa7d0b38d41 100644
>> --- a/security/landlock/hooks.c
>> +++ b/security/landlock/hooks.c
>> @@ -95,6 +95,38 @@ bool landlock_is_valid_access(int off, int size, enum bpf_access_type type,
>>         return true;
>>  }
>>
>> +/**
>> + * landlock_event_deny - run Landlock rules tied to an event
>> + *
>> + * @event_idx: event index in the rules array
>> + * @ctx: non-NULL eBPF context
>> + * @events: Landlock events pointer
>> + *
>> + * Return true if at least one rule deny the event.
>> + */
>> +static bool landlock_event_deny(u32 event_idx, const struct landlock_context *ctx,
>> +               struct landlock_events *events)
>> +{
>> +       struct landlock_rule *rule;
>> +
>> +       if (!events)
>> +               return false;
>> +
>> +       for (rule = events->rules[event_idx]; rule; rule = rule->prev) {
>> +               u32 ret;
>> +
>> +               if (WARN_ON(!rule->prog))
>> +                       continue;
>> +               rcu_read_lock();
>> +               ret = BPF_PROG_RUN(rule->prog, (void *)ctx);
>> +               rcu_read_unlock();
>> +               /* deny access if a program returns a value different than 0 */
>> +               if (ret)
>> +                       return true;
>> +       }
>> +       return false;
>> +}
>> +
>>  int landlock_decide(enum landlock_subtype_event event,
>>                 __u64 ctx_values[CTX_ARG_NB], u32 cmd, const char *hook)
>>  {
>> @@ -111,5 +143,10 @@ int landlock_decide(enum landlock_subtype_event event,
>>                 .arg2 = ctx_values[1],
>>         };
>>
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +       deny = landlock_event_deny(event_idx, &ctx,
>> +                       current->seccomp.landlock_events);
>> +#endif /* CONFIG_SECCOMP_FILTER */
> 
> Isn't ..._FILTER required?

Same as above, this is required for now but could change in the near
future. :)

> 
>> +
>>         return deny ? -EPERM : 0;
>>  }
>> diff --git a/security/landlock/hooks.h b/security/landlock/hooks.h
>> index 2e180f6ed86b..dd0486a4c284 100644
>> --- a/security/landlock/hooks.h
>> +++ b/security/landlock/hooks.h
>> @@ -12,6 +12,7 @@
>>  #include <linux/bpf.h> /* enum bpf_access_type */
>>  #include <linux/lsm_hooks.h>
>>  #include <linux/sched.h> /* struct task_struct */
>> +#include <linux/seccomp.h>
>>
>>  /* separators */
>>  #define SEP_COMMA() ,
>> @@ -163,7 +164,11 @@ WRAP_TYPE_RAW_C;
>>
>>  static inline bool landlocked(const struct task_struct *task)
>>  {
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +       return !!(task->seccomp.landlock_events);
>> +#else
>>         return false;
>> +#endif /* CONFIG_SECCOMP_FILTER */
>>  }
>>
>>  __init void landlock_register_hooks(struct security_hook_list *hooks, int count);
>> diff --git a/security/landlock/init.c b/security/landlock/init.c
>> index 1c2750e12dfa..ef8a3da69860 100644
>> --- a/security/landlock/init.c
>> +++ b/security/landlock/init.c
>> @@ -135,7 +135,8 @@ static struct bpf_prog_type_list bpf_landlock_type __ro_after_init = {
>>
>>  void __init landlock_add_hooks(void)
>>  {
>> -       pr_info("landlock: Version %u", LANDLOCK_VERSION);
>> +       pr_info("landlock: Version %u, ready to sandbox with %s\n",
>> +                       LANDLOCK_VERSION, "seccomp");
>>         landlock_add_hooks_fs();
>>         security_add_hooks(NULL, 0, "landlock");
>>         bpf_register_prog_type(&bpf_landlock_type);
>> diff --git a/security/landlock/providers.c b/security/landlock/providers.c
>> new file mode 100644
>> index 000000000000..6d867a39c947
>> --- /dev/null
>> +++ b/security/landlock/providers.c
>> @@ -0,0 +1,232 @@
>> +/*
>> + * Landlock LSM - seccomp provider
>> + *
>> + * Copyright © 2017 Mickaël Salaün <mic@...ikod.net>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2, as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <asm/page.h> /* PAGE_SIZE */
>> +#include <linux/atomic.h> /* atomic_*(), smp_store_release() */
>> +#include <linux/bpf.h> /* bpf_prog_put() */
>> +#include <linux/filter.h> /* struct bpf_prog */
>> +#include <linux/kernel.h> /* round_up() */
>> +#include <linux/landlock.h>
>> +#include <linux/sched.h> /* current_cred(), task_no_new_privs() */
>> +#include <linux/security.h> /* security_capable_noaudit() */
>> +#include <linux/slab.h> /* alloc(), kfree() */
>> +#include <linux/types.h> /* atomic_t */
>> +#include <linux/uaccess.h> /* copy_from_user() */
>> +
>> +#include "common.h"
>> +
>> +static void put_landlock_rule(struct landlock_rule *rule)
>> +{
>> +       struct landlock_rule *orig = rule;
>> +
>> +       /* clean up single-reference branches iteratively */
>> +       while (orig && atomic_dec_and_test(&orig->usage)) {
>> +               struct landlock_rule *freeme = orig;
>> +
>> +               bpf_prog_put(orig->prog);
>> +               orig = orig->prev;
>> +               kfree(freeme);
>> +       }
>> +}
>> +
>> +void put_landlock_events(struct landlock_events *events)
>> +{
>> +       if (events && atomic_dec_and_test(&events->usage)) {
>> +               size_t i;
>> +
>> +               for (i = 0; i < ARRAY_SIZE(events->rules); i++)
>> +                       /* XXX: Do we need to use lockless_dereference() here? */
>> +                       put_landlock_rule(events->rules[i]);
>> +               kfree(events);
>> +       }
>> +}
>> +
>> +static struct landlock_events *new_landlock_events(void)
>> +{
>> +       struct landlock_events *ret;
>> +
>> +       /* array filled with NULL values */
>> +       ret = kzalloc(sizeof(*ret), GFP_KERNEL);
>> +       if (!ret)
>> +               return ERR_PTR(-ENOMEM);
>> +       atomic_set(&ret->usage, 1);
>> +       return ret;
>> +}
>> +
>> +static void add_landlock_rule(struct landlock_events *events,
>> +               struct landlock_rule *rule)
>> +{
>> +       /* subtype.landlock_rule.event > 0 for loaded programs */
>> +       u32 event_idx = get_index(rule->prog->subtype.landlock_rule.event);
>> +
>> +       rule->prev = events->rules[event_idx];
>> +       WARN_ON(atomic_read(&rule->usage));
>> +       atomic_set(&rule->usage, 1);
>> +       /* do not increment the previous rule usage */
>> +       smp_store_release(&events->rules[event_idx], rule);
>> +}
>> +
>> +/* limit Landlock events to 256KB */
>> +#define LANDLOCK_EVENTS_MAX_PAGES (1 << 6)
>> +
>> +/**
>> + * landlock_append_prog - attach a Landlock rule to @current_events
>> + *
>> + * @current_events: landlock_events pointer, must be locked (if needed) to
>> + *                  prevent a concurrent put/free. This pointer must not be
>> + *                  freed after the call.
>> + * @prog: non-NULL Landlock rule to append to @current_events. @prog will be
>> + *        owned by landlock_append_prog() and freed if an error happened.
>> + *
>> + * Return @current_events or a new pointer when OK. Return a pointer error
>> + * otherwise.
>> + */
>> +static struct landlock_events *landlock_append_prog(
>> +               struct landlock_events *current_events, struct bpf_prog *prog)
>> +{
>> +       struct landlock_events *new_events = current_events;
>> +       unsigned long pages;
>> +       struct landlock_rule *rule;
>> +       u32 event_idx;
>> +
>> +       if (prog->type != BPF_PROG_TYPE_LANDLOCK) {
>> +               new_events = ERR_PTR(-EINVAL);
>> +               goto put_prog;
>> +       }
>> +
>> +       /* validate memory size allocation */
>> +       pages = prog->pages;
>> +       if (current_events) {
>> +               size_t i;
>> +
>> +               for (i = 0; i < ARRAY_SIZE(current_events->rules); i++) {
>> +                       struct landlock_rule *walker_r;
>> +
>> +                       for (walker_r = current_events->rules[i]; walker_r;
>> +                                       walker_r = walker_r->prev)
>> +                               pages += walker_r->prog->pages;
>> +               }
>> +               /* count a struct landlock_events if we need to allocate one */
>> +               if (atomic_read(&current_events->usage) != 1)
>> +                       pages += round_up(sizeof(*current_events), PAGE_SIZE) /
>> +                               PAGE_SIZE;
>> +       }
>> +       if (pages > LANDLOCK_EVENTS_MAX_PAGES) {
>> +               new_events = ERR_PTR(-E2BIG);
>> +               goto put_prog;
>> +       }
>> +
>> +       rule = kzalloc(sizeof(*rule), GFP_KERNEL);
>> +       if (!rule) {
>> +               new_events = ERR_PTR(-ENOMEM);
>> +               goto put_prog;
>> +       }
>> +       rule->prog = prog;
>> +
>> +       /* subtype.landlock_rule.event > 0 for loaded programs */
>> +       event_idx = get_index(rule->prog->subtype.landlock_rule.event);
>> +
>> +       if (!new_events) {
>> +               /*
>> +                * If there is no Landlock events used by the current task,
>> +                * then create a new one.
>> +                */
>> +               new_events = new_landlock_events();
>> +               if (IS_ERR(new_events))
>> +                       goto put_rule;
> 
> Shouldn't bpf_prog_put() get called in the face of a rule failure too?
> Why separate exit paths?

You're right but put_landlock_rule() call bpf_prog_put() by itself.


> 
>> +       } else if (atomic_read(&current_events->usage) > 1) {
>> +               /*
>> +                * If the current task is not the sole user of its Landlock
>> +                * events, then duplicate them.
>> +                */
>> +               size_t i;
>> +
>> +               new_events = new_landlock_events();
>> +               if (IS_ERR(new_events))
>> +                       goto put_rule;
>> +               for (i = 0; i < ARRAY_SIZE(new_events->rules); i++) {
>> +                       new_events->rules[i] =
>> +                               lockless_dereference(current_events->rules[i]);
>> +                       if (new_events->rules[i])
>> +                               atomic_inc(&new_events->rules[i]->usage);
> 
> I was going to ask: isn't the top-level usage counter sufficient to
> track rule lifetime? But I think I see how things are tracked now:
> each task_struct points to an array of rule list pointers. These
> tables are duplicated when additions are made (which means each table
> needs to be refcounted for the processes using it), and when a new
> table is created, all the refcounters on the rules are bumped (to
> track each table that references the rule), and when a new rule is
> added, it's just prepended to the list for the new table to point at.

That's right.

> 
> Does this mean that rules are processed in reverse?

Yes, the rules are processed from the newest to the oldest, as
seccomp-bpf does with filters.

> 
>> +               }
>> +
>> +               /*
>> +                * Landlock events from the current task will not be freed here
>> +                * because the usage is strictly greater than 1. It is only
>> +                * prevented to be freed by another subject thanks to the
>> +                * caller of landlock_append_prog() which should be locked if
>> +                * needed.
>> +                */
>> +               put_landlock_events(current_events);
>> +       }
>> +       add_landlock_rule(new_events, rule);
>> +       return new_events;
>> +
>> +put_prog:
>> +       bpf_prog_put(prog);
>> +       return new_events;
>> +
>> +put_rule:
>> +       put_landlock_rule(rule);
>> +       return new_events;
>> +}
>> +
>> +/**
>> + * landlock_seccomp_append_prog - attach a Landlock rule to the current process
>> + *
>> + * current->seccomp.landlock_events is lazily allocated. When a process fork,
>> + * only a pointer is copied. When a new event is added by a process, if there
>> + * is other references to this process' landlock_events, then a new allocation
>> + * is made to contain an array pointing to Landlock rule lists. This design
>> + * enable low-performance impact and is memory efficient while keeping the
>> + * property of append-only rules.
>> + *
>> + * @flags: not used for now, but could be used for TSYNC
>> + * @user_bpf_fd: file descriptor pointing to a loaded Landlock rule
>> + */
>> +#ifdef CONFIG_SECCOMP_FILTER
>> +int landlock_seccomp_append_prog(unsigned int flags,
>> +               const char __user *user_bpf_fd)
>> +{
>> +       struct landlock_events *new_events;
>> +       struct bpf_prog *prog;
>> +       int bpf_fd;
>> +
>> +       /* force no_new_privs to limit privilege escalation */
>> +       if (!task_no_new_privs(current))
>> +               return -EPERM;
>> +       /* will be removed in the future to allow unprivileged tasks */
>> +       if (!capable(CAP_SYS_ADMIN))
>> +               return -EPERM;
>> +       /* enable to check if Landlock is supported with early EFAULT */
>> +       if (!user_bpf_fd)
>> +               return -EFAULT;
>> +       if (flags)
>> +               return -EINVAL;
>> +       if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(bpf_fd)))
>> +               return -EFAULT;
> 
> I think this can just be get_user()?

Yes, I didn't know about that.

> 
>> +       prog = bpf_prog_get(bpf_fd);
>> +       if (IS_ERR(prog))
>> +               return PTR_ERR(prog);
>> +
>> +       /*
>> +        * We don't need to lock anything for the current process hierarchy,
>> +        * everything is guarded by the atomic counters.
>> +        */
>> +       new_events = landlock_append_prog(current->seccomp.landlock_events,
>> +                       prog);
>> +       /* @prog is managed/freed by landlock_append_prog() */
> 
> Does kmemcheck notice this "leak"? (i.e. is further annotation needed?)

I didn't enable kmemcheck, I will take a look at it.

> 
>> +       if (IS_ERR(new_events))
>> +               return PTR_ERR(new_events);
>> +       current->seccomp.landlock_events = new_events;
>> +       return 0;
>> +}
>> +#endif /* CONFIG_SECCOMP_FILTER */
>> --
>> 2.11.0
>>
> 
> 
> 



Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ