linux-kernel - Re: [PATCH v6] seccomp, ptrace: add support for dumping seccomp filters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5614F532.2070907@iogearbox.net>
Date:	Wed, 07 Oct 2015 12:34:26 +0200
From:	Daniel Borkmann <daniel@...earbox.net>
To:	Tycho Andersen <tycho.andersen@...onical.com>,
	Kees Cook <keescook@...omium.org>
CC:	Alexei Starovoitov <ast@...nel.org>,
	Will Drewry <wad@...omium.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Andy Lutomirski <luto@...capital.net>,
	Pavel Emelyanov <xemul@...allels.com>,
	"Serge E. Hallyn" <serge.hallyn@...ntu.com>,
	linux-kernel@...r.kernel.org, linux-api@...r.kernel.org
Subject: Re: [PATCH v6] seccomp, ptrace: add support for dumping seccomp filters

On 10/07/2015 12:25 PM, Daniel Borkmann wrote:
> On 10/07/2015 11:46 AM, Tycho Andersen wrote:
>> This patch adds support for dumping a process' (classic BPF) seccomp
>> filters via ptrace.
>>
>> PTRACE_SECCOMP_GET_FILTER allows the tracer to dump the user's classic BPF
>> seccomp filters. addr should be an integer which represents the ith seccomp
>> filter (0 is the most recently installed filter). data should be a struct
>> sock_filter * with enough room for the ith filter, or NULL, in which case
>> the filter is not saved. The return value for this command is the number of
>> BPF instructions the program represents, or negative in the case of errors.
>> A command specific error is ENOENT, which indicates that there is no ith
>> filter in this seccomp tree.
>>
>> A caveat with this approach is that there is no way to get explicitly at
>> the heirarchy of seccomp filters, and users need to memcmp() filters to
>> decide which are inherited. This means that a task which installs two of
>> the same filter can potentially confuse users of this interface.
>>
>> Signed-off-by: Tycho Andersen <tycho.andersen@...onical.com>
>> CC: Kees Cook <keescook@...omium.org>
>> CC: Will Drewry <wad@...omium.org>
>> CC: Oleg Nesterov <oleg@...hat.com>
>> CC: Andy Lutomirski <luto@...capital.net>
>> CC: Pavel Emelyanov <xemul@...allels.com>
>> CC: Serge E. Hallyn <serge.hallyn@...ntu.com>
>> CC: Alexei Starovoitov <ast@...nel.org>
>> CC: Daniel Borkmann <daniel@...earbox.net>
>> ---
>>   include/linux/seccomp.h     | 11 +++++++++
>>   include/uapi/linux/ptrace.h |  2 ++
>>   kernel/ptrace.c             |  5 ++++
>>   kernel/seccomp.c            | 57 ++++++++++++++++++++++++++++++++++++++++++++-
>>   4 files changed, 74 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
>> index f426503..8861b5b 100644
>> --- a/include/linux/seccomp.h
>> +++ b/include/linux/seccomp.h
>> @@ -95,4 +95,15 @@ static inline void get_seccomp_filter(struct task_struct *tsk)
>>       return;
>>   }
>>   #endif /* CONFIG_SECCOMP_FILTER */
>> +
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_CHECKPOINT_RESTORE)
>> +extern long seccomp_get_filter(struct task_struct *task, long n,
>> +                   void __user *data);
>> +#else
>> +static inline long seccomp_get_filter(struct task_struct *task,
>> +                      long n, void __user *data)
>> +{
>> +    return -EINVAL;
>
> Nit: -ENOTSUP would probably be the better choice? -EINVAL might just
> be confusing to users? (Would be unclear to them whether there's actual
> support of dumping or whether it's just an invalid argument.)
>
>> +}
>> +#endif /* CONFIG_SECCOMP_FILTER && CONFIG_CHECKPOINT_RESTORE */
>>   #endif /* _LINUX_SECCOMP_H */
> ...
>> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
>> index 787320d..b760bae 100644
>> --- a/kernel/ptrace.c
>> +++ b/kernel/ptrace.c
>> @@ -1016,6 +1016,11 @@ int ptrace_request(struct task_struct *child, long request,
>>           break;
>>       }
>>   #endif
>> +
>> +    case PTRACE_SECCOMP_GET_FILTER:
>> +        ret = seccomp_get_filter(child, addr, datavp);
>> +        break;
>> +
>>       default:
>>           break;
>>       }
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 06858a7..c8a4564 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -347,6 +347,7 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>>   {
>>       struct seccomp_filter *sfilter;
>>       int ret;
>> +    bool save_orig = config_enabled(CONFIG_CHECKPOINT_RESTORE);
>>
>>       if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
>>           return ERR_PTR(-EINVAL);
>> @@ -370,7 +371,7 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
>>           return ERR_PTR(-ENOMEM);
>>
>>       ret = bpf_prog_create_from_user(&sfilter->prog, fprog,
>> -                    seccomp_check_filter, false);
>> +                    seccomp_check_filter, save_orig);
>>       if (ret < 0) {
>>           kfree(sfilter);
>>           return ERR_PTR(ret);
>> @@ -867,3 +868,57 @@ long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter)
>>       /* prctl interface doesn't have flags, so they are always zero. */
>>       return do_seccomp(op, 0, uargs);
>>   }
>> +
>> +#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_CHECKPOINT_RESTORE)
>> +long seccomp_get_filter(struct task_struct *task, long n, void __user *data)
>> +{
>> +    struct seccomp_filter *filter;
>> +    struct sock_fprog_kern *fprog;
>> +    long ret;
>> +
>> +    if (n < 0)
>> +        return -EINVAL;
>
> I would probably give 'n' a better name, maybe 'filter_off' to denote an
> offset in the task's filter list?
>
> So, it's called as seccomp_get_filter(child, addr, datavp), and addr is
> an unsigned long in ptrace_request(). Any reasons why making this 'long n'
> with adding this above check?
>
>> +    spin_lock_irq(&current->sighand->siglock);
>> +    if (!capable(CAP_SYS_ADMIN) ||
>
> The capability check should probably happen before taking the task's spinlock.
>
>> +        current->seccomp.mode != SECCOMP_MODE_DISABLED) {

Should this rather be: current->seccomp.mode == SECCOMP_MODE_DISABLED ?
So that you bail out when seccomp is not in use?

>> +        ret = -EACCES;
>> +        goto out_self;
>> +    }
>> +
>> +    spin_lock_irq(&task->sighand->siglock);
>> +    if (task->seccomp.mode != SECCOMP_MODE_FILTER) {
>> +        ret = -EINVAL;
>> +        goto out_task;
>> +    }
>> +
>> +    filter = task->seccomp.filter;
>> +    while (n > 0 && filter) {
>> +        filter = filter->prev;
>> +        n--;
>> +    }
>> +
>> +    if (!filter) {
>> +        ret = -ENOENT;
>> +        goto out_task;
>> +    }
>> +
>> +    fprog = filter->prog->orig_prog;
>
> You could add this check ...
>
>    https://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=93d08b6966cf730ea669d4d98f43627597077153
>
> ... here as well, so we don't get surprises in future. ;)
>
>> +    ret = fprog->len;
>> +    if (!data)
>> +        goto out_task;
>> +
>> +    if (copy_to_user(data, fprog->filter, bpf_classic_proglen(fprog))) {
>> +        ret = -EFAULT;
>> +        goto out_task;
>> +    }
>> +
>> +out_task:
>> +    spin_unlock_irq(&task->sighand->siglock);
>> +
>> +out_self:
>> +    spin_unlock_irq(&current->sighand->siglock);
>> +    return ret;
>> +}
>> +#endif
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/