lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Mar 2022 10:27:39 -0700
From:   Hao Luo <haoluo@...gle.com>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        KP Singh <kpsingh@...nel.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Joe Burton <jevburton.kernel@...il.com>,
        Tejun Heo <tj@...nel.org>, joshdon@...gle.com, sdf@...gle.com,
        bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v1 1/9] bpf: Add mkdir, rmdir, unlink syscalls
 for prog_bpf_syscall

On Mon, Mar 14, 2022 at 4:12 PM Al Viro <viro@...iv.linux.org.uk> wrote:
>
> On Mon, Mar 14, 2022 at 10:07:31AM -0700, Hao Luo wrote:
> > Hello Al,
>
> > > In which contexts can those be called?
> > >
> >
> > In a sleepable context. The plan is to introduce a certain tracepoints
> > as sleepable, a program that attaches to sleepable tracepoints is
> > allowed to call these functions. In particular, the first sleepable
> > tracepoint introduced in this patchset is one at the end of
> > cgroup_mkdir(). Do you have any advices?
>
> Yes - don't do it, unless you really want a lot of user-triggerable
> deadlocks.
>
> Pathname resolution is not locking-agnostic.  In particular, you can't
> do it if you are under any ->i_rwsem, whether it's shared or exclusive.
> That includes cgroup_mkdir() callchains.  And if the pathname passed
> to these functions will have you walk through the parent directory,
> you would get screwed (e.g. if the next component happens to be
> inexistent, triggering a lookup, which takes ->i_rwsem shared).

I'm thinking of two options, let's see if either can work out:

Option 1: We can put restrictions on the pathname passed into this
helper. We can explicitly require the parameter dirfd to be in bpffs
(we can verify). In addition, we check pathname to be not containing
any dot or dotdot, so the resolved path will end up inside bpffs,
therefore won't take ->i_rwsem that is in the callchain of
cgroup_mkdir().

Option 2: We can avoid pathname resolution entirely. Like above, we
can adjust the semantics of this helper to be: making an immediate
directory under the dirfd passed in. In particular, like above, we can
enforce the dirfd to be in bpffs and pathname to consist of only
alphabet and numbers. With these restrictions, we call vfs_mkdir() to
create directories.

Being able to mkdir from bpf has useful use cases, let's try to make
it happen even with many limitations.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ