[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+khW7jhD0+s9kivrd6PsNEaxmDCewhk_egrsxwdHPZNkubJYA@mail.gmail.com>
Date: Tue, 15 Mar 2022 10:27:39 -0700
From: Hao Luo <haoluo@...gle.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
KP Singh <kpsingh@...nel.org>,
Shakeel Butt <shakeelb@...gle.com>,
Joe Burton <jevburton.kernel@...il.com>,
Tejun Heo <tj@...nel.org>, joshdon@...gle.com, sdf@...gle.com,
bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v1 1/9] bpf: Add mkdir, rmdir, unlink syscalls
for prog_bpf_syscall
On Mon, Mar 14, 2022 at 4:12 PM Al Viro <viro@...iv.linux.org.uk> wrote:
>
> On Mon, Mar 14, 2022 at 10:07:31AM -0700, Hao Luo wrote:
> > Hello Al,
>
> > > In which contexts can those be called?
> > >
> >
> > In a sleepable context. The plan is to introduce a certain tracepoints
> > as sleepable, a program that attaches to sleepable tracepoints is
> > allowed to call these functions. In particular, the first sleepable
> > tracepoint introduced in this patchset is one at the end of
> > cgroup_mkdir(). Do you have any advices?
>
> Yes - don't do it, unless you really want a lot of user-triggerable
> deadlocks.
>
> Pathname resolution is not locking-agnostic. In particular, you can't
> do it if you are under any ->i_rwsem, whether it's shared or exclusive.
> That includes cgroup_mkdir() callchains. And if the pathname passed
> to these functions will have you walk through the parent directory,
> you would get screwed (e.g. if the next component happens to be
> inexistent, triggering a lookup, which takes ->i_rwsem shared).
I'm thinking of two options, let's see if either can work out:
Option 1: We can put restrictions on the pathname passed into this
helper. We can explicitly require the parameter dirfd to be in bpffs
(we can verify). In addition, we check pathname to be not containing
any dot or dotdot, so the resolved path will end up inside bpffs,
therefore won't take ->i_rwsem that is in the callchain of
cgroup_mkdir().
Option 2: We can avoid pathname resolution entirely. Like above, we
can adjust the semantics of this helper to be: making an immediate
directory under the dirfd passed in. In particular, like above, we can
enforce the dirfd to be in bpffs and pathname to consist of only
alphabet and numbers. With these restrictions, we call vfs_mkdir() to
create directories.
Being able to mkdir from bpf has useful use cases, let's try to make
it happen even with many limitations.
Thanks!
Powered by blists - more mailing lists