linux-kernel - Re: BPF: bpf_d_path() can be invoked on "struct path" not holding proper references, resulting in kernel memory corruption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAADnVQKm3u+XbMoRxXSwg_d+Q80jPdqgzO6Yz+6JXvYATyMEZw@mail.gmail.com>
Date:   Mon, 30 Oct 2023 10:11:00 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Jann Horn <jannh@...gle.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Jiri Olsa <jolsa@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, KP Singh <kpsingh@...gle.com>,
        Matt Bobrowski <mattbobrowski@...gle.com>,
        bpf <bpf@...r.kernel.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: Re: BPF: bpf_d_path() can be invoked on "struct path" not holding
 proper references, resulting in kernel memory corruption

On Fri, Oct 27, 2023 at 10:13 AM Jann Horn <jannh@...gle.com> wrote:
>
> Hi!
>
> bpf_d_path() can be invoked on a "struct path" that results from
> following a pointer chain involving pointers that can concurrently
> change; this can lead to stuff like use-after-free in d_path().
>
> For example, the BPF verifier permits stuff like
> bpf_d_path(&current->mm->exe_file->f_path, ...), which is not actually
> safe in many contexts:
>
> current->mm->exe_file can concurrently change; so by the time
> bpf_d_path() is called, the file's refcount might have gone to zero,
> and __fput() may have already mostly torn down the file. "struct file"
> currently has some limited RCU lifetime, but that is supposed to be an
> implementation detail that BPF shouldn't be relying on, and "struct
> file" will soon have even less RCU lifetime than before (see
> <https://lore.kernel.org/all/20230930-glitzer-errungenschaft-b86880c177c4@brauner/>).
>
> When __fput() tears down a file, it drops the references held by
> file->f_path.mnt and file->f_path.dentry. "struct vfsmount" has some
> kind of RCU lifetime, but "struct dentry" will be freed directly in
> dentry_free() if it has DCACHE_NORCU set, which is the case if it was
> allocated via d_alloc_pseudo(), which is how memfd files are
> allocated.
>
> So the following race is possible, if we start in a situation where
> current->mm->exe_file points to a memfd:
>
> thread A            thread B
> ========            ========
> begin RCU section
> begin BPF program
> compute path = &current->mm->exe_file->f_path
>
>                     prctl(PR_SET_MM, PR_SET_MM_MAP, ...)
>                       updates current->mm->exe_file
>                       calls fput() on old ->exe_file
>                     __fput() runs
>                       dput(dentry);
>                       mntput(mnt)
>
> invoke helper bpf_d_path(path, ...)
>   d_path()
>     reads path->dentry->d_op  *** UAF read ***
>     reads path->dentry->d_op->d_dname  *** read through wild pointer ***
>     path->dentry->d_op->d_dname(...) *** wild pointer call ***
>
> So if an attacker managed to reallocate the old "struct dentry" with
> attacker-controlled data, they could probably get the kernel to call
> an attacker-provided function pointer, eventually letting an attacker
> gain kernel privileges.
>
> Obviously this is not a bug an unprivileged attacker can just hit
> directly on a system where no legitimate BPF programs are already
> running, because loading tracing BPF programs requires privileges; but
> if a privileged process loads a tracing BPF program that does
> something unsafe like "bpf_d_path(&current->mm->exe_file->f_path,
> ...)", an attacker might be able to leverage that.

Thanks for the report. That's a verifier bug indeed.
Curious, did you actually see such broken bpf program or this is
theoretical issue in case somebody will write such thing ?

>
> If BPF wants to be able to promise that buggy BPF code can't crash the
> kernel (or, worse, introduce privilege escalation vulnerabilities in
> the kernel),

Only the former. The verifier cannot possibly guarantee that the bpf-lsm
program or tracing bpf prog is not leaking addresses or acting maliciously.
Same in networking. XDP prog might be doing firewalling incorrectly,
dropping wrong packets, disabling ssh when it shouldn't, etc.
We cannot validate semantics. The verifier tries to guarantee non-crash only.
Hence loading bpf prog is a privileged operation.

But back to the verifier bug... I suspect it will be very hard to
craft a test that does prctl(PR_SET_MM) and goes all the way through
the delayed fput logic on one cpu while bpf prog under rcu_read_lock
calls bpf_d_path on the other cpu. I can see this happening in theory
and we need to close this verification gap, but we need to be realistic
in assessing the severity of it.
To fix it we need to make bpf_d_path KF_TRUSTED_ARGS. All new kfuncs
are done this way already. They don't allow unrestricted pointer walks.