netdev - [PATCH bpf] bpf: fix potential race in tail call compatibility check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211021183951.169905-1-toke@redhat.com>
Date:   Thu, 21 Oct 2021 20:39:51 +0200
From:   Toke Høiland-Jørgensen <toke@...hat.com>
To:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Cc:     netdev@...r.kernel.org, bpf@...r.kernel.org,
        Toke Høiland-Jørgensen <toke@...hat.com>,
        Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
Subject: [PATCH bpf] bpf: fix potential race in tail call compatibility check

Lorenzo noticed that the code testing for program type compatibility of
tail call maps is potentially racy in that two threads could encounter a
map with an unset type simultaneously and both return true even though they
are inserting incompatible programs.

The race window is quite small, but artificially enlarging it by adding a
usleep_range() inside the check in bpf_prog_array_compatible() makes it
trivial to trigger from userspace with a program that does, essentially:

        map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0);
        pid = fork();
        if (pid) {
                key = 0;
                value = xdp_fd;
        } else {
                key = 1;
                value = tc_fd;
        }
        err = bpf_map_update_elem(map_fd, &key, &value, 0);

While the race window is small, it has potentially serious ramifications in
that triggering it would allow a BPF program to tail call to a program of a
different type. So let's get rid of it by changing to an atomic update of
the array map aux->type. To do this, move the aux->jited boolean to be
encoded in the top-most bit of the aux->type field, so we can cmpxchg() the
whole thing in one go. The commit in the Fixes tag is the last commit that
touches the code in question.

Fixes: 3324b584b6f6 ("ebpf: misc core cleanup")
Reported-by: Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@...hat.com>
---
We noticed this while reworking the code in question to also deal with XDP
multi-buffer compatibility. We figured we'd send this fix separately, but
we'd like to base some code on it in bpf-next. What's the right procedure
here, should we wait until bpf gets merged into bpf-next, or can we include
this patch in the multibuf series as well?

-Toke

 include/linux/bpf.h  |  9 +++++++--
 kernel/bpf/core.c    | 23 ++++++++++++++++-------
 kernel/bpf/syscall.c |  4 ++--
 3 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 020a7d5bf470..48cc42063a86 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -923,14 +923,19 @@ struct bpf_prog_aux {
 	};
 };
 
+#define BPF_MAP_JITED_FLAG (1 << 31)
+
 struct bpf_array_aux {
 	/* 'Ownership' of prog array is claimed by the first program that
 	 * is going to use this map or by the first program which FD is
 	 * stored in the map to make sure that all callers and callees have
 	 * the same prog type and JITed flag.
+	 *
+	 * We store the type as a u32 and encode the jited state in the
+	 * most-significant bit (BPF_MAP_JITED_FLAG). This allows setting the
+	 * type atomically without locking.
 	 */
-	enum bpf_prog_type type;
-	bool jited;
+	u32 type;
 	/* Programs with direct jumps into programs part of this array. */
 	struct list_head poke_progs;
 	struct bpf_map *map;
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index c1e7eb3f1876..2811e7723886 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1823,20 +1823,29 @@ static unsigned int __bpf_prog_ret0_warn(const void *ctx,
 bool bpf_prog_array_compatible(struct bpf_array *array,
 			       const struct bpf_prog *fp)
 {
+	u32 map_type, prog_type = fp->type;
+
 	if (fp->kprobe_override)
 		return false;
 
-	if (!array->aux->type) {
-		/* There's no owner yet where we could check for
-		 * compatibility.
+	/* Encode the jited status in the top-most bit of the aux->type field so
+	 * we have a single value we can atomically swap in below
+	 */
+	if (fp->jited)
+		prog_type |= BPF_MAP_JITED_FLAG;
+
+	map_type = READ_ONCE(array->aux->type);
+	if (!map_type) {
+		/* There's no owner yet where we could check for compatibility.
+		 * Do an atomic swap to prevent racing with another invocation
+		 * of this branch (via simultaneous map_update syscalls).
 		 */
-		array->aux->type  = fp->type;
-		array->aux->jited = fp->jited;
+		if (cmpxchg(&array->aux->type, 0, prog_type))
+			return false;
 		return true;
 	}
 
-	return array->aux->type  == fp->type &&
-	       array->aux->jited == fp->jited;
+	return map_type == prog_type;
 }
 
 static int bpf_check_tail_call(const struct bpf_prog *fp)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 4e50c0bfdb7d..45485ebdfb2b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -543,8 +543,8 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp)
 
 	if (map->map_type == BPF_MAP_TYPE_PROG_ARRAY) {
 		array = container_of(map, struct bpf_array, map);
-		type  = array->aux->type;
-		jited = array->aux->jited;
+		type  = array->aux->type & ~BPF_MAP_JITED_FLAG;
+		jited = !!(array->aux->type & BPF_MAP_JITED_FLAG);
 	}
 
 	seq_printf(m,
-- 
2.33.0