netdev - Re: [PATCH bpf v2] bpf: fix potential race in tail call compatibility check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <878rygbspu.fsf@toke.dk>
Date:   Tue, 26 Oct 2021 01:16:45 +0200
From:   Toke Høiland-Jørgensen <toke@...hat.com>
To:     Daniel Borkmann <daniel@...earbox.net>,
        Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
Cc:     Alexei Starovoitov <ast@...nel.org>, netdev@...r.kernel.org,
        bpf@...r.kernel.org
Subject: Re: [PATCH bpf v2] bpf: fix potential race in tail call
 compatibility check

Daniel Borkmann <daniel@...earbox.net> writes:

> On 10/25/21 4:28 PM, Lorenzo Bianconi wrote:
>>> Lorenzo noticed that the code testing for program type compatibility of
>>> tail call maps is potentially racy in that two threads could encounter a
>>> map with an unset type simultaneously and both return true even though they
>>> are inserting incompatible programs.
>>>
>>> The race window is quite small, but artificially enlarging it by adding a
>>> usleep_range() inside the check in bpf_prog_array_compatible() makes it
>>> trivial to trigger from userspace with a program that does, essentially:
>>>
>>>          map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0);
>>>          pid = fork();
>>>          if (pid) {
>>>                  key = 0;
>>>                  value = xdp_fd;
>>>          } else {
>>>                  key = 1;
>>>                  value = tc_fd;
>>>          }
>>>          err = bpf_map_update_elem(map_fd, &key, &value, 0);
>>>
>>> While the race window is small, it has potentially serious ramifications in
>>> that triggering it would allow a BPF program to tail call to a program of a
>>> different type. So let's get rid of it by protecting the update with a
>>> spinlock. The commit in the Fixes tag is the last commit that touches the
>>> code in question.
>>>
>>> v2:
>>> - Use a spinlock instead of an atomic variable and cmpxchg() (Alexei)
>>>
>>> Fixes: 3324b584b6f6 ("ebpf: misc core cleanup")
>>> Reported-by: Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
>>> Signed-off-by: Toke Høiland-Jørgensen <toke@...hat.com>
>>> ---
>>>   include/linux/bpf.h   |  1 +
>>>   kernel/bpf/arraymap.c |  1 +
>>>   kernel/bpf/core.c     | 14 ++++++++++----
>>>   kernel/bpf/syscall.c  |  2 ++
>>>   4 files changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>>> index 020a7d5bf470..98d906176d89 100644
>>> --- a/include/linux/bpf.h
>>> +++ b/include/linux/bpf.h
>>> @@ -929,6 +929,7 @@ struct bpf_array_aux {
>>>   	 * stored in the map to make sure that all callers and callees have
>>>   	 * the same prog type and JITed flag.
>>>   	 */
>>> +	spinlock_t type_check_lock;
>> 
>> I was wondering if we can use a mutex instead of a spinlock here since it is
>> run from a syscall AFAIU. The only downside is mutex_lock is run inside
>> aux->used_maps_mutex critical section. Am I missing something?
>
> Hm, potentially it could work, but then it's also 32 vs 4 extra bytes. There's
> also poke_mutex or freeze_mutex, but feels to hacky to 'generalize for reuse',
> so I think the spinlock in bpf_array_aux is fine.
>
>>>   	enum bpf_prog_type type;
>>>   	bool jited;
>>>   	/* Programs with direct jumps into programs part of this array. */
>>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>>> index cebd4fb06d19..da9b1e96cadc 100644
>>> --- a/kernel/bpf/arraymap.c
>>> +++ b/kernel/bpf/arraymap.c
>>> @@ -1072,6 +1072,7 @@ static struct bpf_map *prog_array_map_alloc(union bpf_attr *attr)
>>>   	INIT_WORK(&aux->work, prog_array_map_clear_deferred);
>>>   	INIT_LIST_HEAD(&aux->poke_progs);
>>>   	mutex_init(&aux->poke_mutex);
>>> +	spin_lock_init(&aux->type_check_lock);
>
> Just as a tiny nit, I would probably name it slightly different, since type_check_lock
> mainly refers to the type property but there's also jit vs non-jit and as pointed out
> there could be other extensions that need checking in future as well. Maybe 'compat_lock'
> would be a more generic one or just:
>
>          struct {
>                  enum bpf_prog_type type;
>                  bool jited;
>                  spinlock_t lock;
>          } owner;

Uh, I like that! Makes it easier to move as well (which we're doing as
part of the xdp_mb series). Will send a v3 with this :)

-Toke