netdev - Re: [PATCH v3 bpf-next 1/8] bpf: Introduce bpf timers.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <de1204cc-8c20-0e09-8880-e39c9ee6d889@fb.com>
Date:   Fri, 25 Jun 2021 07:57:37 -0700
From:   Alexei Starovoitov <ast@...com>
To:     Yonghong Song <yhs@...com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        <davem@...emloft.net>
CC:     <daniel@...earbox.net>, <andrii@...nel.org>,
        <netdev@...r.kernel.org>, <bpf@...r.kernel.org>,
        <kernel-team@...com>
Subject: Re: [PATCH v3 bpf-next 1/8] bpf: Introduce bpf timers.

On 6/24/21 11:25 PM, Yonghong Song wrote:
> 
>> +
>> +    ____bpf_spin_lock(&timer->lock);
> 
> I think we may still have some issues.
> Case 1:
>    1. one bpf program is running in process context,
>       bpf_timer_start() is called and timer->lock is taken
>    2. timer softirq is triggered and this callback is called

___bpf_spin_lock is actually irqsave version of spin_lock.
So this race is not possible.

> Case 2:
>    1. this callback is called, timer->lock is taken
>    2. a nmi happens and some bpf program is called (kprobe, tracepoint,
>       fentry/fexit or perf_event, etc.) and that program calls
>       bpf_timer_start()
> 
> So we could have deadlock in both above cases?

Shouldn't be possible either because bpf timers are not allowed
in nmi-bpf-progs. I'll double check that it's the case.
Pretty much the same restrictions are with bpf_spin_lock.

> 
>> +    /* callback_fn and prog need to match. They're updated together
>> +     * and have to be read under lock.
>> +     */
>> +    prog = t->prog;
>> +    callback_fn = t->callback_fn;
>> +
>> +    /* wrap bpf subprog invocation with prog->refcnt++ and -- to make
>> +     * sure that refcnt doesn't become zero when subprog is executing.
>> +     * Do it under lock to make sure that bpf_timer_start doesn't drop
>> +     * prev prog refcnt to zero before timer_cb has a chance to bump it.
>> +     */
>> +    bpf_prog_inc(prog);
>> +    ____bpf_spin_unlock(&timer->lock);
>> +
>> +    /* bpf_timer_cb() runs in hrtimer_run_softirq. It doesn't migrate 
>> and
>> +     * cannot be preempted by another bpf_timer_cb() on the same cpu.
>> +     * Remember the timer this callback is servicing to prevent
>> +     * deadlock if callback_fn() calls bpf_timer_cancel() on the same 
>> timer.
>> +     */
>> +    this_cpu_write(hrtimer_running, t);
> 
> This is not protected by spinlock, in bpf_timer_cancel() and
> bpf_timer_cancel_and_free(), we have spinlock protected read, so
> there is potential race conditions if callback function and 
> helper/bpf_timer_cancel_and_free run in different context?

what kind of race do you see?
This is per-cpu var and bpf_timer_cb is in softirq
while timer_cancel/cancel_and_free are calling it under
spin_lock_irqsave... so they cannot race because softirq
and bpf_timer_cb will run after start/canel/cancel_free
will do unlock_irqrestore.

>> +    prev = t->prog;
>> +    if (prev != prog) {
>> +        if (prev)
>> +            /* Drop pref prog refcnt when swapping with new prog */
> 
> pref -> prev
> 
>> +            bpf_prog_put(prev);
> 
> Maybe we want to put the above two lines with {}?

you mean add {} because there is a comment ?
I don't think the kernel coding style considers comment as a statement.

>> +    if (this_cpu_read(hrtimer_running) != t)
>> +        hrtimer_cancel(&t->timer);
> 
> We could still have race conditions here when 
> bpf_timer_cancel_and_free() runs in process context and callback in
> softirq context. I guess we might be okay.

No, since this check is under spin_lock_irsave.

> But if bpf_timer_cancel_and_free() in nmi context, not 100% sure
> whether we have issues or not.

timers shouldn't be available to nmi-bpf progs.
There will be all sorts of issues.
The underlying hrtimer implementation cannot deal with nmi either.