lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Nov 2020 15:37:27 -0500 (EST)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     rostedt <rostedt@...dmis.org>, paulmck <paulmck@...nel.org>
Cc:     Matt Mullins <mmullins@...x.us>, Ingo Molnar <mingo@...hat.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        Andrii Nakryiko <andriin@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...omium.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH] bpf: don't fail kmalloc while releasing raw_tp

----- On Nov 16, 2020, at 12:19 PM, rostedt rostedt@...dmis.org wrote:

> On Sat, 14 Nov 2020 21:52:55 -0800
> Matt Mullins <mmullins@...x.us> wrote:
> 
>> bpf_link_free is always called in process context, including from a
>> workqueue and from __fput.  Neither of these have the ability to
>> propagate an -ENOMEM to the caller.
>> 
> 
> Hmm, I think the real fix is to not have unregistering a tracepoint probe
> fail because of allocation. We are removing a probe, perhaps we could just
> inject NULL pointer that gets checked via the DO_TRACE loop?
> 
> I bet failing an unregister because of an allocation failure causes
> problems elsewhere than just BPF.
> 
> Mathieu,
> 
> Can't we do something that would still allow to unregister a probe even if
> a new probe array fails to allocate? We could kick off a irq work to try to
> clean up the probe at a later time, but still, the unregister itself should
> not fail due to memory failure.

Currently, the fast path iteration looks like:

                struct tracepoint_func *it_func_ptr;
                void *it_func;

                it_func_ptr =                                           \
                        rcu_dereference_raw((&__tracepoint_##_name)->funcs); \
                do {                                                    \
                        it_func = (it_func_ptr)->func;                  \
                        __data = (it_func_ptr)->data;                   \
                        ((void(*)(void *, proto))(it_func))(__data, args); \
                } while ((++it_func_ptr)->func); 

So we RCU dereference the array, and iterate on the array until we find a NULL
func. So you could not use NULL to skip items, but you could perhaps reserve
a (void *)0x1UL tombstone for this.

It should ideally be an unlikely branch, and it would be good to benchmark the
change when multiple tracing probes are attached to figure out whether the
overhead is significant when tracing is enabled.

I wonder whether we really mind that much about using slightly more memory
than required after a failed reallocation due to ENOMEM. Perhaps the irq work
is not even needed. Chances are that the irq work would fail again and again if
it's in low memory conditions. So maybe it's better to just keep the tombstone
in place until the next successful callback array reallocation.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ