[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160114011953.GA43324@ast-mbp.thefacebook.com>
Date: Wed, 13 Jan 2016 17:19:54 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Ming Lei <tom.leiming@...il.com>
Cc: Martin KaFai Lau <kafai@...com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Network Development <netdev@...r.kernel.org>,
Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH 5/9] bpf: syscall: add percpu version of lookup/update
elem
On Wed, Jan 13, 2016 at 10:56:38PM +0800, Ming Lei wrote:
> On Wed, Jan 13, 2016 at 1:30 PM, Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> > On Wed, Jan 13, 2016 at 11:17:23AM +0800, Ming Lei wrote:
> >> On Wed, Jan 13, 2016 at 10:22 AM, Martin KaFai Lau <kafai@...com> wrote:
> >> > On Wed, Jan 13, 2016 at 08:38:18AM +0800, Ming Lei wrote:
> >> >> > The userspace usually only aggregates value across all cpu every X seconds.
> >> >>
> >> >> That is just in your case, and Alexei worried the issue of data stale.
> >> > I believe we are talking about validity of a value. How to
> >> > make use of a less-stale but invalid data?
> >>
> >> About the 'invalidity' thing, it should be same between using
> >> smp_call(run in IPI irq handler) and simple memcpy().
> >>
> >> When smp_call_function_single() is used to request to lookup element in
> >> the specific CPU, the value of the element may be in updating in that CPU
> >> and not completed yet in eBPF prog, then IPI comes and half updated
> >> data is still returned to syscall.
> >
> > hmm. I'm not following. bpf programs are executing with preempt disabled,
> > so smp_call_function_single suppose to execute when bpf is not running.
>
> Preempt disabled doesn't mean irq disabled, does it? So when bpf prog is
> running, the IPI irq for smp_call still may come on that CPU.
In case of kprobes irqs are disabled, but yeah for sockets smp_call won't help.
Can probably use schedule_work_on(), but that's too heavy.
I guess we need bpf_map_lookup_and_delete_elem() syscall command, so we can
delete single pointer out of per-cpu hash map and in call_rcu() copy precise
counters.
> Also in current non-percpu hash, the situation exists too between
> lookup elem syscall and updating value of element from bpf prog in
> SMP.
looks like regular bpf_map_lookup_elem() syscall will return inaccurate data
even for per-cpu hash. hmm. we need to brain storm more on it.
Powered by blists - more mailing lists