[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <98225fe7-c58c-cfbd-513d-de4a20c02e26@iogearbox.net>
Date: Thu, 11 Apr 2019 22:43:09 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Magnus Karlsson <magnus.karlsson@...il.com>,
Y Song <ys114321@...il.com>
Cc: Magnus Karlsson <magnus.karlsson@...el.com>,
Björn Töpel <bjorn.topel@...el.com>,
Alexei Starovoitov <ast@...nel.org>,
netdev <netdev@...r.kernel.org>, bpf@...r.kernel.org,
bruce.richardson@...el.com, ciara.loftus@...el.com,
ilias.apalodimas@...aro.org, Ye Xiaolong <xiaolong.ye@...el.com>,
ferruh.yigit@...el.com, "Zhang, Qi Z" <qi.z.zhang@...el.com>,
georgmueller@....net
Subject: Re: [PATCH bpf v2 2/2] libbpf: remove dependency on barrier.h in
xsk.h
On 04/11/2019 10:23 PM, Daniel Borkmann wrote:
> On 04/11/2019 09:54 AM, Magnus Karlsson wrote:
>> On Wed, Apr 10, 2019 at 9:08 PM Y Song <ys114321@...il.com> wrote:
>>> On Wed, Apr 10, 2019 at 12:21 AM Magnus Karlsson
>>> <magnus.karlsson@...el.com> wrote:
>>>>
>>>> The use of smp_rmb() and smp_wmb() creates a Linux header dependency
>>>> on barrier.h that is uneccessary in most parts. This patch implements
>>>> the two small defines that are needed from barrier.h. As a bonus, the
>>>> new implementations are faster than the default ones as they default
>>>> to sfence and lfence for x86, while we only need a compiler barrier in
>>>> our case. Just as it is when the same ring access code is compiled in
>>>> the kernel.
>>>>
>>>> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
>>>> Signed-off-by: Magnus Karlsson <magnus.karlsson@...el.com>
>>>> ---
>>>> tools/lib/bpf/xsk.h | 20 ++++++++++++++++++--
>>>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
>>>> index 3638147..69136d9 100644
>>>> --- a/tools/lib/bpf/xsk.h
>>>> +++ b/tools/lib/bpf/xsk.h
>>>> @@ -39,6 +39,22 @@ DEFINE_XSK_RING(xsk_ring_cons);
>>>> struct xsk_umem;
>>>> struct xsk_socket;
>>>>
>>>> +#if !defined bpf_smp_rmb && !defined bpf_smp_wmb
>>>
>>> Maybe add some comments to explain the different between bpf_smp_{r,w}mb
>>> and smp_{r,w}mb so later users will have a better idea which to pick?
>>
>> Ouch, that is a hard one. I would just recommend people to read
>> Documentation/memory-barriers.txt. My attempt at explaining all this
>> would not be pretty and likely sprinkled with errors ;-).
>
> I think Yonghong meant here place a comment wrt when to use the below versus
> when to use smp_{r,w}mb(). Both are essentially the same just that the main
> difference here would be that this header needs to be installed in the system
> so users need to have it. I think it indeed makes sense to add a comment about
> this specific fact otherwise we might forget about it in few months.
>
>>>> +# if defined(__i386__) || defined(__x86_64__)
>>>> +# define bpf_smp_rmb() asm volatile("" : : : "memory")
>>>> +# define bpf_smp_wmb() asm volatile("" : : : "memory")
>>>> +# elif defined(__aarch64__)
>>>> +# define bpf_smp_rmb() asm volatile("dmb ishld" : : : "memory")
>>>> +# define bpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
>>>> +# elif defined(__arm__)
>>>> +/* These are only valid for armv7 and above */
>>>> +# define bpf_smp_rmb() asm volatile("dmb ish" : : : "memory")
>>>> +# define bpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
>>>> +# else
>>>> +# error Architecture not supported by the XDP socket code in libbpf.
>>>> +# endif
>>>> +#endif
>>>
>>> Since this is generic enough and could be used by other files as well,
>>> maybe put it into libbpf_util.h?
>
> Hmm, maybe a good point. We could place it into libbpf.h as there is already
> various misc helpers and xsk.h includes it anyway. But: if we do that, then
> the above 'else' part would need some generic fallback (__sync_synchronize()
> plus a warning?) as otherwise compilation would break for everyone with 'error'.
> Ideally this should then cover as much as possible from mainstream archs though.
(And if so then prefixed with libbpf_smp_{r,w}mb() to denote it's misc libbpf
internal function.)
>> Good question. Do not know. Daniel suggested introducing [0] and
>> perhaps that can be used by the broader libbpf code base? The
>> important part for this patch set is that these operations match the
>> ones in the kernel on the other end of the ring.
>
> Yeah, it can be used generally except for headers that are going to be
> installed where these are present in inline helper functions.
>
>> [0] https://lore.kernel.org/netdev/20181017144156.16639-2-daniel@iogearbox.net/
>>
>>>> +
>>>> static inline __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill,
>>>> __u32 idx)
>>>> {
>>>> @@ -119,7 +135,7 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t nb)
>>>> /* Make sure everything has been written to the ring before signalling
>>>> * this to the kernel.
>>>> */
>>>> - smp_wmb();
>>>> + bpf_smp_wmb();
>>>>
>>>> *prod->producer += nb;
>>>> }
>>>> @@ -133,7 +149,7 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
>>>> /* Make sure we do not speculatively read the data before
>>>> * we have received the packet buffers from the ring.
>>>> */
>>>> - smp_rmb();
>>>> + bpf_smp_rmb();
>>>
>>> Could you explain why a compiler barrier is good enough here on x86? Note that
>>> the load cons->cached_cons could be reordered with earlier
>>> non-overlapping stores
>>> at runtime.
>>
>> The bpf_smp_rmb() is there to protect the data in the ring itself to
>> be read by the consumer before the producer has signaled that it has
>> finished “producing” them by updating the producer (head) pointer. As
>> stores are not reordered with other stores on x86 (nor loads with
>> other loads), the update of the producer pointer will always be
>> observed after the writing of the data in the ring, as that is done
>> before the update of the producer pointer in xsk_ring_prod__submit().
>> One side only updates and the other side only reads. cached_cons is a
>> local variable and only for operations done by another core can we
>> observe loads being reordered with older stores to different
>> locations. Since no one else is touching cached_cons, this will not
>> happen.
>
> From perf RB side, I found this one kernel/events/ring_buffer.c +72 to
> be very helpful. It's independent of this series, but I would appreciate
> if you could make similar scheme / comment somewhere in the AF_XDP code
> such that all barriers in there can be more easily followed wrt how they
> pair to user space.
>
> Thanks,
> Daniel
>
>> /Magnus
>>
>>>>
>>>> *idx = cons->cached_cons;
>>>> cons->cached_cons += entries;
>>>> --
>>>> 2.7.4
>>>>
>
Powered by blists - more mailing lists