[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3cc841af-d950-c961-7865-a5694eb5c618@iogearbox.net>
Date: Fri, 18 May 2018 15:43:53 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Alexei Starovoitov <ast@...com>,
Björn Töpel <bjorn.topel@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: "Karlsson, Magnus" <magnus.karlsson@...el.com>,
"Duyck, Alexander H" <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Netdev <netdev@...r.kernel.org>,
Björn Töpel <bjorn.topel@...el.com>,
michael.lundkvist@...csson.com,
"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
"Singhai, Anjali" <anjali.singhai@...el.com>,
"Zhang, Qi Z" <qi.z.zhang@...el.com>
Subject: Re: [PATCH bpf-next v3 00/15] Introducing AF_XDP support
On 05/18/2018 05:38 AM, Alexei Starovoitov wrote:
> On 5/16/18 11:46 PM, Björn Töpel wrote:
>> 2018-05-04 1:38 GMT+02:00 Alexei Starovoitov <alexei.starovoitov@...il.com>:
>>> On Fri, May 04, 2018 at 12:49:09AM +0200, Daniel Borkmann wrote:
>>>> On 05/02/2018 01:01 PM, Björn Töpel wrote:
>>>>> From: Björn Töpel <bjorn.topel@...el.com>
>>>>>
>>>>> This patch set introduces a new address family called AF_XDP that is
>>>>> optimized for high performance packet processing and, in upcoming
>>>>> patch sets, zero-copy semantics. In this patch set, we have removed
>>>>> all zero-copy related code in order to make it smaller, simpler and
>>>>> hopefully more review friendly. This patch set only supports copy-mode
>>>>> for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
>>>>> for RX using the XDP_DRV path. Zero-copy support requires XDP and
>>>>> driver changes that Jesper Dangaard Brouer is working on. Some of his
>>>>> work has already been accepted. We will publish our zero-copy support
>>>>> for RX and TX on top of his patch sets at a later point in time.
>>>>
>>>> +1, would be great to see it land this cycle. Saw few minor nits here
>>>> and there but nothing to hold it up, for the series:
>>>>
>>>> Acked-by: Daniel Borkmann <daniel@...earbox.net>
>>>>
>>>> Thanks everyone!
>>>
>>> Great stuff!
>>>
>>> Applied to bpf-next, with one condition.
>>> Upcoming zero-copy patches for both RX and TX need to be posted
>>> and reviewed within this release window.
>>> If netdev community as a whole won't be able to agree on the zero-copy
>>> bits we'd need to revert this feature before the next merge window.
>>>
>>> Few other minor nits:
>>> patch 3:
>>> +struct xdp_ring {
>>> + __u32 producer __attribute__((aligned(64)));
>>> + __u32 consumer __attribute__((aligned(64)));
>>> +};
>>> It kinda begs for ____cacheline_aligned_in_smp to be introduced for uapi headers.
>>
>> Hmm, I need some guidance on what a sane uapi variant would be. We
>> can't have the uapi depend on the kernel build. ARM64, e.g., can have
>> both 64B and 128B according to the specs. Contemporary IA processors
>> have 64B.
>>
>> The simplest, and maybe most future-proof, would be 128B aligned for
>> all. Another is having 128B for ARM and 64B for all IA. A third option
>> is having a hand-shaking API (I think virtio has that) for determine
>> the cache line size, but I'd rather not go down that route.
>>
>> Thoughts/ideas on how a uapi ____cacheline_aligned_in_smp version
>> would look like?
>
> I suspect i40e+arm combination wasn't tested anyway.
> The api may have endianness issues too on something like sparc.
> I think the way to be backwards compatible in this area
> is to make the api usable on x86 only by adding
> to include/uapi/linux/if_xdp.h
> #if defined(__x86_64__)
> #define AF_XDP_CACHE_BYTES 64
> #else
> #error "AF_XDP support is not yet available for this architecture"
> #endif
> and doing:
> __u32 producer __attribute__((aligned(AF_XDP_CACHE_BYTES)));
> __u32 consumer __attribute__((aligned(AF_XDP_CACHE_BYTES)));
>
> And progressively add to this for arm64 and few other archs.
> Eventually removing #error and adding some generic define
> that's good enough for long tail of architectures that
> we really cannot test.
Been looking into this yesterday as well a bit, and it's a bit of a mess what
uapi headers do on this regard (though there are just a handful of such headers).
Some of the kernel uapi headers hard-code generally 64 bytes regardless of the
underlying arch. In general, the kernel does expose it to user space via sysfs
(coherency_line_size). Here's what perf does to retrieve it:
#ifdef _SC_LEVEL1_DCACHE_LINESIZE
#define cache_line_size(cacheline_sizep) *cacheline_sizep = sysconf(_SC_LEVEL1_DCACHE_LINESIZE)
#else
static void cache_line_size(int *cacheline_sizep)
{
if (sysfs__read_int("devices/system/cpu/cpu0/cache/index0/coherency_line_size", cacheline_sizep))
pr_debug("cannot determine cache line size");
}
#endif
The sysconf() implementation for _SC_LEVEL1_DCACHE_LINESIZE seems also only
available for x86, arm64, s390 and ppc on a cursory glance in the glibc code.
In the x86 case it retrieves the info from cpuid insn. In order to generically
use it in combination with the header you'd have some probe which would then
set this as a define before including the header.
Then projects like urcu, they do ...
#define ____cacheline_internodealigned_in_smp \
__attribute__((__aligned__(CAA_CACHE_LINE_SIZE)))
... and then hard code CAA_CACHE_LINE_SIZE for x86 (== 128), s390 (== 128),
ppc (== 256) and sparc64 (== 256) with a generic fallback to 64.
Hmm, perhaps a combination of the two would make sense where in case of known
cacheline size it can still be used and we only have the fallback in such way.
Like:
#ifndef XDP_CACHE_BYTES
# if defined(__x86_64__)
# define XDP_CACHE_BYTES 64
# else
# error "Please define XDP_CACHE_BYTES for this architecture!"
# endif
#endif
Too bad there's no asm uapi header at least for the archs where it's fixed
anyway such that not every project out there has to redefine all of it from
scratch and we could just include it (and the generic-asm one would throw
a compile error if it's not externally defined or such).
Cheers,
Daniel
Powered by blists - more mailing lists