[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19c7c7f3-2db5-6f90-648f-78e8da92d862@mellanox.com>
Date: Wed, 13 Dec 2017 15:03:33 -0800
From: Saeed Mahameed <saeedm@...lanox.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>,
Tariq Toukan <tariqt@...lanox.com>
Cc: Daniel Borkmann <borkmann@...earbox.net>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
netdev@...r.kernel.org, dsahern@...il.com,
Matan Barak <matanb@...lanox.com>, gospo@...adcom.com,
bjorn.topel@...el.com, michael.chan@...adcom.com
Subject: Re: [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and
extend with qtype
On 12/13/2017 5:44 AM, Jesper Dangaard Brouer wrote:
> On Wed, 13 Dec 2017 14:27:08 +0200
> Tariq Toukan <tariqt@...lanox.com> wrote:
>
>> Hi Jesper,
>> Thanks for taking care of the drop RQ.
>>
>> In general, mlx5 part looks ok to me.
>> Find a few comments below. Mostly pointing out some typos.
>>
>> On 13/12/2017 1:19 PM, Jesper Dangaard Brouer wrote:
>>> The mlx5 driver have a special drop-RQ queue (one per interface) that
>>> simply drops all incoming traffic. It helps driver keep other HW
>>> objects (flow steering) alive upon down/up operations. It is
>>> temporarily pointed by flow steering objects during the interface
>>> setup, and when interface is down. It lacks many fields that are set
>>> in a regular RQ (for example its state is never switched to
>>> MLX5_RQC_STATE_RDY). (Thanks to Tariq Toukan for explaination).
>> typo: explanation
>
> Fixed
>
>>>
>>> The XDP RX-queue info API is extended with a queue-type, and mlx5 uses
>>> this kind of drop/sink-type (RXQ_TYPE_SINK) for this kind of sink queue.
>>>
>>> Driver hook points for xdp_rxq_info:
>>> * init+reg: mlx5e_alloc_rq()
>>> * init+reg: mlx5e_alloc_drop_rq()
>>> * unreg : mlx5e_free_rq()
>>>
>>> Tested on actual hardware with samples/bpf program
>>>
>>> Cc: Saeed Mahameed <saeedm@...lanox.com>
>>> Cc: Matan Barak <matanb@...lanox.com>
>>> Cc: Tariq Toukan <tariqt@...lanox.com>
>>> Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
>>> ---
>>> drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 ++++
>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 14 +++++++++++++
>>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 1 +
>>> include/net/xdp.h | 23 +++++++++++++++++++++
>>> net/core/xdp.c | 6 +++++
>>> 5 files changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> index c0872b3284cb..fe10a042783b 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> @@ -46,6 +46,7 @@
>>> #include <linux/mlx5/transobj.h>
>>> #include <linux/rhashtable.h>
>>> #include <net/switchdev.h>
>>> +#include <net/xdp.h>
>>> #include "wq.h"
>>> #include "mlx5_core.h"
>>> #include "en_stats.h"
>>> @@ -568,6 +569,9 @@ struct mlx5e_rq {
>>> u32 rqn;
>>> struct mlx5_core_dev *mdev;
>>> struct mlx5_core_mkey umr_mkey;
>>> +
>>> + /* XDP read-mostly */
>>> + struct xdp_rxq_info xdp_rxq;
>>> } ____cacheline_aligned_in_smp;
>>>
>>> struct mlx5e_channel {
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> index 0f5c012de52e..ea44b5f25e11 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> @@ -582,6 +582,12 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
>>> rq->ix = c->ix;
>>> rq->mdev = mdev;
>>>
>>> + /* XDP RX-queue info */
>>> + xdp_rxq_info_init(&rq->xdp_rxq);
>>> + rq->xdp_rxq.dev = rq->netdev;
>>> + rq->xdp_rxq.queue_index = rq->ix;
>>> + xdp_rxq_info_reg(&rq->xdp_rxq);
>>> +
See my comment below and my comment on patch #12 I believe we can reduce
the amount of code duplication, and have a more generic way to register
XDP RXQs, without the need for drivers to take care of xdp_rxq_info
declaration and handling.
>> You don't set type here. This is ok as long as the following hold:
>> 1) RXQ_TYPE_DEFAULT is zero
>
> True
>
>> 2) xdp_rxq is zalloc'ed.
>
> xdp_rxq memory area is part of rq allocation, but in
> xdp_rxq_info_init() I memset/zero the area explicit.
>
>
>>> rq->xdp_prog = params->xdp_prog ?
>>> bpf_prog_inc(params->xdp_prog) : NULL; if (IS_ERR(rq->xdp_prog)) {
>>> err = PTR_ERR(rq->xdp_prog);
>>> @@ -695,6 +701,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel
>>> *c, err_rq_wq_destroy:
>>> if (rq->xdp_prog)
>>> bpf_prog_put(rq->xdp_prog);
>>> + xdp_rxq_info_unreg(&rq->xdp_rxq);
>>> mlx5_wq_destroy(&rq->wq_ctrl);
>>>
>>> return err;
>>> @@ -707,6 +714,8 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
>>> if (rq->xdp_prog)
>>> bpf_prog_put(rq->xdp_prog);
>>>
>>> + xdp_rxq_info_unreg(&rq->xdp_rxq);
>>> +
>>> switch (rq->wq_type) {
>>> case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
>>> mlx5e_rq_free_mpwqe_info(rq);
>>> @@ -2768,6 +2777,11 @@ static int mlx5e_alloc_drop_rq(struct
>>> mlx5_core_dev *mdev, if (err)
>>> return err;
>>>
>>> + /* XDP RX-queue info for "Drop-RQ", packets never reach
>>> XDP */
>>> + xdp_rxq_info_init(&rq->xdp_rxq);
>>> + xdp_rxq_info_type(&rq->xdp_rxq, RXQ_TYPE_SINK);
>>> + xdp_rxq_info_reg(&rq->xdp_rxq);
>>> +
I don't see why you need this, This RQ is not even assigned to any
netdev_rxq! it is a pure HW object that drops traffic in HW when netdev
is down, it even has no buffers or napi handling, just ignore it's
existence for the sake of mlx5 xdp_rxq_info reg/unreg stuff and remove
RXQ_TYPE_SINK, bottom line it is not a real RQ and for sure XDP has
nothing to do with it.
>>> rq->mdev = mdev;
>>>
>>> return 0;
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index
>>> 5b499c7a698f..7b38480811d4 100644 ---
>>> a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -812,6 +812,7
>>> @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
>>> xdp_set_data_meta_invalid(&xdp); xdp.data_end = xdp.data + *len;
>>> xdp.data_hard_start = va;
>>> + xdp.rxq = &rq->xdp_rxq;
>>>
>>> act = bpf_prog_run_xdp(prog, &xdp);
>>> switch (act) {
>>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>>> index e4acd198fd60..5be560d943e1 100644
>>> --- a/include/net/xdp.h
>>> +++ b/include/net/xdp.h
>>> @@ -36,10 +36,33 @@ struct xdp_rxq_info {
>>> struct net_device *dev;
>>> u32 queue_index;
>>> u32 reg_state;
>>> + u32 qtype;
>>> } ____cacheline_aligned; /* perf critical, avoid false-sharing */
>>>
>>> void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
>>> void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
>>> void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
>>>
>>> +/**
>>> + * DOC: XDP RX-queue type
>>> + *
>>> + * The XDP RX-queue info can have associated a type.
>>> + *
>>> + * @RXQ_TYPE_DEFAULT: default no specifik queue type need to be
>>> specified
>>
>> typo: specific
>
> Thanks, this is a Danish typo (it's spelled that way in Danish).
>
>>> + *
>>> + * @RXQ_TYPE_SINK: indicate a fake queue that never reach XDP RX
>>> + * code. Some drivers have a need to maintain a lower layer
>>> + * RX-queue as a sink queue, while reconfiguring other
>>> RX-queues.
>>> + */
>>> +#define RXQ_TYPE_DEFAULT 0
>>> +#define RXQ_TYPE_SINK 1
>>> +#define RXQ_TYPE_MAX RXQ_TYPE_SINK
>>
>> Definitions of incremental numbers, enum might be best here, you can
>> give them some enum type and use it in xdp_rxq_info->qtype.
>
> I use defines to make the below BUILD_BUG_ON work, as enums does not
> get expanded to their values in the C-preprocessor stage.
>
>>> +
>>> +static inline
>>> +void xdp_rxq_info_type(struct xdp_rxq_info *xdp_rxq, u32 qtype)
>>> +{
>>> + BUILD_BUG_ON(qtype > RXQ_TYPE_MAX);
>>> + xdp_rxq->qtype = qtype;
>>> +}
>>> +
>>> #endif /* __LINUX_NET_XDP_H__ */
>>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>>> index a9d2dd7b1ede..2a111f5987f6 100644
>>> --- a/net/core/xdp.c
>>> +++ b/net/core/xdp.c
>>> @@ -32,8 +32,14 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
>>>
>>> void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
>>> {
>>> + if (xdp_rxq->qtype == RXQ_TYPE_SINK)
>>> + goto skip_content_check;
>>> +
>>> + /* Check information setup by driver code */
>>> WARN(!xdp_rxq->dev, "Missing net_device from driver");
>>> WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver"); +
>>> +skip_content_check:
>>> WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
>>> xdp_rxq->reg_state = REG_STATE_REGISTRED;
>> typo: REGISTERED (introduced in a previous patch)
>
> Thanks for catching that! :-)
>
Powered by blists - more mailing lists