lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19c7c7f3-2db5-6f90-648f-78e8da92d862@mellanox.com>
Date:   Wed, 13 Dec 2017 15:03:33 -0800
From:   Saeed Mahameed <saeedm@...lanox.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>,
        Tariq Toukan <tariqt@...lanox.com>
Cc:     Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        netdev@...r.kernel.org, dsahern@...il.com,
        Matan Barak <matanb@...lanox.com>, gospo@...adcom.com,
        bjorn.topel@...el.com, michael.chan@...adcom.com
Subject: Re: [bpf-next V1-RFC PATCH 02/14] xdp/mlx5: setup xdp_rxq_info and
 extend with qtype



On 12/13/2017 5:44 AM, Jesper Dangaard Brouer wrote:
> On Wed, 13 Dec 2017 14:27:08 +0200
> Tariq Toukan <tariqt@...lanox.com> wrote:
> 
>> Hi Jesper,
>> Thanks for taking care of the drop RQ.
>>
>> In general, mlx5 part looks ok to me.
>> Find a few comments below. Mostly pointing out some typos.
>>
>> On 13/12/2017 1:19 PM, Jesper Dangaard Brouer wrote:
>>> The mlx5 driver have a special drop-RQ queue (one per interface) that
>>> simply drops all incoming traffic. It helps driver keep other HW
>>> objects (flow steering) alive upon down/up operations.  It is
>>> temporarily pointed by flow steering objects during the interface
>>> setup, and when interface is down. It lacks many fields that are set
>>> in a regular RQ (for example its state is never switched to
>>> MLX5_RQC_STATE_RDY). (Thanks to Tariq Toukan for explaination).
>> typo: explanation
> 
> Fixed
> 
>>>
>>> The XDP RX-queue info API is extended with a queue-type, and mlx5 uses
>>> this kind of drop/sink-type (RXQ_TYPE_SINK) for this kind of sink queue.
>>>
>>> Driver hook points for xdp_rxq_info:
>>>    * init+reg: mlx5e_alloc_rq()
>>>    * init+reg: mlx5e_alloc_drop_rq()
>>>    * unreg   : mlx5e_free_rq()
>>>
>>> Tested on actual hardware with samples/bpf program
>>>
>>> Cc: Saeed Mahameed <saeedm@...lanox.com>
>>> Cc: Matan Barak <matanb@...lanox.com>
>>> Cc: Tariq Toukan <tariqt@...lanox.com>
>>> Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
>>> ---
>>>    drivers/net/ethernet/mellanox/mlx5/core/en.h      |    4 ++++
>>>    drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   14 +++++++++++++
>>>    drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   |    1 +
>>>    include/net/xdp.h                                 |   23 +++++++++++++++++++++
>>>    net/core/xdp.c                                    |    6 +++++
>>>    5 files changed, 48 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> index c0872b3284cb..fe10a042783b 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
>>> @@ -46,6 +46,7 @@
>>>    #include <linux/mlx5/transobj.h>
>>>    #include <linux/rhashtable.h>
>>>    #include <net/switchdev.h>
>>> +#include <net/xdp.h>
>>>    #include "wq.h"
>>>    #include "mlx5_core.h"
>>>    #include "en_stats.h"
>>> @@ -568,6 +569,9 @@ struct mlx5e_rq {
>>>    	u32                    rqn;
>>>    	struct mlx5_core_dev  *mdev;
>>>    	struct mlx5_core_mkey  umr_mkey;
>>> +
>>> +	/* XDP read-mostly */
>>> +	struct xdp_rxq_info xdp_rxq;
>>>    } ____cacheline_aligned_in_smp;
>>>    
>>>    struct mlx5e_channel {
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> index 0f5c012de52e..ea44b5f25e11 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> @@ -582,6 +582,12 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
>>>    	rq->ix      = c->ix;
>>>    	rq->mdev    = mdev;
>>>    
>>> +	/* XDP RX-queue info */
>>> +	xdp_rxq_info_init(&rq->xdp_rxq);
>>> +	rq->xdp_rxq.dev		= rq->netdev;
>>> +	rq->xdp_rxq.queue_index = rq->ix;
>>> +	xdp_rxq_info_reg(&rq->xdp_rxq);
>>> +

See my comment below and my comment on patch #12 I believe we can reduce 
the amount of code duplication, and have a more generic way to register 
XDP RXQs, without the need for drivers to take care of xdp_rxq_info 
declaration and handling.

>> You don't set type here. This is ok as long as the following hold:
>> 1) RXQ_TYPE_DEFAULT is zero
> 
> True
> 
>> 2) xdp_rxq is zalloc'ed.
> 
> xdp_rxq memory area is part of rq allocation, but in
> xdp_rxq_info_init() I memset/zero the area explicit.
> 
>   
>>>    	rq->xdp_prog = params->xdp_prog ?
>>> bpf_prog_inc(params->xdp_prog) : NULL; if (IS_ERR(rq->xdp_prog)) {
>>>    		err = PTR_ERR(rq->xdp_prog);
>>> @@ -695,6 +701,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel
>>> *c, err_rq_wq_destroy:
>>>    	if (rq->xdp_prog)
>>>    		bpf_prog_put(rq->xdp_prog);
>>> +	xdp_rxq_info_unreg(&rq->xdp_rxq);
>>>    	mlx5_wq_destroy(&rq->wq_ctrl);
>>>    
>>>    	return err;
>>> @@ -707,6 +714,8 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
>>>    	if (rq->xdp_prog)
>>>    		bpf_prog_put(rq->xdp_prog);
>>>    
>>> +	xdp_rxq_info_unreg(&rq->xdp_rxq);
>>> +
>>>    	switch (rq->wq_type) {
>>>    	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
>>>    		mlx5e_rq_free_mpwqe_info(rq);
>>> @@ -2768,6 +2777,11 @@ static int mlx5e_alloc_drop_rq(struct
>>> mlx5_core_dev *mdev, if (err)
>>>    		return err;
>>>    
>>> +	/* XDP RX-queue info for "Drop-RQ", packets never reach
>>> XDP */
>>> +	xdp_rxq_info_init(&rq->xdp_rxq);
>>> +	xdp_rxq_info_type(&rq->xdp_rxq, RXQ_TYPE_SINK);
>>> +	xdp_rxq_info_reg(&rq->xdp_rxq);
>>> +

I don't see why you need this, This RQ is not even assigned to any 
netdev_rxq! it is a pure HW object that drops traffic in HW when netdev 
is down, it even has no buffers or napi handling, just ignore it's 
existence for the sake of mlx5 xdp_rxq_info reg/unreg stuff and remove 
RXQ_TYPE_SINK, bottom line it is not a real RQ and for sure XDP has 
nothing to do with it.

>>>    	rq->mdev = mdev;
>>>    
>>>    	return 0;
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index
>>> 5b499c7a698f..7b38480811d4 100644 ---
>>> a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -812,6 +812,7
>>> @@ static inline int mlx5e_xdp_handle(struct mlx5e_rq *rq,
>>> xdp_set_data_meta_invalid(&xdp); xdp.data_end = xdp.data + *len;
>>>    	xdp.data_hard_start = va;
>>> +	xdp.rxq = &rq->xdp_rxq;
>>>    
>>>    	act = bpf_prog_run_xdp(prog, &xdp);
>>>    	switch (act) {
>>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>>> index e4acd198fd60..5be560d943e1 100644
>>> --- a/include/net/xdp.h
>>> +++ b/include/net/xdp.h
>>> @@ -36,10 +36,33 @@ struct xdp_rxq_info {
>>>    	struct net_device *dev;
>>>    	u32 queue_index;
>>>    	u32 reg_state;
>>> +	u32 qtype;
>>>    } ____cacheline_aligned; /* perf critical, avoid false-sharing */
>>>    
>>>    void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq);
>>>    void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq);
>>>    void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq);
>>>    
>>> +/**
>>> + * DOC: XDP RX-queue type
>>> + *
>>> + * The XDP RX-queue info can have associated a type.
>>> + *
>>> + * @RXQ_TYPE_DEFAULT: default no specifik queue type need to be
>>> specified
>>
>> typo: specific
> 
> Thanks, this is a Danish typo (it's spelled that way in Danish).
>   
>>> + *
>>> + * @RXQ_TYPE_SINK: indicate a fake queue that never reach XDP RX
>>> + *	code.  Some drivers have a need to maintain a lower layer
>>> + *	RX-queue as a sink queue, while reconfiguring other
>>> RX-queues.
>>> + */
>>> +#define RXQ_TYPE_DEFAULT	0
>>> +#define RXQ_TYPE_SINK		1
>>> +#define RXQ_TYPE_MAX		RXQ_TYPE_SINK
>>
>> Definitions of incremental numbers, enum might be best here, you can
>> give them some enum type and use it in xdp_rxq_info->qtype.
> 
> I use defines to make the below BUILD_BUG_ON work, as enums does not
> get expanded to their values in the C-preprocessor stage.
> 
>>> +
>>> +static inline
>>> +void xdp_rxq_info_type(struct xdp_rxq_info *xdp_rxq, u32 qtype)
>>> +{
>>> +	BUILD_BUG_ON(qtype > RXQ_TYPE_MAX);
>>> +	xdp_rxq->qtype = qtype;
>>> +}
>>> +
>>>    #endif /* __LINUX_NET_XDP_H__ */
>>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>>> index a9d2dd7b1ede..2a111f5987f6 100644
>>> --- a/net/core/xdp.c
>>> +++ b/net/core/xdp.c
>>> @@ -32,8 +32,14 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_init);
>>>    
>>>    void xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq)
>>>    {
>>> +	if (xdp_rxq->qtype == RXQ_TYPE_SINK)
>>> +		goto skip_content_check;
>>> +
>>> +	/* Check information setup by driver code */
>>>    	WARN(!xdp_rxq->dev, "Missing net_device from driver");
>>>    	WARN(xdp_rxq->queue_index == U32_MAX, "Miss queue_index from driver"); +
>>> +skip_content_check:
>>>    	WARN(!(xdp_rxq->reg_state == REG_STATE_NEW),"API violation, miss init");
>>>      xdp_rxq->reg_state = REG_STATE_REGISTRED;
>> typo: REGISTERED (introduced in a previous patch)
> 
> Thanks for catching that! :-)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ