[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171218165227.4ec6f6f2@redhat.com>
Date: Mon, 18 Dec 2017 16:52:27 +0100
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: David Ahern <dsahern@...il.com>
Cc: Daniel Borkmann <borkmann@...earbox.net>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
netdev@...r.kernel.org, gospo@...adcom.com, bjorn.topel@...el.com,
michael.chan@...adcom.com, brouer@...hat.com
Subject: Re: [bpf-next V1-RFC PATCH 01/14] xdp: base API for new XDP
rx-queue info concept
On Mon, 18 Dec 2017 06:23:40 -0700
David Ahern <dsahern@...il.com> wrote:
> On 12/18/17 3:55 AM, Jesper Dangaard Brouer wrote:
> >
> > Handling return-errors in the drivers complicated the driver code, as it
> > involves unraveling and deallocating other RX-rings etc (that were
> > already allocated) if the reg fails. (Also notice next patch will allow
> > dev == NULL, if right ptype is set).
> >
> > I'm not completely rejecting you idea, as this is a good optimization
> > trick, which is to move validation checks to setup-time, thus allowing
> > less validation checks at runtime. I sort-of actually already did
> > this, as I allow bpf to deref dev without NULL check. I would argue
> > this is good enough, as we will crash in a predictable way, as above
> > WARN will point to which driver violated the API.
> >
> > If people think it is valuable I can change this API to return an err?
>
> Saeed's suggested API in a comment on patch 12 also removes most of the
> WARN_ONs as it sets the device and index:
>
> xdp_rxq_info_reg(netdev, rxq_index)
> {
> rxqueue = netdev->_rx + rxq_index;
> xdp_rxq = rxqueue.xdp_rxq;
> xdp_rxq_info_init(xdp_rxq);
> xdp_rxq.dev = netdev;
> xdp_rxq.queue_index = rxq_index;
> }
>
> xdp_rxq_info_unreg(netdev, rxq_index)
> {
> ...
> }
No, we still need the other WARN_ON's.
I don't understand why you think above API is better. In case
netdev==NULL the system will simply crash on deref of netdev. That
case happened for both drivers i40e and mlx5, when I was adding this.
The WARN_ON help me quickly identify the issue, and in both drivers it
was a non-critical error, as these queues are not used by XDP. IHMO a
better experience for the driver developer.
IHMO WARN_ON's are a good thing. For example the:
if (xdp_rxq->reg_state == REG_STATE_REGISTERED)
WARN(1, "Missing unregister, handled but fix driver\n");
Just helped me identify a bug in i40e driver. It turns out that
changing the RX-ring queue size via ethtool <-G|--set-ring> (_not_ the
number of RX-rings, but frames per RX-ring). Then i40e_set_ringparam()
allocates some temp RX-rings and copy-around struct contents, causing
this strange issue. It will not crash with our currently simple content,
but later this would cause a hard-to-debug issue. I'm happy I could
catch this now, instead of later as a strange crash.
The WARN's are there to assist driver developers when using this API
in their drivers (better than crash/BUG_ON as they don't have to dig-up
their serial cable console). For me it is also part of the
documentation, as it document the API assumptions/assertions together
with a small text field.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists