lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220329183506.513b93cb@xps13>
Date:   Tue, 29 Mar 2022 18:35:06 +0200
From:   Miquel Raynal <miquel.raynal@...tlin.com>
To:     Alexander Aring <alex.aring@...il.com>
Cc:     Stefan Schmidt <stefan@...enfreihafen.org>,
        linux-wpan - ML <linux-wpan@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        "open list:NETWORKING [GENERAL]" <netdev@...r.kernel.org>,
        David Girault <david.girault@...vo.com>,
        Romuald Despres <romuald.despres@...vo.com>,
        Frederic Blain <frederic.blain@...vo.com>,
        Nicolas Schodet <nico@...fr.eu.org>,
        Thomas Petazzoni <thomas.petazzoni@...tlin.com>
Subject: Re: [PATCH wpan-next v4 07/11] net: ieee802154: at86rf230: Provide
 meaningful error codes when possible

Hi Alexander,

alex.aring@...il.com wrote on Sun, 27 Mar 2022 11:46:12 -0400:

> Hi,
> 
> On Fri, Mar 18, 2022 at 2:56 PM Miquel Raynal <miquel.raynal@...tlin.com> wrote:
> >
> > Either the spi operation failed, or the offloaded transmit operation
> > failed and returned a TRAC value. Use this value when available or use
> > the default "SYSTEM_ERROR" otherwise, in order to propagate one step
> > above the error.
> >
> > Signed-off-by: Miquel Raynal <miquel.raynal@...tlin.com>
> > ---
> >  drivers/net/ieee802154/at86rf230.c | 25 +++++++++++++++++++++++--
> >  1 file changed, 23 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c
> > index d3cf6d23b57e..34d199f597c9 100644
> > --- a/drivers/net/ieee802154/at86rf230.c
> > +++ b/drivers/net/ieee802154/at86rf230.c
> > @@ -358,7 +358,23 @@ static inline void
> >  at86rf230_async_error(struct at86rf230_local *lp,
> >                       struct at86rf230_state_change *ctx, int rc)
> >  {
> > -       dev_err(&lp->spi->dev, "spi_async error %d\n", rc);
> > +       int reason;
> > +
> > +       switch (rc) {  
> 
> I think there was a miscommunication last time, this rc variable is
> not a trac register value, it is a linux errno. Also the error here
> has nothing to do with a trac error. A trac error is the result of the
> offloaded transmit functionality on the transceiver, here we dealing
> about bus communication errors produced by the spi subsystem. What we
> need is to report it to the softmac layer as "IEEE802154_SYSTEM_ERROR"
> (as we decided that this is a user specific error and can be returned
> by the transceiver for non 802.15.4 "error" return code.

I think we definitely need to handle both, see below.

> 
> > +       case TRAC_CHANNEL_ACCESS_FAILURE:
> > +               reason = IEEE802154_CHANNEL_ACCESS_FAILURE;
> > +               break;
> > +       case TRAC_NO_ACK:
> > +               reason = IEEE802154_NO_ACK;
> > +               break;
> > +       default:
> > +               reason = IEEE802154_SYSTEM_ERROR;
> > +       }
> > +
> > +       if (rc < 0)
> > +               dev_err(&lp->spi->dev, "spi_async error %d\n", rc);
> > +       else
> > +               dev_err(&lp->spi->dev, "xceiver error %d\n", reason);
> >
> >         at86rf230_async_state_change(lp, ctx, STATE_FORCE_TRX_OFF,
> >                                      at86rf230_async_error_recover);
> > @@ -666,10 +682,15 @@ at86rf230_tx_trac_check(void *context)
> >         case TRAC_SUCCESS:
> >         case TRAC_SUCCESS_DATA_PENDING:
> >                 at86rf230_async_state_change(lp, ctx, STATE_TX_ON, at86rf230_tx_on);
> > +               return;
> > +       case TRAC_CHANNEL_ACCESS_FAILURE:
> > +       case TRAC_NO_ACK:
> >                 break;
> >         default:
> > -               at86rf230_async_error(lp, ctx, -EIO);
> > +               trac = TRAC_INVALID;
> >         }
> > +
> > +       at86rf230_async_error(lp, ctx, trac);  
> 
> That makes no sense, at86rf230_async_error() is not a trac error
> handling, it is a bus error handling.

Both will have to be handled asynchronously, which means we have to
tell the soft mac layer that something bad happened in each case.

> As noted above. With this change
> you mix bus errors and trac errors (which are not bus errors).

In the case of a SPI error, it will happen asynchronously, which means
the tx call is over and something bad happened. We are aware that
something bad happened and there was a bus error. We need to:
- Free the skb
- Restart the internal machinery
- Somehow tell the soft mac layer something bad happened and the packet
  will not be transmitted as expected (IOW, balance the "end" calls
  with the "start" calls, just because we did not return immediately
  when we got the transmit request).

In the case of a transmission error, this is a trac condition that is
reported to us by an IRQ. We know it is a trac error, we can look at a
buffer to find which trac error exactly happened. In this case we need
to go through exactly the same steps as above.

But you are right that a spi_async() error is not a trac error, hence
my choice in the switch statement to default to the
IEEE80154_SYSTEM_ERROR flag in this case.

Should I ignore spi bus errors? I don't think I can, so I don't really
see how to handle it differently.

Thanks,
Miquèl

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ