[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK-6q+g07ficTc-h_ks8GPpv880goHuGNXTD2fqbfbR7LDPZWQ@mail.gmail.com>
Date: Wed, 18 May 2022 09:08:23 -0400
From: Alexander Aring <aahringo@...hat.com>
To: Miquel Raynal <miquel.raynal@...tlin.com>
Cc: Alexander Aring <alex.aring@...il.com>,
Stefan Schmidt <stefan@...enfreihafen.org>,
linux-wpan - ML <linux-wpan@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Network Development <netdev@...r.kernel.org>,
David Girault <david.girault@...vo.com>,
Romuald Despres <romuald.despres@...vo.com>,
Frederic Blain <frederic.blain@...vo.com>,
Nicolas Schodet <nico@...fr.eu.org>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>
Subject: Re: [PATCH wpan-next v2 09/11] net: mac802154: Introduce a
synchronous API for MLME commands
Hi,
On Wed, May 18, 2022 at 8:37 AM Miquel Raynal <miquel.raynal@...tlin.com> wrote:
>
>
> alex.aring@...il.com wrote on Wed, 18 May 2022 08:05:46 -0400:
>
> > Hi,
> >
> > On Wed, May 18, 2022 at 6:12 AM Miquel Raynal <miquel.raynal@...tlin.com> wrote:
> > >
> > >
> > > aahringo@...hat.com wrote on Tue, 17 May 2022 21:14:03 -0400:
> > >
> > > > Hi,
> > > >
> > > > On Tue, May 17, 2022 at 9:30 AM Miquel Raynal <miquel.raynal@...tlin.com> wrote:
> > > > >
> > > > >
> > > > > aahringo@...hat.com wrote on Sun, 15 May 2022 19:03:53 -0400:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Sun, May 15, 2022 at 6:28 PM Alexander Aring <aahringo@...hat.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > On Thu, May 12, 2022 at 10:34 AM Miquel Raynal
> > > > > > > <miquel.raynal@...tlin.com> wrote:
> > > > > > > >
> > > > > > > > This is the slow path, we need to wait for each command to be processed
> > > > > > > > before continuing so let's introduce an helper which does the
> > > > > > > > transmission and blocks until it gets notified of its asynchronous
> > > > > > > > completion. This helper is going to be used when introducing scan
> > > > > > > > support.
> > > > > > > >
> > > > > > > > Signed-off-by: Miquel Raynal <miquel.raynal@...tlin.com>
> > > > > > > > ---
> > > > > > > > net/mac802154/ieee802154_i.h | 1 +
> > > > > > > > net/mac802154/tx.c | 25 +++++++++++++++++++++++++
> > > > > > > > 2 files changed, 26 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/net/mac802154/ieee802154_i.h b/net/mac802154/ieee802154_i.h
> > > > > > > > index a057827fc48a..f8b374810a11 100644
> > > > > > > > --- a/net/mac802154/ieee802154_i.h
> > > > > > > > +++ b/net/mac802154/ieee802154_i.h
> > > > > > > > @@ -125,6 +125,7 @@ extern struct ieee802154_mlme_ops mac802154_mlme_wpan;
> > > > > > > > void ieee802154_rx(struct ieee802154_local *local, struct sk_buff *skb);
> > > > > > > > void ieee802154_xmit_sync_worker(struct work_struct *work);
> > > > > > > > int ieee802154_sync_and_hold_queue(struct ieee802154_local *local);
> > > > > > > > +int ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb);
> > > > > > > > netdev_tx_t
> > > > > > > > ieee802154_monitor_start_xmit(struct sk_buff *skb, struct net_device *dev);
> > > > > > > > netdev_tx_t
> > > > > > > > diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> > > > > > > > index 38f74b8b6740..ec8d872143ee 100644
> > > > > > > > --- a/net/mac802154/tx.c
> > > > > > > > +++ b/net/mac802154/tx.c
> > > > > > > > @@ -128,6 +128,31 @@ int ieee802154_sync_and_hold_queue(struct ieee802154_local *local)
> > > > > > > > return ieee802154_sync_queue(local);
> > > > > > > > }
> > > > > > > >
> > > > > > > > +int ieee802154_mlme_tx(struct ieee802154_local *local, struct sk_buff *skb)
> > > > > > > > +{
> > > > > > > > + int ret;
> > > > > > > > +
> > > > > > > > + /* Avoid possible calls to ->ndo_stop() when we asynchronously perform
> > > > > > > > + * MLME transmissions.
> > > > > > > > + */
> > > > > > > > + rtnl_lock();
> > > > > > >
> > > > > > > I think we should make an ASSERT_RTNL() here, the lock needs to be
> > > > > > > earlier than that over the whole MLME op. MLME can trigger more than
> > > > > >
> > > > > > not over the whole MLME_op, that's terrible to hold the rtnl lock so
> > > > > > long... so I think this is fine that some netdev call will interfere
> > > > > > with this transmission.
> > > > > > So forget about the ASSERT_RTNL() here, it's fine (I hope).
> > > > > >
> > > > > > > one message, the whole sync_hold/release queue should be earlier than
> > > > > > > that... in my opinion is it not right to allow other messages so far
> > > > > > > an MLME op is going on? I am not sure what the standard says to this,
> > > > > > > but I think it should be stopped the whole time? All those sequence
> > > > > >
> > > > > > Whereas the stop of the netdev queue makes sense for the whole mlme-op
> > > > > > (in my opinion).
> > > > >
> > > > > I might still implement an MLME pre/post helper and do the queue
> > > > > hold/release calls there, while only taking the rtnl from the _tx.
> > > > >
> > > > > And I might create an mlme_tx_one() which does the pre/post calls as
> > > > > well.
> > > > >
> > > > > Would something like this fit?
> > > >
> > > > I think so, I've heard for some transceiver types a scan operation can
> > > > take hours... but I guess whoever triggers that scan in such an
> > > > environment knows that it has some "side-effects"...
> > >
> > > Yeah, a scan requires the data queue to be stopped and all incoming
> > > packets to be dropped (others than beacons, ofc), so users must be
> > > aware of this limitation.
> >
> > I think there is a real problem about how the user can synchronize the
> > start of a scan and be sure that at this point everything was
> > transmitted, we might need to real "flush" the queue. Your naming
> > "flush" is also wrong, It will flush the framebuffer(s) of the
> > transceivers but not the netdev queue... and we probably should flush
> > the netdev queue before starting mlme-op... this is something to add
> > in the mlme_op_pre() function.
>
> Is it even possible? This requires waiting for the netdev queue to be
> empty before stopping it, but if users constantly flood the transceiver
> with data packets this might "never" happen.
>
Nothing is impossible, just maybe nobody thought about that. Sure
putting more into the queue should be forbidden but what's inside
should be "flushed". Currently we make a hard cut, there is no way
that the user knows what's sent or not BUT that is the case for
xmit_do() anyway, it's not reliable... people need to have the right
upper layer protocol. However I think we could run into problems if we
especially have features like waiting for the socket error queue to
know if e.g. an ack was received or not.
> And event thought we might accept this situation, I don't know how to
> check the emptiness of the netif queue. Any inputs?
Don't think about it, I see a practical issue here which I keep in my mind.
- Alex
Powered by blists - more mailing lists