[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210130111618.335b6945@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Sat, 30 Jan 2021 11:16:18 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Xie He <xie.he.0141@...il.com>
Cc: Martin Schiller <ms@....tdt.de>,
"David S. Miller" <davem@...emloft.net>,
Linux X25 <linux-x25@...r.kernel.org>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Krzysztof Halasa <khc@...waw.pl>
Subject: Re: [PATCH net] net: hdlc_x25: Use qdisc to queue outgoing LAPB
frames
On Sat, 30 Jan 2021 06:29:20 -0800 Xie He wrote:
> On Fri, Jan 29, 2021 at 5:36 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > I'm still struggling to wrap my head around this.
> >
> > Did you test your code with lockdep enabled? Which Qdisc are you using?
> > You're queuing the frames back to the interface they came from - won't
> > that cause locking issues?
>
> Hmm... Thanks for bringing this to my attention. I indeed find issues
> when the "noqueue" qdisc is used.
>
> When using a qdisc other than "noqueue", when sending an skb:
> "__dev_queue_xmit" will call "__dev_xmit_skb";
> "__dev_xmit_skb" will call "qdisc_run_begin" to mark the beginning of
> a qdisc run, and if the qdisc is already running, "qdisc_run_begin"
> will fail, then "__dev_xmit_skb" will just enqueue this skb without
> starting qdisc. There is no problem.
>
> When using "noqueue" as the qdisc, when sending an skb:
> "__dev_queue_xmit" will try to send this skb directly. Before it does
> that, it will first check "txq->xmit_lock_owner" and will find that
> the current cpu already owns the xmit lock, it will then print a
> warning message "Dead loop on virtual device ..." and drop the skb.
>
> A solution can be queuing the outgoing L2 frames in this driver first,
> and then using a tasklet to send them to the qdisc TX queue.
>
> Thanks! I'll make changes to fix this.
Sounds like too much afford for a sub-optimal workaround.
The qdisc semantics are borken in the proposed scheme (double
counting packets) - both in term of statistics and if user decides
to add a policer, filter etc.
Another worry is that something may just inject a packet with
skb->protocol == ETH_P_HDLC but unexpected structure (IDK if
that's a real concern).
It may be better to teach LAPB to stop / start the internal queue.
The lower level drivers just needs to call LAPB instead of making
the start/wake calls directly to the stack, and LAPB can call the
stack. Would that not work?
Powered by blists - more mailing lists