[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKfDRXge7RmsTx2Xh6BFwX+qObZXtukkm0LKkZQo+z9vubpybw@mail.gmail.com>
Date: Thu, 10 Jun 2021 18:04:30 +0200
From: Kristian Evensen <kristian.evensen@...il.com>
To: Network Development <netdev@...r.kernel.org>,
subashab@...eaurora.org, stranche@...eaurora.org,
sharathv@...eaurora.org
Subject: Re: Kernel oops when using rmnet together with qmi_wwan on kernel 5.4
Hi,
On Wed, Jun 9, 2021 at 8:39 PM Kristian Evensen
<kristian.evensen@...il.com> wrote:
> Does anyone have any idea about what could be wrong, where to look or
> how to fix the problem? Are there for example any related commits that
> I should backport to my 5.4 kernel?
I spent some more time looking into this problem today and I believe I
have figured out what goes wrong. Please note that my knowledge of the
network driver infrastructure is a bit limited, so please excuse any
mistakes :)
I started out by taking a closer look at the rmnet code, and I noticed
that rmnet calls consume_skb() when the processing of the aggregated
skb is done. Instrumenting the kernel revealed that the reference
count of the aggregated is one, so the skb will be freed inside
consume_skb(). I believe the call to consume_skb can cause problems
with usbnet, depending on the order of operations. The skb that is
passed to rmnet is owned by usbnet and is referenced after the
qmi_wwan rx_fixup-callback has been called.
In order to try to prove my theory, I modified qmi_wwan to clone the
skb inside qmi_wwan_rx_fixup (before the call to netif_rx()). When
cloning the skb, I am no longer able to trigger the crash. Without
cloning, the crash happens more or less instantaneously. I don't know
if my reasoning and fix makes sense, or if I have misunderstood
something or there is a better way to fix the problem? I also tried to
only call skb_get() but this did not help.
BR,
Kristian
Powered by blists - more mailing lists