[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171130035153.GA28231@axis.com>
Date: Thu, 30 Nov 2017 04:51:54 +0100
From: Niklas Cassel <niklas.cassel@...s.com>
To: Jose Abreu <Jose.Abreu@...opsys.com>
Cc: Joao Pinto <Joao.Pinto@...opsys.com>,
linux-netdev <netdev@...r.kernel.org>,
Giuseppe CAVALLARO <peppe.cavallaro@...com>
Subject: Re: Commit 05cf0d1bf4 ("net: stmmac: free an skb first when there
are no longer any descriptors using it") breaks stmmac?
On Mon, Nov 27, 2017 at 02:41:00PM +0000, Jose Abreu wrote:
> Hi Niklas,
Hello Jose,
>
> I think your commit 05cf0d1bf4 ("net: stmmac: free an skb first
> when there are no longer any descriptors using it") is breaking
> stmmac driver in multi-queue configuration (this stacktrace may
> contain some extra characters as I was using serial port):
>
> ------------->8-------------
> general protection fault: 0000 [#1] SMP
> Modules linked in: stmmac_pci stmmac libphy igb ptp pps_core
> x86_pkg_temp_thermal
> CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 4.14.0-rc5 #2
> Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1
> 10/06/2016
> task: ffffa2fe14d8b100 task.stack: ffffb8c6000b8000
> RIP: 0010:skb_release_data+0x66/0x110
> RSP: 0018:ffffa2fe2dd43d98 EFLAGS: 00010206
> RAX: 0000000000000030 RBX: ffffa2fe13fab100 RCX: 00000000000005aa
> RDX: ffffa2fe12a50000 RSI: 0000000000000000 RDI: fffcfffdfffbfffc
> RBP: ffffa2fe2dd43db0 R08: ffffa2fe2dfcd000 R09: 0000000000000001
> R10: ffffffffa06245d0 R11: ffffa2fe14c03700 R12: 0000000000000000
> R13: ffffa2fe11e686c0 R14: ffffa2fe13fab100 R15: ffffa2fe129b8940
> FS: 0000000000000000(0000) GS:ffffa2fe2dd40000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe26c457000 CR3: 000000002b609003 CR4: 00000000003606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> skb_release_all+0x1f/0x30
> consume_skb+0x1d/0x40
> __dev_kfree_skb_any+0x2a/0x30
> stmmac_tx_clean+0x230/0x4d0 [stmmac]
> stmmac_poll+0x4b/0x980 [stmmac]
> net_rx_action+0x1ad/0x290
> __do_softirq+0xdd/0x1d6
> irq_exit+0x77/0x80
> do_IRQ+0x4a/0xc0
> common_interrupt+0x93/0x93
> </IRQ>
> RIP: 0010:cpuidle_enter_state+0x16a/0x210
> RSP: 0018:ffffb8c6000bbe90 EFLAGS: 00000286 ORIG_RAX:
> ffffffffffffffae
> RAX: ffffa2fe2dd575c0 RBX: ffffa2fe14560200 RCX: 000000000000001f
> RDX: 0000000000000000 RSI: 00000122dbd7ae06 RDI: 0000000000000000
> RBP: ffffb8c6000bbec0 R08: 0000000000000020 R09: 0000000000000002
> R10: ffffb8c6000bbe60 R11: 0000000000123400 R12: 0000004f3c11b47a
> R13: 0000000000000003 R14: ffffffffa0e3aa58 R15: 0000000000000003
> cpuidle_enter+0x12/0x20
> call_cpuidle+0x1e/0x40
> do_idle+0x16a/0x1c0
> cpu_startup_entry+0x18/0x20
> start_secondary+0x10d/0x110
> secondary_startup_64+0xa5/0xa5
> ------------->8-------------
>
> Using tree with your commit I get this stacktrace upon streaming
> data at random time (stacktrace does not appear everytime),
> without the commit I get no errors.
>
> I understand the reason for your commit but we already have
> dirty_tx in stmmac_tx_clean(), shouldn't it be enough to prevent
> skb use-after-free? Can you please review your patch to make sure
> nothing else is missing?
Dirty is not enough.
The problem that I thought I solved,
was if an single skb was spread out over
3 tx descs like this:
TX DESC #1: first descriptor bit set
TX DESC #2:
TX DESC #3: last descriptor bit set
before my patch the
tx_q->tx_skbuff[first_entry] would have the skb pointer.
let's say that curr_tx is pointing to TX DESC4,
since curr_tx is supposed to point to the place where
we are going to start filling next time.
The problem I thought I solved was if stmmac_tx_clean
came and started cleaning, it could clean TX DESC #1,
and thus freeing the skb (which TX DESC #2 and TX DESC #3 uses.)
However, by reading the databook more carefully:
If an Ethernet packet is stored over data buffers in multiple
descriptors, the DMA will fetch all descriptors, and
will only update the own bit in all descriptors,
after the complete packet has been sent.
Because of this, I think that the case where
TX DESC #1 gets its own bit cleared, while the hardware
hasn't yet fetched the TX DESC #2 and TX DESC #3 cannot
happen, so my patch 05cf0d1bf4 ("net: stmmac: free an skb first
when there are no longer any descriptors using it")
has no real purpose, and could be reverted.
However, I don't see why is should cause your
kernel to crash.
It looks like skb_release_data is trying to access a member in a pointer
that is null, i.e. a double free of an skb.
I assume this is because the same skb address exists in several queues.
My guess is that the problem is that tx_q->tx_skbuff[] does not get set to
NULL properly when having multiple queues. (It obviously works for a single
tx queue).
stmmac_tx_clean() will do:
if (skb)
tx_q->tx_skbuff[entry] = NULL;
so skbs should always be set to NULL.
However, one difference seems to be that stmmac_xmit and stmmac_tso_xmit
inside the nfrags for-loop, seems to set the skb pointer to NULL.
for (i = 0; i < nfrags; i++) {
tx_q->tx_skbuff[entry] = NULL;
Considering this is already done in stmmac_tx_clean, this seems redundant.
But one difference is that before my patch,
all frags, except first TX DESC, would get NULL:ed
(first desc would be assigned the skb).
After my patch, all descs, except first TX DESC,
would get NULL:ed, last desc would get assigned
the skb, but first desc is not explicitly cleared.
Could you try to see if the following patch
solves your problem:
(still don't see why it should be needed,
considering stmmac_tx_clean() should set all non-NULL
entries to NULL)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 1763e48c84e2..1d30b20b3740 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2830,6 +2830,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
tx_q->tx_skbuff_dma[first_entry].buf = des;
tx_q->tx_skbuff_dma[first_entry].len = skb_headlen(skb);
+ tx_q->tx_skbuff[first_entry] = NULL;
first->des0 = cpu_to_le32(des);
@@ -3003,6 +3004,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
first = desc;
+ tx_q->tx_skbuff[first_entry] = NULL;
+
enh_desc = priv->plat->enh_desc;
/* To program the descriptors according to the size of the frame */
if (enh_desc)
Powered by blists - more mailing lists