[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM2PR07MB685EABE2F7D9DF4B6B1913589D30@DM2PR07MB685.namprd07.prod.outlook.com>
Date: Mon, 12 Mar 2018 19:25:52 +0000
From: "Chopra, Manish" <Manish.Chopra@...ium.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: "john.fastabend@...il.com" <john.fastabend@...il.com>,
"davem@...emloft.net" <davem@...emloft.net>,
Eric Dumazet <eric.dumazet@...il.com>,
"Kalderon, Michal" <Michal.Kalderon@...ium.com>,
"Elior, Ariel" <Ariel.Elior@...ium.com>,
"yuvalm@...lanox.com" <yuvalm@...lanox.com>,
Michal Schmidt <mschmidt@...hat.com>
Subject: mmiowb() vs wmb() with write combined MMIO
Hello Folks,
A series from John fastabend [for lockless qdisc support, specifically commit c5ad119fb6c09b0297446be05bd66602fa564758]
has uncovered an issue in qede NIC driver [/drivers/net/ethernet/qlogic/qede] and this driver is seriously broken for basic L2.
Before the series from John, driver was running fine, probably with hiding a serious bug underneath it which got exposed with his series.
Driver uses mmiowb() after writing TX doorbell in function qede_update_tx_producer() to synchronize the write between CPUs and
that TX doorbell area, driver maps with write combined memory using ioremap_wc().
This is causing an issue where HW seems to be reading an unexpected producer value [not as was written on doorbell, probably an older producer]
and causing various SKB null condition hit while handling TX completion. This seems to be indicating that mmiowb() doesn't seems to be synchronizing/flushing
writes here, at least on x86_64 SMP, it's just like a compiler barrier [Not really a hardware fence].
If we replace mmiowb() with wmb() we never hit the problem. wmb() seems to be adding some synchronization/flush mechanism here.
Moreover, if we map doorbell with ioremap_nocache() we never hit problem with mmiowb() being there even.
I am quite sure this has to do with write combined mapped memory, where mmiowb() doesn't seems to be effective, it seems like we might want to replace
mmiowb() with wmb() here. Few vendors uses wmb() after doorbell write instead of mmiowb(), some uses flush_wc() kind of helpers.
Any suggestions or help about this barrier differences will be appreciated here in order to prepare a proper fix.
Thanks in Advance !!
Manish Chopra
Powered by blists - more mailing lists