lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YKIg3VDupFYPLB1o@orome.fritz.box>
Date:   Mon, 17 May 2021 09:53:01 +0200
From:   Thierry Reding <treding@...dia.com>
To:     Joakim Zhang <qiangqing.zhang@....com>
CC:     Florian Fainelli <f.fainelli@...il.com>,
        Jon Hunter <jonathanh@...dia.com>,
        Jakub Kicinski <kuba@...nel.org>,
        "peppe.cavallaro@...com" <peppe.cavallaro@...com>,
        "alexandre.torgue@...s.st.com" <alexandre.torgue@...s.st.com>,
        "joabreu@...opsys.com" <joabreu@...opsys.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "mcoquelin.stm32@...il.com" <mcoquelin.stm32@...il.com>,
        "andrew@...n.ch" <andrew@...n.ch>,
        dl-linux-imx <linux-imx@....com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [RFC net-next] net: stmmac: should not modify RX descriptor when
 STMMAC resume

On Mon, May 10, 2021 at 02:10:21AM +0000, Joakim Zhang wrote:
> 
> Hi Florian,
> 
> > -----Original Message-----
> > From: Florian Fainelli <f.fainelli@...il.com>
> > Sent: 2021年5月8日 23:42
> > To: Joakim Zhang <qiangqing.zhang@....com>; Jon Hunter
> > <jonathanh@...dia.com>; Jakub Kicinski <kuba@...nel.org>
> > Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> > joabreu@...opsys.com; davem@...emloft.net;
> > mcoquelin.stm32@...il.com; andrew@...n.ch; dl-linux-imx
> > <linux-imx@....com>; treding@...dia.com; netdev@...r.kernel.org
> > Subject: Re: [RFC net-next] net: stmmac: should not modify RX descriptor when
> > STMMAC resume
> > 
> > 
> > 
> > On 5/8/2021 4:20 AM, Joakim Zhang wrote:
> > >
> > > Hi Jakub,
> > >
> > >> -----Original Message-----
> > >> From: Jon Hunter <jonathanh@...dia.com>
> > >> Sent: 2021年5月7日 22:22
> > >> To: Joakim Zhang <qiangqing.zhang@....com>; Jakub Kicinski
> > >> <kuba@...nel.org>
> > >> Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> > >> joabreu@...opsys.com; davem@...emloft.net;
> > mcoquelin.stm32@...il.com;
> > >> andrew@...n.ch; f.fainelli@...il.com; dl-linux-imx
> > >> <linux-imx@....com>; treding@...dia.com; netdev@...r.kernel.org
> > >> Subject: Re: [RFC net-next] net: stmmac: should not modify RX
> > >> descriptor when STMMAC resume
> > >>
> > >> Hi Joakim,
> > >>
> > >> On 06/05/2021 07:33, Joakim Zhang wrote:
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Jon Hunter <jonathanh@...dia.com>
> > >>>> Sent: 2021年4月23日 21:48
> > >>>> To: Jakub Kicinski <kuba@...nel.org>; Joakim Zhang
> > >>>> <qiangqing.zhang@....com>
> > >>>> Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> > >>>> joabreu@...opsys.com; davem@...emloft.net;
> > >> mcoquelin.stm32@...il.com;
> > >>>> andrew@...n.ch; f.fainelli@...il.com; dl-linux-imx
> > >>>> <linux-imx@....com>; treding@...dia.com; netdev@...r.kernel.org
> > >>>> Subject: Re: [RFC net-next] net: stmmac: should not modify RX
> > >>>> descriptor when STMMAC resume
> > >>>>
> > >>>>
> > >>>> On 22/04/2021 16:56, Jakub Kicinski wrote:
> > >>>>> On Thu, 22 Apr 2021 04:53:08 +0000 Joakim Zhang wrote:
> > >>>>>> Could you please help review this patch? It's really beyond my
> > >>>>>> comprehension, why this patch would affect Tegra186 Jetson TX2
> > board?
> > >>>>>
> > >>>>> Looks okay, please repost as non-RFC.
> > >>>>
> > >>>>
> > >>>> I still have an issue with a board not being able to resume from
> > >>>> suspend with this patch. Shouldn't we try to resolve that first?
> > >>>
> > >>> Hi Jon,
> > >>>
> > >>> Any updates about this? Could I repost as non-RFC?
> > >>
> > >>
> > >> Sorry no updates from my end. Again, I don't see how we can post this
> > >> as it introduces a regression for us. I am sorry that I am not able
> > >> to help more here, but we have done some extensive testing on the
> > >> current mainline without your change and I don't see any issues with
> > >> regard to suspend/resume. Hence, this does not appear to fix any
> > >> pre-existing issues. It is possible that we are not seeing them.
> > >>
> > >> At this point I think that we really need someone from Synopsys to
> > >> help us understand that exact problem that you are experiencing so
> > >> that we can ensure we have the necessary fix in place and if this is
> > >> something that is applicable to all devices or not.
> > >
> > > This patch only removes modification of Rx descriptors when STMMAC
> > resume back, IMHO, it should not affect system suspend/resume function.
> > > Do you have any idea about Joh's issue or any acceptable solution to fix the
> > issue I met? Thanks a lot!
> > 
> > Joakim, don't you have a support contact at Synopsys who would be able to
> > help or someone at NXP who was responsible for the MAC integration?
> > We also have Synopsys engineers copied so presumably they could shed some
> > light.
> 
> I contacted Synopsys no substantive help was received, and integration guys from NXP is unavailable now.
> 
> But, some hints has came out, seems a bit help. I found that the DMA width is 34 bits on i.MX8MP, this may different from many existing SoCs which integrated STMMAC.
> 
> As I described in the commit message:
> When system suspend: the rx descriptor is 008 [0x00000000c4310080]: 0x0 0x40 0x0 0x34010040
> When system resume: the rx descriptor modified to 008 [0x00000000c4310080]: 0x0 0x40 0x0 0xb5010040
> Since the DMA is 34 bits width, so desc0/desc1 indicates the buffer address, after system resume, the buffer address changed to 0x4000000000.
> And the correct rx descriptor is 008 [0x00000000c4310080]: 0x6511000 0x1 0x0 0x81000000, the valid buffer address is 0x16511000.
> So when DMA tried to access 0x4000000000, this valid address, would generate fatal bus error.

Okay, that's interesting. If i.MX8MP supports only 34 address bits but
the driver tries to set a DMA address of 0x4000000000, that's way out of
the valid range.

I suspect what might be happening is that the DMA mask isn't properly
set for your device. There's in fact some code in the driver that deals
with this. If you look at the implementation of stmmac_dvr_probe() in
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c around line 4980,
there's a comment that actually mentions i.MX8MP and the 34 address bit
limitation. Can you find out what that priv->plat->addr64 is set to on
your system?

Or alternatively find out what priv->dma_cap.addr64 ends up being set a
few lines further down? That value is effectively used to set the DMA
mask and if that's wrong it might explain why the driver is setting a
bad DMA address.

In fact, maybe that information is already in the kernel log. There's a
dev_info() there that should print out something like:

	Using 34 bits DMA width

in your case. If that says something other than 34 in there, it's very
likely that this needs to be correctly set somewhere. Looking at the
code in dwmac-imx.c, I see that that's already set to 34, so this looks
like it should be setting things correctly, but better make sure.

> But for other 32 bits width DMA, DMA seems still can work when this issue happened, only desc0 indicates buffer address, so the buffer address is 0x0 when system resume.
> And there is a NOTE in the guide:
> In the Receive Descriptor (Read Format), if the Buffer Address
> field is all 0s, the module does not transfer data to that buffer
> and skips to the next buffer or next descriptor.
> For this note, I don't know what could IP actually do, when detect all zeros buffer address, it will change the descriptor to application own? If not, STMMAC driver seems can't handle this case.
> I will contact Synopsys guys for more details.
> 
> It now appears that this issue seems only can be reproduced on DMA width more than 32 bits, this may be why other SoCs(e.g. i.MX8DXL) which integrated the same STMMAC IP can't reproduce it.

On Tegra186 and later we support up to 40 address bits. The newer
Tegra194 has a special quirk where bit 39 has special meaning, so we
have to override the DMA mask as well. I recall that this was causing
issues at some point, which is why I suspect something like this could
be happening in your case as well.

Thierry

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ