linux-kernel - RE: Regression v5.12-rc3: net: stmmac: re-init rx buffers when mac resume back

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <DB8PR04MB6795863753DAD71F1F64F81DE6629@DB8PR04MB6795.eurprd04.prod.outlook.com>
Date:   Thu, 25 Mar 2021 07:53:26 +0000
From:   Joakim Zhang <qiangqing.zhang@....com>
To:     Jon Hunter <jonathanh@...dia.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-tegra <linux-tegra@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>
Subject: RE: Regression v5.12-rc3: net: stmmac: re-init rx buffers when mac
 resume back


> -----Original Message-----
> From: Jon Hunter <jonathanh@...dia.com>
> Sent: 2021年3月24日 20:39
> To: Joakim Zhang <qiangqing.zhang@....com>
> Cc: netdev@...r.kernel.org; Linux Kernel Mailing List
> <linux-kernel@...r.kernel.org>; linux-tegra <linux-tegra@...r.kernel.org>;
> Jakub Kicinski <kuba@...nel.org>
> Subject: Re: Regression v5.12-rc3: net: stmmac: re-init rx buffers when mac
> resume back
> 
> 
> 
> On 24/03/2021 12:20, Joakim Zhang wrote:
> 
> ...
> 
> > Sorry for this breakage at your side.
> >
> > You mean one of your boards? Does other boards with STMMAC can work
> fine?
> 
> We have two devices with the STMMAC and one works OK and the other fails.
> They are different generation of device and so there could be some
> architectural differences which is causing this to only be seen on one device.
It's really strange, but I also don't know what architectural differences could affect this. Sorry.

> > We do daily test with NFS to mount rootfs, on issue found. And I add this
> patch at the resume patch, and on error check, this should not break suspend.
> > I even did the overnight stress test, there is no issue found.
> >
> > Could you please do more test to see where the issue happen?
> 
> The issue occurs 100% of the time on the failing board and always on the first
> resume from suspend. Is there any more debug I can enable to track down
> what the problem is?
> 

As commit messages described, the patch aims to re-init rx buffers address, since the address is not fixed, so I only can 
recycle and then re-allocate all of them. The page pool is allocated once when open the net device.

Could you please debug if it fails at some functions, such as page_pool_dev_alloc_pages() ?

Best Regards,
Joakim Zhang
> Jon
> 
> --
> nvpublic