netdev - RE: [PATCH v2] net: macb: Restart tx only if queue pointer is lagging

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BL3PR02MB818723E41D3939482DBAC23AC9E99@BL3PR02MB8187.namprd02.prod.outlook.com>
Date:   Fri, 8 Apr 2022 12:02:23 +0000
From:   Harini Katakam <harinik@...inx.com>
To:     Tomas Melin <tomas.melin@...sala.com>,
        "Claudiu.Beznea@...rochip.com" <Claudiu.Beznea@...rochip.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC:     "Nicolas.Ferre@...rochip.com" <Nicolas.Ferre@...rochip.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        Shubhrajyoti Datta <shubhraj@...inx.com>,
        Michal Simek <michals@...inx.com>,
        "pthombar@...ence.com" <pthombar@...ence.com>,
        "mparab@...ence.com" <mparab@...ence.com>,
        "rafalo@...ence.com" <rafalo@...ence.com>
Subject: RE: [PATCH v2] net: macb: Restart tx only if queue pointer is lagging

Hi Tomas,

> -----Original Message-----
> From: Tomas Melin <tomas.melin@...sala.com>
> Sent: Friday, April 8, 2022 3:27 PM
> To: Harini Katakam <harinik@...inx.com>; Claudiu.Beznea@...rochip.com;
> netdev@...r.kernel.org
> Cc: Nicolas.Ferre@...rochip.com; davem@...emloft.net; kuba@...nel.org;
> pabeni@...hat.com; Shubhrajyoti Datta <shubhraj@...inx.com>; Michal
> Simek <michals@...inx.com>; pthombar@...ence.com;
> mparab@...ence.com; rafalo@...ence.com
> Subject: Re: [PATCH v2] net: macb: Restart tx only if queue pointer is lagging
> 
> Hi Claudiu, Harini,
> 
> On 08/04/2022 11:47, Harini Katakam wrote:
> > Hi Claudiu, Tomas,
> >
> >> -----Original Message-----
> >> From: Claudiu.Beznea@...rochip.com <Claudiu.Beznea@...rochip.com>
> >> Sent: Friday, April 8, 2022 1:13 PM
> >> To: tomas.melin@...sala.com; netdev@...r.kernel.org
> >> Cc: Nicolas.Ferre@...rochip.com; davem@...emloft.net;
> kuba@...nel.org;
> >> pabeni@...hat.com; Harini Katakam <harinik@...inx.com>; Shubhrajyoti
> >> Datta <shubhraj@...inx.com>; Michal Simek <michals@...inx.com>;
> >> pthombar@...ence.com; mparab@...ence.com; rafalo@...ence.com
> >> Subject: Re: [PATCH v2] net: macb: Restart tx only if queue pointer is
> lagging
> >>
> >> Hi, Tomas,
> >>
> >> I'm returning to this new thread.
> >>
> >> Sorry for the long delay. I looked though my emails for the steps to
> >> reproduce the bug that introduces macb_tx_restart() but haven't found
> >> them.
> >> Though the code in this patch should not affect at all SAMA5D4.
> >>
> >> I have tested anyway SAMA5D4 with and without your code and saw no
> >> issues.
> >> In case Dave, Jakub want to merge it you can add my
> >> Tested-by: Claudiu Beznea <claudiu.beznea@...rochip.com>
> >> Reviewed-by: Claudiu Beznea <claudiu.beznea@...rochip.com>
> 
> Thank you for the effort to review and test this! Also thanks for the
> discussions around this issue to provide further insights.
> 
> 
> >>
> >> The only thing with this patch, as mention earlier, is that freeing of packet
> N
> >> may depend on sending packet N+1 and if packet N+1 blocks again the
> HW
> >> then the freeing of packets N, N+1 may depend on packet N+2 etc. But
> from
> >> your investigation it seems hardware has some bugs.
> 
> Indeed, this is not behaviour I have encountered in any testing. If we
> were ever to encounter such issue, then it would need to be handled in
> separate manner. Perhaps call tx_interrupt() to progress the queue. But
> then again, this does not seem to happen.
> 
> >>
> >> FYI, I looked though Xilinx github repository and saw no patches on macb
> that
> >> may be related to this issue.
> >>
> >> Anyway, it would be good if there would be some replies from Xilinx or at
> >> least Cadence people on this (previous thread at [1]).
> >
> > Sorry for the delayed response.
> > I saw the condition you described and I'm not able to reproduce it.
> > But I agree with your assessment that restarting TX will not help in this
> case.
> > Also, the original patch restarting TX was also not reproduced on Zynq
> board
> > easily. We've had some users report the issue after > 1hr of traffic but that
> was
> > on a 4.xx kernel and I'm afraid I don’t have a case where I can reproduce
> the
> > original issue Claudiu described on any 5.xx kernel.
> >
> > Based on the thread, there is one possibility for a HW bug that controller
> fails to
> > generate TCOMP when a TXUBR and restart conditions occur because
> these interrupts
> > are edge triggered on Zynq.
> 
> This is interesting hypothesis and that would indeed lead to this situation.
> 
> 
> >
> > I'm going to check the errata and let you know if I find anything relevant
> and also
> > request Cadence folks to comment.
> > I'm sorry ask but is this condition reproducible on any later variants of the IP
> in Xilinx or
> > non-Xilinx devices?
> 
> I have not seen this issue on MPSoC (atleast yet). Indeed this issue
> seems to require the correct timing conditions for being able to trigger it.
> 
> So any additional information that we might get about possible issues in
> IP is welcomed. However, the hardware on the boards we have at hand will
> still be the same so the patch as such is relevant.

Yes, agreed. The patch is still required.

Regards,
Harini

> 
> BR,
> Tomas
> 
> 
> 
> > Zynq US+ MPSoC has the r1p07 while Zynq has the older version IP r1p23
> (old versioning)
> >
> > Regards,
> > Harini
> >
> >>
> >> Thank you,
> >> Claudiu Beznea
> >>
> >> [1]
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fnetdev%2F82276bf7-72a5-6a2e-ff33-
> &amp;data=04%7C01%7Ctomas.melin%40vaisala.com%7C352a532fe14b42ad
> 01d508da193c6320%7C6d7393e041f54c2e9b124c2be5da5c57%7C0%7C0%7C6
> 37850044400650522%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=
> rsBnJEVlDqpSUIfL%2BuXzAgTUL4w9rqaR6A6OLAi9gNQ%3D&amp;reserved=
> 0
> >>
> f8fe0c5e4a90@...rochip.com/T/#m644c84a8709a65c40b8fc15a589e83b24e4
> >> 8ccfd
> >>
<snip>