lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <0ceb7ed6-2535-aa40-52a4-c6a8e99dafdc@microchip.com> Date: Mon, 24 Apr 2023 08:54:02 +0000 From: <Claudiu.Beznea@...rochip.com> To: <ingo.rohloff@...terbach.com>, <robert.hancock@...ian.com> CC: <Nicolas.Ferre@...rochip.com>, <davem@...emloft.net>, <kuba@...nel.org>, <netdev@...r.kernel.org>, <tomas.melin@...sala.com> Subject: Re: [PATCH 0/1] Alternative, restart tx after tx used bit read On 08.04.2023 00:33, Ingo Rohloff wrote: > [You don't often get email from ingo.rohloff@...terbach.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > I am sorry; this is a long E-Mail. > > I am referring to this problem: > > Robert Hancock wrote: >> On Wed, 2022-03-23 at 08:43 -0700, Jakub Kicinski wrote: >>> On Wed, 23 Mar 2022 10:08:20 +0200 Tomas Melin wrote: >>>>> From: Claudiu Beznea <claudiu.beznea@...rochip.com> >>>>> >>>>> On some platforms (currently detected only on SAMA5D4) TX might stuck >>>>> even the pachets are still present in DMA memories and TX start was >>>>> issued for them. >>>>> ... >>>> On Xilinx Zynq the above change can cause infinite interrupt loop >>>> leading to CPU stall. Seems timing/load needs to be appropriate for >>>> this to happen, and currently with 1G ethernet this can be triggered >>>> normally within minutes when running stress tests on the network >>>> interface. >>>> ... >>> Which kernel version are you using? Robert has been working on macb + >>> Zynq recently, adding him to CC. >> ... >> I haven't looked at the TX ring descriptor and register setup on this core >> in that much detail, but the fact the controller gets into this "TX used >> bit read" state in the first place seems unusual. I'm wondering if >> something is being done in the wrong order or if we are missing a memory >> barrier etc? > > I am developing on a ZynqMP (Ultrascale+) SoC from AMD/Xilinx. > I have seen the same issue before commit 4298388574dae6168 ("net: macb: > restart tx after tx used bit read") > > The scenario which sometimes triggers it for me: > > I have an application running on the PC. > The application sends a short command (via TCP) to the ZynqMP. > The ZynqMP answers with a long stream of bytes via TCP > (around 230KiB). > The PC knows the amount of data and waits to receive the data completely. > The PC gets stuck, because the last TCP segment of the transfer gets > stuck in the ZynqMP and is not transmitted. > You can re-trigger the TX Ring by pinging the ZynqMP: > The Ping answer will re-trigger the TX ring, which in turn will also > then send the stuck IP/TCP packet. From the moment I've been working on this I remember ping was also restarting the TX for me (which is expected as it enqueues more descriptors and issue TSTART). At the time I worked on this the hardware prefetch was not implemented in software (and I think the hardware I had the issue with didn't support it). > > Unfortunately triggering this problem seems to be hard; at least I am > not able to reproduce it easily. It was the same on my side. > > So: If anyone has a more reliable way to trigger the problem, > please tell me. > This is to check if my proposed alternative works under all circumstances. > > I have an alternate implementation, which does not require to turn on > the "TX USED BIT READ" (TUBR) interrupt. > The reason why I think this alternative might be better is, because I > believe the TUBR interrupt happens at the wrong time; so I am not sure > that the current implementation works reliably. > > Analysis: > Commit 404cd086f29e867f ("net: macb: Allocate valid memory for TX and RX BD > prefetch") mentions: > > GEM version in ZynqMP and most versions greater than r1p07 supports > TX and RX BD prefetch. The number of BDs that can be prefetched is a > HW configurable parameter. For ZynqMP, this parameter is 4. > > I think what happens is this: > Example Scenario (SW == linux kernel, HW == cadence ethernet IP). > 1) SW has written TX descriptors 0..7 > 2) HW is currently transmitting TX descriptor 6. > HW has already prefetched TX descriptors 6,7,8,9. > 3) SW writes TX descriptor 8 (clearing TX_USED) > 4) SW writes the TSTART bit. > HW ignores this, because it is still transmitting. > 5) HW transmits TX descriptor 7. > 6) HW reaches descriptor 8; because this descriptor > has already been prefetched, HW sees a non-active > descriptor (TX_USED set) and stops transmitting. I think you can check TBQP register to see if hardware position in TX queue is ahead of software to validate against hardware this prefetch scenario. > >>From debugging the code it seems that the TUBR interrupt happens, when > a descriptor is prefetched, which has a TX_USED bit set, which is before From what I know, yes, TUBR is raised when HW TX logic reaches a descriptor with TX_USED set. > it is processed by the rest of the hardware: > When looking at the end of a transfer it seems I get a TUBR interrupt, > followed by some more TX COMPLETE interrupts. > > Additionally that means at the time the TUBR interrupt happens, it > is too early to write the TSTART bit again, because the hardware is > still actively transmitting. > > The alternative I implemented is to check in macb_tx_complete() if > > 1) The TX Queue is non-empty (there are pending TX descriptors) > 2) The hardware indicates that it is not transmitting any more > > If this situation is detected, the TSTART bit will be written to > restart the TX ring. > > I know for sure, that I hit the code path, which restarts the > transmission in macb_tx_complete(); that's why I believe the > "Example Scenario" I described above is correct. > > I am still not sure if what I implemented is enough: > macb_tx_complete() should at least see all completed TX descriptors. > I still believe there is a (very short) time window in which there > might be a race: > 1) HW completes TX descriptor 7 and sets the TX_USED bit > in TX descriptor 7. > TX descriptor 8 was prefetched with a set TX_USED bit. > 2) SW sees that TX descriptor 7 is completed > (TX_USED bit now is set). > 3) SW sees that there still is a pending TX descriptor 8. > 4) SW checks if the TGO bit is still set, which it is. > So the SW does nothing at this point. > 5) HW processes the prefetched,set TX_USED bit in > TX descriptor 8 and stops transmission (clearing the TGO bit). > > I am not sure if it is guaranteed that 5) cannot happen after 4). If 5) > happens after 4) as described above, then the controller still gets stuck. > The only idea I can come up with, is to re-check the TGO bit > a second time a little bit later, but I am not sure how to > implement this. > > Is there anyone who has access to hardware documentation, which > sheds some light onto the way the descriptor prefetching works? Some part of it could be found here: https://ww1.microchip.com/downloads/aemDocuments/documents/MPU32/ProductDocuments/DataSheets/SAMA7G5-Series-Data-Sheet-DS60001765A.pdf#_OPENTOPIC_TOC_PROCESSING_d17023e778582 Thank you, Claudiu > > so long > Ingo > > > Ingo Rohloff (1): > net: macb: A different way to restart a stuck TX descriptor ring. > > drivers/net/ethernet/cadence/macb.h | 1 - > drivers/net/ethernet/cadence/macb_main.c | 67 +++++++++--------------- > 2 files changed, 24 insertions(+), 44 deletions(-) > > -- > 2.17.1 >
Powered by blists - more mailing lists