lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zi10QS6UGGaNVRaB@builder>
Date: Sat, 27 Apr 2024 23:55:13 +0200
From: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@...roamp.se>
To: Andrew Lunn <andrew@...n.ch>
Cc: Parthiban Veerasooran <Parthiban.Veerasooran@...rochip.com>,
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, horms@...nel.org, saeedm@...dia.com,
	anthony.l.nguyen@...el.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, corbet@....net,
	linux-doc@...r.kernel.org, robh+dt@...nel.org,
	krzysztof.kozlowski+dt@...aro.org, conor+dt@...nel.org,
	devicetree@...r.kernel.org, horatiu.vultur@...rochip.com,
	ruanjinjie@...wei.com, steen.hegelund@...rochip.com,
	vladimir.oltean@....com, UNGLinuxDriver@...rochip.com,
	Thorsten.Kummermehr@...rochip.com, Pier.Beruto@...emi.com,
	Selvamani.Rajagopal@...emi.com, Nicolas.Ferre@...rochip.com,
	benjamin.bigler@...nformulastudent.ch
Subject: Re: [PATCH net-next v4 05/12] net: ethernet: oa_tc6: implement error
 interrupts unmasking

> How fast is your SPI bus? Faster than the link speed? Or slower?
> 
> It could be different behaviour is needed depending on the SPI bus
> speed. If the SPI bus is faster than the link speed, by some margin,
> the receiver buffer should not overflow, since the CPU can empty the
> buffer faster than it fills.

I'm running at 25MHz, I'm guessing that should translate to fast enough
for the 10MBit half duplex link.
But I'm not sure how the spi clock translates to bps here.

> 
> If however, the SPI bus is slower than the link speed, there will be
> buffer overflows, and a reliance on TCP backing off and slowing down.
> The driver should not be spamming the log, since it is going to happen
> and there is nothing that can be done about it.
> 

I agree, I think the print could be a DBG if deemed necessary, but there
is also the dropped counter to look at.

> > I tried this patch
> > 
> > diff --git a/drivers/net/ethernet/oa_tc6.c b/drivers/net/ethernet/oa_tc6.c
> > index 9f17f3712137..bd7bd3ef6897 100644
> > --- a/drivers/net/ethernet/oa_tc6.c
> > +++ b/drivers/net/ethernet/oa_tc6.c
> > @@ -615,21 +615,9 @@ static int oa_tc6_sw_reset_macphy(struct oa_tc6 *tc6)
> >         return oa_tc6_write_register(tc6, OA_TC6_REG_STATUS0, regval);
> >  }
> > 
> > -static int oa_tc6_unmask_macphy_error_interrupts(struct oa_tc6 *tc6)
> > +static int oa_tc6_disable_imask0_interrupts(struct oa_tc6 *tc6)
> >  {
> > -       u32 regval;
> > -       int ret;
> > -
> > -       ret = oa_tc6_read_register(tc6, OA_TC6_REG_INT_MASK0, &regval);
> > -       if (ret)
> > -               return ret;
> > -
> > -       regval &= ~(INT_MASK0_TX_PROTOCOL_ERR_MASK |
> > -                   INT_MASK0_RX_BUFFER_OVERFLOW_ERR_MASK |
> > -                   INT_MASK0_LOSS_OF_FRAME_ERR_MASK |
> > -                   INT_MASK0_HEADER_ERR_MASK);
> > -
> > -       return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, regval);
> > +       return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, (u32)-1);
> 
> So this appears to be disabling all error interrupts?

Yes, and I think you are right in that it's an overcorrection. There is
a secondary interrupt mask register as well that is not touched by the
driver, so that's left at whatever the chip defaults to on reset.

> 
> This is maybe going too far. Overflow errors are going to happen if
> you have a slow SPI bus. So i probably would not enable that. However,
> are the other errors actually expected in normal usage? If not, leave
> them enabled, because they might indicate real problems.

I'm guessing you are right and that the others actually would be
meningful to log.
There is a nested question here as well, and that is wheter to keep or
drop the code that drops the rx buffer on overflow interrupt.
I think not dropping the full buffer could be one of the reasons for the
perf change.

> 
> > Which results in no log spam, ~5-10% higher throughput and no dropped
> > packets when I look at /sys/class/net/eth0/statistics/rx_dropped
> 
> You cannot trust rx_dropped because you just disabled the code which
> increments it! The device is probably still dropping packets, and they
> are no longer counted.

I'll curb my enthusiasm a bit :)

> 
> It could be the performance increase comes from two places:
> 
> 1) Spending time and bus bandwidth dealing with the buffer overflow
> interrupt
> 
> 2) Printing out the serial port.
> 

I think it's possible that the buffer cleanup triggered after the
overflow interrupt hits could be cause 3) here, but just a guess.

> Please could you benchmark a few things:
> 
> 1) Remove the printk("Receive buffer overflow error"), but otherwise
> keep the code the same. That will give us an idea how much the serial
> port matters.
> 
> 2) Disable only the RX buffer overflow interrupt
> 
> 3) Disable all error interrupts.
> 

Thanks for layout it out so clearly. I'll run through the scenarios!

R

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ