lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <77d7d190-0847-4dc9-8fc5-4e33308ce7c8@lunn.ch>
Date: Sat, 27 Apr 2024 23:17:37 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@...roamp.se>
Cc: Parthiban Veerasooran <Parthiban.Veerasooran@...rochip.com>,
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, horms@...nel.org, saeedm@...dia.com,
	anthony.l.nguyen@...el.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, corbet@....net,
	linux-doc@...r.kernel.org, robh+dt@...nel.org,
	krzysztof.kozlowski+dt@...aro.org, conor+dt@...nel.org,
	devicetree@...r.kernel.org, horatiu.vultur@...rochip.com,
	ruanjinjie@...wei.com, steen.hegelund@...rochip.com,
	vladimir.oltean@....com, UNGLinuxDriver@...rochip.com,
	Thorsten.Kummermehr@...rochip.com, Pier.Beruto@...emi.com,
	Selvamani.Rajagopal@...emi.com, Nicolas.Ferre@...rochip.com,
	benjamin.bigler@...nformulastudent.ch
Subject: Re: [PATCH net-next v4 05/12] net: ethernet: oa_tc6: implement error
 interrupts unmasking

On Sat, Apr 27, 2024 at 09:52:15PM +0200, Ramón Nordin Rodriguez wrote:
> > +static int oa_tc6_unmask_macphy_error_interrupts(struct oa_tc6 *tc6)
> > +{
> > +	u32 regval;
> > +	int ret;
> > +
> > +	ret = oa_tc6_read_register(tc6, OA_TC6_REG_INT_MASK0, &regval);
> > +	if (ret)
> > +		return ret;
> > +
> > +	regval &= ~(INT_MASK0_TX_PROTOCOL_ERR_MASK |
> > +		    INT_MASK0_RX_BUFFER_OVERFLOW_ERR_MASK |
> > +		    INT_MASK0_LOSS_OF_FRAME_ERR_MASK |
> > +		    INT_MASK0_HEADER_ERR_MASK);
> > +
> > +	return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, regval);
> > +}
> > +
> 
> This togheter with patch 11 works poorly for me. I get alot of kernel
> output, dropped packets and lower performance.
> Below is an example for a run when I curl a 10MB blob
> 
> time curl 20.0.0.55:8000/rdump -o dump -w '{%speed_download}'
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  U[  387.944737] net_ratelimit: 38 callbacks suppressed
> pload   Total   Spent    Left  Sp[  387.944755] eth0: Receive buffer overflow error
> eed
>   0     0    0     0    0     0      0      0 --:--:-- --:-[  387.961424] eth0: Receive buffer overflow error
>   0 10.0M    0  2896    0     0  13031      0  0:13:24 --:--:--  0:13:24 12986[  388.204257] eth0: Receive buffer overflow error
> [  388.209848] eth0: Receive buffer overflow error

How fast is your SPI bus? Faster than the link speed? Or slower?

It could be different behaviour is needed depending on the SPI bus
speed. If the SPI bus is faster than the link speed, by some margin,
the receiver buffer should not overflow, since the CPU can empty the
buffer faster than it fills.

If however, the SPI bus is slower than the link speed, there will be
buffer overflows, and a reliance on TCP backing off and slowing down.
The driver should not be spamming the log, since it is going to happen
and there is nothing that can be done about it.

> I tried this patch
> 
> diff --git a/drivers/net/ethernet/oa_tc6.c b/drivers/net/ethernet/oa_tc6.c
> index 9f17f3712137..bd7bd3ef6897 100644
> --- a/drivers/net/ethernet/oa_tc6.c
> +++ b/drivers/net/ethernet/oa_tc6.c
> @@ -615,21 +615,9 @@ static int oa_tc6_sw_reset_macphy(struct oa_tc6 *tc6)
>         return oa_tc6_write_register(tc6, OA_TC6_REG_STATUS0, regval);
>  }
> 
> -static int oa_tc6_unmask_macphy_error_interrupts(struct oa_tc6 *tc6)
> +static int oa_tc6_disable_imask0_interrupts(struct oa_tc6 *tc6)
>  {
> -       u32 regval;
> -       int ret;
> -
> -       ret = oa_tc6_read_register(tc6, OA_TC6_REG_INT_MASK0, &regval);
> -       if (ret)
> -               return ret;
> -
> -       regval &= ~(INT_MASK0_TX_PROTOCOL_ERR_MASK |
> -                   INT_MASK0_RX_BUFFER_OVERFLOW_ERR_MASK |
> -                   INT_MASK0_LOSS_OF_FRAME_ERR_MASK |
> -                   INT_MASK0_HEADER_ERR_MASK);
> -
> -       return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, regval);
> +       return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, (u32)-1);

So this appears to be disabling all error interrupts?

This is maybe going too far. Overflow errors are going to happen if
you have a slow SPI bus. So i probably would not enable that. However,
are the other errors actually expected in normal usage? If not, leave
them enabled, because they might indicate real problems.

> Which results in no log spam, ~5-10% higher throughput and no dropped
> packets when I look at /sys/class/net/eth0/statistics/rx_dropped

You cannot trust rx_dropped because you just disabled the code which
increments it! The device is probably still dropping packets, and they
are no longer counted.

It could be the performance increase comes from two places:

1) Spending time and bus bandwidth dealing with the buffer overflow
interrupt

2) Printing out the serial port.

Please could you benchmark a few things:

1) Remove the printk("Receive buffer overflow error"), but otherwise
keep the code the same. That will give us an idea how much the serial
port matters.

2) Disable only the RX buffer overflow interrupt

3) Disable all error interrupts.

   Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ