[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <77d7d190-0847-4dc9-8fc5-4e33308ce7c8@lunn.ch>
Date: Sat, 27 Apr 2024 23:17:37 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@...roamp.se>
Cc: Parthiban Veerasooran <Parthiban.Veerasooran@...rochip.com>,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, saeedm@...dia.com,
anthony.l.nguyen@...el.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, corbet@....net,
linux-doc@...r.kernel.org, robh+dt@...nel.org,
krzysztof.kozlowski+dt@...aro.org, conor+dt@...nel.org,
devicetree@...r.kernel.org, horatiu.vultur@...rochip.com,
ruanjinjie@...wei.com, steen.hegelund@...rochip.com,
vladimir.oltean@....com, UNGLinuxDriver@...rochip.com,
Thorsten.Kummermehr@...rochip.com, Pier.Beruto@...emi.com,
Selvamani.Rajagopal@...emi.com, Nicolas.Ferre@...rochip.com,
benjamin.bigler@...nformulastudent.ch
Subject: Re: [PATCH net-next v4 05/12] net: ethernet: oa_tc6: implement error
interrupts unmasking
On Sat, Apr 27, 2024 at 09:52:15PM +0200, Ramón Nordin Rodriguez wrote:
> > +static int oa_tc6_unmask_macphy_error_interrupts(struct oa_tc6 *tc6)
> > +{
> > + u32 regval;
> > + int ret;
> > +
> > + ret = oa_tc6_read_register(tc6, OA_TC6_REG_INT_MASK0, ®val);
> > + if (ret)
> > + return ret;
> > +
> > + regval &= ~(INT_MASK0_TX_PROTOCOL_ERR_MASK |
> > + INT_MASK0_RX_BUFFER_OVERFLOW_ERR_MASK |
> > + INT_MASK0_LOSS_OF_FRAME_ERR_MASK |
> > + INT_MASK0_HEADER_ERR_MASK);
> > +
> > + return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, regval);
> > +}
> > +
>
> This togheter with patch 11 works poorly for me. I get alot of kernel
> output, dropped packets and lower performance.
> Below is an example for a run when I curl a 10MB blob
>
> time curl 20.0.0.55:8000/rdump -o dump -w '{%speed_download}'
> % Total % Received % Xferd Average Speed Time Time Time Current
> Dload U[ 387.944737] net_ratelimit: 38 callbacks suppressed
> pload Total Spent Left Sp[ 387.944755] eth0: Receive buffer overflow error
> eed
> 0 0 0 0 0 0 0 0 --:--:-- --:-[ 387.961424] eth0: Receive buffer overflow error
> 0 10.0M 0 2896 0 0 13031 0 0:13:24 --:--:-- 0:13:24 12986[ 388.204257] eth0: Receive buffer overflow error
> [ 388.209848] eth0: Receive buffer overflow error
How fast is your SPI bus? Faster than the link speed? Or slower?
It could be different behaviour is needed depending on the SPI bus
speed. If the SPI bus is faster than the link speed, by some margin,
the receiver buffer should not overflow, since the CPU can empty the
buffer faster than it fills.
If however, the SPI bus is slower than the link speed, there will be
buffer overflows, and a reliance on TCP backing off and slowing down.
The driver should not be spamming the log, since it is going to happen
and there is nothing that can be done about it.
> I tried this patch
>
> diff --git a/drivers/net/ethernet/oa_tc6.c b/drivers/net/ethernet/oa_tc6.c
> index 9f17f3712137..bd7bd3ef6897 100644
> --- a/drivers/net/ethernet/oa_tc6.c
> +++ b/drivers/net/ethernet/oa_tc6.c
> @@ -615,21 +615,9 @@ static int oa_tc6_sw_reset_macphy(struct oa_tc6 *tc6)
> return oa_tc6_write_register(tc6, OA_TC6_REG_STATUS0, regval);
> }
>
> -static int oa_tc6_unmask_macphy_error_interrupts(struct oa_tc6 *tc6)
> +static int oa_tc6_disable_imask0_interrupts(struct oa_tc6 *tc6)
> {
> - u32 regval;
> - int ret;
> -
> - ret = oa_tc6_read_register(tc6, OA_TC6_REG_INT_MASK0, ®val);
> - if (ret)
> - return ret;
> -
> - regval &= ~(INT_MASK0_TX_PROTOCOL_ERR_MASK |
> - INT_MASK0_RX_BUFFER_OVERFLOW_ERR_MASK |
> - INT_MASK0_LOSS_OF_FRAME_ERR_MASK |
> - INT_MASK0_HEADER_ERR_MASK);
> -
> - return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, regval);
> + return oa_tc6_write_register(tc6, OA_TC6_REG_INT_MASK0, (u32)-1);
So this appears to be disabling all error interrupts?
This is maybe going too far. Overflow errors are going to happen if
you have a slow SPI bus. So i probably would not enable that. However,
are the other errors actually expected in normal usage? If not, leave
them enabled, because they might indicate real problems.
> Which results in no log spam, ~5-10% higher throughput and no dropped
> packets when I look at /sys/class/net/eth0/statistics/rx_dropped
You cannot trust rx_dropped because you just disabled the code which
increments it! The device is probably still dropping packets, and they
are no longer counted.
It could be the performance increase comes from two places:
1) Spending time and bus bandwidth dealing with the buffer overflow
interrupt
2) Printing out the serial port.
Please could you benchmark a few things:
1) Remove the printk("Receive buffer overflow error"), but otherwise
keep the code the same. That will give us an idea how much the serial
port matters.
2) Disable only the RX buffer overflow interrupt
3) Disable all error interrupts.
Andrew
Powered by blists - more mailing lists