[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250212151108.jI8qODdD@linutronix.de>
Date: Wed, 12 Feb 2025 16:11:08 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Wander Lairson Costa <wander@...hat.com>
Cc: Tony Nguyen <anthony.l.nguyen@...el.com>, davem@...emloft.net,
kuba@...nel.org, pabeni@...hat.com, edumazet@...gle.com,
andrew+netdev@...n.ch, netdev@...r.kernel.org, rostedt@...dmis.org,
clrkwllms@...nel.org, jgarzik@...hat.com, yuma@...hat.com,
linux-rt-devel@...ts.linux.dev
Subject: Re: [PATCH net 0/4][pull request] igb: fix igb_msix_other() handling
for PREEMPT_RT
On 2025-02-12 08:56:47 [-0300], Wander Lairson Costa wrote:
> > What disables preemption? On PREEMPT_RT the spin_lock() does not disable
> > preemption. You shouldn't spin that long. When was interrupt scheduled.
> > _Why_ is the interrupt delayed that long.
> >
> When I was using trace-cmd report -l, it omitted some fields, one of
> them is preempt-lazy-depth (which was something new to me), and it seems
> this is what affects interrupts. It comes from here [1]. I had the logs,
> but the machine went under maintenance before I could save them. Once
> it comes back, I can grab them and post here.
>
> [1] https://elixir.bootlin.com/linux/v6.13.2/source/drivers/net/ethernet/intel/igbvf/netdev.c#L1522
If you do send patches against mainline please test against mainline. As
of today we have preempt-disable and migrate-disable depth. We don't
do lazy-depth anymore, we just have a bit now (which is [lLbB]).
The referenced line will only disable migration, not preemption.
It is important to understand what exactly is going on.
> > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> > index d368b753a4675..6fe37b8001c36 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_main.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> > @@ -912,7 +912,7 @@ static int igb_request_msix(struct igb_adapter *adapter)
> > struct net_device *netdev = adapter->netdev;
> > int i, err = 0, vector = 0, free_vector = 0;
> >
> > - err = request_irq(adapter->msix_entries[vector].vector,
> > + err = request_threaded_irq(adapter->msix_entries[vector].vector, NULL,
> > igb_msix_other, 0, netdev->name, adapter);
> > if (err)
> > goto err_out;
> >
> > just to see if it solves the problem?
> >
> I have two test cases:
>
> 1) Boot the machine with nr_cpus=1. The driver reports "PF still
> resetting" message continuously. This issue is gone.
good.
> 2) Run the following script:
>
> ipaddr_vlan=3
> nic_test=ens14f0
> vf=${nic_test}v0 # The main testing steps:
> while true; do
> ip link set ${nic_test} mtu 1500
> ip link set ${vf} mtu 1500
> ip link set $vf up
> # 3. set vlan and ip for VF
> ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan}
> ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf}
> ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf}
> # 4. check the link state for VF and PF
> ip link show ${nic_test}
> if ! ip link show $vf | grep 'state UP'; then
> echo 'Error found'
> break
> fi
> ip link set $vf down
> done
>
> This one eventually fails. It is the first time that one works and the
> other fails. So far, it has been all or nothing. I didn't have time yet to
> investigate why this happens.
"eventually fails". Does this mean it passes the first few iterations
but then it times out? In that case it might be something else
I managed to find a "Intel Corporation I350 Gigabit Network Connection
(rev 01)" and I end up in a warning if I start the script (without the
while true):
|8021q: adding VLAN 0 to HW filter on device eno0v0
|igbvf 0000:08:10.0: Vlan id 0 is not added
|igb 0000:07:00.0: Setting VLAN 3, QOS 0x0 on VF 0
|igb 0000:07:00.0: VF 0 attempted to set invalid MAC filter
|------------[ cut here ]------------
|WARNING: CPU: 25 PID: 3013 at drivers/net/ethernet/intel/igbvf/netdev.c:1777 igbvf_close+0x111/0x120
…
|CPU: 25 UID: 0 PID: 3013 Comm: ip Not tainted 6.14.0-rc1-rt1+ #186 PREEMPT_RT+LAZY 39474a76e7562bb76173f4b98cf194301d39bf7f
|igbvf 0000:08:10.0: Link is Up 1000 Mbps Full Duplex
|---[ end trace 0000000000000000 ]---
|igb 0000:07:00.0: VF 0 attempted to set invalid MAC filter
|igb 0000:07:00.0: VF 0 attempted to set invalid MAC filter
|igb 0000:07:00.0: VF 0 attempted to set invalid MAC filter
|8021q: adding VLAN 0 to HW filter on device eno0v0
|igb 0000:07:00.0: VF 0 attempted to override administratively set VLAN tag
|Reload the VF driver to resume operations
|igbvf 0000:08:10.0: Link is Up 1000 Mbps Full Duplex
|igb 0000:07:00.0: VF 0 attempted to set invalid MAC filter
and the state is down ('Error found' is printed). But if I do it
manually, line by line, then it all passes without the warning and the
state of the VF device is up.
Sebastian
Powered by blists - more mailing lists