[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250910091532.27951-1-enjuk@amazon.com>
Date: Wed, 10 Sep 2025 18:15:21 +0900
From: Kohei Enju <enjuk@...zon.com>
To: <kurt@...utronix.de>
CC: <aleksandr.loktionov@...el.com>, <andrew+netdev@...n.ch>,
<anthony.l.nguyen@...el.com>, <davem@...emloft.net>, <edumazet@...gle.com>,
<enjuk@...zon.com>, <intel-wired-lan@...ts.osuosl.org>,
<kohei.enju@...il.com>, <kuba@...nel.org>, <netdev@...r.kernel.org>,
<pabeni@...hat.com>, <przemyslaw.kitszel@...el.com>,
<vitaly.lifshits@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH v1 iwl-net] igc: unregister netdev when igc_led_setup() fails in igc_probe()
On Wed, 10 Sep 2025 10:57:17 +0200, Kurt Kanzenbach wrote:
>On Wed Sep 10 2025, Kohei Enju wrote:
>> + Aleksandr
>>
>> On Wed, 10 Sep 2025 10:28:17 +0300, Lifshits, Vitaly wrote:
>>
>>>On 9/8/2025 9:26 AM, Kurt Kanzenbach wrote:
>>>> On Sat Sep 06 2025, Kohei Enju wrote:
>>>>> Currently igc_probe() doesn't unregister netdev when igc_led_setup()
>>>>> fails, causing BUG_ON() in free_netdev() and then kernel panics. [1]
>>>>>
>>>>> This behavior can be tested using fault-injection framework. I used the
>>>>> failslab feature to test the issue. [2]
>>>>>
>>>>> Call unregister_netdev() when igc_led_setup() fails to avoid the kernel
>>>>> panic.
>>>>>
>>>>> [1]
>>>>> kernel BUG at net/core/dev.c:12047!
>>>>> Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>>> CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary)
>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>>>> RIP: 0010:free_netdev+0x278/0x2b0
>>>>> [...]
>>>>> Call Trace:
>>>>> <TASK>
>>>>> igc_probe+0x370/0x910
>>>>> local_pci_probe+0x3a/0x80
>>>>> pci_device_probe+0xd1/0x200
>>>>> [...]
>>>>>
>>>>> [2]
>>>>> #!/bin/bash -ex
>>>>>
>>>>> FAILSLAB_PATH=/sys/kernel/debug/failslab/
>>>>> DEVICE=0000:00:05.0
>>>>> START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \
>>>>> | awk '{printf("0x%s", $1)}')
>>>>> END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100)))
>>>>>
>>>>> echo $START_ADDR > $FAILSLAB_PATH/require-start
>>>>> echo $END_ADDR > $FAILSLAB_PATH/require-end
>>>>> echo 1 > $FAILSLAB_PATH/times
>>>>> echo 100 > $FAILSLAB_PATH/probability
>>>>> echo N > $FAILSLAB_PATH/ignore-gfp-wait
>>>>>
>>>>> echo $DEVICE > /sys/bus/pci/drivers/igc/bind
>>>>>
>>>>> Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226")
>>>>> Signed-off-by: Kohei Enju <enjuk@...zon.com>
>>>>
>>>> Reviewed-by: Kurt Kanzenbach <kurt@...utronix.de>
>>>
>>>Thank you for the patch and for identifying this issue!
>>>
>>>I was wondering whether we could avoid failing the probe in cases where
>>>igc_led_setup fails. It seems to me that a failure in the LED class
>>>functionality shouldn't prevent the device's core functionality from
>>>working properly.
>>
>> Indeed, that also makes sense.
>>
>> The behavior that igc_probe() succeeds even if igc_led_setup() fails
>> also seems good to me, as long as notifying users that igc's led
>> functionality is not available.
>
>SGTM. The LED code is nice to have, but not mandatory at all. The device
>has sane LED defaults.
Thank you for clarification.
I'll do like that in v2.
>
>>
>>>
>>> From what I understand, errors in this function are not due to hardware
>>>malfunctions. Therefore, I suggest we remove the error propagation.
>>>
>>>Alternatively, if feasible, we could consider reordering the function
>>>calls so that the LED class setup occurs before the netdev registration.
>>>
>>
>> I don't disagree with you, but I would like to hear Kurt and Aleksandr's
>> opinion. Do you have any preference or suggestions?
>
>See above.
Got it :)
>
>Thanks,
>Kurt
Powered by blists - more mailing lists