[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <95752e6d-82da-4cd3-b162-4fb88d7ffd13@gmail.com>
Date: Mon, 29 Jan 2024 08:22:47 -0800
From: Florian Fainelli <f.fainelli@...il.com>
To: Vladimir Oltean <olteanv@...il.com>,
Luiz Angelo Daros de Luca <luizluca@...il.com>
Cc: netdev@...r.kernel.org, linus.walleij@...aro.org, alsi@...g-olufsen.dk,
andrew@...n.ch, davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, arinc.unal@...nc9.com, ansuelsmth@...il.com
Subject: Re: [PATCH net-next v4 08/11] net: dsa: realtek: clean user_mii_bus
setup
On 1/29/2024 8:15 AM, Vladimir Oltean wrote:
> On Sun, Jan 28, 2024 at 11:12:25PM -0300, Luiz Angelo Daros de Luca wrote:
>> It looks like it is now, although "remove" mostly leaves the job for devm.
>>
>> I'm still not sure if we have the correct shutdown/remove code. From
>> what I could understand, driver shutdown is called during system
>> shutdown while remove is called when the driver is removed.
>
> Yeah, poweroff or reboot or kexec.
>
>> However, it looks like that both might be called in sequence. Would it
>> be shutdown,remove? (it's probably that because there is the
>> dev_set_drvdata(priv->dev, NULL) in shutdown).
>
> Yeah, while shutting down, the (SPI, I2C, MDIO, ...) bus driver might
> call spi_unregister_controller() on shutdown(), and this will also call
> remove() on its child devices. Even the Raspberry Pi SPI controller does
> this, AFAIR. The idea of implementing .shutdown() as .remove() is to
> gain more code coverage by sharing code, which should reduce chances of
> bugs in less-tested code (remove). Or at least that's how the saying goes...
>
>> However, if shutdown should prepare the system for another OS, I
>> believe it should be asserting the hw reset as well or remove should
>> stop doing it. Are the dsa_switch_shutdown and dsa_switch_unregister
>> enough to prevent leaking traffic after the driver is gone? It does
>> disable all ports. Or should we have a fallback "isolate all ports"
>> when a hw reset is missing? I guess the u-boot driver does something
>> like that.
>>
>> I don't think it is mandatory for this series but if we got something
>> wrong, it would be nice to fix it.
>
> I don't really know anything at all about kexec. You might want to get
> input from someone who uses it. All that I know is that this should do
> something meaningful (not crash, and still work in the second kernel):
>
> root@...ian:~# kexec -l /boot/Image.gz --reuse-cmdline && kexec -e
> [ 46.335430] mscc_felix 0000:00:00.5 swp3: Link is Down
> [ 46.345747] fsl_enetc 0000:00:00.2 eno2: Link is Down
> [ 46.419201] kvm: exiting hardware virtualization
> [ 46.424036] kexec_core: Starting new kernel
> [ 46.471657] psci: CPU1 killed (polled 0 ms)
> [ 46.486060] Bye!
> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
> [ 0.000000] Linux version 5.16.0-rc2-07010-ga9b9500ffaac-dirty (tigrisor@...uf) (aarch64-none-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 10.2-2020.11 (arm-10.16)) 10.2.1 20201103, GNU ld (GNU Toolchain for the A-profile Architecture 10.2-2020.11 (arm-10.16)) 2.35.1.20201028) #1519 SMP PREEMPT Wed Dec 1 08:59:13 EET 2021
> [ 0.000000] Machine model: LS1028A RDB Board
> [ 0.000000] earlycon: uart8250 at MMIO 0x00000000021c0500 (options '')
> [ 0.000000] printk: bootconsole [uart8250] enabled
> [ 0.000000] efi: UEFI not found.
> [ 0.000000] NUMA: No NUMA configuration found
> [ 0.000000] NUMA: Faking a node at [mem 0x0000000080000000-0x00000020ffffffff]
> [ 0.000000] NUMA: NODE_DATA [mem 0x20ff6fab80-0x20ff6fcfff]
> (...)
>
> which in this case it does.
>
> From other discussions I've had, there seems to be interest in quite the
> opposite thing, in fact. Reboot the SoC running Linux, but do not
> disturb traffic flowing through the switch, and somehow pick up the
> state from where the previous kernel left it.
Yes this is actually an use case that is very dear to the users of DSA
in an airplane. The entertainment system in the seat in front of you
typically has a left, CPU/display and right set of switch ports. Across
the 300+ units in the plane each entertainment systems runs STP to avoid
loops being created when one of the display units goes bad. Occasionally
cabin crew members will have to swap those units out since they tend to
wear out. When they do, the switch operates in a headless mode and it
would be unfortunate that plugging in a display unit into the network
again would be disrupting existing traffic. I have seen out of tree
patches doing that, but there was not a good way to make them upstream
quality.
>
> Now, obviously that doesn't currently work, but it does raise the
> question about the usefulness of resetting the switch on shutdown.
The users would really care would likely introduce sufficient amounts of
control knobs (module parameters, ethtool private flags, devlink?) to
control that behavior. It does seem however universally acceptable to
stop any DMAs and packets from flowing as a default and safe
implementation to the upstream kernel.
--
Florian
Powered by blists - more mailing lists