[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8658a9a4-d900-4e87-86a0-78478fa08271@intel.com>
Date: Wed, 22 Nov 2023 13:28:19 -0800
From: "Nambiar, Amritha" <amritha.nambiar@...el.com>
To: Jakub Kicinski <kuba@...nel.org>, Willem de Bruijn <willemb@...gle.com>
CC: <netdev@...r.kernel.org>, <pabeni@...hat.com>,
<sridhar.samudrala@...el.com>
Subject: Re: [net-next PATCH v8 02/10] net: Add queue and napi association
On 11/21/2023 5:15 PM, Jakub Kicinski wrote:
> On Tue, 21 Nov 2023 16:08:07 -0800 Nambiar, Amritha wrote:
>>> To reiterate - the thing I find odd about the current situation is that
>>> we hide the queues if they get disabled by lowering ethtool -L, but we
>>> don't hide them when the entire interface is down. When the entire
>>> interface is down there should be no queues, right?
>>
>> "When the entire interface is down there should be no queues" -
>> currently, 'ethtool --show-channels' reports all the available queues
>> when interface is DOWN
>
> That's not the same. ethtool -l shows the configuration not
> the instantiated objects. ethtool -a will also show you the
> pause settings even when cable is not plugged in.
> sysfs objects of the queues are still exposed for devices which
> are down, that's true. But again, that's to expose the config.
>
>>> Differently put - what logic that'd make sense to the user do we apply
>>> when trying to decide if the queue is visible? < real_num_queues is
>>> an implementation detail.
>>>
>>> We can list all the queues, always, too. No preference. I just want to
>>> make sure that the rules are clear and not very dependent on current
>>> implementation and not different driver to driver.
>>
>> I think currently, the queue dump results when the device is down aligns
>> for both APIs (netdev-genl queue-get and ethtool show-channels) for all
>> the drivers. If we decide to NOT show queues/NAPIs (with netdev-genl)
>> when the device is down, the user would see conflicting results, the
>> dump results with netdev-genl APIs would be different from what 'ethtool
>> --show-channels' and 'ps -aef | grep napi' reports.
>
> We should make the distinction between configuration and state of
> instantiated objects clear before we get too far. Say we support
> setting ring length for a specific queue. Global setting is 512,
> queue X wants 256. How do we remove the override for queue X?
> By setting it to 512? What if we want 512, and the default shifts
> to something else? We'll need an explicit "reset" command.
>
> I think it may be cleaner to keep queue-get as state of queues,
> and configuration / settings / rules completely separate.
>
> Am I wrong? Willem?
>
So, the instantiated netdev objects and their states are more aligned to
the netdev-level than the actual configuration in the hardware.
WRT to showing queues when the device is down, I see your point, the
hardware has those queues available, but when the interface is down,
those queues are not valid at the netdev-level and need not be reported
to the user. So, it is worth showing no results.
Also, my concern about showing the queues when the device is down, to do
the user-configuration and then bring up the device, does not hold
strong, as configuring when the device is up would also need a reset to
make the updates in the hardware.
Trying to understand this distinction bit more:
So, netdev-genl queue-get shows state of queue objects as reported from
the netdev level. Now, unless there's the queue-set command changing the
configuration, the state of queue-objects would match the hardware
configurations.
When the user changes the configuration with a queue-set command:
- queue-get would report the new updates (as obtained from the netdev).
- The updates would not be reflected in the hardware till a reset is
issued. At this point, ethtool or others would report the older
configuration (before reset).
- After reset, the state of queue objects from queue-get would match the
actual hardware configuration.
I agree, an explicit "reset" user-command would be great. This way all
the set operations for the netdev objects (queue, NAPI, page pool etc.)
would stay at the netdev level without needing ndo_op for each, and then
the "reset" command can trigger the ndo callback and actuate the
hardware changes.
Powered by blists - more mailing lists