lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <103033d0-f6e2-49ee-a8e2-ba23c6e9a6a1@nvidia.com>
Date: Tue, 7 May 2024 11:55:44 -0700
From: William Tu <witu@...dia.com>
To: Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Cc: jiri@...dia.com, bodong@...dia.com, kuba@...nel.org
Subject: Re: [PATCH RFC net-next] net: cache the __dev_alloc_name()



On 5/7/24 12:26 AM, Paolo Abeni wrote:
> External email: Use caution opening links or attachments
>
>
> On Mon, 2024-05-06 at 20:32 +0000, William Tu wrote:
>> When a system has around 1000 netdevs, adding the 1001st device becomes
>> very slow. The devlink command to create an SF
>>    $ devlink port add pci/0000:03:00.0 flavour pcisf \
>>      pfnum 0 sfnum 1001
>> takes around 5 seconds, and Linux perf and flamegraph show 19% of time
>> spent on __dev_alloc_name() [1].
>>
>> The reason is that devlink first requests for next available "eth%d".
>> And __dev_alloc_name will scan all existing netdev to match on "ethN",
>> set N to a 'inuse' bitmap, and find/return next available number,
>> in our case eth0.
>>
>> And later on based on udev rule, we renamed it from eth0 to
>> "en3f0pf0sf1001" and with altname below
>>    14: en3f0pf0sf1001: <BROADCAST,MULTICAST,UP,LOWER_UP> ...
>>        altname enp3s0f0npf0sf1001
>>
>> So eth0 is actually never being used, but as we have 1k "en3f0pf0sfN"
>> devices + 1k altnames, the __dev_alloc_name spends lots of time goint
>> through all existing netdev and try to build the 'inuse' bitmap of
>> pattern 'eth%d'. And the bitmap barely has any bit set, and it rescanes
>> every time.
>>
>> I want to see if it makes sense to save/cache the result, or is there
>> any way to not go through the 'eth%d' pattern search. The RFC patch
>> adds name_pat (name pattern) hlist and saves the 'inuse' bitmap. It saves
>> pattens, ex: "eth%d", "veth%d", with the bitmap, and lookup before
>> scanning all existing netdevs.
> An alternative heuristic that should be cheap and possibly reasonable
> could be optimistically check for <name>0..<name><very small int>
> availability, possibly restricting such attempt at scenarios where the
> total number of hashed netdevice names is somewhat high.
>
> WDYT?
>
> Cheers,
>
> Paolo
Hi Paolo,

Thanks for your suggestion!
I'm not clear with that idea.

The current code has to do a full scan of all netdevs in a list, and the 
name list is not sorted / ordered. So to get to know, ex: eth0 .. eth10, 
we still need to do a full scan, find netdev with prefix "eth", and get 
net available bit 11 (10+1).
And in another use case where users doesn't install UDEV rule to rename, 
the system can actually create eth998, eth999, eth1000....

What if we create prefix map (maybe using xarray)
idx   entry=(prefix, bitmap)
--------------------
0      eth, 1111000000...
1      veth, 1000000...
2      can, 11100000...
3      firewire, 00000...

but then we need to unset the bit when device is removed.
William



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ