lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9598b359-ab96-7d61-687a-917bee7a5cd9@huawei.com>
Date:   Tue, 10 Sep 2019 14:43:32 +0800
From:   Yunsheng Lin <linyunsheng@...wei.com>
To:     Greg KH <gregkh@...uxfoundation.org>
CC:     <rafael@...nel.org>, <linux-kernel@...r.kernel.org>,
        <peterz@...radead.org>, <mingo@...nel.org>, <mhocko@...nel.org>,
        <linuxarm@...wei.com>
Subject: Re: [PATCH] driver core: ensure a device has valid node id in
 device_add()

On 2019/9/9 17:53, Greg KH wrote:
> On Mon, Sep 09, 2019 at 02:04:23PM +0800, Yunsheng Lin wrote:
>> Currently a device does not belong to any of the numa nodes
>> (dev->numa_node is NUMA_NO_NODE) when the node id is neither
>> specified by fw nor by virtual device layer and the device has
>> no parent device.
> 
> Is this really a problem?

Not really.
Someone need to guess the node id when it is not specified, right?
This patch chooses to guess the node id in the driver core.

> 
>> According to discussion in [1]:
>> Even if a device's numa node is not specified, the device really
>> does belong to a node.
> 
> But as we do not know the node, can we cause more harm by randomly
> picking one (i.e. putting it all in node 0)?
If we do not pick node 0 for device with invalid node, then caller need
to check the node id and pick one, and currently different callers
does a different checking:

1) some does " < 0" check;
2) some does "== NUMA_NO_NODE" check;
3) some does ">= MAX_NUMNODES" check;
4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.

and caller of dev_to_node() may pick one node based on below if the
dev_to_node() return a invalid node based on above checking:
1) based on numa_mem_id().
2) pick a random one like in workqueue_select_cpu_near().

If we pick node 0 for device with invalid node in device_add(), we
may avoid the above different checking and picking for caller, but we
may lose some caller context info, for example, user may use node of the
cpu on which the process is using the device to allocate the resource
close to the process, or user may pick a random one if they know what
they are doing.

It seems there is trade off here, as I can see, we can guess and pick the
node at different stage when it is not specified.
1. guess and pick node 0 at device_add(), it has the advantage of ensure
   all devices will have a valid node at very begin of device creation,
   so the user does not have to check and guess one, but user might lose
   the opportunity to do their own guessing and picking.

2. Maybe provide a dev_to_valid_node() to always return a valid node id,
   for example return numa_mem_id() if dev->numa_node is not valid.
   User know what they are doing can still use dev_to_node().

3. Caller of dev_to_node() do their own checking and picking, which
   might lead to adding more different and reduplicate checking as above.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ