lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bf6870f7f3942398e4d1fdaa42184c7@hisilicon.com>
Date:   Thu, 29 Apr 2021 22:32:30 +0000
From:   "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
To:     Dave Hansen <dave.hansen@...el.com>,
        "tiantao (H)" <tiantao6@...ilicon.com>,
        "corbet@....net" <corbet@....net>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC:     "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        "Valentin Schneider" <valentin.schneider@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>
Subject: RE: [PATCH 1/2] CPU, NUMA topology ABIs: clarify the overflow issue
 of sysfs pagebuf



> -----Original Message-----
> From: Dave Hansen [mailto:dave.hansen@...el.com]
> Sent: Friday, April 30, 2021 9:39 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>; tiantao (H)
> <tiantao6@...ilicon.com>; corbet@....net; gregkh@...uxfoundation.org
> Cc: linux-doc@...r.kernel.org; linux-kernel@...r.kernel.org; Rafael J.
> Wysocki <rafael@...nel.org>; Peter Zijlstra <peterz@...radead.org>; Valentin
> Schneider <valentin.schneider@....com>; Dave Hansen
> <dave.hansen@...ux.intel.com>; Daniel Bristot de Oliveira <bristot@...hat.com>
> Subject: Re: [PATCH 1/2] CPU, NUMA topology ABIs: clarify the overflow issue
> of sysfs pagebuf
> 
> On 4/29/21 2:08 PM, Song Bao Hua (Barry Song) wrote:
> >> Do we think >PAGE_SIZE data out of a sysfs file is a worse ABI break or
> >> something?
> > This kind of cpu list ABIs have been there for many years but have
> > never been documented well.
> >
> > We have two ABIs:
> > xxx_cpus - in format like 3333333333
> > xxx_cpus_list - in format like 0,3,5,7,9,11,13....
> >
> > xxx_cpus_list is another human-readable version of xxx_cpus. It doesn't
> > include any more useful information than xxx_cpus.
> >
> > xxx_cpus won't overflow based on BUILD_BUG_ON and maximum NR_CPUS
> > in kconfig nowadays.
> >
> > if people all agree the trimmed list is a break of ABI, I think we may
> > totally remove this list. For these days, this list probably has never
> > overflowed but literally this could happen.
> >
> > thoughts?
> 
> From what Greg said, it sounds like removing the BUILD_BUG_ON(), making
> it a binary sysfs file, and making it support arbitrary lengths is the
> way to go.

I am actually more concerned on xxx_cpus_list than xxx_cpus.

xxx_cpus has never overflowed. Though there is a BUILD_BUG_ON(), but the
maximum NR_CPUS is 8096, it is still far from overflow.
8096 /32 * 9 = 2277
as 2277 < 4096

While NR_CPUS gets to ~14500, for a 4KB page, xxx_cpus will overflow.
But I don't know when hardware will reach there. If we reach there,
the existing code to describe topology and schedule tasks might also
need rework.

On the other hand, lscpu, hwloc, numactl etc are using the existing
hex bitmap ABI:

$ strace lscpu  2>&1 | grep topology
faccessat(AT_FDCWD,
"/sys/devices/system/cpu/cpu0/topology/thread_siblings", F_OK) = 0
openat(AT_FDCWD,
"/sys/devices/system/cpu/cpu0/topology/thread_siblings",
O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD,
"/sys/devices/system/cpu/cpu0/topology/core_siblings",
O_RDONLY|O_CLOEXEC) = 3
faccessat(AT_FDCWD,
"/sys/devices/system/cpu/cpu0/topology/book_siblings", F_OK) = -1
ENOENT (No such file or directory)
faccessat(AT_FDCWD,
"/sys/devices/system/cpu/cpu1/topology/thread_siblings", F_OK) = 0
openat(AT_FDCWD,
...

$ strace numactl --hardware  2>&1 | grep cpu
openat(AT_FDCWD, "/sys/devices/system/cpu",
O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/sys/devices/system/node/node0/cpumap", O_RDONLY) = 3
openat(AT_FDCWD, "/sys/devices/system/node/node1/cpumap", O_RDONLY) = 3
openat(AT_FDCWD, "/sys/devices/system/node/node2/cpumap", O_RDONLY) = 3
openat(AT_FDCWD, "/sys/devices/system/node/node3/cpumap", O_RDONLY) = 3

If we move to binary, it means we have to change those applications.

For this moment, I'd argue we keep cpu bitmap ABI as is and defer this
issue to when we really get so many cores.

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ