[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Ys7FSCAYOQH+YLbS@lorien.usersys.redhat.com>
Date: Wed, 13 Jul 2022 09:14:48 -0400
From: Phil Auld <pauld@...hat.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: linux-kernel@...r.kernel.org,
"Rafael J . Wysocki" <rafael@...nel.org>,
Tian Tao <tiantao6@...ilicon.com>
Subject: Re: [PATCH] drivers/base/node.c: fix userspace break from using
bin_attributes for cpumap and cpulist
On Wed, Jul 13, 2022 at 03:05:52PM +0200 Greg Kroah-Hartman wrote:
> On Wed, Jul 13, 2022 at 07:47:58AM -0400, Phil Auld wrote:
> > Hi Greg,
> >
> > On Wed, Jul 13, 2022 at 08:06:02AM +0200 Greg Kroah-Hartman wrote:
> > > On Tue, Jul 12, 2022 at 05:43:01PM -0400, Phil Auld wrote:
> > > > Using bin_attributes with a 0 size causes fstat and friends to return that 0 size.
> > > > This breaks userspace code that retrieves the size before reading the file. Rather
> > > > than reverting 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size
> > > > limitation of cpumap ABI") let's put in a size value at compile time. Use direct
> > > > comparison and a worst-case maximum to ensure compile time constants. For cpulist the
> > > > max is on the order of NR_CPUS * (ceil(log10(NR_CPUS)) + 1) which for 8192 is 40960.
> > > > In order to get near that you'd need a system with every other CPU on one node or
> > > > something similar. e.g. (0,2,4,... 1024,1026...). We set it to a min of PAGE_SIZE
> > > > to retain the older behavior. For cpumap, PAGE_SIZE is plenty big.
> > >
> > > Does userspace care about that size, or can we just put any value in
> > > there and it will be ok? How about just returning to the original
> > > PAGE_SIZE value to keep things looking identical, will userspace not
> > > read more than that size from the file then?
> > >
> >
> > I'll go look. But I think the point of pre-reading the size with fstat is to allocate
> > a buffer to read into. So that may be a problem.
> >
> > That said, I believe in this case it's the cpulist file which given the use of ranges
> > is very unlikely to actually get that big.
>
> That is why we had to change this to a binary file. Think about
> every-other CPU being there, that's a huge list. This already was
> broken on some systems which is why it had to be changed (i.e. we didn't
> change it for no reason at all.)
>
I didn't think you did and the change made sense. I did not expect this to
cause problems either when I backported it... :)
> > > > On an 80 cpu 4-node sytem (NR_CPUS == 8192)
> > >
> > > We have systems running Linux with many more cpus than that, and your
> > > company knows this :)
> >
> > The 80 cpus here don't matter and we only build with NR_CPUS = 8192 :)
> >
> > But yes, I realize now that the cpumap part I posted is broken for larger
> > NR_CPUS. I originally had it as NR_CPUS, but as I said in my reply to Barry,
> > it wants to be ~= NR_CPUS/4 + NR_CPUS/32. I'll change that.
> >
> > I think we should decide on a max for each and use that.
>
> Sure, pick a max size please, that's fine with me.
Right. I had another reply that crossed in the ether.
I can repost with the new version shortly.
It's using cpumap at NR_CPUS/2 and cpulist at NR_CPUS*6.
Cheers,
Phil
>
> greg k-h
>
--
Powered by blists - more mailing lists