[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1512312251430.28591@nanos>
Date: Thu, 31 Dec 2015 23:30:57 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Marcelo Tosatti <mtosatti@...hat.com>
cc: "Yu, Fenghua" <fenghua.yu@...el.com>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
"x86@...nel.org" <x86@...nel.org>,
Luiz Capitulino <lcapitulino@...hat.com>,
"Shivappa, Vikas" <vikas.shivappa@...el.com>,
Tejun Heo <tj@...nel.org>,
"Shankar, Ravi V" <ravi.v.shankar@...el.com>,
"Luck, Tony" <tony.luck@...el.com>
Subject: Re: [RFD] CAT user space interface revisited
Marcelo,
On Thu, 31 Dec 2015, Marcelo Tosatti wrote:
First of all thanks for the explanation.
> There is one directory structure in this topic, CAT. That is the
> directory structure which is exposed to userspace to control the
> CAT HW.
>
> With the current patchset posted by Intel ("Subject: [PATCH V16 00/11]
> x86: Intel Cache Allocation Technology Support"), the directory
> structure there (the files and directories exposed by that patchset)
> (*1) does not allow one to configure different CBM masks on each socket
> (that is, it forces the user to configure the same mask CBM on every
> socket). This is a blocker for us, and it is one of the points in your
> proposal.
>
> There was a call between Red Hat and Intel where it was communicated
> to Intel, and Intel agreed, that it was necessary to fix this (fix this
> == allow different CBM masks on different sockets).
>
> Now, that is one change to the current directory structure (*1).
I don't have an idea how that would look like. The current structure is a
cgroups based hierarchy oriented approach, which does not allow simple things
like
T1 00001111
T2 00111100
at least not in a way which is natural to the problem at hand.
> (*1) modified to allow for different CBM masks on different sockets,
> lets say (*2), is what we have been waiting for Intel to post.
> It would handle our usecase, and all use-cases which the current
> patchset from Intel already handles (Vikas posted emails mentioning
> there are happy users of the current interface, feel free to ask
> him for more details).
I cannot imagine how that modification to the current interface would solve
that. Not to talk about per CPU associations which are not related to tasks at
all.
> What i have asked you, and you replied "to go Google read my previous
> post" is this:
> What are the advantages over you proposal (which is a completely
> different directory structure, requiring a complete rewrite),
> over (*2) ?
>
> (what is my reason behind this: the reason is that if you, with
> maintainer veto power, forces your proposal to be accepted, it will be
> necessary to wait for another rewrite (a new set of problems, fully
> think through your proposal, test it, ...) rather than simply modify an
> already known, reviewed, already used directory structure.
>
> And functionally, your proposal adds nothing to (*2) (other than, well,
> being a different directory structure).
Sorry. I cannot see at all how a modification to the existing interface would
cover all the sensible use cases I described in a coherent way. I really want
to see a proper description of the interface before people start hacking on it
in a frenzy. What you described is: "let's say (*2)" modification. That's
pretty meager.
> If Fenghua or you post a patchset, say in 2 weeks, with your proposal,
> i am fine with that. But i since i doubt that will be the case, i am
> pushing for the interface which requires the least amount of changes
> (and therefore the least amount of time) to be integrated.
>
> >From your email:
>
> "It would even be sufficient for particular use cases to just associate
> a piece of cache to a given CPU and do not bother with tasks at all.
>
> We really need to make this as configurable as possible from userspace
> without imposing random restrictions to it. I played around with it on
> my new intel toy and the restriction to 16 COS ids (that's 8 with CDP
> enabled) makes it really useless if we force the ids to have the same
> meaning on all sockets and restrict it to per task partitioning."
>
> Yes, thats the issue we hit, that is the modification that was agreed
> with Intel, and thats what we are waiting for them to post.
How do you implement the above - especially that part:
"It would even be sufficient for particular use cases to just associate a
piece of cache to a given CPU and do not bother with tasks at all."
as a "simple" modification to (*1) ?
> > I described a directory structure for that qos/cat stuff in my proposal and
> > that's complete AFAICT.
>
> Ok, lets make the job for the submitter easier. You are the maintainer,
> so you decide.
>
> Is it enough for you to have (*2) (which was agreed with Intel), or
> would you rather prefer to integrate the directory structure at
> "[RFD] CAT user space interface revisited" ?
The only thing I care about as a maintainer is, that we merge something which
actually reflects the properties of the hardware and gives the admin the
required flexibility to utilize it fully. I don't care at all if it's my
proposal or something else which allows to do the same.
Let me copy the relevant bits from my proposal here once more and let me ask
questions to the various points so you can tell me how that modification to
(*1) is going to deal with that.
>> At top level:
>>
>> xxxxxxx/cat/max_cosids <- Assume that all CPUs are the same
>> xxxxxxx/cat/max_maskbits <- Assume that all CPUs are the same
>> xxxxxxx/cat/cdp_enable <- Depends on CDP availability
Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.
>> Per socket data:
>>
>> xxxxxxx/cat/socket-0/
>> ...
>> xxxxxxx/cat/socket-N/l3_size
>> xxxxxxx/cat/socket-N/hwsharedbits
Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.
>> Per socket mask data:
>>
>> xxxxxxx/cat/socket-N/cos-id-0/
>> ...
>> xxxxxxx/cat/socket-N/cos-id-N/inuse
>> /cat_mask
>> /cdp_mask <- Data mask if CDP enabled
Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.
>> Per cpu default cos id for the cpus on that socket:
>>
>> xxxxxxx/cat/socket-N/cpu-x/default_cosid
>> ...
>> xxxxxxx/cat/socket-N/cpu-N/default_cosid
>>
>> The above allows a simple cpu based partitioning. All tasks which do
>> not have a cache partition assigned on a particular socket use the
>> default one of the cpu they are running on.
Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.
>> Now for the task(s) partitioning:
>>
>> xxxxxxx/cat/partitions/
>>
>> Under that directory one can create partitions
>>
>> xxxxxxx/cat/partitions/p1/tasks
>> /socket-0/cosid
>> ...
>> /socket-n/cosid
>>
>> The default value for the per socket cosid is COSID_DEFAULT, which
>> causes the task(s) to use the per cpu default id.
Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.
Yes. I ask the same question several times and I really want to see the
directory/interface structure which solves all of the above before anyone
starts to implement it. We already have a completely useless interface (*1)
and there is no point to implement another one based on it (*2) just because
it solves your particular issue and is the fastest way forward. User space
interfaces are hard and we really do not need some half baken solution which
we have to support forever.
Let me enumerate the required points again:
1) Information about the hardware properties
2) Integration of CAT and CDP
3) Per socket cos-id partitioning
4) Per cpu default cos-id association
5) Task association to cos-id
Can you please explain in a simple directory based scheme, like the one I gave
you how all of these points are going to be solved with a modification to (*1)?
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists