[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.2motdpw6wjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date: Tue, 23 Apr 2024 10:30:51 -0500
From: "Haitao Huang" <haitao.huang@...ux.intel.com>
To: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"jarkko@...nel.org" <jarkko@...nel.org>, "x86@...nel.org" <x86@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "hpa@...or.com"
<hpa@...or.com>, "mingo@...hat.com" <mingo@...hat.com>, "tj@...nel.org"
<tj@...nel.org>, "mkoutny@...e.com" <mkoutny@...e.com>, "Mehta, Sohil"
<sohil.mehta@...el.com>, "linux-sgx@...r.kernel.org"
<linux-sgx@...r.kernel.org>, "tim.c.chen@...ux.intel.com"
<tim.c.chen@...ux.intel.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"bp@...en8.de" <bp@...en8.de>, "Huang, Kai" <kai.huang@...el.com>
Cc: "mikko.ylinen@...ux.intel.com" <mikko.ylinen@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "anakrish@...rosoft.com"
<anakrish@...rosoft.com>, "Zhang, Bo" <zhanb@...rosoft.com>,
"kristen@...ux.intel.com" <kristen@...ux.intel.com>, "yangjie@...rosoft.com"
<yangjie@...rosoft.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"chrisyan@...rosoft.com" <chrisyan@...rosoft.com>
Subject: Re: [PATCH v12 09/14] x86/sgx: Implement async reclamation for cgroup
On Tue, 23 Apr 2024 09:19:53 -0500, Huang, Kai <kai.huang@...el.com> wrote:
> On Tue, 2024-04-23 at 08:08 -0500, Haitao Huang wrote:
>> On Mon, 22 Apr 2024 17:16:34 -0500, Huang, Kai <kai.huang@...el.com>
>> wrote:
>>
>> > On Mon, 2024-04-22 at 11:17 -0500, Haitao Huang wrote:
>> > > On Sun, 21 Apr 2024 19:22:27 -0500, Huang, Kai <kai.huang@...el.com>
>> > > wrote:
>> > >
>> > > > On Fri, 2024-04-19 at 20:14 -0500, Haitao Huang wrote:
>> > > > > > > I think we can add support for "sgx_cgroup=disabled" in
>> future
>> > > if
>> > > > > indeed
>> > > > > > > needed. But just for init failure, no?
>> > > > > > >
>> > > > > >
>> > > > > > It's not about the commandline, which we can add in the future
>> > > when
>> > > > > > needed. It's about we need to have a way to handle SGX cgroup
>> > > being
>> > > > > > disabled at boot time nicely, because we already have a case
>> > > where we
>> > > > > > need
>> > > > > > to do so.
>> > > > > >
>> > > > > > Your approach looks half-way to me, and is not future
>> > > extendible. If
>> > > > > we
>> > > > > > choose to do it, do it right -- that is, we need a way to
>> disable
>> > > it
>> > > > > > completely in both kernel and userspace so that userspace
>> won't be
>> > > > > able> to
>> > > > > > see it.
>> > > > >
>> > > > > That would need more changes in misc cgroup implementation to
>> > > support
>> > > > > sgx-disable. Right now misc does not have separate files for
>> > > different
>> > > > > resource types. So we can only block echo "sgx_epc..." to those
>> > > > > interfacefiles, can't really make files not visible.
>> > > >
>> > > > "won't be able to see" I mean "only for SGX EPC resource", but
>> not the
>> > > > control files for the entire MISC cgroup.
>> > > >
>> > > > I replied at the beginning of the previous reply:
>> > > >
>> > > > "
>> > > > Given SGX EPC is just one type of MISC cgroup resources, we cannot
>> > > just
>> > > > disable MISC cgroup as a whole.
>> > > > "
>> > > >
>> > > Sorry I missed this point. below.
>> > >
>> > > > You just need to set the SGX EPC "capacity" to 0 to disable SGX
>> EPC.
>> > > See
>> > > > the comment of @misc_res_capacity:
>> > > >
>> > > > * Miscellaneous resources capacity for the entire machine. 0
>> capacity
>> > > > * means resource is not initialized or not present in the host.
>> > > >
>> > >
>> > > IIUC I don't think the situation we have is either of those cases.
>> For
>> > > our
>> > > case, resource is inited and present on the host but we have
>> allocation
>> > > error for sgx cgroup infra.
>> >
>> > You have calculated the "capacity", but later you failed something and
>> > then reset the "capacity" to 0, i.e., cleanup. What's wrong with
>> that?
>> >
>> > >
>> > > > And "blocking echo sgx_epc ... to those control files" is already
>> > > > sufficient for the purpose of not exposing SGX EPC to userspace,
>> > > correct?
>> > > >
>> > > > E.g., if SGX cgroup is enabled, you can see below when you read
>> "max":
>> > > >
>> > > > # cat /sys/fs/cgroup/my_group/misc.max
>> > > > # <resource1> <max1>
>> > > > sgx_epc ...
>> > > > ...
>> > > >
>> > > > Otherwise you won't be able to see "sgx_epc":
>> > > >
>> > > > # cat /sys/fs/cgroup/my_group/misc.max
>> > > > # <resource1> <max1>
>> > > > ...
>> > > >
>> > > > And when you try to write the "max" for "sgx_epc", you will hit
>> error:
>> > > >
>> > > > # echo "sgx_epc 100" > /sys/fs/cgroup/my_group/misc.max
>> > > > # ... echo: write error: Invalid argument
>> > > >
>> > > > The above applies to all the control files. To me this is pretty
>> much
>> > > > means "SGX EPC is disabled" or "not supported" for userspace.
>> > > >
>> > > You are right, capacity == 0 does block echoing max and users see an
>> > > error
>> > > if they do that. But 1) doubt you literately wanted "SGX EPC is
>> > > disabled"
>> > > and make it unsupported in this case,
>> >
>> > I don't understand. Something failed during SGX cgroup
>> initialization,
>> > you _literally_ cannot continue to support it.
>> >
>> >
>>
>> Then we should just return -ENOMEM from sgx_init() when sgx cgroup
>> initialization fails?
>> I thought we only disable SGX cgroup support. SGX can still run.
>
> I am not sure how you got this conclusion. I specifically said something
> failed during SGX "cgroup" initialization, so only SGX "cgroup" needs to
> be disabled, not SGX as a whole.
>
>>
>> > > 2) even if we accept this is "sgx
>> > > cgroup disabled" I don't see how it is much better user experience
>> than
>> > > current solution or really helps user better.
>> >
>> > In your way, the userspace is still able to see "sgx_epc" in control
>> > files
>> > and is able to update them. So from userspace's perspective SGX
>> cgroup
>> > is
>> > enabled, but obviously updating to "max" doesn't have any impact.
>> This
>> > will confuse userspace.
>> >
>> > >
>>
>> Setting capacity to zero also confuses user space. Some application may
>> rely on this file to know the capacity.
>
>
> Why??
>
> Are you saying before this SGX cgroup patchset those applications cannot
> run?
>
>>
>> > > Also to implement this approach, as you mentioned, we need
>> workaround
>> > > the
>> > > fact that misc_try_charge() fails when capacity set to zero, and
>> adding
>> > > code to return root always?
>> >
>> > Why this is a problem?
>> >
>>
>> It changes/overrides the the original meaning of capacity==0: no one can
>> allocate if capacity is zero.
>
> Why??
>
> Are you saying before this series, no one can allocate EPC page?
>
>>
>> > > So it seems like more workaround code to just
>> > > make it work for a failing case no one really care much and end
>> result
>> > > is
>> > > not really much better IMHO.
>> >
>> > It's not workaround, it's the right thing to do.
>> >
>> > The result is userspace will see it being disabled when kernel
>> disables
>> > it.
>> >
>> >
>> It's a workaround because you use the capacity==0 but it does not really
>> mean to disable the misc cgroup for specific resource IIUC.
>
> Please read the comment around @misc_res_capacity again:
>
> * Miscellaneous resources capacity for the entire machine. 0 capacity
> * means resource is not initialized or not present in the host.
>
I mentioned this in earlier email. I think this means no SGX EPC. It does
not mean sgx epc cgroup not enabled. That's also consistent with the
behavior try_charge() fails if capacity is zero.
>>
>> There is explicit way for user to disable misc without setting capacity
>> to
>> zero.
>
> Which way are you talking about?
Echo "-misc" to cgroup.subtree_control at root level for example still
shows non-zero sgx_epc capacity.
>
>> So in future if we want really disable sgx_epc cgroup specifically
>> we should not use capacity. Therefore your approach is not
>> extensible/reusable.
>>
>> Given this is a rare corner case caused by configuration, we can only do
>> as much as possible IMHO, not trying to implement a perfect solution at
>> the moment. Maybe BUG_ON() is more appropriate?
>>
>
> I think I will reply this thread for the last time:
>
> I don't have strong opinion to against using BUG_ON() when you fail to
> allocate workqueue. If you choose to do this, I'll leave to others.
>
> If you want to "disable SGX cgroup" when you fail to allocate workqueue,
> reset the "capacity" to 0 to disable it.
Unless I hear otherwise, I'll revert to BUG_ON().
Thanks
Haitao
Powered by blists - more mailing lists