lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96018978-6b7f-1e7f-1012-9df7f7996ec5@redhat.com>
Date:   Wed, 15 Dec 2021 13:16:43 -0500
From:   Waiman Long <longman@...hat.com>
To:     Michal Koutný <mkoutny@...e.com>,
        Tejun Heo <tj@...nel.org>
Cc:     Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Shuah Khan <shuah@...nel.org>, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kselftest@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <guro@...com>, Phil Auld <pauld@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Marcelo Tosatti <mtosatti@...hat.com>
Subject: Re: [PATCH v9 6/7] cgroup/cpuset: Update description of
 cpuset.cpus.partition in cgroup-v2.rst

On 12/15/21 09:44, Michal Koutný wrote:
> On Mon, Dec 13, 2021 at 11:00:17AM -1000, Tejun Heo <tj@...nel.org> wrote:
>> * When a valid partition turns invalid, now we have a reliable way of
>>    discovering what exactly caused the transition. However, when a user now
>>    fails to turn a member into partition, all they get is -EINVAL and there's
>>    no way to discover why it failed and the failure conditions that -EINVAL
>>    represents aren't simple.
>>
>> * In an automated configuration scenarios, this operation mode may be
>>    difficult to make reliable and lead to sporadic failures which can be
>>    tricky to track down. The core problem is that whether a given operation
>>    succeeds or not may depend on external states (CPU on/offline) which may
>>    change asynchronously in a way that the configuring entity doesn't have
>>    any control over.
>>
>> It's true that both are existing problems with the current partition
>> interface and given that this is a pretty spcialized feature, this can be
>> okay. Michal, what are your thoughts?
> Because of asynchronous changes, the return value should not be that
> important and the user should watch cpuset.partitions for the result
> (end state) anyway.
> Furthermore, the reasons should be IMO just informative (i.e. I like
> they're not explicitly documented) and not API.
>
> But I see there could be a distinction between -EINVAL (the supplied
> input makes no sense) and -EAGAIN(?) denoting that the switch to
> partition root could not happen (due to outer constraints).
>
> You seem to propose to replace the -EAGAIN above with a success code and
> allow the switch to an invalid root.
> The action of the configuring entity would be different: retry (when?)
> vs wait till transition happens (notification) (although the immediate
> effect (the change did not happen) is same).
> I considered the two variants equal but the clear information about when
> the change can happen I'd favor the variant allowing the switch to
> invalid root now.

Allowing direct transition from member to invalid partition doesn't feel 
right for me. A casual user may assume a partition is correctly formed 
without double checking the "cpuset.partition" value. Returning an error 
will prevent this kind of issue. If returning more information about the 
failure is the main reason for allowing the invalid partition 
transition, we can extend the "cpuset.partition" read syntax to also 
show the reason for the previous failure.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ