lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAYs2=gaTkA2f65SXkexxAUkSPxgaPNQGdkSKS4pYmJ3hO7z-Q@mail.gmail.com>
Date:   Thu, 16 Mar 2023 10:30:52 +0000
From:   Song Shuai <suagrfillet@...il.com>
To:     Sudeep Holla <sudeep.holla@....com>
Cc:     gregkh@...uxfoundation.org, rafael@...nel.org,
        conor.dooley@...rochip.com, ionela.voinescu@....com,
        Pierre.Gondois@....com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2] arch_topology: Clear LLC sibling when cacheinfo teardown

Sudeep Holla <sudeep.holla@....com> 于2023年3月16日周四 09:29写道:
>
> On Tue, Mar 14, 2023 at 03:53:45PM +0800, Song Shuai wrote:
> > The teardown of CPUHP_AP_BASE_CACHEINFO_ONLINE now only invokes
> > free_cache_attributes() to clear share_cpu_map of cacheinfo list.
> > At the same time, clearing cpu_topology[].llc_sibling is
> > called quite late at the teardown code in hotplug STARTING section.
> >
> > To avoid the incorrect LLC sibling masks generated, move its clearing
> > right after free_cache_attributes().
> >
>
> Technically in terms of flow/timing this is correct. However I would like
> to know if you are seeing any issues without this change ?
>
> Technically, if a cpu is hotplugged out, the cacheinfo is reset first
> and then the topology. Until the cpu is removes, the LLC info in the
> topology is still valid. Also I am not sure if anything gets scheduled
> and this LLC info is utilised once the teardown of CPUHP_AP_BASE_CACHEINFO_ONLINE
> has started.

There is no visible issue in the entire offline process(eg: echo 0 > online).

However, when I hotplugged out the CPU into the state before CACHEINFO_ONLINE on
my kernel with the CONFIG_CPU_HOTPLUG_STATE_CONTROL configured,
the share_cpu_map had been updated but llc_sibling had not, which
would result in a trivial issue:

At the end of stepped hotplugging out, the cpuset_hotplug_work would
be flushed and then sched domain would be rebuilt
where the **cpu_coregroup_mask** in sched_domain_topology got
incorrect llc_sibling,
but the result of rebuilding was correct due to the protection of
cpu_active_mask.

The stepped hotplugging may not be used in the production environment,
but the issue existed.
Even in the entire offline process, it's possible that a future user
gets wrong the llc_sibling when accessing it concurrently or right
after the teardown of CACHEINFO_ONLINE.

>
> So I am trying to understand if we really need this change. Please let me
> know if I am missing anything here.
>
> --
> Regards,
> Sudeep



-- 
Thanks,
Song

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ