linux-kernel - Re: [PATCH v2 0/3] x86/intel_rdt: task_work vs task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <22537adf-9280-ea1f-bac5-6c9a7a589ae9@intel.com>
Date:   Wed, 25 Nov 2020 11:06:35 -0800
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Fenghua Yu <fenghua.yu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, James Morse <James.Morse@....com>
Subject: Re: [PATCH v2 0/3] x86/intel_rdt: task_work vs task_struct
 rmid/closid write race

Hi Valentin,

On 11/25/2020 10:39 AM, Valentin Schneider wrote:
> On 25/11/20 17:20, Reinette Chatre wrote:
>>>> Until the queued work is run, the moved task runs with old (and even
>>>> invalid in the case when its original resource group has been removed)
>>>> closid and rmid.
>>>>
>>>
>>> For a userspace task, that queued work should be run as soon as possible
>>> (& relevant). If said task is currently running, then task_work_add() will
>>> lead to an IPI;
>>> the other cases (task moving itself or not currently
>>> running) are covered by the return to userspace path.
>>
>> At this time the work is added with the TWA_RESUME flag so the running
>> task does not get a signal. I tried to follow the task_work_add() path
>> if there is a change to use TWA_SIGNAL instead and (I may have
>> misunderstanding) it seems to me that a sleeping task will be woken (if
>> it is TASK_INTERRUPTIBLE)? That is unnecessary. The goal of this work is
>> only to change the CPU register to indicate the active closid/rmid so it
>> is unnecessary to wake a process to do that, it only needs to be done
>> next time the task is scheduled in (which is already done with the
>> resctrl_sched_in() call in __switch_to()). If a task is not running all
>> that is needed is to change the closid/rmid in its task_struct to be
>> used next time it is scheduled in.
>>
> 
> The (default) TWA_RESUME ensures the targeted (userspace) task gets kicked
> if it is currently running, and doesn't perturb any CPU otherwise;
> see set_notify_resume() + exit_to_user_mode_loop() (or do_notify_resume()
> on arm64)

I missed that, thanks. The first issue is thus not a problem. Thank you 
very much for clearing this up. Queueing work for tasks that are not 
running remains unnecessary and simplifying this with a targeted 
smp_call_function addresses that (while also taking care of the other 
issues with using the queued work).

>> In the new solution, after updating closid/rmid in the task_struct, the
>> CPU register is updated via smp_call_function_single() on a CPU the task
>> is running. Nothing is done for tasks not running, next time they are
>> scheduled in the CPU's register will be updated to reflect the task's
>> closid/rmid. Moving to the smp_call_function_xxx() API would also bring
>> this update in line with how other register updates are already done in
>> resctrl.
>>
>>> Kernel threads however are a prickly matter because they quite explicitly
>>> don't have this return to userspace - they only run their task_work
>>> callbacks on exit. So we currently have to wait for those kthreads to go
>>> through a context switch to update the relevant register, but I don't
>>> see any other alternative that wouldn't involve interrupting every other
>>> CPU (the kthread could move between us triggering some remote work and its
>>> previous CPU receiving the IPI).
>>
>> This seems ok? In the new solution the closid/rmid would be updated in
>> task_struct and a smp_call_function_single() attempted on the CPU where
>> the kthread is running. If the kthread is no longer running at the time
>> the function is called the CPU register will not be changed.
> 
> Right, if the update happens before triggering the remote work then that
> should all work. I was stuck thinking about keeping the update contained
> within the remote work itself to prevent any other races (i.e. patch 3).

Are you saying that the task_struct update as well as register update 
should both be done in the remote work? I think I may be 
misunderstanding though. Currently, with your entire series applied, the 
update to task_struct is done before the remote work is queued that only 
changes the register. The new solution would also first update the 
task_struct and then the remote work (this time with smp_call_function) 
will just update the register.

 From what I understand your work in patch 3 would continue to be 
welcome with the new solution that will also update the task_struct and 
then trigger the remote work to just update the register.

> Anywho, that's enough speculation from me, I'll just sit tight and see what
> comes next!
> 

Reinette

>> I assume
>> the kthread move would include a context switch that would result in the
>> register change (__switch_to()->resctrl_sched_in()) for the kthread to
>> run with its new closid/rmid after the move.
>>


Reinette