lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f96978d0-ae96-0b4e-042f-531d17cb217e@efficios.com>
Date:   Tue, 11 Apr 2023 08:57:20 -0400
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, Aaron Lu <aaron.lu@...el.com>,
        Olivier Dion <odion@...icios.com>, michael.christie@...cle.com
Subject: Re: [RFC PATCH v3] sched: Fix performance regression introduced by
 mm_cid

On 2023-04-11 05:37, Peter Zijlstra wrote:
> On Fri, Apr 07, 2023 at 09:14:36PM -0400, Mathieu Desnoyers wrote:
> 
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index bc0e1cd0d6ac..f3e7dc2cd1cc 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -3354,6 +3354,37 @@ static inline int mm_cid_get(struct mm_struct *mm)
>>   static inline void switch_mm_cid(struct task_struct *prev, struct task_struct *next)
>>   {
>> +	/*
>> +	 * Provide a memory barrier between rq->curr store and load of
>> +	 * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
>> +	 *
>> +	 * Should be adapted if context_switch() is modified.
>> +	 */
>> +	if (!next->mm) {                                // to kernel
>> +		/*
>> +		 * user -> kernel transition does not guarantee a barrier, but
>> +		 * we can use the fact that it performs an atomic operation in
>> +		 * mmgrab().
>> +		 */
>> +		if (prev->mm)                           // from user
>> +			smp_mb__after_mmgrab();
>> +		/*
>> +		 * kernel -> kernel transition does not change rq->curr->mm
>> +		 * state. It stays NULL.
>> +		 */
>> +	} else {                                        // to user
>> +		/*
>> +		 * kernel -> user transition does not provide a barrier
>> +		 * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
>> +		 * Provide it here.
>> +		 */
>> +		if (!prev->mm)                          // from kernel
>> +			smp_mb();
>> +		/*
>> +		 * user -> user transition guarantees a memory barrier through
>> +		 * switch_mm().
>> +		 */
> 
> What about the user->user case where next->mm == prev->mm ? There
> sys_membarrier() relies on finish_task_switch()'s mmdrop(), but we
> can't.

AFAIU the finish_task_switch()'s mmdrop() is for the case where:

                  * [...] or in
                  * case 'prev->active_mm == next->mm' through
                  * finish_task_switch()'s mmdrop().

which applies for the case where we schedule from a kernel thread (which
kept the prior user task's mm as active mm) to a user task with the same
mm.

But this is really a transition from kernel -> user, not user -> user ?

Why should either membarrier or mm_cid care about a transition from
prev->mm to next->mm where mm is unchanged ? It does not register
as a transition from the comparison perspective.

I'll update my comment in switch_mm_cid to:

      /*
       * user -> user transition guarantees a memory barrier through
       * switch_mm() when current->mm changes. If current->mm is
       * unchanged, no barrier is needed.
       */

Thanks,

Mathieu


> 
>> +	}
>>   	if (prev->mm_cid_active) {
>>   		mm_cid_put_lazy(prev);
>>   		prev->mm_cid = -1;
>>

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ