[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <491D6B4EAD0A714894D8AD22F4BDE0439F98F4@SCYBEXDAG02.amd.com>
Date: Thu, 5 Apr 2012 08:37:46 +0000
From: "Chen, Dennis (SRDC SW)" <Dennis1.Chen@....com>
To: Ingo Molnar <mingo@...nel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mingo@...hat.com" <mingo@...hat.com>
Subject: RE: semaphore and mutex in current Linux kernel (3.2.2)
On Tue, Apr 3, 2012 at 3:52 PM, Ingo Molnar <mingo@...nel.org> wrote:
> I'm not sure what the point of comparative measurements with
> semaphores would be: for example we don't have per architecture
> optimized semaphores anymore, we switched the legacy semaphores
> to a generic version and are phasing them out.
>
About the point, very simple, I am very curious about the mutex performance
optimization (actually I am curious about almost everything in the kernel :)
I know that the rationale of the mutex's optimization is, if the lock owner is
running, it's likely to release the lock soon. So make the waiter to spin a
short time waiting for the lock to be released is reasonable given the workload
of a process switch.
But how about if the lock owner running doesn't release the lock soon? Will it
degrade the mutex performance comparing with semaphore?
Look at the below code in mutex slow path:
int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
{
if (!sched_feat(OWNER_SPIN))
return 0;
rcu_read_lock();
while (owner_running(lock, owner)) {
if (need_resched())
break;
arch_mutex_cpu_relax();
}
rcu_read_unlock();
/*
* We break out the loop above on need_resched() and when the
* owner changed, which is a sign for heavy contention. Return
* success only when lock->owner is NULL.
*/
return lock->owner == NULL;
}
According to the codes, I guess probably the waiter will busy wait in the while loop.
So I made an experiment:
1. write a simple character device kernel module. Add a busy wait for 2 minutes
between the mutex_lock/mutex_unlock in its read function like this:
xxx_read(){
/* 2 minutes */
unsigned long j = jiffies + 120 * HZ;
mutex_lock(&c_mutex);
while(time_before(jiffies, j))
cpu_relax();
mutex_unlock(&c_mutex);
}
2. write an Application to open and read this device, I startup the App in 2
different CPUs almost the same time:
# taskset 0x00000001 ./main
# taskset 0x00000002 ./main
The App in CPU0 will get the mutex lock and running about 2 minutes to release the lock,
according the mutex_spin_on_owner(), I guess the App in CPU1 will run in the while loop,
but the ps command output:
root 30197 0.0 0.0 4024 324 pts/2 R+ 11:30 0:00 ./main
root 30198 0.0 0.0 4024 324 pts/0 D+ 11:30 0:00 ./main
D+ means the App in CPU1 is sleeping in a UNINTERRUPTIBLE state. This is very interesting,
How does this happen? I check my kernel config, the CONFIG_MUTEX_SPIN_ON_OWNER is
Set and '/sys/kernel/debug# cat sched_features' output:
... ICK LB_BIAS OWNER_SPIN NONTASK_POWER TTWU_QUEUE NO_FORCE_SD_OVERLAP
I know I must have some misunderstanding of the code, but I don't know where it is...
> Mutexes have various advantages (such as lockdep coverage and in
> general tighter semantics that makes their usage more robust)
Yes, I agree.
> and we aren't going back to semaphores.
>
> What would make a ton of sense would be to create a 'perf bench'
> module that would use the kernel's mutex code and would measure
> it in user-space. 'perf bench mem' already does a simplified
>
> So if you'd be interested in writing that brand new benchmarking
> feature and need help then let the perf people know.
Thanks Ingo for the info, it's already in my TODO list of the benchmarking feature for
some kernel feature...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists