[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1509161201210.3951@nanos>
Date: Wed, 16 Sep 2015 12:22:09 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Zhu Jefferry <Jefferry.Zhu@...escale.com>
cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"bigeasy@...utronix.de" <bigeasy@...utronix.de>
Subject: RE: [PATCH v2] futex: lower the lock contention on the HB lock during
wake up
On Wed, 16 Sep 2015, Zhu Jefferry wrote:
> The application is a multi-thread program, to use the pairs of mutex_lock and
> mutex_unlock to protect the shared data structure. The type of this mutex
> is PTHREAD_MUTEX_PI_RECURSIVE_NP. After running long time, to say several days,
> the mutex_lock data structure in user space looks like corrupt.
>
> thread 0 can do mutex_lock/unlock
> __lock = this thread | FUTEX_WAITERS
> __owner = 0, should be this thread
The kernel does not know about __owner.
> __counter keep increasing, although there is no recursive mutex_lock call.
>
> thread 1 will be stuck
>
> The primary debugging shows the content of __lock is wrong in first. After a call of
> Mutex_unlock, the value of __lock should not be this thread self. But we observed
> The value of __lock is still self after unlock. So, other threads will be stuck,
How did you observe that?
> This thread could lock due to recursive type and __counter keep increasing,
> although mutex_unlock return fails, due to the wrong value of __owner,
> but the application did not check the return value. So the thread 0 looks
> like fine. But thread 1 will be stuck forever.
Oh well. So thread 0 looks all fine, despite not checking return
values.
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists