lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 18 Feb 2015 15:13:52 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:	linux-kernel@...r.kernel.org,
	linux-rt-users <linux-rt-users@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Carsten Emde <C.Emde@...dl.org>,
	John Kacur <jkacur@...hat.com>,
	Paul Gortmaker <paul.gortmaker@...driver.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH RT 1/2] rwsem-rt: Do not allow readers to nest

On Wed, 18 Feb 2015 20:57:10 +0100
Sebastian Andrzej Siewior <bigeasy@...utronix.de> wrote:

> * Steven Rostedt | 2014-04-08 22:47:01 [-0400]:
> 
> >From: "Steven Rostedt (Red Hat)" <rostedt@...dmis.org>
> >
> >The readers of mainline rwsems are not allowed to nest, the rwsems in the
> >PREEMPT_RT kernel should not nest either.
> 
> I applied this and this is the reason why cpufreq isn't working. What I
> see in cpufreq is:
> |         test.sh-788   [004] .......    61.416288: store: down_read_try
> |         test.sh-788   [004] .......    61.416296: cpufreq_cpu_get: down_read_try
> |         test.sh-788   [004] .......    61.416301: cpufreq_cpu_put.part.6: up_read
> |         test.sh-788   [004] .......    61.416332: store: up_read
> 
> as you see, one code path takes the read path of rw_sema twice.
> 
> Looking at the generic implementation, we have:
> |#define RWSEM_UNLOCKED_VALUE            0x00000000L
> |#define RWSEM_ACTIVE_BIAS               0x00000001L
> |#define RWSEM_WAITING_BIAS              (-RWSEM_ACTIVE_MASK-1)
> 
> | static inline int __down_read_trylock(struct rw_semaphore *sem)
> | {
> |         long tmp;
> | 
> |         while ((tmp = sem->count) >= 0) {
> |                 if (tmp == cmpxchg(&sem->count, tmp,
> |                                    tmp + RWSEM_ACTIVE_READ_BIAS)) {
> |                         return 1;
> |                 }
> |         }
> |         return 0;
> | }
> 
> While sem->count is >= 0 we loop and take the semaphore. So we can have
> five readers at once. The first writer would set count to a negative
> value resulting in trylock failure.
> 
> |static inline void __down_read(struct rw_semaphore *sem)
> |{
> |        if (unlikely(atomic_long_inc_return((atomic_long_t*)&sem->count) <= 0))
> |                rwsem_down_read_failed(sem);
> |}
> 
> Here the same thing but without cmpxchg(). _If_ after an increment the
> value is negative then we take slowpath. Otherwise we have the lock.

OK, so I need to make it so it can nest with trylock. I have to look at
the patch again because it has been a while.

> 
> I think I'm going to revert this patch. Where is it written that
> multiple readers of a RW-semaphore can not nest? According to the code
> we can even have multiple readers without nesting (two+ processes may
> take a reader lock).

An RW sem must not do two down_read()s on the same lock (it's fine for
a trylock if it has a fail safe for it). The reason is, the second
down_read() will block if there's a writer waiting. Thus you are
guaranteed a deadlock if you have the lock for read, a write comes in
and waits, and you grab the RW sem again, because it will block, and
the writer is waiting for the reader to release. Thus you have a
deadlock.

I'll have to revisit this. I also need to revisit the multi readers
(although Thomas hates it, but he even admitted there's a better way to
do it. Now only if I could remember what that was ;-)

Thanks,

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ