lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Aug 2010 22:02:13 -0700
From:	Michel Lespinasse <walken@...gle.com>
To:	Tony Luck <tony.luck@...el.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Howells <dhowells@...hat.com>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mike Waychison <mikew@...gle.com>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Ying Han <yinghan@...gle.com>
Subject: Re: [PATCH 06/11] rwsem: wake queued readers when writer blocks on 
	active read lock

On Wed, Aug 11, 2010 at 6:24 PM, Tony Luck <tony.luck@...el.com> wrote:
> Linus tree this morning[1] was behaving badly on ia64 ... processes would wander
> off into some unkillable state ... and since this happened to processes starting
> from rc*.d I couln't get the system up to a login prompt. System is a 32-way
> (4 sockets * quad-core * hyperthread).
>
> git bisect pins the blame on this change (commit 424acaaeb...).
> Reverting it (and
> it's successor a8618a0e - because I assumed that it depended on 424...) gives
> me a kernel that works fine.

Thanks for the report. FYI, a8618a0e does not depend on 424acaae so it
should be fine if you only revert 424acaae

> Not sure what is wrong with this change. Maybe ia64 needs some more memory
> ordering bits than the changed code provides? I can dig into it a bit
> harder tomorrow,
> but I thought you'd like an early heads-up in case anyone else is seeing similar
> problems.

In arch/ia64/include/asm/rwsem.h I see RWSEM_WAITING_BIAS defined as
-__IA64_UL_CONST(0x0000000100000000)

This makes it a large, positive unsigned value. This is probably
throwing off the rwsem_atomic_update(0, sem) < RWSEM_WAITING_BIAS
comparison in my patch (supposed to be long versus long, but actually
is long versus unsigned long on ia64).

Also, it looks like ia64 uses intrinsics for the atomic accesses, not
asm (that I could see in a 5 minute look around), so maybe one could
just get rid of the __IA64_UL_CONST macros ???

I can not compile or test on ia64, but could you report as to what
happens if you replace the #defines in arch/ia64/include/asm/rwsem.h
with:
#define RWSEM_UNLOCKED_VALUE __IA64_UL_CONST(0x0000000000000000)
#define RWSEM_ACTIVE_BIAS (1L)
#define RWSEM_ACTIVE_MASK (0xffffffffL)
#define RWSEM_WAITING_BIAS (-0x100000000L)
#define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS
#define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)


Cheers,

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ