linux-kernel - Re: [PATCH] synchronize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20071019042025.GA9617@gondor.apana.org.au>
Date:	Fri, 19 Oct 2007 12:20:25 +0800
From:	Herbert Xu <herbert@...dor.apana.org.au>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	akpm@...ux-foundation.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linuxppc-dev@...abs.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] synchronize_irq needs a barrier

On Thu, Oct 18, 2007 at 08:26:45PM -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 18 Oct 2007, Linus Torvalds wrote:
> >
> > I *think* it should work with something like
> > 
> > 	for (;;) {
> > 		smp_rmb();
> > 		if (!spin_is_locked(&desc->lock)) {
> > 			smp_rmb();
> > 			if (!(desc->status & IRQ_INPROGRESS)
> > 				break;
> > 		}
> > 		cpu_relax();
> > 	}
> 
> I'm starting to doubt this. 
> 
> One of the issues is that we still need the smp_mb() in front of the loop 
> (because we want to serialize the loop with any writes in the caller).

Right.  I think if we accept the definition of a spin lock/unlock
that Nick outlined earlier, then we can see that on the IRQ path
there simply isn't a memory barrier between the changing of the
status field and the execution of the action:

	spin_lock
	set IRQ_INPROGRESS
	spin_unlock

	action

	spin_lock
	clear IRQ_INPROGRESS
	spin_unlock

What may happen is that action can either float upwards to give

	spin_lock
	action
	set IRQ_INPROGRESS
	spin_unlock

	spin_lock
	clear IRQ_INPROGRESS
	spin_unlock

or it can float downwards to give

	spin_lock
	set IRQ_INPROGRESS
	spin_unlock

	spin_lock
	clear IRQ_INPROGRESS
	action
	spin_unlock

As a result we can add as many barriers as we want on the slow
(synchronize_irq) path and it just won't do any good (not unless
we add some barriers on the IRQ path == fast path).

What we do have on the right though is the fact in some ways
action behaves as if it's inside the spin lock even though it's
not.  In particular, action cannot float up past the first spin
lock nor can it float down past the last spin unlock.

> See commit fa490cfd15d7ce0900097cc4e60cfd7a76381138 and ponder. Maybe we 
> should take the same approach here, and do something like
> 
> 	repeat:
> 		/* Optimistic, no-locking loop */
> 		while (desc->status & IRQ_INPROGRESS)
> 			cpu_relax();
> 
> 		/* Ok, that indicated we're done: double-check carefully */
> 		spin_lock_irqsave(&desc->lock, flags);
> 		status = desc->status;
> 		spin_unlock_irqrestore(&desc->lock, flags);
> 
> 		/* Oops, that failed? */
> 		if (status & IRQ_INPROGRESS)
> 			goto repeat;

That's why I think this patch is in fact the only one that
solves all the races in this thread.  The case that it solves
which the lock/unlock patch does not is the one where action
flows downwards past the clearing of IRQ_INPROGRESS.  I missed
this case earlier.

In fact this bug exists elsewhere too.  For example, the network
stack does this in net/sched/sch_generic.c:

        /* Wait for outstanding qdisc_run calls. */
	while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
		yield();

This has the same problem as the current synchronize_irq code.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/