linux-kernel - spin_unlock_wait() in ata_scsi_cmd_error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20170629181057.GA5228@linux.vnet.ibm.com>
Date:   Thu, 29 Jun 2017 11:10:57 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     tj@...nel.org
Cc:     linux-ide@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: spin_unlock_wait() in ata_scsi_cmd_error_handler()?

Hello, Tejun!

We are having some discussion about the semantics of spin_unlock_wait(),
and your code has one of them.
(https://marc.info/?l=linux-kernel&m=149730349001044)

We seem to agree that spin_unlock_wait() should provide acquire semantics.
Consider the following admittedly bizarre code fragment:

	CPU 0			CPU 1
	-----			-----
	spin_unlock_wait(&ml);	/* Lock held initially. */
	WRITE_ONCE(x, 1);	r2 = READ_ONCE(x);
	r1 = READ_ONCE(y);	WRITE_ONCE(y, 1);
				spin_unlock(&ml);

	r1 == 0 || r2 == 1 /* again, evaluated "at the end of time" */

CPU 0's spin_unlock_wait() must wait for CPU 1 to release the lock,
which means that CPU 0's memory references must see the result of
CPU 1's memory references and not vice versa.  In other words, the
expression beneath the code fragment cannot hold.

The current sense is that spin_unlock_wait() will -not- provide
release semantics.  This calls for an even more bizarre code fragment:

	CPU 0			CPU 1
	-----			-----
	WRITE_ONCE(x, 1);	spin_lock(&ml);
	r1 = READ_ONCE(y);	r2 = READ_ONCE(x);
	spin_unlock_wait(&ml);	WRITE_ONCE(y, 1);
	WRITE_ONCE(z, 1);	/* Intentionally not releasing lock! */

	z == 1 && (r1 == 1 || r2 == 0) /* evaluated "at the end of time" */

If this code fragment doesn't deadlock, then CPU 0's spin_unlock_wait()
must have executed before CPU 1's spin_lock().  However, even on x86,
CPU 0's prior writes can be reordered with its subsequent reads, which
means that r1 == 0 is possible, which means that the above condition
could hold, even on x86.

One of the uses of spin_unlock_wait() is in ata_scsi_cmd_error_handler()
in the file drivers/ata/libata-eh.c.  Your commit ad9e27624479b
("libata-eh-fw: update ata_scsi_error() for new EH") last touched it,
though it predates that commit.

My question to you is whether the code in ata_scsi_cmd_error_handler()
needs release semantics.  If it does, my recommendation is to replace
the spin_unlock_wait(ap->lock) with this (adding the needed curly braces,
of course):

	spin_lock(ap->lock);
	spin_unlock(ap->lock);

If the code only needs acquire semantics, no change required.

If your code requires release semantics, and there is some reason why
my suggested replacement above is a bad idea, please let me know!

							Thanx, Paul