lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151216063206.GA9866@xzibit.linux.bs1.fc.nec.co.jp>
Date:	Wed, 16 Dec 2015 06:32:08 +0000
From:	Junichi Nomura <j-nomura@...jp.nec.com>
To:	"peter@...leysoftware.com" <peter@...leysoftware.com>
CC:	"bhe@...hat.com" <bhe@...hat.com>,
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"jslaby@...e.com" <jslaby@...e.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: v4.4-rc1: /dev/console open fails with -EIO

Since kernel v4.4-rc1, kdump capture service with Fedora23 / RHEL7.2
almost always fails on my test system which uses serial console. It
used to work fine until kernel v4.3.

Kdump fails with an error like this:
  kdump.sh[1040]: /bin/kdump.sh: line 8: /dev/console: Input/output error

The line 8 of kdump.sh is doing this:
  exec &> /dev/console
(http://pkgs.fedoraproject.org/cgit/kexec-tools.git/tree/dracut-kdump.sh)

and the EIO is returned by this code in tty_reopen():
        if (!tty->count)
                return -EIO;

Bisection tells that commit 79c1faa4511e ("tty: Remove
tty_wait_until_sent_from_close()") is the first bad commit.
Actually, after reverting the commit, kdump capture starts working
again.

Open of /dev/console used to return -EIO when it races with close.
(https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/comments/245)
But the commit seems widening the race window.

  Before the commit:
    tty_release()
      tty_lock(tty)
      tty->ops->close(tty, filp)
        tty_unlock(tty)
        tty_wait_until_sent()
        // the window starts from here
        tty_lock(tty)
      decrement tty->count
      tty_unlock(tty)
      (releasing tty if count became zero)

  After the commit
    tty_release()
      // the window starts from here
      tty_lock(tty)
      tty->ops->close(tty, filp)
        tty_wait_until_sent()
      decrement tty->count
      tty_unlock(tty)
      (releasing tty if count became zero)

While it might be possible for user space to cope with the problem
by retrying open(), there is no clue whether and how long it should.
Also current situation makes shell scripting like the above kdump.sh
fragile for this sort of timing change.

How about retrying tty_open in kernel instead, like the attached patch?
If !tty->count in tty_reopen() means the race has happened, that
seems reasonable.

---
Jun'ichi Nomura, NEC Corporation

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index bcc8e1e..070ea66 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1462,8 +1462,9 @@ static int tty_reopen(struct tty_struct *tty)
 {
 	struct tty_driver *driver = tty->driver;
 
+	/* We cannot re-open tty which is being released. */
 	if (!tty->count)
-		return -EIO;
+		return -ERESTARTSYS;
 
 	if (driver->type == TTY_DRIVER_TYPE_PTY &&
 	    driver->subtype == PTY_TYPE_MASTER)
@@ -2087,6 +2088,11 @@ retry_open:
 
 	if (IS_ERR(tty)) {
 		retval = PTR_ERR(tty);
+		if (retval == -ERESTARTSYS && !signal_pending(current)) {
+			tty_free_file(filp);
+			schedule();
+			goto retry_open;
+		}
 		goto err_file;
 	}
 --
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ