linux-kernel - Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <566A13C2.7040803@hurleysoftware.com>
Date:	Thu, 10 Dec 2015 16:07:30 -0800
From:	Peter Hurley <peter@...leysoftware.com>
To:	Marc Aurele La France <tsi@...oix.net>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jslaby@...e.com>, linux-kernel@...r.kernel.org,
	Volth <openssh@...th.com>, Damien Miller <djm@...drot.org>
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on
 a read()

On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
> On Thu, 10 Dec 2015, Peter Hurley wrote:
>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
>>> The following four commits are some of the changes that have been made
>>> to the tty layer since kernel version 3.11:
> 
>>> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
>>>     n_tty: Don't wait for buffer work in read() loop
> 
>>> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>>>     tty: Fix pty master read() after slave closes
> 
>>> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>>>     pty, n_tty: Simplify input processing on final close
> 
>>> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>>>     pty: Fix input race when closing
> 
>>> Indeed, the previous code (before commit "1)") checked the other end
>>> of the pty pair for any data before returning EAGAIN.  This mimics the
>>> behaviour of other System-V variants (Solaris, AIX, etc.)
> 
>> What other SysV systems were tested?
> 
> The fix for mindrot bug 52 was verified to correct the issue reported
> there on various versions of SunOS, OSF1, HP-UX, IRIX, Tru64 UNIX, AIX,
> OpenSolaris, and a number of Linux distributions.  There was a no-go
> report regarding HP-UX 11.11 that was never followed up on, but I
> believe it to have been resolved by the fix for mindrot bug 1467 (for
> systems where EAGAIN != EWOULDBLOCK).

Ok.

>>> in this
>>> regard and ensured that EAGAIN really did mean no data was available
>>> at the time of the call.
> 
>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>> continues to read the master pty until an error occurs.  This error
>>> will be EIO if no process has the slave pty open.  Otherwise (for
>>> example when the shell spawned long-running processes in the
>>> background before terminating), that error is expected to be EAGAIN.
>>> sshd cannot continue to read until an EIO in all cases, because doing
>>> so causes the session to hang until all processes have closed the
>>> slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
>>> return causes sshd to lose data, whether or not the slave pty is
>>> completely closed.
> 
>> Ah, the games userspace will be up to :)
> 
> Not really.

Definitely.

The idea that a read with O_NONBLOCK set should have synchronous behavior
is ridiculous.

> The fact different OSes behave differently in this regard can
> hardly be said to be userland's fault.  The lower the number of distinct
> behaviours userland needs to deal with, the better.  Furthermore, sshd
> "knows" there should be data there, so it makes no sense to befuddle it
> with false EAGAIN returns.

But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
sshd is extrapolating from one known condition to another unknown condition,
and assuming it "should" be that way because it has been.

For example, try the same idea with real ttys on loopback. Wouldn't work,
because it's asynchronous.

The only reason this needs fixing is because it's a userspace regression.


>>> --- a/drivers/tty/n_tty.c
>>> +++ a/drivers/tty/n_tty.c
>>> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>>>  			if (!timeout)
>>>  				break;
>>>  			if (file->f_flags & O_NONBLOCK) {
>>> -				retval = -EAGAIN;
>>> -				break;
>>> -			}
>>> -			if (signal_pending(current)) {
>>> -				retval = -ERESTARTSYS;
>>> -				break;
>>> -			}
>>> -			up_read(&tty->termios_rwsem);
>>> +				up_read(&tty->termios_rwsem);
>>> +				flush_work(&tty->port->buf.work);
>>> +				down_read(&tty->termios_rwsem);
>>> +				if (!input_available_p(tty, 0)) {
>>> +					retval = -EAGAIN;
>>> +					break;
>>> +				}
>>> +			} else {
>>> +				if (signal_pending(current)) {
>>> +					retval = -ERESTARTSYS;
>>> +					break;
>>> +				}
>>> +				up_read(&tty->termios_rwsem);
> 
>> No sense in doing this just for O_NONBLOCK; might as well do it before
>> all the condition tests.
> 
>> Which renders the earlier fixes for the slave end closing superfluous,
>> so might as well rip those out.
> 
> This strikes me as somewhat draconian.  It seems to me the intent behind
> commit "1)" was in part to speed things up a bit.  I beleive that goal to
> have been largely attained.  It's just a matter of ensuring this speedup
> doesn't change kernel<->userland semantics.  I see your EIO fix and the
> eventual solution to the problem here in that light.

When I first saw this report, I considered pipelining the status change
using new tty flags, but then reconsidered. It's one thing to operate within
the confines of VFS but a whole different situation to workaround this
problem with some kind of process exit hack.

Unfortunately, this condition lands squarely in the path I wanted to
avoid; O_NONBLOCK and poll(). I see no clear solution other than that
presented to prevent the regression. And the existing work since 3.12
is really pointless if we have to wait in O_NONBLOCK and poll anyway.

This is just one of those unfortunate situations where userspace has come
to rely on an unspecified behavior because it worked.

>> n_tty_poll() will need to be fixed as well, because if one application
>> used read() with O_NONBLOCK to expect to block until i/o became available,
>> then I guarantee some other application uses poll() with no timeout
>> for the same purpose.
> 
> Agreed. which is another reason why I don't believe my suggestion to be
> the correct fix.  Also, I took somewhat of a hammer approach, meaning
> that, for OpenSSH's purposes, it would be sufficient to remove spurious
> EAGAIN only after disassociation of the slave pty.

I don't see another solution.

Regards,
Peter Hurley

>>>
>>> -			timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>>> -					     timeout);
>>> +				timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>>> +						     timeout);
>>>
>>> -			down_read(&tty->termios_rwsem);
>>> -			continue;
>>> +				down_read(&tty->termios_rwsem);
>>> +				continue;
>>> +			}
>>>  		}
>>>
>>>  		if (ldata->icanon && !L_EXTPROC(tty)) {
>>>
> 
> Thanks.
> 
> Marc.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/