linux-kernel - Re: [PATCH resend v2] tty: n_tty -- Add new TIOCPEEKRAW ioctl to peek unread data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160411214211.GT2000@uranus.lan>
Date:	Tue, 12 Apr 2016 00:42:11 +0300
From:	Cyrill Gorcunov <gorcunov@...il.com>
To:	Peter Hurley <peter@...leysoftware.com>
Cc:	LKML <linux-kernel@...r.kernel.org>, Jiri Slaby <jslaby@...e.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Andrey Vagin <avagin@...tuozzo.com>,
	Pavel Emelianov <xemul@...tuozzo.com>,
	Vladimir Davydov <vdavydov@...tuozzo.com>,
	Konstantin Khorenko <khorenko@...tuozzo.com>
Subject: Re: [PATCH resend v2] tty: n_tty -- Add new TIOCPEEKRAW ioctl to
 peek unread data

On Mon, Apr 11, 2016 at 10:20:08AM -0700, Peter Hurley wrote:
...
> 
> My thoughts here are two-fold.
> 
> 1. I'd like whatever functionality is to be added to support checkpoint/
> restore to be a restrictive as possible while still accomplishing your
> objectives _exactly because I don't want it to grow other uses_.
> That's why I suggested asserting the task state (and maybe adding
> ptrace state checks) so that the interface can only be used for this
> purpose.
> 
> 2. If we want to make changes because the existing read()/write()
> interface is inadequate, let's add a complete interface.

I completely agree!

> > Strictly speaking saving complete state would be ideal but this require
> > exporting internal n_tty structure into user space as api (sure we
> > can name the members in any way we like, but this would force then
> > to keep backward compatibility when n_tty code get changed in future).
> > I think loosing some information like column and marker is acceptable
> > and fetching unread buffers is more general. I can try to implement
> > c/r'ing ioctl though which would not only fetch buffers but column
> > and such, and see how it goes, sounds ok?
> 
> My thinking here was to use an opaque buffer; it could be sized in the
> same manner as the peek interface; ie., provide a buffer address with
> size, and the call fails with an updated size required if inadequate
> (or whatever, but I think you get the idea).

Yeah, got it.

> >> A generic peek() may find other uses which could make future improvements
> >> difficult.
> >>
> >> Plus this ioctl() is only reading the 4k line discipline read buffer, and
> >> not the tty buffers which could contain up to another 8k of unprocessed
> >> input.
> > 
> > Which is scheduled for flush into port and then landed into ld when
> > write() called. True, I managed to miss this moment, thanks! So the
> > proper general peek() handling should be implemented on tty code level
> > rather than ld, and fetch both -- start with tty buffer and proceed with
> > ld buffer then, right?
> 
> To safely copy the tty buffers, it needs to be implemented like
> tty_buffer_flush():

...

> 6. Release ldisc reference
> 
> 	tty_ldisc_deref(ld);
>

Thanks a huge, got it!

...
> >>> +{
> >>> +	struct n_tty_data *ldata = tty->disc_data;
> >>> +	ssize_t retval;
> >>> +
> >>> +	if (!mutex_trylock(&ldata->atomic_read_lock))
> >>> +		return -EAGAIN;
> >>> +
> >>> +	down_read(&tty->termios_rwsem);
> >>
> >> Why take these two locks if parallel operations are not expected?
> >> Conversely, I would expect this to assert current state is TASK_TRACED.
> > 
> > If someone is already reading the data it will either exit soon with
> > data read or will sit here waiting for data to appear. So I thought
> > if we're fetching unread data we should not interfere with any other
> > readers? If someone is sitting in read procedure it implies that there
> > either no data on buffer and reader is waiting for it, or the read
> > procedure will complete soon, but indeed I rather should be checking
> > for read_cnt locklessly and return -EAGAIN if > 0 and can't lock
> > the atomic_read_lock. I implied that task can be in non-traced state.
> > If require traced indeed we won't need the lock. For criu lockless
> > + tracing state is fine but i thought someone else might be needing
> > to peek data without tracing the task. Hm?
> 
> Well, I'd rather not have other uses for peeking data. In fact, I need
> to check if this interface needs to be CAP_SYSADMIN; I'm pretty sure
> that ioctl() ignores file permissions and write-only file permissions
> (used by /usr/bin/write to send messages to other users) should not
> allow tty reads.
> 
> Ok, so we want to defend against parallel operations in case not
> all tty users are currently stopped by ptrace (such as when unrelated
> tasks are potentially reading or writing to either end and were not
> specified for dumping)?

I think so. At least this gives some kind of consistensy while we're
fetching data. Something close to peeking data from pipes/sockets.

> In that case, I think just write trylocking the termios rwsem should
> prevent any parallel i/o, at least for the N_TTY line discipline.
> 
> This should only interfere with processes not being dumped because
> ptrace signalling should have ejected any process to be dumped out
> of their i/o loops (readers or writers).
> 
> Also, I think we should further limit the interface based on what is
> supported currently; IOW, check that the driver is either pty or vt,
> assert that the line discipline is N_TTY at both ends, etc.

Thats a good idea, thanks, will do!