[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120719150316.GN10382@moon>
Date: Thu, 19 Jul 2012 19:03:16 +0400
From: Cyrill Gorcunov <gorcunov@...nvz.org>
To: Matthew Helsley <matt.helsley@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Al Viro <viro@...iv.linux.org.uk>,
Alexey Dobriyan <adobriyan@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Pavel Emelyanov <xemul@...allels.com>,
James Bottomley <jbottomley@...allels.com>
Subject: Re: [rfc 5/7] fs, epoll: Add procfs fdinfo helper
On Thu, Jul 19, 2012 at 07:52:41AM -0700, Matthew Helsley wrote:
> On Wed, Jun 27, 2012 at 4:01 AM, Cyrill Gorcunov <gorcunov@...nvz.org> wrote:
> > This allow us to print out eventpoll target file descriptor,
> > events and data, the /proc/pid/fdinfo/fd consists of
> >
> > | pos: 0
> > | flags: 02
> > | tfd: 5 events: 1d data: ffffffffffffffff
> >
> > +#if defined(CONFIG_PROC_FS) && defined(CONFIG_CHECKPOINT_RESTORE)
> > +
> > +struct epitem_fdinfo {
> > + struct epoll_event ev;
> > + int fd;
> > +};
> > +
> > +static struct epitem_fdinfo *
> > +seq_lookup_fdinfo(struct proc_fdinfo_extra *extra, struct eventpoll *ep, loff_t num)
> > +{
> > + struct epitem_fdinfo *fdinfo = extra->priv;
> > + struct epitem *epi = NULL;
> > + struct rb_node *rbp;
> > +
> > + mutex_lock(&ep->mtx);
> > + for (rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp)) {
> > + if (num-- == 0) {
> > + epi = rb_entry(rbp, struct epitem, rbn);
> > + fdinfo->fd = epi->ffd.fd;
> > + fdinfo->ev = epi->event;
> > + break;
>
> This will be incredibly slow. epoll was designed to scale to tens of
> thousands of file descriptors. This algorithm is O(N^2) because each
> time we show a new epoll item we walk through the whole rb tree again
> (we're not doing a search so it isn't O(NlogN)).
Yeah, I know, it's quadratic. I'll be reworking this series to use
immediate seq-printf and print out the whole tree once the appropriate
fdinfo file get read.
> Also, we could miss one or more later items if one of the earlier
> items is removed from the epoll set in between "seq_lookup_fdinfo"
> calls. This isn't a problem for checkpoint because we assume the task
> (and everything with this eventpoll file in its fd table) is frozen.
> However it means the file will be worse than useless for almost any
> other purpose because they are unlikely to realize they need to freeze
> all the task(s) to get consistent data.
Well, a bunch of data read from proc is consistent only at moment of
reading.
Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists