lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1338553486.2708.25.camel@menhir>
Date:	Fri, 01 Jun 2012 13:24:46 +0100
From:	Steven Whitehouse <swhiteho@...hat.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	linux-kernel@...r.kernel.org, cluster-devel@...hat.com,
	Al Viro <viro@...iv.linux.org.uk>, nstraz@...hat.com
Subject: Re: seq_file: Use larger buffer to reduce time traversing lists

Hi,

On Fri, 2012-06-01 at 14:10 +0200, Eric Dumazet wrote:
> On Fri, 2012-06-01 at 11:39 +0100, Steven Whitehouse wrote:
> > I've just been taking a look at the seq_read() code, since we've noticed
> > that dumping files with large numbers of records can take some
> > considerable time. This is due to seq_read() using a buffer which, at
> > most is a single page in size, and that it has to find its place again
> > on every call to seq_read(). That makes it rather inefficient.
> > 
> > As an example, I created a GFS2 filesystem with 100k inodes in it, and
> > then ran ls -l to get a decent number of cached inodes. This result in
> > there being approx 400k lines in the debugfs file containing GFS2's
> > glocks. I then timed how long it takes to read this file:
> > 
> > [root@...woon mnt]# time dd if=/sys/kernel/debug/gfs2/unity\:myfs/glocks
> > of=/dev/null bs=1M
> > 0+5769 records in
> > 0+5769 records out
> > 23273958 bytes (23 MB) copied, 63.3681 s, 367 kB/s
> 
> What time do you get if you do
> 
> time dd if=/sys/kernel/debug/gfs2/unity\:myfs/glocks of=/dev/null bs=4k
> 
Yes, you are right that it will be slow, still. But I'm not sure what we
can do about that. We have to find our place again on each read call, in
any case I think.

> This patch seems the wrong way to me.
> 
> seq_read(size = 1MB) should perform many copy_to_user() calls instead of a single one.
> 
> Instead of doing kmalloc(m->size <<= 1, GFP_KERNEL) each time we overflow the buffer,
> we should flush its content to user space.
> 
> 
> 

Yes, I thought of that... there is a potential problem though. Is it
valid to do a copy to user while between ->seq_start() and ->seq_stop()
in general? These might be holding locks which may potentially be
required during a page fault caused by the copy_to_user(). Thats not an
issue in the GFS2 case, but I don't know if it is generally ok to do
that. I assume that is why the results are buffered rather than being
directly written to userspace anyway.

As soon as we do a ->seq_stop, then we have to go through the whole list
(assuming a list based data source) again to find our place :(

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ