lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 16 Sep 2011 23:02:38 +0200
From:	Andres Freund <andres@...razel.de>
To:	Benjamin LaHaise <bcrl@...ck.org>
Cc:	Matthew Wilcox <matthew@....cx>, Andi Kleen <andi@...stfloor.org>,
	viro@...iv.linux.org.uk, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, robertmhaas@...il.com,
	pgsql-hackers@...tgresql.org
Subject: Re: Improve lseek scalability v3

On Friday, September 16, 2011 10:08:17 PM Benjamin LaHaise wrote:
> On Fri, Sep 16, 2011 at 07:27:33PM +0200, Andres Freund wrote:
> > many tuples does the table have. Those statistics are only updated every
> > now and then though.
> > So it uses those old stats to check how many tuples are normally stored
> > on a page and then uses that to extrapolate the number of tuples from
> > the current nr of pages (which is computed by lseek(SEEK_END) over the
> > 1GB segements of a table).
> > 
> > I am not sure how interested you are on the relevant postgres internals?
> 
> For such tables, can't Postgres track the size of the file internally?  I'm
> assuming it's keeping file descriptors open on the tables it manages, in
> which case when it writes to a file to extend it, the internally stored
> size could be updated.  Not making a syscall at all would scale far better
> than even a modified lseek() will perform.
Yes, it tracks the fds internally. The problem is that postgres is process 
based so those tables are not reachable by all processes. It could start 
tracking those in shared memory but the synchronization overhead for that 
would likely be more expensive than the syscall overhead (Given that the 
fdsets are possibly (and realistically) disjunct between the individual 
backends you would have to reserve enough shared memory for a fully seperate 
fds between each process... Which would complicate efficient lookup).

Also with fstat() instead of lseek() there was no bottleneck anymore, so I 
don't think the benefits would warrant that.

Greetings,

Andres
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ