lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 3 Jun 2007 21:05:51 -0400
From:	"Aaron Wiebe" <epiphani@...il.com>
To:	linux-kernel@...r.kernel.org
Cc:	"John Stoffel" <john@...ffel.org>
Subject: Re: slow open() calls and o_nonblock

Hi John, thanks for responding.  I'm using kernel 2.6.20 on a
home-grown distro.

I've responded to a few specific points inline - but as a whole,
Davide directed me to work that is being done specifically to address
these issues in the kernel, as well as a userspace implementation that
would allow me to sidestep this failing for the time being.


On 6/3/07, John Stoffel <john@...ffel.org> wrote:
>
> How large are these files?  Are they all in a single directory?  How
> many files are in the directory?
>
> Ugh. Why don't you just write to a DB instead?  It sounds like you're
> writing small records, with one record to a file.  It can work, but
> when you're doing thousands per-minute, the open/close overhead is
> starting to dominate.  Can you just amortize that overhead across a
> bunch of writes instead by writing to a single file which is more
> structured for your needs?

In short, I'm distributing logs in realtime for about 600,000
websites.  The sources of the logs (http, ftp, realmedia, etc) are
flexible, however the base framework was build around a large cluster
of webservers.  The output can be to several hundred thousand files
across about two dozen filers for user consumption - some can be very
active, some can be completely inactive.

> Netapps usually scream for NFS writes and such, so it sounds to me
> that you've blown out the NVRAM cache on the box.  Can you elaborate
> more on your hardware & Network & Netapp setup?

You're totally correct here - Netapp has told us as much about our
filesystem design, we use too much ram on the filer itself.  Its true
that the application would handle just fine if our filesystem
structure were redesigned - I am approaching this from an application
perspective though.  These units are capable of the raw IO, its the
simple fact that open calls are taking a while.  If I were to thread
off the application (which Davide has been kind enough to provide some
libraries which will make that substantially easier), the problem
wouldn't exist.

> The problem is that O_NONBLOCK on files open doesn't make sense.  You
> either open it, or you don't.  How long it takes to comlete isn't part
> of the spec.

You can certainly open the file, but not block on the call to do it.
What confuses me is why the kernel would "block" for 415ms on an open
call.  Thats an eternity to suspend a process that has to distribute
data such as this.

> But in this case, I think you're doing something hokey with your data
> design.  You should be opening just a handful of files and then
> streaming your writes to those files.   You'll get much more
> performance.

Except I cant very well keep 600,000 files open over NFS.  :)  Pool
and queue, and cycle through the pool.  I've managed to achieve a
balance in my production deployment with this method - my email was
more of a rant after months of trying to work around a problem (caused
by a limitation in system calls), only to have it present an order of
magnitude worse than I expected.  Sorry for not giving more
information off the line - and thanks for your time.

-Aaron
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ