lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1545375580.3424480.1368968403367.JavaMail.ngmail@webmail10.arcor-online.net>
Date:	Sun, 19 May 2013 15:00:03 +0200 (CEST)
From:	frankcmoeller@...or.de
To:	linux-ext4@...r.kernel.org
Subject: Aw: Re: Aw: Re: Ext4: Slow performance on first write after mount

Hi,

> One question regarding fallocate: I create a new file and do a 100MB
> fallocate 
> with FALLOC_FL_KEEP_SIZE. Then I write only 70MB to that file and close it.
> Is the 30 MB unused preallocated space still preallocated for that file
> after closing
> it? Or does a close release the preallocated space?

I did some tests and now I can answer it by myself ;-)
The space stays preallocated after closing the file. Also umount don't releases 
the space. Interesting!

I was testing concurrent fallocates and writes to the same file descriptor. It 
seems to work. If it is quick enough I cannot say at the moment.

Regards,
Frank

----- Original Nachricht ----
Von:     frankcmoeller@...or.de
An:      linux-ext4@...r.kernel.org
Datum:   19.05.2013 12:01
Betreff: Re: Aw: Re: Ext4: Slow performance on first write after mount

> Hi Andreas,
> 
> > Part of the problem is that filesystems are rarely unmounted cleanly, so
> it
> > means that this information would need to be updated periodically to disk
> so
> > that it is available after a crash.
> > I wouldn't object to some kind of "lazy" updating of group information on
> > disk that at least gives the newly-mounted filesystem a rough idea of
> what
> > each group's usage is. It wouldn't have to be totally accurate (it
> wouldn't
> > replace the bitmaps), but maybe 2 bits per group would be enough as a
> > starting point?
> > For a 32 TB filesystem that would be about 16 4kB blocks of bits that
> would
> > be updated periodically (e.g. every five minutes or so). Since the
> allocator
> > will typically work in successive groups that might not cause too much
> > churn. 
> 
> Yes, you're right. The stored data wouldn't be 100% reliable. And yes, it
> would be really good if 
> right after mount the filesystem would knew something more to find a good
> group quicker.
> What do you think of this:
> 1. I read this already in some discussions: You already store the free space
> amount for every
>   group. Why not also storing how big the biggest contiguous free space
> block in a group is? Then you 
>   don't have to read the whole group.
> 2. What about a list (in memory and also stored on disk) with all unused
> groups (1 bit for every group).
>   If the allocator cannot find a good group within lets say half second, a
> group from this list is used.
>   The list is also not be 100% reliable (because of the mentioned unclean
> unmounts), so you need to search
>   a good group in the list. If no good group was found in the list, the
> allocator can continue searching.
>   This don't helps in all situations (e.g. almost full disk or every group
> contains a small amount of data),
>   but it should be in many cases much faster, if the list is not totally
> outdated.
> 
> > It would be possible to fallocate() at some expected size (e.g. average
> file
> > size) and then either truncate off the unused space, or fallocate() some
> > more in another thread when you are close to tunning out. 
> > If the fallocate() is done in a separate thread the latency can be hidden
> > from the main application?
> Adding a new thread for fallocate shouldn't be a big problem. But fallocate
> might 
> generate high disk usage (while searching for a good group). I don't know
> whether
> parallel writing from the other thread is quick enough.
> 
> One question regarding fallocate: I create a new file and do a 100MB
> fallocate 
> with FALLOC_FL_KEEP_SIZE. Then I write only 70MB to that file and close it.
> Is the 30 MB unused preallocated space still preallocated for that file
> after closing
> it? Or does a close release the preallocated space?
> 
> Regards,
> Frank
> 
> > 
> > Cheers, Andreas 
> > 
> > > And you have to take care about alignment and there are several threads
> in
> > the internet which explain why you shouldn't use it (or only in very
> special
> > situations and I don't think that my situation is one of them). And ext4
> > group initialization takes also place when using O_DIRECT (as said before
> > perhaps I did something wrong).
> > > 
> > > Regards,
> > > Frank
> > > 
> > > ----- Original Nachricht ----
> > > Von:     "Sidorov, Andrei" <Andrei.Sidorov@...isi.com>
> > > An:      "frankcmoeller@...or.de" <frankcmoeller@...or.de>, ext4
> > development <linux-ext4@...r.kernel.org>
> > > Datum:   17.05.2013 23:18
> > > Betreff: Re: Ext4: Slow performance on first write after mount
> > > 
> > >> Hi Frank,
> > >> 
> > >> Consider using bigalloc feature (requires reformat), preallocate space
> > >> with fallocate and use O_DIRECT for reads/writes. However, 188k writes
> > >> are too small for good throughput with O_DIRECT. You might also want
> to
> > >> adjust max_sectors_kb to something larger than 512k.
> > >> 
> > >> We're doing 6in+6out 20Mbps streams just fine.
> > >> 
> > >> Regards,
> > >> Andrei.
> > >> 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4"
> in
> > > the body of a message to majordomo@...r.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@...r.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ