lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191120181353.GG4262@mit.edu>
Date:   Wed, 20 Nov 2019 13:13:53 -0500
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Alex Zhuravlev <azhuravlev@...mcloud.com>
Cc:     "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: [RFC] improve malloc for large filesystems

Hi Alex,

A couple of comments.  First, please separate this patch so that these
two separate pieces of functionality can be reviewed and tested
separately:

> 1) mballoc tries too hard to find the best chunk which is
>  counterproductive - it makes sense to limit this process

> 2) during scanning the bitmaps are loaded one by one, synchronously
>  - it makes sense to prefetch few groups at once

As far the prefetch is concerned, please note that the bitmap is first
read into the buffer cache via read_block_bitmap_nowait(), but then it
needs to be copied into buddy bitmap pages where it is cached along
side the buddy bitmap.  (The copy in the buddy bitmap is a combination
of the on-disk block allocation bitmap plus any outstanding
preallocations.)  From that copy of block bitmap, we then generate the
buddy bitmap and as a side effect, initialize the statistics
(grp->bb_first_free, grp->bb_largest_free_order, grp->bb_counters[]).

It is these statistics that we need to be able to make allocation
decisions for a particular block group.  So perhaps we should drive
the readahead of the bitmaps from ext4_mb_init_group() /
ext4_mb_init_cache(), and make sure that we actually initialize the
ext4_group_info structure, and not just read the bitmap into buffer
cache and hope it gets used before memory pressure pushes it out of
the buddy cache.

Andreas has suggested going even farther, and perhaps storing this
derived information from the allocation bitmaps someplace convenient
on disk.  This is an on-disk format change, so we would want to think
very carefully before going down that path.  Especially since if we're
going to go this far, perhaps we should consider using an on-disk
b-tree to store the allocation information, which could be more
efficient than using allocation bitmaps plus buddy bitmaps.

Cheers,

							- Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ