lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 May 2020 13:34:03 -0600
From:   Andreas Dilger <adilger@...ger.ca>
To:     Alex Zhuravlev <azhuravlev@...mcloud.com>
Cc:     Ritesh Harjani <riteshh@...ux.ibm.com>,
        "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 2/2] ext4: skip non-loaded groups at cr=0/1

On May 20, 2020, at 2:40 AM, Alex Zhuravlev <azhuravlev@...mcloud.com> wrote:
> 
> 
> 
>> On 17 May 2020, at 10:55, Andreas Dilger <adilger@...ger.ca> wrote:
>> 
>> The question is whether this is situation is affecting only a few inode
>> allocations for a short time after mount, or does this persist for a long
>> time?  I think that it _should_ be only a short time, because these other
>> threads should all start prefetch on their preferred groups, so even if a
>> few inodes have their blocks allocated in the "wrong" group, it shouldn't
>> be a long term problem since the prefetched bitmaps will finish loading
>> and allow the blocks to be allocated, or skipped if group is fragmented.
> 
> Yes, that’s the idea - there is a short window when buddy data is being
> populated. And for each “cluster” (not just a single group) prefetching
> will be initiated by allocation.
> It’s possible that some number of inodes will get “bad” blocks right after
> after mount.
> If you think this is a bad scenario I can introduce couple more things:
> 1) few times discussed prefetching thread
> 2) let mballoc wait for the goal group to get ready - this essentials one
>    more check in ext4_mb_good_group()

IMHO, this is an acceptable "cache warmup" behavior, not really different
than mballoc doing limited scanning when looking for any other allocation.
Since we already separate inode table blocks and data blocks into separate
groups due to flex_bg, I don't think any group is "better" than another,
so long as the allocations are avoiding worst-case fragmentation (i.e. a
series of one-block allocations).

Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ