linux-kernel - Re: [RFC PATCH 0/2] sbitmap: NUMA node spreading

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <40fd1cc9-15b9-719c-8b8d-118cb156729f@huawei.com>
Date:   Wed, 11 May 2022 10:57:45 +0100
From:   John Garry <john.garry@...wei.com>
To:     Ming Lei <ming.lei@...hat.com>
CC:     Jens Axboe <axboe@...nel.dk>, <linux-block@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linux-scsi@...r.kernel.org>
Subject: Re: [RFC PATCH 0/2] sbitmap: NUMA node spreading

On 11/05/2022 03:07, Ming Lei wrote:

Hi Ming,

>>> Spreading the memory out does probably make sense, but we need to retain
>>> the fast normal case. Making sbitmap support both, selected at init
>>> time, would be far more likely to be acceptable imho.
>> I wanted to keep the code changes minimal for an initial RFC to test the
>> water.
>>
>> My original approach did not introduce the extra load for normal path and
>> had some init time selection for a normal word map vs numa word map, but the
>> code grew and became somewhat unmanageable. I'll revisit it to see how to
>> improve that.
> I understand this approach just splits shared sbitmap into per-numa-node
> part, but what if all IOs are just from CPUs in one same numa node? Doesn't
> this way cause tag starvation and waste?
> 

We would not do this. If we can't find a free bit in one node then we 
need to check the others before giving up. This is some of the added 
complexity which I hinted at. And things like batch get or RR support 
become more complex.

Alternatively we could have the double pointer for numa spreading only, 
which would make things simpler. I need to check which is overall 
better. Adding the complexity for dealing with numa node sub-arrays may 
affect performance also.

Thanks,
John