linux-kernel - Re: [PATCH blktests v3 09/12] common/fio: Limit number of random jobs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ec72542d-a554-59ac-8f67-26eeb7f2a5b0@nvidia.com>
Date:   Thu, 4 May 2023 05:16:34 +0000
From:   Chaitanya Kulkarni <chaitanyak@...dia.com>
To:     Daniel Wagner <dwagner@...e.de>
CC:     "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        Shin'ichiro Kawasaki <shinichiro@...tmail.com>,
        Hannes Reinecke <hare@...e.de>
Subject: Re: [PATCH blktests v3 09/12] common/fio: Limit number of random jobs

On 5/3/23 04:01, Daniel Wagner wrote:
> On Wed, May 03, 2023 at 09:41:37AM +0000, Chaitanya Kulkarni wrote:
>> On 5/3/23 01:02, Daniel Wagner wrote:
>>> Limit the number of random threads to 32 for big machines. This still
>>> gives enough randomness but limits the resource usage.
>>>
>>> Signed-off-by: Daniel Wagner <dwagner@...e.de>
>>> ---
>> I don't think we should change this, the point of all the tests is
>> to not limit the resources but use threads at least equal to
>> $(nproc), see recent patches from lenovo they have 448 cores,
>> limiting 32 is < 10% CPUs and that is really small number for
>> a large machine if we decide to run tests on that machine ...
> I just wonder how handle the limits for the job size. Hannes asked to limit it
> to 32 CPUs so that the job size doesn't get small, e.g. nvme_img_size=16M job
> size per job with 448 CPUs is roughly 36kB. Is this good, bad or does it even
> make sense? I don't know.

16M is very small number ..

from my experience with smaller I/O sizes we don't see the lokdeps
that we see with the large I/O sizes hence it is a bad idea to use small
I/O sizes and limiting the jobs to hard coded 32 number ...

> My question is what should the policy be? Should we reject configuration which
> try to run too small jobs sizes? Reject anything below 1M for example? Or is
> there a metric which we could as base for a limit calculation (disk geometry)?

the basic requirement here is we need to run the I/O from every processor,
so let's keep --numjobs=($nproc) constant now and let the user set job 
size..
in this particular case for NVMe we set the size 1G and that is 
sufficient since
numbjobs are set to nproc and with this series user can set the size 
based on
a particular arch ...

See [1] if you are interested in how to quantify small or large job size.

For this series to merge let's keep is simple and not worry about erroring
out on a particular job size but just keeping the nproc as it is ...

-ck

Ideally in past what I've done is  :-
1. Accept the % of the CPU cores that we want to keep it busy.
2. Accept the % of the disk space we want to exercise test.
3. Use the combination of the #1 and #2 to spread out the
    job size across the number of jobs.

with above design one doesn't have to assume what is small or what it large
job size and system gets tested according to user's expectations such as
50% CPUs are busy on 80% disk size or 100% CPUs are busy with 50% of
disk size.