lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e01e469f-488a-8f2d-008f-8427289d2ff3@kernel.dk>
Date:   Mon, 16 Jan 2023 06:55:20 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     Breno Leitao <leitao@...ian.org>,
        Gabriel Krisman Bertazi <krisman@...e.de>
Cc:     asml.silence@...il.com, dylany@...a.com, io-uring@...r.kernel.org,
        leit@...com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] io_uring: Split io_issue_def struct

On 1/16/23 3:52 AM, Breno Leitao wrote:
> On Thu, Jan 12, 2023 at 05:35:22PM -0300, Gabriel Krisman Bertazi wrote:
>> Breno Leitao <leitao@...ian.org> writes:
>>
>>> This patch removes some "cold" fields from `struct io_issue_def`.
>>>
>>> The plan is to keep only highly used fields into `struct io_issue_def`, so,
>>> it may be hot in the cache. The hot fields are basically all the bitfields
>>> and the callback functions for .issue and .prep.
>>>
>>> The other less frequently used fields are now located in a secondary and
>>> cold struct, called `io_cold_def`.
>>>
>>> This is the size for the structs:
>>>
>>> Before: io_issue_def = 56 bytes
>>> After: io_issue_def = 24 bytes; io_cold_def = 40 bytes
>>
>> Does this change have an observable impact in run time? Did it show
>> a significant decrease of dcache misses?
> 
> I haven't tested it. I expect it might be hard to came up with such test.
> 
> A possible test might be running io_uring heavy tests, while adding
> enough memory pressure. Doing this in two different instant (A/B test),
> might be a unpredicable and the error deviation might hide the benefit.

I think what you'd want is two (or more) io_uring ops being really
busy and measuring dcache pressure while running that test. I don't
think this is very feasible to accurately measure, and I also don't
think that is an issue. The split into hot/cold parts of the op
definitions is obviously a good idea. For ideal setups, we'll never
be using the cold part at all, and having a smaller op definition
for the fast path is always going to be helpful.

-- 
Jens Axboe


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ