[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2cab01ce-7c5f-46d6-b8a4-c2a24c3f9a32@suse.cz>
Date: Wed, 3 Apr 2024 09:25:33 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: "Song, Xiongwei" <Xiongwei.Song@...driver.com>,
"rientjes@...gle.com" <rientjes@...gle.com>, "cl@...ux.com" <cl@...ux.com>,
"penberg@...nel.org" <penberg@...nel.org>,
"iamjoonsoo.kim@....com" <iamjoonsoo.kim@....com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"roman.gushchin@...ux.dev" <roman.gushchin@...ux.dev>,
"42.hyeyoo@...il.com" <42.hyeyoo@...il.com>
Cc: "linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"chengming.zhou@...ux.dev" <chengming.zhou@...ux.dev>
Subject: Re: [PATCH 3/4] mm/slub: simplify get_partial_node()
On 4/3/24 2:37 AM, Song, Xiongwei wrote:
>>
>>
>> It could be tempting to use >= instead of > to achieve the same effect but
>> that would have unintended performance effects that would best be evaluated
>> separately.
>
> I can run a test to measure Amean changes. But in terms of x86 assembly, there
> should not be extra instructions with ">=".
>
> Did a simple test, for ">=" it uses "jle" instruction, while "jl" instruction is used for ">".
> No more instructions involved. So there should not be performance effects on x86.
Right, I didn't mean the code of the test, but how the difference of the
comparison affects how many cpu partial slabs would be put on the cpu
partial list here.
> Thanks,
> Xiongwei
>
>>
>> >
>> > + put_cpu_partial(s, slab, 0);
>> > + stat(s, CPU_PARTIAL_NODE);
>> > + partial_slabs++;
>> > +
>> > + if (partial_slabs > slub_get_cpu_partial(s) / 2)
>> > + break;
>> > }
>> > spin_unlock_irqrestore(&n->list_lock, flags);
>> > return partial;
>
Powered by blists - more mailing lists