[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YdvjuZATR4727gaT@linutronix.de>
Date: Mon, 10 Jan 2022 08:43:53 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Waiman Long <longman@...hat.com>
Cc: linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] locking/local_lock: Make the empty local_lock_*()
function a macro.
On 2022-01-05 22:34:31 [-0500], Waiman Long wrote:
>
> I try out this patch and it indeed helps to reduce the object size of
> functions that use local_lock(). However, the extra code isn't an additional
> mov+add.
>
> Using folio_add_lru() as an example,
>
> Without the patch:
>
> 466 local_lock(&lru_pvecs.lock);
> 0x00000000000032ee <+14>: mov $0x1,%edi
> 0x00000000000032f3 <+19>: callq 0x32f8 <folio_add_lru+24>
> 0x00000000000032f8 <+24>: callq 0x32fd <folio_add_lru+29>
The call here might be due to some debugging switches or compiler
optimisation. I have with no debug and gcc-11:
| # mm/swap.c:466: local_lock(&lru_pvecs.lock);
| movq $lru_pvecs, %rbx #, tmp135
| movq %rbx, %rax # tmp135, tcp_ptr__
| #APP
| # 466 "mm/swap.c" 1
| add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__
so it is mov per-CPU variable, add per-CPU offset.
Sebastian
Powered by blists - more mailing lists