[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <85bd3208-d65e-ce55-d488-9115f1f57858@intel.com>
Date: Fri, 12 Aug 2016 10:16:18 +0800
From: Aaron Lu <aaron.lu@...el.com>
To: "Huang, Ying" <ying.huang@...el.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
Alex Shi <alex.shi@...aro.org>,
Tomoki Sekiyama <tomoki.sekiyama.qu@...achi.com>,
Davidlohr Bueso <dave@...olabs.net>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH] x86/irq: do not substract irq_tlb_count from
irq_call_count
On 08/11/2016 11:13 PM, Huang, Ying wrote:
> Aaron Lu <aaron.lu@...el.com> writes:
>
>> Since commit 52aec3308db8 ("x86/tlb: replace INVALIDATE_TLB_VECTOR by
>> CALL_FUNCTION_VECTOR"), the tlb remote shootdown is done through call
>> function vector. That commit didn't take care of irq_tlb_count so later
>> commit fd0f5869724f ("x86: Distinguish TLB shootdown interrupts from
>> other functions call interrupts") tried to fix it.
>>
>> The fix assumes every increase of irq_tlb_count has a corresponding
>> increase of irq_call_count. So the irq_call_count is always bigger than
>> irq_tlb_count and we could substract irq_tlb_count from irq_call_count.
>>
>> Unfortunately this is not true for the smp_call_function_single case.
>> The IPI is only sent if the target CPU's call_single_queue is empty when
>> adding a csd into it in generic_exec_single. That means if two threads
>> are both adding flush tlb csds to the same CPU's call_single_queue, only
>> one IPI is sent. In other words, the irq_call_count is incremented by 1
>> but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be
>> bigger than irq_call_count and the substract will produce a very large
>> irq_call_count value due to overflow.
>>
>> Considering that:
>> 1 it's not worth to send more IPIs for the sake of accurate counting of
>> irq_call_count in generic_exec_single;
>> 2 it's not easy to tell if the call function interrupt is for TLB
>> shootdown in __smp_call_function_single_interrupt.
>> Not to exclude TLB shootdown from call function count seems to be the
>> simplest fix and this patch just did that.
>>
>> This is found by LKP's cyclic performance regression tracking recently
>> with the vm-scalability test suite. I have bisected to commit
>> 0a7ce4b5a632 ("mm/rmap: share the i_mmap_rwsem"). This commit didn't do
>> anything wrong but revealed the irq_call_count problem. IIUC, the commit
>> makes rwc->remap_one in rmap_walk_file concurrent with multiple threads.
>> When remap_one is try_to_unmap_one, then multiple threads could queue
>> flush tlb to the same CPU but only one IPI will be sent.
>>
>> Since the commit enter Linux v3.19, the counting problem only shows up
>> from v3.19. Considering this is a behaviour change, I'm not sure if I
>> should add the stable tag here.
>>
>> Signed-off-by: Aaron Lu <aaron.lu@...el.com>
>
> Thanks for fix. You forget to add :)
>
> Reported-by: "Huang, Ying" <ying.huang@...el.com>
Oh right, sorry about that.
Regards,
Aaron
Powered by blists - more mailing lists