lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 12 Aug 2016 10:16:18 +0800
From:	Aaron Lu <aaron.lu@...el.com>
To:	"Huang, Ying" <ying.huang@...el.com>
Cc:	x86@...nel.org, linux-kernel@...r.kernel.org,
	Alex Shi <alex.shi@...aro.org>,
	Tomoki Sekiyama <tomoki.sekiyama.qu@...achi.com>,
	Davidlohr Bueso <dave@...olabs.net>,
	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH] x86/irq: do not substract irq_tlb_count from
 irq_call_count

On 08/11/2016 11:13 PM, Huang, Ying wrote:
> Aaron Lu <aaron.lu@...el.com> writes:
> 
>> Since commit 52aec3308db8 ("x86/tlb: replace INVALIDATE_TLB_VECTOR by
>> CALL_FUNCTION_VECTOR"), the tlb remote shootdown is done through call
>> function vector. That commit didn't take care of irq_tlb_count so later
>> commit fd0f5869724f ("x86: Distinguish TLB shootdown interrupts from
>> other functions call interrupts") tried to fix it.
>>
>> The fix assumes every increase of irq_tlb_count has a corresponding
>> increase of irq_call_count. So the irq_call_count is always bigger than
>> irq_tlb_count and we could substract irq_tlb_count from irq_call_count.
>>
>> Unfortunately this is not true for the smp_call_function_single case.
>> The IPI is only sent if the target CPU's call_single_queue is empty when
>> adding a csd into it in generic_exec_single. That means if two threads
>> are both adding flush tlb csds to the same CPU's call_single_queue, only
>> one IPI is sent. In other words, the irq_call_count is incremented by 1
>> but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be
>> bigger than irq_call_count and the substract will produce a very large
>> irq_call_count value due to overflow.
>>
>> Considering that:
>> 1 it's not worth to send more IPIs for the sake of accurate counting of
>>   irq_call_count in generic_exec_single;
>> 2 it's not easy to tell if the call function interrupt is for TLB
>>   shootdown in __smp_call_function_single_interrupt.
>> Not to exclude TLB shootdown from call function count seems to be the
>> simplest fix and this patch just did that.
>>
>> This is found by LKP's cyclic performance regression tracking recently
>> with the vm-scalability test suite. I have bisected to commit
>> 0a7ce4b5a632 ("mm/rmap: share the i_mmap_rwsem"). This commit didn't do
>> anything wrong but revealed the irq_call_count problem. IIUC, the commit
>> makes rwc->remap_one in rmap_walk_file concurrent with multiple threads.
>> When remap_one is try_to_unmap_one, then multiple threads could queue
>> flush tlb to the same CPU but only one IPI will be sent.
>>
>> Since the commit enter Linux v3.19, the counting problem only shows up
>> from v3.19. Considering this is a behaviour change, I'm not sure if I
>> should add the stable tag here.
>>
>> Signed-off-by: Aaron Lu <aaron.lu@...el.com>
> 
> Thanks for fix.  You forget to add :)
> 
> Reported-by: "Huang, Ying" <ying.huang@...el.com>

Oh right, sorry about that.

Regards,
Aaron

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ