lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0a5255db-5455-4317-979c-191cae3ff42b@efficios.com>
Date: Thu, 5 Dec 2024 09:59:26 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Dave Hansen <dave.hansen@...el.com>, Rik van Riel <riel@...riel.com>
Cc: linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...nel.org>,
 Linus Torvalds <torvalds@...ux-foundation.org>, Mel Gorman
 <mgorman@...e.de>, x86@...nel.org
Subject: Re: [PATCH] smp: Evaluate local cond_func() before IPI side-effects

On 2024-12-03 20:38, Dave Hansen wrote:
> On 12/3/24 10:39, Mathieu Desnoyers wrote:
>> If cond_func() depends on loading shared state updated by other CPU's
>> IPI handlers func(), then triggering execution of remote CPUs IPI before
>> evaluating cond_func() may have unexpected consequences.
> 
> I always thought this was on purpose so cond_func() can be executed in
> parallel with the remote work.
> 
> Could we double-check that this doesn't meaningfully slow down IPIs that
> have longer work to do?

I notice that this question was not answered. I did do extensive
benchmark of this effect, but I would not expect a significant
impact there, because the cond_func() I've seen (there are very
few users) are all really short, and should be much shorter than
doing the IPI, so I expect a negligible performance overhead.

But we'll see if any bot observe something unexpected.

Caller code:

fs/buffer.c
1530:	on_each_cpu_cond(has_bh_in_lru, invalidate_bh_lru, NULL, 1);

#define BH_LRU_SIZE     16

bool has_bh_in_lru(int cpu, void *dummy)
{
         struct bh_lru *b = per_cpu_ptr(&bh_lrus, cpu);
         int i;
         
         for (i = 0; i < BH_LRU_SIZE; i++) {
                 if (b->bhs[i])
                         return true;
         }

         return false;
}

arch/x86/mm/tlb.c
932:		on_each_cpu_cond_mask(tlb_is_not_lazy, flush_tlb_func,

^ this is the small function introduced by Rik's patches.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ