lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87czqu2iew.ffs@tglx>
Date:   Wed, 04 Aug 2021 01:54:47 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Mel Gorman <mgorman@...hsingularity.net>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Ingo Molnar <mingo@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
        Hugh Dickins <hughd@...gle.com>, Linux-MM <linux-mm@...ck.org>,
        Linux-RT-Users <linux-rt-users@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 2/2] mm/vmstat: Protect per cpu variables with preempt
 disable on RT

Mel!

On Fri, Jul 23 2021 at 11:00, Mel Gorman wrote:
> From: Ingo Molnar <mingo@...e.hu>
>
> Disable preemption on -RT for the vmstat code. On vanila the code runs
> in IRQ-off regions while on -RT it may not when stats are updated under
> a local_lock. "preempt_disable" ensures that the same resources is not
> updated in parallel due to preemption.
>
> This patch differs from the preempt-rt version where __count_vm_event and
> __count_vm_events are also protected. The counters are explicitly "allowed
> to be to be racy" so there is no need to protect them from preemption. Only
> the accurate page stats that are updated by a read-modify-write need
> protection.
>
> Signed-off-by: Ingo Molnar <mingo@...e.hu>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>
> ---
>  mm/vmstat.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index b0534e068166..d06332c221b1 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -319,6 +319,7 @@ void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
>  	long x;
>  	long t;
>  
> +	preempt_disable_rt();

Yes, this is smart to some extent. But in reality it's a bandaid simply
because nobody can tell which item of vmstat requires which protection.

If you go back in RT history then you will figure out that we were able
to eliminate _all_ occurences of preempt_disable_rt() except for this
one.

Even mm developers are wary about this:

 <tglx> so in vmstat.c there is this magic comment:
 <tglx>  * For use when we know that interrupts are disabled
 <tglx>  * or when we know that preemption is disabled and that
 <tglx>  * particular counter cannot be updated from interrupt context.
 <tglx> how can I know which counters need what?
 <mm_expert> I don't think there's a list, one would have to check on counter to counter basis :/ 
 <tglx> and of course there is nothing which validates that, right?
 <mm_expert> exactly

Brilliant stuff which prevents you to do any validation on this. Over
the years there have been several issues where callers had to be fixed
by analysing bug reports instead of having a proper instrumentation in
that code which would have told the developer that he got it wrong.

Of course on RT kernels the preempt_disable_rt() will serialize
everything correctly, but as we have learned over the years just
slapping _if_rt() or if_not_rt() variants of things around is most of
the time papering over the underlying problem of badly defined
protection scopes. Let's not proliferate that. As I said in the above
IRC conversation:

 <tglx> I fundamentally hate this preempt_disable_rt() muck

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ