lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y73F4tbfxT6Kb9kZ@tpad>
Date:   Tue, 10 Jan 2023 17:09:06 -0300
From:   Marcelo Tosatti <mtosatti@...hat.com>
To:     Christoph Lameter <cl@...two.de>
Cc:     Frederic Weisbecker <frederic@...nel.org>, atomlin@...mlin.com,
        tglx@...utronix.de, mingo@...nel.org, peterz@...radead.org,
        pauld@...hat.com, neelx@...hat.com, oleksandr@...alenko.name,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v13 2/6] mm/vmstat: Use vmstat_dirty to track
 CPU-specific vmstat discrepancies

On Tue, Jan 10, 2023 at 02:39:08PM +0100, Christoph Lameter wrote:
> On Tue, 10 Jan 2023, Frederic Weisbecker wrote:
> 
> > Note I'm absolutely clueless with vmstat. But I was wondering about it as well
> > while reviewing Marcelo's series, so git blame pointed me to:
> >
> > 7c83912062c801738d7d19acaf8f7fec25ea663c ("vmstat: User per cpu atomics to avoid
> > interrupt disable / enable")
> >
> > And this seem to mention that this can race with IRQs as well, hence the local
> > cmpxchg operation.
> 
> The race with irq could be an issue but I thought we avoided that and were
> content with disabling preemption.
> 
> But this issue illustrates the central problem of the patchset: It makes
> the lightweight counters not so lightweight anymore. 

https://lkml.iu.edu/hypermail/linux/kernel/0903.2/00569.html

With added

static void do_test_preempt(void)
{
        unsigned long flags;
        unsigned int i;
        cycles_t time1, time2, time;
        u32 rem;

        local_irq_save(flags);
        preempt_disable();
        time1 = get_cycles();
        for (i = 0; i < NR_LOOPS; i++) {
                preempt_disable();
                preempt_enable();
        }
        time2 = get_cycles();
        local_irq_restore(flags);
        preempt_enable();
        time = time2 - time1;

        printk(KERN_ALERT "test results: time for disabling/enabling preemption\n");
        printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
        printk(KERN_ALERT "total time: %llu\n", time);
        time = div_u64_rem(time, NR_LOOPS, &rem);
        printk(KERN_ALERT "-> enabling/disabling preemption takes %llu cycles\n",
time);
        printk(KERN_ALERT "test end\n");
}


model name	: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz

[  423.676079] test init
[  423.676249] test results: time for baseline
[  423.676405] number of loops: 200000
[  423.676676] total time: 104274
[  423.676910] -> baseline takes 0 cycles
[  423.677051] test end
[  423.678150] test results: time for locked cmpxchg
[  423.678353] number of loops: 200000
[  423.678498] total time: 2473839
[  423.678630] -> locked cmpxchg takes 12 cycles
[  423.678810] test end
[  423.679204] test results: time for non locked cmpxchg
[  423.679394] number of loops: 200000
[  423.679527] total time: 740298
[  423.679644] -> non locked cmpxchg takes 3 cycles
[  423.679817] test end
[  423.680755] test results: time for locked add return
[  423.680951] number of loops: 200000
[  423.681089] total time: 2118185
[  423.681229] -> locked add return takes 10 cycles
[  423.681411] test end
[  423.681846] test results: time for enabling interrupts (STI)
[  423.682063] number of loops: 200000
[  423.682209] total time: 861591
[  423.682335] -> enabling interrupts (STI) takes 4 cycles
[  423.682532] test end
[  423.683606] test results: time for disabling interrupts (CLI)
[  423.683852] number of loops: 200000
[  423.684006] total time: 2440756
[  423.684141] -> disabling interrupts (CLI) takes 12 cycles
[  423.684588] test end
[  423.686626] test results: time for disabling/enabling interrupts (STI/CLI)
[  423.686879] number of loops: 200000
[  423.687015] total time: 4802297
[  423.687139] -> enabling/disabling interrupts (STI/CLI) takes 24 cycles
[  423.687389] test end
[  423.688025] test results: time for disabling/enabling preemption
[  423.688258] number of loops: 200000
[  423.688396] total time: 1341001
[  423.688526] -> enabling/disabling preemption takes 6 cycles
[  423.689276] test end

> The basic primitives add a  lot of weight. 

Can't see any alternative given the necessity to avoid interruption
by the work to sync per-CPU vmstats to global vmstats.

> And the pre cpu atomic updates operations require the modification
> of multiple values. The operation 
> cannot be "atomic" in that sense anymore and we need some other form of
> synchronization that can
> span multiple instructions.

    So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer
    count on preremption being disabled we still have some minor issues.
    The fetching of the counter thresholds is racy.
    A threshold from another cpu may be applied if we happen to be
    rescheduled on another cpu.  However, the following vmstat operation
    will then bring the counter again under the threshold limit.

Those small issues are gone, OTOH.





Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ