[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200817075533.GI2674@hirez.programming.kicks-ass.net>
Date: Mon, 17 Aug 2020 09:55:33 +0200
From: peterz@...radead.org
To: Jiaxun Yang <jiaxun.yang@...goat.com>
Cc: linux-mips@...r.kernel.org,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Huacai Chen <chenhc@...ote.com>,
Aleksandar Markovic <aleksandar.qemu.devel@...il.com>,
Paul Burton <paulburton@...nel.org>,
Serge Semin <Sergey.Semin@...kalelectronics.ru>,
WANG Xuerui <git@...0n.name>,
周琰杰 <zhouyanjie@...yeetech.com>,
Liangliang Huang <huanglllzu@...il.com>,
afzal mohammed <afzal.mohd.ma@...il.com>,
Ingo Molnar <mingo@...nel.org>, Peter Xu <peterx@...hat.com>,
Sergey Korolev <s.korolev@...systems.com>,
Alexey Malahov <Alexey.Malahov@...kalelectronics.ru>,
Marc Zyngier <maz@...nel.org>, Anup Patel <anup.patel@....com>,
Palmer Dabbelt <palmer@...belt.com>,
Atish Patra <atish.patra@....com>,
Michael Kelley <mikelley@...rosoft.com>,
Steven Price <steven.price@....com>,
Daniel Jordan <daniel.m.jordan@...cle.com>,
Ming Lei <ming.lei@...hat.com>,
Ulf Hansson <ulf.hansson@...aro.org>,
Mike Leach <mike.leach@...aro.org>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH 1/7] MIPS: sync-r4k: Rework to be many cores firendly
On Mon, Aug 17, 2020 at 11:46:40AM +0800, Jiaxun Yang wrote:
> Here we reworked the whole procdure. Now the synchronise event on CPU0
> is triggered by smp call function, and we won't touch the count on CPU0
> at all.
Are you telling me, that in 2020 you're building chips that need
horrible crap like this ?!?
> +#define MAX_LOOPS 1000
> +
> +void synchronise_count_master(void *unused)
> {
> unsigned long flags;
> + long delta;
> + int i;
>
> + if (atomic_read(&sync_stage) != STAGE_START)
> + BUG();
BUG_ON(atomic_read(&sync_state) != STAGE_START);
>
> local_irq_save(flags);
That's silly, replace with: lockdep_assert_hardirqs_disabled().
>
> + cur_count = read_c0_count();
> + smp_wmb();
> + atomic_inc(&sync_stage); /* inc to STAGE_MASTER_READY */
memory barriers require a comment that describes the ordering. This
includes at least 2 variables and at least 2 code paths (*) -- afaict
your code does NOT have a matching barrier, see below.
>
> + for (i = 0; i < MAX_LOOPS; i++) {
> + cur_count = read_c0_count();
> smp_wmb();
> - atomic_inc(&count_count_stop);
> + if (atomic_read(&sync_stage) == STAGE_SLAVE_SYNCED)
> + break;
> }
> +
> + delta = read_c0_count() - fini_count;
>
> local_irq_restore(flags);
>
> + if (i == MAX_LOOPS)
> + pr_err("sync-r4k: Master: synchronise timeout\n");
> + else
> + pr_info("sync-r4k: Master: synchronise succeed, maximum delta: %ld\n", delta);
> +
> + return;
> }
>
> void synchronise_count_slave(int cpu)
> {
> int i;
> unsigned long flags;
> + call_single_data_t csd;
>
> + raw_spin_lock(&sync_r4k_lock);
Why should this be a raw_spnilock_t ?
>
> + /* Let variables get attention from cache */
> + for (i = 0; i < MAX_LOOPS; i++) {
> + cur_count++;
> + fini_count += cur_count;
> + cur_count += fini_count;
> }
What does this actually do? You're going to bounce those variables
between this CPU and CPU-0.
> +
> + atomic_set(&sync_stage, STAGE_START);
> + csd.func = synchronise_count_master;
> +
> + /* Master count is always CPU0 */
> + if (smp_call_function_single_async(0, &csd)) {
This is diguisting.
It also requires a comment on how the on-stack csd is correct (it is,
but it really needs a comment).
> + pr_err("sync-r4k: Salve: Failed to call master\n");
> + raw_spin_unlock(&sync_r4k_lock);
> + return;
> + }
> +
> + local_irq_save(flags);
> +
> + /* Wait until master ready */
> + while (atomic_read(&sync_stage) != STAGE_MASTER_READY)
> + cpu_relax();
This really wants to be:
atomic_cond_read_acquire(&&sync_stage, VAL == STAGE_MASTER_READY);
Because, afaict the smp_wmb() (*) in synchronize_count_master() order
against this here and we need to guarantee we read @sync_stage _before_
@cur_count.
> +
> + write_c0_count(cur_count);
> + fini_count = read_c0_count();
> + smp_wmb();
> + atomic_inc(&sync_stage); /* inc to STAGE_SLAVE_SYNCED */
>
> local_irq_restore(flags);
> +
> + raw_spin_unlock(&sync_r4k_lock);
> }
Furthermore, afaict there isn't actually any concurrency on @sync_stage,
so atomic_t isn't required, Using smp_store_release() to change state
might be far more natural.
Powered by blists - more mailing lists