linux-kernel - Re: [PATCH v4 14/16] timer: Implement the hierarchical pull model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20221115113152.GH590078@lothringen>
Date:   Tue, 15 Nov 2022 12:31:52 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Anna-Maria Behnsen <anna-maria@...utronix.de>
Cc:     linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        John Stultz <jstultz@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Eric Dumazet <edumazet@...gle.com>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Arjan van de Ven <arjan@...radead.org>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH v4 14/16] timer: Implement the hierarchical pull model

On Fri, Nov 04, 2022 at 03:57:35PM +0100, Anna-Maria Behnsen wrote:
> +static bool tmigr_update_events(struct tmigr_group *group,
> +				struct tmigr_group *child,
> +				struct tmigr_walk *data)
> +{
> +	struct tmigr_event *evt, *first_childevt;
> +	bool walk_done;
> +	u64 nextexp;
> +
> +	if (child) {
> +		if (data->childstate.active)
> +			return true;
> +
> +		raw_spin_lock(&child->lock);
> +		raw_spin_lock_nested(&group->lock, SINGLE_DEPTH_NESTING);
> +
> +		first_childevt = tmigr_next_groupevt(child);
> +		nextexp = child->next_expiry;
> +		evt = &child->groupevt;
> +	} else {
> +		nextexp = data->nextexp;
> +
> +		/*
> +		 * Set @data->nextexp to KTIME_MAX; it is reused for first
> +		 * global event which needs to be handled by migrator (in
> +		 * toplevel group)
> +		 */
> +		data->nextexp = KTIME_MAX;
> +
> +		first_childevt = evt = data->evt;
> +		if (test_bit(0, &evt->ignore))
> +			return true;
> +
> +		raw_spin_lock(&group->lock);
> +	}
> +
> +	if (nextexp == KTIME_MAX) {
> +		set_bit(0, &evt->ignore);
> +		walk_done = true;
> +		goto unlock;
> +	}
> +
> +	walk_done = !group->parent;
> +
> +	/*
> +	 * Update of event cpu and ignore bit is required only when @child
> +	 * is set (child is equal or higher than lvl0), but it doesn't
> +	 * matter if it is written once more to per cpu event; make the
> +	 * update unconditional.
> +	 */
> +	evt->cpu = first_childevt->cpu;
> +	clear_bit(0, &evt->ignore);
> +
> +	/*
> +	 * If child event is already queued in group, remove it from queue
> +	 * when expiry time changed only
> +	 */
> +	if (timerqueue_node_queued(&evt->nextevt)) {
> +		if (evt->nextevt.expires == nextexp)
> +			goto check_toplvl;
> +
> +		if (evt->nextevt.expires != nextexp &&

At this point evt->nextevt.expires != nextexp has to be true.

> +		    !timerqueue_del(&group->events, &evt->nextevt))
> +			group->next_expiry = KTIME_MAX;
> +	}
> +
> +	evt->nextevt.expires = nextexp;
> +
> +	if (timerqueue_add(&group->events, &evt->nextevt))
> +		group->next_expiry = nextexp;
> +
> +check_toplvl:
> +	if (walk_done && (data->groupstate.migrator == TMIGR_NONE)) {
> +		/*
> +		 * Toplevel group is idle and it has to be ensured global
> +		 * timers are handled in time. (This could be optimized by
> +		 * keeping track of the last global scheduled event and
> +		 * only arming it on CPU if the new event is earlier. Not
> +		 * sure if its worth the complexity.)
> +		 */
> +		data->nextexp = tmigr_next_groupevt_expires(group);
> +	}
> +
> +unlock:
> +	raw_spin_unlock(&group->lock);
> +
> +	if (child)
> +		raw_spin_unlock(&child->lock);
> +
> +	return walk_done;
> +}
[...]
> +static int tmigr_cpu_offline(unsigned int cpu)
> +{
> +	struct tmigr_cpu *tmc = this_cpu_ptr(&tmigr_cpu);
> +
> +	raw_spin_lock_irq(&tmc->lock);
> +	tmc->online = 0;
> +	__tmigr_cpu_deactivate(tmc, KTIME_MAX);

This means that if the CPU is going idle for some time during
the hotplug process (ie: at some point between CPUHP_AP_TMIGR_ONLINE
and CPUHP_TEARDOWN_CPU), then a global timer may be delayed for that long.

I guess it shouldn't be too bad but worth mentioning...

Although if it happens to be a problem it could be solved with simply allowing
tmigr_cpu_deactivate() when !tmc->online.

Thanks.