linux-kernel - Re: [PATCH 0.5/3] livepatch: Convert stack entries array to percpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZBBRbfAL5+ZI77XN@alley>
Date:   Tue, 14 Mar 2023 11:50:21 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Josh Poimboeuf <jpoimboe@...nel.org>
Cc:     live-patching@...r.kernel.org, linux-kernel@...r.kernel.org,
        Seth Forshee <sforshee@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Song Liu <song@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Miroslav Benes <mbenes@...e.cz>,
        Jiri Kosina <jikos@...nel.org>, Ingo Molnar <mingo@...hat.com>,
        Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH 0.5/3] livepatch: Convert stack entries array to percpu

On Mon 2023-03-13 16:33:46, Josh Poimboeuf wrote:
> On Fri, Mar 03, 2023 at 03:00:13PM +0100, Petr Mladek wrote:
> > > MAX_STACK_ENTRIES is 100, which seems excessive.  If we halved that, the
> > > array would be "only" 400 bytes, which is *almost* reasonable to
> > > allocate on the stack?
> > 
> > It is just for the stack in the process context. Right?
> > 
> > I think that I have never seen a stack with over 50 entries. And in
> > the worst case, a bigger amount of entries would "just" result in
> > a non-reliable stack which might be acceptable.
> > 
> > It looks acceptable to me.
> > 
> > > Alternatively we could have a percpu entries array... :-/
> > 
> > That said, percpu entries would be fine as well. It sounds like
> > a good price for the livepatching feature. I think that livepatching
> > is used on big systems anyway.
> > 
> > I slightly prefer the per-cpu solution.
> 
> Booting a kernel with PREEMPT+LOCKDEP gave me a high-water mark of 60+
> stack entries, seen when probing a device.  I decided not to mess with
> MAX_STACK_ENTRIES, and instead just convert the entries to percpu.  This
> patch could be inserted at the beginning of the set.

Good to know.

> 
> ---8<---
> 
> Subject: [PATCH 0.5/3] livepatch: Convert stack entries array to percpu
> 
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -240,12 +242,15 @@ static int klp_check_stack_func(struct klp_func *func, unsigned long *entries,
>   */
>  static int klp_check_stack(struct task_struct *task, const char **oldname)
>  {
> -	static unsigned long entries[MAX_STACK_ENTRIES];
> +	unsigned long *entries = this_cpu_ptr(klp_stack_entries);
>  	struct klp_object *obj;
>  	struct klp_func *func;
>  	int ret, nr_entries;
>  
> -	ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries));
> +	/* Protect 'klp_stack_entries' */
> +	lockdep_assert_preemption_disabled();

I think about adding:

	/*
	 * Stay on the safe side even when cond_resched() is called from
	 * an IRQ context by mistake.
	 */
	if (!in_task())
		return -EINVAL;

Or is this prevented another way, please?

> +
> +	ret = stack_trace_save_tsk_reliable(task, entries, MAX_STACK_ENTRIES);
>  	if (ret < 0)
>  		return -EINVAL;
>  	nr_entries = ret;

Otherwise, it looks good to me.

Best Regards,
Petr