linux-kernel - Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190507142847.pre7tvm4oysimfww@pathway.suse.cz>
Date:   Tue, 7 May 2019 16:28:47 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
Cc:     Jiri Kosina <jikos@...nel.org>, Miroslav Benes <mbenes@...e.cz>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
        live-patching@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing
 reliable stacktrace support

On Tue 2019-05-07 07:03:50, Josh Poimboeuf wrote:
> On Tue, May 07, 2019 at 01:29:50PM +0200, Petr Mladek wrote:
> > On Mon 2019-05-06 20:43:32, Josh Poimboeuf wrote:
> > > On Mon, May 06, 2019 at 07:40:32PM -0500, Josh Poimboeuf wrote:
> > > > On Tue, Apr 30, 2019 at 11:10:48AM +0200, Petr Mladek wrote:
> > > > > --- a/kernel/livepatch/transition.c
> > > > > +++ b/kernel/livepatch/transition.c
> > > > > @@ -263,8 +263,15 @@ static int klp_check_stack(struct task_struct *task, char *err_buf)
> > > > >  	trace.nr_entries = 0;
> > > > >  	trace.max_entries = MAX_STACK_ENTRIES;
> > > > >  	trace.entries = entries;
> > > > > +
> > > > >  	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > > -	WARN_ON_ONCE(ret == -ENOSYS);
> > > > > +	/*
> > > > > +	 * pr_warn() under task rq lock might cause a deadlock.
> > > > > +	 * Fortunately, missing reliable stacktrace support has
> > > > > +	 * already been handled when the livepatch was enabled.
> > > > > +	 */
> > > > > +	if (ret == -ENOSYS)
> > > > > +		return ret;
> > > > 
> > > > I find the comment to be a bit wordy and confusing (and vague).
> > 
> > Then please provide a better one. I have no idea what might make
> > you happy and am not interested into an endless disputing.
>
> Something like this would be clearer:
> 
> 	if (ret == -ENOSYS) {
> 		/*
> 		 * This arch doesn't support reliable stack tracing.  No
> 		 * need to print a warning; that has already been done
> 		 * by klp_enable_patch().
> 		 */
> 		return ret;
> 	}

I do not mind.

> > > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > > check which is done in kernel/livepatch/core.c.  So I think it would be
> > > > clearer and more consistent if the same check is done here:
> > > > 
> > > > 	if (!klp_have_reliable_stack())
> > > > 		return -ENOSYS;
> > 
> > Huh, it smells with over engineering to me.
> 
> How so?  It makes the code more readable and the generated code should
> be much better because it becomes a build-time check.

save_stack_trace_tsk_reliable() returns various error codes.
We catch a specific one because otherwise the message below
might be misleading.

I do not see why we should prevent this error by calling
a custom hack: klp_have_reliable_stack()?

Regarding reliability. If anyone changes semantic of
save_stack_trace_tsk_reliable() error codes, they would likely
check if all users (one at the moment) handle it correctly.

On the other hand, the dependency between the -ENOSYS
return value and klp_have_reliable_stack() is far from
obvious.

If we want to discuss and fix this to the death. We should change
the return value from -ENOSYS to -EOPNOTSUPP. The reason
is the same as in the commit 375bfca3459db1c5596
("livepatch: core: Return EOPNOTSUPP instead of ENOSYS").

Note that EOPNOTSUPP is the same errno as ENOTSUP, see
man 3 errno.


> But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> better.

AFAIK, Miroslav wanted to point out that your opinion was inconsistent.


> > > > 	ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > 
> > > > 	[ no need to check ret for ENOSYS here ]
> > > > 
> > > > Then, IMO, no comment is needed.
> > > 
> > > BTW, if you agree with this approach then we can leave the
> > > WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.
> > 
> > I really like the removal of the WARN_ON_ONCE(). I consider
> > it an old fashioned way used when people are lazy to handle
> > errors. It might make sense when the backtrace helps to locate
> > the context but the context is well known here. Finally,
> > WARN() should be used with care. It might cause reboot
> > with panic_on_warn.
> 
> The warning makes the function consistent with the other weak functions
> in stacktrace.c and clarifies that it should never be called unless an
> arch has misconfigured something.  And if we aren't even checking the
> specific ENOSYS error as I proposed then this warning would make the
> error more obvious.

I consider both WARN() and error value as superfluous. I like the
error value because it allows users to handle the situation as
they need it.

Best Regards,
Petr

PS: This is my last mail in the thread this week. I will eventually
return to it with a clear head next week. It is all nitpicking from
my POV and I have better things to do.