linux-kernel - Re: [PATCH v3 11/19] unwind: Add deferred user space unwinding API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAEf4BzZksT=GTs268KBiCsYxUcvWz5KUghjKQQR8OxGdoBt=6A@mail.gmail.com>
Date: Thu, 31 Oct 2024 16:28:08 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Josh Poimboeuf <jpoimboe@...nel.org>
Cc: x86@...nel.org, Peter Zijlstra <peterz@...radead.org>, 
	Steven Rostedt <rostedt@...dmis.org>, Ingo Molnar <mingo@...nel.org>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, linux-kernel@...r.kernel.org, 
	Indu Bhagat <indu.bhagat@...cle.com>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Namhyung Kim <namhyung@...nel.org>, Ian Rogers <irogers@...gle.com>, 
	Adrian Hunter <adrian.hunter@...el.com>, linux-perf-users@...r.kernel.org, 
	Mark Brown <broonie@...nel.org>, linux-toolchains@...r.kernel.org, 
	Jordan Rome <jordalgo@...a.com>, Sam James <sam@...too.org>, linux-trace-kernel@...r.kerne.org, 
	Jens Remus <jremus@...ux.ibm.com>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, 
	Florian Weimer <fweimer@...hat.com>, Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH v3 11/19] unwind: Add deferred user space unwinding API

On Thu, Oct 31, 2024 at 4:13 PM Josh Poimboeuf <jpoimboe@...nel.org> wrote:
>
> On Thu, Oct 31, 2024 at 02:22:48PM -0700, Andrii Nakryiko wrote:
> > > Problem is, the unwinder doesn't know in advance which tasks will be
> > > unwound.
> > >
> > > Its first clue is unwind_user_register(), would it make sense for the
> > > caller to clarify whether all tasks need to be unwound or only a
> > > specific subset?
> > >
> > > Its second clue is unwind_user_deferred(), which is called for the task
> > > itself.  But by then it's too late because it needs to access the
> > > per-task data from (potentially) irq context so it can't do a lazy
> > > allocation.
> > >
> > > I'm definitely open to ideas...
> >
> > The laziest thing would be to perform GFP_ATOMIC allocation, and if
> > that fails, oops, too bad, no stack trace for you (but, generally
> > speaking, no big deal). Advantages are clear, though, right? Single
> > pointer in task_struct, which most of the time will be NULL, so no
> > unnecessary overheads.
>
> GFP_ATOMIC is limited, I don't think we want the unwinder to trigger
> OOM.
>

So all task_structs on the system using 104 bytes more, *permanently*
and *unconditionally*, is not a concern, but lazy GFP_ATOMIC
allocation when you actually need it is?

> > It's the last point that's important to make usability so much
> > simpler, avoiding unnecessary custom timeouts and stuff like that.
> > Regardless whether stack trace capture is success or not, user is
> > guaranteed to get a "notification" about the outcome.
> >
> > Hope this helps.
> >
> > But basically, if I I called unwind_user_deferred(), I expect to get
> > some callback, guaranteed, with the result or failure. The only thing
> > that's not guaranteed (and which makes timeouts bad) is *when* this
> > will happen. Because stack trace capture can be arbitrarily delayed
> > and stuff. That's fine, but that also shows why timeout is tricky and
> > necessarily fragile.
>
> That sounds reasonable.  In the OOM error case I can just pass a small
> (stack allocated) one-entry trace with only regs->ip.
>

SGTM

> --
> Josh
>