lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <A0C713E9-6ACE-4235-87C2-2653A8F22F7B@fb.com>
Date:   Tue, 8 Jan 2019 23:54:04 +0000
From:   Song Liu <songliubraving@...com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     lkml <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "acme@...nel.org" <acme@...nel.org>,
        "ast@...nel.org" <ast@...nel.org>,
        "daniel@...earbox.net" <daniel@...earbox.net>,
        Kernel Team <Kernel-team@...com>,
        Andi Kleen <andi@...stfloor.org>
Subject: Re: [PATCH v5 perf, bpf-next 3/7] perf, bpf: introduce
 PERF_RECORD_BPF_EVENT



> On Jan 8, 2019, at 11:43 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> 
> On Tue, Jan 08, 2019 at 07:10:20PM +0000, Song Liu wrote:
>>> On Jan 8, 2019, at 10:41 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>>> On Thu, Dec 20, 2018 at 10:29:00AM -0800, Song Liu wrote:
>>>> @@ -986,9 +987,35 @@ enum perf_event_type {
>>>> 	 */
>>>> 	PERF_RECORD_KSYMBOL			= 17,
>>>> 
>>>> +	/*
>>>> +	 * Record bpf events:
>>>> +	 *  enum perf_bpf_event_type {
>>>> +	 *	PERF_BPF_EVENT_UNKNOWN		= 0,
>>>> +	 *	PERF_BPF_EVENT_PROG_LOAD	= 1,
>>>> +	 *	PERF_BPF_EVENT_PROG_UNLOAD	= 2,
>>>> +	 *  };
>>>> +	 *
>>>> +	 * struct {
>>>> +	 *	struct perf_event_header	header;
>>>> +	 *	u16				type;
>>>> +	 *	u16				flags;
>>>> +	 *	u32				id;
>>>> +	 *	u8				tag[BPF_TAG_SIZE];
>>>> +	 *	struct sample_id		sample_id;
>>>> +	 * };
>>>> +	 */
>>>> +	PERF_RECORD_BPF_EVENT			= 18,
>>>> +
>>> 
>>> Elsewhere today, I raised the point that by the time (however short
>>> interval) userspace gets around to reading this event, the actual
>>> program could be gone again.
>>> 
>>> In this case the program has been with us for a very short period
>>> indeed; but it could still have generated some samples or otherwise
>>> generated trace data.
>> 
>> Since we already have the separate KSYMBOL events, BPF_EVENT is only 
>> required for advanced use cases, like annotation. So I guess missing 
>> it for very-short-living programs should not be a huge problem?
>> 
>>> It was suggested to allow pinning modules/programs to avoid this
>>> situation, but that of course has other undesirable effects, such as a
>>> trivial DoS.
>>> 
>>> A truly horrible hack would be to include an open filedesc in the event
>>> that needs closing to release the resource, but I'm sorry for even
>>> suggesting that **shudder**.
>>> 
>>> Do we have any sane ideas?
>> 
>> How about we gate the open filedesc solution with an option, and limit
>> that option for root only? If this still sounds hacky, maybe we should
>> just ignore when short-living programs are missed?
> 
> I'm afraid we might also 'need' this for the kallsym thing.
> 
> The problem is that things like Intel PT (ARM Coresight too IIRC) encode
> a bitstream of branch-taken decisions. The only way to decode that and
> reconstruct the actual code-flow is with an exact matching text image.
> 
> In order to have this matching text we need to be able to copy out every
> piece of dynamic text (from kcore) that has ever executed before it
> dissapears.
> 
> Elsewhere (*), Andi suggests to have a kind of text-free fence
> interface, where userspace can call a complete. And I suppose as long we
> know there is a consumer, we also know we'll not be blocked
> indefinitely. So it would have to be slightly more complicated than
> suggested, but I think that is something we could work with.
> 
> It would also not complicate these events.
> 
> 
> 
> [*] https://lkml.kernel.org/r/20190108172721.GN6118@tassilo.jf.intel.com

I think Intel PT case is at instruction granularity (instead of ksymbol
granularity)? If this is true, modules, BPF, and PT could still share
the ksymbol record for basic profiling. And advanced use cases like 
annotation will depend on user space to record BPF_EVENT (and equivalent
for other cases) timely. But at least, the ksymbol is already there. 

Does this make sense?  

Thanks,
Song 



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ