linux-kernel - Re: [PATCH v3 bpf-next 2/3] bpf: introduce helper bpf_get_branch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <26B5B18A-EC0B-4661-BE6F-E5D96DCDE0B9@fb.com>
Date:   Tue, 31 Aug 2021 16:41:01 +0000
From:   Song Liu <songliubraving@...com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     "open list:BPF (Safe dynamic programs and tools)" 
        <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ingo Molnar <mingo@...hat.com>,
        kajoljain <kjain@...ux.ibm.com>, Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH v3 bpf-next 2/3] bpf: introduce helper
 bpf_get_branch_snapshot



> On Aug 31, 2021, at 8:32 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> 
> On Mon, Aug 30, 2021 at 02:41:05PM -0700, Song Liu wrote:
> 
>> @@ -564,6 +565,18 @@ static void notrace inc_misses_counter(struct bpf_prog *prog)
>> u64 notrace __bpf_prog_enter(struct bpf_prog *prog)
>> 	__acquires(RCU)
>> {
> 	preempt_disable_notrace();
> 
>> +#ifdef CONFIG_PERF_EVENTS
>> +	/* Calling migrate_disable costs two entries in the LBR. To save
>> +	 * some entries, we call perf_snapshot_branch_stack before
>> +	 * migrate_disable to save some entries. This is OK because we
>> +	 * care about the branch trace before entering the BPF program.
>> +	 * If migrate happens exactly here, there isn't much we can do to
>> +	 * preserve the data.
>> +	 */
>> +	if (prog->call_get_branch)
>> +		static_call(perf_snapshot_branch_stack)(
>> +			this_cpu_ptr(&bpf_perf_branch_snapshot));
> 
> Here the comment is accurate, but if you recall the calling context
> requirements of perf_snapshot_branch_stack from the last patch, you'll
> see it requires you have at the very least preemption disabled, which
> you just violated.

> 
> I think you'll find that (on x86 at least) the suggested
> preempt_disable_notrace() incurs no additional branches.
> 
> Still, there is the next point to consider...
> 
>> +#endif
>> 	rcu_read_lock();
>> 	migrate_disable();
> 
> 	preempt_enable_notrace();

Do we want preempt_enable_notrace() after migrate_disable()? It feels a 
little weird to me.

> 
>> 	if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) {
> 
>> @@ -1863,9 +1892,23 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
>> 	preempt_enable();
>> }
>> 
>> +DEFINE_PER_CPU(struct perf_branch_snapshot, bpf_perf_branch_snapshot);
>> +
>> static __always_inline
>> void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
>> {
>> +#ifdef CONFIG_PERF_EVENTS
>> +	/* Calling migrate_disable costs two entries in the LBR. To save
>> +	 * some entries, we call perf_snapshot_branch_stack before
>> +	 * migrate_disable to save some entries. This is OK because we
>> +	 * care about the branch trace before entering the BPF program.
>> +	 * If migrate happens exactly here, there isn't much we can do to
>> +	 * preserve the data.
>> +	 */
>> +	if (prog->call_get_branch)
>> +		static_call(perf_snapshot_branch_stack)(
>> +			this_cpu_ptr(&bpf_perf_branch_snapshot));
>> +#endif
>> 	cant_sleep();
> 
> In the face of ^^^^^^ the comment makes no sense. Still, what are the
> nesting rules for __bpf_trace_run() and __bpf_prog_enter() ? I'm
> thinking the trace one can nest inside an occurence of prog, at which
> point you have pieces.

I think broken LBR records is something we cannot really avoid in case 
of nesting. OTOH, these should be rare cases and will not hurt the results
in most the use cases. 

I should probably tighten the rules in verifier to only apply it for 
__bpf_prog_enter (where we have the primary use case). We can enable it
for other program types when there are other use cases. 

Thanks,
Song