[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pneht3re.fsf@nanos.tec.linutronix.de>
Date: Fri, 14 Feb 2020 19:36:37 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: David Miller <davem@...emloft.net>
Cc: linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
netdev@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
bigeasy@...utronix.de, peterz@...radead.org, williams@...hat.com,
rostedt@...dmis.org, juri.lelli@...hat.com, mingo@...nel.org
Subject: Re: [RFC patch 00/19] bpf: Make BPF and PREEMPT_RT co-exist
David Miller <davem@...emloft.net> writes:
> From: Thomas Gleixner <tglx@...utronix.de>
> Date: Fri, 14 Feb 2020 14:39:17 +0100
>
>> This is a follow up to the initial patch series which David posted a
>> while ago:
>>
>> https://lore.kernel.org/bpf/20191207.160357.828344895192682546.davem@davemloft.net/
>>
>> which was (while non-functional on RT) a good starting point for further
>> investigations.
>
> This looks really good after a cursory review, thanks for doing this week.
>
> I was personally unaware of the pre-allocation rules for MAPs used by
> tracing et al. And that definitely shapes how this should be handled.
Hmm. I just noticed that my analysis only holds for PERF events. But
that's broken on mainline already.
Assume the following simplified callchain:
kmalloc() from regular non BPF context
cache empty
freelist empty
lock(zone->lock);
tracepoint or kprobe
BPF()
update_elem()
lock(bucket)
kmalloc()
cache empty
freelist empty
lock(zone->lock); <- DEADLOCK
So really, preallocation _must_ be enforced for all variants of
intrusive instrumentation. There is no if and but, it's simply mandatory
as all intrusive instrumentation has to follow the only sensible
principle: KISS = Keep It Safe and Simple.
The above is a perfectly valid scenario and works with perf and tracing,
so it has to work with BPF in the same safe way.
I might be missing some magic enforcement of that, but I got lost in the
maze.
Thanks,
tglx
Powered by blists - more mailing lists