lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pneht3re.fsf@nanos.tec.linutronix.de>
Date:   Fri, 14 Feb 2020 19:36:37 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     David Miller <davem@...emloft.net>
Cc:     linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
        netdev@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
        bigeasy@...utronix.de, peterz@...radead.org, williams@...hat.com,
        rostedt@...dmis.org, juri.lelli@...hat.com, mingo@...nel.org
Subject: Re: [RFC patch 00/19] bpf: Make BPF and PREEMPT_RT co-exist

David Miller <davem@...emloft.net> writes:

> From: Thomas Gleixner <tglx@...utronix.de>
> Date: Fri, 14 Feb 2020 14:39:17 +0100
>
>> This is a follow up to the initial patch series which David posted a
>> while ago:
>> 
>>  https://lore.kernel.org/bpf/20191207.160357.828344895192682546.davem@davemloft.net/
>> 
>> which was (while non-functional on RT) a good starting point for further
>> investigations.
>
> This looks really good after a cursory review, thanks for doing this week.
>
> I was personally unaware of the pre-allocation rules for MAPs used by
> tracing et al.  And that definitely shapes how this should be handled.

Hmm. I just noticed that my analysis only holds for PERF events. But
that's broken on mainline already.

Assume the following simplified callchain:

       kmalloc() from regular non BPF context
         cache empty
           freelist empty
             lock(zone->lock);
                tracepoint or kprobe
                  BPF()
                    update_elem()
                      lock(bucket)
                        kmalloc()
                          cache empty
                            freelist empty
                              lock(zone->lock);  <- DEADLOCK

So really, preallocation _must_ be enforced for all variants of
intrusive instrumentation. There is no if and but, it's simply mandatory
as all intrusive instrumentation has to follow the only sensible
principle: KISS = Keep It Safe and Simple.

The above is a perfectly valid scenario and works with perf and tracing,
so it has to work with BPF in the same safe way.

I might be missing some magic enforcement of that, but I got lost in the
maze.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ