netdev - Re: [PATCH net] bpf: expose netns inode to bpf programs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrXgdY_Kt8wn4uiATUnNJ3YXttCUREgEeQReG7u29Lc44g@mail.gmail.com>
Date:   Thu, 26 Jan 2017 11:07:01 -0800
From:   Andy Lutomirski <luto@...capital.net>
To:     Alexei Starovoitov <ast@...com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        "David S . Miller" <davem@...emloft.net>,
        Daniel Borkmann <daniel@...earbox.net>,
        David Ahern <dsa@...ulusnetworks.com>,
        Tejun Heo <tj@...nel.org>, Thomas Graf <tgraf@...g.ch>,
        Network Development <netdev@...r.kernel.org>
Subject: Re: [PATCH net] bpf: expose netns inode to bpf programs

On Thu, Jan 26, 2017 at 10:32 AM, Alexei Starovoitov <ast@...com> wrote:
> On 1/26/17 10:12 AM, Andy Lutomirski wrote:
>>
>> On Thu, Jan 26, 2017 at 9:46 AM, Alexei Starovoitov <ast@...com> wrote:
>>>
>>> On 1/26/17 8:37 AM, Andy Lutomirski wrote:
>>>>>
>>>>>
>>>>> Think of bpf programs as safe kernel modules. They don't have
>>>>> confined boundaries and program authors, if not careful, can shoot
>>>>> themselves in the foot. We're not trying to prevent that because
>>>>> it's impossible to check that the program is sane. Just like
>>>>> it's impossible to check that kernel module is sane.
>>>>> But in case of bpf we check that bpf program is _safe_ from the kernel
>>>>> point of view. If it's doing some garbage, it's program's business.
>>>>> Does it make more sense now?
>>>>>
>>>>
>>>> With all due respect, I think this is not an acceptable way to think
>>>> about BPF at all.  If you think of BPF this way, I think there needs
>>>> to be a real discussion at KS or similar as to whether this is okay.
>>>> The reason is simple: the kernel promises a stable ABI to userspace
>>>> but not to kernel modules.  By thinking of BPF as more like a module,
>>>> you're taking a big shortcut that will either result in ABI breakage
>>>> down the road or in committing to a problematic stable ABI.
>>>
>>>
>>>
>>> you misunderstood the analogy.
>>> bpf abi is certainly stable. that's why we were careful of not
>>> exposing anything to it that is not already stable.
>>>
>>
>> In that case I don't understand what you're trying to say.  Eric
>> thinks your patch exposes a bad interface.  A bad interface for
>> userspace is a very different thing from a bad interface available to
>> kernel modules.  Are you saying that BPF is kernel-module-like in that
>> the ABI exposed to BPF programs doesn't need to meet the same quality
>> standards as userspace ABIs?
>
>
> of course not.
> ns.inum is already exposed to user space as a value.
> This patch exposes it to bpf program in a convenient and stable way,

Here's what I'm imaging Eric is thinking:

ns.inum is currently exposed to userspace via procfs.  In principle,
the value could be local to a namespace, though, which would enable
CRIU to be able to preserve namespace inode numbers across a
checkpoint+restore operation.  If this happened, the contained and
restored procfs would see a different inode number than the outermost
procfs.

If you start exposing the raw ns.inum field to BPF programs and those
programs are not themselves scoped to a namespace, then this could
create a problem for CRIU.

But you told Eric that his nack doesn't matter, and maybe it would be
nice to ask him to clarify instead.