lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180706163041.xstyfednmgho23m3@ast-mbp.dhcp.thefacebook.com>
Date:   Fri, 6 Jul 2018 09:30:42 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Saeed Mahameed <saeedm@...lanox.com>,
        "saeedm@....mellanox.co.il" <saeedm@....mellanox.co.il>,
        "alexander.h.duyck@...el.com" <alexander.h.duyck@...el.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Tariq Toukan <tariqt@...lanox.com>,
        "john.fastabend@...il.com" <john.fastabend@...il.com>,
        "brouer@...hat.com" <brouer@...hat.com>,
        "borkmann@...earbox.net" <borkmann@...earbox.net>,
        "peter.waskiewicz.jr@...el.com" <peter.waskiewicz.jr@...el.com>
Subject: Re: [RFC bpf-next 2/6] net: xdp: RX meta data infrastructure

On Thu, Jul 05, 2018 at 10:18:23AM -0700, Jakub Kicinski wrote:
> 
> I'm also not 100% on board with the argument that "future" FW can
> reshuffle things whatever way it wants to.  Is the assumption that
> future ASICs/FW will be designed to always use the "blessed" BTF
> format?  Or will it be reconfigurable at runtime?

let's table configuration of metadata aside for a second.

Describing metedata layout in BTF allows NICs to disclose everything
NIC has to users in a standard and generic way.
Whether firmware is reconfigurable on the fly or has to reflashed
and hw powercycled to have new md layout (and corresponding BTF description)
is a separate discussion.
Saeed's proposal introduces the concept of 'offset' inside 'struct xdp_md_info'
to reach 'hash' value in metadata.
Essentially it's a run-time way to access 'hash' instead of build-time.
So bpf program would need two loads to read csum or hash field instead of one.
With BTF the layout of metadata is known to the program at build-time.

To reiterate the proposal:
- driver+firmware keep layout of the metadata in BTF format (either in the driver
  or driver can read it from firmware)
- 'bpftool read-metadata-desc eth0 > md_desc.h' command will query the driver and
  generate normal C header file based on BTF in the given NIC
- user does #include "md_desc.h" and bpf program can access md->csum or md->hash
  with direct single load out of metadata area in front of the packet
- llvm compiles bpf program and records how program is doing this md->csum accesses
  in BTF format as well (the compiler will be keeping such records
  for __sk_buff and all other structs too, but that's separate discussion)
- during sys_bpf(prog_load) the kernel checks (via supplied BTF) that the way the program
  accesses metadata (and other structs) matches BTF from the driver,
  so no surprises if driver+firmware got updated, but program is not recompiled
- every NIC can have their own layout of metadata and its own meaning of the fields,
  but would be good to standardize at least a few common fields like hash

Once this is working we can do more cool things with BTF.
Like doing offset rewriting at program load time similar to what we plan
to do for tracing. Tracing programs will be doing 'task->pid' access
and the kernel will adjust offsetof(struct task_struct, pid) during program load
depending on BTF for the kernel.
The same trick we can do for networking metadata.
The program will contain load instruction md->hash that will get automatically
adjusted to proper offset depending on BTF of 'hash' field in the NIC.
For now I'm proposing _not_ to go that far with offset rewriting and start
with simple steps described above.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ