lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 Nov 2019 14:47:58 -0800
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Alexei Starovoitov <ast@...com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables

On Wed, Nov 20, 2019 at 1:39 PM Alexei Starovoitov <ast@...com> wrote:
>
> On 11/19/19 7:44 PM, Andrii Nakryiko wrote:
> > So, to summarize, we proceed with uint64_t for everything, with added
> > bits of weak vs strong handling. Then in parallel we'll work on adding
> > BTF for externs and __builtin_extern_resolved (and corresponding new
> > kind of BTF relocation) and will keep making this whole API even
> > better, while already having something useful and extensible.
>
> I didn't know extern can be weak. That was a good find.
> Indeed undefined config_* can be represented as uint64 zero value.
> But I still have an issue with 'proceed uint64_t for everything',
> since strings and tri-state don't fit into uint64.

well, great that we are at least aligned w.r.t. int/hex :)

not clear about bool, though. They are subset of tristate, so should
they be 'y'/'n' (which is certainly weird, isn't it?) raw characters
(with zero byte termination? Or just plain single char?) or C's 0 and
1?

>
> How about strtol be applied by libbpf only for things that parse
> successfully (like decimal and hex) and everything else will be
> represented raw ?
> Like CONFIG=y will be 121.
> CONFIG=m will be 109

Again, conceptually, bool is subset of tristate. So if we go with real
bools, then 0 and 1 should match to tristate's y/n as well.

For tristate we also can have 8-byte enum, if you think that's better
than short 0/1/2 (or whatever, -1, if you don't like 2):

enum kconfig_tristate {
        NO = 0,
        YES = 1,
        MODULE = 2,
        __KCONFIG_TRISTATE_INVALID = 0xFFFFFFFFFFFFFFFF
};

> CONFIG="abc" will be 0x636261
> In other words CONFIG_A is an address and
> 'extern weak uint64_t CONFIG_A' reads raw bytes from that location.
> When that CONFIG_A is undefined in /boot/config.gz the u64 read
> from that address will return zero.
> u8 read from that address will return zero too.

I'm puzzled on why strings should be represented as "extern int64_t"?
What about strings longer than 7 bytes (+1 zero terminator)? IMO,
strings should be strings, i.e., const char []. Or const char *, if we
had a support (see my previous email). Cramming arbitrary strings into
uint64_t seems weird and I fail to see the reason.


So just to summarize how I understand situation right now. Please let
me know if that's not true.

1. We proceed with extern uint64_t for what's detected as integer/hex
(at runtime).
2. Weak/strong externs determine how we handle unresolved config
values. Weak - default to 0, strong - ensure it can't be read. But
this might be complicated by our further choices below.

Things that I'd like to clarify. And I'll list options we've discussed.

1. For bools:
  a) parse them as 0/1, for now use them as uint64_t. Once BTF for
externs available, users will be able to pick bool instead.
  b) parse them as 0/1 (false, true), represent as single byte bool.
  c) leave them as raw single character: 'y' or 'n'
    c1) represent as single byte?
    c2) represent as 8 bytes to match uint64_t?
  d) leave them as a raw string (zero-terminated)
    d1) represent as fixed 8-byte string?..
    d2) represent as variable-sized raw string?

2. For tristate, pretty much the same set of questions, except there
is one more state 'm'. One extra option: represent as 8-byte enum,
maybe? We can do 4-byte enum as well, but that has all the downsides
of variable sized externs without yet having BTF type info.

3. For strings:
  a) don't add them yet, if string is detected in runtime, return
error (my original proposal);
  b) add them as up to 7 characters fixed strings you proposed? What
do we do with longer strings?
  c) add them as variable-sized zero-terminated strings (libbpf will
determine size at runtime). See below for consequences.

If we pick any combination where fields are not uniform (so, if we add
strings right now, or have bool as 1 byte, or whatever), we'll have to
specify what size value we fall back to if weakly-defined extern is
not resolved. Is it going to be zero uint64_t? This is wasteful for
what's expected to be a string (and dirty). On the other hand,
specifying it as empty single-byte string is not going to work for
CONFIG_XXX that is undefined integer. Generally speaking, having
variable sized representation is very confusing until we have expected
type information or at least expected byte size.

Given all this, I think realistically we can pick few combinations:

1. Only support int/hex as uint64_t. Anything y/n/m or a string will
fail in runtime.
Pros:
  - we get at least something to use in practice (LINUX_KERNEL_VERSION
and int CONFIG_XXXs).
Cons:
  - undefined weak extern will have to be assumed either uint64_t 0
(and succeed) or undefined string (and fail). No clearly best behavior
here.
  - no ability to do "feature detection" logic in BPF program.
2. Stick to uint64_t for bool/tristate/int/hex. Don't support strings
yet. Improve in backwards compatible way once we get BTF type info for
externs (adding strings at that time).
Pros:
  - well-defined and logical behavior
  - easily extensible once we get BTF for externs. No new flags, no
changes in behavior.
Cons:
  - no strings yet
  - Alexei doesn't like y/n/m as 0/1/2
3. Mix of the options for bool/tristate/string I layed out above.
Pros:
  - we can get reasonable strings (or up to 7-byte ones, but not sure why);
  - bool/tristate can be accessed, though, IMO, in an ugly
character-based fashion.
Cons:
  - some arbitrary choices need to be made, that couldn't be changed
even when we have BTF for externs.
  - needs further discussion and specification to narrow down all the downsides.

My preferences would be 2, if not, then 1.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ