linux-kernel - Re: [PATCH bpf-next v2 2/3] bpf: btf: add btf print functionality

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180704163106.GA2200@w1t1fb>
Date:   Wed, 4 Jul 2018 17:31:07 +0100
From:   Okash Khawaja <osk@...com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
CC:     Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <kafai@...com>,
        Alexei Starovoitov <ast@...nel.org>,
        Yonghong Song <yhs@...com>,
        Quentin Monnet <quentin.monnet@...ronome.com>,
        "David S. Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
        <kernel-team@...com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2 2/3] bpf: btf: add btf print functionality

hi,

On Tue, Jul 03, 2018 at 03:23:31PM -0700, Jakub Kicinski wrote:
> On Tue, 3 Jul 2018 22:46:00 +0100, Okash Khawaja wrote:
> > On Mon, Jul 02, 2018 at 10:06:59PM -0700, Jakub Kicinski wrote:
> > > On Mon, 2 Jul 2018 11:39:15 -0700, Okash Khawaja wrote:  
> > > > +#define BITS_PER_BYTE_MASK (BITS_PER_BYTE - 1)
> > > > +#define BITS_PER_BYTE_MASKED(bits) ((bits) & BITS_PER_BYTE_MASK)  
> > > 
> > > Perhaps it's just me but BIT_OFFSET or BIT_COUNT as a name of this macro
> > > would make it more obvious to parse in the code below.  
> > I don't mind either. However these macro names are also used inside
> > kernel for same purpose. For sake of consistency, I'd recommend we keep
> > them :)
> 
> Ugh, okay :)
> 
> > > > +	} print_num;
> > > > +
> > > > +	total_bits_offset = bit_offset + BTF_INT_OFFSET(int_type);
> > > > +	data += BITS_ROUNDDOWN_BYTES(total_bits_offset);
> > > > +	bit_offset = BITS_PER_BYTE_MASKED(total_bits_offset);
> > > > +	bits_to_copy = bits + bit_offset;
> > > > +	bytes_to_copy = BITS_ROUNDUP_BYTES(bits_to_copy);
> > > > +
> > > > +	print_num.u64_num = 0;
> > > > +	memcpy(&print_num.u64_num, data, bytes_to_copy);  
> > > 
> > > This scheme is unlikely to work on big endian machines...  
> > Can you give an example how?
> 
> On BE:
> 
> Input:         [0x01, 0x82]
> Bit length:    15
> Bytes to copy:  2
> bit_offset:     0
> upper_bits:     7
> 
> print_num.u64_num = 0;
> # [0, 0, 0, 0,   0, 0, 0, 0]
> 
> memcpy(&print_num.u64_num, data, bytes_to_copy);  
> # [0x01, 0x82, 0, 0,   0, 0, 0, 0]
> 
> mask = (1 << upper_bits) - 1;
> # mask = 0x7f
> 
> print_num.u8_nums[bytes_to_copy - 1] &= mask;
> # [0x01, 0x02, 0, 0,   0, 0, 0, 0]
> 
> printf("0x%llx", print_num.u64_num);
> # 0x0102000000000000 AKA 72620543991349248
> # expected:
> # 0x0102             AKA 258
> 
> Am I missing something?
yes you're right, good catch! i'll fix this. thanks vrey much :)

> 
> > > > +	upper_bits = BITS_PER_BYTE_MASKED(bits_to_copy);
> > > > +	if (upper_bits) {
> > > > +		uint8_t mask = (1 << upper_bits) - 1;
> > > > +
> > > > +		print_num.u8_nums[bytes_to_copy - 1] &= mask;
> > > > +	}
> > > > +
> > > > +	print_num.u64_num >>= bit_offset;
> > > > +
> > > > +	if (is_plain_text)
> > > > +		jsonw_printf(jw, "0x%llx", print_num.u64_num);
> > > > +	else
> > > > +		jsonw_printf(jw, "%llu", print_num.u64_num);
> > > > +}
> > > > +
> > > > +static int btf_dumper_int(const struct btf_type *t, uint8_t bit_offset,
> > > > +			  const void *data, json_writer_t *jw,
> > > > +			  bool is_plain_text)
> > > > +{
> > > > +	uint32_t *int_type = (uint32_t *)(t + 1);
> > > > +	uint32_t bits = BTF_INT_BITS(*int_type);
> > > > +	int ret = 0;
> > > > +
> > > > +	/* if this is bit field */
> > > > +	if (bit_offset || BTF_INT_OFFSET(*int_type) ||
> > > > +	    BITS_PER_BYTE_MASKED(bits)) {
> > > > +		btf_dumper_int_bits(*int_type, bit_offset, data, jw,
> > > > +				    is_plain_text);
> > > > +		return ret;
> > > > +	}
> > > > +
> > > > +	switch (BTF_INT_ENCODING(*int_type)) {
> > > > +	case 0:
> > > > +		if (BTF_INT_BITS(*int_type) == 64)
> > > > +			jsonw_printf(jw, "%lu", *((uint64_t *)data));
> > > > +		else if (BTF_INT_BITS(*int_type) == 32)
> > > > +			jsonw_printf(jw, "%u", *((uint32_t *)data));
> > > > +		else if (BTF_INT_BITS(*int_type) == 16)
> > > > +			jsonw_printf(jw, "%hu", *((uint16_t *)data));
> > > > +		else if (BTF_INT_BITS(*int_type) == 8)
> > > > +			jsonw_printf(jw, "%hhu", *((uint8_t *)data));
> > > > +		else
> > > > +			btf_dumper_int_bits(*int_type, bit_offset, data, jw,
> > > > +					    is_plain_text);
> > > > +		break;
> > > > +	case BTF_INT_SIGNED:
> > > > +		if (BTF_INT_BITS(*int_type) == 64)
> > > > +			jsonw_printf(jw, "%ld", *((int64_t *)data));
> > > > +		else if (BTF_INT_BITS(*int_type) == 32)
> > > > +			jsonw_printf(jw, "%d", *((int32_t *)data));
> > > > +		else if (BTF_INT_BITS(*int_type) ==  16)  
> > > 
> > > Please drop the double space.  Both for 16 where it makes no sense and
> > > for 8 where it's marginally useful but not really.
> > >   
> > > > +			jsonw_printf(jw, "%hd", *((int16_t *)data));
> > > > +		else if (BTF_INT_BITS(*int_type) ==  8)
> > > > +			jsonw_printf(jw, "%hhd", *((int8_t *)data));
> > > > +		else
> > > > +			btf_dumper_int_bits(*int_type, bit_offset, data, jw,
> > > > +					    is_plain_text);
> > > > +		break;
> > > > +	case BTF_INT_CHAR:
> > > > +		if (*((char *)data) == '\0')
> > > > +			jsonw_null(jw);  
> > > 
> > > Mm.. I don't think 0 char is equivalent to null.  
> > Yes, thanks. Will fix.
> > 
> > >   
> > > > +		else if (isprint(*((char *)data)))
> > > > +			jsonw_printf(jw, "\"%c\"", *((char *)data));  
> > > 
> > > This looks very suspicious.  So if I see a "6" for a char field it's
> > > either a 6 ('\u0006') or a 54 ('6')...  
> > It will always be 54. May be I missed your point. Could you explain why
> > it would be other than 54?
> 
> Ah, I think I missed that %c is in quotes...
> 
> > > > +		else
> > > > +			if (is_plain_text)
> > > > +				jsonw_printf(jw, "%hhx", *((char *)data));
> 
> This seems to be missing a "0x" prefix?
yes it does. will add 0x.

> 
> > > > +			else
> > > > +				jsonw_printf(jw, "%hhd", *((char *)data));  
> > > 
> > > ... I think you need to always print a string, and express it as
> > > \u00%02hhx for non-printable.  
> > Okay that makes sense
> 
> Yeah, IDK, char can be used as a byte as well as a string.  In eBPF
> it may actually be more likely to just be used as a raw byte buffer...
> Either way I think it may be nice to keep it consistent, at least for
> the JSON output could we do either always ints or always characters?
yes, makes sense. i'll keep them always characters.