lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3186bf18-a8fd-4b30-a080-61beb13f19f7@linux.dev>
Date: Wed, 13 Dec 2023 22:46:11 -0800
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: andrii@...nel.org, ast@...nel.org, bpf@...r.kernel.org,
 daniel@...earbox.net, dxu@...uu.xyz, edumazet@...gle.com,
 kuni1840@...il.com, netdev@...r.kernel.org,
 Yonghong Song <yonghong.song@...ux.dev>
Subject: Re: [PATCH v5 bpf-next 6/6] selftest: bpf: Test
 bpf_sk_assign_tcp_reqsk().

On 12/13/23 7:18 PM, Kuniyuki Iwashima wrote:
>>> +static int tcp_parse_option(__u32 index, struct tcp_syncookie *ctx)
>>> +{
>>> +	struct tcp_options_received *tcp_opt = &ctx->attr.tcp_opt;
>>> +	char opcode, opsize;
>>> +
>>> +	if (ctx->ptr + 1 > ctx->data_end)
>>> +		goto stop;
>>> +
>>> +	opcode = *ctx->ptr++;
>>> +
>>> +	if (opcode == TCPOPT_EOL)
>>> +		goto stop;
>>> +
>>> +	if (opcode == TCPOPT_NOP)
>>> +		goto next;
>>> +
>>> +	if (ctx->ptr + 1 > ctx->data_end)
>>> +		goto stop;
>>> +
>>> +	opsize = *ctx->ptr++;
>>> +
>>> +	if (opsize < 2)
>>> +		goto stop;
>>> +
>>> +	switch (opcode) {
>>> +	case TCPOPT_MSS:
>>> +		if (opsize == TCPOLEN_MSS && ctx->tcp->syn &&
>>> +		    ctx->ptr + (TCPOLEN_MSS - 2) < ctx->data_end)
>>> +			tcp_opt->mss_clamp = get_unaligned_be16(ctx->ptr);
>>> +		break;
>>> +	case TCPOPT_WINDOW:
>>> +		if (opsize == TCPOLEN_WINDOW && ctx->tcp->syn &&
>>> +		    ctx->ptr + (TCPOLEN_WINDOW - 2) < ctx->data_end) {
>>> +			tcp_opt->wscale_ok = 1;
>>> +			tcp_opt->snd_wscale = *ctx->ptr;
>> When writing to a bitfield of "struct tcp_options_received" which is a kernel
>> struct, it needs to use the CO-RE api. The BPF_CORE_WRITE_BITFIELD has not been
>> landed yet:
>> https://lore.kernel.org/bpf/4d3dd215a4fd57d980733886f9c11a45e1a9adf3.1702325874.git.dxu@dxuuu.xyz/
>>
>> The same for reading bitfield but BPF_CORE_READ_BITFIELD() has already been
>> implemented in bpf_core_read.h
>>
>> Once the BPF_CORE_WRITE_BITFIELD is landed, this test needs to be changed to use
>> the BPF_CORE_{READ,WRITE}_BITFIELD.
> IIUC, the CO-RE api assumes that the offset of bitfields could be changed.
> 
> If the size of struct tcp_cookie_attributes is changed, kfunc will not work
> in this test.  So, BPF_CORE_WRITE_BITFIELD() works only when the size of
> tcp_cookie_attributes is unchanged but fields in tcp_options_received are
> rearranged or expanded to use the unused@ bits ?

Right, CO-RE helps to figure out the offset of a member in the running kernel.

> 
> Also, do we need to use BPF_CORE_READ() for other non-bitfields in
> strcut tcp_options_received (and ecn_ok in struct tcp_cookie_attributes
> just in case other fields are added to tcp_cookie_attributes and ecn_ok
> is rearranged) ?

BPF_CORE_READ is a CO-RE friendly macro for using bpf_probe_read_kernel(). 
bpf_probe_read_kernel() is mostly for the tracing use case where the ptr is not 
safe to read directly.

It is not the case for the tcp_options_received ptr in this tc-bpf use case or 
other stack allocated objects. In general, no need to use BPF_CORE_READ. The 
relocation will be done by the libbpf for tcp_opt->mss_clamp (e.g.).

Going back to bitfield, it needs BPF_CORE_*_BITFIELD because the offset may not 
be right after __attribute__((preserve_access_index)), cc: Yonghong and Andrii 
who know more details than I do.

A verifier error has been reported: 
https://lore.kernel.org/bpf/391d524c496acc97a8801d8bea80976f58485810.1700676682.git.dxu@dxuuu.xyz/.

I also hit an error earlier in 
https://lore.kernel.org/all/20220817061847.4182339-1-kafai@fb.com/ when not 
using BPF_CORE_READ_BITFIELD. I don't exactly remember how the instruction looks 
like but it was reading a wrong value instead of verifier error.

================

Going back to this patch set here.

After sleeping on it longer, I am thinking it is better not to reuse 'struct 
tcp_options_received' (meaning no bitfield) in the bpf_sk_assign_tcp_reqsk() 
kfunc API.

There is not much benefit in reusing 'tcp_options_received'. When new tcp option 
was ever added to tcp_options_received, it is not like bpf_sk_assign_tcp_reqsk 
will support it automatically. It needs to relay this new option back to the 
allocated req. Unlike tcp_sock or req which may have a lot of them such that it 
is useful to have a compact tcp_options_received, the tc-bpf use case here is to 
allocate it once in the stack. Also, not all the members in tcp_options_received 
is useful, e.g. num_sacks, ts_recent_stamp, and user_mss are not used. Leaving 
it there being ignored by bpf_sk_assign_tcp_reqsk is confusing.

How about using a full u8 for each necessary member and directly add them to 
struct tcp_cookie_attributes instead of nesting them into another struct. After 
taking out the unnecessary members, the size may not end up to be much bigger.

The bpf prog can then directly access attr->tstamp_ok more naturally. The 
changes to patch 5 and 6 should be mostly mechanical changes.

I would also rename s/tcp_cookie_attributes/bpf_tcp_req_attrs/.

wdyt?



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ