netdev - Re: [PATCH bpf-next 6/8] bpf: tcp: bpf iter batching and lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6e5864b8-ab64-68a6-b2d3-d5c9b9468f57@fb.com>
Date:   Tue, 29 Jun 2021 11:55:52 -0700
From:   Yonghong Song <yhs@...com>
To:     Martin KaFai Lau <kafai@...com>
CC:     <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Eric Dumazet <edumazet@...gle.com>, <kernel-team@...com>,
        Neal Cardwell <ncardwell@...gle.com>, <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: [PATCH bpf-next 6/8] bpf: tcp: bpf iter batching and lock_sock



On 6/29/21 11:06 AM, Martin KaFai Lau wrote:
> On Tue, Jun 29, 2021 at 10:57:46AM -0700, Yonghong Song wrote:
>>
>>
>> On 6/29/21 10:44 AM, Martin KaFai Lau wrote:
>>> On Tue, Jun 29, 2021 at 10:27:17AM -0700, Yonghong Song wrote:
>>> [ ... ]
>>>
>>>>> +static int bpf_iter_tcp_realloc_batch(struct bpf_tcp_iter_state *iter,
>>>>> +				      unsigned int new_batch_sz)
>>>>> +{
>>>>> +	struct sock **new_batch;
>>>>> +
>>>>> +	new_batch = kvmalloc(sizeof(*new_batch) * new_batch_sz, GFP_USER);
>>>>
>>>> Since we return -ENOMEM below, should we have __GFP_NOWARN in kvmalloc
>>>> flags?
>>> will add in v2.
>>>
>>>>
>>>>> +	if (!new_batch)
>>>>> +		return -ENOMEM;
>>>>> +
>>>>> +	bpf_iter_tcp_put_batch(iter);
>>>>> +	kvfree(iter->batch);
>>>>> +	iter->batch = new_batch;
>>>>> +	iter->max_sk = new_batch_sz;
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>> [...]
>>>>> +
>>>>>     static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v)
>>>>>     {
>>>>>     	struct bpf_iter_meta meta;
>>>>>     	struct bpf_prog *prog;
>>>>>     	struct sock *sk = v;
>>>>> +	bool slow;
>>>>>     	uid_t uid;
>>>>> +	int ret;
>>>>>     	if (v == SEQ_START_TOKEN)
>>>>>     		return 0;
>>>>> +	if (sk_fullsock(sk))
>>>>> +		slow = lock_sock_fast(sk);
>>>>> +
>>>>> +	if (unlikely(sk_unhashed(sk))) {
>>>>> +		ret = SEQ_SKIP;
>>>>> +		goto unlock;
>>>>> +	}
>>>>
>>>> I am not a tcp expert. Maybe a dummy question.
>>>> Is it possible to do setsockopt() for listening socket?
>>>> What will happen if the listening sock is unhashed after the
>>>> above check?
>>> It won't happen because the sk has been locked before doing the
>>> unhashed check.
>>
>> Ya, that is true. I guess I probably mean TCP_TIME_WAIT and
>> TCP_NEW_SYN_RECV sockets. We cannot do setsockopt() for
>> TCP_TIME_WAIT sockets since user space shouldn't be able
>> to access the socket any more.
>>
>> But how about TCP_NEW_SYN_RECV sockets?
> _bpf_setsockopt() will return -EINVAL for non fullsock.

That makes sense. I think whether we could block calling 
bpf_setsockopt() for unsupported sockets outside bpf program.
But indeed letting bpf to do filtering in such cases should
be simpler.