[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be239a5581e5b7d5c6f310c2a4c11282aa5896b5.camel@gmail.com>
Date: Wed, 17 Jul 2024 14:10:35 -0700
From: Eduard Zingerman <eddyz87@...il.com>
To: Shung-Hsi Yu <shung-hsi.yu@...e.com>, Xu Kuohai
<xukuohai@...weicloud.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org, Alexei Starovoitov
<ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>, Daniel Borkmann
<daniel@...earbox.net>, Martin KaFai Lau <martin.lau@...ux.dev>, Song Liu
<song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>, John Fastabend
<john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>, Stanislav
Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>, Jiri Olsa
<jolsa@...nel.org>, Roberto Sassu <roberto.sassu@...wei.com>, Edward Cree
<ecree.xilinx@...il.com>, Eric Dumazet <edumazet@...gle.com>, Jakub
Kicinski <kuba@...nel.org>, Harishankar Vishwanathan
<harishankar.vishwanathan@...il.com>, Santosh Nagarakatte
<santosh.nagarakatte@...gers.edu>, Srinivas Narayana
<srinivas.narayana@...gers.edu>, Matan Shachnai <m.shachnai@...gers.edu>
Subject: Re: [RFC bpf-next] bpf, verifier: improve signed ranges inference
for BPF_AND
On Tue, 2024-07-16 at 22:52 +0800, Shung-Hsi Yu wrote:
[...]
> To allow verification of such instruction pattern, update
> scalar*_min_max_and() to infer signed ranges directly from signed ranges
> of the operands. With BPF_AND, the resulting value always gains more
> unset '0' bit, thus it only move towards 0x0000000000000000. The
> difficulty lies with how to deal with signs. While non-negative
> (positive and zero) value simply grows smaller, a negative number can
> grows smaller, but may also underflow and become a larger value.
>
> To better address this situation we split the signed ranges into
> negative range and non-negative range cases, ignoring the mixed sign
> cases for now; and only consider how to calculate smax_value.
>
> Since negative range & negative range preserve the sign bit, so we know
> the result is still a negative value, thus it only move towards S64_MIN,
> but never underflow, thus a save bet is to use a value in ranges that is
> closet to 0, thus "max(dst_reg->smax_value, src->smax_value)". For
> negative range & positive range the sign bit is always cleared, thus we
> know the resulting is a non-negative, and only moves towards 0, so a
> safe bet is to use smax_value of the non-negative range. Last but not
> least, non-negative range & non-negative range is still a non-negative
> value, and only moves towards 0; however same as the unsigned range
> case, the maximum is actually capped by the lesser of the two, and thus
> min(dst_reg->smax_value, src_reg->smax_value);
>
> Listing out the above reasoning as a table (dst_reg abbreviated as dst,
> src_reg abbreviated as src, smax_value abbrivated as smax) we get:
>
> | src_reg
> smax = ? +---------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+---------------------------+---------------------------
> | negative | max(dst->smax, src->smax) | src->smax
> dst_reg +--------------+---------------------------+---------------------------
> | non-negative | dst->smax | min(dst->smax, src->smax)
>
> However this is quite complicated, luckily it can be simplified given
> the following observations
>
> max(dst_reg->smax_value, src_reg->smax_value) >= src_reg->smax_value
> max(dst_reg->smax_value, src_reg->smax_value) >= dst_reg->smax_value
> max(dst_reg->smax_value, src_reg->smax_value) >= min(dst_reg->smax_value, src_reg->smax_value)
>
> So we could substitute the cells in the table above all with max(...),
> and arrive at:
>
> | src_reg
> smax' = ? +---------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+---------------------------+---------------------------
> | negative | max(dst->smax, src->smax) | max(dst->smax, src->smax)
> dst_reg +--------------+---------------------------+---------------------------
> | non-negative | max(dst->smax, src->smax) | max(dst->smax, src->smax)
>
> Meaning that simply using
>
> max(dst_reg->smax_value, src_reg->smax_value)
>
> to calculate the resulting smax_value would work across all sign combinations.
>
>
> For smin_value, we know that both non-negative range & non-negative
> range and negative range & non-negative range both result in a
> non-negative value, so an easy guess is to use the minimum non-negative
> value, thus 0.
>
> | src_reg
> smin = ? +----------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+----------------------------+---------------------------
> | negative | ? | 0
> dst_reg +--------------+----------------------------+---------------------------
> | non-negative | 0 | 0
>
> This leave the negative range & negative range case to be considered. We
> know that negative range & negative range always yield a negative value,
> so a preliminary guess would be S64_MIN. However, that guess is too
> imprecise to help with the r0 <<= 62, r0 s>>= 63, r0 &= -13 pattern
> we're trying to deal with here.
>
> This can be further improve with the observation that for negative range
> & negative range, the smallest possible value must be one that has
> longest _common_ most-significant set '1' bits sequence, thus we can use
> min(dst_reg->smin_value, src->smin_value) as the starting point, as the
> smaller value will be the one with the shorter most-significant set '1'
> bits sequence. But that alone is not enough, as we do not know whether
> rest of the bits would be set, so the safest guess would be one that
> clear alls bits after the most-significant set '1' bits sequence,
> something akin to bit_floor(), but for rounding to a negative power-of-2
> instead.
>
> negative_bit_floor(0xffff000000000003) == 0xffff000000000000
> negative_bit_floor(0xf0ff0000ffff0000) == 0xf000000000000000
> negative_bit_floor(0xfffffb0000000000) == 0xfffff80000000000
>
> With negative range & negative range solve, we now have:
>
> | src_reg
> smin = ? +----------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+----------------------------+---------------------------
> | negative |negative_bit_floor( | 0
> | | min(dst->smin, src->smin))|
> dst_reg +--------------+----------------------------+---------------------------
> | non-negative | 0 | 0
>
> This can be further simplied since min(dst->smin, src->smin) < 0 when both
> dst_reg and src_reg have a negative range. Which means using
>
> negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)
>
> to calculate the resulting smin_value would work across all sign combinations.
>
> Together these allows us to infer the signed range of the result of BPF_AND
> operation using the signed range from its operands.
Hi Shung-Hsi,
This seems quite elegant.
As an additional check, I did a simple brute-force for all possible
ranges of 6-bit integers and bounds are computed safely.
[...]
Powered by blists - more mailing lists