lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200729011459.wix5xadqoir54rod@kafai-mbp.dhcp.thefacebook.com>
Date:   Tue, 28 Jul 2020 18:14:59 -0700
From:   Martin KaFai Lau <kafai@...com>
To:     John Fastabend <john.fastabend@...il.com>
CC:     <daniel@...earbox.net>, <ast@...nel.org>, <netdev@...r.kernel.org>,
        <bpf@...r.kernel.org>
Subject: Re: [bpf PATCH 1/3] bpf: sock_ops ctx access may stomp registers in
 corner case

On Tue, Jul 28, 2020 at 05:44:19PM -0700, John Fastabend wrote:
> Martin KaFai Lau wrote:
> > On Tue, Jul 28, 2020 at 01:55:22PM -0700, John Fastabend wrote:
> > > Martin KaFai Lau wrote:
> > > > On Tue, Jul 28, 2020 at 08:43:46AM -0700, John Fastabend wrote:
> > > > > I had a sockmap program that after doing some refactoring started spewing
> > > > > this splat at me:
> > > > > 
> > > > > [18610.807284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
> > > > > [...]
> > > > > [18610.807359] Call Trace:
> > > > > [18610.807370]  ? 0xffffffffc114d0d5
> > > > > [18610.807382]  __cgroup_bpf_run_filter_sock_ops+0x7d/0xb0
> > > > > [18610.807391]  tcp_connect+0x895/0xd50
> > > > > [18610.807400]  tcp_v4_connect+0x465/0x4e0
> > > > > [18610.807407]  __inet_stream_connect+0xd6/0x3a0
> > > > > [18610.807412]  ? __inet_stream_connect+0x5/0x3a0
> > > > > [18610.807417]  inet_stream_connect+0x3b/0x60
> > > > > [18610.807425]  __sys_connect+0xed/0x120
> > > > > 
> > > 
> > > [...]
> > > 
> > > > > So three additional instructions if dst == src register, but I scanned
> > > > > my current code base and did not see this pattern anywhere so should
> > > > > not be a big deal. Further, it seems no one else has hit this or at
> > > > > least reported it so it must a fairly rare pattern.
> > > > > 
> > > > > Fixes: 9b1f3d6e5af29 ("bpf: Refactor sock_ops_convert_ctx_access")
> > > > I think this issue dated at least back from
> > > > commit 34d367c59233 ("bpf: Make SOCK_OPS_GET_TCP struct independent")
> > > > There are a few refactoring since then, so fixing in much older
> > > > code may not worth it since it is rare?
> > > 
> > > OK I just did a quick git annotate and pulled out the last patch
> > > there. I didn't go any farther back. The failure is rare and has
> > > the nice property that it crashes hard always. For example I found
> > > it by simply running some of our go tests after doing the refactor.
> > > I guess if it was in some path that doesn't get tested like an
> > > error case or something you might have an ugly surprise in production.
> > > I can imagine a case where tracking this down might be difficult.
> > > 
> > > OTOH the backport wont be automatic past some of those reworks.
> > > 
> > > > 
> > > > > Signed-off-by: John Fastabend <john.fastabend@...il.com>
> > > > > ---
> > > > >  net/core/filter.c |   26 ++++++++++++++++++++++++--
> > > > >  1 file changed, 24 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > > > index 29e34551..c50cb80 100644
> > > > > --- a/net/core/filter.c
> > > > > +++ b/net/core/filter.c
> > > > > @@ -8314,15 +8314,31 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
> > > > >  /* Helper macro for adding read access to tcp_sock or sock fields. */
> > > > >  #define SOCK_OPS_GET_FIELD(BPF_FIELD, OBJ_FIELD, OBJ)			      \
> > > > >  	do {								      \
> > > > > +		int fullsock_reg = si->dst_reg, reg = BPF_REG_9, jmp = 2;     \
> > > > >  		BUILD_BUG_ON(sizeof_field(OBJ, OBJ_FIELD) >		      \
> > > > >  			     sizeof_field(struct bpf_sock_ops, BPF_FIELD));   \
> > > > > +		if (si->dst_reg == reg || si->src_reg == reg)		      \
> > > > > +			reg--;						      \
> > > > > +		if (si->dst_reg == reg || si->src_reg == reg)		      \
> > > > > +			reg--;						      \
> > > > > +		if (si->dst_reg == si->src_reg) {			      \
> > > > > +			*insn++ = BPF_STX_MEM(BPF_DW, si->src_reg, reg,	      \
> > > > > +					  offsetof(struct bpf_sock_ops_kern,  \
> > > > > +				          temp));			      \
> > > > Instead of sock_ops->temp, can BPF_REG_AX be used here as a temp?
> > > > e.g. bpf_convert_shinfo_access() has already used it as a temp also.
> 
> OTOH it looks like we will cause the bpf_jit_blind_insn() to abort on those
> instructions.
You are right.  I think eventually "BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, ...);"
has to be done here and it is working on an IMM 0.  BPF_REG_AX has the
is_fullsock.

>From the bpf_jit_blind_insn(), the "case BPF_JMP | BPF_JEQ  | BPF_K"
will break the above BPF_JMP_IMM.

I think the "temp" approach has to be used as in this patch.

> 
> I'm not sure it matters for performance see'ing we are in a bit of an
> edge case. iirc Daniel wrote that code so maybe its best to see if he has
> any opinions.
> 
> @Daniel, Do you have a preference? If we use REG_RAX it seems the insns
> will be skipped over by bpf_jit_blind_insn otoh its slightly faster I guess
> to skip the load/store.
> 
> > > 
> > > Sure I will roll a v2 I agree that rax is a bit nicer. I guess for
> > > bpf-next we can roll the load over to use rax as well? Once the
> > > fix is in place I'll take a look it would be nice for consistency.
> > Agree that it would be nice to do the same in SOCK_OPS_SET_FIELD() also
> > and this improvement could be done in bpf-next.
> > 
> > > 
> > > > 
> > > > Also, it seems the "sk" access in sock_ops_convert_ctx_access() suffers
> > > > a similar issue.
> > > 
> > > Good catch. I'll fix it up as well. Maybe with a second patch and test.
> > > Patches might be a bit verbose but makes it easier to track the bugs
> > > I think.
> > Thanks for taking care of it!
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ