netdev - Re: [PATCH] 9p: fix crash when transaction killed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m2a6487f23.fsf@gmail.com>
Date:   Wed, 30 Nov 2022 21:15:12 +0800
From:   Schspa Shi <schspa@...il.com>
To:     asmadeus@...ewreck.org
Cc:     ericvh@...il.com, lucho@...kov.net, linux_oss@...debyte.com,
        davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
        pabeni@...hat.com, v9fs-developer@...ts.sourceforge.net,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        syzbot+8f1060e2aaf8ca55220b@...kaller.appspotmail.com
Subject: Re: [PATCH] 9p: fix crash when transaction killed


asmadeus@...ewreck.org writes:

> Schspa Shi wrote on Wed, Nov 30, 2022 at 04:14:32PM +0800:
>> >  - reqs are alloced in a kmem_cache created with SLAB_TYPESAFE_BY_RCU.
>> >  This means that if we get a req from idr_find, even if it has just been
>> >  freed, it either is still in the state it was freed at (hence refcount
>> >  0, we ignore it) or is another req coming from the same cache (if
>> 
>> If the req was newly alloced(It was at a new page), refcount maybe not
>> 0, there will be problem in this case. It seems we can't relay on this.
>> 
>> We need to set the refcount to zero before add it to idr in p9_tag_alloc.
>
> Hmm, if it's reused then it's zero by definition, but if it's a new
> allocation (uninitialized) then anything goes; that lookup could find
> and increase it before the refcount_set, and we'd have an off by one
> leading to use after free. Good catch!
>
> Initializing it to zero will lead to the client busy-looping until after
> the refcount is properly set, which should work.

Why? It looks no different from the previous process here. Initializing
it to zero should makes no difference.

> Setting refcount early might have us use an re-used req before the tag
> has been changed so that one cannot move.
>
> Could you test with just that changed if syzbot still reproduces this
> bug? (perhaps add a comment if you send this)
>

I have upload a new v2 change for this. But I can't easily reproduce
this problem.

> ------
> diff --git a/net/9p/client.c b/net/9p/client.c
> index aaa37b07e30a..aa64724f6a69 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -297,6 +297,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, uint t_size, uint r_size,
>  	p9pdu_reset(&req->rc);
>  	req->t_err = 0;
>  	req->status = REQ_STATUS_ALLOC;
> +	refcount_set(&req->refcount, 0);
>  	init_waitqueue_head(&req->wq);
>  	INIT_LIST_HEAD(&req->req_list);
>
> ----- 
>
>> >  refcount isn't zero, we can check its tag)
>> 
>> As for the release case, the next request will have the same tag with
>> high probability. It's better to make the tag value to be an increase
>> sequence, thus will avoid very much possible req reuse.
>
> I'd love to be able to do this, but it would break some servers that
> assume tags are small (e.g. using it as an index for a tag array)
> ... I thought nfs-ganesha was doing this but they properly put in in
> buckets, so that's one less server to worry about, but I wouldn't put
> it past some simple servers to do that; having a way to lookup a given
> tag for flush is an implementation requirement.
>
> That shouldn't be a problem though as that will just lead to either fail
> the guard check after lookup (m->rreq->status != REQ_STATUS_SENT) or be
> processed as a normal reply if it's already been sent by the other
> thread at this point.
> OTOH, that m->rreq->status isn't protected by m->req_lock in trans_fd,
> and that is probably another bug...

Yes, I aggree with you, another BUG.

-- 
BRs
Schspa Shi