lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <HE1PR0802MB25555E7AAFA66DA3FE025D0AF4230@HE1PR0802MB2555.eurprd08.prod.outlook.com>
Date:   Mon, 14 Sep 2020 07:32:41 +0000
From:   Jianyong Wu <Jianyong.Wu@....com>
To:     Dominique Martinet <asmadeus@...ewreck.org>
CC:     "ericvh@...il.com" <ericvh@...il.com>,
        "lucho@...kov.net" <lucho@...kov.net>,
        "v9fs-developer@...ts.sourceforge.net" 
        <v9fs-developer@...ts.sourceforge.net>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Justin He <Justin.He@....com>,
        Greg Kurz <gkurz@...ux.vnet.ibm.com>
Subject: RE: [PATCH RFC 4/4] 9p: fix race issue in fid contention.

Hi Dominique,

> -----Original Message-----
> From: Dominique Martinet <asmadeus@...ewreck.org>
> Sent: Monday, September 14, 2020 1:56 PM
> To: Jianyong Wu <Jianyong.Wu@....com>
> Cc: ericvh@...il.com; lucho@...kov.net; v9fs-
> developer@...ts.sourceforge.net; linux-kernel@...r.kernel.org; Justin He
> <Justin.He@....com>; Greg Kurz <gkurz@...ux.vnet.ibm.com>
> Subject: Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.
>
>
> Thanks for having a look a this!
>
> Jianyong Wu wrote on Mon, Sep 14, 2020:
> > Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
> > bug in 9p. But there is race issue in fid comtention.
> > As Greg's patch stores all of fids from opened files into according
> > inode, so all the lookup fid ops can retrieve fid from inode
> > preferentially. But there is no mechanism to handle the fid comtention
> > issue. For example, there are two threads get the same fid in the same
> > time and one of them clunk the fid before the other thread ready to
> > discard the fid. In this scenario, it will lead to some fatal problems, even
> kernel core dump.
>
> Ah, so that's what the problem was. Good job finding the problem!
>
Thanks! Very pleasure.
>
> > I introduce a mechanism to fix this race issue. A counter field
> > introduced into p9_fid struct to store the reference counter to the
> > fid. When a fid is allocated from the inode, the counter will
> > increase, and will decrease at the end of its occupation. It is
> > guaranteed that the fid won't be clunked before the reference counter
> > go down to 0, then we can avoid the clunked fid to be used.
> > As there is no need to retrieve fid from inode in all conditions, a
> > enum value denotes the source of the fid is introduced to 9p_fid
> > either. So we can only handle the reference counter as to the fid obtained
> from inode.
>
> If there is no contention then an always-one refcount and an enum are the
> same thing.
> I'd rather not make a difference but make it a full-fledged refcount thing; the
> enum in the code introduces quite a bit of code churn that doesn't strike me
> as useful (and I don't like int arguments like this, but if we can just do away
> with it there's no need to argue about that)
>
> Not having exceptions for that will also make the code around
> fid_atomic_dec much simpler: just have clunk do an atomic dec and only do
> the actual clunk if that hit zero, and we should be able to get rid of that
> helper?
>
Sorry, I think always-one refcount  won't work at this point, as the fid will be clunked only by
File context itself not the every consumer of every fid. We can't decrease the refcounter at just one
static point. Am I wrong?
This enum value is not functionally necessary, but I think it can reduce the contention of fid, as there are
really lots of scenarios that fid from inode is not necessary.

>
> Timing wise it's a bit awkward but I just dug out the async clunk mechanism I
> wrote two years ago, that will conflict with this patch but might also help a bit
> I guess?
> I should probably have reposted them...
>
Interesting!

>
> So to recap:
>  - Let's try some more straight-forward refcounting: set to 1 on alloc,
> increment when it's found in fid.c, decrement in clunk and only send the
> actual clunk if counter hit 0
>
it may not work, I think.

>  - Ideally base yourself of my 9p-test branch to have async clunk:
> https://github.com/martinetd/linux/commits/9p-test
> I've been promising to push it to next this week™ for a couple of weeks but if
> something is based on it I won't be able to delay this much longer, it'll get
> pushed to 5.10 cycle anyway.
> (I'll resend the patches to be clean)
>
>  - (please, no polling 10ms then leaking something!)
>
Yeah, it will lead fid to leak sometimes, unfortunately,  I'm afraid that the CPU may be stuck here.  we
must wait here (v9fs_dir_release) for the counter down to 0, as this is the only place to release the fid.
That's the problem.

Thanks
Jianyong
> Thanks,
> --
> Dominique
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ