[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fe7d3b03-4d08-34ac-a695-a5c57c751aeb@huawei.com>
Date: Tue, 27 Jun 2023 17:08:27 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Jan Kara <jack@...e.cz>
CC: <linux-fsdevel@...r.kernel.org>, <linux-ext4@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <yi.zhang@...wei.com>,
<yangerkun@...wei.com>, <chengzhihao1@...wei.com>,
<yukuai3@...wei.com>, Baokun Li <libaokun1@...wei.com>
Subject: Re: [PATCH] quota: fix race condition between dqput() and
dquot_mark_dquot_dirty()
Hello!
On 2023/6/27 16:34, Jan Kara wrote:
> Hello!
>
> On Mon 26-06-23 21:55:49, Baokun Li wrote:
>> On 2023/6/26 21:09, Jan Kara wrote:
>>> On Sun 25-06-23 15:56:10, Baokun Li wrote:
>>>>>> I think we can simply focus on the race between the DQ_ACTIVE_B flag and
>>>>>> the DQ_MOD_B flag, which is the core problem, because the same quota
>>>>>> should not have both flags. These two flags are protected by dq_list_lock
>>>>>> and dquot->dq_lock respectively, so it makes sense to add a
>>>>>> wait_on_dquot() to ensure the accuracy of DQ_ACTIVE_B.
>>>>> But the fundamental problem is not only the race with DQ_MOD_B setting. The
>>>>> dquot structure can be completely freed by the time
>>>>> dquot_claim_space_nodirty() calls dquot_mark_dquot_dirty() on it. That's
>>>>> why I think making __dquot_transfer() obey dquot_srcu rules is the right
>>>>> solution.
>>>> Yes, now I also think that making __dquot_transfer() obey dquot_srcu
>>>> rules is a better solution. But with inode->i_lock protection, why would
>>>> the dquot structure be completely freed?
>>> Well, when dquot_claim_space_nodirty() calls mark_all_dquot_dirty() it does
>>> not hold any locks (only dquot_srcu). So nothing prevents dquot_transfer()
>>> to go, swap dquot structure pointers and drop dquot references and after
>>> that mark_all_dquot_dirty() can use a stale pointer to call
>>> mark_dquot_dirty() on already freed memory.
>>>
>> No, this doesn't look like it's going to happen. The
>> mark_all_dquot_dirty() uses a pointer array pointer, the dquot in the
>> array is dynamically changing, so after swap dquot structure pointers,
>> mark_all_dquot_dirty() uses the new pointer, and the stale pointer is
>> always destroyed after swap, so there is no case of using the stale
>> pointer here.
> There is a case - CPU0 can prefetch the values from dquots[] array into its
> local cache, then CPU1 can update the dquots[] array (these writes can
> happily stay in CPU1 store cache invisible to other CPUs) and free the
> dquots via dqput(). Then CPU0 can pass the prefetched dquot pointers to
> mark_dquot_dirty(). There are no locks or memory barries preventing CPUs
> from ordering instructions and memory operations like this in the code...
> You can read Documentation/memory-barriers.txt about all the perils current
> CPU architecture brings wrt coordination of memory accesses among CPUs ;)
>
> Honza
Got it!
Sorry for misunderstanding you (I thought "completely freed" meant
dquot_destroy(), but you should have meant dquot_release()).
Then this is exactly the scenario in the first function graph in my patch
description. In this scenario the data from the old dquot has been moved
to the new dquot. The old dquot may set DQ_MOD_B bit after it has been
cleared of DQ_ACTIVE_B bit.
We want to do exactly the thing to avoid this situation.
--
With Best Regards,
Baokun Li
.
Powered by blists - more mailing lists