lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d6f46ea8-0f09-4942-6818-a58005c8a0c1@linux.dev>
Date: Fri, 23 Aug 2024 11:37:05 +0800
From: Hao Ge <hao.ge@...ux.dev>
To: Suren Baghdasaryan <surenb@...gle.com>, Miaohe Lin <linmiaohe@...wei.com>
Cc: kent.overstreet@...ux.dev, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, Hao Ge <gehao@...inos.cn>,
 stable@...r.kernel.org, nao.horiguchi@...il.com, akpm@...ux-foundation.org,
 pasha.tatashin@...een.com, david@...hat.com
Subject: Re: [PATCH] codetag: debug: mark codetags for pages which
 transitioned from being poison to unpoison as empty

Hi Suren and Miaohe


On 8/23/24 09:47, Hao Ge wrote:
> Hi Suren and Miaohe
>
>
> Thank you all for taking the time to discuss this issue.
>
>
> On 8/23/24 06:50, Suren Baghdasaryan wrote:
>> On Thu, Aug 22, 2024 at 2:46 AM Hao Ge <hao.ge@...ux.dev> wrote:
>>> Hi Miaohe
>>>
>>>
>>> Thank you for taking the time to review this patch.
>>>
>>>
>>> On 8/22/24 16:04, Miaohe Lin wrote:
>>>> On 2024/8/22 10:58, Hao Ge wrote:
>>>>> From: Hao Ge <gehao@...inos.cn>
>>>>>
>>>> Thanks for your patch.
>>>>
>>>>> The PG_hwpoison page will be caught and isolated on the entrance to
>>>>> the free buddy page pool. so,when we clear this flag and return it
>>>>> to the buddy system,mark codetags for pages as empty.
>>>>>
>>>> Is below scene cause the problem?
>>>>
>>>> 1. Pages are allocated. pgalloc_tag_add() will be called when 
>>>> prep_new_page().
>>>>
>>>> 2. Pages are hwpoisoned. memory_failure() will set PG_hwpoison flag 
>>>> and pgalloc_tag_sub()
>>>> will be called when pages are caught and isolated on the entrance 
>>>> to buddy.
>> Hi Folks,
>> Thanks for reporting this! Could you please describe in more details
>> how memory_failure() ends up calling pgalloc_tag_sub()? It's not
>> obvious to me which path leads to pgalloc_tag_sub(), so I must be
>> missing something.
>
>
> OK,Let me describe the scenario I encountered.
>
> In the Link [1] I mentioned,here is the logic behind it:
>
> It performed the following operations:
>
> madvise(ptrs[num_alloc], pagesize, MADV_SOFT_OFFLINE)
>
> and then the kernel's call stack looks like this:
>
> do_madvise
>
> soft_offline_page
>
> page_handle_poison
>
> __folio_put
>
> free_unref_page
>

I just reviewed it and I think I missed a stack.

Actually, it's like this

do_madvise

soft_offline_page

soft_offline_in_use_page

page_handle_poison

__folio_put

free_unref_page


And I've come up with a minimal solution. If everyone agrees, I'll send 
the patch.look this

https://elixir.bootlin.com/linux/v6.11-rc4/source/mm/page_alloc.c#L1056

Let's directly call clear_page_tag_ref after pgalloc_tag_sub.

Thanks

BR

Hao


> It will set a flag within the following function and then release the 
> page.
>
> https://elixir.bootlin.com/linux/v6.11-rc4/source/mm/memory-failure.c#L206 
>
>
> and and then,because you set the PG_hwpoison flag, so the page will be 
> caught and isolated on the
>
> entrance to the free buddy page pool. look here:
>
> https://elixir.bootlin.com/linux/v6.11-rc4/source/mm/page_alloc.c#L1052
>
> At this very moment, we call pgalloc_tag_sub.
>
> So,when we callunpoison_memoryclear this flag and return the page to 
> the buddy system, the problem arises.
>
>
>> On a conceptual level I want to understand if the page isolated in
>> this manner should be considered freed or not. If it shouldn't be
>> considered free then I think the right fix would be to avoid
>> pgalloc_tag_sub() when this isolation happens.
>> Thanks,
>> Suren.
>
> In my understanding, the purpose of unpoison_memory is to reclaim 
> poisoned pages.
>
> I dug up the patch that introduced this function back then
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/mm/memory-failure.c?id=847ce401df392b0704369fd3f75df614ac1414b4 
>
>
> Therefore, this is reasonable.
>
> Thanks
>
> Best regards
>
> Hao
>
>>
>>>> 3. unpoison_memory cleared flags and sent the pages to buddy. 
>>>> pgalloc_tag_sub() will be
>>>> called again in free_pages_prepare().
>>>>
>>>> So there is a imbalance that pgalloc_tag_add() is called once and 
>>>> pgalloc_tag_sub() is called twice?
>>> As you said, that's exactly the case.
>>>> If so, let's think about more complicated scene:
>>>>
>>>> 1. Same as above.
>>>>
>>>> 2. Pages are hwpoisoned. But memory_failure() fails to handle it. 
>>>> So PG_hwpoison flag is set
>>>> but pgalloc_tag_sub() is not called (pages are not sent to buddy).
>>>>
>>>> 3. unpoison_memory cleared flags and calls clear_page_tag_ref() 
>>>> without calling pgalloc_tag_sub()
>>>> first. Will this cause problem?
>>>>
>>>> Though this should be really rare...
>>>>
>>>> Thanks.
>>>> .
>>> Great, I didn't anticipate this scenario.
>>>
>>> When we call clear_page_tag_ref() without calling pgalloc_tag_sub(),
>>>
>>> It will cause exceptions 
>>> in|tag->counters->bytes|and|tag->counters->calls|.
>>>
>>> We can add a layer of protection to handle it
>>>
>>> The pseudocode is as follows:
>>>
>>> if (mem_alloc_profiling_enabled()) {
>>>           union codetag_ref *ref = get_page_tag_ref(page);
>>>
>>>           if (ref) {
>>>               if( ref->ct != NULL && !is_codetag_empty(ref))
>>>               {
>>>                   tag = ct_to_alloc_tag(ref->ct);
>>>                   this_cpu_sub(tag->counters->bytes, bytes);
>>>                   this_cpu_dec(tag->counters->calls);
>>>               }
>>>               set_codetag_empty(ref);
>>>               put_page_tag_ref(ref);
>>>           }
>>> }
>>>
>>> Hi Suren and Kent
>>>
>>> Do you have any suggestions for this? If it's okay, I'll add comments
>>> and include this pseudocode in|clear_page_tag_ref|.
>>>
>>>>> It was detected by [1] and the following WARN occurred:
>>>>>
>>>>> [  113.930443][ T3282] ------------[ cut here ]------------
>>>>> [  113.931105][ T3282] alloc_tag was not set
>>>>> [  113.931576][ T3282] WARNING: CPU: 2 PID: 3282 at 
>>>>> ./include/linux/alloc_tag.h:130 pgalloc_tag_sub.part.66+0x154/0x164
>>>>> [  113.932866][ T3282] Modules linked in: hwpoison_inject fuse 
>>>>> ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 
>>>>> xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_man4
>>>>> [  113.941638][ T3282] CPU: 2 UID: 0 PID: 3282 Comm: madvise11 
>>>>> Kdump: loaded Tainted: G        W 6.11.0-rc4-dirty #18
>>>>> [  113.943003][ T3282] Tainted: [W]=WARN
>>>>> [  113.943453][ T3282] Hardware name: QEMU KVM Virtual Machine, 
>>>>> BIOS unknown 2/2/2022
>>>>> [  113.944378][ T3282] pstate: 40400005 (nZcv daif +PAN -UAO -TCO 
>>>>> -DIT -SSBS BTYPE=--)
>>>>> [  113.945319][ T3282] pc : pgalloc_tag_sub.part.66+0x154/0x164
>>>>> [  113.946016][ T3282] lr : pgalloc_tag_sub.part.66+0x154/0x164
>>>>> [  113.946706][ T3282] sp : ffff800087093a10
>>>>> [  113.947197][ T3282] x29: ffff800087093a10 x28: ffff0000d7a9d400 
>>>>> x27: ffff80008249f0a0
>>>>> [  113.948165][ T3282] x26: 0000000000000000 x25: ffff80008249f2b0 
>>>>> x24: 0000000000000000
>>>>> [  113.949134][ T3282] x23: 0000000000000001 x22: 0000000000000001 
>>>>> x21: 0000000000000000
>>>>> [  113.950597][ T3282] x20: ffff0000c08fcad8 x19: ffff80008251e000 
>>>>> x18: ffffffffffffffff
>>>>> [  113.952207][ T3282] x17: 0000000000000000 x16: 0000000000000000 
>>>>> x15: ffff800081746210
>>>>> [  113.953161][ T3282] x14: 0000000000000000 x13: 205d323832335420 
>>>>> x12: 5b5d353031313339
>>>>> [  113.954120][ T3282] x11: ffff800087093500 x10: 000000000000005d 
>>>>> x9 : 00000000ffffffd0
>>>>> [  113.955078][ T3282] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008236ba90 
>>>>> x6 : c0000000ffff7fff
>>>>> [  113.956036][ T3282] x5 : ffff000b34bf4dc8 x4 : ffff8000820aba90 
>>>>> x3 : 0000000000000001
>>>>> [  113.956994][ T3282] x2 : ffff800ab320f000 x1 : 841d1e35ac932e00 
>>>>> x0 : 0000000000000000
>>>>> [  113.957962][ T3282] Call trace:
>>>>> [  113.958350][ T3282] pgalloc_tag_sub.part.66+0x154/0x164
>>>>> [  113.959000][ T3282]  pgalloc_tag_sub+0x14/0x1c
>>>>> [  113.959539][ T3282]  free_unref_page+0xf4/0x4b8
>>>>> [  113.960096][ T3282]  __folio_put+0xd4/0x120
>>>>> [  113.960614][ T3282]  folio_put+0x24/0x50
>>>>> [  113.961103][ T3282]  unpoison_memory+0x4f0/0x5b0
>>>>> [  113.961678][ T3282]  hwpoison_unpoison+0x30/0x48 [hwpoison_inject]
>>>>> [  113.962436][ T3282] simple_attr_write_xsigned.isra.34+0xec/0x1cc
>>>>> [  113.963183][ T3282]  simple_attr_write+0x38/0x48
>>>>> [  113.963750][ T3282]  debugfs_attr_write+0x54/0x80
>>>>> [  113.964330][ T3282]  full_proxy_write+0x68/0x98
>>>>> [  113.964880][ T3282]  vfs_write+0xdc/0x4d0
>>>>> [  113.965372][ T3282]  ksys_write+0x78/0x100
>>>>> [  113.965875][ T3282]  __arm64_sys_write+0x24/0x30
>>>>> [  113.966440][ T3282]  invoke_syscall+0x7c/0x104
>>>>> [  113.966984][ T3282] el0_svc_common.constprop.1+0x88/0x104
>>>>> [  113.967652][ T3282]  do_el0_svc+0x2c/0x38
>>>>> [  113.968893][ T3282]  el0_svc+0x3c/0x1b8
>>>>> [  113.969379][ T3282]  el0t_64_sync_handler+0x98/0xbc
>>>>> [  113.969980][ T3282]  el0t_64_sync+0x19c/0x1a0
>>>>> [  113.970511][ T3282] ---[ end trace 0000000000000000 ]---
>>>>>
>>>>> Link [1]: 
>>>>> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise11.c
>>>>>
>>>>> Fixes: a8fc28dad6d5 ("alloc_tag: introduce clear_page_tag_ref() 
>>>>> helper function")
>>>>> Cc: stable@...r.kernel.org # v6.10
>>>>> Signed-off-by: Hao Ge <gehao@...inos.cn>
>>>>> ---
>>>>>    mm/memory-failure.c | 6 ++++++
>>>>>    1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>>> index 7066fc84f351..570388c41532 100644
>>>>> --- a/mm/memory-failure.c
>>>>> +++ b/mm/memory-failure.c
>>>>> @@ -2623,6 +2623,12 @@ int unpoison_memory(unsigned long pfn)
>>>>>
>>>>>               folio_put(folio);
>>>>>               if (TestClearPageHWPoison(p)) {
>>>>> +                    /* the PG_hwpoison page will be caught and 
>>>>> isolated
>>>>> +                     * on the entrance to the free buddy page pool.
>>>>> +                     * so,when we clear this flag and return it 
>>>>> to the buddy system,
>>>>> +                     * clear it's codetag
>>>>> +                     */
>>>>> +                    clear_page_tag_ref(p);
>>>>>                       folio_put(folio);
>>>>>                       ret = 0;
>>>>>               }
>>>>>
>>>>>
>>> Thanks
>>>
>>> BR
>>>
>>> Hao
>>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ