lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230406014419epcms1p3f285b6e3fdbb1457db1bcbaab9e863be@epcms1p3>
Date:   Thu, 06 Apr 2023 10:44:19 +0900
From:   Jaewon Kim <jaewon31.kim@...sung.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
CC:     "jstultz@...gle.com" <jstultz@...gle.com>,
        "tjmercier@...gle.com" <tjmercier@...gle.com>,
        "sumit.semwal@...aro.org" <sumit.semwal@...aro.org>,
        "daniel.vetter@...ll.ch" <daniel.vetter@...ll.ch>,
        "hannes@...xchg.org" <hannes@...xchg.org>,
        "mhocko@...nel.org" <mhocko@...nel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "jaewon31.kim@...il.com" <jaewon31.kim@...il.com>
Subject: RE: [PATCH v2] dma-buf/heaps: system_heap: Avoid DoS by limiting
 single allocations to half of all memory

>On Thu,  6 Apr 2023 09:08:54 +0900 Jaewon Kim <jaewon31.kim@...sung.com> wrote:
>
>> Normal free:212600kB min:7664kB low:57100kB high:106536kB
>>   reserved_highatomic:4096KB active_anon:276kB inactive_anon:180kB
>>   active_file:1200kB inactive_file:0kB unevictable:2932kB
>>   writepending:0kB present:4109312kB managed:3689488kB mlocked:2932kB
>>   pagetables:13600kB bounce:0kB free_pcp:0kB local_pcp:0kB
>>   free_cma:200844kB
>> Out of memory and no killable processes...
>> Kernel panic - not syncing: System is deadlocked on memory
>> 
>> An OoM panic was reported, there were only native processes which are
>> non-killable as OOM_SCORE_ADJ_MIN.
>> 
>> After looking into the dump, I've found the dma-buf system heap was
>> trying to allocate a huge size. It seems to be a signed negative value.
>> 
>> dma_heap_ioctl_allocate(inline)
>>     |  heap_allocation = 0xFFFFFFC02247BD38 -> (
>>     |    len = 0xFFFFFFFFE7225100,
>> 
>> Actually the old ion system heap had policy which does not allow that
>> huge size with commit c9e8440eca61 ("staging: ion: Fix overflow and list
>> bugs in system heap"). We need this change again. Single allocation
>> should not be bigger than half of all memory.
>> 
>> ...
>>
>> --- a/drivers/dma-buf/heaps/system_heap.c
>> +++ b/drivers/dma-buf/heaps/system_heap.c
>> @@ -351,6 +351,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap,
>>  	struct page *page, *tmp_page;
>>  	int i, ret = -ENOMEM;
>>  
>> +	if (len / PAGE_SIZE > totalram_pages() / 2)
>> +		return ERR_PTR(-ENOMEM);
>> +
>
>This seems so random.  Why ram/2 rather than ram/3 or 17*ram/35?

Hello

Thank you for your comment.

I just took the change from the old ion driver code, and actually I thought the
half of all memory is unrealistic. It could be unwanted size like negative,
or too big size which incurs slowness or OoM panic.

>
>Better behavior would be to try to allocate what the caller asked
>for and if that doesn't work out, fail gracefully after freeing the
>partial allocations which have been performed thus far.  If dma_buf
>is changed to do this then that change is useful in many scenarios other
>than this crazy corner case.

I think you would like __GFP_RETRY_MAYFAIL. Actually T.J. Mercier recommended
earlier, here's what we discussed.
https://lore.kernel.org/linux-mm/20230331005140epcms1p1ac5241f02f645e9dbc29626309a53b24@epcms1p1/

I just worried about a case in which we need oom kill to get more memory but
let me change my mind. That case seems to be rare. I think now it's time when
we need to make a decision and not to allow oom kill for dma-buf system heap
allocations.

But I still want to block that huge size over ram. For an unavailabe size,
I think, we don't have to do memory reclaim or killing processes, and we can
avoid freezing screen in user perspecitve.

This is eventually what I want. Can we check totalram_pages and and apply
__GFP_RETRY_MAYFAIL?

--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -41,7 +41,7 @@ struct dma_heap_attachment {
        bool mapped;
 };
 
-#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP | __GFP_RETRY_MAYFAIL)
 #define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN)
 #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
                                | __GFP_NORETRY) & ~__GFP_RECLAIM) \
@@ -351,6 +351,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap,
        struct page *page, *tmp_page;
        int i, ret = -ENOMEM;
 
+       if (len / PAGE_SIZE > totalram_pages())
+               return ERR_PTR(-ENOMEM);
+

BR
Jaewon Kim

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ