linux-kernel - Re: [PATCH 0/5] blkcg: Limit maximum number of aio requests available for cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <707ca8fa-aee1-f068-b8ab-de5004d3d7ac@virtuozzo.com>
Date:   Tue, 5 Dec 2017 02:14:54 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Jeff Moyer <jmoyer@...hat.com>
Cc:     Tejun Heo <tj@...nel.org>, axboe@...nel.dk, bcrl@...ck.org,
        viro@...iv.linux.org.uk, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-aio@...ck.org, oleg@...hat.com
Subject: Re: [PATCH 0/5] blkcg: Limit maximum number of aio requests available
 for cgroup

On 05.12.2017 01:59, Jeff Moyer wrote:
> Kirill Tkhai <ktkhai@...tuozzo.com> writes:
> 
>> On 05.12.2017 00:52, Tejun Heo wrote:
>>> Hello, Kirill.
>>>
>>> On Tue, Dec 05, 2017 at 12:44:00AM +0300, Kirill Tkhai wrote:
>>>>> Can you please explain how this is a fundamental resource which can't
>>>>> be controlled otherwise?
>>>>
>>>> Currently, aio_nr and aio_max_nr are global. In case of containers this
>>>> means that a single container may occupy all aio requests, which are
>>>> available in the system, and to deprive others possibility to use aio
>>>> at all. This may happen because of evil intentions of the container's
>>>> user or because of the program error, when the user makes this occasionally.
>>>
>>> Hmm... I see.  It feels really wrong to me to make this a first class
>>> resource because there is a system wide limit.  The only reason I can
>>> think of for the system wide limit is to prevent too much kernel
>>> memory consumed by creating a lot of aios but that squarely falls
>>> inside cgroup memory controller protection.  If there are other
>>> reasons why the number of aios should be limited system-wide, please
>>> bring them up.
>>>
>>> If the only reason is kernel memory consumption protection, the only
>>> thing we need to do is making sure that memory used for aio commands
>>> are accounted against cgroup kernel memory consumption and
>>> relaxing/removing system wide limit.
>>
>> So, we just use GFP_KERNEL_ACCOUNT flag for allocation of internal aio
>> structures and pages, and all the memory will be accounted in kmem and
>> limited by memcg. Looks very good.
>>
>> One detail about memory consumption. io_submit() calls primitives
>> file_operations::write_iter and read_iter. It's not clear for me whether
>> they consume the same memory as if writev() or readv() system calls
>> would be used instead. writev() may delay the actual write till dirty
>> pages limit will be reached, so it seems logic of the accounting should
>> be the same. So aio mustn't use more not accounted system memory in file
>> system internals, then simple writev().
>>
>> Could you please to say if you have thoughts about this?
> 
> I think you just need to account the completion ring.

A request of struct aio_kiocb type consumes much more memory, than
struct io_event does. Shouldn't we account it too?

Kirill