lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 8 Jul 2015 21:53:31 +0200
From:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:	Doug Ledford <dledford@...hat.com>
Cc:	Davidlohr Bueso <dave@...olabs.net>,
	Marcus Gelderie <redmnic@...il.com>,
	lkml <linux-kernel@...r.kernel.org>,
	David Howells <dhowells@...hat.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	John Duffy <jb_duffy@...nternet.com>,
	Arto Bendiken <arto@...diken.net>,
	Linux API <linux-api@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v3] ipc: Modify message queue accounting to not take
 kernel data structures into account

On 8 July 2015 at 21:17, Doug Ledford <dledford@...hat.com> wrote:
> On 07/07/2015 09:01 AM, Michael Kerrisk (man-pages) wrote:
>> Hi David,
>>
>> On 7 July 2015 at 07:16, Davidlohr Bueso <dave@...olabs.net> wrote:
>>> On Mon, 2015-07-06 at 17:49 +0200, Marcus Gelderie wrote:
>>>> A while back, the message queue implementation in the kernel was
>>>> improved to use btrees to speed up retrieval of messages (commit
>>>> d6629859b36). The patch introducing the improved kernel handling of
>>>> message queues (using btrees) has, as a by-product, changed the
>>>> meaning of the QSIZE field in the pseudo-file created for the queue.
>>>> Before, this field reflected the size of the user-data in the queue.
>>>> Since, it also takes kernel data structures into account. For
>>>> example, if 13 bytes of user data are in the queue, on my machine the
>>>> file reports a size of 61 bytes.
>>>>
>>>> There was some discussion on this topic before (for example
>>>> https://lkml.org/lkml/2014/10/1/115). Commenting on a th lkml, Michael
>>>> Kerrisk gave the following background (https://lkml.org/lkml/2015/6/16/74):
>>>>
>>>>     The pseudofiles in the mqueue filesystem (usually mounted at
>>>>     /dev/mqueue) expose fields with metadata describing a message
>>>>     queue. One of these fields, QSIZE, as originally implemented,
>>>>     showed the total number of bytes of user data in all messages in
>>>>     the message queue, and this feature was documented from the
>>>>     beginning in the mq_overview(7) page. In 3.5, some other (useful)
>>>>     work happened to break the user-space API in a couple of places,
>>>>     including the value exposed via QSIZE, which now includes a measure
>>>>     of kernel overhead bytes for the queue, a figure that renders QSIZE
>>>>     useless for its original purpose, since there's no way to deduce
>>>>     the number of overhead bytes consumed by the implementation.
>>>>     (The other user-space breakage was subsequently fixed.)
>>>
>>> Michael, this breakage was never finally documented in the manpage,
>>> right?
>>
>> Right.
>>
>>> I took a look and there is no mention, but it was a quick look.
>>> It's just that if this patch goes in, I'd hate ending up with something
>>> like this in the manpage:
>>>
>>> as of 3.5
>>>   <accounts for kernel overhead>
>>>
>>> as of 4.3
>>>   <behavior reverted back to not include kernel overhead... *sigh*>
>>>
>>> If there are changes to be made to the manpage, it should probably be
>>> posted with this patch, methinks.
>>
>> I think something like the above will have to end up in the man page.
>> The only thing I'd add is that the fix also has gone to -stable
>> kernels. At least: I think this patch should also be marked for
>> -stable. I'll take care of the man page updates as the patch goes
>> through.
>>
>>>> This patch removes the accounting of kernel data structures in the
>>>> queue. Reporting the size of these data-structures in the QSIZE field
>>>> was a breaking change (see Michael's comment above). Without the QSIZE
>>>> field reporting the total size of user-data in the queue, there is no
>>>> way to deduce this number.
>>>>
>>>> It should be noted that the resource limit RLIMIT_MSGQUEUE is counted
>>>> against the worst-case size of the queue (in both the old and the new
>>>> implementation). Therefore, the kernel overhead accounting in QSIZE is
>>>> not necessary to help the user understand the limitations RLIMIT imposes
>>>> on the processes.
>>>
>>> Also, I would suggest adding some comment in struct mqueue_inode_info
>>> for future reference, ie:
>>>
>>> -       unsigned long qsize; /* size of queue in memory (sum of all msgs) */
>>> +       /*
>>> +        * Size of queue in memory (sum of all msgs). Accounts for
>>> +        * only userspace overhead; ignoring any in-kernel rbtree nodes.
>>> +        */
>>> +       unsigned long qsize;
>>>
>>> But no big deal in any case.
>>>
>>> I think this is the right approach,
>>
>> Me too.
>>
>>> but would still like to know if Doug
>>> has any concerns about it.
>>
>> Looking on gmane, Doug does not appear to have been active on any
>> lists since late May! Not sure why.
>
> I responded yesterday in the v2 patch thread I believe.  In any case, I
> think this patch is fine, and can go to stable.  This patch doesn't
> actually change the math related to the rlimit checks (which is the main
> thing I wanted to correct in my original patches), instead it corrects a
> mistake I made.  At the time, I mistakenly thought that the qsize
> included the current message data total + the struct msg_msg size total.
>  It didn't, it was just the current user data total.  I added the rbtree
> nodes in order to keep the total accurate but I shouldn't have added the
> rbtree nodes to this total because none of the other kernel usage was
> previously included.
>
> Acked-by: Doug Ledford <dledford@...hat.com>

+
Acked-by: Michael Kerrisk <mtk.manpages@...il.com>

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ