linux-kernel - Re: kdbus: to merge or not to merge?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrUA6o04QYhvSZjtVUs9p1A+ASndEv0C8X6D+Fg5uudo9A@mail.gmail.com>
Date:	Sun, 9 Aug 2015 19:10:20 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Daniel Mack <daniel@...que.org>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Tom Gundersen <teg@...m.no>,
	"Kalle A. Sandstrom" <ksandstr@....fi>,
	Borislav Petkov <bp@...en8.de>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	Havoc Pennington <havoc.pennington@...il.com>,
	Djalal Harouni <tixxdz@...ndz.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	cee1 <fykcee1@...il.com>, David Herrmann <dh.herrmann@...il.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: kdbus: to merge or not to merge?

On Sun, Aug 9, 2015 at 3:11 PM, Daniel Mack <daniel@...que.org> wrote:
>
> Internally, the connection pool is simply a shmem backed file. From the
> context of the HELLO ioctl, we are calling into shmem_file_setup(), so
> the file is eventually owned by the task which created the bus task
> connecting to the bus. One reason why we do the shmem file allocation in
> the kernel and on behalf of a the userspace task is that we clear the
> VM_MAYWRITE bit to prevent the task from writing to the pool through its
> mapped buffer. We also do not set VM_NORESERVE, so the entire buffer is
> pre-accounted for the task that created the connection.

I don't have access to the system I've been using for testing right
now, but I wonder how the kdbus pool stack up against the entire rest
of memory allocations for the average desktop process.

>
> The pool implementation uses an r/b tree to organize the buffer into
> slices. Those slices can be kept by userspace as long as the parsing
> implementation needs to have access to them. When finished, the slices
> are freed. A simple ring buffer cannot cope with the gaps that emerge by
> that.
>
> When a connection buffer is written to, it is done from the context of
> another task which calls into the kdbus code through one of the ioctls.
> The memcg implementation should hence charge the task that acts as
> writer, which is maybe not ideal but can be changed easily with some
> addition to the internal APIs. We omitted it for the current version,
> which is non-intrusive with regards to other kernel subsystems.
>

This has at least the following weakness.  I can very easily get
systemd to write to my shmem-backed pool: simply subscribe to one of
its broadcasts.  If I cause such a write to be very slow
(intentionally or otherwise), then PID 1 blocks.

If you change the memcg code to charge me instead of PID 1 (as it
should IMO), then the problem gets worse.

> The kdbus implementation is actually comparable to two tasks X and Y
> which both have their own buffer file open and mmap()ed, and they both
> pass their FD to the other side. If X now writes to Y's file, and that
> is causing a page fault, X is accounted for it, correct?

If PID 1 accepted a memfd from me (even a properly sealed one) and
wrote to it, I would wonder whether it were actually a good idea.

Does this scheme have any actual measurable advantage over the
traditional model of a small non-paged buffer in the kernel (i.e. the
way sockets work) with explicit userspace memfd use as appropriate?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/