lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 7 Jan 2016 05:03:07 +0530
From:	Parav Pandit <pandit.parav@...il.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
	lizefan@...wei.com, Johannes Weiner <hannes@...xchg.org>,
	Doug Ledford <dledford@...hat.com>,
	Liran Liss <liranl@...lanox.com>,
	"Hefty, Sean" <sean.hefty@...el.com>,
	Jason Gunthorpe <jgunthorpe@...idianresearch.com>,
	Haggai Eran <haggaie@...lanox.com>,
	Jonathan Corbet <corbet@....net>, james.l.morris@...cle.com,
	serge@...lyn.com, Or Gerlitz <ogerlitz@...lanox.com>,
	Matan Barak <matanb@...lanox.com>, raindel@...lanox.com,
	akpm@...ux-foundation.org, linux-security-module@...r.kernel.org
Subject: Re: [PATCHv1 3/6] rdmacg: implements rdma cgroup

On Wed, Jan 6, 2016 at 3:31 AM, Tejun Heo <tj@...nel.org> wrote:
> Hello,
>
> On Wed, Jan 06, 2016 at 12:28:03AM +0530, Parav Pandit wrote:
>> +/* hash table to keep map of tgid to owner cgroup */
>> +DEFINE_HASHTABLE(pid_cg_map_tbl, 7);
>> +DEFINE_SPINLOCK(pid_cg_map_lock);    /* lock to protect hash table access */
>> +
>> +/* Keeps mapping of pid to its owning cgroup at rdma level,
>> + * This mapping doesn't change, even if process migrates from one to other
>> + * rdma cgroup.
>> + */
>> +struct pid_cg_map {
>> +     struct pid *pid;                /* hash key */
>> +     struct rdma_cgroup *cg;
>> +
>> +     struct hlist_node hlist;        /* pid to cgroup hash table link */
>> +     atomic_t refcnt;                /* count active user tasks to figure out
>> +                                      * when to free the memory
>> +                                      */
>> +};
>
> Ugh, there's something clearly wrong here.  Why does the rdma
> controller need to keep track of pid cgroup membership?
>
Rdma resource can be allocated by parent process, used and freed by
child process.
Child process could belong to different rdma cgroup.
Parent process might have been terminated after creation of rdma
cgroup. (Followed by cgroup might have been deleted too).
Its discussed in https://lkml.org/lkml/2015/11/2/307

In nutshell, there is process that clearly owns the rdma resource.
So to keep the design simple, rdma resource is owned by the creator
process and cgroup without modifying the task_struct.

>> +static void _dealloc_cg_rpool(struct rdma_cgroup *cg,
>> +                           struct cg_resource_pool *rpool)
>> +{
>> +     spin_lock(&cg->cg_list_lock);
>> +
>> +     /* if its started getting used by other task,
>> +      * before we take the spin lock, then skip,
>> +      * freeing it.
>> +      */
>
> Please follow CodingStyle.
>
>> +     if (atomic_read(&rpool->refcnt) == 0) {
>> +             list_del_init(&rpool->cg_list);
>> +             spin_unlock(&cg->cg_list_lock);
>> +
>> +             _free_cg_rpool(rpool);
>> +             return;
>> +     }
>> +     spin_unlock(&cg->cg_list_lock);
>> +}
>> +
>> +static void dealloc_cg_rpool(struct rdma_cgroup *cg,
>> +                          struct cg_resource_pool *rpool)
>> +{
>> +     /* Don't free the resource pool which is created by the
>> +      * user, otherwise we miss the configured limits. We don't
>> +      * gain much either by splitting storage of limit and usage.
>> +      * So keep it around until user deletes the limits.
>> +      */
>> +     if (atomic_read(&rpool->creator) == RDMACG_RPOOL_CREATOR_DEFAULT)
>> +             _dealloc_cg_rpool(cg, rpool);
>
> I'm pretty sure you can get away with an fixed length array of
> counters.  Please keep it simple.  It's a simple hard limit enforcer.
> There's no need to create a massive dynamic infrastrucure.
>
Every resource pool for verbs resource is fixed length array. Length
of the array is defined by the IB stack modules.
This array is per cgroup, per device.
Its per device, because we agreed that we want to address requirement
of controlling/configuring them on per device basis.
Devices appear and disappear. Therefore they are allocated dynamically.
Otherwise this array could be static in cgroup structure.



> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ