lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6599ad830805081445w5991b47cld2861aab26ac6323@mail.gmail.com>
Date:	Thu, 8 May 2008 14:45:13 -0700
From:	"Paul Menage" <menage@...gle.com>
To:	balbir@...ux.vnet.ibm.com
Cc:	linux-mm@...ck.org, "Sudhir Kumar" <skumar@...ux.vnet.ibm.com>,
	"YAMAMOTO Takashi" <yamamoto@...inux.co.jp>, lizf@...fujitsu.com,
	linux-kernel@...r.kernel.org,
	"David Rientjes" <rientjes@...gle.com>,
	"Pavel Emelianov" <xemul@...nvz.org>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [-mm][PATCH 3/4] Add rlimit controller accounting and control

On Thu, May 8, 2008 at 7:35 AM, Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:
>
>  I currently intend to use this controller for controlling memory related
>  rlimits, like address space and mlock'ed memory. How about we use something like
>  "memrlimit"?

Sounds reasonable.

>
>  Good suggestion, but it will be hard if not impossible to account the data
>  correctly as it changes, if we do the accounting/summation at bind time. We'll
>  need a really big lock to do it, something I want to avoid. Did you have
>  something else in mind?

Yes, it'll be tricky but I think worthwhile. I believe it can be done
without the charge/uncharge code needing to take a global lock, except
for when we're actually binding/unbinding, with careful use of RCU.

My first thought for how to do this was that we have a field
"bind_transition" that indicates whether we're transitioning between
bound and unbound, and a bind_mutex. By default the charge/unpath uses
RCU, but by marking that we're in a transition state, the charge path
will use the mutex instead. By waiting for all existing chargers that
are using RCU to exit, we can then take the lock and synchronize with
the chargers.

So the charge/uncharge path would do:

  rcu_read_lock();
  if (ss->tranistioning) {
    rcu_read_unlock();
    locked = 1;
    mutex_lock(&ss->bind_mutex);
  }
  if (ss->active) {
    /* do charge/uncharge stuff, which must not block */
  }
  if (locked) {
    mutex_unlock(&ss->bind_mutex);
  } else {
    rcu_read_unlock();
  }

and the bind path would do something like:

ss->transitioning = 1;
synchronize_rcu();
mutex_lock(&ss->bind_mutex);
for_each_mm(mm) {
  down_read(&mm->mmap_sem);
  add_charge_for_mm();
  up_read(&mm->mmap_sem);
}
mutex_unlock(&ss->bind_mutex);
ss->transitioning = 0;

But this would break because we're nesting mmap_sem inside bind_mutex
in the bind path, but in the charge path we're nesting bind_mutex
inside mmap_sem. So we'd probably need to define a new bit
MMF_RLIMIT_ACCOUNTED in mm->flags to indicate whether that mm's
address space usage is accounted for. Once we've done that, we can use
mmap_sem to synchronize changes to the per-mm charged status for free,
since we already hold mmap_sem whenever we're doing the charging,
right? So it becomes simple:

charge path:

if (!test_bit(MMF_RLIMIT_ACCOUNTED, &mm->flags))
  return 0;
/* do charge/uncharge stuff */

bind path:

while((mm = find_unaccounted_mm()) {
  down_write(&mm->mmap_sem);
  add_charge_for_mm();
  set_bit(MMF_RLIMIT_ACCOUNTED, &mm->flags);
  up_write(&mm->mmap_sem);
}

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ