lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080407231007.4946410d.akpm@linux-foundation.org>
Date:	Mon, 7 Apr 2008 23:10:07 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Gerlof Langeveld <gerlof@...omputing.nl>
Cc:	linux-kernel@...r.kernel.org, Balbir Singh <balbir@...ibm.com>,
	Pavel Emelyanov <xemul@...nvz.org>
Subject: Re: [PATCH 1/3] accounting: task counters for disk/network

On Tue, 8 Apr 2008 07:48:37 +0200 Gerlof Langeveld <gerlof@...omputing.nl> wrote:

> > > --- linux-2.6.24.4-vanilla/block/ll_rw_blk.c	2008-03-24 19:49:18.000000000 +0100
> > > +++ linux-2.6.24.4-modified/block/ll_rw_blk.c	2008-03-25 13:52:14.000000000 +0100
> > > @@ -2739,6 +2739,19 @@ static void drive_stat_acct(struct reque
> > >  		disk_round_stats(rq->rq_disk);
> > >  		rq->rq_disk->in_flight++;
> > >  	}
> > > +
> > > +#ifdef CONFIG_TASK_IO_ACCOUNTING
> > > +	switch (rw) {
> > > +	case READ:
> > > +		current->group_leader->ioac.dsk_rio += new_io;
> > > +		current->group_leader->ioac.dsk_rsz += rq->nr_sectors;
> > > +		break;
> > > +	case WRITE:
> > > +		current->group_leader->ioac.dsk_wio += new_io;
> > > +		current->group_leader->ioac.dsk_wsz += rq->nr_sectors;
> > > +		break;
> > > +	}
> > > +#endif
> > 
> > For many workloads, this will cause almost all writeout to be accounted to
> > pdflush and perhaps kswapd.  This makes the per-task write accounting
> > largely unuseful.
> 
> There are several situations that writeouts are accounted to the user-process
> itself, e.g. when issueing direct writes (open mode O_DIRECT) or synchronous
> writes (open mode O_SYNC, syscall sync/fsync, synchronous file attribute,
> synchronous mounted filesystem).

yup.

> Apart from that, swapping out of process pages by kswapd is currently not
> accounted at all as shown by the following snapshot of 'atop' on a heavily
> swapping system:

Under heavy load, callers into alloc_pages() will themselves perform disk
writeout.  So under the proposed scheme, process A will be accounted for
writeout which was in fact caused by process B.

> So the extra counters can be considered as a useful addition to the I/O 
> counters that are currently maintained.

mmm, maybe.  But if we implement a partial solution like this we really
should have a plan to finish it off.

There have been numerous attempts at this, which tend to involve adding
backpointers to the pageframe structure and such.

This sort of accounting will presumably be needed by a disk bandwidth
cgroup controller.  Perhaps the containers/cgroup people have plans of code
already?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ