lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 19 Jul 2010 14:47:20 -0400 From: Jeff Moyer <jmoyer@...hat.com> To: Vivek Goyal <vgoyal@...hat.com> Cc: linux-kernel@...r.kernel.org, axboe@...nel.dk, nauman@...gle.com, dpshah@...gle.com, guijianfeng@...fujitsu.com, czoccolo@...il.com Subject: Re: [PATCH 1/3] cfq-iosched: Improve time slice charging logic Vivek Goyal <vgoyal@...hat.com> writes: > - Currently in CFQ there are many situations where don't know how > much time slice has been consumed by a queue. For example, all > the random reader/writer queues where we don't idle on > individual queues and we expire the queue either immediately > after the request dispatch. > > - In this case time consumed by a queue is just a memory copy > operation. Actually time measurement is possible only if we > idle on a queue and allow dispatch from a queue for significant > amount of time. > > - As of today, in such cases we calculate the time since the > dispatch from the queue started and charge all that time. > Generally this rounds to 1 jiffy but in some cases it can > be more. For example, if we are driving high request queue > depth and driver is too busy and does not ask for new > requests for 8-10 jiffies. In such cases, the active queue > gets charged very unfairly. > > - So fundamentally, whole notion of charging for time slice > is valid only if we have been idling on the queue. Otherwise > in an NCQ queue, there might be other requests on the queue > and we can not do the time slice calculation. > > - This patch tweaks the slice charging logic a bit so that > in the cases where we can't know the amount of time, we > start charging in terms of number of requests dispatched > (IOPS). This practically switching CFQ fairness model to > fairness in terms of IOPS with slice_idle=0. > > - As of today this will primarily be useful only with > group_idle patches so that we get fairness in terms of > IOPS across groups. The idea is that on fast storage > one can run CFQ with slice_idle=0 and still get IO > controller working without losing too much of > throughput. I'm not fluent in the cgroup code, my apologies for that. However, just trying to make sense of this is giving me a headache. Now, in some cases you are using IOPS *in place of* jiffies. How are we to know which is which and in what cases? It sounds like this is addressing an important problem, but I'm having a hard time picking out what that problem is. Is this problem noticable for competing sync-noidle workloads (competing between groups, that is)? If not, then what? Thanks, Jeff > Signed-off-by: Vivek Goyal <vgoyal@...hat.com> > --- > block/cfq-iosched.c | 24 +++++++++++++++++++++--- > 1 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index 7982b83..f44064c 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -896,16 +896,34 @@ static inline unsigned int cfq_cfqq_slice_usage(struct cfq_queue *cfqq) > * if there are mutiple queues in the group, each can dispatch > * a single request on seeky media and cause lots of seek time > * and group will never know it. > + * > + * If drive is NCQ and we are driving deep queue depths, then > + * it is not reasonable to charge the slice since dispatch > + * started because this time will include time taken by all > + * the other requests in the queue. > + * > + * Actually there is no reasonable way to know the disk time > + * here and we need to come up with some approximation. If > + * disk is non NCQ, we should be driving request queue depth > + * 1, then charge for time since dispatch start and this will > + * account for seek time properly on seeky media. If request > + * queue depth is high, then charge for number of requests > + * dispatched from the queue. This will sort of becoming > + * charging in terms of IOPS. > */ > - slice_used = max_t(unsigned, (jiffies - cfqq->dispatch_start), > - 1); > + if (cfqq->cfqd->hw_tag == 0) > + slice_used = max_t(unsigned, > + (jiffies - cfqq->dispatch_start), 1); > + else > + slice_used = cfqq->slice_dispatch; > } else { > slice_used = jiffies - cfqq->slice_start; > if (slice_used > cfqq->allocated_slice) > slice_used = cfqq->allocated_slice; > } > > - cfq_log_cfqq(cfqq->cfqd, cfqq, "sl_used=%u", slice_used); > + cfq_log_cfqq(cfqq->cfqd, cfqq, "sl_used=%u, sl_disp=%u", slice_used, > + cfqq->slice_dispatch); > return slice_used; > } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists