linux-kernel - Re: [PATCH v6 2/2] Output stall data in debugfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1313091323.8491.30.camel@twins>
Date:	Thu, 11 Aug 2011 21:35:22 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Alex Neronskiy <zakmagnus@...omium.org>
Cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Don Zickus <dzickus@...hat.com>,
	Mandeep Singh Baines <msb@...omium.org>,
	Alex Neronskiy <zakmagnus@...omium.com>
Subject: Re: [PATCH v6 2/2] Output stall data in debugfs

On Wed, 2011-08-10 at 11:02 -0700, Alex Neronskiy wrote:
> @@ -210,22 +236,27 @@ void touch_softlockup_watchdog_sync(void)
>  /* watchdog detector functions */
>  static void update_hardstall(unsigned long stall, int this_cpu)
>  {
>         if (stall > hardstall_thresh && stall > worst_hardstall) {
>                 unsigned long flags;
> +               spin_lock_irqsave(&hardstall_write_lock, flags);
> +               if (stall > worst_hardstall) {
> +                       int write_ind = hard_read_ind;
> +                       int locked = spin_trylock(&hardstall_locks[write_ind]);
> +                       /* cannot wait, so if there's contention,
> +                        * switch buffers */
> +                       if (!locked)
> +                               write_ind = !write_ind;
> +
>                         worst_hardstall = stall;
> +                       hardstall_traces[write_ind].nr_entries = 0;
> +                       save_stack_trace(&hardstall_traces[write_ind]);
>  
> +                       /* tell readers to use the new buffer from now on */
> +                       hard_read_ind = write_ind;
> +                       if (locked)
> +                               spin_unlock(&hardstall_locks[write_ind]);
> +               }
> +               spin_unlock_irqrestore(&hardstall_write_lock, flags);
>         }
>  } 

That must be the most convoluted locking I've seen in a while.. OMG!

What's wrong with something like:

static void update_stall(struct stall *s, unsigned long stall)
{
	if (stall <= s->worst)
		return;

again:
	if (!raw_spin_trylock(&s->lock[s->idx])) {
		s->idx ^= 1;
		goto again;
	}

	if (stall <= s->worst)
		goto unlock;

	s->worst = stall;
	s->trace[s->idx].nr_entries = 0;
	save_stack_trace(&s->trace[s->idx]);

unlock:
	raw_spin_unlock(&s->lock[s->idx]);
}


And have your read side do:


static void show_stall_trace(struct seq_file *f, void *v)
{
	struct stall *s = f->private;
	int i, idx = ACCESS_ONCE(s->idx);

	mutex_lock(&stall_mutex);

	raw_spin_lock(&s->lock[idx]);
	seq_printf(f, "stall: %d\n", s->worst);
	for (i = 0; i < s->trace[idx].nr_entries; i++) {
		seq_printf(f, "[<%pK>] %pS\n", 
			(void *)s->trace->entries[i],
			(void *)s->trace->entries[i]);
	}
	raw_spin_unlock(&s->lock[idx]);

	mutex_unlock(&stall_mutex);
}


Yes its racy on s->worst, but who cares (if you do care you can keep a
copy in s->delay[idx] or so). Also, it might be better to not do the
spinlock but simply use an atomic bitop to set an in-use flag, there is
no reason to disable preemption over the seq_printf() loop.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/