[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20141211014415.4168B1400B7@ozlabs.org>
Date: Thu, 11 Dec 2014 12:44:15 +1100 (AEDT)
From: Michael Ellerman <mpe@...erman.id.au>
To: sukadev@...ux.vnet.ibm.com, Michael Ellerman <mpe@...erman.id.au>
Cc: linuxppc-dev@...abs.org, Jiri Olsa <jolsa@...hat.com>,
dev@...yps.com, linux-kernel@...r.kernel.org,
Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: [1/2] perf/powerpc/hv-24x7: Use per-cpu page buffer
On Wed, 2014-10-12 at 22:29:13 UTC, sukadev@...ux.vnet.ibm.com wrote:
> Michael Ellerman [mpe@...erman.id.au] wrote:
> | On Tue, 2014-12-09 at 23:06 -0800, Sukadev Bhattiprolu wrote:
> | > From 470c16c8955672103a9529c78dffbb239e9e27b8 Mon Sep 17 00:00:00 2001
> | > From: Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
> | > Date: Tue, 9 Dec 2014 22:17:46 -0500
> | > Subject: [PATCH 1/2] perf/poweprc/hv-24x7: Use per-cpu page buffer
> | >
> | > diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
> | > index dba3408..18e1f49 100644
> | > --- a/arch/powerpc/perf/hv-24x7.c
> | > +++ b/arch/powerpc/perf/hv-24x7.c
> | > @@ -217,11 +217,14 @@ static bool is_physical_domain(int domain)
> | > domain == HV_24X7_PERF_DOMAIN_PHYSICAL_CORE;
> | > }
> | >
> | > +DEFINE_PER_CPU(char, hv_24x7_reqb[4096]);
> | > +DEFINE_PER_CPU(char, hv_24x7_resb[4096]);
> |
> | Do we need it to be 4K aligned also? I would guess so.
>
> Yes, fixed in the patch below.
OK.
> |
> | Rather than declaring these as char arrays and then casting below, can you pull
> | the struct definitions up and then declare the per cpu variables with the
> | proper type.
>
> Well, the structures, used for communication with HV, have variable length
> arrays, like:
>
> struct hv_24x7_request_buffer {
> ...
> struct hv_24x7_request requests[];
> };
>
> i.e the buffer needs to be larger than reported by sizeof(). So we
> allocate a large buffer and cast it. Not sure if there is a trick to
> get DEFINE_PER_CPU() to do that.
So the array is variable length, but no larger than 4K - at least I hope
because you're using a 4K buffer :)
The neatest way to handle that is to make it a union, with the struct and a 4K
char buffer.
But we can do that as a cleanup later.
> | > + memset(request_buffer, 0, 4096);
> | > + memset(result_buffer, 0, 4096);
> |
> | Do we have to memset them? That's not going to speed things up.
>
> I agree about the speed, specially since we have a larger buffer. But we
> are reusing the buffer for independent events and some fields need to be 0
> (hence the zalloc in the current code).
Sure, so you could explicitly initialise those fields to zero.
But that also can be another cleanup.
I'll take this as it is.
cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists