lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHp75Vc22JkiHpSWsnG72BWOV=Oc4WgKBWp8Ly8GXzjcepC9jg@mail.gmail.com>
Date:   Tue, 30 May 2017 19:52:50 +0300
From:   Andy Shevchenko <andy.shevchenko@...il.com>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>
Cc:     "x86@...nel.org" <x86@...nel.org>, devel@...uxdriverproject.org,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "K. Y. Srinivasan" <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Jork Loeser <Jork.Loeser@...rosoft.com>,
        Simon Xiao <sixiao@...rosoft.com>,
        Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH v5 08/10] x86/hyper-v: use hypercall for remote TLB flush

On Tue, May 30, 2017 at 2:34 PM, Vitaly Kuznetsov <vkuznets@...hat.com> wrote:
> Hyper-V host can suggest us to use hypercall for doing remote TLB flush,
> this is supposed to work faster than IPIs.
>
> Implementation details: to do HvFlushVirtualAddress{Space,List} hypercalls
> we need to put the input somewhere in memory and we don't really want to
> have memory allocation on each call so we pre-allocate per cpu memory areas
> on boot. These areas are of fixes size, limit them with an arbitrary number
> of 16 (16 gvas are able to specify 16 * 4096 pages).
>
> pv_ops patching is happening very early so we need to separate
> hyperv_setup_mmu_ops() and hyper_alloc_mmu().
>
> It is possible and easy to implement local TLB flushing too and there is
> even a hint for that. However, I don't see a room for optimization on the
> host side as both hypercall and native tlb flush will result in vmexit. The
> hint is also not set on modern Hyper-V versions.

> @@ -0,0 +1,121 @@
> +#include <linux/types.h>
> +#include <linux/hyperv.h>
> +#include <linux/slab.h>
> +#include <linux/log2.h>

Alphabetical order, please.

+ empty line

> +#include <asm/mshyperv.h>
> +#include <asm/tlbflush.h>
> +#include <asm/msr.h>
> +#include <asm/fpu/api.h>

Can be alphabetically ordered?

> +/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
> +struct hv_flush_pcpu {
> +       __u64 address_space;
> +       __u64 flags;
> +       __u64 processor_mask;
> +       __u64 gva_list[];
> +};

I dunno what is the style there, but usually in Linux __uXX types are
used exclusively for User API.
Is it a case here? Can we use plain uXX types instead?

> +/* Each gva in gva_list encodes up to 4096 pages to flush */
> +#define HV_TLB_FLUSH_UNIT (PAGE_SIZE * PAGE_SIZE)

Regarding to the comment it would be rather
(4096 * PAGE_SIZE)

Yes, theoretically PAGE_SIZE can be not 4096.

> +static void hyperv_flush_tlb_others(const struct cpumask *cpus,
> +                                   struct mm_struct *mm, unsigned long start,
> +                                   unsigned long end)
> +{

> +       if (cpumask_equal(cpus, cpu_present_mask)) {
> +               flush->flags |= HV_FLUSH_ALL_PROCESSORS;
> +       } else {
> +               for_each_cpu(cpu, cpus) {
> +                       vcpu = hv_cpu_number_to_vp_number(cpu);

> +                       if (vcpu != -1 && vcpu < 64)

Just
if (vcpu < 64)
?

> +                               __set_bit(vcpu, (unsigned long *)
> +                                         &flush->processor_mask);
> +                       else
> +                               goto do_native;
> +               }
> +       }

> +       if (end == TLB_FLUSH_ALL) {
> +               flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY;
> +               status = hv_do_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE,
> +                                        flush, NULL);
> +       } else if (end && ((end - start)/HV_TLB_FLUSH_UNIT) > max_gvas) {
> +               status = hv_do_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE,
> +                                        flush, NULL);

Yes! Looks much more cleaner.

> +       } else {
> +               cur = start;
> +               gva_n = 0;
> +               do {
> +                       flush->gva_list[gva_n] = cur & PAGE_MASK;

> +                       /*
> +                        * Lower 12 bits encode the number of additional
> +                        * pages to flush (in addition to the 'cur' page).
> +                        */
> +                       if (end >= cur + HV_TLB_FLUSH_UNIT)
> +                               flush->gva_list[gva_n] |= ~PAGE_MASK;
> +                       else if (end > cur)
> +                               flush->gva_list[gva_n] |=
> +                                       (end - cur - 1) >> PAGE_SHIFT;

You can also simplify this slightly by introducing

unsigned long diff = end > cur ? end - cur : 0;

if (diff >= HV_TLB_FLUSH_UNIT)
    flush->gva_list[gva_n] |= ~PAGE_MASK;
else if (diff)
    flush->gva_list[gva_n] |= (diff - 1) >> PAGE_SHIFT;

> +
> +                       cur += HV_TLB_FLUSH_UNIT;

> +                       ++gva_n;

Make it post-increment. Better for reader (No need to pay an
additional attention why it's a pre-increment)

> +
> +               } while (cur < end);

> +       if (!(status & 0xffff))

Not first time I see this magic.

Perhaps

#define STATUS_BLA_BLA_MASK GENMASK(15,0)

if (!(status & STATUS_BLA_BLA_MASK))

in all appropriate places?

> +#define HV_FLUSH_ALL_PROCESSORS                        0x00000001
> +#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES    0x00000002
> +#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY      0x00000004
> +#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT     0x00000008

BIT() ?

-- 
With Best Regards,
Andy Shevchenko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ