linux-kernel - Re: [PATCH 1/4] arm64: hyperv: Add core Hyper-V include files

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 7 Dec 2018 13:42:35 +0000
From:   Will Deacon <will.deacon@....com>
To:     kys@...rosoft.com
Cc:     catalin.marinas@...m.com, mark.rutland@....com,
        marc.zyngier@....com, linux-arm-kernel@...ts.infradead.org,
        gregkh@...uxfoundation.org, linux-kernel@...r.kernel.org,
        devel@...uxdriverproject.org, olaf@...fle.de, apw@...onical.com,
        jasowang@...hat.com, sthemmin@...rosoft.com,
        Michael.H.Kelley@...rosoft.com, vkuznets@...hat.com,
        Michael Kelley <mikelley@...rosoft.com>
Subject: Re: [PATCH 1/4] arm64: hyperv: Add core Hyper-V include files

Hi all,

On Thu, Nov 22, 2018 at 03:10:56AM +0000, kys@...uxonhyperv.com wrote:
> From: Michael Kelley <mikelley@...rosoft.com>
> 
> hyperv-tlfs.h defines Hyper-V interfaces from the Hyper-V Top Level
> Functional Spec (TLFS). The TLFS is distinctly oriented to x86/x64,
> and Hyper-V has not separated out the architecture-dependent parts into
> x86/x64 vs. ARM64. So hyperv-tlfs.h includes information for ARM64
> that is not yet formally published. The TLFS is available here:

When do you plan to publish the spec? It's pretty hard to review this stuff
without knowing what it's supposed to look like.

>   docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
> 
> mshyperv.h defines Linux-specific structures and routines for
> interacting with Hyper-V. It is split into an ARM64 specific file
> and an architecture independent file in include/asm-generic.
> 
> Signed-off-by: Michael Kelley <mikelley@...rosoft.com>
> Signed-off-by: K. Y. Srinivasan <kys@...rosoft.com>
> ---
>  MAINTAINERS                          |   3 +
>  arch/arm64/include/asm/hyperv-tlfs.h | 338 +++++++++++++++++++++++++++
>  arch/arm64/include/asm/mshyperv.h    | 116 +++++++++
>  include/asm-generic/mshyperv.h       | 240 +++++++++++++++++++
>  4 files changed, 697 insertions(+)
>  create mode 100644 arch/arm64/include/asm/hyperv-tlfs.h
>  create mode 100644 arch/arm64/include/asm/mshyperv.h
>  create mode 100644 include/asm-generic/mshyperv.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f4855974f325..72f19cef4c48 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6835,6 +6835,8 @@ F:	arch/x86/include/asm/trace/hyperv.h
>  F:	arch/x86/include/asm/hyperv-tlfs.h
>  F:	arch/x86/kernel/cpu/mshyperv.c
>  F:	arch/x86/hyperv
> +F:	arch/arm64/include/asm/hyperv-tlfs.h
> +F:	arch/arm64/include/asm/mshyperv.h
>  F:	drivers/hid/hid-hyperv.c
>  F:	drivers/hv/
>  F:	drivers/input/serio/hyperv-keyboard.c
> @@ -6846,6 +6848,7 @@ F:	drivers/video/fbdev/hyperv_fb.c
>  F:	net/vmw_vsock/hyperv_transport.c
>  F:	include/linux/hyperv.h
>  F:	include/uapi/linux/hyperv.h
> +F:	include/asm-generic/mshyperv.h
>  F:	tools/hv/
>  F:	Documentation/ABI/stable/sysfs-bus-vmbus
>  
> diff --git a/arch/arm64/include/asm/hyperv-tlfs.h b/arch/arm64/include/asm/hyperv-tlfs.h
> new file mode 100644
> index 000000000000..924e37600e92
> --- /dev/null
> +++ b/arch/arm64/include/asm/hyperv-tlfs.h
> @@ -0,0 +1,338 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * This file contains definitions from the Hyper-V Hypervisor Top-Level
> + * Functional Specification (TLFS):
> + * https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fvirtualization%2Fhyper-v-on-windows%2Freference%2Ftlfs&amp;data=02%7C01%7Ckys%40microsoft.com%7Cc831a45fd63e4a4b083908d641216aa8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636768009113747528&amp;sdata=jRSrs9ZWXdmeS7LQUEpoSyUfBS7a5KLYy%2FolFdE2tI0%3D&amp;reserved=0

As mentioned elsewhere, please use a better link here and drop the license
boilerplate below.

> + *
> + * Copyright (C) 2018, Microsoft, Inc.
> + *
> + * Author : Michael Kelley <mikelley@...rosoft.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT.  See the GNU General Public License for more
> + * details.
> + */
> +
> +#ifndef _ASM_ARM64_HYPERV_H
> +#define _ASM_ARM64_HYPERV_H

__ASM_HYPER_V_H please

> +
> +#include <linux/types.h>
> +
> +/*
> + * These Hyper-V registers provide information equivalent to the CPUID
> + * instruction on x86/x64.
> + */
> +#define HV_REGISTER_HYPERVISOR_VERSION		0x00000100 /*CPUID 0x40000002 */
> +#define	HV_REGISTER_PRIVILEGES_AND_FEATURES	0x00000200 /*CPUID 0x40000003 */
> +#define	HV_REGISTER_FEATURES			0x00000201 /*CPUID 0x40000004 */
> +#define	HV_REGISTER_IMPLEMENTATION_LIMITS	0x00000202 /*CPUID 0x40000005 */
> +#define HV_ARM64_REGISTER_INTERFACE_VERSION	0x00090006 /*CPUID 0x40000001 */
> +
> +/*
> + * Feature identification. HvRegisterPrivilegesAndFeaturesInfo returns a
> + * 128-bit value with flags indicating which features are available to the
> + * partition based upon the current partition privileges. The 128-bit
> + * value is broken up with different portions stored in different 32-bit
> + * fields in the ms_hyperv structure.
> + */
> +
> +/* Partition Reference Counter available*/
> +#define HV_MSR_TIME_REF_COUNT_AVAILABLE		(1 << 1)
> +
> +/*
> + * Synthetic Timers available
> + */
> +#define HV_MSR_SYNTIMER_AVAILABLE		(1 << 3)
> +
> +/* Frequency MSRs available */
> +#define HV_FEATURE_FREQUENCY_MSRS_AVAILABLE	(1 << 8)
> +
> +/* Reference TSC available */
> +#define HV_MSR_REFERENCE_TSC_AVAILABLE		(1 << 9)
> +
> +/* Crash MSR available */
> +#define HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE	(1 << 10)
> +
> +
> +/*
> + * This group of flags is in the high order 64-bits of the returned
> + * 128-bit value.
> + */
> +
> +/* STIMER direct mode is available */
> +#define HV_STIMER_DIRECT_MODE_AVAILABLE		(1 << 19)
> +
> +/*
> + * Implementation recommendations in register
> + * HvRegisterFeaturesInfo. Indicates which behaviors the hypervisor
> + * recommends the OS implement for optimal performance.
> + */
> +
> +/*
> + * Recommend not using Auto EOI
> + */
> +#define HV_DEPRECATING_AEOI_RECOMMENDED		(1 << 9)
> +
> +/*
> + * Synthetic register definitions equivalent to MSRs on x86/x64
> + */
> +#define HV_REGISTER_CRASH_P0		0x00000210
> +#define HV_REGISTER_CRASH_P1		0x00000211
> +#define HV_REGISTER_CRASH_P2		0x00000212
> +#define HV_REGISTER_CRASH_P3		0x00000213
> +#define HV_REGISTER_CRASH_P4		0x00000214
> +#define HV_REGISTER_CRASH_CTL		0x00000215
> +
> +#define HV_REGISTER_GUEST_OSID		0x00090002
> +#define HV_REGISTER_VPINDEX		0x00090003
> +#define HV_REGISTER_TIME_REFCOUNT	0x00090004
> +#define HV_REGISTER_REFERENCE_TSC	0x00090017
> +
> +#define HV_REGISTER_SINT0		0x000A0000
> +#define HV_REGISTER_SINT1		0x000A0001
> +#define HV_REGISTER_SINT2		0x000A0002
> +#define HV_REGISTER_SINT3		0x000A0003
> +#define HV_REGISTER_SINT4		0x000A0004
> +#define HV_REGISTER_SINT5		0x000A0005
> +#define HV_REGISTER_SINT6		0x000A0006
> +#define HV_REGISTER_SINT7		0x000A0007
> +#define HV_REGISTER_SINT8		0x000A0008
> +#define HV_REGISTER_SINT9		0x000A0009
> +#define HV_REGISTER_SINT10		0x000A000A
> +#define HV_REGISTER_SINT11		0x000A000B
> +#define HV_REGISTER_SINT12		0x000A000C
> +#define HV_REGISTER_SINT13		0x000A000D
> +#define HV_REGISTER_SINT14		0x000A000E
> +#define HV_REGISTER_SINT15		0x000A000F
> +#define HV_REGISTER_SCONTROL		0x000A0010
> +#define HV_REGISTER_SVERSION		0x000A0011
> +#define HV_REGISTER_SIFP		0x000A0012
> +#define HV_REGISTER_SIPP		0x000A0013
> +#define HV_REGISTER_EOM			0x000A0014
> +#define HV_REGISTER_SIRBP		0x000A0015
> +
> +#define HV_REGISTER_STIMER0_CONFIG	0x000B0000
> +#define HV_REGISTER_STIMER0_COUNT	0x000B0001
> +#define HV_REGISTER_STIMER1_CONFIG	0x000B0002
> +#define HV_REGISTER_STIMER1_COUNT	0x000B0003
> +#define HV_REGISTER_STIMER2_CONFIG	0x000B0004
> +#define HV_REGISTER_STIMER2_COUNT	0x000B0005
> +#define HV_REGISTER_STIMER3_CONFIG	0x000B0006
> +#define HV_REGISTER_STIMER3_COUNT	0x000B0007
> +
> +/*
> + * Crash notification flags.
> + */
> +#define HV_CRASH_CTL_CRASH_NOTIFY_MSG	BIT_ULL(62)
> +#define HV_CRASH_CTL_CRASH_NOTIFY	BIT_ULL(63)
> +
> +/*
> + * The guest OS needs to register the guest ID with the hypervisor.
> + * The guest ID is a 64 bit entity and the structure of this ID is
> + * specified in the Hyper-V specification:
> + *
> + * msdn.microsoft.com/en-us/library/windows/hardware/ff542653%28v=vs.85%29.aspx

Dead link :(

> + *
> + * While the current guideline does not specify how Linux guest ID(s)
> + * need to be generated, our plan is to publish the guidelines for
> + * Linux and other guest operating systems that currently are hosted
> + * on Hyper-V. The implementation here conforms to this yet
> + * unpublished guidelines.

Again, any timeline on publishing this stuff? Right now, this is just a file
full of magic numbers and I'm not sure we're in a good position to maintain
it based on the idea that you have a cunning plan.

> + *
> + *
> + * Bit(s)
> + * 63 - Indicates if the OS is Open Source or not; 1 is Open Source
> + * 62:56 - Os Type; Linux is 0x100
> + * 55:48 - Distro specific identification
> + * 47:16 - Linux kernel version number
> + * 15:0  - Distro specific identification
> + *
> + *
> + */
> +#define HV_LINUX_VENDOR_ID              0x8100
> +
> +/* Declare the various hypercall operations. */
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE	0x0002
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST	0x0003
> +#define HVCALL_NOTIFY_LONG_SPIN_WAIT		0x0008
> +#define HVCALL_SEND_IPI				0x000b
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX	0x0013
> +#define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
> +#define HVCALL_SEND_IPI_EX			0x0015
> +#define HVCALL_GET_VP_REGISTERS			0x0050
> +#define HVCALL_SET_VP_REGISTERS			0x0051
> +#define HVCALL_POST_MESSAGE			0x005c
> +#define HVCALL_SIGNAL_EVENT			0x005d
> +#define HVCALL_RETARGET_INTERRUPT		0x007e
> +#define HVCALL_START_VIRTUAL_PROCESSOR		0x0099
> +#define HVCALL_GET_VP_INDEX_FROM_APICID		0x009a

This all sounds very x86y...

> +/* Declare standard hypercall field values. */
> +#define HV_PARTITION_ID_SELF                    ((u64)-1)
> +#define HV_VP_INDEX_SELF                        ((u32)-2)
> +
> +#define HV_HYPERCALL_FAST_BIT                   BIT(16)
> +#define HV_HYPERCALL_REP_COUNT_1                BIT_ULL(32)
> +#define HV_HYPERCALL_RESULT_MASK                GENMASK_ULL(15, 0)
> +
> +/* Define the hypercall status result */
> +
> +union hv_hypercall_status {
> +	u64 as_uint64;
> +	struct {
> +		u16 status;
> +		u16 reserved;
> +		u16 reps_completed;  /* Low 12 bits */
> +		u16 reserved2;
> +	};
> +};
> +
> +/* hypercall status code */
> +#define HV_STATUS_SUCCESS			0
> +#define HV_STATUS_INVALID_HYPERCALL_CODE	2
> +#define HV_STATUS_INVALID_HYPERCALL_INPUT	3
> +#define HV_STATUS_INVALID_ALIGNMENT		4
> +#define HV_STATUS_INSUFFICIENT_MEMORY		11
> +#define HV_STATUS_INVALID_CONNECTION_ID		18
> +#define HV_STATUS_INSUFFICIENT_BUFFERS		19
> +
> +/* Define output layout for Get VP Register hypercall */
> +struct hv_get_vp_register_output {
> +	u64 registervaluelow;
> +	u64 registervaluehigh;
> +};
> +
> +#define HV_FLUSH_ALL_PROCESSORS			BIT(0)
> +#define HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES	BIT(1)
> +#define HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY	BIT(2)
> +#define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT	BIT(3)
> +
> +enum HV_GENERIC_SET_FORMAT {
> +	HV_GENERIC_SET_SPARSE_4K,
> +	HV_GENERIC_SET_ALL,
> +};
> +
> +/*
> + * The Hyper-V TimeRefCount register and the TSC
> + * page provide a guest VM clock with 100ns tick rate
> + */
> +#define HV_CLOCK_HZ (NSEC_PER_SEC/100)
> +
> +/*
> + * The fields in this structure are set by Hyper-V and read
> + * by the Linux guest.  They should be accessed with READ_ONCE()
> + * so the compiler doesn't optimize in a way that will cause
> + * problems.
> + */
> +struct ms_hyperv_tsc_page {
> +	u32 tsc_sequence;
> +	u32 reserved1;
> +	u64 tsc_scale;
> +	s64 tsc_offset;
> +	u64 reserved2[509];
> +};

It might be clearer to express this as a union with the page size:

struct ms_hyperv_tsc_page {
	union {
		struct {
			__u32	tsc_sequence;
			__u32	reserved1;
		};
		__u8	reserved2[HV_HYP_PAGE_SIZE];
	};
};

What's the required alignment on this structure?
Also, do you intend for 32-bit guests to use this ABI as well?

> +
> +/* Define the number of synthetic interrupt sources. */
> +#define HV_SYNIC_SINT_COUNT		(16)
> +/* Define the expected SynIC version. */
> +#define HV_SYNIC_VERSION_1		(0x1)
> +
> +#define HV_SYNIC_CONTROL_ENABLE		(1ULL << 0)
> +#define HV_SYNIC_SIMP_ENABLE		(1ULL << 0)
> +#define HV_SYNIC_SIEFP_ENABLE		(1ULL << 0)
> +#define HV_SYNIC_SINT_MASKED		(1ULL << 16)
> +#define HV_SYNIC_SINT_AUTO_EOI		(1ULL << 17)
> +#define HV_SYNIC_SINT_VECTOR_MASK	(0xFF)
> +
> +#define HV_SYNIC_STIMER_COUNT		(4)
> +
> +/* Define synthetic interrupt controller message constants. */
> +#define HV_MESSAGE_SIZE			(256)
> +#define HV_MESSAGE_PAYLOAD_BYTE_COUNT	(240)
> +#define HV_MESSAGE_PAYLOAD_QWORD_COUNT	(30)
> +
> +/* Define hypervisor message types. */
> +enum hv_message_type {
> +	HVMSG_NONE			= 0x00000000,
> +
> +	/* Memory access messages. */
> +	HVMSG_UNMAPPED_GPA		= 0x80000000,
> +	HVMSG_GPA_INTERCEPT		= 0x80000001,
> +
> +	/* Timer notification messages. */
> +	HVMSG_TIMER_EXPIRED		= 0x80000010,
> +
> +	/* Error messages. */
> +	HVMSG_INVALID_VP_REGISTER_VALUE	= 0x80000020,
> +	HVMSG_UNRECOVERABLE_EXCEPTION	= 0x80000021,
> +	HVMSG_UNSUPPORTED_FEATURE	= 0x80000022,
> +
> +	/* Trace buffer complete messages. */
> +	HVMSG_EVENTLOG_BUFFERCOMPLETE	= 0x80000040,
> +};
> +
> +/* Define synthetic interrupt controller message flags. */
> +union hv_message_flags {
> +	__u8 asu8;
> +	struct {
> +		__u8 msg_pending:1;
> +		__u8 reserved:7;
> +	};
> +};
> +
> +/* Define port identifier type. */
> +union hv_port_id {
> +	__u32 asu32;
> +	struct {
> +		__u32 id:24;
> +		__u32 reserved:8;
> +	} u;
> +};

Hmm, I thought bitfields weren't a good idea for this sort of thing. Isn't
it up to the compiler how they are laid out in memory?

> +
> +/* Define synthetic interrupt controller message header. */
> +struct hv_message_header {
> +	__u32 message_type;
> +	__u8 payload_size;
> +	union hv_message_flags message_flags;
> +	__u8 reserved[2];
> +	union {
> +		__u64 sender;
> +		union hv_port_id port;
> +	};
> +};
> +
> +/* Define synthetic interrupt controller message format. */
> +struct hv_message {
> +	struct hv_message_header header;
> +	union {
> +		__u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
> +	} u;
> +};
> +
> +/* Define the synthetic interrupt message page layout. */
> +struct hv_message_page {
> +	struct hv_message sint_message[HV_SYNIC_SINT_COUNT];
> +};
> +
> +/* Define timer message payload structure. */
> +struct hv_timer_message_payload {
> +	__u32 timer_index;
> +	__u32 reserved;
> +	__u64 expiration_time;	/* When the timer expired */
> +	__u64 delivery_time;	/* When the message was delivered */
> +};
> +
> +#define HV_STIMER_ENABLE		(1ULL << 0)
> +#define HV_STIMER_PERIODIC		(1ULL << 1)
> +#define HV_STIMER_LAZY			(1ULL << 2)
> +#define HV_STIMER_AUTOENABLE		(1ULL << 3)
> +#define HV_STIMER_SINT(config)		(__u8)(((config) >> 16) & 0x0F)
> +
> +#endif
> diff --git a/arch/arm64/include/asm/mshyperv.h b/arch/arm64/include/asm/mshyperv.h
> new file mode 100644
> index 000000000000..a87c431d58b3
> --- /dev/null
> +++ b/arch/arm64/include/asm/mshyperv.h
> @@ -0,0 +1,116 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Linux-specific definitions for managing interactions with Microsoft's
> + * Hyper-V hypervisor. The definitions in this file are specific to
> + * the ARM64 architecture.  See include/asm-generic/mshyperv.h for
> + * definitions are that architecture independent.
> + *
> + * Definitions that are specified in the Hyper-V Top Level Functional
> + * Spec (TLFS) should not go in this file, but should instead go in
> + * hyperv-tlfs.h.
> + *
> + * Copyright (C) 2018, Microsoft, Inc.
> + *
> + * Author : Michael Kelley <mikelley@...rosoft.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT.  See the GNU General Public License for more
> + * details.
> + */
> +
> +#ifndef _ASM_ARM64_MSHYPERV_H
> +#define _ASM_ARM64_MSHYPERV_H
> +
> +#include <linux/types.h>
> +#include <linux/interrupt.h>
> +#include <linux/clocksource.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> +#include <asm/hyperv-tlfs.h>
> +
> +/*
> + * Define the IRQ numbers/vectors used by Hyper-V VMbus interrupts
> + * and by STIMER0 Direct Mode interrupts. Hyper-V should be supplying
> + * these values through ACPI, but there are no other interrupting
> + * devices in a Hyper-V VM on ARM64, so it's OK to hard code for now.
> + * The "CALLBACK_VECTOR" terminology is a left-over from the x86/x64
> + * world that is used in architecture independent Hyper-V code.
> + */
> +#define HYPERVISOR_CALLBACK_VECTOR 16
> +#define	HV_STIMER0_IRQNR	   17
> +
> +extern u64 hv_do_hvc(u64 control, ...);
> +extern u64 hv_do_hvc_fast_get(u64 control, u64 input1, u64 input2, u64 input3,
> +		struct hv_get_vp_register_output *output);
> +
> +/*
> + * Declare calls to get and set Hyper-V VP register values on ARM64, which
> + * requires a hypercall.
> + */
> +extern void hv_set_vpreg(u32 reg, u64 value);
> +extern u64 hv_get_vpreg(u32 reg);
> +extern void hv_get_vpreg_128(u32 reg, struct hv_get_vp_register_output *result);
> +
> +/*
> + * Use the Hyper-V provided stimer0 as the timer that is made
> + * available to the architecture independent Hyper-V drivers.
> + */
> +#define hv_init_timer(timer, tick) \
> +		hv_set_vpreg(HV_REGISTER_STIMER0_COUNT + (2*timer), tick)
> +#define hv_init_timer_config(timer, val) \
> +		hv_set_vpreg(HV_REGISTER_STIMER0_CONFIG + (2*timer), val)
> +#define hv_get_current_tick(tick) \
> +		(tick = hv_get_vpreg(HV_REGISTER_TIME_REFCOUNT))
> +
> +#define hv_get_simp(val) (val = hv_get_vpreg(HV_REGISTER_SIPP))
> +#define hv_set_simp(val) hv_set_vpreg(HV_REGISTER_SIPP, val)
> +
> +#define hv_get_siefp(val) (val = hv_get_vpreg(HV_REGISTER_SIFP))
> +#define hv_set_siefp(val) hv_set_vpreg(HV_REGISTER_SIFP, val)
> +
> +#define hv_get_synic_state(val) (val = hv_get_vpreg(HV_REGISTER_SCONTROL))
> +#define hv_set_synic_state(val) hv_set_vpreg(HV_REGISTER_SCONTROL, val)
> +
> +#define hv_get_vp_index(index) (index = hv_get_vpreg(HV_REGISTER_VPINDEX))
> +
> +#define hv_signal_eom()	hv_set_vpreg(HV_REGISTER_EOM, 0)
> +
> +/*
> + * Hyper-V SINT registers are numbered sequentially, so we can just
> + * add the SINT number to the register number of SINT0
> + */
> +#define hv_get_synint_state(sint_num, val) \
> +		(val = hv_get_vpreg(HV_REGISTER_SINT0 + sint_num))
> +#define hv_set_synint_state(sint_num, val) \
> +		hv_set_vpreg(HV_REGISTER_SINT0 + sint_num, val)
> +
> +#define hv_get_crash_ctl(val) \
> +		(val = hv_get_vpreg(HV_REGISTER_CRASH_CTL))
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +#define hv_enable_stimer0_percpu_irq(irq)	enable_percpu_irq(irq, 0)
> +#define hv_disable_stimer0_percpu_irq(irq)	disable_percpu_irq(irq)
> +#endif
> +
> +/* ARM64 specific code to read the hardware clock */
> +static inline u64 hv_read_hwclock(void)
> +{
> +	u64 result;
> +
> +	isb();
> +	result = read_sysreg(cntvct_el0);
> +	isb();
> +
> +	return result;
> +}

Shouldn't you use arch_timer_read_counter() or arch_counter_get_cntvct()
instead?

> +#include <asm-generic/mshyperv.h>
> +
> +#endif
> diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
> new file mode 100644
> index 000000000000..fbe1ec89c85a
> --- /dev/null
> +++ b/include/asm-generic/mshyperv.h
> @@ -0,0 +1,240 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Linux-specific definitions for managing interactions with Microsoft's
> + * Hyper-V hypervisor. The definitions in this file are architecture
> + * independent. See arch/<arch>/include/asm/mshyperv.h for definitions
> + * that are specific to architecture <arch>.
> + *
> + * Definitions that are specified in the Hyper-V Top Level Functional
> + * Spec (TLFS) should not go in this file, but should instead go in
> + * hyperv-tlfs.h.
> + *
> + * Copyright (C) 2018, Microsoft, Inc.
> + *
> + * Author : Michael Kelley <mikelley@...rosoft.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT.  See the GNU General Public License for more
> + * details.
> + */
> +
> +#ifndef _ASM_GENERIC_MSHYPERV_H
> +#define _ASM_GENERIC_MSHYPERV_H
> +
> +#include <linux/types.h>
> +#include <linux/interrupt.h>
> +#include <linux/clocksource.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> +#include <asm/hyperv-tlfs.h>
> +
> +/*
> + * Hyper-V always runs with a page size of 4096. These definitions
> + * are used when communicating with Hyper-V using guest physical
> + * pages and guest physical page addresses, since the guest page
> + * size may not be 4096 on ARM64.
> + */
> +#define HV_HYP_PAGE_SIZE	4096
> +#define HV_HYP_PAGE_SHIFT	12

Maybe define the page size in terms of the shift?

> +#define HV_HYP_PAGE_MASK	(~(HV_HYP_PAGE_SIZE - 1))
> +
> +
> +struct ms_hyperv_info {
> +	u32 features;
> +	u32 misc_features;
> +	u32 hints;
> +	u32 max_vp_index;
> +	u32 max_lp_index;
> +};

What is the define endianness for in-memory structures shared with the
hypervisor?

> +extern struct ms_hyperv_info ms_hyperv;
> +
> +extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
> +extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
> +
> +/*
> + * The guest OS needs to register the guest ID with the hypervisor.
> + * The guest ID is a 64 bit entity and the structure of this ID is
> + * specified in the Hyper-V specification:
> + *
> + * msdn.microsoft.com/en-us/library/windows/hardware/ff542653%28v=vs.85%29.aspx
> + *
> + * While the current guideline does not specify how Linux guest ID(s)
> + * need to be generated, our plan is to publish the guidelines for
> + * Linux and other guest operating systems that currently are hosted
> + * on Hyper-V. The implementation here conforms to this yet
> + * unpublished guidelines.
> + *
> + *
> + * Bit(s)
> + * 63 - Indicates if the OS is Open Source or not; 1 is Open Source
> + * 62:56 - Os Type; Linux is 0x100
> + * 55:48 - Distro specific identification
> + * 47:16 - Linux kernel version number
> + * 15:0  - Distro specific identification

This looks familiar...

> + * Generate the guest ID based on the guideline described above.
> + */
> +
> +static inline  __u64 generate_guest_id(__u64 d_info1, __u64 kernel_version,
> +				       __u64 d_info2)
> +{
> +	__u64 guest_id = 0;
> +
> +	guest_id = (((__u64)HV_LINUX_VENDOR_ID) << 48);
> +	guest_id |= (d_info1 << 48);
> +	guest_id |= (kernel_version << 16);
> +	guest_id |= d_info2;
> +
> +	return guest_id;
> +}
> +
> +
> +/* Free the message slot and signal end-of-message if required */
> +static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
> +{
> +	/*
> +	 * On crash we're reading some other CPU's message page and we need
> +	 * to be careful: this other CPU may already had cleared the header
> +	 * and the host may already had delivered some other message there.
> +	 * In case we blindly write msg->header.message_type we're going
> +	 * to lose it. We can still lose a message of the same type but
> +	 * we count on the fact that there can only be one
> +	 * CHANNELMSG_UNLOAD_RESPONSE and we don't care about other messages
> +	 * on crash.
> +	 */
> +	if (cmpxchg(&msg->header.message_type, old_msg_type,
> +		    HVMSG_NONE) != old_msg_type)
> +		return;
> +
> +	/*
> +	 * Make sure the write to MessageType (ie set to
> +	 * HVMSG_NONE) happens before we read the
> +	 * MessagePending and EOMing. Otherwise, the EOMing
> +	 * will not deliver any more messages since there is
> +	 * no empty slot
> +	 */
> +	mb();

A successful cmpxchg() already implies full barriers, so this isn't needed.
I also wonder whether cmpxchg_acquire() would be sufficient here.

> +
> +	if (msg->header.message_flags.msg_pending) {
> +		/*
> +		 * This will cause message queue rescan to
> +		 * possibly deliver another msg from the
> +		 * hypervisor
> +		 */
> +		hv_signal_eom();
> +	}
> +}
> +
> +void hv_setup_vmbus_irq(void (*handler)(void));
> +void hv_remove_vmbus_irq(void);
> +void hv_enable_vmbus_irq(void);
> +void hv_disable_vmbus_irq(void);
> +
> +void hv_setup_kexec_handler(void (*handler)(void));
> +void hv_remove_kexec_handler(void);
> +void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs));
> +void hv_remove_crash_handler(void);
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +extern struct clocksource *hyperv_cs;
> +
> +/*
> + * Hypervisor's notion of virtual processor ID is different from
> + * Linux' notion of CPU ID. This information can only be retrieved
> + * in the context of the calling CPU. Setup a map for easy access
> + * to this information.
> + */
> +extern u32 *hv_vp_index;
> +extern u32 hv_max_vp_index;
> +
> +/* Sentinel value for an uninitialized entry in hv_vp_index array */
> +#define VP_INVAL	U32_MAX
> +
> +/**
> + * hv_cpu_number_to_vp_number() - Map CPU to VP.
> + * @cpu_number: CPU number in Linux terms
> + *
> + * This function returns the mapping between the Linux processor
> + * number and the hypervisor's virtual processor number, useful
> + * in making hypercalls and such that talk about specific
> + * processors.
> + *
> + * Return: Virtual processor number in Hyper-V terms
> + */
> +static inline int hv_cpu_number_to_vp_number(int cpu_number)
> +{
> +	return hv_vp_index[cpu_number];
> +}
> +
> +void hyperv_report_panic(struct pt_regs *regs, long err);
> +void hyperv_report_panic_msg(phys_addr_t pa, size_t size);
> +bool hv_is_hyperv_initialized(void);
> +void hyperv_cleanup(void);
> +#else /* CONFIG_HYPERV */
> +static inline bool hv_is_hyperv_initialized(void) { return false; }
> +static inline void hyperv_cleanup(void) {}
> +#endif /* CONFIG_HYPERV */
> +
> +#if IS_ENABLED(CONFIG_HYPERV)
> +extern int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void));
> +extern void hv_remove_stimer0_irq(int irq);
> +#endif
> +
> +static inline u64 hv_read_tsc_page_tsc(const struct ms_hyperv_tsc_page *tsc_pg,
> +				       u64 *cur_tsc)
> +{
> +	u64	scale, offset;
> +	u32	sequence;
> +
> +	/*
> +	 * The protocol for reading Hyper-V TSC page is specified in Hypervisor
> +	 * Top-Level Functional Specification.  To get the reference time we
> +	 * must do the following:
> +	 * - READ ReferenceTscSequence
> +	 *   A special '0' value indicates the time source is unreliable and we
> +	 *   need to use something else.
> +	 * - ReferenceTime =
> +	 *     ((HWclock val) * ReferenceTscScale) >> 64) + ReferenceTscOffset
> +	 * - READ ReferenceTscSequence again. In case its value has changed
> +	 *   since our first reading we need to discard ReferenceTime and repeat
> +	 *   the whole sequence as the hypervisor was updating the page in
> +	 *   between.
> +	 */
> +	do {
> +		sequence = READ_ONCE(tsc_pg->tsc_sequence);
> +		/*
> +		 * Make sure we read sequence before we read other values from
> +		 * TSC page.
> +		 */
> +		smp_rmb();
> +
> +		scale = READ_ONCE(tsc_pg->tsc_scale);
> +		offset = READ_ONCE(tsc_pg->tsc_offset);
> +		*cur_tsc = hv_read_hwclock();
> +
> +		/*
> +		 * Make sure we read sequence after we read all other values
> +		 * from TSC page.
> +		 */
> +		smp_rmb();
> +
> +	} while (READ_ONCE(tsc_pg->tsc_sequence) != sequence);

Could you explain what the writer side looks like, please? I'm a bit
confused as to why we're not using the bottom bit of the sequence
number to detect a concurrent update.

Will