[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2025051838-send-hurler-581c@gregkh>
Date: Sun, 18 May 2025 08:41:38 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org,
torvalds@...ux-foundation.org,
stable@...r.kernel.org
Cc: lwn@....net,
jslaby@...e.cz,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: Linux 6.1.139
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 78c26280c473..1468609052c7 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -514,6 +514,7 @@ Description: information about CPUs heterogeneity.
What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
+ /sys/devices/system/cpu/vulnerabilities/indirect_target_selection
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 3e4a14e38b49..dc69ba0b05e4 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -22,3 +22,4 @@ are configurable at compile, boot or run time.
gather_data_sampling.rst
srso
reg-file-data-sampling
+ indirect-target-selection
diff --git a/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst b/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst
new file mode 100644
index 000000000000..4788e14ebce0
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst
@@ -0,0 +1,156 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Indirect Target Selection (ITS)
+===============================
+
+ITS is a vulnerability in some Intel CPUs that support Enhanced IBRS and were
+released before Alder Lake. ITS may allow an attacker to control the prediction
+of indirect branches and RETs located in the lower half of a cacheline.
+
+ITS is assigned CVE-2024-28956 with a CVSS score of 4.7 (Medium).
+
+Scope of Impact
+---------------
+- **eIBRS Guest/Host Isolation**: Indirect branches in KVM/kernel may still be
+ predicted with unintended target corresponding to a branch in the guest.
+
+- **Intra-Mode BTI**: In-kernel training such as through cBPF or other native
+ gadgets.
+
+- **Indirect Branch Prediction Barrier (IBPB)**: After an IBPB, indirect
+ branches may still be predicted with targets corresponding to direct branches
+ executed prior to the IBPB. This is fixed by the IPU 2025.1 microcode, which
+ should be available via distro updates. Alternatively microcode can be
+ obtained from Intel's github repository [#f1]_.
+
+Affected CPUs
+-------------
+Below is the list of ITS affected CPUs [#f2]_ [#f3]_:
+
+ ======================== ============ ==================== ===============
+ Common name Family_Model eIBRS Intra-mode BTI
+ Guest/Host Isolation
+ ======================== ============ ==================== ===============
+ SKYLAKE_X (step >= 6) 06_55H Affected Affected
+ ICELAKE_X 06_6AH Not affected Affected
+ ICELAKE_D 06_6CH Not affected Affected
+ ICELAKE_L 06_7EH Not affected Affected
+ TIGERLAKE_L 06_8CH Not affected Affected
+ TIGERLAKE 06_8DH Not affected Affected
+ KABYLAKE_L (step >= 12) 06_8EH Affected Affected
+ KABYLAKE (step >= 13) 06_9EH Affected Affected
+ COMETLAKE 06_A5H Affected Affected
+ COMETLAKE_L 06_A6H Affected Affected
+ ROCKETLAKE 06_A7H Not affected Affected
+ ======================== ============ ==================== ===============
+
+- All affected CPUs enumerate Enhanced IBRS feature.
+- IBPB isolation is affected on all ITS affected CPUs, and need a microcode
+ update for mitigation.
+- None of the affected CPUs enumerate BHI_CTRL which was introduced in Golden
+ Cove (Alder Lake and Sapphire Rapids). This can help guests to determine the
+ host's affected status.
+- Intel Atom CPUs are not affected by ITS.
+
+Mitigation
+----------
+As only the indirect branches and RETs that have their last byte of instruction
+in the lower half of the cacheline are vulnerable to ITS, the basic idea behind
+the mitigation is to not allow indirect branches in the lower half.
+
+This is achieved by relying on existing retpoline support in the kernel, and in
+compilers. ITS-vulnerable retpoline sites are runtime patched to point to newly
+added ITS-safe thunks. These safe thunks consists of indirect branch in the
+second half of the cacheline. Not all retpoline sites are patched to thunks, if
+a retpoline site is evaluated to be ITS-safe, it is replaced with an inline
+indirect branch.
+
+Dynamic thunks
+~~~~~~~~~~~~~~
+From a dynamically allocated pool of safe-thunks, each vulnerable site is
+replaced with a new thunk, such that they get a unique address. This could
+improve the branch prediction accuracy. Also, it is a defense-in-depth measure
+against aliasing.
+
+Note, for simplicity, indirect branches in eBPF programs are always replaced
+with a jump to a static thunk in __x86_indirect_its_thunk_array. If required,
+in future this can be changed to use dynamic thunks.
+
+All vulnerable RETs are replaced with a static thunk, they do not use dynamic
+thunks. This is because RETs get their prediction from RSB mostly that does not
+depend on source address. RETs that underflow RSB may benefit from dynamic
+thunks. But, RETs significantly outnumber indirect branches, and any benefit
+from a unique source address could be outweighed by the increased icache
+footprint and iTLB pressure.
+
+Retpoline
+~~~~~~~~~
+Retpoline sequence also mitigates ITS-unsafe indirect branches. For this
+reason, when retpoline is enabled, ITS mitigation only relocates the RETs to
+safe thunks. Unless user requested the RSB-stuffing mitigation.
+
+Mitigation in guests
+^^^^^^^^^^^^^^^^^^^^
+All guests deploy ITS mitigation by default, irrespective of eIBRS enumeration
+and Family/Model of the guest. This is because eIBRS feature could be hidden
+from a guest. One exception to this is when a guest enumerates BHI_DIS_S, which
+indicates that the guest is running on an unaffected host.
+
+To prevent guests from unnecessarily deploying the mitigation on unaffected
+platforms, Intel has defined ITS_NO bit(62) in MSR IA32_ARCH_CAPABILITIES. When
+a guest sees this bit set, it should not enumerate the ITS bug. Note, this bit
+is not set by any hardware, but is **intended for VMMs to synthesize** it for
+guests as per the host's affected status.
+
+Mitigation options
+^^^^^^^^^^^^^^^^^^
+The ITS mitigation can be controlled using the "indirect_target_selection"
+kernel parameter. The available options are:
+
+ ======== ===================================================================
+ on (default) Deploy the "Aligned branch/return thunks" mitigation.
+ If spectre_v2 mitigation enables retpoline, aligned-thunks are only
+ deployed for the affected RET instructions. Retpoline mitigates
+ indirect branches.
+
+ off Disable ITS mitigation.
+
+ vmexit Equivalent to "=on" if the CPU is affected by guest/host isolation
+ part of ITS. Otherwise, mitigation is not deployed. This option is
+ useful when host userspace is not in the threat model, and only
+ attacks from guest to host are considered.
+
+ force Force the ITS bug and deploy the default mitigation.
+ ======== ===================================================================
+
+Sysfs reporting
+---------------
+
+The sysfs file showing ITS mitigation status is:
+
+ /sys/devices/system/cpu/vulnerabilities/indirect_target_selection
+
+Note, microcode mitigation status is not reported in this file.
+
+The possible values in this file are:
+
+.. list-table::
+
+ * - Not affected
+ - The processor is not vulnerable.
+ * - Vulnerable
+ - System is vulnerable and no mitigation has been applied.
+ * - Vulnerable, KVM: Not affected
+ - System is vulnerable to intra-mode BTI, but not affected by eIBRS
+ guest/host isolation.
+ * - Mitigation: Aligned branch/return thunks
+ - The mitigation is enabled, affected indirect branches and RETs are
+ relocated to safe thunks.
+
+References
+----------
+.. [#f1] Microcode repository - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files
+
+.. [#f2] Affected Processors list - https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html
+
+.. [#f3] Affected Processors list (machine readable) - https://github.com/intel/Intel-affected-processor-list
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 18c8fc60db93..6938c8cd7a6f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2025,6 +2025,20 @@
different crypto accelerators. This option can be used
to achieve best performance for particular HW.
+ indirect_target_selection= [X86,Intel] Mitigation control for Indirect
+ Target Selection(ITS) bug in Intel CPUs. Updated
+ microcode is also required for a fix in IBPB.
+
+ on: Enable mitigation (default).
+ off: Disable mitigation.
+ force: Force the ITS bug and deploy default
+ mitigation.
+ vmexit: Only deploy mitigation if CPU is affected by
+ guest/host isolation part of ITS.
+
+ For details see:
+ Documentation/admin-guide/hw-vuln/indirect-target-selection.rst
+
init= [KNL]
Format: <full_path>
Run specified binary instead of /sbin/init as init
@@ -3263,6 +3277,7 @@
expose users to several CPU vulnerabilities.
Equivalent to: if nokaslr then kpti=0 [ARM64]
gather_data_sampling=off [X86]
+ indirect_target_selection=off [X86]
kvm.nx_huge_pages=off [X86]
l1tf=off [X86]
mds=off [X86]
diff --git a/Makefile b/Makefile
index d8d39fe5ef70..80748b7c3540 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
VERSION = 6
PATCHLEVEL = 1
-SUBLEVEL = 138
+SUBLEVEL = 139
EXTRAVERSION =
NAME = Curry Ramen
diff --git a/arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi b/arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi
index a9bfcdf378ce..fb75800e8999 100644
--- a/arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mm-verdin.dtsi
@@ -144,6 +144,19 @@ reg_usdhc2_vmmc: regulator-usdhc2 {
startup-delay-us = <20000>;
};
+ reg_usdhc2_vqmmc: regulator-usdhc2-vqmmc {
+ compatible = "regulator-gpio";
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_usdhc2_vsel>;
+ gpios = <&gpio1 4 GPIO_ACTIVE_HIGH>;
+ regulator-max-microvolt = <3300000>;
+ regulator-min-microvolt = <1800000>;
+ states = <1800000 0x1>,
+ <3300000 0x0>;
+ regulator-name = "PMIC_USDHC_VSELECT";
+ vin-supply = <®_nvcc_sd>;
+ };
+
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
@@ -262,7 +275,7 @@ &gpio1 {
"SODIMM_19",
"",
"",
- "",
+ "PMIC_USDHC_VSELECT",
"",
"",
"",
@@ -788,6 +801,7 @@ &usdhc2 {
pinctrl-2 = <&pinctrl_usdhc2_200mhz>, <&pinctrl_usdhc2_cd>;
pinctrl-3 = <&pinctrl_usdhc2_sleep>, <&pinctrl_usdhc2_cd_sleep>;
vmmc-supply = <®_usdhc2_vmmc>;
+ vqmmc-supply = <®_usdhc2_vqmmc>;
};
&wdog1 {
@@ -1210,13 +1224,17 @@ pinctrl_usdhc2_pwr_en: usdhc2pwrengrp {
<MX8MM_IOMUXC_NAND_CLE_GPIO3_IO5 0x6>; /* SODIMM 76 */
};
+ pinctrl_usdhc2_vsel: usdhc2vselgrp {
+ fsl,pins =
+ <MX8MM_IOMUXC_GPIO1_IO04_GPIO1_IO4 0x10>; /* PMIC_USDHC_VSELECT */
+ };
+
/*
* Note: Due to ERR050080 we use discrete external on-module resistors pulling-up to the
* on-module +V3.3_1.8_SD (LDO5) rail and explicitly disable the internal pull-ups here.
*/
pinctrl_usdhc2: usdhc2grp {
fsl,pins =
- <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x90>, /* SODIMM 78 */
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x90>, /* SODIMM 74 */
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x90>, /* SODIMM 80 */
@@ -1227,7 +1245,6 @@ pinctrl_usdhc2: usdhc2grp {
pinctrl_usdhc2_100mhz: usdhc2-100mhzgrp {
fsl,pins =
- <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x94>,
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x94>,
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x94>,
@@ -1238,7 +1255,6 @@ pinctrl_usdhc2_100mhz: usdhc2-100mhzgrp {
pinctrl_usdhc2_200mhz: usdhc2-200mhzgrp {
fsl,pins =
- <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x10>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x96>,
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x96>,
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x96>,
@@ -1250,7 +1266,6 @@ pinctrl_usdhc2_200mhz: usdhc2-200mhzgrp {
/* Avoid backfeeding with removed card power */
pinctrl_usdhc2_sleep: usdhc2slpgrp {
fsl,pins =
- <MX8MM_IOMUXC_GPIO1_IO04_USDHC2_VSELECT 0x0>,
<MX8MM_IOMUXC_SD2_CLK_USDHC2_CLK 0x0>,
<MX8MM_IOMUXC_SD2_CMD_USDHC2_CMD 0x0>,
<MX8MM_IOMUXC_SD2_DATA0_USDHC2_DATA0 0x0>,
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index cdb024dd33f5..fe022fe2d4f6 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -81,6 +81,7 @@
#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A510 0xD46
+#define ARM_CPU_PART_CORTEX_X1C 0xD4C
#define ARM_CPU_PART_CORTEX_A520 0xD80
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_A715 0xD4D
@@ -159,6 +160,7 @@
#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510)
+#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
#define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715)
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 834bff720582..0075dc5d8238 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -619,6 +619,7 @@ u32 aarch64_insn_gen_cas(enum aarch64_insn_register result,
}
#endif
u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type);
+u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type);
s32 aarch64_get_branch_offset(u32 insn);
u32 aarch64_set_branch_offset(u32 insn, s32 offset);
diff --git a/arch/arm64/include/asm/spectre.h b/arch/arm64/include/asm/spectre.h
index 6f9f4f68b9e2..9de17d4bb036 100644
--- a/arch/arm64/include/asm/spectre.h
+++ b/arch/arm64/include/asm/spectre.h
@@ -96,6 +96,9 @@ enum mitigation_state arm64_get_meltdown_state(void);
enum mitigation_state arm64_get_spectre_bhb_state(void);
bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope);
+extern bool __nospectre_bhb;
+u8 get_spectre_bhb_loop_value(void);
+bool is_spectre_bhb_fw_mitigated(void);
void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused);
#endif /* __ASSEMBLY__ */
#endif /* __ASM_SPECTRE_H */
diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
index 3d12fc5af87b..fcc641f30c93 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -903,6 +903,7 @@ static u8 spectre_bhb_loop_affected(void)
MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
@@ -1010,6 +1011,11 @@ bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
return true;
}
+u8 get_spectre_bhb_loop_value(void)
+{
+ return max_bhb_k;
+}
+
static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot)
{
const char *v = arm64_get_bp_hardening_vector(slot);
@@ -1030,7 +1036,7 @@ static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot)
isb();
}
-static bool __read_mostly __nospectre_bhb;
+bool __read_mostly __nospectre_bhb;
static int __init parse_spectre_bhb_param(char *str)
{
__nospectre_bhb = true;
@@ -1108,6 +1114,11 @@ void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry)
update_mitigation_state(&spectre_bhb_state, state);
}
+bool is_spectre_bhb_fw_mitigated(void)
+{
+ return test_bit(BHB_FW, &system_bhb_mitigations);
+}
+
/* Patched to NOP when enabled */
void noinstr spectre_bhb_patch_loop_mitigation_enable(struct alt_instr *alt,
__le32 *origptr,
diff --git a/arch/arm64/lib/insn.c b/arch/arm64/lib/insn.c
index 49e972beeac7..44bb90ee2f41 100644
--- a/arch/arm64/lib/insn.c
+++ b/arch/arm64/lib/insn.c
@@ -5,6 +5,7 @@
*
* Copyright (C) 2014-2016 Zi Shen Lim <zlim.lnx@...il.com>
*/
+#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/bug.h>
#include <linux/printk.h>
@@ -1630,43 +1631,41 @@ u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
}
-u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type)
+static u32 __get_barrier_crm_val(enum aarch64_insn_mb_type type)
{
- u32 opt;
- u32 insn;
-
switch (type) {
case AARCH64_INSN_MB_SY:
- opt = 0xf;
- break;
+ return 0xf;
case AARCH64_INSN_MB_ST:
- opt = 0xe;
- break;
+ return 0xe;
case AARCH64_INSN_MB_LD:
- opt = 0xd;
- break;
+ return 0xd;
case AARCH64_INSN_MB_ISH:
- opt = 0xb;
- break;
+ return 0xb;
case AARCH64_INSN_MB_ISHST:
- opt = 0xa;
- break;
+ return 0xa;
case AARCH64_INSN_MB_ISHLD:
- opt = 0x9;
- break;
+ return 0x9;
case AARCH64_INSN_MB_NSH:
- opt = 0x7;
- break;
+ return 0x7;
case AARCH64_INSN_MB_NSHST:
- opt = 0x6;
- break;
+ return 0x6;
case AARCH64_INSN_MB_NSHLD:
- opt = 0x5;
- break;
+ return 0x5;
default:
- pr_err("%s: unknown dmb type %d\n", __func__, type);
+ pr_err("%s: unknown barrier type %d\n", __func__, type);
return AARCH64_BREAK_FAULT;
}
+}
+
+u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type)
+{
+ u32 opt;
+ u32 insn;
+
+ opt = __get_barrier_crm_val(type);
+ if (opt == AARCH64_BREAK_FAULT)
+ return AARCH64_BREAK_FAULT;
insn = aarch64_insn_get_dmb_value();
insn &= ~GENMASK(11, 8);
@@ -1674,3 +1673,18 @@ u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type)
return insn;
}
+
+u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type)
+{
+ u32 opt, insn;
+
+ opt = __get_barrier_crm_val(type);
+ if (opt == AARCH64_BREAK_FAULT)
+ return AARCH64_BREAK_FAULT;
+
+ insn = aarch64_insn_get_dsb_base_value();
+ insn &= ~GENMASK(11, 8);
+ insn |= (opt << 8);
+
+ return insn;
+}
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index c04ace8f4843..09b70a9b5c39 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -7,6 +7,7 @@
#define pr_fmt(fmt) "bpf_jit: " fmt
+#include <linux/arm-smccc.h>
#include <linux/bitfield.h>
#include <linux/bpf.h>
#include <linux/filter.h>
@@ -17,6 +18,7 @@
#include <asm/asm-extable.h>
#include <asm/byteorder.h>
#include <asm/cacheflush.h>
+#include <asm/cpufeature.h>
#include <asm/debug-monitors.h>
#include <asm/insn.h>
#include <asm/patching.h>
@@ -653,7 +655,51 @@ static void build_plt(struct jit_ctx *ctx)
plt->target = (u64)&dummy_tramp;
}
-static void build_epilogue(struct jit_ctx *ctx)
+/* Clobbers BPF registers 1-4, aka x0-x3 */
+static void __maybe_unused build_bhb_mitigation(struct jit_ctx *ctx)
+{
+ const u8 r1 = bpf2a64[BPF_REG_1]; /* aka x0 */
+ u8 k = get_spectre_bhb_loop_value();
+
+ if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY) ||
+ cpu_mitigations_off() || __nospectre_bhb ||
+ arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE)
+ return;
+
+ if (capable(CAP_SYS_ADMIN))
+ return;
+
+ if (supports_clearbhb(SCOPE_SYSTEM)) {
+ emit(aarch64_insn_gen_hint(AARCH64_INSN_HINT_CLEARBHB), ctx);
+ return;
+ }
+
+ if (k) {
+ emit_a64_mov_i64(r1, k, ctx);
+ emit(A64_B(1), ctx);
+ emit(A64_SUBS_I(true, r1, r1, 1), ctx);
+ emit(A64_B_(A64_COND_NE, -2), ctx);
+ emit(aarch64_insn_gen_dsb(AARCH64_INSN_MB_ISH), ctx);
+ emit(aarch64_insn_get_isb_value(), ctx);
+ }
+
+ if (is_spectre_bhb_fw_mitigated()) {
+ emit(A64_ORR_I(false, r1, AARCH64_INSN_REG_ZR,
+ ARM_SMCCC_ARCH_WORKAROUND_3), ctx);
+ switch (arm_smccc_1_1_get_conduit()) {
+ case SMCCC_CONDUIT_HVC:
+ emit(aarch64_insn_get_hvc_value(), ctx);
+ break;
+ case SMCCC_CONDUIT_SMC:
+ emit(aarch64_insn_get_smc_value(), ctx);
+ break;
+ default:
+ pr_err_once("Firmware mitigation enabled with unknown conduit\n");
+ }
+ }
+}
+
+static void build_epilogue(struct jit_ctx *ctx, bool was_classic)
{
const u8 r0 = bpf2a64[BPF_REG_0];
const u8 r6 = bpf2a64[BPF_REG_6];
@@ -675,10 +721,13 @@ static void build_epilogue(struct jit_ctx *ctx)
emit(A64_POP(r8, r9, A64_SP), ctx);
emit(A64_POP(r6, r7, A64_SP), ctx);
+ if (was_classic)
+ build_bhb_mitigation(ctx);
+
/* Restore FP/LR registers */
emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
- /* Set return value */
+ /* Move the return value from bpf:r0 (aka x7) to x0 */
emit(A64_MOV(1, A64_R(0), r0), ctx);
/* Authenticate lr */
@@ -1527,7 +1576,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
}
ctx.epilogue_offset = ctx.idx;
- build_epilogue(&ctx);
+ build_epilogue(&ctx, was_classic);
build_plt(&ctx);
extable_align = __alignof__(struct exception_table_entry);
@@ -1563,7 +1612,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
goto out_off;
}
- build_epilogue(&ctx);
+ build_epilogue(&ctx, was_classic);
build_plt(&ctx);
/* 3. Extra pass to validate JITed code. */
diff --git a/arch/mips/include/asm/ptrace.h b/arch/mips/include/asm/ptrace.h
index 4a2b40ce39e0..841612913f0d 100644
--- a/arch/mips/include/asm/ptrace.h
+++ b/arch/mips/include/asm/ptrace.h
@@ -65,7 +65,8 @@ static inline void instruction_pointer_set(struct pt_regs *regs,
/* Query offset/name of register from its name/offset */
extern int regs_query_register_offset(const char *name);
-#define MAX_REG_OFFSET (offsetof(struct pt_regs, __last))
+#define MAX_REG_OFFSET \
+ (offsetof(struct pt_regs, __last) - sizeof(unsigned long))
/**
* regs_get_register() - get register value from its offset
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 48ab6e8f2d1d..8e66bb443351 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2575,6 +2575,17 @@ config MITIGATION_SPECTRE_BHI
indirect branches.
See <file:Documentation/admin-guide/hw-vuln/spectre.rst>
+config MITIGATION_ITS
+ bool "Enable Indirect Target Selection mitigation"
+ depends on CPU_SUP_INTEL && X86_64
+ depends on RETPOLINE && RETHUNK
+ default y
+ help
+ Enable Indirect Target Selection (ITS) mitigation. ITS is a bug in
+ BPU on some Intel CPUs that may allow Spectre V2 style attacks. If
+ disabled, mitigation cannot be enabled via cmdline.
+ See <file:Documentation/admin-guide/hw-vuln/indirect-target-selection.rst>
+
endif
config ARCH_HAS_ADD_PAGES
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index a114338380a6..4f5968810450 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1559,7 +1559,9 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* ORC to unwind properly.
*
* The alignment is for performance and not for safety, and may be safely
- * refactored in the future if needed.
+ * refactored in the future if needed. The .skips are for safety, to ensure
+ * that all RETs are in the second half of a cacheline to mitigate Indirect
+ * Target Selection, rather than taking the slowpath via its_return_thunk.
*/
SYM_FUNC_START(clear_bhb_loop)
push %rbp
@@ -1569,10 +1571,22 @@ SYM_FUNC_START(clear_bhb_loop)
call 1f
jmp 5f
.align 64, 0xcc
+ /*
+ * Shift instructions so that the RET is in the upper half of the
+ * cacheline and don't take the slowpath to its_return_thunk.
+ */
+ .skip 32 - (.Lret1 - 1f), 0xcc
ANNOTATE_INTRA_FUNCTION_CALL
1: call 2f
- RET
+.Lret1: RET
.align 64, 0xcc
+ /*
+ * As above shift instructions for RET at .Lret2 as well.
+ *
+ * This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc
+ * but some Clang versions (e.g. 18) don't like this.
+ */
+ .skip 32 - 18, 0xcc
2: movl $5, %eax
3: jmp 4f
nop
@@ -1580,7 +1594,7 @@ SYM_FUNC_START(clear_bhb_loop)
jnz 3b
sub $1, %ecx
jnz 1b
- RET
+.Lret2: RET
5: lfence
pop %rbp
RET
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 9542c582d546..fd3b73017738 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -5,6 +5,7 @@
#include <linux/types.h>
#include <linux/stringify.h>
#include <asm/asm.h>
+#include <asm/bug.h>
#define ALTINSTR_FLAG_INV (1 << 15)
#define ALT_NOT(feat) ((feat) | ALTINSTR_FLAG_INV)
@@ -81,6 +82,37 @@ extern void apply_ibt_endbr(s32 *start, s32 *end);
struct module;
+#ifdef CONFIG_MITIGATION_ITS
+extern void its_init_mod(struct module *mod);
+extern void its_fini_mod(struct module *mod);
+extern void its_free_mod(struct module *mod);
+extern u8 *its_static_thunk(int reg);
+#else /* CONFIG_MITIGATION_ITS */
+static inline void its_init_mod(struct module *mod) { }
+static inline void its_fini_mod(struct module *mod) { }
+static inline void its_free_mod(struct module *mod) { }
+static inline u8 *its_static_thunk(int reg)
+{
+ WARN_ONCE(1, "ITS not compiled in");
+
+ return NULL;
+}
+#endif
+
+#if defined(CONFIG_RETHUNK) && defined(CONFIG_OBJTOOL)
+extern bool cpu_wants_rethunk(void);
+extern bool cpu_wants_rethunk_at(void *addr);
+#else
+static __always_inline bool cpu_wants_rethunk(void)
+{
+ return false;
+}
+static __always_inline bool cpu_wants_rethunk_at(void *addr)
+{
+ return false;
+}
+#endif
+
#ifdef CONFIG_SMP
extern void alternatives_smp_module_add(struct module *mod, char *name,
void *locks, void *locks_end,
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 5709cbba28e7..28edef597282 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -445,6 +445,7 @@
#define X86_FEATURE_BHI_CTRL (21*32+ 2) /* "" BHI_DIS_S HW control available */
#define X86_FEATURE_CLEAR_BHB_HW (21*32+ 3) /* "" BHI_DIS_S HW control enabled */
#define X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT (21*32+ 4) /* "" Clear branch history at vmexit using SW loop */
+#define X86_FEATURE_INDIRECT_THUNK_ITS (21*32 + 5) /* "" Use thunk for indirect branches in lower half of cacheline */
/*
* BUG word(s)
@@ -495,4 +496,6 @@
#define X86_BUG_RFDS X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */
#define X86_BUG_BHI X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */
#define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */
+#define X86_BUG_ITS X86_BUG(1*32 + 5) /* CPU is affected by Indirect Target Selection */
+#define X86_BUG_ITS_NATIVE_ONLY X86_BUG(1*32 + 6) /* CPU is affected by ITS, VMX is not affected */
#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 5ffcfcc41282..727947ed5e5e 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -185,6 +185,14 @@
* VERW clears CPU Register
* File.
*/
+#define ARCH_CAP_ITS_NO BIT_ULL(62) /*
+ * Not susceptible to
+ * Indirect Target Selection.
+ * This bit is not set by
+ * HW, but is synthesized by
+ * VMMs for guests to know
+ * their affected status.
+ */
#define ARCH_CAP_XAPIC_DISABLE BIT(21) /*
* IA32_XAPIC_DISABLE_STATUS MSR
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index daf58a96e0a7..cccd5072e09f 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -119,9 +119,8 @@
.endm
/*
- * Equivalent to -mindirect-branch-cs-prefix; emit the 5 byte jmp/call
- * to the retpoline thunk with a CS prefix when the register requires
- * a RAX prefix byte to encode. Also see apply_retpolines().
+ * Emits a conditional CS prefix that is compatible with
+ * -mindirect-branch-cs-prefix.
*/
.macro __CS_PREFIX reg:req
.irp rs,r8,r9,r10,r11,r12,r13,r14,r15
@@ -245,8 +244,12 @@
_ASM_PTR " 999b\n\t" \
".popsection\n\t"
+#define ITS_THUNK_SIZE 64
+
typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE];
+typedef u8 its_thunk_t[ITS_THUNK_SIZE];
extern retpoline_thunk_t __x86_indirect_thunk_array[];
+extern its_thunk_t __x86_indirect_its_thunk_array[];
#ifdef CONFIG_RETHUNK
extern void __x86_return_thunk(void);
@@ -254,6 +257,12 @@ extern void __x86_return_thunk(void);
static inline void __x86_return_thunk(void) {}
#endif
+#ifdef CONFIG_MITIGATION_ITS
+extern void its_return_thunk(void);
+#else
+static inline void its_return_thunk(void) {}
+#endif
+
extern void retbleed_return_thunk(void);
extern void srso_return_thunk(void);
extern void srso_alias_return_thunk(void);
@@ -280,20 +289,23 @@ extern void (*x86_return_thunk)(void);
#ifdef CONFIG_X86_64
+/*
+ * Emits a conditional CS prefix that is compatible with
+ * -mindirect-branch-cs-prefix.
+ */
+#define __CS_PREFIX(reg) \
+ ".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n" \
+ ".ifc \\rs," reg "\n" \
+ ".byte 0x2e\n" \
+ ".endif\n" \
+ ".endr\n"
+
/*
* Inline asm uses the %V modifier which is only in newer GCC
* which is ensured when CONFIG_RETPOLINE is defined.
*/
-# define CALL_NOSPEC \
- ALTERNATIVE_2( \
- ANNOTATE_RETPOLINE_SAFE \
- "call *%[thunk_target]\n", \
- "call __x86_indirect_thunk_%V[thunk_target]\n", \
- X86_FEATURE_RETPOLINE, \
- "lfence;\n" \
- ANNOTATE_RETPOLINE_SAFE \
- "call *%[thunk_target]\n", \
- X86_FEATURE_RETPOLINE_LFENCE)
+#define CALL_NOSPEC __CS_PREFIX("%V[thunk_target]") \
+ "call __x86_indirect_thunk_%V[thunk_target]\n"
# define THUNK_TARGET(addr) [thunk_target] "r" (addr)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 69f85e274611..3bc6be6404d2 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -18,6 +18,8 @@
#include <linux/mmu_context.h>
#include <linux/bsearch.h>
#include <linux/sync_core.h>
+#include <linux/moduleloader.h>
+#include <linux/cleanup.h>
#include <asm/text-patching.h>
#include <asm/alternative.h>
#include <asm/sections.h>
@@ -30,6 +32,8 @@
#include <asm/fixmap.h>
#include <asm/paravirt.h>
#include <asm/asm-prototypes.h>
+#include <asm/cfi.h>
+#include <asm/set_memory.h>
int __read_mostly alternatives_patched;
@@ -396,6 +400,212 @@ static int emit_indirect(int op, int reg, u8 *bytes)
return i;
}
+#ifdef CONFIG_MITIGATION_ITS
+
+#ifdef CONFIG_MODULES
+static struct module *its_mod;
+static void *its_page;
+static unsigned int its_offset;
+
+/* Initialize a thunk with the "jmp *reg; int3" instructions. */
+static void *its_init_thunk(void *thunk, int reg)
+{
+ u8 *bytes = thunk;
+ int i = 0;
+
+ if (reg >= 8) {
+ bytes[i++] = 0x41; /* REX.B prefix */
+ reg -= 8;
+ }
+ bytes[i++] = 0xff;
+ bytes[i++] = 0xe0 + reg; /* jmp *reg */
+ bytes[i++] = 0xcc;
+
+ return thunk;
+}
+
+void its_init_mod(struct module *mod)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS))
+ return;
+
+ mutex_lock(&text_mutex);
+ its_mod = mod;
+ its_page = NULL;
+}
+
+void its_fini_mod(struct module *mod)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS))
+ return;
+
+ WARN_ON_ONCE(its_mod != mod);
+
+ its_mod = NULL;
+ its_page = NULL;
+ mutex_unlock(&text_mutex);
+
+ for (int i = 0; i < mod->its_num_pages; i++) {
+ void *page = mod->its_page_array[i];
+ set_memory_ro((unsigned long)page, 1);
+ set_memory_x((unsigned long)page, 1);
+ }
+}
+
+void its_free_mod(struct module *mod)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS))
+ return;
+
+ for (int i = 0; i < mod->its_num_pages; i++) {
+ void *page = mod->its_page_array[i];
+ module_memfree(page);
+ }
+ kfree(mod->its_page_array);
+}
+
+DEFINE_FREE(its_execmem, void *, if (_T) module_memfree(_T));
+
+static void *its_alloc(void)
+{
+ void *page __free(its_execmem) = module_alloc(PAGE_SIZE);
+
+ if (!page)
+ return NULL;
+
+ if (its_mod) {
+ void *tmp = krealloc(its_mod->its_page_array,
+ (its_mod->its_num_pages+1) * sizeof(void *),
+ GFP_KERNEL);
+ if (!tmp)
+ return NULL;
+
+ its_mod->its_page_array = tmp;
+ its_mod->its_page_array[its_mod->its_num_pages++] = page;
+ }
+
+ return no_free_ptr(page);
+}
+
+static void *its_allocate_thunk(int reg)
+{
+ int size = 3 + (reg / 8);
+ void *thunk;
+
+ if (!its_page || (its_offset + size - 1) >= PAGE_SIZE) {
+ its_page = its_alloc();
+ if (!its_page) {
+ pr_err("ITS page allocation failed\n");
+ return NULL;
+ }
+ memset(its_page, INT3_INSN_OPCODE, PAGE_SIZE);
+ its_offset = 32;
+ }
+
+ /*
+ * If the indirect branch instruction will be in the lower half
+ * of a cacheline, then update the offset to reach the upper half.
+ */
+ if ((its_offset + size - 1) % 64 < 32)
+ its_offset = ((its_offset - 1) | 0x3F) + 33;
+
+ thunk = its_page + its_offset;
+ its_offset += size;
+
+ set_memory_rw((unsigned long)its_page, 1);
+ thunk = its_init_thunk(thunk, reg);
+ set_memory_ro((unsigned long)its_page, 1);
+ set_memory_x((unsigned long)its_page, 1);
+
+ return thunk;
+}
+#else /* CONFIG_MODULES */
+
+static void *its_allocate_thunk(int reg)
+{
+ return NULL;
+}
+
+#endif /* CONFIG_MODULES */
+
+static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes,
+ void *call_dest, void *jmp_dest)
+{
+ u8 op = insn->opcode.bytes[0];
+ int i = 0;
+
+ /*
+ * Clang does 'weird' Jcc __x86_indirect_thunk_r11 conditional
+ * tail-calls. Deal with them.
+ */
+ if (is_jcc32(insn)) {
+ bytes[i++] = op;
+ op = insn->opcode.bytes[1];
+ goto clang_jcc;
+ }
+
+ if (insn->length == 6)
+ bytes[i++] = 0x2e; /* CS-prefix */
+
+ switch (op) {
+ case CALL_INSN_OPCODE:
+ __text_gen_insn(bytes+i, op, addr+i,
+ call_dest,
+ CALL_INSN_SIZE);
+ i += CALL_INSN_SIZE;
+ break;
+
+ case JMP32_INSN_OPCODE:
+clang_jcc:
+ __text_gen_insn(bytes+i, op, addr+i,
+ jmp_dest,
+ JMP32_INSN_SIZE);
+ i += JMP32_INSN_SIZE;
+ break;
+
+ default:
+ WARN(1, "%pS %px %*ph\n", addr, addr, 6, addr);
+ return -1;
+ }
+
+ WARN_ON_ONCE(i != insn->length);
+
+ return i;
+}
+
+static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes)
+{
+ u8 *thunk = __x86_indirect_its_thunk_array[reg];
+ u8 *tmp = its_allocate_thunk(reg);
+
+ if (tmp)
+ thunk = tmp;
+
+ return __emit_trampoline(addr, insn, bytes, thunk, thunk);
+}
+
+/* Check if an indirect branch is at ITS-unsafe address */
+static bool cpu_wants_indirect_its_thunk_at(unsigned long addr, int reg)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS))
+ return false;
+
+ /* Indirect branch opcode is 2 or 3 bytes depending on reg */
+ addr += 1 + reg / 8;
+
+ /* Lower-half of the cacheline? */
+ return !(addr & 0x20);
+}
+
+u8 *its_static_thunk(int reg)
+{
+ u8 *thunk = __x86_indirect_its_thunk_array[reg];
+
+ return thunk;
+}
+
+#endif
+
/*
* Rewrite the compiler generated retpoline thunk calls.
*
@@ -466,6 +676,15 @@ static int patch_retpoline(void *addr, struct insn *insn, u8 *bytes)
bytes[i++] = 0xe8; /* LFENCE */
}
+#ifdef CONFIG_MITIGATION_ITS
+ /*
+ * Check if the address of last byte of emitted-indirect is in
+ * lower-half of the cacheline. Such branches need ITS mitigation.
+ */
+ if (cpu_wants_indirect_its_thunk_at((unsigned long)addr + i, reg))
+ return emit_its_trampoline(addr, insn, reg, bytes);
+#endif
+
ret = emit_indirect(op, reg, bytes + i);
if (ret < 0)
return ret;
@@ -537,6 +756,21 @@ void __init_or_module noinline apply_retpolines(s32 *start, s32 *end)
#ifdef CONFIG_RETHUNK
+bool cpu_wants_rethunk(void)
+{
+ return cpu_feature_enabled(X86_FEATURE_RETHUNK);
+}
+
+bool cpu_wants_rethunk_at(void *addr)
+{
+ if (!cpu_feature_enabled(X86_FEATURE_RETHUNK))
+ return false;
+ if (x86_return_thunk != its_return_thunk)
+ return true;
+
+ return !((unsigned long)addr & 0x20);
+}
+
/*
* Rewrite the compiler generated return thunk tail-calls.
*
@@ -552,13 +786,12 @@ static int patch_return(void *addr, struct insn *insn, u8 *bytes)
{
int i = 0;
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) {
- if (x86_return_thunk == __x86_return_thunk)
- return -1;
-
+ /* Patch the custom return thunks... */
+ if (cpu_wants_rethunk_at(addr)) {
i = JMP32_INSN_SIZE;
__text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i);
} else {
+ /* ... or patch them out if not needed. */
bytes[i++] = RET_INSN_OPCODE;
}
@@ -914,6 +1147,8 @@ static noinline void __init int3_selftest(void)
void __init alternative_instructions(void)
{
+ u64 ibt;
+
int3_selftest();
/*
@@ -951,6 +1186,9 @@ void __init alternative_instructions(void)
*/
paravirt_set_cap();
+ /* Keep CET-IBT disabled until caller/callee are patched */
+ ibt = ibt_save();
+
/*
* First patch paravirt functions, such that we overwrite the indirect
* call with the direct call.
@@ -972,6 +1210,8 @@ void __init alternative_instructions(void)
apply_ibt_endbr(__ibt_endbr_seal, __ibt_endbr_seal_end);
+ ibt_restore(ibt);
+
#ifdef CONFIG_SMP
/* Patch to UP if other cpus not imminent. */
if (!noreplace_smp && (num_present_cpus() == 1 || setup_max_cpus <= 1)) {
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 0be0edb07a2a..766cee7fa905 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -48,6 +48,7 @@ static void __init srbds_select_mitigation(void);
static void __init l1d_flush_select_mitigation(void);
static void __init gds_select_mitigation(void);
static void __init srso_select_mitigation(void);
+static void __init its_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR without task-specific bits set */
u64 x86_spec_ctrl_base;
@@ -66,6 +67,14 @@ static DEFINE_MUTEX(spec_ctrl_mutex);
void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
+static void __init set_return_thunk(void *thunk)
+{
+ if (x86_return_thunk != __x86_return_thunk)
+ pr_warn("x86/bugs: return thunk changed\n");
+
+ x86_return_thunk = thunk;
+}
+
/* Update SPEC_CTRL MSR and its cached copy unconditionally */
static void update_spec_ctrl(u64 val)
{
@@ -174,6 +183,7 @@ void __init cpu_select_mitigations(void)
*/
srso_select_mitigation();
gds_select_mitigation();
+ its_select_mitigation();
}
/*
@@ -1081,7 +1091,7 @@ static void __init retbleed_select_mitigation(void)
setup_force_cpu_cap(X86_FEATURE_UNRET);
if (IS_ENABLED(CONFIG_RETHUNK))
- x86_return_thunk = retbleed_return_thunk;
+ set_return_thunk(retbleed_return_thunk);
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
@@ -1142,6 +1152,116 @@ static void __init retbleed_select_mitigation(void)
pr_info("%s\n", retbleed_strings[retbleed_mitigation]);
}
+#undef pr_fmt
+#define pr_fmt(fmt) "ITS: " fmt
+
+enum its_mitigation_cmd {
+ ITS_CMD_OFF,
+ ITS_CMD_ON,
+ ITS_CMD_VMEXIT,
+};
+
+enum its_mitigation {
+ ITS_MITIGATION_OFF,
+ ITS_MITIGATION_VMEXIT_ONLY,
+ ITS_MITIGATION_ALIGNED_THUNKS,
+};
+
+static const char * const its_strings[] = {
+ [ITS_MITIGATION_OFF] = "Vulnerable",
+ [ITS_MITIGATION_VMEXIT_ONLY] = "Mitigation: Vulnerable, KVM: Not affected",
+ [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks",
+};
+
+static enum its_mitigation its_mitigation __ro_after_init = ITS_MITIGATION_ALIGNED_THUNKS;
+
+static enum its_mitigation_cmd its_cmd __ro_after_init =
+ IS_ENABLED(CONFIG_MITIGATION_ITS) ? ITS_CMD_ON : ITS_CMD_OFF;
+
+static int __init its_parse_cmdline(char *str)
+{
+ if (!str)
+ return -EINVAL;
+
+ if (!IS_ENABLED(CONFIG_MITIGATION_ITS)) {
+ pr_err("Mitigation disabled at compile time, ignoring option (%s)", str);
+ return 0;
+ }
+
+ if (!strcmp(str, "off")) {
+ its_cmd = ITS_CMD_OFF;
+ } else if (!strcmp(str, "on")) {
+ its_cmd = ITS_CMD_ON;
+ } else if (!strcmp(str, "force")) {
+ its_cmd = ITS_CMD_ON;
+ setup_force_cpu_bug(X86_BUG_ITS);
+ } else if (!strcmp(str, "vmexit")) {
+ its_cmd = ITS_CMD_VMEXIT;
+ } else {
+ pr_err("Ignoring unknown indirect_target_selection option (%s).", str);
+ }
+
+ return 0;
+}
+early_param("indirect_target_selection", its_parse_cmdline);
+
+static void __init its_select_mitigation(void)
+{
+ enum its_mitigation_cmd cmd = its_cmd;
+
+ if (!boot_cpu_has_bug(X86_BUG_ITS) || cpu_mitigations_off()) {
+ its_mitigation = ITS_MITIGATION_OFF;
+ return;
+ }
+
+ /* Exit early to avoid irrelevant warnings */
+ if (cmd == ITS_CMD_OFF) {
+ its_mitigation = ITS_MITIGATION_OFF;
+ goto out;
+ }
+ if (spectre_v2_enabled == SPECTRE_V2_NONE) {
+ pr_err("WARNING: Spectre-v2 mitigation is off, disabling ITS\n");
+ its_mitigation = ITS_MITIGATION_OFF;
+ goto out;
+ }
+ if (!IS_ENABLED(CONFIG_RETPOLINE) || !IS_ENABLED(CONFIG_RETHUNK)) {
+ pr_err("WARNING: ITS mitigation depends on retpoline and rethunk support\n");
+ its_mitigation = ITS_MITIGATION_OFF;
+ goto out;
+ }
+ if (IS_ENABLED(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B)) {
+ pr_err("WARNING: ITS mitigation is not compatible with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B\n");
+ its_mitigation = ITS_MITIGATION_OFF;
+ goto out;
+ }
+ if (boot_cpu_has(X86_FEATURE_RETPOLINE_LFENCE)) {
+ pr_err("WARNING: ITS mitigation is not compatible with lfence mitigation\n");
+ its_mitigation = ITS_MITIGATION_OFF;
+ goto out;
+ }
+
+ switch (cmd) {
+ case ITS_CMD_OFF:
+ its_mitigation = ITS_MITIGATION_OFF;
+ break;
+ case ITS_CMD_VMEXIT:
+ if (boot_cpu_has_bug(X86_BUG_ITS_NATIVE_ONLY)) {
+ its_mitigation = ITS_MITIGATION_VMEXIT_ONLY;
+ goto out;
+ }
+ fallthrough;
+ case ITS_CMD_ON:
+ its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS;
+ if (!boot_cpu_has(X86_FEATURE_RETPOLINE))
+ setup_force_cpu_cap(X86_FEATURE_INDIRECT_THUNK_ITS);
+ setup_force_cpu_cap(X86_FEATURE_RETHUNK);
+ set_return_thunk(its_return_thunk);
+ break;
+ }
+out:
+ pr_info("%s\n", its_strings[its_mitigation]);
+}
+
#undef pr_fmt
#define pr_fmt(fmt) "Spectre V2 : " fmt
@@ -1656,10 +1776,11 @@ static void __init bhi_select_mitigation(void)
return;
}
- if (spec_ctrl_bhi_dis())
+ if (!IS_ENABLED(CONFIG_X86_64))
return;
- if (!IS_ENABLED(CONFIG_X86_64))
+ /* Mitigate in hardware if supported */
+ if (spec_ctrl_bhi_dis())
return;
/* Mitigate KVM by default */
@@ -2591,10 +2712,10 @@ static void __init srso_select_mitigation(void)
if (boot_cpu_data.x86 == 0x19) {
setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
- x86_return_thunk = srso_alias_return_thunk;
+ set_return_thunk(srso_alias_return_thunk);
} else {
setup_force_cpu_cap(X86_FEATURE_SRSO);
- x86_return_thunk = srso_return_thunk;
+ set_return_thunk(srso_return_thunk);
}
srso_mitigation = SRSO_MITIGATION_SAFE_RET;
} else {
@@ -2774,6 +2895,11 @@ static ssize_t rfds_show_state(char *buf)
return sysfs_emit(buf, "%s\n", rfds_strings[rfds_mitigation]);
}
+static ssize_t its_show_state(char *buf)
+{
+ return sysfs_emit(buf, "%s\n", its_strings[its_mitigation]);
+}
+
static char *stibp_state(void)
{
if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) &&
@@ -2958,6 +3084,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
case X86_BUG_RFDS:
return rfds_show_state(buf);
+ case X86_BUG_ITS:
+ return its_show_state(buf);
+
default:
break;
}
@@ -3037,4 +3166,9 @@ ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attrib
{
return cpu_show_common(dev, attr, buf, X86_BUG_RFDS);
}
+
+ssize_t cpu_show_indirect_target_selection(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_ITS);
+}
#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 62aaabea7e33..48cc1612df49 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1251,6 +1251,10 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
#define GDS BIT(6)
/* CPU is affected by Register File Data Sampling */
#define RFDS BIT(7)
+/* CPU is affected by Indirect Target Selection */
+#define ITS BIT(8)
+/* CPU is affected by Indirect Target Selection, but guest-host isolation is not affected */
+#define ITS_NATIVE_ONLY BIT(9)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS),
@@ -1262,22 +1266,25 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_INTEL_STEPPINGS(BROADWELL_G, X86_STEPPING_ANY, SRBDS),
VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO),
VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS),
- VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPINGS(0x0, 0x5), MMIO | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS),
VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
- VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
- VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
+ VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0xb), MMIO | RETBLEED | GDS | SRBDS),
+ VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS),
+ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0xc), MMIO | RETBLEED | GDS | SRBDS),
+ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS),
VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED),
- VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS),
- VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS),
- VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS),
- VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS),
- VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED),
- VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS),
- VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS),
- VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS),
+ VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY),
+ VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS | ITS | ITS_NATIVE_ONLY),
+ VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS | ITS | ITS_NATIVE_ONLY),
+ VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS),
+ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED | ITS),
+ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS),
+ VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS | ITS | ITS_NATIVE_ONLY),
+ VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
- VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_STEPPINGS(ALDERLAKE, X86_STEPPING_ANY, RFDS),
VULNBL_INTEL_STEPPINGS(ALDERLAKE_L, X86_STEPPING_ANY, RFDS),
VULNBL_INTEL_STEPPINGS(RAPTORLAKE, X86_STEPPING_ANY, RFDS),
@@ -1341,6 +1348,32 @@ static bool __init vulnerable_to_rfds(u64 x86_arch_cap_msr)
return cpu_matches(cpu_vuln_blacklist, RFDS);
}
+static bool __init vulnerable_to_its(u64 x86_arch_cap_msr)
+{
+ /* The "immunity" bit trumps everything else: */
+ if (x86_arch_cap_msr & ARCH_CAP_ITS_NO)
+ return false;
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return false;
+
+ /* None of the affected CPUs have BHI_CTRL */
+ if (boot_cpu_has(X86_FEATURE_BHI_CTRL))
+ return false;
+
+ /*
+ * If a VMM did not expose ITS_NO, assume that a guest could
+ * be running on a vulnerable hardware or may migrate to such
+ * hardware.
+ */
+ if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
+ return true;
+
+ if (cpu_matches(cpu_vuln_blacklist, ITS))
+ return true;
+
+ return false;
+}
+
static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
{
u64 x86_arch_cap_msr = x86_read_arch_cap_msr();
@@ -1455,9 +1488,12 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
if (vulnerable_to_rfds(x86_arch_cap_msr))
setup_force_cpu_bug(X86_BUG_RFDS);
- /* When virtualized, eIBRS could be hidden, assume vulnerable */
- if (!(x86_arch_cap_msr & ARCH_CAP_BHI_NO) &&
- !cpu_matches(cpu_vuln_whitelist, NO_BHI) &&
+ /*
+ * Intel parts with eIBRS are vulnerable to BHI attacks. Parts with
+ * BHI_NO still need to use the BHI mitigation to prevent Intra-mode
+ * attacks. When virtualized, eIBRS could be hidden, assume vulnerable.
+ */
+ if (!cpu_matches(cpu_vuln_whitelist, NO_BHI) &&
(boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) ||
boot_cpu_has(X86_FEATURE_HYPERVISOR)))
setup_force_cpu_bug(X86_BUG_BHI);
@@ -1465,6 +1501,12 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET))
setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
+ if (vulnerable_to_its(x86_arch_cap_msr)) {
+ setup_force_cpu_bug(X86_BUG_ITS);
+ if (cpu_matches(cpu_vuln_blacklist, ITS_NATIVE_ONLY))
+ setup_force_cpu_bug(X86_BUG_ITS_NATIVE_ONLY);
+ }
+
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
return;
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index ec51ce713dea..e73743701530 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -360,7 +360,7 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
goto fail;
ip = trampoline + size;
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK))
+ if (cpu_wants_rethunk_at(ip))
__text_gen_insn(ip, JMP32_INSN_OPCODE, ip, x86_return_thunk, JMP32_INSN_SIZE);
else
memcpy(ip, retq, sizeof(retq));
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index c032edcd3d95..7728060b640c 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -285,10 +285,16 @@ int module_finalize(const Elf_Ehdr *hdr,
void *pseg = (void *)para->sh_addr;
apply_paravirt(pseg, pseg + para->sh_size);
}
+
+ its_init_mod(me);
+
if (retpolines) {
void *rseg = (void *)retpolines->sh_addr;
apply_retpolines(rseg, rseg + retpolines->sh_size);
}
+
+ its_fini_mod(me);
+
if (returns) {
void *rseg = (void *)returns->sh_addr;
apply_returns(rseg, rseg + returns->sh_size);
@@ -320,4 +326,5 @@ int module_finalize(const Elf_Ehdr *hdr,
void module_arch_cleanup(struct module *mod)
{
alternatives_smp_module_del(mod);
+ its_free_mod(mod);
}
diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
index ca8dfc0c9edb..93809a4dc793 100644
--- a/arch/x86/kernel/static_call.c
+++ b/arch/x86/kernel/static_call.c
@@ -79,7 +79,7 @@ static void __ref __static_call_transform(void *insn, enum insn_type type,
break;
case RET:
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK))
+ if (cpu_wants_rethunk_at(insn))
code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk);
else
code = &retinsn;
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 78ccb5ec3c0e..35aeec9d2267 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -528,6 +528,16 @@ INIT_PER_CPU(irq_stack_backing_store);
"SRSO function pair won't alias");
#endif
+#if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B)
+. = ASSERT(__x86_indirect_its_thunk_rax & 0x20, "__x86_indirect_thunk_rax not in second half of cacheline");
+. = ASSERT(((__x86_indirect_its_thunk_rcx - __x86_indirect_its_thunk_rax) % 64) == 0, "Indirect thunks are not cacheline apart");
+. = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array");
+#endif
+
+#if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B)
+. = ASSERT(its_return_thunk & 0x20, "its_return_thunk not in second half of cacheline");
+#endif
+
#endif /* CONFIG_X86_64 */
#ifdef CONFIG_KEXEC_CORE
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 752fa116a261..a6dc8f662fa4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1614,7 +1614,7 @@ static unsigned int num_msr_based_features;
ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \
- ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO)
+ ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO | ARCH_CAP_ITS_NO)
static u64 kvm_get_arch_capabilities(void)
{
@@ -1653,6 +1653,8 @@ static u64 kvm_get_arch_capabilities(void)
data |= ARCH_CAP_MDS_NO;
if (!boot_cpu_has_bug(X86_BUG_RFDS))
data |= ARCH_CAP_RFDS_NO;
+ if (!boot_cpu_has_bug(X86_BUG_ITS))
+ data |= ARCH_CAP_ITS_NO;
if (!boot_cpu_has(X86_FEATURE_RTM)) {
/*
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 7880e2a7ec6a..e42661500094 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -258,6 +258,45 @@ SYM_FUNC_START(entry_untrain_ret)
SYM_FUNC_END(entry_untrain_ret)
__EXPORT_THUNK(entry_untrain_ret)
+#ifdef CONFIG_MITIGATION_ITS
+
+.macro ITS_THUNK reg
+
+SYM_INNER_LABEL(__x86_indirect_its_thunk_\reg, SYM_L_GLOBAL)
+ UNWIND_HINT_EMPTY
+ ANNOTATE_NOENDBR
+ ANNOTATE_RETPOLINE_SAFE
+ jmp *%\reg
+ int3
+ .align 32, 0xcc /* fill to the end of the line */
+ .skip 32, 0xcc /* skip to the next upper half */
+.endm
+
+/* ITS mitigation requires thunks be aligned to upper half of cacheline */
+.align 64, 0xcc
+.skip 32, 0xcc
+SYM_CODE_START(__x86_indirect_its_thunk_array)
+
+#define GEN(reg) ITS_THUNK reg
+#include <asm/GEN-for-each-reg.h>
+#undef GEN
+
+ .align 64, 0xcc
+SYM_CODE_END(__x86_indirect_its_thunk_array)
+
+.align 64, 0xcc
+.skip 32, 0xcc
+SYM_CODE_START(its_return_thunk)
+ UNWIND_HINT_FUNC
+ ANNOTATE_NOENDBR
+ ANNOTATE_UNRET_SAFE
+ ret
+ int3
+SYM_CODE_END(its_return_thunk)
+EXPORT_SYMBOL(its_return_thunk)
+
+#endif /* CONFIG_MITIGATION_ITS */
+
SYM_CODE_START(__x86_return_thunk)
UNWIND_HINT_FUNC
ANNOTATE_NOENDBR
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 8d46b9c0e920..4918268d2007 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -617,7 +617,11 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
- /* Let nmi_uaccess_okay() know that we're changing CR3. */
+ /*
+ * Indicate that CR3 is about to change. nmi_uaccess_okay()
+ * and others are sensitive to the window where mm_cpumask(),
+ * CR3 and cpu_tlbstate.loaded_mm are not all in sync.
+ */
this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING);
barrier();
}
@@ -880,8 +884,16 @@ static void flush_tlb_func(void *info)
static bool should_flush_tlb(int cpu, void *data)
{
+ struct mm_struct *loaded_mm = per_cpu(cpu_tlbstate.loaded_mm, cpu);
struct flush_tlb_info *info = data;
+ /*
+ * Order the 'loaded_mm' and 'is_lazy' against their
+ * write ordering in switch_mm_irqs_off(). Ensure
+ * 'is_lazy' is at least as new as 'loaded_mm'.
+ */
+ smp_rmb();
+
/* Lazy TLB will get flushed at the next context switch. */
if (per_cpu(cpu_tlbstate_shared.is_lazy, cpu))
return false;
@@ -890,8 +902,15 @@ static bool should_flush_tlb(int cpu, void *data)
if (!info->mm)
return true;
+ /*
+ * While switching, the remote CPU could have state from
+ * either the prev or next mm. Assume the worst and flush.
+ */
+ if (loaded_mm == LOADED_MM_SWITCHING)
+ return true;
+
/* The target mm is loaded, and the CPU is not lazy. */
- if (per_cpu(cpu_tlbstate.loaded_mm, cpu) == info->mm)
+ if (loaded_mm == info->mm)
return true;
/* In cpumask, but not the loaded mm? Periodically remove by flushing. */
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 92db785a0a8e..f3068bb53c4d 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -36,6 +36,8 @@ static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
#define EMIT2(b1, b2) EMIT((b1) + ((b2) << 8), 2)
#define EMIT3(b1, b2, b3) EMIT((b1) + ((b2) << 8) + ((b3) << 16), 3)
#define EMIT4(b1, b2, b3, b4) EMIT((b1) + ((b2) << 8) + ((b3) << 16) + ((b4) << 24), 4)
+#define EMIT5(b1, b2, b3, b4, b5) \
+ do { EMIT1(b1); EMIT4(b2, b3, b4, b5); } while (0)
#define EMIT1_off32(b1, off) \
do { EMIT1(b1); EMIT(off, 4); } while (0)
@@ -462,7 +464,11 @@ static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
{
u8 *prog = *pprog;
- if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {
+ if (IS_ENABLED(CONFIG_MITIGATION_ITS) &&
+ cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) {
+ OPTIMIZER_HIDE_VAR(reg);
+ emit_jump(&prog, its_static_thunk(reg), ip);
+ } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {
EMIT_LFENCE();
EMIT2(0xFF, 0xE0 + reg);
} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
@@ -481,7 +487,7 @@ static void emit_return(u8 **pprog, u8 *ip)
{
u8 *prog = *pprog;
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) {
+ if (cpu_wants_rethunk()) {
emit_jump(&prog, x86_return_thunk, ip);
} else {
EMIT1(0xC3); /* ret */
@@ -947,6 +953,47 @@ static void emit_nops(u8 **pprog, int len)
#define RESTORE_TAIL_CALL_CNT(stack) \
EMIT3_off32(0x48, 0x8B, 0x85, -round_up(stack, 8) - 8)
+static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
+ struct bpf_prog *bpf_prog)
+{
+ u8 *prog = *pprog;
+ u8 *func;
+
+ if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP)) {
+ /* The clearing sequence clobbers eax and ecx. */
+ EMIT1(0x50); /* push rax */
+ EMIT1(0x51); /* push rcx */
+ ip += 2;
+
+ func = (u8 *)clear_bhb_loop;
+
+ if (emit_call(&prog, func, ip))
+ return -EINVAL;
+ EMIT1(0x59); /* pop rcx */
+ EMIT1(0x58); /* pop rax */
+ }
+ /* Insert IBHF instruction */
+ if ((cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP) &&
+ cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) ||
+ cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_HW)) {
+ /*
+ * Add an Indirect Branch History Fence (IBHF). IBHF acts as a
+ * fence preventing branch history from before the fence from
+ * affecting indirect branches after the fence. This is
+ * specifically used in cBPF jitted code to prevent Intra-mode
+ * BHI attacks. The IBHF instruction is designed to be a NOP on
+ * hardware that doesn't need or support it. The REP and REX.W
+ * prefixes are required by the microcode, and they also ensure
+ * that the NOP is unlikely to be used in existing code.
+ *
+ * IBHF is not a valid instruction in 32-bit mode.
+ */
+ EMIT5(0xF3, 0x48, 0x0F, 0x1E, 0xF8); /* ibhf */
+ }
+ *pprog = prog;
+ return 0;
+}
+
static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image,
int oldproglen, struct jit_context *ctx, bool jmp_padding)
{
@@ -1760,6 +1807,15 @@ st: if (is_imm8(insn->off))
seen_exit = true;
/* Update cleanup_addr */
ctx->cleanup_addr = proglen;
+
+ if (bpf_prog_was_classic(bpf_prog) &&
+ !capable(CAP_SYS_ADMIN)) {
+ u8 *ip = image + addrs[i - 1];
+
+ if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
+ return -EINVAL;
+ }
+
pop_callee_regs(&prog, callee_regs_used);
EMIT1(0xC9); /* leave */
emit_return(&prog, image + addrs[i - 1] + (prog - temp));
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 31da94afe4f3..27aff2503765 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -595,6 +595,12 @@ ssize_t __weak cpu_show_reg_file_data_sampling(struct device *dev,
return sysfs_emit(buf, "Not affected\n");
}
+ssize_t __weak cpu_show_indirect_target_selection(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sysfs_emit(buf, "Not affected\n");
+}
+
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
@@ -609,6 +615,7 @@ static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL);
static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL);
static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL);
static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL);
+static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
@@ -625,6 +632,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_gather_data_sampling.attr,
&dev_attr_spec_rstack_overflow.attr,
&dev_attr_reg_file_data_sampling.attr,
+ &dev_attr_indirect_target_selection.attr,
NULL
};
diff --git a/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c b/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c
index f52552c5fa27..6b9f2e1d9d69 100644
--- a/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/hdp_v5_2.c
@@ -34,7 +34,17 @@ static void hdp_v5_2_flush_hdp(struct amdgpu_device *adev,
if (!ring || !ring->funcs->emit_wreg) {
WREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2,
0);
- RREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2);
+ if (amdgpu_sriov_vf(adev)) {
+ /* this is fine because SR_IOV doesn't remap the register */
+ RREG32_NO_KIQ((adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2);
+ } else {
+ /* We just need to read back a register to post the write.
+ * Reading back the remapped register causes problems on
+ * some platforms so just read back the memory size register.
+ */
+ if (adev->nbio.funcs->get_memsize)
+ adev->nbio.funcs->get_memsize(adev);
+ }
} else {
amdgpu_ring_emit_wreg(ring,
(adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL) >> 2,
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 7dee02e8ba6f..43aaf3a42b2e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10693,7 +10693,7 @@ int amdgpu_dm_process_dmub_aux_transfer_sync(
* Transient states before tunneling is enabled could
* lead to this error. We can ignore this for now.
*/
- if (p_notify->result != AUX_RET_ERROR_PROTOCOL_ERROR) {
+ if (p_notify->result == AUX_RET_ERROR_PROTOCOL_ERROR) {
DRM_WARN("DPIA AUX failed on 0x%x(%d), error %d\n",
payload->address, payload->length,
p_notify->result);
@@ -10702,22 +10702,14 @@ int amdgpu_dm_process_dmub_aux_transfer_sync(
goto out;
}
+ payload->reply[0] = adev->dm.dmub_notify->aux_reply.command & 0xF;
+ if (adev->dm.dmub_notify->aux_reply.command & 0xF0)
+ /* The reply is stored in the top nibble of the command. */
+ payload->reply[0] = (adev->dm.dmub_notify->aux_reply.command >> 4) & 0xF;
- payload->reply[0] = adev->dm.dmub_notify->aux_reply.command;
- if (!payload->write && p_notify->aux_reply.length &&
- (payload->reply[0] == AUX_TRANSACTION_REPLY_AUX_ACK)) {
-
- if (payload->length != p_notify->aux_reply.length) {
- DRM_WARN("invalid read length %d from DPIA AUX 0x%x(%d)!\n",
- p_notify->aux_reply.length,
- payload->address, payload->length);
- *operation_result = AUX_RET_ERROR_INVALID_REPLY;
- goto out;
- }
-
+ if (!payload->write && p_notify->aux_reply.length)
memcpy(payload->data, p_notify->aux_reply.data,
p_notify->aux_reply.length);
- }
/* success */
ret = p_notify->aux_reply.length;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 5eb994ed5471..ade5dd5d8231 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -55,6 +55,9 @@ bool is_timing_changed(struct dc_stream_state *cur_stream,
struct dc_stream_state *new_stream);
+/*
+ * This function handles both native AUX and I2C-Over-AUX transactions.
+ */
static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
struct drm_dp_aux_msg *msg)
{
@@ -91,15 +94,25 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
if (adev->dm.aux_hpd_discon_quirk) {
if (msg->address == DP_SIDEBAND_MSG_DOWN_REQ_BASE &&
operation_result == AUX_RET_ERROR_HPD_DISCON) {
- result = 0;
+ result = msg->size;
operation_result = AUX_RET_SUCCESS;
}
}
- if (payload.write && result >= 0)
- result = msg->size;
+ /*
+ * result equals to 0 includes the cases of AUX_DEFER/I2C_DEFER
+ */
+ if (payload.write && result >= 0) {
+ if (result) {
+ /*one byte indicating partially written bytes. Force 0 to retry*/
+ drm_info(adev_to_drm(adev), "amdgpu: AUX partially written\n");
+ result = 0;
+ } else if (!payload.reply[0])
+ /*I2C_ACK|AUX_ACK*/
+ result = msg->size;
+ }
- if (result < 0)
+ if (result < 0) {
switch (operation_result) {
case AUX_RET_SUCCESS:
break;
@@ -118,6 +131,13 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
break;
}
+ drm_info(adev_to_drm(adev), "amdgpu: DP AUX transfer fail:%d\n", operation_result);
+ }
+
+ if (payload.reply[0])
+ drm_info(adev_to_drm(adev), "amdgpu: AUX reply command not ACK: 0x%02x.",
+ payload.reply[0]);
+
return result;
}
diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
index a35e94e2ffd0..8a86f5d5c708 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -921,27 +921,28 @@ static const struct panel_desc auo_g070vvn01 = {
},
};
-static const struct drm_display_mode auo_g101evn010_mode = {
- .clock = 68930,
- .hdisplay = 1280,
- .hsync_start = 1280 + 82,
- .hsync_end = 1280 + 82 + 2,
- .htotal = 1280 + 82 + 2 + 84,
- .vdisplay = 800,
- .vsync_start = 800 + 8,
- .vsync_end = 800 + 8 + 2,
- .vtotal = 800 + 8 + 2 + 6,
+static const struct display_timing auo_g101evn010_timing = {
+ .pixelclock = { 64000000, 68930000, 85000000 },
+ .hactive = { 1280, 1280, 1280 },
+ .hfront_porch = { 8, 64, 256 },
+ .hback_porch = { 8, 64, 256 },
+ .hsync_len = { 40, 168, 767 },
+ .vactive = { 800, 800, 800 },
+ .vfront_porch = { 4, 8, 100 },
+ .vback_porch = { 4, 8, 100 },
+ .vsync_len = { 8, 16, 223 },
};
static const struct panel_desc auo_g101evn010 = {
- .modes = &auo_g101evn010_mode,
- .num_modes = 1,
+ .timings = &auo_g101evn010_timing,
+ .num_timings = 1,
.bpc = 6,
.size = {
.width = 216,
.height = 135,
},
.bus_format = MEDIA_BUS_FMT_RGB666_1X7X3_SPWG,
+ .bus_flags = DRM_BUS_FLAG_DE_HIGH,
.connector_type = DRM_MODE_CONNECTOR_LVDS,
};
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 5b729013fd26..41493cf3d03b 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -289,11 +289,16 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
return DRM_GPU_SCHED_STAT_NOMINAL;
}
-/* If the current address or return address have changed, then the GPU
- * has probably made progress and we should delay the reset. This
- * could fail if the GPU got in an infinite loop in the CL, but that
- * is pretty unlikely outside of an i-g-t testcase.
- */
+static void
+v3d_sched_skip_reset(struct drm_sched_job *sched_job)
+{
+ struct drm_gpu_scheduler *sched = sched_job->sched;
+
+ spin_lock(&sched->job_list_lock);
+ list_add(&sched_job->list, &sched->pending_list);
+ spin_unlock(&sched->job_list_lock);
+}
+
static enum drm_gpu_sched_stat
v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
u32 *timedout_ctca, u32 *timedout_ctra)
@@ -303,9 +308,16 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q,
u32 ctca = V3D_CORE_READ(0, V3D_CLE_CTNCA(q));
u32 ctra = V3D_CORE_READ(0, V3D_CLE_CTNRA(q));
+ /* If the current address or return address have changed, then the GPU
+ * has probably made progress and we should delay the reset. This
+ * could fail if the GPU got in an infinite loop in the CL, but that
+ * is pretty unlikely outside of an i-g-t testcase.
+ */
if (*timedout_ctca != ctca || *timedout_ctra != ctra) {
*timedout_ctca = ctca;
*timedout_ctra = ctra;
+
+ v3d_sched_skip_reset(sched_job);
return DRM_GPU_SCHED_STAT_NOMINAL;
}
@@ -345,11 +357,13 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
struct v3d_dev *v3d = job->base.v3d;
u32 batches = V3D_CORE_READ(0, V3D_CSD_CURRENT_CFG4);
- /* If we've made progress, skip reset and let the timer get
- * rearmed.
+ /* If we've made progress, skip reset, add the job to the pending
+ * list, and let the timer get rearmed.
*/
if (job->timedout_batches != batches) {
job->timedout_batches = batches;
+
+ v3d_sched_skip_reset(sched_job);
return DRM_GPU_SCHED_STAT_NOMINAL;
}
diff --git a/drivers/iio/accel/adis16201.c b/drivers/iio/accel/adis16201.c
index dfb8e2e5bdf5..f063226a42f3 100644
--- a/drivers/iio/accel/adis16201.c
+++ b/drivers/iio/accel/adis16201.c
@@ -211,9 +211,9 @@ static const struct iio_chan_spec adis16201_channels[] = {
BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 14),
ADIS_AUX_ADC_CHAN(ADIS16201_AUX_ADC_REG, ADIS16201_SCAN_AUX_ADC, 0, 12),
ADIS_INCLI_CHAN(X, ADIS16201_XINCL_OUT_REG, ADIS16201_SCAN_INCLI_X,
- BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 14),
+ BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 12),
ADIS_INCLI_CHAN(Y, ADIS16201_YINCL_OUT_REG, ADIS16201_SCAN_INCLI_Y,
- BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 14),
+ BIT(IIO_CHAN_INFO_CALIBBIAS), 0, 12),
IIO_CHAN_SOFT_TIMESTAMP(7)
};
diff --git a/drivers/iio/accel/adxl355_core.c b/drivers/iio/accel/adxl355_core.c
index 4bc648eac8b2..e73573cd773a 100644
--- a/drivers/iio/accel/adxl355_core.c
+++ b/drivers/iio/accel/adxl355_core.c
@@ -175,7 +175,7 @@ struct adxl355_data {
u8 transf_buf[3];
struct {
u8 buf[14];
- s64 ts;
+ aligned_s64 ts;
} buffer;
} __aligned(IIO_DMA_MINALIGN);
};
diff --git a/drivers/iio/accel/adxl367.c b/drivers/iio/accel/adxl367.c
index f1a41b92543a..af0ad4e5e4a5 100644
--- a/drivers/iio/accel/adxl367.c
+++ b/drivers/iio/accel/adxl367.c
@@ -622,18 +622,14 @@ static int _adxl367_set_odr(struct adxl367_state *st, enum adxl367_odr odr)
if (ret)
return ret;
+ st->odr = odr;
+
/* Activity timers depend on ODR */
ret = _adxl367_set_act_time_ms(st, st->act_time_ms);
if (ret)
return ret;
- ret = _adxl367_set_inact_time_ms(st, st->inact_time_ms);
- if (ret)
- return ret;
-
- st->odr = odr;
-
- return 0;
+ return _adxl367_set_inact_time_ms(st, st->inact_time_ms);
}
static int adxl367_set_odr(struct iio_dev *indio_dev, enum adxl367_odr odr)
diff --git a/drivers/iio/adc/ad7606_spi.c b/drivers/iio/adc/ad7606_spi.c
index 287a0591533b..67c96572cecc 100644
--- a/drivers/iio/adc/ad7606_spi.c
+++ b/drivers/iio/adc/ad7606_spi.c
@@ -127,7 +127,7 @@ static int ad7606_spi_reg_read(struct ad7606_state *st, unsigned int addr)
{
.tx_buf = &st->d16[0],
.len = 2,
- .cs_change = 0,
+ .cs_change = 1,
}, {
.rx_buf = &st->d16[1],
.len = 2,
diff --git a/drivers/iio/adc/dln2-adc.c b/drivers/iio/adc/dln2-adc.c
index 97d162a3cba4..49a2588e7431 100644
--- a/drivers/iio/adc/dln2-adc.c
+++ b/drivers/iio/adc/dln2-adc.c
@@ -483,7 +483,7 @@ static irqreturn_t dln2_adc_trigger_h(int irq, void *p)
struct iio_dev *indio_dev = pf->indio_dev;
struct {
__le16 values[DLN2_ADC_MAX_CHANNELS];
- int64_t timestamp_space;
+ aligned_s64 timestamp_space;
} data;
struct dln2_adc_get_all_vals dev_data;
struct dln2_adc *dln2 = iio_priv(indio_dev);
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
index e49f2d120ed3..f5a68059f0a6 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
@@ -370,6 +370,9 @@ int st_lsm6dsx_read_fifo(struct st_lsm6dsx_hw *hw)
if (fifo_status & cpu_to_le16(ST_LSM6DSX_FIFO_EMPTY_MASK))
return 0;
+ if (!pattern_len)
+ pattern_len = ST_LSM6DSX_SAMPLE_SIZE;
+
fifo_len = (le16_to_cpu(fifo_status) & fifo_diff_mask) *
ST_LSM6DSX_CHAN_SIZE;
fifo_len = (fifo_len / pattern_len) * pattern_len;
@@ -587,6 +590,9 @@ int st_lsm6dsx_read_tagged_fifo(struct st_lsm6dsx_hw *hw)
if (!fifo_len)
return 0;
+ if (!pattern_len)
+ pattern_len = ST_LSM6DSX_TAGGED_SAMPLE_SIZE;
+
for (read_len = 0; read_len < fifo_len; read_len += pattern_len) {
err = st_lsm6dsx_read_block(hw,
ST_LSM6DSX_REG_FIFO_OUT_TAG_ADDR,
diff --git a/drivers/iio/temperature/maxim_thermocouple.c b/drivers/iio/temperature/maxim_thermocouple.c
index c28a7a6dea5f..555a61e2f3fd 100644
--- a/drivers/iio/temperature/maxim_thermocouple.c
+++ b/drivers/iio/temperature/maxim_thermocouple.c
@@ -121,9 +121,9 @@ static const struct maxim_thermocouple_chip maxim_thermocouple_chips[] = {
struct maxim_thermocouple_data {
struct spi_device *spi;
const struct maxim_thermocouple_chip *chip;
+ char tc_type;
u8 buffer[16] __aligned(IIO_DMA_MINALIGN);
- char tc_type;
};
static int maxim_thermocouple_read(struct maxim_thermocouple_data *data,
diff --git a/drivers/input/keyboard/mtk-pmic-keys.c b/drivers/input/keyboard/mtk-pmic-keys.c
index 9b34da0ec260..2d8999023468 100644
--- a/drivers/input/keyboard/mtk-pmic-keys.c
+++ b/drivers/input/keyboard/mtk-pmic-keys.c
@@ -133,8 +133,8 @@ static void mtk_pmic_keys_lp_reset_setup(struct mtk_pmic_keys *keys,
u32 value, mask;
int error;
- kregs_home = keys->keys[MTK_PMIC_HOMEKEY_INDEX].regs;
- kregs_pwr = keys->keys[MTK_PMIC_PWRKEY_INDEX].regs;
+ kregs_home = ®s->keys_regs[MTK_PMIC_HOMEKEY_INDEX];
+ kregs_pwr = ®s->keys_regs[MTK_PMIC_PWRKEY_INDEX];
error = of_property_read_u32(keys->dev->of_node, "power-off-time-sec",
&long_press_debounce);
diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c
index d8c90a23a101..6cc66774c107 100644
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -163,6 +163,7 @@ static const char * const topbuttonpad_pnp_ids[] = {
static const char * const smbus_pnp_ids[] = {
/* all of the topbuttonpad_pnp_ids are valid, we just add some extras */
+ "DLL060d", /* Dell Precision M3800 */
"LEN0048", /* X1 Carbon 3 */
"LEN0046", /* X250 */
"LEN0049", /* Yoga 11e */
@@ -189,11 +190,15 @@ static const char * const smbus_pnp_ids[] = {
"LEN2054", /* E480 */
"LEN2055", /* E580 */
"LEN2068", /* T14 Gen 1 */
+ "SYN1221", /* TUXEDO InfinityBook Pro 14 v5 */
+ "SYN3003", /* HP EliteBook 850 G1 */
"SYN3015", /* HP EliteBook 840 G2 */
"SYN3052", /* HP EliteBook 840 G4 */
"SYN3221", /* HP 15-ay000 */
"SYN323d", /* HP Spectre X360 13-w013dx */
"SYN3257", /* HP Envy 13-ad105ng */
+ "TOS01f6", /* Dynabook Portege X30L-G */
+ "TOS0213", /* Dynabook Portege X30-D */
NULL
};
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 492a3b38f552..a20cf54d12dc 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1243,7 +1243,7 @@ static int dm_keyslot_evict(struct blk_crypto_profile *profile,
t = dm_get_live_table(md, &srcu_idx);
if (!t)
- return 0;
+ goto put_live_table;
for (unsigned int i = 0; i < t->num_targets; i++) {
struct dm_target *ti = dm_table_get_target(t, i);
@@ -1254,6 +1254,7 @@ static int dm_keyslot_evict(struct blk_crypto_profile *profile,
(void *)key);
}
+put_live_table:
dm_put_live_table(md, srcu_idx);
return 0;
}
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index f28bdb5badd0..56b0f49c8116 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -2047,9 +2047,9 @@ EXPORT_SYMBOL_GPL(m_can_class_register);
void m_can_class_unregister(struct m_can_classdev *cdev)
{
+ unregister_candev(cdev->net);
if (cdev->is_peripheral)
can_rx_offload_del(&cdev->offload);
- unregister_candev(cdev->net);
}
EXPORT_SYMBOL_GPL(m_can_class_unregister);
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
index 6fecfe4cd080..21ae3a89924e 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
@@ -75,6 +75,24 @@ static const struct can_bittiming_const mcp251xfd_data_bittiming_const = {
.brp_inc = 1,
};
+/* The datasheet of the mcp2518fd (DS20006027B) specifies a range of
+ * [-64,63] for TDCO, indicating a relative TDCO.
+ *
+ * Manual tests have shown, that using a relative TDCO configuration
+ * results in bus off, while an absolute configuration works.
+ *
+ * For TDCO use the max value (63) from the data sheet, but 0 as the
+ * minimum.
+ */
+static const struct can_tdc_const mcp251xfd_tdc_const = {
+ .tdcv_min = 0,
+ .tdcv_max = 63,
+ .tdco_min = 0,
+ .tdco_max = 63,
+ .tdcf_min = 0,
+ .tdcf_max = 0,
+};
+
static const char *__mcp251xfd_get_model_str(enum mcp251xfd_model model)
{
switch (model) {
@@ -510,8 +528,7 @@ static int mcp251xfd_set_bittiming(const struct mcp251xfd_priv *priv)
{
const struct can_bittiming *bt = &priv->can.bittiming;
const struct can_bittiming *dbt = &priv->can.data_bittiming;
- u32 val = 0;
- s8 tdco;
+ u32 tdcmod, val = 0;
int err;
/* CAN Control Register
@@ -575,11 +592,16 @@ static int mcp251xfd_set_bittiming(const struct mcp251xfd_priv *priv)
return err;
/* Transmitter Delay Compensation */
- tdco = clamp_t(int, dbt->brp * (dbt->prop_seg + dbt->phase_seg1),
- -64, 63);
- val = FIELD_PREP(MCP251XFD_REG_TDC_TDCMOD_MASK,
- MCP251XFD_REG_TDC_TDCMOD_AUTO) |
- FIELD_PREP(MCP251XFD_REG_TDC_TDCO_MASK, tdco);
+ if (priv->can.ctrlmode & CAN_CTRLMODE_TDC_AUTO)
+ tdcmod = MCP251XFD_REG_TDC_TDCMOD_AUTO;
+ else if (priv->can.ctrlmode & CAN_CTRLMODE_TDC_MANUAL)
+ tdcmod = MCP251XFD_REG_TDC_TDCMOD_MANUAL;
+ else
+ tdcmod = MCP251XFD_REG_TDC_TDCMOD_DISABLED;
+
+ val = FIELD_PREP(MCP251XFD_REG_TDC_TDCMOD_MASK, tdcmod) |
+ FIELD_PREP(MCP251XFD_REG_TDC_TDCV_MASK, priv->can.tdc.tdcv) |
+ FIELD_PREP(MCP251XFD_REG_TDC_TDCO_MASK, priv->can.tdc.tdco);
return regmap_write(priv->map_reg, MCP251XFD_REG_TDC, val);
}
@@ -2083,10 +2105,12 @@ static int mcp251xfd_probe(struct spi_device *spi)
priv->can.do_get_berr_counter = mcp251xfd_get_berr_counter;
priv->can.bittiming_const = &mcp251xfd_bittiming_const;
priv->can.data_bittiming_const = &mcp251xfd_data_bittiming_const;
+ priv->can.tdc_const = &mcp251xfd_tdc_const;
priv->can.ctrlmode_supported = CAN_CTRLMODE_LOOPBACK |
CAN_CTRLMODE_LISTENONLY | CAN_CTRLMODE_BERR_REPORTING |
CAN_CTRLMODE_FD | CAN_CTRLMODE_FD_NON_ISO |
- CAN_CTRLMODE_CC_LEN8_DLC;
+ CAN_CTRLMODE_CC_LEN8_DLC | CAN_CTRLMODE_TDC_AUTO |
+ CAN_CTRLMODE_TDC_MANUAL;
set_bit(MCP251XFD_FLAGS_DOWN, priv->flags);
priv->ndev = ndev;
priv->spi = spi;
@@ -2179,8 +2203,8 @@ static void mcp251xfd_remove(struct spi_device *spi)
struct mcp251xfd_priv *priv = spi_get_drvdata(spi);
struct net_device *ndev = priv->ndev;
- can_rx_offload_del(&priv->offload);
mcp251xfd_unregister(priv);
+ can_rx_offload_del(&priv->offload);
spi->max_speed_hz = priv->spi_max_speed_hz_orig;
free_candev(ndev);
}
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index ea7d2c01e672..1a23fcc0445c 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -373,15 +373,17 @@ static void b53_enable_vlan(struct b53_device *dev, int port, bool enable,
b53_read8(dev, B53_VLAN_PAGE, B53_VLAN_CTRL5, &vc5);
}
+ vc1 &= ~VC1_RX_MCST_FWD_EN;
+
if (enable) {
vc0 |= VC0_VLAN_EN | VC0_VID_CHK_EN | VC0_VID_HASH_VID;
- vc1 |= VC1_RX_MCST_UNTAG_EN | VC1_RX_MCST_FWD_EN;
+ vc1 |= VC1_RX_MCST_UNTAG_EN;
vc4 &= ~VC4_ING_VID_CHECK_MASK;
if (enable_filtering) {
vc4 |= VC4_ING_VID_VIO_DROP << VC4_ING_VID_CHECK_S;
vc5 |= VC5_DROP_VTABLE_MISS;
} else {
- vc4 |= VC4_ING_VID_VIO_FWD << VC4_ING_VID_CHECK_S;
+ vc4 |= VC4_NO_ING_VID_CHK << VC4_ING_VID_CHECK_S;
vc5 &= ~VC5_DROP_VTABLE_MISS;
}
@@ -393,7 +395,7 @@ static void b53_enable_vlan(struct b53_device *dev, int port, bool enable,
} else {
vc0 &= ~(VC0_VLAN_EN | VC0_VID_CHK_EN | VC0_VID_HASH_VID);
- vc1 &= ~(VC1_RX_MCST_UNTAG_EN | VC1_RX_MCST_FWD_EN);
+ vc1 &= ~VC1_RX_MCST_UNTAG_EN;
vc4 &= ~VC4_ING_VID_CHECK_MASK;
vc5 &= ~VC5_DROP_VTABLE_MISS;
@@ -1475,12 +1477,21 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
struct b53_vlan *vl;
+ u16 old_pvid, new_pvid;
int err;
err = b53_vlan_prepare(ds, port, vlan);
if (err)
return err;
+ b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &old_pvid);
+ if (pvid)
+ new_pvid = vlan->vid;
+ else if (!pvid && vlan->vid == old_pvid)
+ new_pvid = b53_default_pvid(dev);
+ else
+ new_pvid = old_pvid;
+
vl = &dev->vlans[vlan->vid];
b53_get_vlan_entry(dev, vlan->vid, vl);
@@ -1497,10 +1508,10 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
b53_set_vlan_entry(dev, vlan->vid, vl);
b53_fast_age_vlan(dev, vlan->vid);
- if (pvid && !dsa_is_cpu_port(ds, port)) {
+ if (!dsa_is_cpu_port(ds, port) && new_pvid != old_pvid) {
b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port),
- vlan->vid);
- b53_fast_age_vlan(dev, vlan->vid);
+ new_pvid);
+ b53_fast_age_vlan(dev, old_pvid);
}
return 0;
@@ -1912,7 +1923,7 @@ EXPORT_SYMBOL(b53_br_join);
void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge)
{
struct b53_device *dev = ds->priv;
- struct b53_vlan *vl = &dev->vlans[0];
+ struct b53_vlan *vl;
s8 cpu_port = dsa_to_port(ds, port)->cpu_dp->index;
unsigned int i;
u16 pvlan, reg, pvid;
@@ -1938,6 +1949,7 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge)
dev->ports[port].vlan_ctl_mask = pvlan;
pvid = b53_default_pvid(dev);
+ vl = &dev->vlans[pvid];
/* Make this port join all VLANs without VLAN entries */
if (is58xx(dev)) {
@@ -1946,12 +1958,12 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct dsa_bridge bridge)
if (!(reg & BIT(cpu_port)))
reg |= BIT(cpu_port);
b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg);
- } else {
- b53_get_vlan_entry(dev, pvid, vl);
- vl->members |= BIT(port) | BIT(cpu_port);
- vl->untag |= BIT(port) | BIT(cpu_port);
- b53_set_vlan_entry(dev, pvid, vl);
}
+
+ b53_get_vlan_entry(dev, pvid, vl);
+ vl->members |= BIT(port) | BIT(cpu_port);
+ vl->untag |= BIT(port) | BIT(cpu_port);
+ b53_set_vlan_entry(dev, pvid, vl);
}
EXPORT_SYMBOL(b53_br_leave);
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c
index 623607fd2cef..0b88635f4fbc 100644
--- a/drivers/net/phy/microchip.c
+++ b/drivers/net/phy/microchip.c
@@ -31,6 +31,47 @@ static int lan88xx_write_page(struct phy_device *phydev, int page)
return __phy_write(phydev, LAN88XX_EXT_PAGE_ACCESS, page);
}
+static int lan88xx_phy_config_intr(struct phy_device *phydev)
+{
+ int rc;
+
+ if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
+ /* unmask all source and clear them before enable */
+ rc = phy_write(phydev, LAN88XX_INT_MASK, 0x7FFF);
+ rc = phy_read(phydev, LAN88XX_INT_STS);
+ rc = phy_write(phydev, LAN88XX_INT_MASK,
+ LAN88XX_INT_MASK_MDINTPIN_EN_ |
+ LAN88XX_INT_MASK_LINK_CHANGE_);
+ } else {
+ rc = phy_write(phydev, LAN88XX_INT_MASK, 0);
+ if (rc)
+ return rc;
+
+ /* Ack interrupts after they have been disabled */
+ rc = phy_read(phydev, LAN88XX_INT_STS);
+ }
+
+ return rc < 0 ? rc : 0;
+}
+
+static irqreturn_t lan88xx_handle_interrupt(struct phy_device *phydev)
+{
+ int irq_status;
+
+ irq_status = phy_read(phydev, LAN88XX_INT_STS);
+ if (irq_status < 0) {
+ phy_error(phydev);
+ return IRQ_NONE;
+ }
+
+ if (!(irq_status & LAN88XX_INT_STS_LINK_CHANGE_))
+ return IRQ_NONE;
+
+ phy_trigger_machine(phydev);
+
+ return IRQ_HANDLED;
+}
+
static int lan88xx_suspend(struct phy_device *phydev)
{
struct lan88xx_priv *priv = phydev->priv;
@@ -351,9 +392,8 @@ static struct phy_driver microchip_phy_driver[] = {
.config_aneg = lan88xx_config_aneg,
.link_change_notify = lan88xx_link_change_notify,
- /* Interrupt handling is broken, do not define related
- * functions to force polling.
- */
+ .config_intr = lan88xx_phy_config_intr,
+ .handle_interrupt = lan88xx_handle_interrupt,
.suspend = lan88xx_suspend,
.resume = genphy_resume,
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4c40ebb9503d..fbe3fb4fbe95 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4881,7 +4881,8 @@ static void nvme_fw_act_work(struct work_struct *work)
msleep(100);
}
- if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
+ if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
+ !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
return;
nvme_start_queues(ctrl);
diff --git a/drivers/staging/axis-fifo/axis-fifo.c b/drivers/staging/axis-fifo/axis-fifo.c
index 0a85ea667a1b..3369892eacf4 100644
--- a/drivers/staging/axis-fifo/axis-fifo.c
+++ b/drivers/staging/axis-fifo/axis-fifo.c
@@ -400,16 +400,14 @@ static ssize_t axis_fifo_read(struct file *f, char __user *buf,
bytes_available = ioread32(fifo->base_addr + XLLF_RLR_OFFSET);
if (!bytes_available) {
- dev_err(fifo->dt_device, "received a packet of length 0 - fifo core will be reset\n");
- reset_ip_core(fifo);
+ dev_err(fifo->dt_device, "received a packet of length 0\n");
ret = -EIO;
goto end_unlock;
}
if (bytes_available > len) {
- dev_err(fifo->dt_device, "user read buffer too small (available bytes=%zu user buffer bytes=%zu) - fifo core will be reset\n",
+ dev_err(fifo->dt_device, "user read buffer too small (available bytes=%zu user buffer bytes=%zu)\n",
bytes_available, len);
- reset_ip_core(fifo);
ret = -EINVAL;
goto end_unlock;
}
@@ -418,8 +416,7 @@ static ssize_t axis_fifo_read(struct file *f, char __user *buf,
/* this probably can't happen unless IP
* registers were previously mishandled
*/
- dev_err(fifo->dt_device, "received a packet that isn't word-aligned - fifo core will be reset\n");
- reset_ip_core(fifo);
+ dev_err(fifo->dt_device, "received a packet that isn't word-aligned\n");
ret = -EIO;
goto end_unlock;
}
@@ -440,7 +437,6 @@ static ssize_t axis_fifo_read(struct file *f, char __user *buf,
if (copy_to_user(buf + copied * sizeof(u32), tmp_buf,
copy * sizeof(u32))) {
- reset_ip_core(fifo);
ret = -EFAULT;
goto end_unlock;
}
@@ -549,7 +545,6 @@ static ssize_t axis_fifo_write(struct file *f, const char __user *buf,
if (copy_from_user(tmp_buf, buf + copied * sizeof(u32),
copy * sizeof(u32))) {
- reset_ip_core(fifo);
ret = -EFAULT;
goto end_unlock;
}
@@ -782,9 +777,6 @@ static int axis_fifo_parse_dt(struct axis_fifo *fifo)
goto end;
}
- /* IP sets TDFV to fifo depth - 4 so we will do the same */
- fifo->tx_fifo_depth -= 4;
-
ret = get_dts_property(fifo, "xlnx,use-rx-data", &fifo->has_rx_fifo);
if (ret) {
dev_err(fifo->dt_device, "missing xlnx,use-rx-data property\n");
diff --git a/drivers/staging/iio/adc/ad7816.c b/drivers/staging/iio/adc/ad7816.c
index 6c14d7bcdd67..081b17f49863 100644
--- a/drivers/staging/iio/adc/ad7816.c
+++ b/drivers/staging/iio/adc/ad7816.c
@@ -136,7 +136,7 @@ static ssize_t ad7816_store_mode(struct device *dev,
struct iio_dev *indio_dev = dev_to_iio_dev(dev);
struct ad7816_chip_info *chip = iio_priv(indio_dev);
- if (strcmp(buf, "full")) {
+ if (strcmp(buf, "full") == 0) {
gpiod_set_value(chip->rdwr_pin, 1);
chip->mode = AD7816_FULL;
} else {
diff --git a/drivers/usb/cdns3/cdnsp-gadget.c b/drivers/usb/cdns3/cdnsp-gadget.c
index 0044897ee800..913050230d1a 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.c
+++ b/drivers/usb/cdns3/cdnsp-gadget.c
@@ -138,6 +138,26 @@ static void cdnsp_clear_port_change_bit(struct cdnsp_device *pdev,
(portsc & PORT_CHANGE_BITS), port_regs);
}
+static void cdnsp_set_apb_timeout_value(struct cdnsp_device *pdev)
+{
+ struct cdns *cdns = dev_get_drvdata(pdev->dev);
+ __le32 __iomem *reg;
+ void __iomem *base;
+ u32 offset = 0;
+ u32 val;
+
+ if (!cdns->override_apb_timeout)
+ return;
+
+ base = &pdev->cap_regs->hc_capbase;
+ offset = cdnsp_find_next_ext_cap(base, offset, D_XEC_PRE_REGS_CAP);
+ reg = base + offset + REG_CHICKEN_BITS_3_OFFSET;
+
+ val = le32_to_cpu(readl(reg));
+ val = CHICKEN_APB_TIMEOUT_SET(val, cdns->override_apb_timeout);
+ writel(cpu_to_le32(val), reg);
+}
+
static void cdnsp_set_chicken_bits_2(struct cdnsp_device *pdev, u32 bit)
{
__le32 __iomem *reg;
@@ -1776,6 +1796,8 @@ static void cdnsp_get_rev_cap(struct cdnsp_device *pdev)
reg += cdnsp_find_next_ext_cap(reg, 0, RTL_REV_CAP);
pdev->rev_cap = reg;
+ pdev->rtl_revision = readl(&pdev->rev_cap->rtl_revision);
+
dev_info(pdev->dev, "Rev: %08x/%08x, eps: %08x, buff: %08x/%08x\n",
readl(&pdev->rev_cap->ctrl_revision),
readl(&pdev->rev_cap->rtl_revision),
@@ -1801,6 +1823,15 @@ static int cdnsp_gen_setup(struct cdnsp_device *pdev)
pdev->hci_version = HC_VERSION(pdev->hcc_params);
pdev->hcc_params = readl(&pdev->cap_regs->hcc_params);
+ /*
+ * Override the APB timeout value to give the controller more time for
+ * enabling UTMI clock and synchronizing APB and UTMI clock domains.
+ * This fix is platform specific and is required to fixes issue with
+ * reading incorrect value from PORTSC register after resuming
+ * from L1 state.
+ */
+ cdnsp_set_apb_timeout_value(pdev);
+
cdnsp_get_rev_cap(pdev);
/* Make sure the Device Controller is halted. */
diff --git a/drivers/usb/cdns3/cdnsp-gadget.h b/drivers/usb/cdns3/cdnsp-gadget.h
index a61aef0dc273..5cffc1444d3a 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.h
+++ b/drivers/usb/cdns3/cdnsp-gadget.h
@@ -520,6 +520,9 @@ struct cdnsp_rev_cap {
#define REG_CHICKEN_BITS_2_OFFSET 0x48
#define CHICKEN_XDMA_2_TP_CACHE_DIS BIT(28)
+#define REG_CHICKEN_BITS_3_OFFSET 0x4C
+#define CHICKEN_APB_TIMEOUT_SET(p, val) (((p) & ~GENMASK(21, 0)) | (val))
+
/* XBUF Extended Capability ID. */
#define XBUF_CAP_ID 0xCB
#define XBUF_RX_TAG_MASK_0_OFFSET 0x1C
@@ -1359,6 +1362,7 @@ struct cdnsp_port {
* @rev_cap: Controller Capabilities Registers.
* @hcs_params1: Cached register copies of read-only HCSPARAMS1
* @hcc_params: Cached register copies of read-only HCCPARAMS1
+ * @rtl_revision: Cached controller rtl revision.
* @setup: Temporary buffer for setup packet.
* @ep0_preq: Internal allocated request used during enumeration.
* @ep0_stage: ep0 stage during enumeration process.
@@ -1413,6 +1417,8 @@ struct cdnsp_device {
__u32 hcs_params1;
__u32 hcs_params3;
__u32 hcc_params;
+ #define RTL_REVISION_NEW_LPM 0x2700
+ __u32 rtl_revision;
/* Lock used in interrupt thread context. */
spinlock_t lock;
struct usb_ctrlrequest setup;
diff --git a/drivers/usb/cdns3/cdnsp-pci.c b/drivers/usb/cdns3/cdnsp-pci.c
index a85db23fa19f..b7a1f28faa1f 100644
--- a/drivers/usb/cdns3/cdnsp-pci.c
+++ b/drivers/usb/cdns3/cdnsp-pci.c
@@ -33,6 +33,8 @@
#define CDNS_DRD_ID 0x0100
#define CDNS_DRD_IF (PCI_CLASS_SERIAL_USB << 8 | 0x80)
+#define CHICKEN_APB_TIMEOUT_VALUE 0x1C20
+
static struct pci_dev *cdnsp_get_second_fun(struct pci_dev *pdev)
{
/*
@@ -144,6 +146,14 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
cdnsp->otg_irq = pdev->irq;
}
+ /*
+ * Cadence PCI based platform require some longer timeout for APB
+ * to fixes domain clock synchronization issue after resuming
+ * controller from L1 state.
+ */
+ cdnsp->override_apb_timeout = CHICKEN_APB_TIMEOUT_VALUE;
+ pci_set_drvdata(pdev, cdnsp);
+
if (pci_is_enabled(func)) {
cdnsp->dev = dev;
cdnsp->gadget_init = cdnsp_gadget_init;
@@ -153,8 +163,6 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
goto free_cdnsp;
}
- pci_set_drvdata(pdev, cdnsp);
-
device_wakeup_enable(&pdev->dev);
if (pci_dev_run_wake(pdev))
pm_runtime_put_noidle(&pdev->dev);
diff --git a/drivers/usb/cdns3/cdnsp-ring.c b/drivers/usb/cdns3/cdnsp-ring.c
index 47096b8e3179..6247584cb939 100644
--- a/drivers/usb/cdns3/cdnsp-ring.c
+++ b/drivers/usb/cdns3/cdnsp-ring.c
@@ -308,7 +308,8 @@ static bool cdnsp_ring_ep_doorbell(struct cdnsp_device *pdev,
writel(db_value, reg_addr);
- cdnsp_force_l0_go(pdev);
+ if (pdev->rtl_revision < RTL_REVISION_NEW_LPM)
+ cdnsp_force_l0_go(pdev);
/* Doorbell was set. */
return true;
diff --git a/drivers/usb/cdns3/core.h b/drivers/usb/cdns3/core.h
index 57d47348dc19..ac30ee21309d 100644
--- a/drivers/usb/cdns3/core.h
+++ b/drivers/usb/cdns3/core.h
@@ -79,6 +79,8 @@ struct cdns3_platform_data {
* @pdata: platform data from glue layer
* @lock: spinlock structure
* @xhci_plat_data: xhci private data structure pointer
+ * @override_apb_timeout: hold value of APB timeout. For value 0 the default
+ * value in CHICKEN_BITS_3 will be preserved.
* @gadget_init: pointer to gadget initialization function
*/
struct cdns {
@@ -117,6 +119,7 @@ struct cdns {
struct cdns3_platform_data *pdata;
spinlock_t lock;
struct xhci_plat_priv *xhci_plat_data;
+ u32 override_apb_timeout;
int (*gadget_init)(struct cdns *cdns);
};
diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class/usbtmc.c
index c2e666e82857..2f92905e05ca 100644
--- a/drivers/usb/class/usbtmc.c
+++ b/drivers/usb/class/usbtmc.c
@@ -482,6 +482,7 @@ static int usbtmc_get_stb(struct usbtmc_file_data *file_data, __u8 *stb)
u8 *buffer;
u8 tag;
int rv;
+ long wait_rv;
dev_dbg(dev, "Enter ioctl_read_stb iin_ep_present: %d\n",
data->iin_ep_present);
@@ -511,16 +512,17 @@ static int usbtmc_get_stb(struct usbtmc_file_data *file_data, __u8 *stb)
}
if (data->iin_ep_present) {
- rv = wait_event_interruptible_timeout(
+ wait_rv = wait_event_interruptible_timeout(
data->waitq,
atomic_read(&data->iin_data_valid) != 0,
file_data->timeout);
- if (rv < 0) {
- dev_dbg(dev, "wait interrupted %d\n", rv);
+ if (wait_rv < 0) {
+ dev_dbg(dev, "wait interrupted %ld\n", wait_rv);
+ rv = wait_rv;
goto exit;
}
- if (rv == 0) {
+ if (wait_rv == 0) {
dev_dbg(dev, "wait timed out\n");
rv = -ETIMEDOUT;
goto exit;
@@ -539,6 +541,8 @@ static int usbtmc_get_stb(struct usbtmc_file_data *file_data, __u8 *stb)
dev_dbg(dev, "stb:0x%02x received %d\n", (unsigned int)*stb, rv);
+ rv = 0;
+
exit:
/* bump interrupt bTag */
data->iin_bTag += 1;
@@ -602,9 +606,9 @@ static int usbtmc488_ioctl_wait_srq(struct usbtmc_file_data *file_data,
{
struct usbtmc_device_data *data = file_data->data;
struct device *dev = &data->intf->dev;
- int rv;
u32 timeout;
unsigned long expire;
+ long wait_rv;
if (!data->iin_ep_present) {
dev_dbg(dev, "no interrupt endpoint present\n");
@@ -618,25 +622,24 @@ static int usbtmc488_ioctl_wait_srq(struct usbtmc_file_data *file_data,
mutex_unlock(&data->io_mutex);
- rv = wait_event_interruptible_timeout(
- data->waitq,
- atomic_read(&file_data->srq_asserted) != 0 ||
- atomic_read(&file_data->closing),
- expire);
+ wait_rv = wait_event_interruptible_timeout(
+ data->waitq,
+ atomic_read(&file_data->srq_asserted) != 0 ||
+ atomic_read(&file_data->closing),
+ expire);
mutex_lock(&data->io_mutex);
/* Note! disconnect or close could be called in the meantime */
if (atomic_read(&file_data->closing) || data->zombie)
- rv = -ENODEV;
+ return -ENODEV;
- if (rv < 0) {
- /* dev can be invalid now! */
- pr_debug("%s - wait interrupted %d\n", __func__, rv);
- return rv;
+ if (wait_rv < 0) {
+ dev_dbg(dev, "%s - wait interrupted %ld\n", __func__, wait_rv);
+ return wait_rv;
}
- if (rv == 0) {
+ if (wait_rv == 0) {
dev_dbg(dev, "%s - wait timed out\n", __func__);
return -ETIMEDOUT;
}
@@ -830,6 +833,7 @@ static ssize_t usbtmc_generic_read(struct usbtmc_file_data *file_data,
unsigned long expire;
int bufcount = 1;
int again = 0;
+ long wait_rv;
/* mutex already locked */
@@ -942,19 +946,24 @@ static ssize_t usbtmc_generic_read(struct usbtmc_file_data *file_data,
if (!(flags & USBTMC_FLAG_ASYNC)) {
dev_dbg(dev, "%s: before wait time %lu\n",
__func__, expire);
- retval = wait_event_interruptible_timeout(
+ wait_rv = wait_event_interruptible_timeout(
file_data->wait_bulk_in,
usbtmc_do_transfer(file_data),
expire);
- dev_dbg(dev, "%s: wait returned %d\n",
- __func__, retval);
+ dev_dbg(dev, "%s: wait returned %ld\n",
+ __func__, wait_rv);
+
+ if (wait_rv < 0) {
+ retval = wait_rv;
+ goto error;
+ }
- if (retval <= 0) {
- if (retval == 0)
- retval = -ETIMEDOUT;
+ if (wait_rv == 0) {
+ retval = -ETIMEDOUT;
goto error;
}
+
}
urb = usb_get_from_anchor(&file_data->in_anchor);
@@ -1380,7 +1389,10 @@ static ssize_t usbtmc_read(struct file *filp, char __user *buf,
if (!buffer)
return -ENOMEM;
- mutex_lock(&data->io_mutex);
+ retval = mutex_lock_interruptible(&data->io_mutex);
+ if (retval < 0)
+ goto exit_nolock;
+
if (data->zombie) {
retval = -ENODEV;
goto exit;
@@ -1503,6 +1515,7 @@ static ssize_t usbtmc_read(struct file *filp, char __user *buf,
exit:
mutex_unlock(&data->io_mutex);
+exit_nolock:
kfree(buffer);
return retval;
}
diff --git a/drivers/usb/gadget/udc/tegra-xudc.c b/drivers/usb/gadget/udc/tegra-xudc.c
index fd7a9535973e..771466cdf385 100644
--- a/drivers/usb/gadget/udc/tegra-xudc.c
+++ b/drivers/usb/gadget/udc/tegra-xudc.c
@@ -1743,6 +1743,10 @@ static int __tegra_xudc_ep_disable(struct tegra_xudc_ep *ep)
val = xudc_readl(xudc, CTRL);
val &= ~CTRL_RUN;
xudc_writel(xudc, val, CTRL);
+
+ val = xudc_readl(xudc, ST);
+ if (val & ST_RC)
+ xudc_writel(xudc, ST_RC, ST);
}
dev_info(xudc->dev, "ep %u disabled\n", ep->index);
diff --git a/drivers/usb/host/uhci-platform.c b/drivers/usb/host/uhci-platform.c
index b2049b47a08d..53d2fbee76e2 100644
--- a/drivers/usb/host/uhci-platform.c
+++ b/drivers/usb/host/uhci-platform.c
@@ -122,7 +122,7 @@ static int uhci_hcd_platform_probe(struct platform_device *pdev)
}
/* Get and enable clock if any specified */
- uhci->clk = devm_clk_get(&pdev->dev, NULL);
+ uhci->clk = devm_clk_get_optional(&pdev->dev, NULL);
if (IS_ERR(uhci->clk)) {
ret = PTR_ERR(uhci->clk);
goto err_rmr;
diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
index 51eabc5e8770..14a772feab79 100644
--- a/drivers/usb/host/xhci-tegra.c
+++ b/drivers/usb/host/xhci-tegra.c
@@ -1228,6 +1228,7 @@ static void tegra_xhci_id_work(struct work_struct *work)
tegra->otg_usb3_port = tegra_xusb_padctl_get_usb3_companion(tegra->padctl,
tegra->otg_usb2_port);
+ pm_runtime_get_sync(tegra->dev);
if (tegra->host_mode) {
/* switch to host mode */
if (tegra->otg_usb3_port >= 0) {
@@ -1257,6 +1258,7 @@ static void tegra_xhci_id_work(struct work_struct *work)
}
tegra_xhci_set_port_power(tegra, true, true);
+ pm_runtime_mark_last_busy(tegra->dev);
} else {
if (tegra->otg_usb3_port >= 0)
@@ -1264,6 +1266,7 @@ static void tegra_xhci_id_work(struct work_struct *work)
tegra_xhci_set_port_power(tegra, true, false);
}
+ pm_runtime_put_autosuspend(tegra->dev);
}
#if IS_ENABLED(CONFIG_PM) || IS_ENABLED(CONFIG_PM_SLEEP)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 4291093f110d..f40eabb7e24f 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -5136,7 +5136,7 @@ static void _tcpm_cc_change(struct tcpm_port *port, enum typec_cc_status cc1,
case SNK_TRY_WAIT_DEBOUNCE:
if (!tcpm_port_is_sink(port)) {
port->max_wait = 0;
- tcpm_set_state(port, SRC_TRYWAIT, 0);
+ tcpm_set_state(port, SRC_TRYWAIT, PD_T_PD_DEBOUNCE);
}
break;
case SRC_TRY_WAIT:
diff --git a/drivers/usb/typec/ucsi/displayport.c b/drivers/usb/typec/ucsi/displayport.c
index 2431febc4615..8c19081c3255 100644
--- a/drivers/usb/typec/ucsi/displayport.c
+++ b/drivers/usb/typec/ucsi/displayport.c
@@ -296,6 +296,8 @@ void ucsi_displayport_remove_partner(struct typec_altmode *alt)
if (!dp)
return;
+ cancel_work_sync(&dp->work);
+
dp->data.conf = 0;
dp->data.status = 0;
dp->initialized = false;
diff --git a/drivers/xen/xenbus/xenbus.h b/drivers/xen/xenbus/xenbus.h
index 2754bdfadcb8..4ba73320694a 100644
--- a/drivers/xen/xenbus/xenbus.h
+++ b/drivers/xen/xenbus/xenbus.h
@@ -77,6 +77,7 @@ enum xb_req_state {
struct xb_req_data {
struct list_head list;
wait_queue_head_t wq;
+ struct kref kref;
struct xsd_sockmsg msg;
uint32_t caller_req_id;
enum xsd_sockmsg_type type;
@@ -103,6 +104,7 @@ int xb_init_comms(void);
void xb_deinit_comms(void);
int xs_watch_msg(struct xs_watch_event *event);
void xs_request_exit(struct xb_req_data *req);
+void xs_free_req(struct kref *kref);
int xenbus_match(struct device *_dev, struct device_driver *_drv);
int xenbus_dev_probe(struct device *_dev);
diff --git a/drivers/xen/xenbus/xenbus_comms.c b/drivers/xen/xenbus/xenbus_comms.c
index e5fda0256feb..82df2da1b880 100644
--- a/drivers/xen/xenbus/xenbus_comms.c
+++ b/drivers/xen/xenbus/xenbus_comms.c
@@ -309,8 +309,8 @@ static int process_msg(void)
virt_wmb();
req->state = xb_req_state_got_reply;
req->cb(req);
- } else
- kfree(req);
+ }
+ kref_put(&req->kref, xs_free_req);
}
mutex_unlock(&xs_response_mutex);
@@ -386,14 +386,13 @@ static int process_writes(void)
state.req->msg.type = XS_ERROR;
state.req->err = err;
list_del(&state.req->list);
- if (state.req->state == xb_req_state_aborted)
- kfree(state.req);
- else {
+ if (state.req->state != xb_req_state_aborted) {
/* write err, then update state */
virt_wmb();
state.req->state = xb_req_state_got_reply;
wake_up(&state.req->wq);
}
+ kref_put(&state.req->kref, xs_free_req);
mutex_unlock(&xb_write_mutex);
diff --git a/drivers/xen/xenbus/xenbus_dev_frontend.c b/drivers/xen/xenbus/xenbus_dev_frontend.c
index 0792fda49a15..c495cff3da30 100644
--- a/drivers/xen/xenbus/xenbus_dev_frontend.c
+++ b/drivers/xen/xenbus/xenbus_dev_frontend.c
@@ -406,7 +406,7 @@ void xenbus_dev_queue_reply(struct xb_req_data *req)
mutex_unlock(&u->reply_mutex);
kfree(req->body);
- kfree(req);
+ kref_put(&req->kref, xs_free_req);
kref_put(&u->kref, xenbus_file_free);
diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
index 12e02eb01f59..a4dd92719e7a 100644
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -112,6 +112,12 @@ static void xs_suspend_exit(void)
wake_up_all(&xs_state_enter_wq);
}
+void xs_free_req(struct kref *kref)
+{
+ struct xb_req_data *req = container_of(kref, struct xb_req_data, kref);
+ kfree(req);
+}
+
static uint32_t xs_request_enter(struct xb_req_data *req)
{
uint32_t rq_id;
@@ -237,6 +243,12 @@ static void xs_send(struct xb_req_data *req, struct xsd_sockmsg *msg)
req->caller_req_id = req->msg.req_id;
req->msg.req_id = xs_request_enter(req);
+ /*
+ * Take 2nd ref. One for this thread, and the second for the
+ * xenbus_thread.
+ */
+ kref_get(&req->kref);
+
mutex_lock(&xb_write_mutex);
list_add_tail(&req->list, &xb_write_list);
notify = list_is_singular(&xb_write_list);
@@ -261,8 +273,8 @@ static void *xs_wait_for_reply(struct xb_req_data *req, struct xsd_sockmsg *msg)
if (req->state == xb_req_state_queued ||
req->state == xb_req_state_wait_reply)
req->state = xb_req_state_aborted;
- else
- kfree(req);
+
+ kref_put(&req->kref, xs_free_req);
mutex_unlock(&xb_write_mutex);
return ret;
@@ -291,6 +303,7 @@ int xenbus_dev_request_and_reply(struct xsd_sockmsg *msg, void *par)
req->cb = xenbus_dev_queue_reply;
req->par = par;
req->user_req = true;
+ kref_init(&req->kref);
xs_send(req, msg);
@@ -319,6 +332,7 @@ static void *xs_talkv(struct xenbus_transaction t,
req->num_vecs = num_vecs;
req->cb = xs_wake_up;
req->user_req = false;
+ kref_init(&req->kref);
msg.req_id = 0;
msg.tx_id = t.id;
diff --git a/fs/namespace.c b/fs/namespace.c
index 57166cc7e511..0dcd57a75ad4 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -629,7 +629,7 @@ int __legitimize_mnt(struct vfsmount *bastard, unsigned seq)
return 0;
mnt = real_mount(bastard);
mnt_add_count(mnt, 1);
- smp_mb(); // see mntput_no_expire()
+ smp_mb(); // see mntput_no_expire() and do_umount()
if (likely(!read_seqretry(&mount_lock, seq)))
return 0;
if (bastard->mnt_flags & MNT_SYNC_UMOUNT) {
@@ -1707,6 +1707,7 @@ static int do_umount(struct mount *mnt, int flags)
umount_tree(mnt, UMOUNT_PROPAGATE);
retval = 0;
} else {
+ smp_mb(); // paired with __legitimize_mnt()
shrink_submounts(mnt);
retval = -EBUSY;
if (!propagate_mount_busy(mnt, 2)) {
diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 775766856a54..7f5fd16d3740 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -173,7 +173,7 @@ int ocfs2_recovery_init(struct ocfs2_super *osb)
struct ocfs2_recovery_map *rm;
mutex_init(&osb->recovery_lock);
- osb->disable_recovery = 0;
+ osb->recovery_state = OCFS2_REC_ENABLED;
osb->recovery_thread_task = NULL;
init_waitqueue_head(&osb->recovery_event);
@@ -192,31 +192,53 @@ int ocfs2_recovery_init(struct ocfs2_super *osb)
return 0;
}
-/* we can't grab the goofy sem lock from inside wait_event, so we use
- * memory barriers to make sure that we'll see the null task before
- * being woken up */
static int ocfs2_recovery_thread_running(struct ocfs2_super *osb)
{
- mb();
return osb->recovery_thread_task != NULL;
}
-void ocfs2_recovery_exit(struct ocfs2_super *osb)
+static void ocfs2_recovery_disable(struct ocfs2_super *osb,
+ enum ocfs2_recovery_state state)
{
- struct ocfs2_recovery_map *rm;
-
- /* disable any new recovery threads and wait for any currently
- * running ones to exit. Do this before setting the vol_state. */
mutex_lock(&osb->recovery_lock);
- osb->disable_recovery = 1;
+ /*
+ * If recovery thread is not running, we can directly transition to
+ * final state.
+ */
+ if (!ocfs2_recovery_thread_running(osb)) {
+ osb->recovery_state = state + 1;
+ goto out_lock;
+ }
+ osb->recovery_state = state;
+ /* Wait for recovery thread to acknowledge state transition */
+ wait_event_cmd(osb->recovery_event,
+ !ocfs2_recovery_thread_running(osb) ||
+ osb->recovery_state >= state + 1,
+ mutex_unlock(&osb->recovery_lock),
+ mutex_lock(&osb->recovery_lock));
+out_lock:
mutex_unlock(&osb->recovery_lock);
- wait_event(osb->recovery_event, !ocfs2_recovery_thread_running(osb));
- /* At this point, we know that no more recovery threads can be
- * launched, so wait for any recovery completion work to
- * complete. */
+ /*
+ * At this point we know that no more recovery work can be queued so
+ * wait for any recovery completion work to complete.
+ */
if (osb->ocfs2_wq)
flush_workqueue(osb->ocfs2_wq);
+}
+
+void ocfs2_recovery_disable_quota(struct ocfs2_super *osb)
+{
+ ocfs2_recovery_disable(osb, OCFS2_REC_QUOTA_WANT_DISABLE);
+}
+
+void ocfs2_recovery_exit(struct ocfs2_super *osb)
+{
+ struct ocfs2_recovery_map *rm;
+
+ /* disable any new recovery threads and wait for any currently
+ * running ones to exit. Do this before setting the vol_state. */
+ ocfs2_recovery_disable(osb, OCFS2_REC_WANT_DISABLE);
/*
* Now that recovery is shut down, and the osb is about to be
@@ -1439,6 +1461,18 @@ static int __ocfs2_recovery_thread(void *arg)
}
}
restart:
+ if (quota_enabled) {
+ mutex_lock(&osb->recovery_lock);
+ /* Confirm that recovery thread will no longer recover quotas */
+ if (osb->recovery_state == OCFS2_REC_QUOTA_WANT_DISABLE) {
+ osb->recovery_state = OCFS2_REC_QUOTA_DISABLED;
+ wake_up(&osb->recovery_event);
+ }
+ if (osb->recovery_state >= OCFS2_REC_QUOTA_DISABLED)
+ quota_enabled = 0;
+ mutex_unlock(&osb->recovery_lock);
+ }
+
status = ocfs2_super_lock(osb, 1);
if (status < 0) {
mlog_errno(status);
@@ -1536,27 +1570,29 @@ static int __ocfs2_recovery_thread(void *arg)
ocfs2_free_replay_slots(osb);
osb->recovery_thread_task = NULL;
- mb(); /* sync with ocfs2_recovery_thread_running */
+ if (osb->recovery_state == OCFS2_REC_WANT_DISABLE)
+ osb->recovery_state = OCFS2_REC_DISABLED;
wake_up(&osb->recovery_event);
mutex_unlock(&osb->recovery_lock);
- if (quota_enabled)
- kfree(rm_quota);
+ kfree(rm_quota);
return status;
}
void ocfs2_recovery_thread(struct ocfs2_super *osb, int node_num)
{
+ int was_set = -1;
+
mutex_lock(&osb->recovery_lock);
+ if (osb->recovery_state < OCFS2_REC_WANT_DISABLE)
+ was_set = ocfs2_recovery_map_set(osb, node_num);
trace_ocfs2_recovery_thread(node_num, osb->node_num,
- osb->disable_recovery, osb->recovery_thread_task,
- osb->disable_recovery ?
- -1 : ocfs2_recovery_map_set(osb, node_num));
+ osb->recovery_state, osb->recovery_thread_task, was_set);
- if (osb->disable_recovery)
+ if (osb->recovery_state >= OCFS2_REC_WANT_DISABLE)
goto out;
if (osb->recovery_thread_task)
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index 689c340c6363..5ebde794a653 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -148,6 +148,7 @@ void ocfs2_wait_for_recovery(struct ocfs2_super *osb);
int ocfs2_recovery_init(struct ocfs2_super *osb);
void ocfs2_recovery_exit(struct ocfs2_super *osb);
+void ocfs2_recovery_disable_quota(struct ocfs2_super *osb);
int ocfs2_compute_replay_slots(struct ocfs2_super *osb);
void ocfs2_free_replay_slots(struct ocfs2_super *osb);
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 740b64238312..22882c636bfb 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -284,6 +284,21 @@ enum ocfs2_mount_options
#define OCFS2_OSB_ERROR_FS 0x0004
#define OCFS2_DEFAULT_ATIME_QUANTUM 60
+enum ocfs2_recovery_state {
+ OCFS2_REC_ENABLED = 0,
+ OCFS2_REC_QUOTA_WANT_DISABLE,
+ /*
+ * Must be OCFS2_REC_QUOTA_WANT_DISABLE + 1 for
+ * ocfs2_recovery_disable_quota() to work.
+ */
+ OCFS2_REC_QUOTA_DISABLED,
+ OCFS2_REC_WANT_DISABLE,
+ /*
+ * Must be OCFS2_REC_WANT_DISABLE + 1 for ocfs2_recovery_exit() to work
+ */
+ OCFS2_REC_DISABLED,
+};
+
struct ocfs2_journal;
struct ocfs2_slot_info;
struct ocfs2_recovery_map;
@@ -346,7 +361,7 @@ struct ocfs2_super
struct ocfs2_recovery_map *recovery_map;
struct ocfs2_replay_map *replay_map;
struct task_struct *recovery_thread_task;
- int disable_recovery;
+ enum ocfs2_recovery_state recovery_state;
wait_queue_head_t checkpoint_event;
struct ocfs2_journal *journal;
unsigned long osb_commit_interval;
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 4b4fa58cd32f..0ca8975a1df4 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -453,8 +453,7 @@ struct ocfs2_quota_recovery *ocfs2_begin_quota_recovery(
/* Sync changes in local quota file into global quota file and
* reinitialize local quota file.
- * The function expects local quota file to be already locked and
- * s_umount locked in shared mode. */
+ * The function expects local quota file to be already locked. */
static int ocfs2_recover_local_quota_file(struct inode *lqinode,
int type,
struct ocfs2_quota_recovery *rec)
@@ -585,7 +584,6 @@ int ocfs2_finish_quota_recovery(struct ocfs2_super *osb,
{
unsigned int ino[OCFS2_MAXQUOTAS] = { LOCAL_USER_QUOTA_SYSTEM_INODE,
LOCAL_GROUP_QUOTA_SYSTEM_INODE };
- struct super_block *sb = osb->sb;
struct ocfs2_local_disk_dqinfo *ldinfo;
struct buffer_head *bh;
handle_t *handle;
@@ -597,7 +595,6 @@ int ocfs2_finish_quota_recovery(struct ocfs2_super *osb,
printk(KERN_NOTICE "ocfs2: Finishing quota recovery on device (%s) for "
"slot %u\n", osb->dev_str, slot_num);
- down_read(&sb->s_umount);
for (type = 0; type < OCFS2_MAXQUOTAS; type++) {
if (list_empty(&(rec->r_list[type])))
continue;
@@ -674,7 +671,6 @@ int ocfs2_finish_quota_recovery(struct ocfs2_super *osb,
break;
}
out:
- up_read(&sb->s_umount);
kfree(rec);
return status;
}
@@ -840,8 +836,7 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
ocfs2_release_local_quota_bitmaps(&oinfo->dqi_chunk);
/*
- * s_umount held in exclusive mode protects us against racing with
- * recovery thread...
+ * ocfs2_dismount_volume() has already aborted quota recovery...
*/
if (oinfo->dqi_rec) {
ocfs2_free_quota_recovery(oinfo->dqi_rec);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 8f7bb76d9cde..b30f9d71a35d 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1869,6 +1869,9 @@ static void ocfs2_dismount_volume(struct super_block *sb, int mnt_err)
/* Orphan scan should be stopped as early as possible */
ocfs2_orphan_scan_stop(osb);
+ /* Stop quota recovery so that we can disable quotas */
+ ocfs2_recovery_disable_quota(osb);
+
ocfs2_disable_quotas(osb);
/* All dquots should be freed by now */
diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c
index 2fcfabc35b06..258ed9978b90 100644
--- a/fs/smb/server/oplock.c
+++ b/fs/smb/server/oplock.c
@@ -1536,7 +1536,7 @@ struct lease_ctx_info *parse_lease_state(void *open_req, bool is_dir)
if (le16_to_cpu(cc->DataOffset) + le32_to_cpu(cc->DataLength) <
sizeof(struct create_lease_v2) - 4)
- return NULL;
+ goto err_out;
memcpy(lreq->lease_key, lc->lcontext.LeaseKey, SMB2_LEASE_KEY_SIZE);
if (is_dir) {
@@ -1557,7 +1557,7 @@ struct lease_ctx_info *parse_lease_state(void *open_req, bool is_dir)
if (le16_to_cpu(cc->DataOffset) + le32_to_cpu(cc->DataLength) <
sizeof(struct create_lease))
- return NULL;
+ goto err_out;
memcpy(lreq->lease_key, lc->lcontext.LeaseKey, SMB2_LEASE_KEY_SIZE);
lreq->req_state = lc->lcontext.LeaseState;
@@ -1566,6 +1566,9 @@ struct lease_ctx_info *parse_lease_state(void *open_req, bool is_dir)
lreq->version = 1;
}
return lreq;
+err_out:
+ kfree(lreq);
+ return NULL;
}
/**
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index 25e8004a5905..cf1b241a1578 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -440,6 +440,13 @@ static int ksmbd_vfs_stream_write(struct ksmbd_file *fp, char *buf, loff_t *pos,
goto out;
}
+ if (v_len <= *pos) {
+ pr_err("stream write position %lld is out of bounds (stream length: %zd)\n",
+ *pos, v_len);
+ err = -EINVAL;
+ goto out;
+ }
+
if (v_len < size) {
wbuf = kvzalloc(size, GFP_KERNEL);
if (!wbuf) {
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 4b06b1f1e267..186e0e0f2e40 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -76,6 +76,8 @@ extern ssize_t cpu_show_gds(struct device *dev,
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev,
struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_indirect_target_selection(struct device *dev,
+ struct device_attribute *attr, char *buf);
extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,
diff --git a/include/linux/module.h b/include/linux/module.h
index 35876e89eb93..41c4c472ed17 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -536,6 +536,11 @@ struct module {
atomic_t refcnt;
#endif
+#ifdef CONFIG_MITIGATION_ITS
+ int its_num_pages;
+ void **its_page_array;
+#endif
+
#ifdef CONFIG_CONSTRUCTORS
/* Constructor functions. */
ctor_fn_t *ctors;
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index d001a69fcb7d..aef8c7304d45 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -1031,6 +1031,9 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
#define kvfree_rcu(...) KVFREE_GET_MACRO(__VA_ARGS__, \
kvfree_rcu_arg_2, kvfree_rcu_arg_1)(__VA_ARGS__)
+#define kvfree_rcu_mightsleep(ptr) kvfree_rcu_arg_1(ptr)
+#define kfree_rcu_mightsleep(ptr) kvfree_rcu_mightsleep(ptr)
+
#define KVFREE_GET_MACRO(_1, _2, NAME, ...) NAME
#define kvfree_rcu_arg_2(ptr, rhf) \
do { \
diff --git a/include/linux/types.h b/include/linux/types.h
index ea8cf60a8a79..f1df5d8df2ef 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -109,8 +109,9 @@ typedef u64 u_int64_t;
typedef s64 int64_t;
#endif
-/* this is a special 64bit data type that is 8-byte aligned */
+/* These are the special 64-bit data types that are 8-byte aligned */
#define aligned_u64 __aligned_u64
+#define aligned_s64 __aligned_s64
#define aligned_be64 __aligned_be64
#define aligned_le64 __aligned_le64
diff --git a/include/net/flow.h b/include/net/flow.h
index 079cc493fe67..5a17fa6e016f 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -115,11 +115,10 @@ static inline void flowi4_init_output(struct flowi4 *fl4, int oif,
}
/* Reset some input parameters after previous lookup */
-static inline void flowi4_update_output(struct flowi4 *fl4, int oif, __u8 tos,
+static inline void flowi4_update_output(struct flowi4 *fl4, int oif,
__be32 daddr, __be32 saddr)
{
fl4->flowi4_oif = oif;
- fl4->flowi4_tos = tos;
fl4->daddr = daddr;
fl4->saddr = saddr;
}
diff --git a/include/net/route.h b/include/net/route.h
index 4185e6da9ef8..cdca622c5c6f 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -325,8 +325,7 @@ static inline struct rtable *ip_route_connect(struct flowi4 *fl4, __be32 dst,
if (IS_ERR(rt))
return rt;
ip_rt_put(rt);
- flowi4_update_output(fl4, oif, fl4->flowi4_tos, fl4->daddr,
- fl4->saddr);
+ flowi4_update_output(fl4, oif, fl4->daddr, fl4->saddr);
}
security_sk_classify_flow(sk, flowi4_to_flowi_common(fl4));
return ip_route_output_flow(net, fl4, sk);
@@ -341,8 +340,7 @@ static inline struct rtable *ip_route_newports(struct flowi4 *fl4, struct rtable
fl4->fl4_dport = dport;
fl4->fl4_sport = sport;
ip_rt_put(rt);
- flowi4_update_output(fl4, sk->sk_bound_dev_if,
- RT_CONN_FLAGS(sk), fl4->daddr,
+ flowi4_update_output(fl4, sk->sk_bound_dev_if, fl4->daddr,
fl4->saddr);
security_sk_classify_flow(sk, flowi4_to_flowi_common(fl4));
return ip_route_output_flow(sock_net(sk), fl4, sk);
diff --git a/include/uapi/linux/types.h b/include/uapi/linux/types.h
index 308433be33c2..c39147653094 100644
--- a/include/uapi/linux/types.h
+++ b/include/uapi/linux/types.h
@@ -49,6 +49,7 @@ typedef __u32 __bitwise __wsum;
* No conversions are necessary between 32-bit user-space and a 64-bit kernel.
*/
#define __aligned_u64 __u64 __attribute__((aligned(8)))
+#define __aligned_s64 __s64 __attribute__((aligned(8)))
#define __aligned_be64 __be64 __attribute__((aligned(8)))
#define __aligned_le64 __le64 __attribute__((aligned(8)))
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index ceac15a6bf3b..f39d66589180 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -372,24 +372,6 @@ static struct io_kiocb *__io_prep_linked_timeout(struct io_kiocb *req)
return req->link;
}
-static inline struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
-{
- if (likely(!(req->flags & REQ_F_ARM_LTIMEOUT)))
- return NULL;
- return __io_prep_linked_timeout(req);
-}
-
-static noinline void __io_arm_ltimeout(struct io_kiocb *req)
-{
- io_queue_linked_timeout(__io_prep_linked_timeout(req));
-}
-
-static inline void io_arm_ltimeout(struct io_kiocb *req)
-{
- if (unlikely(req->flags & REQ_F_ARM_LTIMEOUT))
- __io_arm_ltimeout(req);
-}
-
static void io_prep_async_work(struct io_kiocb *req)
{
const struct io_op_def *def = &io_op_defs[req->opcode];
@@ -437,7 +419,6 @@ static void io_prep_async_link(struct io_kiocb *req)
static void io_queue_iowq(struct io_kiocb *req)
{
- struct io_kiocb *link = io_prep_linked_timeout(req);
struct io_uring_task *tctx = req->task->io_uring;
BUG_ON(!tctx);
@@ -462,8 +443,6 @@ static void io_queue_iowq(struct io_kiocb *req)
trace_io_uring_queue_async_work(req, io_wq_is_hashed(&req->work));
io_wq_enqueue(tctx->io_wq, &req->work);
- if (link)
- io_queue_linked_timeout(link);
}
static __cold void io_queue_deferred(struct io_ring_ctx *ctx)
@@ -840,6 +819,14 @@ bool io_post_aux_cqe(struct io_ring_ctx *ctx,
{
bool filled;
+ /*
+ * If multishot has already posted deferred completions, ensure that
+ * those are flushed first before posting this one. If not, CQEs
+ * could get reordered.
+ */
+ if (!wq_list_empty(&ctx->submit_state.compl_reqs))
+ __io_submit_flush_completions(ctx);
+
io_cq_lock(ctx);
filled = io_fill_cqe_aux(ctx, user_data, res, cflags, allow_overflow);
io_cq_unlock_post(ctx);
@@ -1741,17 +1728,24 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
return !!req->file;
}
+#define REQ_ISSUE_SLOW_FLAGS (REQ_F_CREDS | REQ_F_ARM_LTIMEOUT)
+
static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
{
const struct io_op_def *def = &io_op_defs[req->opcode];
const struct cred *creds = NULL;
+ struct io_kiocb *link = NULL;
int ret;
if (unlikely(!io_assign_file(req, issue_flags)))
return -EBADF;
- if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred()))
- creds = override_creds(req->creds);
+ if (unlikely(req->flags & REQ_ISSUE_SLOW_FLAGS)) {
+ if ((req->flags & REQ_F_CREDS) && req->creds != current_cred())
+ creds = override_creds(req->creds);
+ if (req->flags & REQ_F_ARM_LTIMEOUT)
+ link = __io_prep_linked_timeout(req);
+ }
if (!def->audit_skip)
audit_uring_entry(req->opcode);
@@ -1761,8 +1755,12 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
if (!def->audit_skip)
audit_uring_exit(!ret, ret);
- if (creds)
- revert_creds(creds);
+ if (unlikely(creds || link)) {
+ if (creds)
+ revert_creds(creds);
+ if (link)
+ io_queue_linked_timeout(link);
+ }
if (ret == IOU_OK) {
if (issue_flags & IO_URING_F_COMPLETE_DEFER)
@@ -1809,8 +1807,6 @@ void io_wq_submit_work(struct io_wq_work *work)
else
req_ref_get(req);
- io_arm_ltimeout(req);
-
/* either cancelled or io-wq is dying, so don't touch tctx->iowq */
if (work->flags & IO_WQ_WORK_CANCEL) {
fail:
@@ -1908,15 +1904,11 @@ struct file *io_file_get_normal(struct io_kiocb *req, int fd)
static void io_queue_async(struct io_kiocb *req, int ret)
__must_hold(&req->ctx->uring_lock)
{
- struct io_kiocb *linked_timeout;
-
if (ret != -EAGAIN || (req->flags & REQ_F_NOWAIT)) {
io_req_complete_failed(req, ret);
return;
}
- linked_timeout = io_prep_linked_timeout(req);
-
switch (io_arm_poll_handler(req, 0)) {
case IO_APOLL_READY:
io_kbuf_recycle(req, 0);
@@ -1929,9 +1921,6 @@ static void io_queue_async(struct io_kiocb *req, int ret)
case IO_APOLL_OK:
break;
}
-
- if (linked_timeout)
- io_queue_linked_timeout(linked_timeout);
}
static inline void io_queue_sqe(struct io_kiocb *req)
@@ -1945,9 +1934,7 @@ static inline void io_queue_sqe(struct io_kiocb *req)
* We async punt it if the file wasn't marked NOWAIT, or if the file
* doesn't support non-blocking read/write attempts
*/
- if (likely(!ret))
- io_arm_ltimeout(req);
- else
+ if (unlikely(ret))
io_queue_async(req, ret);
}
diff --git a/kernel/params.c b/kernel/params.c
index 5b92310425c5..27954dfe3d20 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -945,7 +945,9 @@ int module_sysfs_initialized;
static void module_kobj_release(struct kobject *kobj)
{
struct module_kobject *mk = to_module_kobject(kobj);
- complete(mk->kobj_completion);
+
+ if (mk->kobj_completion)
+ complete(mk->kobj_completion);
}
struct kobj_type module_ktype = {
diff --git a/net/can/gw.c b/net/can/gw.c
index 23a3d89cad81..5933423663cd 100644
--- a/net/can/gw.c
+++ b/net/can/gw.c
@@ -130,7 +130,7 @@ struct cgw_job {
u32 handled_frames;
u32 dropped_frames;
u32 deleted_frames;
- struct cf_mod mod;
+ struct cf_mod __rcu *cf_mod;
union {
/* CAN frame data source */
struct net_device *dev;
@@ -459,6 +459,7 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data)
struct cgw_job *gwj = (struct cgw_job *)data;
struct canfd_frame *cf;
struct sk_buff *nskb;
+ struct cf_mod *mod;
int modidx = 0;
/* process strictly Classic CAN or CAN FD frames */
@@ -506,7 +507,8 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data)
* When there is at least one modification function activated,
* we need to copy the skb as we want to modify skb->data.
*/
- if (gwj->mod.modfunc[0])
+ mod = rcu_dereference(gwj->cf_mod);
+ if (mod->modfunc[0])
nskb = skb_copy(skb, GFP_ATOMIC);
else
nskb = skb_clone(skb, GFP_ATOMIC);
@@ -529,8 +531,8 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data)
cf = (struct canfd_frame *)nskb->data;
/* perform preprocessed modification functions if there are any */
- while (modidx < MAX_MODFUNCTIONS && gwj->mod.modfunc[modidx])
- (*gwj->mod.modfunc[modidx++])(cf, &gwj->mod);
+ while (modidx < MAX_MODFUNCTIONS && mod->modfunc[modidx])
+ (*mod->modfunc[modidx++])(cf, mod);
/* Has the CAN frame been modified? */
if (modidx) {
@@ -546,11 +548,11 @@ static void can_can_gw_rcv(struct sk_buff *skb, void *data)
}
/* check for checksum updates */
- if (gwj->mod.csumfunc.crc8)
- (*gwj->mod.csumfunc.crc8)(cf, &gwj->mod.csum.crc8);
+ if (mod->csumfunc.crc8)
+ (*mod->csumfunc.crc8)(cf, &mod->csum.crc8);
- if (gwj->mod.csumfunc.xor)
- (*gwj->mod.csumfunc.xor)(cf, &gwj->mod.csum.xor);
+ if (mod->csumfunc.xor)
+ (*mod->csumfunc.xor)(cf, &mod->csum.xor);
}
/* clear the skb timestamp if not configured the other way */
@@ -581,9 +583,20 @@ static void cgw_job_free_rcu(struct rcu_head *rcu_head)
{
struct cgw_job *gwj = container_of(rcu_head, struct cgw_job, rcu);
+ /* cgw_job::cf_mod is always accessed from the same cgw_job object within
+ * the same RCU read section. Once cgw_job is scheduled for removal,
+ * cf_mod can also be removed without mandating an additional grace period.
+ */
+ kfree(rcu_access_pointer(gwj->cf_mod));
kmem_cache_free(cgw_cache, gwj);
}
+/* Return cgw_job::cf_mod with RTNL protected section */
+static struct cf_mod *cgw_job_cf_mod(struct cgw_job *gwj)
+{
+ return rcu_dereference_protected(gwj->cf_mod, rtnl_is_locked());
+}
+
static int cgw_notifier(struct notifier_block *nb,
unsigned long msg, void *ptr)
{
@@ -616,6 +629,7 @@ static int cgw_put_job(struct sk_buff *skb, struct cgw_job *gwj, int type,
{
struct rtcanmsg *rtcan;
struct nlmsghdr *nlh;
+ struct cf_mod *mod;
nlh = nlmsg_put(skb, pid, seq, type, sizeof(*rtcan), flags);
if (!nlh)
@@ -650,82 +664,83 @@ static int cgw_put_job(struct sk_buff *skb, struct cgw_job *gwj, int type,
goto cancel;
}
+ mod = cgw_job_cf_mod(gwj);
if (gwj->flags & CGW_FLAGS_CAN_FD) {
struct cgw_fdframe_mod mb;
- if (gwj->mod.modtype.and) {
- memcpy(&mb.cf, &gwj->mod.modframe.and, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.and;
+ if (mod->modtype.and) {
+ memcpy(&mb.cf, &mod->modframe.and, sizeof(mb.cf));
+ mb.modtype = mod->modtype.and;
if (nla_put(skb, CGW_FDMOD_AND, sizeof(mb), &mb) < 0)
goto cancel;
}
- if (gwj->mod.modtype.or) {
- memcpy(&mb.cf, &gwj->mod.modframe.or, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.or;
+ if (mod->modtype.or) {
+ memcpy(&mb.cf, &mod->modframe.or, sizeof(mb.cf));
+ mb.modtype = mod->modtype.or;
if (nla_put(skb, CGW_FDMOD_OR, sizeof(mb), &mb) < 0)
goto cancel;
}
- if (gwj->mod.modtype.xor) {
- memcpy(&mb.cf, &gwj->mod.modframe.xor, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.xor;
+ if (mod->modtype.xor) {
+ memcpy(&mb.cf, &mod->modframe.xor, sizeof(mb.cf));
+ mb.modtype = mod->modtype.xor;
if (nla_put(skb, CGW_FDMOD_XOR, sizeof(mb), &mb) < 0)
goto cancel;
}
- if (gwj->mod.modtype.set) {
- memcpy(&mb.cf, &gwj->mod.modframe.set, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.set;
+ if (mod->modtype.set) {
+ memcpy(&mb.cf, &mod->modframe.set, sizeof(mb.cf));
+ mb.modtype = mod->modtype.set;
if (nla_put(skb, CGW_FDMOD_SET, sizeof(mb), &mb) < 0)
goto cancel;
}
} else {
struct cgw_frame_mod mb;
- if (gwj->mod.modtype.and) {
- memcpy(&mb.cf, &gwj->mod.modframe.and, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.and;
+ if (mod->modtype.and) {
+ memcpy(&mb.cf, &mod->modframe.and, sizeof(mb.cf));
+ mb.modtype = mod->modtype.and;
if (nla_put(skb, CGW_MOD_AND, sizeof(mb), &mb) < 0)
goto cancel;
}
- if (gwj->mod.modtype.or) {
- memcpy(&mb.cf, &gwj->mod.modframe.or, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.or;
+ if (mod->modtype.or) {
+ memcpy(&mb.cf, &mod->modframe.or, sizeof(mb.cf));
+ mb.modtype = mod->modtype.or;
if (nla_put(skb, CGW_MOD_OR, sizeof(mb), &mb) < 0)
goto cancel;
}
- if (gwj->mod.modtype.xor) {
- memcpy(&mb.cf, &gwj->mod.modframe.xor, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.xor;
+ if (mod->modtype.xor) {
+ memcpy(&mb.cf, &mod->modframe.xor, sizeof(mb.cf));
+ mb.modtype = mod->modtype.xor;
if (nla_put(skb, CGW_MOD_XOR, sizeof(mb), &mb) < 0)
goto cancel;
}
- if (gwj->mod.modtype.set) {
- memcpy(&mb.cf, &gwj->mod.modframe.set, sizeof(mb.cf));
- mb.modtype = gwj->mod.modtype.set;
+ if (mod->modtype.set) {
+ memcpy(&mb.cf, &mod->modframe.set, sizeof(mb.cf));
+ mb.modtype = mod->modtype.set;
if (nla_put(skb, CGW_MOD_SET, sizeof(mb), &mb) < 0)
goto cancel;
}
}
- if (gwj->mod.uid) {
- if (nla_put_u32(skb, CGW_MOD_UID, gwj->mod.uid) < 0)
+ if (mod->uid) {
+ if (nla_put_u32(skb, CGW_MOD_UID, mod->uid) < 0)
goto cancel;
}
- if (gwj->mod.csumfunc.crc8) {
+ if (mod->csumfunc.crc8) {
if (nla_put(skb, CGW_CS_CRC8, CGW_CS_CRC8_LEN,
- &gwj->mod.csum.crc8) < 0)
+ &mod->csum.crc8) < 0)
goto cancel;
}
- if (gwj->mod.csumfunc.xor) {
+ if (mod->csumfunc.xor) {
if (nla_put(skb, CGW_CS_XOR, CGW_CS_XOR_LEN,
- &gwj->mod.csum.xor) < 0)
+ &mod->csum.xor) < 0)
goto cancel;
}
@@ -1059,7 +1074,7 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh,
struct net *net = sock_net(skb->sk);
struct rtcanmsg *r;
struct cgw_job *gwj;
- struct cf_mod mod;
+ struct cf_mod *mod;
struct can_can_gw ccgw;
u8 limhops = 0;
int err = 0;
@@ -1078,37 +1093,48 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh,
if (r->gwtype != CGW_TYPE_CAN_CAN)
return -EINVAL;
- err = cgw_parse_attr(nlh, &mod, CGW_TYPE_CAN_CAN, &ccgw, &limhops);
+ mod = kmalloc(sizeof(*mod), GFP_KERNEL);
+ if (!mod)
+ return -ENOMEM;
+
+ err = cgw_parse_attr(nlh, mod, CGW_TYPE_CAN_CAN, &ccgw, &limhops);
if (err < 0)
- return err;
+ goto out_free_cf;
- if (mod.uid) {
+ if (mod->uid) {
ASSERT_RTNL();
/* check for updating an existing job with identical uid */
hlist_for_each_entry(gwj, &net->can.cgw_list, list) {
- if (gwj->mod.uid != mod.uid)
+ struct cf_mod *old_cf;
+
+ old_cf = cgw_job_cf_mod(gwj);
+ if (old_cf->uid != mod->uid)
continue;
/* interfaces & filters must be identical */
- if (memcmp(&gwj->ccgw, &ccgw, sizeof(ccgw)))
- return -EINVAL;
+ if (memcmp(&gwj->ccgw, &ccgw, sizeof(ccgw))) {
+ err = -EINVAL;
+ goto out_free_cf;
+ }
- /* update modifications with disabled softirq & quit */
- local_bh_disable();
- memcpy(&gwj->mod, &mod, sizeof(mod));
- local_bh_enable();
+ rcu_assign_pointer(gwj->cf_mod, mod);
+ kfree_rcu_mightsleep(old_cf);
return 0;
}
}
/* ifindex == 0 is not allowed for job creation */
- if (!ccgw.src_idx || !ccgw.dst_idx)
- return -ENODEV;
+ if (!ccgw.src_idx || !ccgw.dst_idx) {
+ err = -ENODEV;
+ goto out_free_cf;
+ }
gwj = kmem_cache_alloc(cgw_cache, GFP_KERNEL);
- if (!gwj)
- return -ENOMEM;
+ if (!gwj) {
+ err = -ENOMEM;
+ goto out_free_cf;
+ }
gwj->handled_frames = 0;
gwj->dropped_frames = 0;
@@ -1118,7 +1144,7 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh,
gwj->limit_hops = limhops;
/* insert already parsed information */
- memcpy(&gwj->mod, &mod, sizeof(mod));
+ RCU_INIT_POINTER(gwj->cf_mod, mod);
memcpy(&gwj->ccgw, &ccgw, sizeof(ccgw));
err = -ENODEV;
@@ -1145,9 +1171,11 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh,
if (!err)
hlist_add_head_rcu(&gwj->list, &net->can.cgw_list);
out:
- if (err)
+ if (err) {
kmem_cache_free(cgw_cache, gwj);
-
+out_free_cf:
+ kfree(mod);
+ }
return err;
}
@@ -1207,19 +1235,22 @@ static int cgw_remove_job(struct sk_buff *skb, struct nlmsghdr *nlh,
/* remove only the first matching entry */
hlist_for_each_entry_safe(gwj, nx, &net->can.cgw_list, list) {
+ struct cf_mod *cf_mod;
+
if (gwj->flags != r->flags)
continue;
if (gwj->limit_hops != limhops)
continue;
+ cf_mod = cgw_job_cf_mod(gwj);
/* we have a match when uid is enabled and identical */
- if (gwj->mod.uid || mod.uid) {
- if (gwj->mod.uid != mod.uid)
+ if (cf_mod->uid || mod.uid) {
+ if (cf_mod->uid != mod.uid)
continue;
} else {
/* no uid => check for identical modifications */
- if (memcmp(&gwj->mod, &mod, sizeof(mod)))
+ if (memcmp(cf_mod, &mod, sizeof(mod)))
continue;
}
diff --git a/net/core/filter.c b/net/core/filter.c
index d713696d0832..497b41ac399d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2500,6 +2500,7 @@ int skb_do_redirect(struct sk_buff *skb)
goto out_drop;
skb->dev = dev;
dev_sw_netstats_rx_add(dev, skb->len);
+ skb_scrub_packet(skb, false);
return -EAGAIN;
}
return flags & BPF_F_NEIGH ?
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 0811cc8956a3..40ee2d6ef229 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3162,16 +3162,13 @@ static void add_v4_addrs(struct inet6_dev *idev)
struct in6_addr addr;
struct net_device *dev;
struct net *net = dev_net(idev->dev);
- int scope, plen, offset = 0;
+ int scope, plen;
u32 pflags = 0;
ASSERT_RTNL();
memset(&addr, 0, sizeof(struct in6_addr));
- /* in case of IP6GRE the dev_addr is an IPv6 and therefore we use only the last 4 bytes */
- if (idev->dev->addr_len == sizeof(struct in6_addr))
- offset = sizeof(struct in6_addr) - 4;
- memcpy(&addr.s6_addr32[3], idev->dev->dev_addr + offset, 4);
+ memcpy(&addr.s6_addr32[3], idev->dev->dev_addr, 4);
if (!(idev->dev->flags & IFF_POINTOPOINT) && idev->dev->type == ARPHRD_SIT) {
scope = IPV6_ADDR_COMPATv4;
@@ -3481,7 +3478,13 @@ static void addrconf_gre_config(struct net_device *dev)
return;
}
- if (dev->type == ARPHRD_ETHER) {
+ /* Generate the IPv6 link-local address using addrconf_addr_gen(),
+ * unless we have an IPv4 GRE device not bound to an IP address and
+ * which is in EUI64 mode (as __ipv6_isatap_ifid() would fail in this
+ * case). Such devices fall back to add_v4_addrs() instead.
+ */
+ if (!(dev->type == ARPHRD_IPGRE && *(__be32 *)dev->dev_addr == 0 &&
+ idev->cnf.addr_gen_mode == IN6_ADDR_GEN_MODE_EUI64)) {
addrconf_addr_gen(idev, true);
return;
}
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index ef04e556aadb..0bd6bf46f05f 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -63,7 +63,7 @@ struct hbucket {
#define ahash_sizeof_regions(htable_bits) \
(ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region))
#define ahash_region(n, htable_bits) \
- ((n) % ahash_numof_locks(htable_bits))
+ ((n) / jhash_size(HTABLE_REGION_BITS))
#define ahash_bucket_start(h, htable_bits) \
((htable_bits) < HTABLE_REGION_BITS ? 0 \
: (h) * jhash_size(HTABLE_REGION_BITS))
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index d40a4ca2b27f..e1437c72ca6e 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -119,13 +119,12 @@ __mtu_check_toobig_v6(const struct sk_buff *skb, u32 mtu)
return false;
}
-/* Get route to daddr, update *saddr, optionally bind route to saddr */
+/* Get route to daddr, optionally bind route to saddr */
static struct rtable *do_output_route4(struct net *net, __be32 daddr,
- int rt_mode, __be32 *saddr)
+ int rt_mode, __be32 *ret_saddr)
{
struct flowi4 fl4;
struct rtable *rt;
- bool loop = false;
memset(&fl4, 0, sizeof(fl4));
fl4.daddr = daddr;
@@ -135,23 +134,17 @@ static struct rtable *do_output_route4(struct net *net, __be32 daddr,
retry:
rt = ip_route_output_key(net, &fl4);
if (IS_ERR(rt)) {
- /* Invalid saddr ? */
- if (PTR_ERR(rt) == -EINVAL && *saddr &&
- rt_mode & IP_VS_RT_MODE_CONNECT && !loop) {
- *saddr = 0;
- flowi4_update_output(&fl4, 0, 0, daddr, 0);
- goto retry;
- }
IP_VS_DBG_RL("ip_route_output error, dest: %pI4\n", &daddr);
return NULL;
- } else if (!*saddr && rt_mode & IP_VS_RT_MODE_CONNECT && fl4.saddr) {
+ }
+ if (rt_mode & IP_VS_RT_MODE_CONNECT && fl4.saddr) {
ip_rt_put(rt);
- *saddr = fl4.saddr;
- flowi4_update_output(&fl4, 0, 0, daddr, fl4.saddr);
- loop = true;
+ flowi4_update_output(&fl4, 0, daddr, fl4.saddr);
+ rt_mode = 0;
goto retry;
}
- *saddr = fl4.saddr;
+ if (ret_saddr)
+ *ret_saddr = fl4.saddr;
return rt;
}
@@ -344,19 +337,15 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
if (ret_saddr)
*ret_saddr = dest_dst->dst_saddr.ip;
} else {
- __be32 saddr = htonl(INADDR_ANY);
-
noref = 0;
/* For such unconfigured boxes avoid many route lookups
* for performance reasons because we do not remember saddr
*/
rt_mode &= ~IP_VS_RT_MODE_CONNECT;
- rt = do_output_route4(net, daddr, rt_mode, &saddr);
+ rt = do_output_route4(net, daddr, rt_mode, ret_saddr);
if (!rt)
goto err_unreach;
- if (ret_saddr)
- *ret_saddr = saddr;
}
local = (rt->rt_flags & RTCF_LOCAL) ? 1 : 0;
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index c517b24b3093..a87c25e06baf 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -954,8 +954,7 @@ static int output_userspace(struct datapath *dp, struct sk_buff *skb,
upcall.cmd = OVS_PACKET_CMD_ACTION;
upcall.mru = OVS_CB(skb)->mru;
- for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
- a = nla_next(a, &rem)) {
+ nla_for_each_nested(a, attr, rem) {
switch (nla_type(a)) {
case OVS_USERSPACE_ATTR_USERDATA:
upcall.userdata = a;
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index fb0fb8825574..29f394fe3998 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -345,7 +345,8 @@ static void htb_add_to_wait_tree(struct htb_sched *q,
*/
static inline void htb_next_rb_node(struct rb_node **n)
{
- *n = rb_next(*n);
+ if (*n)
+ *n = rb_next(*n);
}
/**
@@ -606,8 +607,8 @@ static inline void htb_activate(struct htb_sched *q, struct htb_class *cl)
*/
static inline void htb_deactivate(struct htb_sched *q, struct htb_class *cl)
{
- WARN_ON(!cl->prio_activity);
-
+ if (!cl->prio_activity)
+ return;
htb_deactivate_prios(q, cl);
cl->prio_activity = 0;
}
@@ -1482,8 +1483,6 @@ static void htb_qlen_notify(struct Qdisc *sch, unsigned long arg)
{
struct htb_class *cl = (struct htb_class *)arg;
- if (!cl->prio_activity)
- return;
htb_deactivate(qdisc_priv(sch), cl);
}
@@ -1735,8 +1734,7 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg,
if (cl->parent)
cl->parent->children--;
- if (cl->prio_activity)
- htb_deactivate(q, cl);
+ htb_deactivate(q, cl);
if (cl->cmode != HTB_CAN_SEND)
htb_safe_rb_erase(&cl->pq_node,
@@ -1948,8 +1946,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
/* turn parent into inner node */
qdisc_purge_queue(parent->leaf.q);
parent_qdisc = parent->leaf.q;
- if (parent->prio_activity)
- htb_deactivate(q, parent);
+ htb_deactivate(q, parent);
/* remove from evt list because of level change */
if (parent->cmode != HTB_CAN_SEND) {
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index bcd3384ab07a..036dc574af4f 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -497,9 +497,7 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
continue;
fl4->fl4_sport = laddr->a.v4.sin_port;
- flowi4_update_output(fl4,
- asoc->base.sk->sk_bound_dev_if,
- RT_CONN_FLAGS_TOS(asoc->base.sk, tos),
+ flowi4_update_output(fl4, asoc->base.sk->sk_bound_dev_if,
daddr->v4.sin_addr.s_addr,
laddr->a.v4.sin_addr.s_addr);
Powered by blists - more mailing lists