Changes in 4.9.204 net/mlx4_en: fix mlx4 ethtool -N insertion net: rtnetlink: prevent underflows in do_setvfinfo() sfc: Only cancel the PPS workqueue if it exists net/mlx5e: Fix set vf link state error flow net/sched: act_pedit: fix WARN() in the traffic path gpio: max77620: Fixup debounce delays tools: gpio: Correctly add make dependencies for gpio_utils Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()" mm/ksm.c: don't WARN if page is still mapped in remove_stable_node() platform/x86: asus-nb-wmi: Support ALS on the Zenbook UX430UQ platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi mwifiex: Fix NL80211_TX_POWER_LIMITED ALSA: isight: fix leak of reference to firewire unit in error path of .probe callback printk: fix integer overflow in setup_log_buf() gfs2: Fix marking bitmaps non-full synclink_gt(): fix compat_ioctl() powerpc: Fix signedness bug in update_flash_db() powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field brcmsmac: AP mode: update beacon when TIM changes ath10k: allocate small size dma memory in ath10k_pci_diag_write_mem spi: sh-msiof: fix deferred probing mmc: mediatek: fix cannot receive new request when msdc_cmd_is_ready fail btrfs: handle error of get_old_root gsmi: Fix bug in append_to_eventlog sysfs handler misc: mic: fix a DMA pool free failure m68k: fix command-line parsing when passed from u-boot amiflop: clean up on errors during setup scsi: ips: fix missing break in switch KVM/x86: Fix invvpid and invept register operand size in 64-bit mode scsi: isci: Use proper enumerated type in atapi_d2h_reg_frame_handler scsi: isci: Change sci_controller_start_task's return type to sci_status scsi: iscsi_tcp: Explicitly cast param in iscsi_sw_tcp_host_get_param clk: mmp2: fix the clock id for sdh2_clk and sdh3_clk ASoC: tegra_sgtl5000: fix device_node refcounting scsi: dc395x: fix dma API usage in srb_done scsi: dc395x: fix DMA API usage in sg_update_list net: fix warning in af_unix net: ena: Fix Kconfig dependency on X86 xfs: fix use-after-free race in xfs_buf_rele kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack ALSA: i2c/cs8427: Fix int to char conversion macintosh/windfarm_smu_sat: Fix debug output USB: misc: appledisplay: fix backlight update_status return code usbip: tools: fix atoi() on non-null terminated string SUNRPC: Fix a compile warning for cmpxchg64() sunrpc: safely reallow resvport min/max inversion atm: zatm: Fix empty body Clang warnings s390/perf: Return error when debug_register fails spi: omap2-mcspi: Set FIFO DMA trigger level to word length sparc: Fix parport build warnings. ceph: fix dentry leak in ceph_readdir_prepopulate rtc: s35390a: Change buf's type to u8 in s35390a_init f2fs: fix to spread clear_cold_data() mISDN: Fix type of switch control variable in ctrl_teimanager qlcnic: fix a return in qlcnic_dcb_get_capability() net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode mfd: arizona: Correct calling of runtime_put_sync mfd: mc13xxx-core: Fix PMIC shutdown when reading ADC values mfd: max8997: Enale irq-wakeup unconditionally selftests/ftrace: Fix to test kprobe $comm arg only if available thermal: rcar_thermal: Prevent hardware access during system suspend powerpc/process: Fix flush_all_to_thread for SPE sparc64: Rework xchg() definition to avoid warnings. fs/ocfs2/dlm/dlmdebug.c: fix a sleep-in-atomic-context bug in dlm_print_one_mle() mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock macsec: update operstate when lower device changes macsec: let the administrator set UP state even if lowerdev is down um: Make line/tty semantics use true write IRQ linux/bitmap.h: handle constant zero-size bitmaps correctly linux/bitmap.h: fix type of nbits in bitmap_shift_right() hfsplus: fix BUG on bnode parent update hfs: fix BUG on bnode parent update hfsplus: prevent btree data loss on ENOSPC hfs: prevent btree data loss on ENOSPC hfsplus: fix return value of hfsplus_get_block() hfs: fix return value of hfs_get_block() hfsplus: update timestamps on truncate() hfs: update timestamp on truncate() fs/hfs/extent.c: fix array out of bounds read of array extent mm/memory_hotplug: make add_memory() take the device_hotplug_lock igb: shorten maximum PHC timecounter update interval ntb_netdev: fix sleep time mismatch ntb: intel: fix return value for ndev_vec_mask() arm64: makefile fix build of .i file in external module case ocfs2: don't put and assigning null to bh allocated outside ocfs2: fix clusters leak in ocfs2_defrag_extent() net: do not abort bulk send on BQL status sched/fair: Don't increase sd->balance_interval on newidle balance audit: print empty EXECVE args wlcore: Fix the return value in case of error in 'wlcore_vendor_cmd_smart_config_start()' rtl8xxxu: Fix missing break in switch brcmsmac: never log "tid x is not agg'able" by default wireless: airo: potential buffer overflow in sprintf() rtlwifi: rtl8192de: Fix misleading REG_MCUFWDL information scsi: mpt3sas: Fix Sync cache command failure during driver unload scsi: mpt3sas: Fix driver modifying persistent data in Manufacturing page11 scsi: megaraid_sas: Fix msleep granularity scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces dlm: fix invalid free dlm: don't leak kernel pointer to userspace ACPICA: Use %d for signed int print formatting instead of %u net: bcmgenet: return correct value 'ret' from bcmgenet_power_down sock: Reset dst when changing sk_mark via setsockopt pinctrl: qcom: spmi-gpio: fix gpio-hog related boot issues pinctrl: lpc18xx: Use define directive for PIN_CONFIG_GPIO_PIN_INT pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD PCI: keystone: Use quirk to limit MRRS for K2G spi: omap2-mcspi: Fix DMA and FIFO event trigger size mismatch mm/memory_hotplug: Do not unlock when fails to take the device_hotplug_lock Bluetooth: Fix invalid-free in bcsp_close() KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved ath9k_hw: fix uninitialized variable data dm: use blk_set_queue_dying() in __dm_destroy() arm64: fix for bad_mode() handler to always result in panic cpufreq: Skip cpufreq resume if it's not suspended ocfs2: remove ocfs2_is_o2cb_active() ARM: 8904/1: skip nomap memblocks while finding the lowmem/highmem boundary ARC: perf: Accommodate big-endian CPU x86/insn: Fix awk regexp warnings x86/speculation: Fix incorrect MDS/TAA mitigation status x86/speculation: Fix redundant MDS mitigation message nfc: port100: handle command failure cleanly l2tp: don't use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6 media: vivid: Set vid_cap_streaming and vid_out_streaming to true media: vivid: Fix wrong locking that causes race conditions on streaming stop media: usbvision: Fix races among open, close, and disconnect cpufreq: Add NULL checks to show() and store() methods of cpufreq media: uvcvideo: Fix error path in control parsing failure media: b2c2-flexcop-usb: add sanity checking media: cxusb: detect cxusb_ctrl_msg error in query media: imon: invalid dereference in imon_touch_event virtio_console: reset on out of memory virtio_console: don't tie bufs to a vq virtio_console: allocate inbufs in add_port() only if it is needed virtio_ring: fix return code on DMA mapping fails virtio_console: fix uninitialized variable use virtio_console: drop custom control queue cleanup virtio_console: move removal code usbip: tools: fix fd leakage in the function of read_attr_usbip_status usb-serial: cp201x: support Mark-10 digital force gauge USB: chaoskey: fix error case of a timeout appledisplay: fix error handling in the scheduled work USB: serial: mos7840: add USB ID to support Moxa UPort 2210 USB: serial: mos7720: fix remote wakeup USB: serial: mos7840: fix remote wakeup USB: serial: option: add support for DW5821e with eSIM support USB: serial: option: add support for Foxconn T77W968 LTE modules staging: comedi: usbduxfast: usbduxfast_ai_cmdtest rounding error powerpc/64s: support nospectre_v2 cmdline option powerpc/book3s64: Fix link stack flush on context switch KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel Linux 4.9.204 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
280 lines
12 KiB
ReStructuredText
280 lines
12 KiB
ReStructuredText
.. SPDX-License-Identifier: GPL-2.0
|
|
|
|
TAA - TSX Asynchronous Abort
|
|
======================================
|
|
|
|
TAA is a hardware vulnerability that allows unprivileged speculative access to
|
|
data which is available in various CPU internal buffers by using asynchronous
|
|
aborts within an Intel TSX transactional region.
|
|
|
|
Affected processors
|
|
-------------------
|
|
|
|
This vulnerability only affects Intel processors that support Intel
|
|
Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)
|
|
is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit
|
|
(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations
|
|
also mitigate against TAA.
|
|
|
|
Whether a processor is affected or not can be read out from the TAA
|
|
vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.
|
|
|
|
Related CVEs
|
|
------------
|
|
|
|
The following CVE entry is related to this TAA issue:
|
|
|
|
============== ===== ===================================================
|
|
CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some
|
|
microprocessors utilizing speculative execution may
|
|
allow an authenticated user to potentially enable
|
|
information disclosure via a side channel with
|
|
local access.
|
|
============== ===== ===================================================
|
|
|
|
Problem
|
|
-------
|
|
|
|
When performing store, load or L1 refill operations, processors write
|
|
data into temporary microarchitectural structures (buffers). The data in
|
|
those buffers can be forwarded to load operations as an optimization.
|
|
|
|
Intel TSX is an extension to the x86 instruction set architecture that adds
|
|
hardware transactional memory support to improve performance of multi-threaded
|
|
software. TSX lets the processor expose and exploit concurrency hidden in an
|
|
application due to dynamically avoiding unnecessary synchronization.
|
|
|
|
TSX supports atomic memory transactions that are either committed (success) or
|
|
aborted. During an abort, operations that happened within the transactional region
|
|
are rolled back. An asynchronous abort takes place, among other options, when a
|
|
different thread accesses a cache line that is also used within the transactional
|
|
region when that access might lead to a data race.
|
|
|
|
Immediately after an uncompleted asynchronous abort, certain speculatively
|
|
executed loads may read data from those internal buffers and pass it to dependent
|
|
operations. This can be then used to infer the value via a cache side channel
|
|
attack.
|
|
|
|
Because the buffers are potentially shared between Hyper-Threads cross
|
|
Hyper-Thread attacks are possible.
|
|
|
|
The victim of a malicious actor does not need to make use of TSX. Only the
|
|
attacker needs to begin a TSX transaction and raise an asynchronous abort
|
|
which in turn potenitally leaks data stored in the buffers.
|
|
|
|
More detailed technical information is available in the TAA specific x86
|
|
architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.
|
|
|
|
|
|
Attack scenarios
|
|
----------------
|
|
|
|
Attacks against the TAA vulnerability can be implemented from unprivileged
|
|
applications running on hosts or guests.
|
|
|
|
As for MDS, the attacker has no control over the memory addresses that can
|
|
be leaked. Only the victim is responsible for bringing data to the CPU. As
|
|
a result, the malicious actor has to sample as much data as possible and
|
|
then postprocess it to try to infer any useful information from it.
|
|
|
|
A potential attacker only has read access to the data. Also, there is no direct
|
|
privilege escalation by using this technique.
|
|
|
|
|
|
.. _tsx_async_abort_sys_info:
|
|
|
|
TAA system information
|
|
-----------------------
|
|
|
|
The Linux kernel provides a sysfs interface to enumerate the current TAA status
|
|
of mitigated systems. The relevant sysfs file is:
|
|
|
|
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
|
|
|
|
The possible values in this file are:
|
|
|
|
.. list-table::
|
|
|
|
* - 'Vulnerable'
|
|
- The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.
|
|
* - 'Vulnerable: Clear CPU buffers attempted, no microcode'
|
|
- The system tries to clear the buffers but the microcode might not support the operation.
|
|
* - 'Mitigation: Clear CPU buffers'
|
|
- The microcode has been updated to clear the buffers. TSX is still enabled.
|
|
* - 'Mitigation: TSX disabled'
|
|
- TSX is disabled.
|
|
* - 'Not affected'
|
|
- The CPU is not affected by this issue.
|
|
|
|
.. _ucode_needed:
|
|
|
|
Best effort mitigation mode
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
If the processor is vulnerable, but the availability of the microcode-based
|
|
mitigation mechanism is not advertised via CPUID the kernel selects a best
|
|
effort mitigation mode. This mode invokes the mitigation instructions
|
|
without a guarantee that they clear the CPU buffers.
|
|
|
|
This is done to address virtualization scenarios where the host has the
|
|
microcode update applied, but the hypervisor is not yet updated to expose the
|
|
CPUID to the guest. If the host has updated microcode the protection takes
|
|
effect; otherwise a few CPU cycles are wasted pointlessly.
|
|
|
|
The state in the tsx_async_abort sysfs file reflects this situation
|
|
accordingly.
|
|
|
|
|
|
Mitigation mechanism
|
|
--------------------
|
|
|
|
The kernel detects the affected CPUs and the presence of the microcode which is
|
|
required. If a CPU is affected and the microcode is available, then the kernel
|
|
enables the mitigation by default.
|
|
|
|
|
|
The mitigation can be controlled at boot time via a kernel command line option.
|
|
See :ref:`taa_mitigation_control_command_line`.
|
|
|
|
.. _virt_mechanism:
|
|
|
|
Virtualization mitigation
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Affected systems where the host has TAA microcode and TAA is mitigated by
|
|
having disabled TSX previously, are not vulnerable regardless of the status
|
|
of the VMs.
|
|
|
|
In all other cases, if the host either does not have the TAA microcode or
|
|
the kernel is not mitigated, the system might be vulnerable.
|
|
|
|
|
|
.. _taa_mitigation_control_command_line:
|
|
|
|
Mitigation control on the kernel command line
|
|
---------------------------------------------
|
|
|
|
The kernel command line allows to control the TAA mitigations at boot time with
|
|
the option "tsx_async_abort=". The valid arguments for this option are:
|
|
|
|
============ =============================================================
|
|
off This option disables the TAA mitigation on affected platforms.
|
|
If the system has TSX enabled (see next parameter) and the CPU
|
|
is affected, the system is vulnerable.
|
|
|
|
full TAA mitigation is enabled. If TSX is enabled, on an affected
|
|
system it will clear CPU buffers on ring transitions. On
|
|
systems which are MDS-affected and deploy MDS mitigation,
|
|
TAA is also mitigated. Specifying this option on those
|
|
systems will have no effect.
|
|
|
|
full,nosmt The same as tsx_async_abort=full, with SMT disabled on
|
|
vulnerable CPUs that have TSX enabled. This is the complete
|
|
mitigation. When TSX is disabled, SMT is not disabled because
|
|
CPU is not vulnerable to cross-thread TAA attacks.
|
|
============ =============================================================
|
|
|
|
Not specifying this option is equivalent to "tsx_async_abort=full". For
|
|
processors that are affected by both TAA and MDS, specifying just
|
|
"tsx_async_abort=off" without an accompanying "mds=off" will have no
|
|
effect as the same mitigation is used for both vulnerabilities.
|
|
|
|
The kernel command line also allows to control the TSX feature using the
|
|
parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
|
|
to control the TSX feature and the enumeration of the TSX feature bits (RTM
|
|
and HLE) in CPUID.
|
|
|
|
The valid options are:
|
|
|
|
============ =============================================================
|
|
off Disables TSX on the system.
|
|
|
|
Note that this option takes effect only on newer CPUs which are
|
|
not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1
|
|
and which get the new IA32_TSX_CTRL MSR through a microcode
|
|
update. This new MSR allows for the reliable deactivation of
|
|
the TSX functionality.
|
|
|
|
on Enables TSX.
|
|
|
|
Although there are mitigations for all known security
|
|
vulnerabilities, TSX has been known to be an accelerator for
|
|
several previous speculation-related CVEs, and so there may be
|
|
unknown security risks associated with leaving it enabled.
|
|
|
|
auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX
|
|
on the system.
|
|
============ =============================================================
|
|
|
|
Not specifying this option is equivalent to "tsx=off".
|
|
|
|
The following combinations of the "tsx_async_abort" and "tsx" are possible. For
|
|
affected platforms tsx=auto is equivalent to tsx=off and the result will be:
|
|
|
|
========= ========================== =========================================
|
|
tsx=on tsx_async_abort=full The system will use VERW to clear CPU
|
|
buffers. Cross-thread attacks are still
|
|
possible on SMT machines.
|
|
tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT
|
|
mitigated.
|
|
tsx=on tsx_async_abort=off The system is vulnerable.
|
|
tsx=off tsx_async_abort=full TSX might be disabled if microcode
|
|
provides a TSX control MSR. If so,
|
|
system is not vulnerable.
|
|
tsx=off tsx_async_abort=full,nosmt Ditto
|
|
tsx=off tsx_async_abort=off ditto
|
|
========= ========================== =========================================
|
|
|
|
|
|
For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU
|
|
buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0)
|
|
"tsx" command line argument has no effect.
|
|
|
|
For the affected platforms below table indicates the mitigation status for the
|
|
combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO
|
|
and TSX_CTRL_MSR.
|
|
|
|
======= ========= ============= ========================================
|
|
MDS_NO MD_CLEAR TSX_CTRL_MSR Status
|
|
======= ========= ============= ========================================
|
|
0 0 0 Vulnerable (needs microcode)
|
|
0 1 0 MDS and TAA mitigated via VERW
|
|
1 1 0 MDS fixed, TAA vulnerable if TSX enabled
|
|
because MD_CLEAR has no meaning and
|
|
VERW is not guaranteed to clear buffers
|
|
1 X 1 MDS fixed, TAA can be mitigated by
|
|
VERW or TSX_CTRL_MSR
|
|
======= ========= ============= ========================================
|
|
|
|
Mitigation selection guide
|
|
--------------------------
|
|
|
|
1. Trusted userspace and guests
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
If all user space applications are from a trusted source and do not execute
|
|
untrusted code which is supplied externally, then the mitigation can be
|
|
disabled. The same applies to virtualized environments with trusted guests.
|
|
|
|
|
|
2. Untrusted userspace and guests
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
If there are untrusted applications or guests on the system, enabling TSX
|
|
might allow a malicious actor to leak data from the host or from other
|
|
processes running on the same physical core.
|
|
|
|
If the microcode is available and the TSX is disabled on the host, attacks
|
|
are prevented in a virtualized environment as well, even if the VMs do not
|
|
explicitly enable the mitigation.
|
|
|
|
|
|
.. _taa_default_mitigations:
|
|
|
|
Default mitigations
|
|
-------------------
|
|
|
|
The kernel's default action for vulnerable processors is:
|
|
|
|
- Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).
|