Changes in 4.9.298 Bluetooth: bfusb: fix division by zero in send path USB: core: Fix bug in resuming hub's handling of wakeup requests USB: Fix "slab-out-of-bounds Write" bug in usb_hcd_poll_rh_status mfd: intel-lpss: Fix too early PM enablement in the ACPI ->probe() can: gs_usb: fix use of uninitialized variable, detach device on reception of invalid USB data can: gs_usb: gs_can_start_xmit(): zero-initialize hf->{flags,reserved} random: fix data race on crng_node_pool random: fix data race on crng init time staging: wlan-ng: Avoid bitwise vs logical OR warning in hfa384x_usb_throttlefn() drm/i915: Avoid bitwise vs logical OR warning in snb_wm_latency_quirk() media: uvcvideo: fix division by zero at stream start rtlwifi: rtl8192cu: Fix WARNING when calling local_irq_restore() with interrupts enabled HID: uhid: Fix worker destroying device without any protection HID: wacom: Avoid using stale array indicies to read contact count nfc: llcp: fix NULL error pointer dereference on sendmsg() after failed bind() rtc: cmos: take rtc_lock while reading from CMOS media: flexcop-usb: fix control-message timeouts media: mceusb: fix control-message timeouts media: em28xx: fix control-message timeouts media: cpia2: fix control-message timeouts media: s2255: fix control-message timeouts media: dib0700: fix undefined behavior in tuner shutdown media: redrat3: fix control-message timeouts media: pvrusb2: fix control-message timeouts media: stk1160: fix control-message timeouts can: softing_cs: softingcs_probe(): fix memleak on registration failure PCI: Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode Bluetooth: cmtp: fix possible panic when cmtp_init_sockets() fails wcn36xx: Indicate beacon not connection loss on MISSED_BEACON_IND Bluetooth: stop proccessing malicious adv data media: dmxdev: fix UAF when dvb_register_device() fails crypto: qce - fix uaf on qce_ahash_register_one tty: serial: atmel: Check return code of dmaengine_submit() tty: serial: atmel: Call dma_async_issue_pending() netfilter: bridge: add support for pppoe filtering arm64: dts: qcom: msm8916: fix MMC controller aliases drm/amdgpu: Fix a NULL pointer dereference in amdgpu_connector_lcd_native_mode() drm/radeon/radeon_kms: Fix a NULL pointer dereference in radeon_driver_open_kms() serial: amba-pl011: do not request memory region twice floppy: Fix hang in watchdog when disk is ejected media: dib8000: Fix a memleak in dib8000_init() media: saa7146: mxb: Fix a NULL pointer dereference in mxb_attach() media: si2157: Fix "warm" tuner state detection media: msi001: fix possible null-ptr-deref in msi001_probe() usb: ftdi-elan: fix memory leak on device disconnect pcmcia: rsrc_nonstatic: Fix a NULL pointer dereference in __nonstatic_find_io_region() pcmcia: rsrc_nonstatic: Fix a NULL pointer dereference in nonstatic_find_mem_region() ppp: ensure minimum packet size in ppp_write() fsl/fman: Check for null pointer after calling devm_ioremap spi: spi-meson-spifc: Add missing pm_runtime_disable() in meson_spifc_probe can: softing: softing_startstop(): fix set but not used variable warning can: xilinx_can: xcan_probe(): check for error irq pcmcia: fix setting of kthread task states net: mcs7830: handle usb read errors properly ext4: avoid trim error on fs with small groups ALSA: jack: Add missing rwsem around snd_ctl_remove() calls ALSA: PCM: Add missing rwsem around snd_ctl_remove() calls ALSA: hda: Add missing rwsem around snd_ctl_remove() calls RDMA/hns: Validate the pkey index powerpc/prom_init: Fix improper check of prom_getprop() ALSA: oss: fix compile error when OSS_DEBUG is enabled char/mwave: Adjust io port register size scsi: ufs: Fix race conditions related to driver data RDMA/core: Let ib_find_gid() continue search even after empty entry dmaengine: pxa/mmp: stop referencing config->slave_id ASoC: samsung: idma: Check of ioremap return value misc: lattice-ecp3-config: Fix task hung when firmware load failed mips: lantiq: add support for clk_set_parent() mips: bcm63xx: add support for clk_set_parent() RDMA/cxgb4: Set queue pair state when being queried Bluetooth: Fix debugfs entry leak in hci_register_dev() fs: dlm: filter user dlm messages for kernel locks ar5523: Fix null-ptr-deref with unexpected WDCMSG_TARGET_START reply usb: gadget: f_fs: Use stream_open() for endpoint files HID: apple: Do not reset quirks when the Fn key is not found media: b2c2: Add missing check in flexcop_pci_isr: gpiolib: acpi: Do not set the IRQ type if the IRQ is already in use HSI: core: Fix return freed object in hsi_new_client mwifiex: Fix skb_over_panic in mwifiex_usb_recv() floppy: Add max size check for user space request media: saa7146: hexium_orion: Fix a NULL pointer dereference in hexium_attach() media: m920x: don't use stack on USB reads iwlwifi: mvm: synchronize with FW after multicast commands ath10k: Fix tx hanging net: bonding: debug: avoid printing debug logs when bond is not notifying peers media: igorplugusb: receiver overflow should be reported media: saa7146: hexium_gemini: Fix a NULL pointer dereference in hexium_attach() usb: hub: Add delay for SuperSpeed hub resume to let links transit to U0 ath9k: Fix out-of-bound memcpy in ath9k_hif_usb_rx_stream um: registers: Rename function names to avoid conflicts and build problems jffs2: GC deadlock reading a page that is used in jffs2_write_begin() ACPICA: Utilities: Avoid deleting the same object twice in a row ACPICA: Executer: Fix the REFCLASS_REFOF case in acpi_ex_opcode_1A_0T_1R() btrfs: remove BUG_ON() in find_parent_nodes() btrfs: remove BUG_ON(!eie) in find_parent_nodes net: mdio: Demote probed message to debug print dm btree: add a defensive bounds check to insert_at() dm space map common: add bounds check to sm_ll_lookup_bitmap() serial: pl010: Drop CR register reset on set_termios serial: core: Keep mctrl register state and cached copy in sync parisc: Avoid calling faulthandler_disabled() twice powerpc/6xx: add missing of_node_put powerpc/powernv: add missing of_node_put powerpc/cell: add missing of_node_put powerpc/btext: add missing of_node_put i2c: i801: Don't silently correct invalid transfer size powerpc/smp: Move setup_profiling_timer() under CONFIG_PROFILING i2c: mpc: Correct I2C reset procedure w1: Misuse of get_user()/put_user() reported by sparse ALSA: seq: Set upper limit of processed events i2c: designware-pci: Fix to change data types of hcnt and lcnt parameters MIPS: Octeon: Fix build errors using clang scsi: sr: Don't use GFP_DMA ASoC: mediatek: mt8173: fix device_node leak power: bq25890: Enable continuous conversion for ADC at charging ubifs: Error path in ubifs_remount_rw() seems to wrongly free write buffers iwlwifi: mvm: Increase the scan timeout guard to 30 seconds ext4: set csum seed in tmp inode while migrating to extents ext4: Fix BUG_ON in ext4_bread when write quota data ext4: don't use the orphan list when migrating an inode fuse: fix bad inode fuse: fix live lock in fuse_iget() drm/radeon: fix error handling in radeon_driver_open_kms RDMA/hns: Modify the mapping attribute of doorbell to device RDMA/rxe: Fix a typo in opcode name powerpc/fsl/dts: Enable WA for erratum A-009885 on fman3l MDIO buses net/fsl: xgmac_mdio: Fix incorrect iounmap when removing module parisc: pdc_stable: Fix memory leak in pdcs_register_pathentries af_unix: annote lockless accesses to unix_tot_inflight & gc_in_progress net: axienet: Wait for PhyRstCmplt after core reset net: axienet: fix number of TX ring slots for available check netns: add schedule point in ops_exit_list() libcxgb: Don't accidentally set RTO_ONLINK in cxgb_find_route() dmaengine: at_xdmac: Don't start transactions at tx_submit level dmaengine: at_xdmac: Print debug message after realeasing the lock dmaengine: at_xdmac: Fix lld view setting dmaengine: at_xdmac: Fix at_xdmac_lld struct definition net_sched: restore "mpu xxx" handling bcmgenet: add WOL IRQ check scripts/dtc: dtx_diff: remove broken example from help text lib82596: Fix IRQ check in sni_82596_probe Revert "gup: document and work around "COW can break either way" issue" gup: document and work around "COW can break either way" issue drm/ttm/nouveau: don't call tt destroy callback on alloc failure. gianfar: simplify FCS handling and fix memory leak gianfar: fix jumbo packets+napi+rx overrun crash cipso,calipso: resolve a number of problems with the DOI refcounts rbtree: cache leftmost node internally lib/timerqueue: Rely on rbtree semantics for next timer mm: add follow_pte_pmd() KVM: do not assume PTE is writable after follow_pfn KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() KVM: do not allow mapping valid but non-reference-counted pages Linux 4.9.298 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ifcea82a702a0906d9090c89785363c2d5423f652
285 lines
7.1 KiB
C
285 lines
7.1 KiB
C
/*
|
|
* Lockless get_user_pages_fast for s390
|
|
*
|
|
* Copyright IBM Corp. 2010
|
|
* Author(s): Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
*/
|
|
#include <linux/sched.h>
|
|
#include <linux/mm.h>
|
|
#include <linux/hugetlb.h>
|
|
#include <linux/vmstat.h>
|
|
#include <linux/pagemap.h>
|
|
#include <linux/rwsem.h>
|
|
#include <asm/pgtable.h>
|
|
|
|
/*
|
|
* The performance critical leaf functions are made noinline otherwise gcc
|
|
* inlines everything into a single function which results in too much
|
|
* register pressure.
|
|
*/
|
|
static inline int gup_pte_range(pmd_t *pmdp, pmd_t pmd, unsigned long addr,
|
|
unsigned long end, int write, struct page **pages, int *nr)
|
|
{
|
|
struct page *head, *page;
|
|
unsigned long mask;
|
|
pte_t *ptep, pte;
|
|
|
|
mask = (write ? _PAGE_PROTECT : 0) | _PAGE_INVALID | _PAGE_SPECIAL;
|
|
|
|
ptep = ((pte_t *) pmd_deref(pmd)) + pte_index(addr);
|
|
do {
|
|
pte = *ptep;
|
|
barrier();
|
|
/* Similar to the PMD case, NUMA hinting must take slow path */
|
|
if (pte_protnone(pte))
|
|
return 0;
|
|
if ((pte_val(pte) & mask) != 0)
|
|
return 0;
|
|
VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
|
|
page = pte_page(pte);
|
|
head = compound_head(page);
|
|
if (WARN_ON_ONCE(page_ref_count(head) < 0)
|
|
|| !page_cache_get_speculative(head))
|
|
return 0;
|
|
if (unlikely(pte_val(pte) != pte_val(*ptep))) {
|
|
put_page(head);
|
|
return 0;
|
|
}
|
|
VM_BUG_ON_PAGE(compound_head(page) != head, page);
|
|
pages[*nr] = page;
|
|
(*nr)++;
|
|
|
|
} while (ptep++, addr += PAGE_SIZE, addr != end);
|
|
|
|
return 1;
|
|
}
|
|
|
|
static inline int gup_huge_pmd(pmd_t *pmdp, pmd_t pmd, unsigned long addr,
|
|
unsigned long end, int write, struct page **pages, int *nr)
|
|
{
|
|
struct page *head, *page;
|
|
unsigned long mask;
|
|
int refs;
|
|
|
|
mask = (write ? _SEGMENT_ENTRY_PROTECT : 0) | _SEGMENT_ENTRY_INVALID;
|
|
if ((pmd_val(pmd) & mask) != 0)
|
|
return 0;
|
|
VM_BUG_ON(!pfn_valid(pmd_val(pmd) >> PAGE_SHIFT));
|
|
|
|
refs = 0;
|
|
head = pmd_page(pmd);
|
|
page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
|
|
do {
|
|
VM_BUG_ON(compound_head(page) != head);
|
|
pages[*nr] = page;
|
|
(*nr)++;
|
|
page++;
|
|
refs++;
|
|
} while (addr += PAGE_SIZE, addr != end);
|
|
|
|
if (WARN_ON_ONCE(page_ref_count(head) < 0)
|
|
|| !page_cache_add_speculative(head, refs)) {
|
|
*nr -= refs;
|
|
return 0;
|
|
}
|
|
|
|
if (unlikely(pmd_val(pmd) != pmd_val(*pmdp))) {
|
|
*nr -= refs;
|
|
while (refs--)
|
|
put_page(head);
|
|
return 0;
|
|
}
|
|
|
|
return 1;
|
|
}
|
|
|
|
|
|
static inline int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
|
|
unsigned long end, int write, struct page **pages, int *nr)
|
|
{
|
|
unsigned long next;
|
|
pmd_t *pmdp, pmd;
|
|
|
|
pmdp = (pmd_t *) pudp;
|
|
if ((pud_val(pud) & _REGION_ENTRY_TYPE_MASK) == _REGION_ENTRY_TYPE_R3)
|
|
pmdp = (pmd_t *) pud_deref(pud);
|
|
pmdp += pmd_index(addr);
|
|
do {
|
|
pmd = *pmdp;
|
|
barrier();
|
|
next = pmd_addr_end(addr, end);
|
|
if (pmd_none(pmd))
|
|
return 0;
|
|
if (unlikely(pmd_large(pmd))) {
|
|
/*
|
|
* NUMA hinting faults need to be handled in the GUP
|
|
* slowpath for accounting purposes and so that they
|
|
* can be serialised against THP migration.
|
|
*/
|
|
if (pmd_protnone(pmd))
|
|
return 0;
|
|
if (!gup_huge_pmd(pmdp, pmd, addr, next,
|
|
write, pages, nr))
|
|
return 0;
|
|
} else if (!gup_pte_range(pmdp, pmd, addr, next,
|
|
write, pages, nr))
|
|
return 0;
|
|
} while (pmdp++, addr = next, addr != end);
|
|
|
|
return 1;
|
|
}
|
|
|
|
static int gup_huge_pud(pud_t *pudp, pud_t pud, unsigned long addr,
|
|
unsigned long end, int write, struct page **pages, int *nr)
|
|
{
|
|
struct page *head, *page;
|
|
unsigned long mask;
|
|
int refs;
|
|
|
|
mask = (write ? _REGION_ENTRY_PROTECT : 0) | _REGION_ENTRY_INVALID;
|
|
if ((pud_val(pud) & mask) != 0)
|
|
return 0;
|
|
VM_BUG_ON(!pfn_valid(pud_pfn(pud)));
|
|
|
|
refs = 0;
|
|
head = pud_page(pud);
|
|
page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
|
|
do {
|
|
VM_BUG_ON_PAGE(compound_head(page) != head, page);
|
|
pages[*nr] = page;
|
|
(*nr)++;
|
|
page++;
|
|
refs++;
|
|
} while (addr += PAGE_SIZE, addr != end);
|
|
|
|
if (WARN_ON_ONCE(page_ref_count(head) < 0)
|
|
|| !page_cache_add_speculative(head, refs)) {
|
|
*nr -= refs;
|
|
return 0;
|
|
}
|
|
|
|
if (unlikely(pud_val(pud) != pud_val(*pudp))) {
|
|
*nr -= refs;
|
|
while (refs--)
|
|
put_page(head);
|
|
return 0;
|
|
}
|
|
|
|
return 1;
|
|
}
|
|
|
|
static inline int gup_pud_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr,
|
|
unsigned long end, int write, struct page **pages, int *nr)
|
|
{
|
|
unsigned long next;
|
|
pud_t *pudp, pud;
|
|
|
|
pudp = (pud_t *) pgdp;
|
|
if ((pgd_val(pgd) & _REGION_ENTRY_TYPE_MASK) == _REGION_ENTRY_TYPE_R2)
|
|
pudp = (pud_t *) pgd_deref(pgd);
|
|
pudp += pud_index(addr);
|
|
do {
|
|
pud = *pudp;
|
|
barrier();
|
|
next = pud_addr_end(addr, end);
|
|
if (pud_none(pud))
|
|
return 0;
|
|
if (unlikely(pud_large(pud))) {
|
|
if (!gup_huge_pud(pudp, pud, addr, next, write, pages,
|
|
nr))
|
|
return 0;
|
|
} else if (!gup_pmd_range(pudp, pud, addr, next, write, pages,
|
|
nr))
|
|
return 0;
|
|
} while (pudp++, addr = next, addr != end);
|
|
|
|
return 1;
|
|
}
|
|
|
|
/*
|
|
* Like get_user_pages_fast() except its IRQ-safe in that it won't fall
|
|
* back to the regular GUP.
|
|
*/
|
|
int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
|
|
struct page **pages)
|
|
{
|
|
struct mm_struct *mm = current->mm;
|
|
unsigned long addr, len, end;
|
|
unsigned long next, flags;
|
|
pgd_t *pgdp, pgd;
|
|
int nr = 0;
|
|
|
|
start &= PAGE_MASK;
|
|
addr = start;
|
|
len = (unsigned long) nr_pages << PAGE_SHIFT;
|
|
end = start + len;
|
|
if ((end <= start) || (end > TASK_SIZE))
|
|
return 0;
|
|
/*
|
|
* local_irq_save() doesn't prevent pagetable teardown, but does
|
|
* prevent the pagetables from being freed on s390.
|
|
*
|
|
* So long as we atomically load page table pointers versus teardown,
|
|
* we can follow the address down to the the page and take a ref on it.
|
|
*/
|
|
local_irq_save(flags);
|
|
pgdp = pgd_offset(mm, addr);
|
|
do {
|
|
pgd = *pgdp;
|
|
barrier();
|
|
next = pgd_addr_end(addr, end);
|
|
if (pgd_none(pgd))
|
|
break;
|
|
if (!gup_pud_range(pgdp, pgd, addr, next, write, pages, &nr))
|
|
break;
|
|
} while (pgdp++, addr = next, addr != end);
|
|
local_irq_restore(flags);
|
|
|
|
return nr;
|
|
}
|
|
|
|
/**
|
|
* get_user_pages_fast() - pin user pages in memory
|
|
* @start: starting user address
|
|
* @nr_pages: number of pages from start to pin
|
|
* @write: whether pages will be written to
|
|
* @pages: array that receives pointers to the pages pinned.
|
|
* Should be at least nr_pages long.
|
|
*
|
|
* Attempt to pin user pages in memory without taking mm->mmap_sem.
|
|
* If not successful, it will fall back to taking the lock and
|
|
* calling get_user_pages().
|
|
*
|
|
* Returns number of pages pinned. This may be fewer than the number
|
|
* requested. If nr_pages is 0 or negative, returns 0. If no pages
|
|
* were pinned, returns -errno.
|
|
*/
|
|
int get_user_pages_fast(unsigned long start, int nr_pages, int write,
|
|
struct page **pages)
|
|
{
|
|
int nr, ret;
|
|
|
|
might_sleep();
|
|
start &= PAGE_MASK;
|
|
/*
|
|
* The FAST_GUP case requires FOLL_WRITE even for pure reads,
|
|
* because get_user_pages() may need to cause an early COW in
|
|
* order to avoid confusing the normal COW routines. So only
|
|
* targets that are already writable are safe to do by just
|
|
* looking at the page tables.
|
|
*/
|
|
nr = __get_user_pages_fast(start, nr_pages, 1, pages);
|
|
if (nr == nr_pages)
|
|
return nr;
|
|
|
|
/* Try to get the remaining pages with get_user_pages */
|
|
start += nr << PAGE_SHIFT;
|
|
pages += nr;
|
|
ret = get_user_pages_unlocked(start, nr_pages - nr, pages,
|
|
write ? FOLL_WRITE : 0);
|
|
/* Have to be a bit careful with return values */
|
|
if (nr > 0)
|
|
ret = (ret < 0) ? nr : ret + nr;
|
|
return ret;
|
|
}
|