Changes in 4.9.291 binder: use euid from cred instead of using task binder: use cred instead of task for selinux checks xhci: Fix USB 3.1 enumeration issues by increasing roothub power-on-good delay Input: elantench - fix misreporting trackpoint coordinates Input: i8042 - Add quirk for Fujitsu Lifebook T725 libata: fix read log timeout value ocfs2: fix data corruption on truncate mmc: dw_mmc: Dont wait for DRTO on Write RSP error parisc: Fix ptrace check on syscall return media: ite-cir: IR receiver stop working after receive overflow ALSA: ua101: fix division by zero at probe ALSA: 6fire: fix control and bulk message timeouts ALSA: line6: fix control and interrupt message timeouts ALSA: synth: missing check for possible NULL after the call to kstrdup ALSA: timer: Fix use-after-free problem ALSA: timer: Unconditionally unlink slave instances, too x86/irq: Ensure PI wakeup handler is unregistered before module unload sfc: Don't use netif_info before net_device setup hyperv/vmbus: include linux/bitops.h mmc: winbond: don't build on M68K bpf: Prevent increasing bpf_jit_limit above max xen/netfront: stop tx queues during live migration spi: spl022: fix Microwire full duplex mode watchdog: Fix OMAP watchdog early handling vmxnet3: do not stop tx queues after netif_device_detach() btrfs: fix lost error handling when replaying directory deletes hwmon: (pmbus/lm25066) Add offset coefficients regulator: s5m8767: do not use reset value as DVS voltage if GPIO DVS is disabled regulator: dt-bindings: samsung,s5m8767: correct s5m8767,pmic-buck-default-dvs-idx property EDAC/sb_edac: Fix top-of-high-memory value for Broadwell/Haswell mwifiex: fix division by zero in fw download path ath6kl: fix division by zero in send path ath6kl: fix control-message timeout PCI: Mark Atheros QCA6174 to avoid bus reset rtl8187: fix control-message timeouts evm: mark evm_fixmode as __ro_after_init wcn36xx: Fix HT40 capability for 2Ghz band mwifiex: Read a PCI register after writing the TX ring write pointer wcn36xx: handle connection loss indication RDMA/qedr: Fix NULL deref for query_qp on the GSI QP signal: Remove the bogus sigkill_pending in ptrace_stop signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT power: supply: max17042_battery: Prevent int underflow in set_soc_threshold power: supply: max17042_battery: use VFSOC for capacity when no rsns powerpc/85xx: Fix oops when mpc85xx_smp_guts_ids node cannot be found serial: core: Fix initializing and restoring termios speed ALSA: mixer: oss: Fix racy access to slots ALSA: mixer: fix deadlock in snd_mixer_oss_set_volume PCI: aardvark: Read all 16-bits from PCIE_MSI_PAYLOAD_REG quota: check block number when reading the block in quota file quota: correct error number in free_dqentry() iio: dac: ad5446: Fix ad5622_write() return value USB: serial: keyspan: fix memleak on probe errors USB: iowarrior: fix control-message timeouts Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg() Bluetooth: fix use-after-free error in lock_sock_nested() platform/x86: wmi: do not fail if disabling fails MIPS: lantiq: dma: add small delay after reset MIPS: lantiq: dma: reset correct number of channel locking/lockdep: Avoid RCU-induced noinstr fail smackfs: Fix use-after-free in netlbl_catmap_walk() x86: Increase exception stack sizes media: mt9p031: Fix corrupted frame after restarting stream media: netup_unidvb: handle interrupt properly according to the firmware media: uvcvideo: Set capability in s_param media: s5p-mfc: fix possible null-pointer dereference in s5p_mfc_probe() media: mceusb: return without resubmitting URB in case of -EPROTO error. ia64: don't do IA64_CMPXCHG_DEBUG without CONFIG_PRINTK ACPICA: Avoid evaluating methods too early during system resume media: usb: dvd-usb: fix uninit-value bug in dibusb_read_eeprom_byte() tracefs: Have tracefs directories not set OTH permission bits by default ath: dfs_pattern_detector: Fix possible null-pointer dereference in channel_detector_create() ACPI: battery: Accept charges over the design capacity as full memstick: r592: Fix a UAF bug when removing the driver lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression lib/xz: Validate the value before assigning it to an enum variable tracing/cfi: Fix cmp_entries_* functions signature mismatch mwl8k: Fix use-after-free in mwl8k_fw_state_machine() PM: hibernate: Get block device exclusively in swsusp_check() iwlwifi: mvm: disable RX-diversity in powersave smackfs: use __GFP_NOFAIL for smk_cipso_doi() ARM: clang: Do not rely on lr register for stacktrace ARM: 9136/1: ARMv7-M uses BE-8, not BE-32 spi: bcm-qspi: Fix missing clk_disable_unprepare() on error in bcm_qspi_probe() parisc: fix warning in flush_tlb_all parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling cgroup: Make rebind_subsystems() disable v2 controllers all at once media: dvb-usb: fix ununit-value in az6027_rc_query media: mtk-vpu: Fix a resource leak in the error handling path of 'mtk_vpu_probe()' media: si470x: Avoid card name truncation cpuidle: Fix kobject memory leaks in error paths ath9k: Fix potential interrupt storm on queue reset crypto: qat - detect PFVF collision after ACK crypto: qat - disregard spurious PFVF interrupts b43legacy: fix a lower bounds test b43: fix a lower bounds test memstick: avoid out-of-range warning memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host() hwmon: Fix possible memleak in __hwmon_device_register() ath10k: fix max antenna gain unit drm/msm: uninitialized variable in msm_gem_import() net: stream: don't purge sk_error_queue in sk_stream_kill_queues() mmc: mxs-mmc: disable regulator on error and in the remove function platform/x86: thinkpad_acpi: Fix bitwise vs. logical warning mwifiex: Send DELBA requests according to spec phy: micrel: ksz8041nl: do not use power down mode smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi s390/gmap: don't unconditionally call pte_unmap_unlock() in __gmap_zap() irq: mips: avoid nested irq_enter() samples/kretprobes: Fix return value if register_kretprobe() failed libertas_tf: Fix possible memory leak in probe and disconnect libertas: Fix possible memory leak in probe and disconnect crypto: pcrypt - Delay write to padata->info RDMA/rxe: Fix wrong port_cap_flags ARM: s3c: irq-s3c24xx: Fix return value check for s3c24xx_init_intc() scsi: dc395: Fix error case unwinding MIPS: loongson64: make CPU_LOONGSON64 depends on MIPS_FP_SUPPORT JFS: fix memleak in jfs_mount arm: dts: omap3-gta04a4: accelerometer irq fix soc/tegra: Fix an error handling path in tegra_powergate_power_up() memory: fsl_ifc: fix leak of irq and nand_irq in fsl_ifc_ctrl_probe video: fbdev: chipsfb: use memset_io() instead of memset() serial: 8250_dw: Drop wrong use of ACPI_PTR() usb: gadget: hid: fix error code in do_config() power: supply: rt5033_battery: Change voltage values to µV scsi: csiostor: Uninitialized data in csio_ln_vnp_read_cbfn() RDMA/mlx4: Return missed an error if device doesn't support steering serial: xilinx_uartps: Fix race condition causing stuck TX power: supply: bq27xxx: Fix kernel crash on IRQ handler register error pnfs/flexfiles: Fix misplaced barrier in nfs4_ff_layout_prepare_ds drm/plane-helper: fix uninitialized variable reference PCI: aardvark: Don't spam about PIO Response Status fs: orangefs: fix error return code of orangefs_revalidate_lookup() mtd: spi-nor: hisi-sfc: Remove excessive clk_disable_unprepare() dmaengine: at_xdmac: fix AT_XDMAC_CC_PERID() macro auxdisplay: img-ascii-lcd: Fix lock-up when displaying empty string netfilter: nfnetlink_queue: fix OOB when mac header was cleared dmaengine: dmaengine_desc_callback_valid(): Check for `callback_result` m68k: set a default value for MEMORY_RESERVE watchdog: f71808e_wdt: fix inaccurate report in WDIOC_GETTIMEOUT scsi: qla2xxx: Turn off target reset during issue_lip i2c: xlr: Fix a resource leak in the error handling path of 'xlr_i2c_probe()' xen-pciback: Fix return in pm_ctrl_init() net: davinci_emac: Fix interrupt pacing disable ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failed mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration() llc: fix out-of-bound array index in llc_sk_dev_hash() nfc: pn533: Fix double free when pn533_fill_fragment_skbs() fails vsock: prevent unnecessary refcnt inc for nonblocking connect USB: chipidea: fix interrupt deadlock ARM: 9156/1: drop cc-option fallbacks for architecture selection powerpc/bpf: Validate branch ranges powerpc/bpf: Fix BPF_SUB when imm == 0x80000000 mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks mm, oom: do not trigger out_of_memory from the #PF PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros net: mdio-mux: fix unbalanced put_device parisc/entry: fix trace test in syscall exit path PCI/MSI: Destroy sysfs before freeing entries scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq() usb: musb: tusb6010: check return value after calling platform_get_resource() scsi: advansys: Fix kernel pointer leak ARM: dts: omap: fix gpmc,mux-add-data type usb: host: ohci-tmio: check return value after calling platform_get_resource() tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc MIPS: sni: Fix the build scsi: target: Fix ordered tag handling scsi: target: Fix alua_tg_pt_gps_count tracking powerpc/5200: dts: fix memory node unit name ALSA: gus: fix null pointer dereference on pointer block powerpc/dcr: Use cmplwi instead of 3-argument cmpli sh: check return code of request_irq maple: fix wrong return value of maple_bus_init(). sh: fix kconfig unmet dependency warning for FRAME_POINTER sh: define __BIG_ENDIAN for math-emu mips: BCM63XX: ensure that CPU_SUPPORTS_32BIT_KERNEL is set sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain() net: bnx2x: fix variable dereferenced before check iavf: Fix for the false positive ASQ/ARQ errors while issuing VF reset mips: bcm63xx: add support for clk_get_parent() platform/x86: hp_accel: Fix an error handling path in 'lis3lv02d_probe()' NFC: reorganize the functions in nci_request NFC: reorder the logic in nfc_{un,}register_device perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server tun: fix bonding active backup with arp monitoring hexagon: export raw I/O routines for modules mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag btrfs: fix memory ordering between normal and ordered work functions parisc/sticon: fix reverse colors cfg80211: call cfg80211_stop_ap when switch from P2P_GO type drm/udl: fix control-message timeout drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors batman-adv: Keep fragments equally sized batman-adv: Fix own OGM check in aggregated OGMs batman-adv: mcast: fix duplicate mcast packets in BLA backbone from LAN batman-adv: mcast: fix duplicate mcast packets from BLA backbone to mesh batman-adv: Consider fragmentation for needed_headroom batman-adv: Reserve needed_*room for fragments batman-adv: Don't always reallocate the fragmentation skb head ASoC: DAPM: Cover regression by kctl change notification fix usb: max-3421: Use driver data instead of maintaining a list of bound devices soc/tegra: pmc: Fix imbalanced clock disabling in error code path Linux 4.9.291 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I23d798c10aebab1e51add60ccb34a8b289d49a4d
472 lines
12 KiB
C
472 lines
12 KiB
C
/*
|
|
* Copyright (C) 2007 Oracle. All rights reserved.
|
|
* Copyright (C) 2014 Fujitsu. All rights reserved.
|
|
*
|
|
* This program is free software; you can redistribute it and/or
|
|
* modify it under the terms of the GNU General Public
|
|
* License v2 as published by the Free Software Foundation.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
|
* General Public License for more details.
|
|
*
|
|
* You should have received a copy of the GNU General Public
|
|
* License along with this program; if not, write to the
|
|
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
|
|
* Boston, MA 021110-1307, USA.
|
|
*/
|
|
|
|
#include <linux/kthread.h>
|
|
#include <linux/slab.h>
|
|
#include <linux/list.h>
|
|
#include <linux/spinlock.h>
|
|
#include <linux/freezer.h>
|
|
#include "async-thread.h"
|
|
#include "ctree.h"
|
|
|
|
#define WORK_DONE_BIT 0
|
|
#define WORK_ORDER_DONE_BIT 1
|
|
#define WORK_HIGH_PRIO_BIT 2
|
|
|
|
#define NO_THRESHOLD (-1)
|
|
#define DFT_THRESHOLD (32)
|
|
|
|
struct __btrfs_workqueue {
|
|
struct workqueue_struct *normal_wq;
|
|
|
|
/* File system this workqueue services */
|
|
struct btrfs_fs_info *fs_info;
|
|
|
|
/* List head pointing to ordered work list */
|
|
struct list_head ordered_list;
|
|
|
|
/* Spinlock for ordered_list */
|
|
spinlock_t list_lock;
|
|
|
|
/* Thresholding related variants */
|
|
atomic_t pending;
|
|
|
|
/* Up limit of concurrency workers */
|
|
int limit_active;
|
|
|
|
/* Current number of concurrency workers */
|
|
int current_active;
|
|
|
|
/* Threshold to change current_active */
|
|
int thresh;
|
|
unsigned int count;
|
|
spinlock_t thres_lock;
|
|
};
|
|
|
|
struct btrfs_workqueue {
|
|
struct __btrfs_workqueue *normal;
|
|
struct __btrfs_workqueue *high;
|
|
};
|
|
|
|
static void normal_work_helper(struct btrfs_work *work);
|
|
|
|
#define BTRFS_WORK_HELPER(name) \
|
|
void btrfs_##name(struct work_struct *arg) \
|
|
{ \
|
|
struct btrfs_work *work = container_of(arg, struct btrfs_work, \
|
|
normal_work); \
|
|
normal_work_helper(work); \
|
|
}
|
|
|
|
struct btrfs_fs_info *
|
|
btrfs_workqueue_owner(struct __btrfs_workqueue *wq)
|
|
{
|
|
return wq->fs_info;
|
|
}
|
|
|
|
struct btrfs_fs_info *
|
|
btrfs_work_owner(struct btrfs_work *work)
|
|
{
|
|
return work->wq->fs_info;
|
|
}
|
|
|
|
bool btrfs_workqueue_normal_congested(struct btrfs_workqueue *wq)
|
|
{
|
|
/*
|
|
* We could compare wq->normal->pending with num_online_cpus()
|
|
* to support "thresh == NO_THRESHOLD" case, but it requires
|
|
* moving up atomic_inc/dec in thresh_queue/exec_hook. Let's
|
|
* postpone it until someone needs the support of that case.
|
|
*/
|
|
if (wq->normal->thresh == NO_THRESHOLD)
|
|
return false;
|
|
|
|
return atomic_read(&wq->normal->pending) > wq->normal->thresh * 2;
|
|
}
|
|
|
|
BTRFS_WORK_HELPER(worker_helper);
|
|
BTRFS_WORK_HELPER(delalloc_helper);
|
|
BTRFS_WORK_HELPER(flush_delalloc_helper);
|
|
BTRFS_WORK_HELPER(cache_helper);
|
|
BTRFS_WORK_HELPER(submit_helper);
|
|
BTRFS_WORK_HELPER(fixup_helper);
|
|
BTRFS_WORK_HELPER(endio_helper);
|
|
BTRFS_WORK_HELPER(endio_meta_helper);
|
|
BTRFS_WORK_HELPER(endio_meta_write_helper);
|
|
BTRFS_WORK_HELPER(endio_raid56_helper);
|
|
BTRFS_WORK_HELPER(endio_repair_helper);
|
|
BTRFS_WORK_HELPER(rmw_helper);
|
|
BTRFS_WORK_HELPER(endio_write_helper);
|
|
BTRFS_WORK_HELPER(freespace_write_helper);
|
|
BTRFS_WORK_HELPER(delayed_meta_helper);
|
|
BTRFS_WORK_HELPER(readahead_helper);
|
|
BTRFS_WORK_HELPER(qgroup_rescan_helper);
|
|
BTRFS_WORK_HELPER(extent_refs_helper);
|
|
BTRFS_WORK_HELPER(scrub_helper);
|
|
BTRFS_WORK_HELPER(scrubwrc_helper);
|
|
BTRFS_WORK_HELPER(scrubnc_helper);
|
|
BTRFS_WORK_HELPER(scrubparity_helper);
|
|
|
|
static struct __btrfs_workqueue *
|
|
__btrfs_alloc_workqueue(struct btrfs_fs_info *fs_info, const char *name,
|
|
unsigned int flags, int limit_active, int thresh)
|
|
{
|
|
struct __btrfs_workqueue *ret = kzalloc(sizeof(*ret), GFP_KERNEL);
|
|
|
|
if (!ret)
|
|
return NULL;
|
|
|
|
ret->fs_info = fs_info;
|
|
ret->limit_active = limit_active;
|
|
atomic_set(&ret->pending, 0);
|
|
if (thresh == 0)
|
|
thresh = DFT_THRESHOLD;
|
|
/* For low threshold, disabling threshold is a better choice */
|
|
if (thresh < DFT_THRESHOLD) {
|
|
ret->current_active = limit_active;
|
|
ret->thresh = NO_THRESHOLD;
|
|
} else {
|
|
/*
|
|
* For threshold-able wq, let its concurrency grow on demand.
|
|
* Use minimal max_active at alloc time to reduce resource
|
|
* usage.
|
|
*/
|
|
ret->current_active = 1;
|
|
ret->thresh = thresh;
|
|
}
|
|
|
|
if (flags & WQ_HIGHPRI)
|
|
ret->normal_wq = alloc_workqueue("%s-%s-high", flags,
|
|
ret->current_active, "btrfs",
|
|
name);
|
|
else
|
|
ret->normal_wq = alloc_workqueue("%s-%s", flags,
|
|
ret->current_active, "btrfs",
|
|
name);
|
|
if (!ret->normal_wq) {
|
|
kfree(ret);
|
|
return NULL;
|
|
}
|
|
|
|
INIT_LIST_HEAD(&ret->ordered_list);
|
|
spin_lock_init(&ret->list_lock);
|
|
spin_lock_init(&ret->thres_lock);
|
|
trace_btrfs_workqueue_alloc(ret, name, flags & WQ_HIGHPRI);
|
|
return ret;
|
|
}
|
|
|
|
static inline void
|
|
__btrfs_destroy_workqueue(struct __btrfs_workqueue *wq);
|
|
|
|
struct btrfs_workqueue *btrfs_alloc_workqueue(struct btrfs_fs_info *fs_info,
|
|
const char *name,
|
|
unsigned int flags,
|
|
int limit_active,
|
|
int thresh)
|
|
{
|
|
struct btrfs_workqueue *ret = kzalloc(sizeof(*ret), GFP_KERNEL);
|
|
|
|
if (!ret)
|
|
return NULL;
|
|
|
|
ret->normal = __btrfs_alloc_workqueue(fs_info, name,
|
|
flags & ~WQ_HIGHPRI,
|
|
limit_active, thresh);
|
|
if (!ret->normal) {
|
|
kfree(ret);
|
|
return NULL;
|
|
}
|
|
|
|
if (flags & WQ_HIGHPRI) {
|
|
ret->high = __btrfs_alloc_workqueue(fs_info, name, flags,
|
|
limit_active, thresh);
|
|
if (!ret->high) {
|
|
__btrfs_destroy_workqueue(ret->normal);
|
|
kfree(ret);
|
|
return NULL;
|
|
}
|
|
}
|
|
return ret;
|
|
}
|
|
|
|
/*
|
|
* Hook for threshold which will be called in btrfs_queue_work.
|
|
* This hook WILL be called in IRQ handler context,
|
|
* so workqueue_set_max_active MUST NOT be called in this hook
|
|
*/
|
|
static inline void thresh_queue_hook(struct __btrfs_workqueue *wq)
|
|
{
|
|
if (wq->thresh == NO_THRESHOLD)
|
|
return;
|
|
atomic_inc(&wq->pending);
|
|
}
|
|
|
|
/*
|
|
* Hook for threshold which will be called before executing the work,
|
|
* This hook is called in kthread content.
|
|
* So workqueue_set_max_active is called here.
|
|
*/
|
|
static inline void thresh_exec_hook(struct __btrfs_workqueue *wq)
|
|
{
|
|
int new_current_active;
|
|
long pending;
|
|
int need_change = 0;
|
|
|
|
if (wq->thresh == NO_THRESHOLD)
|
|
return;
|
|
|
|
atomic_dec(&wq->pending);
|
|
spin_lock(&wq->thres_lock);
|
|
/*
|
|
* Use wq->count to limit the calling frequency of
|
|
* workqueue_set_max_active.
|
|
*/
|
|
wq->count++;
|
|
wq->count %= (wq->thresh / 4);
|
|
if (!wq->count)
|
|
goto out;
|
|
new_current_active = wq->current_active;
|
|
|
|
/*
|
|
* pending may be changed later, but it's OK since we really
|
|
* don't need it so accurate to calculate new_max_active.
|
|
*/
|
|
pending = atomic_read(&wq->pending);
|
|
if (pending > wq->thresh)
|
|
new_current_active++;
|
|
if (pending < wq->thresh / 2)
|
|
new_current_active--;
|
|
new_current_active = clamp_val(new_current_active, 1, wq->limit_active);
|
|
if (new_current_active != wq->current_active) {
|
|
need_change = 1;
|
|
wq->current_active = new_current_active;
|
|
}
|
|
out:
|
|
spin_unlock(&wq->thres_lock);
|
|
|
|
if (need_change) {
|
|
workqueue_set_max_active(wq->normal_wq, wq->current_active);
|
|
}
|
|
}
|
|
|
|
static void run_ordered_work(struct __btrfs_workqueue *wq,
|
|
struct btrfs_work *self)
|
|
{
|
|
struct list_head *list = &wq->ordered_list;
|
|
struct btrfs_work *work;
|
|
spinlock_t *lock = &wq->list_lock;
|
|
unsigned long flags;
|
|
void *wtag;
|
|
bool free_self = false;
|
|
|
|
while (1) {
|
|
spin_lock_irqsave(lock, flags);
|
|
if (list_empty(list))
|
|
break;
|
|
work = list_entry(list->next, struct btrfs_work,
|
|
ordered_list);
|
|
if (!test_bit(WORK_DONE_BIT, &work->flags))
|
|
break;
|
|
/*
|
|
* Orders all subsequent loads after reading WORK_DONE_BIT,
|
|
* paired with the smp_mb__before_atomic in btrfs_work_helper
|
|
* this guarantees that the ordered function will see all
|
|
* updates from ordinary work function.
|
|
*/
|
|
smp_rmb();
|
|
|
|
/*
|
|
* we are going to call the ordered done function, but
|
|
* we leave the work item on the list as a barrier so
|
|
* that later work items that are done don't have their
|
|
* functions called before this one returns
|
|
*/
|
|
if (test_and_set_bit(WORK_ORDER_DONE_BIT, &work->flags))
|
|
break;
|
|
trace_btrfs_ordered_sched(work);
|
|
spin_unlock_irqrestore(lock, flags);
|
|
work->ordered_func(work);
|
|
|
|
/* now take the lock again and drop our item from the list */
|
|
spin_lock_irqsave(lock, flags);
|
|
list_del(&work->ordered_list);
|
|
spin_unlock_irqrestore(lock, flags);
|
|
|
|
if (work == self) {
|
|
/*
|
|
* This is the work item that the worker is currently
|
|
* executing.
|
|
*
|
|
* The kernel workqueue code guarantees non-reentrancy
|
|
* of work items. I.e., if a work item with the same
|
|
* address and work function is queued twice, the second
|
|
* execution is blocked until the first one finishes. A
|
|
* work item may be freed and recycled with the same
|
|
* work function; the workqueue code assumes that the
|
|
* original work item cannot depend on the recycled work
|
|
* item in that case (see find_worker_executing_work()).
|
|
*
|
|
* Note that the work of one Btrfs filesystem may depend
|
|
* on the work of another Btrfs filesystem via, e.g., a
|
|
* loop device. Therefore, we must not allow the current
|
|
* work item to be recycled until we are really done,
|
|
* otherwise we break the above assumption and can
|
|
* deadlock.
|
|
*/
|
|
free_self = true;
|
|
} else {
|
|
/*
|
|
* We don't want to call the ordered free functions with
|
|
* the lock held though. Save the work as tag for the
|
|
* trace event, because the callback could free the
|
|
* structure.
|
|
*/
|
|
wtag = work;
|
|
work->ordered_free(work);
|
|
trace_btrfs_all_work_done(wq->fs_info, wtag);
|
|
}
|
|
}
|
|
spin_unlock_irqrestore(lock, flags);
|
|
|
|
if (free_self) {
|
|
wtag = self;
|
|
self->ordered_free(self);
|
|
trace_btrfs_all_work_done(wq->fs_info, wtag);
|
|
}
|
|
}
|
|
|
|
static void normal_work_helper(struct btrfs_work *work)
|
|
{
|
|
struct __btrfs_workqueue *wq;
|
|
void *wtag;
|
|
int need_order = 0;
|
|
|
|
/*
|
|
* We should not touch things inside work in the following cases:
|
|
* 1) after work->func() if it has no ordered_free
|
|
* Since the struct is freed in work->func().
|
|
* 2) after setting WORK_DONE_BIT
|
|
* The work may be freed in other threads almost instantly.
|
|
* So we save the needed things here.
|
|
*/
|
|
if (work->ordered_func)
|
|
need_order = 1;
|
|
wq = work->wq;
|
|
/* Safe for tracepoints in case work gets freed by the callback */
|
|
wtag = work;
|
|
|
|
trace_btrfs_work_sched(work);
|
|
thresh_exec_hook(wq);
|
|
work->func(work);
|
|
if (need_order) {
|
|
/*
|
|
* Ensures all memory accesses done in the work function are
|
|
* ordered before setting the WORK_DONE_BIT. Ensuring the thread
|
|
* which is going to executed the ordered work sees them.
|
|
* Pairs with the smp_rmb in run_ordered_work.
|
|
*/
|
|
smp_mb__before_atomic();
|
|
set_bit(WORK_DONE_BIT, &work->flags);
|
|
run_ordered_work(wq, work);
|
|
}
|
|
if (!need_order)
|
|
trace_btrfs_all_work_done(wq->fs_info, wtag);
|
|
}
|
|
|
|
void btrfs_init_work(struct btrfs_work *work, btrfs_work_func_t uniq_func,
|
|
btrfs_func_t func,
|
|
btrfs_func_t ordered_func,
|
|
btrfs_func_t ordered_free)
|
|
{
|
|
work->func = func;
|
|
work->ordered_func = ordered_func;
|
|
work->ordered_free = ordered_free;
|
|
INIT_WORK(&work->normal_work, uniq_func);
|
|
INIT_LIST_HEAD(&work->ordered_list);
|
|
work->flags = 0;
|
|
}
|
|
|
|
static inline void __btrfs_queue_work(struct __btrfs_workqueue *wq,
|
|
struct btrfs_work *work)
|
|
{
|
|
unsigned long flags;
|
|
|
|
work->wq = wq;
|
|
thresh_queue_hook(wq);
|
|
if (work->ordered_func) {
|
|
spin_lock_irqsave(&wq->list_lock, flags);
|
|
list_add_tail(&work->ordered_list, &wq->ordered_list);
|
|
spin_unlock_irqrestore(&wq->list_lock, flags);
|
|
}
|
|
trace_btrfs_work_queued(work);
|
|
queue_work(wq->normal_wq, &work->normal_work);
|
|
}
|
|
|
|
void btrfs_queue_work(struct btrfs_workqueue *wq,
|
|
struct btrfs_work *work)
|
|
{
|
|
struct __btrfs_workqueue *dest_wq;
|
|
|
|
if (test_bit(WORK_HIGH_PRIO_BIT, &work->flags) && wq->high)
|
|
dest_wq = wq->high;
|
|
else
|
|
dest_wq = wq->normal;
|
|
__btrfs_queue_work(dest_wq, work);
|
|
}
|
|
|
|
static inline void
|
|
__btrfs_destroy_workqueue(struct __btrfs_workqueue *wq)
|
|
{
|
|
destroy_workqueue(wq->normal_wq);
|
|
trace_btrfs_workqueue_destroy(wq);
|
|
kfree(wq);
|
|
}
|
|
|
|
void btrfs_destroy_workqueue(struct btrfs_workqueue *wq)
|
|
{
|
|
if (!wq)
|
|
return;
|
|
if (wq->high)
|
|
__btrfs_destroy_workqueue(wq->high);
|
|
__btrfs_destroy_workqueue(wq->normal);
|
|
kfree(wq);
|
|
}
|
|
|
|
void btrfs_workqueue_set_max(struct btrfs_workqueue *wq, int limit_active)
|
|
{
|
|
if (!wq)
|
|
return;
|
|
wq->normal->limit_active = limit_active;
|
|
if (wq->high)
|
|
wq->high->limit_active = limit_active;
|
|
}
|
|
|
|
void btrfs_set_work_high_priority(struct btrfs_work *work)
|
|
{
|
|
set_bit(WORK_HIGH_PRIO_BIT, &work->flags);
|
|
}
|
|
|
|
void btrfs_flush_workqueue(struct btrfs_workqueue *wq)
|
|
{
|
|
if (wq->high)
|
|
flush_workqueue(wq->high->normal_wq);
|
|
|
|
flush_workqueue(wq->normal->normal_wq);
|
|
}
|