linux-kernel-test/fs
Dave Chinner ae687e58b3 xfs: use NOIO contexts for vm_map_ram
When we map pages in the buffer cache, we can do so in GFP_NOFS
contexts. However, the vmap interfaces do not provide any method of
communicating this information to memory reclaim, and hence we get
lockdep complaining about it regularly and occassionally see hangs
that may be vmap related reclaim deadlocks. We can also see these
same problems from anywhere where we use vmalloc for a large buffer
(e.g. attribute code) inside a transaction context.

A typical lockdep report shows up as a reclaim state warning like so:

[14046.101458] =================================
[14046.102850] [ INFO: inconsistent lock state ]
[14046.102850] 3.14.0-rc4+ #2 Not tainted
[14046.102850] ---------------------------------
[14046.102850] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[14046.102850] kswapd0/14 [HC0[0]:SC0[0]:HE1:SE1] takes:
[14046.102850]  (&xfs_dir_ilock_class){++++?+}, at: [<791a04bb>] xfs_ilock+0xff/0x16a
[14046.102850] {RECLAIM_FS-ON-W} state was registered at:
[14046.102850]   [<7904cdb1>] mark_held_locks+0x81/0xe7
[14046.102850]   [<7904d390>] lockdep_trace_alloc+0x5c/0xb4
[14046.102850]   [<790c2c28>] kmem_cache_alloc_trace+0x2b/0x11e
[14046.102850]   [<790ba7f4>] vm_map_ram+0x119/0x3e6
[14046.102850]   [<7914e124>] _xfs_buf_map_pages+0x5b/0xcf
[14046.102850]   [<7914ed74>] xfs_buf_get_map+0x67/0x13f
[14046.102850]   [<7917506f>] xfs_attr_rmtval_set+0x396/0x4d5
[14046.102850]   [<7916e8bb>] xfs_attr_leaf_addname+0x18f/0x37d
[14046.102850]   [<7916ed9e>] xfs_attr_set_int+0x2f5/0x3e8
[14046.102850]   [<7916eefc>] xfs_attr_set+0x6b/0x74
[14046.102850]   [<79168355>] xfs_xattr_set+0x61/0x81
[14046.102850]   [<790e5b10>] generic_setxattr+0x59/0x68
[14046.102850]   [<790e4c06>] __vfs_setxattr_noperm+0x58/0xce
[14046.102850]   [<790e4d0a>] vfs_setxattr+0x8e/0x92
[14046.102850]   [<790e4ddd>] setxattr+0xcf/0x159
[14046.102850]   [<790e5423>] SyS_lsetxattr+0x88/0xbb
[14046.102850]   [<79268438>] sysenter_do_call+0x12/0x36

Now, we can't completely remove these traces - mainly because
vm_map_ram() will do GFP_KERNEL allocation and that generates the
above warning before we get into the reclaim code, but we can turn
them all into false positive warnings.

To do that, use the method that DM and other IO context code uses to
avoid this problem: there is a process flag to tell memory reclaim
not to do IO that we can set appropriately. That prevents GFP_KERNEL
context reclaim being done from deep inside the vmalloc code in
places we can't directly pass a GFP_NOFS context to. That interface
has a pair of wrapper functions: memalloc_noio_save() and
memalloc_noio_restore().

Adding them around vm_map_ram and the vzalloc call in
kmem_alloc_large() will prevent deadlocks and most lockdep reports
for this issue. Also, convert the vzalloc() call in
kmem_alloc_large() to use __vmalloc() so that we can pass the
correct gfp context to the data page allocation routine inside
__vmalloc() so that it is clear that GFP_NOFS context is important
to this vmalloc call.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-03-07 16:19:14 +11:00
..
9p Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-01-28 08:38:04 -08:00
adfs adfs: delayed freeing of sbi 2013-10-24 23:43:27 -04:00
affs affs: use ->kill_sb() to simplify ->put_super() and failure exits of ->mount() 2014-01-25 03:13:01 -05:00
afs afs: proc cells and rootcell are writeable 2014-02-01 10:59:39 -08:00
autofs4 autofs: fix symlinks aren't checked for expiry 2014-01-23 16:36:59 -08:00
befs befs: iget_locked() doesn't return an ERR_PTR 2014-01-25 03:14:38 -05:00
bfs truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2014-01-30 20:08:20 -08:00
cachefiles Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
ceph ceph: fix missing dput in ceph_set_acl 2014-01-31 08:14:06 -08:00
cifs cifs: Fix check for regular file in couldbe_mf_symlink() 2014-01-31 09:06:43 -06:00
coda coda_revalidate_inode(): switch to passing inode... 2013-11-09 00:16:21 -05:00
configfs configfs: fix race between dentry put and lookup 2013-11-21 16:42:27 -08:00
cramfs cramfs: take headers to fs/cramfs 2014-01-25 03:13:02 -05:00
debugfs debugfs: use list_next_entry() in debugfs_remove_recursive() 2013-11-13 12:09:24 +09:00
devpts devpts: plug the memory leak in kill_sb 2013-11-13 12:09:36 +09:00
dlm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
ecryptfs ecryptfs: fix failure handling in ->readlink() 2014-01-25 03:13:00 -05:00
efivarfs consolidate simple ->d_delete() instances 2013-11-15 22:04:17 -05:00
efs efs: get rid of ->put_super() 2014-01-25 03:13:02 -05:00
exofs exofs: Print less in r4w 2014-01-23 18:54:14 +02:00
exportfs exportfs: fix quadratic behavior in filehandle lookup 2013-11-09 00:16:38 -05:00
ext2 ext2/3/4: use generic posix ACL infrastructure 2014-01-25 23:58:19 -05:00
ext3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-01-28 08:38:04 -08:00
ext4 Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
f2fs Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
fat fat: rcu-delay unloading nls and freeing sbi 2013-10-24 23:43:28 -04:00
freevxfs [readdir] convert freevxfs 2013-06-29 12:56:53 +04:00
fscache Merge branch 'for-3.13/core' of git://git.kernel.dk/linux-block 2013-11-14 12:08:14 +09:00
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-01-28 08:38:04 -08:00
gfs2 Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
hfs fs/hfs/btree.h: remove duplicate defines 2013-11-13 12:09:32 +09:00
hfsplus Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-02-01 10:43:45 -08:00
hostfs um: hostfs: make functions static 2014-01-26 11:51:09 +01:00
hpfs hpfs: optimize quad buffer loading 2014-02-02 16:24:07 -08:00
hppfs clean up scary strncpy(dst, src, strlen(src)) uses 2013-07-03 16:07:41 -07:00
hugetlbfs cope with potentially long ->d_dname() output for shmem/hugetlb 2013-08-24 12:10:17 -04:00
isofs isofs: don't pass dentry to isofs_hash{i,}_common() 2013-10-24 23:34:59 -04:00
jbd jbd: Revise KERN_EMERG error messages 2013-12-04 12:27:46 +01:00
jbd2 jbd2: rename obsoleted msg JBD->JBD2 2013-12-08 21:14:59 -05:00
jffs2 MTD updates for 3.14: 2014-01-28 18:56:37 -08:00
jfs Minor bug fix for linux-3.14 2014-01-31 08:14:35 -08:00
kernfs kernfs: associate a new kernfs_node with its parent on creation 2014-01-17 11:50:07 -08:00
lockd LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs 2013-08-05 15:03:46 -04:00
logfs Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
minix fs/minix: Drop dependency on H8300 2013-09-16 18:20:25 -07:00
ncpfs ncpfs: rcu-delay unload_nls() and freeing ncp_server 2013-10-24 23:43:28 -04:00
nfs NFS client bugfixes for Linux 3.14 2014-01-31 15:39:07 -08:00
nfs_common
nfsd Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux 2014-01-30 10:18:43 -08:00
nilfs2 Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
nls nls: have register_nls() set ->owner 2014-01-25 03:14:05 -05:00
notify fanotify: Fix use after free for permission events 2014-01-29 13:57:17 +01:00
ntfs iget/iget5: don't bother with ->i_lock until we find a match 2013-11-09 00:16:31 -05:00
ocfs2 Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
omfs truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
openpromfs [readdir] convert openpromfs 2013-06-29 12:56:32 +04:00
proc fs/proc/array.c: change do_task_stat() to use while_each_thread() 2014-01-23 16:37:02 -08:00
pstore pstore: Don't allow high traffic options on fragile devices 2013-12-20 13:12:01 -08:00
qnx4 qnx4: clean qnx4_fill_super() up 2014-01-25 03:13:03 -05:00
qnx6 [readdir] convert qnx6 2013-06-29 12:56:39 +04:00
quota genetlink: make multicast groups const, prevent abuse 2013-11-19 16:39:06 -05:00
ramfs fs/ramfs: move ramfs_aops to inode.c 2014-01-23 16:36:58 -08:00
reiserfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-01-28 08:38:04 -08:00
romfs romfs: fix returm err while getting inode in fill_super 2014-01-23 16:37:04 -08:00
squashfs Squashfs: fix failure to unlock pages on decompress error 2013-11-24 01:02:50 +00:00
sysfs Revert "kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers" 2014-01-13 14:05:13 -08:00
sysv sysv: Add forgotten superblock lock init for v7 fs 2013-09-29 22:02:02 -04:00
ubifs fs/ubifs: use rbtree postorder iteration helper instead of opencoding 2014-01-23 16:37:03 -08:00
udf udf: Fix lockdep warning from udf_symlink() 2013-12-23 22:02:16 +01:00
ufs truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
xfs xfs: use NOIO contexts for vm_map_ram 2014-03-07 16:19:14 +11:00
aio.c Merge git://git.kvack.org/~bcrl/aio-next 2013-12-22 11:03:49 -08:00
anon_inodes.c ... and kill anon_inode_getfile_private() 2013-11-09 00:16:28 -05:00
attr.c fs: fix iversion handling 2013-12-05 16:36:21 -06:00
bad_inode.c [readdir] ->readdir() is gone 2013-06-29 12:57:04 +04:00
binfmt_aout.c dump_skip(): dump_seek() replacement taking coredump_params 2013-11-09 00:16:26 -05:00
binfmt_elf_fdpic.c elf{,_fdpic} coredump: get rid of pointless if (siginfo->si_signo) 2013-11-09 00:16:30 -05:00
binfmt_elf.c fs: binfmt_elf: remove unused defines INTERPRETER_NONE and INTERPRETER_ELF 2014-01-23 16:36:58 -08:00
binfmt_em86.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
binfmt_som.c
bio-integrity.c bio-integrity: Fix bio_integrity_verify segment start bug 2014-01-21 20:32:05 -08:00
bio.c Revert "block: Warn and free bio if bi_end_io is not set" 2014-01-08 14:14:22 -07:00
block_dev.c a trivial writeback fix 2013-09-13 23:06:40 -04:00
buffer.c block: Replace __this_cpu_ptr with raw_cpu_ptr 2013-12-03 19:19:41 -07:00
char_dev.c Merge branch 'for-3.13/core' of git://git.kernel.dk/linux-block 2013-11-14 12:08:14 +09:00
compat_binfmt_elf.c
compat_ioctl.c fs/compat_ioctl.c: fix an underflow issue (harmless) 2014-01-21 16:19:42 -08:00
compat.c [readdir] constify ->actor 2013-06-29 12:57:05 +04:00
coredump.c coredump: make __get_dumpable/get_dumpable inline, kill fs/coredump.h 2014-01-23 16:37:01 -08:00
dcache.c __dentry_path() fixes 2014-01-26 12:37:55 -05:00
dcookies.c fs/compat: fix lookup_dcookie() parameter handling 2014-01-29 16:22:40 -08:00
direct-io.c block: Abstract out bvec iterator 2013-11-23 22:33:47 -08:00
drop_caches.c shrinker: add node awareness 2013-09-10 18:56:31 -04:00
eventfd.c eventfd_ctx_fdget(): use fdget() instead of fget() 2014-01-25 03:13:04 -05:00
eventpoll.c epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL 2014-01-02 14:40:30 -08:00
exec.c fs/exec.c: call arch_pick_mmap_layout() only once 2014-01-23 16:37:02 -08:00
fcntl.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
fhandle.c
file_table.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
file.c fs: __fget_light() can use __fget() in slow path 2014-01-25 03:14:38 -05:00
filesystems.c
fs_struct.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
fs-writeback.c writeback: Fix data corruption on NFS 2013-12-14 04:21:26 +08:00
inode.c locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
internal.h get rid of s_files and files_lock 2013-11-09 00:16:20 -05:00
ioctl.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
ioprio.c
Kconfig fs: remove generic_acl 2014-01-26 08:26:40 -05:00
Kconfig.binfmt
libfs.c consolidate simple ->d_delete() instances 2013-11-15 22:04:17 -05:00
locks.c locks: missing unlock on error in generic_add_lease() 2013-11-13 07:30:53 -05:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-01-28 08:38:04 -08:00
mbcache.c fs: convert fs shrinkers to new scan/count API 2013-09-10 18:56:31 -04:00
mount.h vfs: Is mounted should be testing mnt_ns for NULL or error. 2014-01-26 08:26:42 -05:00
mpage.c block: Abstract out bvec iterator 2013-11-23 22:33:47 -08:00
namei.c Fix mountpoint reference leakage in linkat 2014-01-31 17:33:13 -05:00
namespace.c Driver core / sysfs patches for 3.14-rc1 2014-01-20 15:49:44 -08:00
no-block.c
open.c locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
pipe.c fs/pipe.c: skip file_update_time on frozen fs 2014-01-23 16:37:00 -08:00
pnode.c split __lookup_mnt() in two functions 2013-10-24 23:35:00 -04:00
pnode.h vfs: Don't copy mount bind mounts of /proc/<pid>/ns/mnt between namespaces 2013-08-26 18:42:15 -07:00
posix_acl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-01-28 08:38:04 -08:00
proc_namespace.c fs/proc_namespace.c: simplify testing nsp and nsp->mnt_ns 2014-01-23 16:37:02 -08:00
read_write.c fs/compat: fix parameter handling for compat readv/writev syscalls 2014-01-29 16:22:39 -08:00
readdir.c file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
select.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-11-13 15:34:18 +09:00
seq_file.c seq_file: always clear m->count when we free m->buf 2013-11-18 19:07:53 -08:00
signalfd.c
splice.c fuse: fix pipe_buf_operations 2014-01-22 19:36:57 +01:00
stack.c
stat.c vfs: split out vfs_getattr_nosec 2013-11-09 00:16:31 -05:00
statfs.c vfs: allow O_PATH file descriptors for fstatfs() 2013-10-12 13:12:31 -07:00
super.c fs/super.c: sync ro remount after blocking writers 2014-01-31 14:29:36 -05:00
sync.c Merge branch 'akpm' (patches from Andrew Morton) 2013-11-13 15:45:43 +09:00
timerfd.c
utimes.c locks: break delegations on any attribute modification 2013-11-09 00:16:44 -05:00
xattr.c