linux-kernel-test/mm
Artem Bityutskiy 6628bc74f1 writeback: do not lose wakeup events when forking bdi threads
This patch fixes the following issue:

INFO: task mount.nfs4:1120 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mount.nfs4    D 00000000fffc6a21     0  1120   1119 0x00000000
 ffff880235643948 0000000000000046 ffffffff00000000 ffffffff00000000
 ffff880235643fd8 ffff880235314760 00000000001d44c0 ffff880235643fd8
 00000000001d44c0 00000000001d44c0 00000000001d44c0 00000000001d44c0
Call Trace:
 [<ffffffff813bc747>] schedule_timeout+0x34/0xf1
 [<ffffffff813bc530>] ? wait_for_common+0x3f/0x130
 [<ffffffff8106b50b>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff813bc5c3>] wait_for_common+0xd2/0x130
 [<ffffffff8104159c>] ? default_wake_function+0x0/0xf
 [<ffffffff813beaa0>] ? _raw_spin_unlock+0x26/0x2a
 [<ffffffff813bc6bb>] wait_for_completion+0x18/0x1a
 [<ffffffff81101a03>] sync_inodes_sb+0xca/0x1bc
 [<ffffffff811056a6>] __sync_filesystem+0x47/0x7e
 [<ffffffff81105798>] sync_filesystem+0x47/0x4b
 [<ffffffff810e7ffd>] generic_shutdown_super+0x22/0xd2
 [<ffffffff810e80f8>] kill_anon_super+0x11/0x4f
 [<ffffffffa00d06d7>] nfs4_kill_super+0x3f/0x72 [nfs]
 [<ffffffff810e7b68>] deactivate_locked_super+0x21/0x41
 [<ffffffff810e7fd6>] deactivate_super+0x40/0x45
 [<ffffffff810fc66c>] mntput_no_expire+0xb8/0xed
 [<ffffffff810fc73b>] release_mounts+0x9a/0xb0
 [<ffffffff810fc7bb>] put_mnt_ns+0x6a/0x7b
 [<ffffffffa00d0fb2>] nfs_follow_remote_path+0x19a/0x296 [nfs]
 [<ffffffffa00d11ca>] nfs4_try_mount+0x75/0xaf [nfs]
 [<ffffffffa00d1790>] nfs4_get_sb+0x276/0x2ff [nfs]
 [<ffffffff810e7dba>] vfs_kern_mount+0xb8/0x196
 [<ffffffff810e7ef6>] do_kern_mount+0x48/0xe8
 [<ffffffff810fdf68>] do_mount+0x771/0x7e8
 [<ffffffff810fe062>] sys_mount+0x83/0xbd
 [<ffffffff810089c2>] system_call_fastpath+0x16/0x1b

The reason of this hang was a race condition: when the flusher thread is
forking a bdi thread, we use 'kthread_run()', so we run it _before_ we make it
visible in 'bdi->wb.task'. The bdi thread runs, does all works, and goes sleep.
'bdi->wb.task' is still NULL. And this is a dangerous time window.

If at this time someone queues a work for this bdi, he does not see the bdi
thread and wakes up the forker thread instead! But the forker has already
forked this bdi thread, but just did not make it visible yet!

The result is that we lose the wake up event for this bdi thread and the NFS4
code waits forever.

To fix the problem, we should use 'ktrhead_create()' for creating bdi threads,
then make them visible in 'bdi->wb.task', and only after this wake them up.
This is exactly what this patch does.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-27 09:16:18 +02:00
..
backing-dev.c writeback: do not lose wakeup events when forking bdi threads 2010-08-27 09:16:18 +02:00
bootmem.c x86,nobootmem: make alloc_bootmem_node fall back to other node when 32bit numa is used 2010-07-20 16:25:40 -07:00
bounce.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
compaction.c mm: compaction: add a tunable that decides when memory should be compacted and when it should be reclaimed 2010-05-25 08:06:59 -07:00
debug-pagealloc.c generic debug pagealloc 2009-04-01 08:59:13 -07:00
dmapool.c dmapools: protect page_list walk in show_pools() 2009-06-30 18:56:00 -07:00
fadvise.c readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM 2010-03-06 11:26:25 -08:00
failslab.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
filemap_xip.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
filemap.c gcc-4.6: mm: fix unused but set warnings 2010-08-09 20:44:58 -07:00
fremap.c mm: clean up mm_counter 2010-03-06 11:26:23 -08:00
highmem.c mm,kdb,kgdb: Add a debug reference for the kdb kmap usage 2010-08-05 09:22:24 -05:00
hugetlb.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2010-08-12 10:15:10 -07:00
hwpoison-inject.c HWPOISON, hugetlb: support hwpoison injection for hugepage 2010-08-11 09:23:11 +02:00
init-mm.c mm: provide init_mm mm_context initializer 2010-08-09 20:44:54 -07:00
internal.h HWPOISON: add an interface to switch off/on all the page filters 2009-12-16 12:19:59 +01:00
Kconfig lmb: rename to memblock 2010-07-14 17:14:00 +10:00
Kconfig.debug trivial: improve help text for mm debug config options 2009-09-21 15:14:57 +02:00
kmemcheck.c kmemcheck: Fix build errors due to missing slab.h 2010-03-30 22:02:32 +09:00
kmemleak-test.c percpu: clean up percpu variable definitions 2009-06-24 15:13:48 +09:00
kmemleak.c kmemleak: Fix typo in the comment 2010-08-08 21:57:23 +01:00
ksm.c ksm: cleanup for mm_slots_hash 2010-08-09 20:45:03 -07:00
maccess.c maccess,probe_kernel: Allow arch specific override probe_kernel_(read|write) 2010-01-07 11:58:36 -06:00
madvise.c HWPOISON: Add a madvise() injector for soft page offlining 2009-12-16 12:20:00 +01:00
Makefile lmb: rename to memblock 2010-07-14 17:14:00 +10:00
memblock.c memblock: Fix memblock_is_region_reserved() to return a boolean 2010-08-09 11:21:38 +10:00
memcontrol.c memcg: convert to use zone_to_nid() from bare zone->zone_pgdat->node_id 2010-08-11 08:59:19 -07:00
memory_hotplug.c mem-hotplug: fix potential race while building zonelist for new populated zone 2010-05-25 08:07:02 -07:00
memory-failure.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2010-08-12 10:15:10 -07:00
memory.c mm: make stack guard page logic use vm_prev pointer 2010-08-21 08:50:00 -07:00
mempolicy.c mempolicy: reduce stack size of migrate_pages() 2010-08-09 20:44:58 -07:00
mempool.c mm: remove broken 'kzalloc' mempool 2009-09-22 07:17:35 -07:00
migrate.c mm: extend KSM refcounts to the anon_vma root 2010-08-09 20:44:55 -07:00
mincore.c mincore: do nested page table walks 2010-05-25 08:06:58 -07:00
mlock.c mm: make the mlock() stack guard page checks stricter 2010-08-21 08:49:50 -07:00
mm_init.c
mmap.c mm: make the vma list be doubly linked 2010-08-21 08:49:21 -07:00
mmu_context.c exit: fix oops in sync_mm_rss 2010-03-24 16:31:21 -07:00
mmu_notifier.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
mmzone.c [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 2009-05-18 11:22:24 +01:00
mprotect.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
mremap.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
msync.c sanitize vfs_fsync calling conventions 2010-05-21 18:31:21 -04:00
nommu.c mm: make the vma list be doubly linked 2010-08-21 08:49:21 -07:00
oom_kill.c oom: __task_cred() need rcu_read_lock() 2010-08-20 09:34:55 -07:00
page_alloc.c vmscan: kill prev_priority completely 2010-08-09 20:45:00 -07:00
page_cgroup.c kmemleak: Annotate false positive in init_section_page_cgroup() 2010-07-19 11:54:14 +01:00
page_io.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
page_isolation.c memory hotplug: fix page_zone() calculation in test_pages_isolated() 2008-11-06 15:41:19 -08:00
page-writeback.c lib/radix-tree.c: fix overflow in radix_tree_range_tag_if_tagged() 2010-08-20 09:34:55 -07:00
pagewalk.c pagemap: fix pfn calculation for hugepage 2010-04-07 08:38:04 -07:00
percpu_up.c percpu: don't implicitly include slab.h from percpu.h 2010-03-30 22:02:32 +09:00
percpu-km.c percpu: implement kernel memory based chunk allocation 2010-05-01 08:30:50 +02:00
percpu-vm.c percpu: move vmalloc based chunk management into percpu-vm.c 2010-05-01 08:30:50 +02:00
percpu.c percpu: allow limited allocation before slab is online 2010-06-27 18:50:00 +02:00
prio_tree.c
quicklist.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
readahead.c readahead.c: fix comment 2010-05-25 08:07:00 -07:00
rmap.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2010-08-12 10:15:10 -07:00
shmem.c shmem: put_super must percpu_counter_destroy 2010-08-17 18:33:11 -07:00
slab.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 2010-08-22 10:08:52 -07:00
slob.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 2010-08-06 11:44:08 -07:00
slub.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 2010-08-06 11:44:08 -07:00
sparse-vmemmap.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
sparse.c sparsemem: on no vmemmap path put mem_map on node high too 2010-05-25 08:06:56 -07:00
swap_state.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
swap.c mm: export lru_cache_add_*() to modules 2010-05-25 15:06:06 +02:00
swapfile.c hibernation: freeze swap at hibernation 2010-08-09 20:45:04 -07:00
thrash.c mm: pass mm to grab_swap_token 2009-06-23 12:50:05 -07:00
truncate.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
util.c mm: use memdup_user 2010-08-09 20:44:54 -07:00
vmalloc.c Merge branch 'stable/xen-swiotlb-0.8.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen 2010-08-12 09:09:41 -07:00
vmscan.c memcg: remove nid and zid argument from mem_cgroup_soft_limit_reclaim() 2010-08-11 08:59:19 -07:00
vmstat.c vmscan: kill prev_priority completely 2010-08-09 20:45:00 -07:00