__mlock_vma_pages_range() doesn't necessarily mlock pages. It depends on
vma flags. The same codepath is used for MAP_POPULATE.
Let's rename __mlock_vma_pages_range() to populate_vma_page_range().
This patch also drops mlock_vma_pages_range() references from
documentation. It has gone in cea10a19b7 ("mm: directly use
__mlock_vma_pages_range() in find_extend_vma()").
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commit a1fde08c74 ("VM: skip the stack guard page lookup in
get_user_pages only for mlock") FOLL_MLOCK has lost its original
meaning: we don't necessarily mlock the page if the flags is set -- we
also take VM_LOCKED into consideration.
Since we use the same codepath for __mm_populate(), let's rename
FOLL_MLOCK to FOLL_POPULATE.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-12ubuntu1) :
mm/migrate.c: In function `migrate_pages':
mm/migrate.c:1148:1: internal compiler error: in push_minipool_fix, at config/arm/arm.c:13500
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.7/README.Bugs> for instructions.
Preprocessed source stored into /tmp/ccPoM1tr.out file, please attach this to your bugreport.
make[1]: *** [mm/migrate.o] Error 1
make: *** [mm/migrate.o] Error 2
Mark unmap_and_move() (which is used in a single place only) "noinline"
to work around this compiler bug.
[akpm@linux-foundation.org: make it conditional on gcc-4.7.3 and arm]
[khilman@kernel.org: fine-tune compiler versions]
[akpm@linux-foundation.org: fix comment]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reported-by: Kevin Hilman <khilman@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Have kvm_guest_init() use hardlockup_detector_disable() instead of
watchdog_enable_hardlockup_detector(false).
Remove the watchdog_hardlockup_detector_is_enabled() and the
watchdog_enable_hardlockup_detector() function which are no longer needed.
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename the update_timers*() functions to update_watchdog*().
Remove the boolean argument from watchdog_enable_all_cpus() because
update_watchdog_all_cpus() is now a generic function to change the run
state of the lockup detectors and to have the lockup detectors use a new
sample period.
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the current user interface of the watchdog mechanism it is only
possible to disable or enable both lockup detectors at the same time.
This series introduces new kernel parameters and changes the semantics of
some existing kernel parameters, so that the hard lockup detector and the
soft lockup detector can be disabled or enabled individually. With this
series applied, the user interface is as follows.
- parameters in /proc/sys/kernel
. soft_watchdog
This is a new parameter to control and examine the run state of
the soft lockup detector.
. nmi_watchdog
The semantics of this parameter have changed. It can now be used
to control and examine the run state of the hard lockup detector.
. watchdog
This parameter is still available to control the run state of both
lockup detectors at the same time. If this parameter is examined,
it shows the logical OR of soft_watchdog and nmi_watchdog.
. watchdog_thresh
The semantics of this parameter are not affected by the patch.
- kernel command line parameters
. nosoftlockup
The semantics of this parameter have changed. It can now be used
to disable the soft lockup detector at boot time.
. nmi_watchdog=0 or nmi_watchdog=1
Disable or enable the hard lockup detector at boot time. The patch
introduces '=1' as a new option.
. nowatchdog
The semantics of this parameter are not affected by the patch. It
is still available to disable both lockup detectors at boot time.
Also, remove the proc_dowatchdog() function which is no longer needed.
[dzickus@redhat.com: wrote changelog]
[dzickus@redhat.com: update documentation for kernel params and sysctl]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If watchdog_nmi_enable() fails to set up the hardware perf event of one
CPU, the entire hard lockup detector is deemed unreliable. Hence, disable
the hard lockup detector and shut down the hardware perf events on all
CPUs.
[dzickus@redhat.com: update comments to explain some code]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Separate handlers for each watchdog parameter in /proc/sys/kernel replace
the proc_dowatchdog() function. Three of those handlers merely call
proc_watchdog_common() with one different argument.
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Three of four handlers for the watchdog parameters in /proc/sys/kernel
essentially have to do the same thing.
if the parameter is being read {
return the state of the corresponding bit(s) in 'watchdog_enabled'
} else {
set/clear the state of the corresponding bit(s) in 'watchdog_enabled'
update the run state of the lockup detector(s)
}
Hence, introduce a common function that can be called by those handlers.
The callers pass a 'bit mask' to this function to indicate which bit(s)
should be set/cleared in 'watchdog_enabled'.
This function handles an uncommon race with watchdog_nmi_enable() where a
concurrent update of 'watchdog_enabled' is possible. We use 'cmpxchg' to
detect the concurrency. [This avoids introducing a new spinlock or a
mutex to synchronize updates of 'watchdog_enabled'. Using the same lock
or mutex in watchdog thread context and in system call context needs to be
considered carefully because it can make the code prone to deadlock
situations in connection with parking/unparking the watchdog threads.]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This series removes proc_dowatchdog(). Since multiple new functions need
the 'watchdog_proc_mutex' to serialize access to the watchdog parameters
in /proc/sys/kernel, move the mutex outside of any function.
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This series introduces a separate handler for each watchdog parameter in
/proc/sys/kernel. The separate handlers need a common function that they
can call to update the run state of the lockup detectors, or to have the
lockup detectors use a new sample period.
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The hardlockup and softockup had always been tied together. Due to the
request of KVM folks, they had a need to have one enabled but not the
other. Internally rework the code to split things apart more cleanly.
There is a bunch of churn here, but the end result should be code that
should be easier to maintain and fix without knowing the internals of what
is going on.
This patch (of 9):
Introduce new definitions and variables to separate the user interface in
/proc/sys/kernel from the internal run state of the lockup detectors. The
internal run state is represented by two bits in a new variable that is
named 'watchdog_enabled'. This helps simplify the code, for example:
- In order to check if any of the two lockup detectors is enabled,
it is sufficient to check if 'watchdog_enabled' is not zero.
- In order to enable/disable one or both lockup detectors,
it is sufficient to set/clear one or both bits in 'watchdog_enabled'.
- Concurrent updates of 'watchdog_enabled' need not be synchronized via
a spinlock or a mutex. Updates can either be atomic or concurrency can
be detected by using 'cmpxchg'.
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2 does
mlog_errno(v);
return v;
in many places. Change mlog_errno() so we can do
return mlog_errno(v);
For some weird reason this patch reduces the size of ocfs2 by 6k:
akpm3:/usr/src/25> size fs/ocfs2/ocfs2.ko
text data bss dec hex filename
1146613 82767 832192 2061572 1f7504 fs/ocfs2/ocfs2.ko-before
1140857 82767 832192 2055816 1f5e88 fs/ocfs2/ocfs2.ko-after
[dan.carpenter@oracle.com: double evaluation concerns in mlog_errno()]
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: alex chen <alex.chen@huawei.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If ocfs2 lockres has not been initialized before calling ocfs2_dlm_lock,
the lock won't be dropped and then will lead umount hung. The case is
described below:
ocfs2_mknod
ocfs2_mknod_locked
__ocfs2_mknod_locked
ocfs2_journal_access_di
Failed because of -ENOMEM or other reasons, the inode lockres
has not been initialized yet.
iput(inode)
ocfs2_evict_inode
ocfs2_delete_inode
ocfs2_inode_lock
ocfs2_inode_lock_full_nested
__ocfs2_cluster_lock
Succeeds and allocates a new dlm lockres.
ocfs2_clear_inode
ocfs2_open_unlock
ocfs2_drop_inode_locks
ocfs2_drop_lock
Since lockres has not been initialized, the lock
can't be dropped and the lockres can't be
migrated, thus umount will hang forever.
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
debugfs_create_dir and debugfs_create_file may return -ENODEV when debugfs
is not configured, so the return value should be checked against
ERROR_VALUE as well, otherwise the later dereference of the dentry pointer
would crash the kernel.
This patch tries to solve this problem by fixing certain checks. However,
I have that found other call sites are protected by #ifdef CONFIG_DEBUG_FS.
In current implementation, if CONFIG_DEBUG_FS is defined, then the above
two functions will never return any ERROR_VALUE. So another possibility
to fix this is to surround all the buggy checks/functions with the same
#ifdef CONFIG_DEBUG_FS. But I'm not sure if this would break any functionality,
as only OCFS2_FS_STATS declares dependency on DEBUG_FS.
Signed-off-by: Chengyu Song <csong84@gatech.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In ocfs2_local_alloc_find_clear_bits and ocfs2_get_dentry, variable
numfound and set may be uninitialized and then used in tracepoint. In
ocfs2_xattr_block_get and ocfs2_delete_xattr_in_bucket, variable block_off
and xv may be uninitialized and then used in the following logic due to
unchecked return value.
This patch fixes these possible issues.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2_block_group_clear_bits will clear bits in block group bitmap.
Once it succeeds but fails in the following step, it will cause block
group bitmap mismatch the corresponding count recorded in dinode.
So rollback the cleared bits if error occurs.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In ocfs2_direct_IO_write, we use ocfs2_zero_extend to zero allocated
clusters in case of cluster not aligned. But ocfs2_zero_extend uses page
cache, this may happen that it clears the data which blockdev_direct_IO
has already written.
We should use blkdev_issue_zeroout instead of ocfs2_zero_extend during
direct IO.
So fix this issue by introducing ocfs2_direct_IO_zero_extend and
ocfs2_direct_IO_extend_no_holes.
Reported-by: Yiwen Jiang <jiangyiwen@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Tested-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2_free_path() was called in some cases by
ocfs2_figure_merge_contig_type() during error handling even if the passed
variables "left_path" and "right_path" contained still a null pointer.
Corresponding implementation details could be improved by adjustments for
jump labels according to the current Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kfree() was called in a few cases by ocfs2_convert_inline_data_to_extents()
during error handling even if the passed variable "pages" contained a
null pointer.
* Return from this implementation directly after failure detection for
the function call "kcalloc".
* Corresponding details could be improved by the introduction of another
jump label.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kfree(), ocfs2_free_path() and __ocfs2_free_slot_info() test whether their
argument is NULL and then return immediately. Thus the test around their
calls is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dwarf_reg_pool and dwarf_frame_pool are not properly destroyed when
cleaning up the dwarf unwinder. Destroy them with mempool_destroy().
Also mark dwarf_unwinder_cleanup() as __init.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull GFS2 updates from Bob Peterson:
"Here is a list of patches we've accumulated for GFS2 for the current
upstream merge window.
Most of the patches fix GFS2 quotas, which were not properly enforced.
There's another that adds me as a GFS2 co-maintainer, and a couple
patches that fix a kernel panic doing splice_write on GFS2 as well as
a few correctness patches"
* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: fix quota refresh race in do_glock()
gfs2: incorrect check for debugfs returns
gfs2: allow fallocate to max out quotas/fs efficiently
gfs2: allow quota_check and inplace_reserve to return available blocks
gfs2: perform quota checks against allocation parameters
GFS2: Move gfs2_file_splice_write outside of #ifdef
GFS2: Allocate reservation during splice_write
GFS2: gfs2_set_acl(): Cache "no acl" as well
Add myself (Bob Peterson) as a maintainer of GFS2
Pull jfs update from David Kleikamp:
"Not much this time. Just a one-liner format fix"
* tag 'jfs-4.1' of git://github.com/kleikamp/linux-shaggy:
jfs: %pf is only for function pointers
Pull pstore fix from Tony Luck:
"Just one pstore fix this release"
* tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
pstore: Fix the ramoops module parameters update
VFs were being improperly added to the switch's multicast group. The
error stems from the fact that incorrect arguments were passed to the
"update_mc_addr" function. It would seem to be a copy paste error since
the parameters are similar to the "update_uc_addr" function.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Acked-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>