Commit Graph

398407 Commits

Author SHA1 Message Date
Olof Johansson
45150c43b1 direct-io: Use return from cmpxchg to decide of assignment happened
Not using the return value can in the generic case be racy, so it's
in general good practice to check the return value instead.

This also resolved the warning caused on ARM and other architectures:

  fs/direct-io.c: In function 'sb_init_dio_done_wq':
  fs/direct-io.c:557:2: warning: value computed is not used [-Wunused-value]

Signed-off-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: H Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-09 10:47:42 -07:00
Waiman Long
232d2d60aa dcache: Translating dentry into pathname without taking rename_lock
When running the AIM7's short workload, Linus' lockref patch eliminated
most of the spinlock contention. However, there were still some left:

     8.46%     reaim  [kernel.kallsyms]     [k] _raw_spin_lock
                 |--42.21%-- d_path
                 |          proc_pid_readlink
                 |          SyS_readlinkat
                 |          SyS_readlink
                 |          system_call
                 |          __GI___readlink
                 |
                 |--40.97%-- sys_getcwd
                 |          system_call
                 |          __getcwd

The big one here is the rename_lock (seqlock) contention in d_path()
and the getcwd system call. This patch will eliminate the need to take
the rename_lock while translating dentries into the full pathnames.

The need to take the rename_lock is to make sure that no rename
operation can be ongoing while the translation is in progress. However,
only one thread can take the rename_lock thus blocking all the other
threads that need it even though the translation process won't make
any change to the dentries.

This patch will replace the writer's write_seqlock/write_sequnlock
sequence of the rename_lock of the callers of the prepend_path() and
__dentry_path() functions with the reader's read_seqbegin/read_seqretry
sequence within these 2 functions. As a result, the code will have to
retry if one or more rename operations had been performed. In addition,
RCU read lock will be taken during the translation process to make sure
that no dentries will go away. To prevent live-lock from happening,
the code will switch back to take the rename_lock if read_seqretry()
fails for three times.

To further reduce spinlock contention, this patch does not take the
dentry's d_lock when copying the filename from the dentries. Instead,
it treats the name pointer and length as unreliable and just copy
the string byte-by-byte over until it hits a null byte or the end of
string as specified by the length. This should avoid stepping into
invalid memory address. The error cases are left to be handled by
the sequence number check.

The following code re-factoring are also made:
1. Move prepend('/') into prepend_name() to remove one conditional
   check.
2. Move the global root check in prepend_path() back to the top of
   the while loop.

With this patch, the _raw_spin_lock will now account for only 1.2%
of the total CPU cycles for the short workload. This patch also has
the effect of reducing the effect of running perf on its profile
since the perf command itself can be a heavy user of the d_path()
function depending on the complexity of the workload.

When taking the perf profile of the high-systime workload, the amount
of spinlock contention contributed by running perf without this patch
was about 16%. With this patch, the spinlock contention caused by
the running of perf will go away and we will have a more accurate
perf profile.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-09 13:44:16 -04:00
Linus Torvalds
ef9a61bef9 Merge tag 'for-linus-20130909' of git://git.infradead.org/linux-mtd
Pull mtd updates from David Woodhouse:
 - factor out common code from MTD tests
 - nand-gpio cleanup and portability to non-ARM
 - m25p80 support for 4-byte addressing chips, other new chips
 - pxa3xx cleanup and support for new platforms
 - remove obsolete alauda, octagon-5066 drivers
 - erase/write support for bcm47xxsflash
 - improve detection of ECC requirements for NAND, controller setup
 - NFC acceleration support for atmel-nand, read/write via SRAM
 - etc

* tag 'for-linus-20130909' of git://git.infradead.org/linux-mtd: (184 commits)
  mtd: chips: Add support for PMC SPI Flash chips in m25p80.c
  mtd: ofpart: use for_each_child_of_node() macro
  mtd: mtdswap: replace strict_strtoul() with kstrtoul()
  mtd cs553x_nand: use kzalloc() instead of memset
  mtd: atmel_nand: fix error return code in atmel_nand_probe()
  mtd: bcm47xxsflash: writing support
  mtd: bcm47xxsflash: implement erasing support
  mtd: bcm47xxsflash: convert to module_platform_driver instead of init/exit
  mtd: bcm47xxsflash: convert kzalloc to avoid invalid access
  mtd: remove alauda driver
  mtd: nand: mxc_nand: mark 'const' properly
  mtd: maps: cfi_flagadm: add missing __iomem annotation
  mtd: spear_smi: add missing __iomem annotation
  mtd: r852: Staticize local symbols
  mtd: nandsim: Staticize local symbols
  mtd: impa7: add missing __iomem annotation
  mtd: sm_ftl: Staticize local symbols
  mtd: m25p80: add support for mr25h10
  mtd: m25p80: make CONFIG_M25PXX_USE_FAST_READ safe to enable
  mtd: m25p80: Pass flags through CAT25_INFO macro
  ...
2013-09-09 10:33:19 -07:00
Linus Torvalds
b5f0998cae Merge tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire updates from Stefan Richter:

 - Fix a regression since 3.2 inclusive: The subsystem workqueue
   deadlocked between transaction completion handling and bus reset
   handling if the worker pool could not be increased in time.

 - janitorial updates

* tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: ohci: Fix deadlock at bus reset
  firewire: ohci: Change module_pci_driver to module_init/module_exit
  firewire: ohci: beautify some macro definitions
  firewire: ohci: change confusing name of a struct member
  firewire: core: typecast from gfp_t to bool more safely
  firewire: WQ_NON_REENTRANT is meaningless and going away
2013-09-09 10:32:03 -07:00
Dan Williams
ab5f8c6ee8 MAINTAINERS: update email for Dan Williams
Returned to intel.com

Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Jon Mason <jon.mason@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2013-09-09 10:29:56 -07:00
Linus Torvalds
64c353864e Merge branch 'for-v3.12' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA mapping update from Marek Szyprowski:
 "This contains an addition of Device Tree support for reserved memory
  regions (Contiguous Memory Allocator is one of the drivers for it) and
  changes required by the KVM extensions for PowerPC architectue"

* 'for-v3.12' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: init: add support for reserved memory defined by device tree
  drivers: of: add initialization code for dma reserved memory
  drivers: of: add function to scan fdt nodes given by path
  drivers: dma-contiguous: clean source code and prepare for device tree
2013-09-09 10:26:33 -07:00
Sachin Kamat
a577659f42 dma: mv_xor: Fix incorrect error path
Return directly if memory allocation fails. There is no need
of dma_free_coherent().

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2013-09-09 10:26:04 -07:00
Linus Torvalds
d8cacd3a25 Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio update from Rusty Russell:
 "More console fixes; these are the theoretical ones which didn't get
  CC:stable.  But for that reason, I did a merge with master partway
  through to avoid an unnecessary conflict.

  Also: a fun lguest bug turns out if you don't clear the TF flag when
  trapping Bad Things happen to the guest kernel as the stack
  overflows..."

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio_pci: pm: Use CONFIG_PM_SLEEP instead of CONFIG_PM
  lguest: fix GPF in guest when using gdb.
  lguest: fix guest kernel stack overflow when TF bit set.
  lguest: fix BUG_ON() in invalid guest page table.
  virtio: console: prevent use-after-free of port name in port unplug
  virtio: console: cleanup an error message
  virtio: console: fix locking around send_sigio_to_port()
  virtio: console: add locking in port unplug path
  virtio: console: add locks around buffer removal in port unplug path
  tools/lguest: offer VIRTIO_F_ANY_LAYOUT for net device.
  virtio tools: add .gitignore
  lguest: Point to the right directory for the lguest launcher
2013-09-09 10:20:54 -07:00
Linus Torvalds
d75671e36e Merge tag 'vfio-v3.12-rc0' of git://github.com/awilliam/linux-vfio
Pull VFIO update from Alex Williamson:
 "VFIO updates include safer default file flags for VFIO device fds, an
  external user interface exported to allow other modules to hold
  references to VFIO groups, a fix to test for extended config space on
  PCIe and PCI-x, and new hot reset interfaces for PCI devices which
  allows the user to do PCI bus/slot resets when all of the devices
  affected by the reset are owned by the user.

  For this last feature, the PCI bus reset interface, I depend on
  changes already merged from Bjorn's PCI pull request.  I therefore
  merged my tree up to commit cb3e433, which I think was the correct
  action, but as Stephen Rothwell noted, I failed to provide a commit
  message indicating why the merge was required.  Sorry for that.
  Thanks, Alex"

* tag 'vfio-v3.12-rc0' of git://github.com/awilliam/linux-vfio:
  vfio: fix documentation
  vfio-pci: PCI hot reset interface
  vfio-pci: Test for extended config space
  vfio-pci: Use fdget() rather than eventfd_fget()
  vfio: Add O_CLOEXEC flag to vfio device fd
  vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
  vfio: add external user support
2013-09-09 10:19:36 -07:00
Konrad Rzeszutek Wilk
c3b7cb1fd8 xen/spinlock: Don't use __initdate for xen_pv_spin
As we get compile warnings about .init.data being
used by non-init functions.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-09-09 13:08:49 -04:00
Linus Torvalds
bf97293eb8 Merge tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

   - Fix NFSv4 recovery so that it doesn't recover lost locks in cases
     such as lease loss due to a network partition, where doing so may
     result in data corruption.  Add a kernel parameter to control
     choice of legacy behaviour or not.
   - Performance improvements when 2 processes are writing to the same
     file.
   - Flush data to disk when an RPCSEC_GSS session timeout is imminent.
   - Implement NFSv4.1 SP4_MACH_CRED state protection to prevent other
     NFS clients from being able to manipulate our lease and file
     locking state.
   - Allow sharing of RPCSEC_GSS caches between different rpc clients.
   - Fix the broken NFSv4 security auto-negotiation between client and
     server.
   - Fix rmdir() to wait for outstanding sillyrename unlinks to complete
   - Add a tracepoint framework for debugging NFSv4 state recovery
     issues.
   - Add tracing to the generic NFS layer.
   - Add tracing for the SUNRPC socket connection state.
   - Clean up the rpc_pipefs mount/umount event management.
   - Merge more patches from Chuck in preparation for NFSv4 migration
     support"

* tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (107 commits)
  NFSv4: use mach cred for SECINFO_NO_NAME w/ integrity
  NFS: nfs_compare_super shouldn't check the auth flavour unless 'sec=' was set
  NFSv4: Allow security autonegotiation for submounts
  NFSv4: Disallow security negotiation for lookups when 'sec=' is specified
  NFSv4: Fix security auto-negotiation
  NFS: Clean up nfs_parse_security_flavors()
  NFS: Clean up the auth flavour array mess
  NFSv4.1 Use MDS auth flavor for data server connection
  NFS: Don't check lock owner compatability unless file is locked (part 2)
  NFS: Don't check lock owner compatibility in writes unless file is locked
  nfs4: Map NFS4ERR_WRONG_CRED to EPERM
  nfs4.1: Add SP4_MACH_CRED write and commit support
  nfs4.1: Add SP4_MACH_CRED stateid support
  nfs4.1: Add SP4_MACH_CRED secinfo support
  nfs4.1: Add SP4_MACH_CRED cleanup support
  nfs4.1: Add state protection handler
  nfs4.1: Minimal SP4_MACH_CRED implementation
  SUNRPC: Replace pointer values with task->tk_pid and rpc_clnt->cl_clid
  SUNRPC: Add an identifier for struct rpc_clnt
  SUNRPC: Ensure rpc_task->tk_pid is available for tracepoints
  ...
2013-09-09 09:19:15 -07:00
Linus Torvalds
16d70e1529 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse bugfixes from Miklos Szeredi:
 "Just a bunch of bugfixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: use list_for_each_entry() for list traversing
  fuse: readdir: check for slash in names
  fuse: hotfix truncate_pagecache() issue
  fuse: invalidate inode attributes on xattr modification
  fuse: postpone end_page_writeback() in fuse_writepage_locked()
2013-09-09 09:18:23 -07:00
Linus Torvalds
6c337ad6cc Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw
Pull GFS2 updates from Steven Whitehouse:
 "This is possibly the smallest ever set of GFS2 patches for a merge
  window.  Also, most of them are bug fixes this time.

  Two of my three patches (moving gfs2_sync_meta and merging the two
  writepage implementations) are clean ups with the third (taking the
  glock ref in examine_bucket) being a fix for a difficult to hit race
  condition.

  The removal of an unused memory barrier is a clean up from Bob
  Peterson, and the "spectator" relates to a rarely used mount option.
  Ben Marzinski's patch fixes a corner case where the incorrect inode
  flags were being set, resulting in incorrect behaviour on fsync"

* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
  GFS2: dirty inode correctly in gfs2_write_end
  GFS2: Don't flag consistency error if first mounter is a spectator
  GFS2: Remove unnecessary memory barrier
  GFS2: Merge ordered and writeback writepage
  GFS2: Take glock reference in examine_bucket()
  GFS2: Move gfs2_sync_meta to lops.c
2013-09-09 09:16:51 -07:00
Linus Torvalds
6cccc7d301 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph updates from Sage Weil:
 "This includes both the first pile of Ceph patches (which I sent to
  torvalds@vger, sigh) and a few new patches that add support for
  fscache for Ceph.  That includes a few fscache core fixes that David
  Howells asked go through the Ceph tree.  (Thanks go to Milosz Tanski
  for putting this feature together)

  This first batch of patches (included here) had (has) several
  important RBD bug fixes, hole punch support, several different
  cleanups in the page cache interactions, improvements in the truncate
  code (new truncate mutex to avoid shenanigans with i_mutex), and a
  series of fixes in the synchronous striping read/write code.

  On top of that is a random collection of small fixes all across the
  tree (error code checks and error path cleanup, obsolete wq flags,
  etc)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (43 commits)
  ceph: use d_invalidate() to invalidate aliases
  ceph: remove ceph_lookup_inode()
  ceph: trivial buildbot warnings fix
  ceph: Do not do invalidate if the filesystem is mounted nofsc
  ceph: page still marked private_2
  ceph: ceph_readpage_to_fscache didn't check if marked
  ceph: clean PgPrivate2 on returning from readpages
  ceph: use fscache as a local presisent cache
  fscache: Netfs function for cleanup post readpages
  FS-Cache: Fix heading in documentation
  CacheFiles: Implement interface to check cache consistency
  FS-Cache: Add interface to check consistency of a cached object
  rbd: fix null dereference in dout
  rbd: fix buffer size for writes to images with snapshots
  libceph: use pg_num_mask instead of pgp_num_mask for pg.seed calc
  rbd: fix I/O error propagation for reads
  ceph: use vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem
  ceph: allow sync_read/write return partial successed size of read/write.
  ceph: fix bugs about handling short-read for sync read mode.
  ceph: remove useless variable revoked_rdcache
  ...
2013-09-09 09:13:22 -07:00
Linus Torvalds
255ae3fbd2 Merge tag 'metag-for-v3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
Pull metag architecture changes from James Hogan:
 - Device tree updates for TZ1090 GPIO drivers merged via GPIO tree.
 - Add driver for ImgTec PDC irqchip as found in TZ1090 SoC.
 - Add linux-metag mailing list to MAINTAINERS file.

* tag 'metag-for-v3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  irq-imgpdc: add ImgTec PDC irqchip driver
  MAINTAINERS: add linux-metag mailing list
  metag: tz1090: instantiate gpio-tz1090-pdc
  metag: tz1090: select and instantiate gpio-tz1090
  metag: tz1090: select and instantiate irq-imgpdc
2013-09-09 09:09:44 -07:00
Konrad Rzeszutek Wilk
fb78e58c27 Revert "xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM"
This reverts commit 70dd4998cb.

Now that the bugs have been resolved we can re-enable the
PV ticketlock implementation under PVHVM Xen guests.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2013-09-09 12:06:45 -04:00
Konrad Rzeszutek Wilk
3310bbedac xen/spinlock: Don't setup xen spinlock IPI kicker if disabled.
There is no need to setup this kicker IPI if we are never going
to use the paravirtualized ticketlock mechanism.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2013-09-09 12:06:38 -04:00
Konrad Rzeszutek Wilk
26a7999527 xen/smp: Update pv_lock_ops functions before alternative code starts under PVHVM
Before this patch we would patch all of the pv_lock_ops sites
using alternative assembler. Then later in the bootup cycle
change the unlock_kick and lock_spinning to the Xen specific -
without re patching.

That meant that for the core of the kernel we would be running
with the baremetal version of unlock_kick and lock_spinning while
for modules we would have the proper Xen specific slowpaths.

As most of the module uses some API from the core kernel that ended
up with slowpath lockers waiting forever to be kicked (b/c they
would be using the Xen specific slowpath logic). And the
kick never came b/c the unlock path that was taken was the
baremetal one.

On PV we do not have the problem as we initialise before the
alternative code kicks in.

The fix is to make the updating of the pv_lock_ops function
be done before the alternative code starts patching.

Note that this patch fixes issues discovered by commit
f10cd522c5.
("xen: disable PV spinlocks on HVM") wherein it mentioned

   PV spinlocks cannot possibly work with the current code because they are
   enabled after pvops patching has already been done, and because PV
   spinlocks use a different data structure than native spinlocks so we
   cannot switch between them dynamically.

The first problem is solved by this patch.

The second problem has been solved by commit
816434ec4a
(Merge branch 'x86-spinlocks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)

P.S.
There is still the commit 70dd4998cb
(xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM) to
revert but that can be done later after all other bugs have been
fixed.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2013-09-09 12:06:31 -04:00
Konrad Rzeszutek Wilk
6055aaf87d xen/spinlock: We don't need the old structure anymore
As we are using the generic ticketlock structs and these
old structures are not needed anymore.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2013-09-09 12:06:24 -04:00
Konrad Rzeszutek Wilk
1fb3a8b2cf xen/spinlock: Fix locking path engaging too soon under PVHVM.
The xen_lock_spinning has a check for the kicker interrupts
and if it is not initialized it will spin normally (not enter
the slowpath).

But for PVHVM case we would initialize the kicker interrupt
before the CPU came online. This meant that if the booting
CPU used a spinlock and went in the slowpath - it would
enter the slowpath and block forever. The forever part because
during bootup: the spinlock would be taken _before_ the CPU
sets itself to be online (more on this further), and we enter
to poll on the event channel forever.

The bootup CPU (see commit fc78d343fa
"xen/smp: initialize IPI vectors before marking CPU online"
for details) and the CPU that started the bootup consult
the cpu_online_mask to determine whether the booting CPU should
get an IPI. The booting CPU has to set itself in this mask via:

  set_cpu_online(smp_processor_id(), true);

However, if the spinlock is taken before this (and it is) and
it polls on an event channel - it will never be woken up as
the kernel will never send an IPI to an offline CPU.

Note that the PVHVM logic in sending IPIs is using the HVM
path which has numerous checks using the cpu_online_mask
and cpu_active_mask. See above mention git commit for details.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2013-09-09 12:06:16 -04:00
Konrad Rzeszutek Wilk
65320fceda Merge tag 'v3.11-rc7' into stable/for-linus-3.12
Linux 3.11-rc7

As we need the git commit 28817e9de4f039a1a8c1fe1df2fa2df524626b9e
Author: Chuck Anderson <chuck.anderson@oracle.com>
Date:   Tue Aug 6 15:12:19 2013 -0700

    xen/smp: initialize IPI vectors before marking CPU online

* tag 'v3.11-rc7': (443 commits)
  Linux 3.11-rc7
  ARC: [lib] strchr breakage in Big-endian configuration
  VFS: collect_mounts() should return an ERR_PTR
  bfs: iget_locked() doesn't return an ERR_PTR
  efs: iget_locked() doesn't return an ERR_PTR()
  proc: kill the extra proc_readfd_common()->dir_emit_dots()
  cope with potentially long ->d_dname() output for shmem/hugetlb
  usb: phy: fix build breakage
  USB: OHCI: add missing PCI PM callbacks to ohci-pci.c
  staging: comedi: bug-fix NULL pointer dereference on failed attach
  lib/lz4: correct the LZ4 license
  memcg: get rid of swapaccount leftovers
  nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection
  nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error
  drivers/platform/olpc/olpc-ec.c: initialise earlier
  ipv4: expose IPV4_DEVCONF
  ipv6: handle Redirect ICMP Message with no Redirected Header option
  be2net: fix disabling TX in be_close()
  Revert "ACPI / video: Always call acpi_video_init_brightness() on init"
  Revert "genetlink: fix family dump race"
  ...

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-09-09 12:05:37 -04:00
Linus Torvalds
89c5a9461d Merge tag 'arc-v3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC changes from Vineet Gupta:

 - ARC MM changes:
    - preparation for MMUv4 (accomodate new PTE bits, new cmds)
    - Rework the ASID allocation algorithm to remove asid-mm reverse map
 - Boilerplate code consolidation in Exception Handlers
 - Disable FRAME_POINTER for ARC
 - Unaligned Access Emulation for Big-Endian from Noam
 - Bunch of fixes (udelay, missing accessors) from Mischa

* tag 'arc-v3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: fix new Section mismatches in build (post __cpuinit cleanup)
  Kconfig.debug: Add FRAME_POINTER anti-dependency for ARC
  ARC: Fix __udelay calculation
  ARC: remove console_verbose() from setup_arch()
  ARC: Add read*_relaxed to asm/io.h
  ARC: Handle un-aligned user space access in BE.
  ARC: [ASID] Track ASID allocation cycles/generations
  ARC: [ASID] activate_mm() == switch_mm()
  ARC: [ASID] get_new_mmu_context() to conditionally allocate new ASID
  ARC: [ASID] Refactor the TLB paranoid debug code
  ARC: [ASID] Remove legacy/unused debug code
  ARC: No need to flush the TLB in early boot
  ARC: MMUv4 preps/3 - Abstract out TLB Insert/Delete
  ARC: MMUv4 preps/2 - Reshuffle PTE bits
  ARC: MMUv4 preps/1 - Fold PTE K/U access flags
  ARC: Code cosmetics (Nothing semantical)
  ARC: Entry Handler tweaks: Optimize away redundant IRQ_DISABLE_SAVE
  ARC: Exception Handlers Code consolidation
  ARC: Add some .gitignore entries
2013-09-09 09:05:33 -07:00
Bartlomiej Zolnierkiewicz
2bc552df76 of/platform: add error reporting to of_amba_device_create()
Add error reporting to of_amba_device_create() so the user knows
when (and why) some device tree nodes fail to initialize.

[ The issue was spotted on Universal C210 board (using revision 0 of
  ARM Exynos4210 SoC) on which initialization was silently failing
  for PL330 MDMA1 device tree node (it was using the wrong addres
  resulting in amba_device_add() returning -ENODEV). ]

Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
2013-09-09 17:04:52 +01:00
Linus Torvalds
833ae40b51 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu fixes from Greg Ungerer:
 "Just a small collection of cleanups and fixes this time, no big
  changes.  The most interresting are to make the m68k and m68knommu
  consistently use CONFIG_IOMAP, clean out some unused board config
  options and flush the cache on signal stack creation"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: remove 16 unused boards in Kconfig.machine
  m68k: define 'VM_DATA_DEFAULT_FLAGS' no matter whether has 'NOMMU' or not
  m68knommu: user generic iomap to support ioread*/iowrite*
  m68k/coldfire: flush cache when creating the signal stack frame
  m68knommu: Mark functions only called from setup_arch() __init
2013-09-09 09:04:46 -07:00
Linus Torvalds
20e029d791 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
 "This pile contains mostly fixes and improvements for issues identified
  by Richard W M Jones while adding UML as backend to libguestfs"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Add irq chip um/mask handlers
  um: prctl: Do not include linux/ptrace.h
  um: Run UML in it's own session.
  um: Cleanup SIGTERM handling
  um: ubd: Introduce submit_request()
  um: ubd: Add REQ_FLUSH suppport
  um: Implement probe_kernel_read()
  um: hostfs: Fix writeback
2013-09-09 09:03:46 -07:00
Yijing Wang
d84ff46a9e irq/of: Fix comment typo for irq_of_parse_and_map
Fix trivial comment typo for irq_of_parse_and_map().

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
2013-09-09 17:03:19 +01:00
Konrad Rzeszutek Wilk
c3f31f6a6f Merge branch 'x86/spinlocks' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into stable/for-linus-3.12
* 'x86/spinlocks' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kvm/guest: Fix sparse warning: "symbol 'klock_waiting' was not declared as static"
  kvm: Paravirtual ticketlocks support for linux guests running on KVM hypervisor
  kvm guest: Add configuration support to enable debug information for KVM Guests
  kvm uapi: Add KICK_CPU and PV_UNHALT definition to uapi
  xen, pvticketlock: Allow interrupts to be enabled while blocking
  x86, ticketlock: Add slowpath logic
  jump_label: Split jumplabel ratelimit
  x86, pvticketlock: When paravirtualizing ticket locks, increment by 2
  x86, pvticketlock: Use callee-save for lock_spinning
  xen, pvticketlocks: Add xen_nopvspin parameter to disable xen pv ticketlocks
  xen, pvticketlock: Xen implementation for PV ticket locks
  xen: Defer spinlock setup until boot CPU setup
  x86, ticketlock: Collapse a layer of functions
  x86, ticketlock: Don't inline _spin_unlock when using paravirt spinlocks
  x86, spinlock: Replace pv spinlocks with pv ticketlocks
2013-09-09 12:01:15 -04:00
Julien Grall
e1a9c16b30 xen/arm: disable cpuidle and cpufreq when linux is running as dom0
When linux is running as dom0, Xen doesn't show the physical cpu but a
virtual CPU.
On some ARM SOC (for instance the exynos 5250), linux registers callbacks
for cpuidle and cpufreq. When these callbacks are called, they will modify
directly the physical cpu not the virtual one. It can impact the whole board
instead of only dom0.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-09-09 11:35:26 +00:00
Boris Ostrovsky
d7f8f48d1e xen/p2m: Don't call get_balloon_scratch_page() twice, keep interrupts disabled for multicalls
m2p_remove_override() calls get_balloon_scratch_page() in
MULTI_update_va_mapping() even though it already has pointer to this page from
the earlier call (in scratch_page). This second call doesn't have a matching
put_balloon_scratch_page() thus not restoring preempt count back. (Also, there
is no put_balloon_scratch_page() in the error path.)

In addition, the second multicall uses __xen_mc_entry() which does not disable
interrupts. Rearrange xen_mc_* calls to keep interrupts off while performing
multicalls.

This commit fixes a regression introduced by:

commit ee0726407f
Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Date:   Tue Jul 23 17:23:54 2013 +0000

    xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-09-09 10:50:52 +00:00
Rob Herring
9dd4b2944c ARM: xen: only set pm function ptrs for Xen guests
xen_pm_init was unconditionally setting pm_power_off and arm_pm_restart
function pointers. This breaks multi-platform kernels. Make this
conditional on running as a Xen guest and make it a late_initcall to
ensure it is setup after platform code for Dom0.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: stable@vger.kernel.org
2013-09-09 10:50:26 +00:00
Heiko Carstens
9e75c6274a s390/irq: reduce size of external interrupt handler hash array
Change the hash algorithm a bit so it produces only values in the
range of 0..31.
This allows to reduce the size of the external interrupt handler hash
array even further while making sure that each of the known interrupt
sources keeps its unique hash with the slightly modified algorithm:

0x1004 --> 12
0x1201 --> 10
0x1202 --> 11
0x1406 --> 16
0x1407 --> 17
0x2401 --> 19
0x2603 --> 22
0x4000 --> 0

This also means that the entire array now fits into exactly one cache
line; so add a proper align statement as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-09-09 08:57:32 +02:00
Ian Kent
ac83871996 autofs4 - fix device ioctl mount lookup
When reconnecting to automounts at startup an autofs ioctl is used
to find the device and inode of existing mounts so they can be used
to open a file descriptor of possibly covered mounts.

At this time the the caller might not yet "own" the mount so it can
trigger calling ->d_automount(). This causes automount to hang when
trying to reconnect to direct or offset mount types.

Consequently kern_path() can't be used but kern_path_mountpoint() can be.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-08 22:07:47 -04:00
Linus Torvalds
e5c832d555 vfs: fix dentry RCU to refcounting possibly sleeping dput()
This is the fix that the last two commits indirectly led up to - making
sure that we don't call dput() in a bad context on the dentries we've
looked up in RCU mode after the sequence count validation fails.

This basically expands d_rcu_to_refcount() into the callers, and then
fixes the callers to delay the dput() in the failure case until _after_
we've dropped all locks and are no longer in an RCU-locked region.

The case of 'complete_walk()' was trivial, since its failure case did
the unlock_rcu_walk() directly after the call to d_rcu_to_refcount(),
and as such that is just a pure expansion of the function with a trivial
movement of the resulting dput() to after 'unlock_rcu_walk()'.

In contrast, the unlazy_walk() case was much more complicated, because
not only does convert two different dentries from RCU to be reference
counted, but it used to not call unlock_rcu_walk() at all, and instead
just returned an error and let the caller clean everything up in
"terminate_walk()".

Happily, one of the dentries in question (called "parent" inside
unlazy_walk()) is the dentry of "nd->path", which terminate_walk() wants
a refcount to anyway for the non-RCU case.

So what the new and improved unlazy_walk() does is to first turn that
dentry into a refcounted one, and once that is set up, the error cases
can continue to use the terminate_walk() helper for cleanup, but for the
non-RCU case.  Which makes it possible to drop out of RCU mode if we
actually hit the sequence number failure case.

Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-08 18:13:49 -07:00
Aaron Lu
9e266ece21 virtio_pci: pm: Use CONFIG_PM_SLEEP instead of CONFIG_PM
The virtio_pci_freeze/restore are defined under CONFIG_PM but is used
by SET_SYSTEM_SLEEP_PM_OPS macro, which is defined under
CONFIG_PM_SLEEP. So if CONFIG_PM_SLEEP is not cofigured but
CONFIG_PM_RUNTIME is, the following warning message appeared:

drivers/virtio/virtio_pci.c:770:12: warning: ‘virtio_pci_freeze’ defined but not used [-Wunused-function]
 static int virtio_pci_freeze(struct device *dev)
            ^
drivers/virtio/virtio_pci.c:790:12: warning: ‘virtio_pci_restore’ defined but not used [-Wunused-function]
 static int virtio_pci_restore(struct device *dev)
            ^
Fix it by changing CONFIG_PM to CONFIG_PM_SLEEP.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-09 10:02:53 +09:30
Al Viro
2d86465101 introduce kern_path_mountpoint()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-08 20:20:23 -04:00
Al Viro
197df04c74 rename user_path_umountat() to user_path_mountpoint_at()
... and move the extern from linux/namei.h to fs/internal.h,
along with that of vfs_path_lookup().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-08 20:20:21 -04:00
Al Viro
35759521ee take unlazy_walk() into umount_lookup_last()
... and massage it a bit to reduce nesting

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-08 20:20:19 -04:00
Linus Torvalds
0d98439ea3 vfs: use lockred "dead" flag to mark unrecoverably dead dentries
This simplifies the RCU to refcounting code in particular.

I was originally intending to leave this for later, but walking through
all the dput() logic (see previous commit), I realized that the dput()
"might_sleep()" check was misleadingly weak.  And I removed it as
misleading, both for performance profiling and for debugging.

However, the might_sleep() debugging case is actually true: the final
dput() can indeed sleep, if the inode of the dentry that you are
releasing ends up sleeping at iput time (see dentry_iput()).  So the
problem with the might_sleep() in dput() wasn't that it wasn't true, it
was that it wasn't actually testing and triggering on the interesting
case.

In particular, just about *any* dput() can indeed sleep, if you happen
to race with another thread deleting the file in question, and you then
lose the race to the be the last dput() for that file.  But because it's
a very rare race, the debugging code would never trigger it in practice.

Why is this problematic? The new d_rcu_to_refcount() (see commit
15570086b5: "vfs: reimplement d_rcu_to_refcount() using
lockref_get_or_lock()") does a dput() for the failure case, and it does
it under the RCU lock.  So potentially sleeping really is a bug.

But there's no way I'm going to fix this with the previous complicated
"lockref_get_or_lock()" interface.  And rather than revert to the old
and crufty nested dentry locking code (which did get this right by
delaying the reference count updates until they were verified to be
safe), let's make forward progress.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-08 13:46:52 -07:00
Linus Torvalds
8aab6a2733 vfs: reorganize dput() memory accesses
This is me being a bit OCD after all the dentry optimization work this
merge window: profiles end up showing 'dput()' as a rather expensive
operation, and there were two unrelated bad reasons for that.

The first reason was reading d_lockref.count for debugging purposes,
which touches the lockref cacheline (for reads) before really need to.
More importantly, the debugging test in question is _wrong_, and has
hidden bugs.  It's true that we can only sleep when the count goes down
to zero, but the test as-is hides the much more subtle bug that happens
if we race with somebody else deleting the file.

Anyway we _will_ touch that cacheline, but let's do it for a write and
in the right routine (ie in "lockref_put_or_lock()") which annotates the
costs better.  So remove the misleading debug code.

The other was an unnecessary access to the cacheline that contains the
d_lru list, just to check whether we already were on the LRU list or
not.  This is exactly what we have d_flags for, so that we can avoid
touching extra cache lines for the common case.  So just add another bit
for "is this dentry on the LRU".

Finally, mark the tests properly likely/unlikely, so that the common
fast-paths are dense in the instruction stream.

This makes the profiles look much saner.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-08 13:26:18 -07:00
Linus Torvalds
b409624ad5 Merge git://git.infradead.org/users/willy/linux-nvme
Pull NVM Express driver update from Matthew Wilcox.

* git://git.infradead.org/users/willy/linux-nvme:
  NVMe: Merge issue on character device bring-up
  NVMe: Handle ioremap failure
  NVMe: Add pci suspend/resume driver callbacks
  NVMe: Use normal shutdown
  NVMe: Separate controller init from disk discovery
  NVMe: Separate queue alloc/free from create/delete
  NVMe: Group pci related actions in functions
  NVMe: Disk stats for read/write commands only
  NVMe: Bring up cdev on set feature failure
  NVMe: Fix checkpatch issues
  NVMe: Namespace IDs are unsigned
  NVMe: Update nvme_id_power_state with latest spec
  NVMe: Split header file into user-visible and kernel-visible pieces
  NVMe: Call nvme_process_cq from submission path
  NVMe: Remove "process_cq did something" message
  NVMe: Return correct value from interrupt handler
  NVMe: Disk IO statistics
  NVMe: Restructure MSI / MSI-X setup
  NVMe: Use kzalloc instead of kmalloc+memset
2013-09-07 20:19:02 -07:00
Linus Torvalds
c4c1725228 Merge tag 'ntb-3.12' of git://github.com/jonmason/ntb
Pull NTB (non-transparent bridge) updates from Jon Mason:
 "NTB driver bug fixes to address issues in NTB-RP enablement, spad,
  debugfs, and USD/DSD identification.

  Add a workaround on Xeon NTB devices for b2bdoorbell errata.  Also,
  add new NTB driver features to support 32bit x86, DMA engine support,
  and NTB-RP support.

  Finally, a few clean-ups and update to MAINTAINERS for the NTB git
  tree and wiki location"

* tag 'ntb-3.12' of git://github.com/jonmason/ntb:
  ntb: clean up unnecessary MSI/MSI-X capability find
  MAINTAINERS: Add Website and Git Tree for NTB
  NTB: Update Version
  NTB: Comment Fix
  NTB: Remove unused variable
  NTB: Remove References of non-B2B BWD HW
  NTB: NTB-RP support
  NTB: Rename Variables for NTB-RP
  NTB: Use DMA Engine to Transmit and Receive
  NTB: Enable 32bit Support
  NTB: Update Device IDs
  NTB: BWD Link Recovery
  NTB: Xeon Errata Workaround
  NTB: Correct debugfs to work with more than 1 NTB Device
  NTB: Correct USD/DSD Identification
  NTB: Correct Number of Scratch Pad Registers
  NTB: Add Error Handling in ntb_device_setup
2013-09-07 20:17:44 -07:00
Linus Torvalds
8de4651abe Merge tag 'mfd-3.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-next
Pull MFD (multi-function device) updates from Samuel Ortiz:
 "For the 3.12 merge window we have one new driver for the DA9063 PMIC
  from Dialog Semiconductor.

  Besides that driver we also have:

   - Device tree support for the s2mps11 driver

   - More devm_* conversion for the pm8921, max89xx, menelaus, tps65010,
     wl1273 and pcf50633-adc drivers.

   - A conversion to threaded IRQ and IRQ domain for the twl6030 driver.

   - A fairly big update for the rtsx driver: Better power saving
     support, better vendor settings handling, and a few fixes.

   - Support for a couple more boards (COMe-bHL6 and COMe-cTH6) for the
     Kontron driver.

   - A conversion to the dev_get_platdata() API for all MFD drivers.

   - A removal of non-DT (legacy) support for the twl6040 driver.

   - A few fixes and additions (Mic detect level) to the wm5110 register
     tables.

   - Regmap support for the davinci_voicecodec driver.

   - The usual bunch of minor cleanups and janitorial fixes"

* tag 'mfd-3.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-next: (81 commits)
  mfd: ucb1x00-core: Rewrite ucb1x00_add_dev()
  mfd: ab8500-debugfs: Apply a check for -ENOMEM after allocating memory for event name
  mfd: ab8500-debugfs: Apply a check for -ENOMEM after allocating memory for sysfs
  mfd: timberdale: Use module_pci_driver
  mfd: timberdale: Remove redundant break
  mfd: timberdale: Staticize local variables
  mfd: ab8500-debugfs: Staticize local variables
  mfd: db8500-prcmu: Staticize clk_mgt
  mfd: db8500-prcmu: Use ANSI function declaration
  mfd: omap-usb-host: Staticize usbhs_driver_name
  mfd: 88pm805: Fix potential NULL pdata dereference
  mfd: 88pm800: Fix potential NULL pdata dereference
  mfd: twl6040: Use regmap for register cache
  mfd: davinci_voicecodec: Provide a regmap for register I/O
  mfd: davinci_voicecodec: Remove unused read and write functions
  mmc: memstick: rtsx: Modify copyright comments
  mmc: rtsx: Clear SD_CLK toggle enable bit if switching voltage fail
  mfd: mmc: rtsx: Change default tx phase
  mfd: pcf50633-adc: Use devm_*() functions
  mfd: rtsx: Copyright modifications
  ...
2013-09-07 20:14:19 -07:00
Linus Torvalds
327fff3e13 Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull misc kbuild updates from Michal Marek:
 "In the kbuild misc branch, I have:
   - make rpm-pkg updates, most importantly the rpm package now calls
     /sbin/installkernel
   - make deb-pkg: debuginfo split, correct kernel image path for
     parisc, mips and powerpc and a couple more minor fixes
   - New coccinelle check"

* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  scripts/checkkconfigsymbols.sh: replace echo -e with printf
  Provide version number for Debian firmware package
  coccinelle: replace 0/1 with false/true in functions returning bool
  deb-pkg: add a hook argument to match debian hooks parameters
  deb-pkg: fix installed image path on parisc, mips and powerpc
  deb-pkg: split debug symbols in their own package
  deb-pkg: use KCONFIG_CONFIG instead of .config file directly
  rpm-pkg: add generation of kernel-devel
  rpm-pkg: install firmware files in kernel relative directory
  rpm-pkg: add %post section to create initramfs and grub hooks
2013-09-07 19:47:35 -07:00
Linus Torvalds
1ff5e37e72 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild update from Michal Marek:
 "Only these two commits are in the kbuild branch this time:
   - Using filechk for include/config/kernel.release
   - Cleanup in scripts/sortextable.c"

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kbuild: Do not overwrite include/config/kernel.release needlessly
  scripts: remove unused function in sortextable.c
2013-09-07 19:46:50 -07:00
Al Viro
4e10f3c988 Kill indirect include of file.h from eventfd.h, use fdget() in cgroup.c
kernel/cgroup.c is the only place in the tree that relies on eventfd.h
pulling file.h; move that include there.  Switch from eventfd_fget()/fput()
to fdget()/fdput(), while we are at it - eventfd_ctx_fileget() will fail
on non-eventfd descriptors just fine, no need to do that check twice...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-07 19:54:57 -04:00
Al Viro
d040790391 prune_super(): sb->s_op is never NULL
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-07 19:54:56 -04:00
Al Viro
dfc59e2c90 exportfs: don't assume that ->iterate() won't feed us too long entries
On some filesystems it's impossible even with fs corruption, but we'd
better not rely on that, what with memcpy() into on-stack array we
are doing there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-07 19:54:55 -04:00
Al Viro
5d8943b04b afs: get rid of redundant ->d_name.len checks
No dentry can get to directory modification methods without
having passed either ->lookup() or ->atomic_open(); if name is
rejected by those two (or by ->d_hash()) with an error, it won't
be seen by anything else.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-07 19:54:55 -04:00
Linus Torvalds
e7d33bb5ea lockref: add ability to mark lockrefs "dead"
The only actual current lockref user (dcache) uses zero reference counts
even for perfectly live dentries, because it's a cache: there may not be
any users, but that doesn't mean that we want to throw away the dentry.

At the same time, the dentry cache does have a notion of a truly "dead"
dentry that we must not even increment the reference count of, because
we have pruned it and it is not valid.

Currently that distinction is not visible in the lockref itself, and the
dentry cache validation uses "lockref_get_or_lock()" to either get a new
reference to a dentry that already had existing references (and thus
cannot be dead), or get the dentry lock so that we can then verify the
dentry and increment the reference count under the lock if that
verification was successful.

That's all somewhat complicated.

This adds the concept of being "dead" to the lockref itself, by simply
using a count that is negative.  This allows a usage scenario where we
can increment the refcount of a dentry without having to validate it,
and pushing the special "we killed it" case into the lockref code.

The dentry code itself doesn't actually use this yet, and it's probably
too late in the merge window to do that code (the dentry_kill() code
with its "should I decrement the count" logic really is pretty complex
code), but let's introduce the concept at the lockref level now.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-07 15:49:18 -07:00
Weston Andros Adamson
b1b3e13694 NFSv4: use mach cred for SECINFO_NO_NAME w/ integrity
Commit 97431204ea introduced a regression
that causes SECINFO_NO_NAME to fail without sending an RPC if:

 1) the nfs_client's rpc_client is using krb5i/p (now tried by default)
 2) the current user doesn't have valid kerberos credentials

This situation is quite common - as of now a sec=sys mount would use
krb5i for the nfs_client's rpc_client and a user would hardly be faulted
for not having run kinit.

The solution is to use the machine cred when trying to use an integrity
protected auth flavor for SECINFO_NO_NAME.

Older servers may not support using the machine cred or an integrity
protected auth flavor for SECINFO_NO_NAME in every circumstance, so we fall
back to using the user's cred and the filesystem's auth flavor in this case.

We run into another problem when running against linux nfs servers -
they return NFS4ERR_WRONGSEC when using integrity auth flavor (unless the
mount is also that flavor) even though that is not a valid error for
SECINFO*.  Even though it's against spec, handle WRONGSEC errors on
SECINFO_NO_NAME by falling back to using the user cred and the
filesystem's auth flavor.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-07 18:39:25 -04:00