Commit Graph

321194 Commits

Author SHA1 Message Date
Linus Torvalds
76159c20c0 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull a howmon update from Jean Delvare.

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: struct x86_cpu_id arrays can be __initconst
2012-07-30 10:10:26 -07:00
Linus Torvalds
219c673438 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull Exynos DRM changes from Dave Airlie:
 "So I totally missed Inki's pull request for -next, its fully exynos
  self contained."

(I took just the actual commits, not Dave's two extraneous merges)

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (30 commits)
  drm/exynos: fixed exception to page allocation failure
  drm/exynos: use __free_page() to deallocate memory
  drm/exynos: fixed a comment to gem size.
  drm/exynos: removed unnecessary variable
  drm/exynos: do not release memory region from exporter.
  drm/exynos: set buffer type from exporter.
  drm/exynos: use alloc_page() to allocate pages.
  drm/exynos: fixed build warning.
  drm/exynos: fixed edid data setting at vidi connection request
  drm/exynos: check if raw edid data is fake or not for test
  drm/exynos: set edid fake data only for test.
  drm/exynos: removed unnecessary declaration.
  drm/exynos: fix buffer pitch calculation
  drm/exynos: check for null in return value of dma_buf_map_attachment()
  drm/exynos: return NULL if exynos_pages_to_sg fails
  drm/exynos: Use devm_* functions in exynos_mixer.c
  drm/exynos: Use devm_* functions in exynos_hdmi.c
  drm/exynos: Use devm_* functions in exynos_drm_fimd.c
  drm/exynos: Add missing static storage class specifier
  drm/exynos: add property for crtc mode
  ...
2012-07-30 10:06:23 -07:00
Linus Torvalds
f1115bb686 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "A new driver for FT5x06 based EDT displays and a couple of other
  driver changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics - handle out of bounds values from the hardware
  Input: wacom - add support to Cintiq 22HD
  Input: add driver for FT5x06 based EDT displays
2012-07-30 10:01:45 -07:00
Linus Torvalds
76c97e6c75 Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Fix timing problems in applesmc driver

 - Improve device removal in jc42 driver

 - Fix build warning in acp_power_meter driver

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (applesmc) Decode and act on read/write status codes
  hwmon: (jc42) Don't reset hysteresis on device removal
  hwmon: (jc42) Simplify hysteresis mask
  hwmon: (acpi_power_meter) Fix build warning
2012-07-30 09:58:10 -07:00
Linus Torvalds
8da8533dfb Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull EDAC patches from Mauro Carvalho Chehab:

 - the second part of the EDAC rework:
    - Add the sysfs nodes that exports the real memory layout, instead
      of the fake one (needed to properly represent Intel memory
      controllers since 2002)
    - convert EDAC MC to use "struct device" instead of creating the
      sysfs nodes via the kobj API
    - adds a tracepoint to represent memory errors

 - some cleanup patches

 - some fixes at i5000, i5400 and EDAC core

 - a new EDAC driver for Caldera.

* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (33 commits)
  edac i5000, i5400: fix pointer math in i5000_get_mc_regs()
  edac: allow specifying the error count with fake_inject
  edac: add support for Calxeda highbank L2 cache ecc
  edac: add support for Calxeda highbank memory controller
  edac: create top-level debugfs directory
  sb_edac: properly handle error count
  i7core_edac: properly handle error count
  edac: edac_mc_handle_error(): add an error_count parameter
  edac: remove arch-specific parameter for the error handler
  amd64_edac: Don't pass driver name as an error parameter
  edac_mc: check for allocation failure in edac_mc_alloc()
  edac: Increase version to 3.0.0
  edac_mc: Cleanup per-dimm_info debug messages
  edac: Convert debugfX to edac_dbg(X,
  edac: Use more normal debugging macro style
  edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs
  Edac: Add ABI Documentation for the new device nodes
  edac: move documentation ABI to ABI/testing/sysfs-devices-edac
  i7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy
  edac: change the mem allocation scheme to make Documentation/kobject.txt happy
  ...
2012-07-30 09:53:50 -07:00
Linus Torvalds
f50f118c49 Merge tag 'boards2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull arm-soc board updates from Olof Johansson:
 "This branch contains board updates, mostly for shmobile, but also a
  couple for PXA.

  The shmobile platforms are still in the early stages of DT enablement,
  so there's a bit more updates here than we'd ideally want to see:
   - regulator updates to provide some fixed regulators on several
     boards
   - gpio support updates for multiple boards
   - misc updates for recently-introduced boards armadillo800eva and
     kzm9g
   - defconfig updates"

* tag 'boards2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (37 commits)
  ARM: shmobile: kzm9g: defconfig enable INOTIFY_USER
  ARM: mach-shmobile: armadillo800eva: defconfig Allow use of armhf userspace
  ARM: shmobile: armadillo800eva: A3SP domain includes USB
  ARM: shmobile: armadillo800eva: A4LC domain includes LCDC
  ARM: shmobile: armadillo800eva: USB Func enables external IRQ mode
  ARM: mach-shmobile: kzm9d: Add defconfig
  ARM: mach-shmobile: select the fixed regulator driver on several boards
  ARM: mach-shmobile: add SDHI2 to the 2.8V fixed regulator consumers on kzm9g
  ARM: pxa: hx4700: Use DEFINE_RES_* macros consistently
  ARM: pxa: remove eseries.h
  ARM: mach-shmobile: add fixed voltage regulators to marzen
  ARM: mach-shmobile: add fixed voltage regulators to kzm9g
  ARM: mach-shmobile: add fixed voltage regulators to kzm9d
  ARM: mach-shmobile: add fixed voltage regulators to kota2
  ARM: mach-shmobile: add fixed voltage regulators to g4evm
  ARM: mach-shmobile: add fixed voltage regulators to bonito
  ARM: mach-shmobile: add fixed voltage regulators to armadillo800eva
  ARM: mach-shmobile: add fixed voltage regulators to ap4evb
  ARM: mach-shmobile: add fixed voltage regulators to ag5evm
  ARM: mach-shmobile: add 3.3V and 1.8V fixed regulators to mackerel
  ...
2012-07-30 09:48:00 -07:00
Linus Torvalds
b7574a22a2 Merge tag 'soc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull arm-soc soc updates from Olof Johansson:
 "This is the second batch of SoC updates for the 3.6 merge window,
  containing parts that arrived close to the merge window opening and
  thus needed to sit in linux-next for a while.

  Most contents is updates of Renesas shmobile, with a couple of Samsung
  Exynos patches in the mix."

* tag 'soc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
  ARM: S3C64XX: Add header file protection macros in pm-core.h
  [CPUFREQ] EXYNOS5250: Add support max 1.7GHz for EXYNOS5250
  ARM: EXYNOS: Add G2D related clock entries for SMDK4X12
  ARM: EXYNOS: Move G2D clock entries to clock-exynos4210.c file
  ARM: shmobile: Fix build problem in pm-sh7372.c for unusual .config
  ARM: shmobile: Take cpuidle dependencies into account correctly
  ARM: mach-shmobile: sh7377 generic board support via DT
  ARM: mach-shmobile: r8a7740 generic board support via DT
  ARM: shmobile: sh7372: completely switch over to using pm-rmobile API
  ARM: shmobile: ap4evb: switch to using pm-rmobile API
  ARM: shmobile: mackerel: switch to using pm-rmobile API
  ARM: shmobile: sh7372: add pm-rmobile domain support
  ARM: shmobile: r8a7740: add A4LC pm domain support
  ARM: shmobile: r8a7740: add A3SP pm domain support
  ARM: shmobile: r8a7740: add A4S pm domain support
  ARM: shmobile: r8a7740: fixup: MSEL1CR 7bit control
  ARM: shmobile: soc-core: add R-mobile PM domain common APIs
  ARM: shmobile: sh7372 A3SM CPUIdle support
  ARM: shmobile: Use INTCA with sh7372 A3SM power domain
  ARM: mach-shmobile: Convert sh_clk_mstp32_register to sh_clk_mstp_register
  ...
2012-07-30 09:45:53 -07:00
Linus Torvalds
148b729b9f Merge tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire updates from Stefan Richter:

 - Small fixes and optimizations.

 - A new sysfs attribute to tell local and remote nodes apart.
   Useful to set special permissions/ ownership of local nodes'
   /dev/fw*, to start daemons on them (for diagnostics, management,
   AV targets, VersaPHY initiator or targets...), to pick up their
   GUID to use it as GUID of an SBP2 target instance, and of course
   for informational purposes.

* tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: core: document is_local sysfs attribute
  firewire: core: add is_local sysfs device attribute
  firewire: ohci: initialize multiChanMode bits after reset
  firewire: core: fix multichannel IR with buffers larger than 2 GB
  firewire: ohci: sanity-check MMIO resource
  firewire: ohci: lazy bus time initialization
  firewire: core: allocate the low memory region
  firewire: core: make address handler length 64 bits
2012-07-30 09:32:39 -07:00
Alex Elder
d1f57ea663 rbd: kill num_reply parameters
Several functions include a num_reply parameter, but it is never
used.  Just get rid of it everywhere--it seems to be something
that never got fully implemented.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:09 -07:00
Alex Elder
43ae470112 rbd: option symbol renames
Use the name "ceph_opts" consistently (rather than just "opt") for
pointers to a ceph_options structure.

Change the few spots that don't use "rbd_opts" for a rbd_options
pointer to match the rest.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:08 -07:00
Alex Elder
aded07ea9f rbd: more symbol renames
Rename variables named "obj" which represent object names so they're
consistently named "object_name".

Rename the "cls" and "method" parameters in rbd_req_sync_exec()
to be "class_name" and "method_name", and make similar changes
to the names of local variables in that function representing
the lengths of those names.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:07 -07:00
Alex Elder
0bed54dc9a rbd: rename some fields in struct rbd_dev
An rbd image is not a single object, but a logical construct made up
of an aggregation of objects.

Rename some fields in struct rbd_dev, in hopes of reinforcing this.
    obj         --> image_name
    obj_len     --> image_name_len
    obj_md_name --> header_name

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:06 -07:00
Alex Elder
0ce1a79413 rbd: use rbd_dev consistently
Most variables that represent a struct rbd_device are named
"rbd_dev", but in some cases "dev" is used instead.  Change all the
"dev" references so they use "rbd_dev" consistently, to make it
clear from the name that we're working with an RBD device (as
opposed to, for example, a struct device).  Similarly, change the
name of the "dev" field in struct rbd_notify_info to be "rbd_dev".

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:05 -07:00
Alex Elder
820a5f3e94 rbd: dynamically allocate snapshot name
There is no need to impose a small limit the length of the snapshot
name recorded for an rbd image in a struct rbd_dev.  Remove the
limitation by allocating space for the snapshot name dynamically.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:04 -07:00
Alex Elder
bf3e5ae112 rbd: dynamically allocate image name
There is no need to impose a small limit the length of the rbd image
name recorded in a struct rbd_dev.  Remove the limitation by
allocating space for the image name dynamically.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:04 -07:00
Alex Elder
cb8627c76d rbd: dynamically allocate image header name
There is no need to impose a small limit the length of the header
name recorded for an rbd image in a struct rbd_dev.  Remove the
limitation by allocating space for the header name dynamically.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:03 -07:00
Alex Elder
849b4260d4 rbd: dynamically allocate object prefix
There is no need to impose a small limit the length of the object
prefix recorded for an rbd image in a struct rbd_image_header.
Remove the limitation by allocating space for the object prefix
dynamically.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:02 -07:00
Alex Elder
d22f76e703 rbd: dynamically allocate pool name
There is no need to impose a small limit the length of the pool name
recorded for an rbd image in a struct rbd_device.  Remove the
limitation by allocating space for the pool name ynamically.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:01 -07:00
Alex Elder
9bb2f334b9 rbd: create pool_id device attribute
Add an entry under /sys/bus/rbd/devices/<N>/ named "pool_id" that
provides the id for the pool the rbd image is assocatied with.  This
is in addition to the pool name already provided.

Rename the "poolid" field in struct rbd_device  to be "pool_id".

Update the documentation to reflect the addition of this new entry.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:30:00 -07:00
Alex Elder
ca1e49a6af rbd: rename rbd_dev->block_name
Each rbd image has a name that forms the basis of all data objects
backing the device.  Old (format 1) images refer to this name as the
"block name," while new (format 2) images use the term "object
prefix" for this.

Change the field name in the in-core rbd image header structure to
reflect the more modern usage.  We intentionally keep the the name
"block_name" in the on-disk definition for format 1 image headers.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:29:59 -07:00
Alex Elder
ea3352f4aa rbd: define dup_token()
Define a new function dup_token(), to be used during argument
parsing for making dynamically-allocated copies of tokens being
parsed.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:29:58 -07:00
Alex Elder
f8c36c58ac libceph: define ceph_extract_encoded_string()
This adds a new utility routine which will return a dynamically-
allocated buffer containing a string that has been decoded from ceph
over-the-wire format.  It also returns the length of the string
if the address of a size variable is supplied to receive it.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-07-30 09:29:57 -07:00
Alex Elder
ad4f232f28 rbd: drop a useless local variable
In rbd_req_sync_notify_ack(), a local variable was needlessly being
used to hold a null pointer.  Just pass NULL instead.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:29:56 -07:00
Alex Elder
c61a1abd21 libceph: fix off-by-one bug in ceph_encode_filepath()
There is a BUG_ON() call that doesn't account for the single byte
structure version at the start of an encoded filepath in
ceph_encode_filepath().  Fix that.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2012-07-30 09:29:55 -07:00
Sage Weil
8842b3be96 ceph: clean up useless d_parent checks
d_parent is never NULL, and IS_ROOT() is the proper way to check for a
(non-self-referential) parent.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30 09:29:54 -07:00
Guanjun He
a2a3258417 libceph: prevent the race of incoming work during teardown
Add an atomic variable 'stopping' as flag in struct ceph_messenger,
set this flag to 1 in function ceph_destroy_client(), and add the condition code
in function ceph_data_ready() to test the flag value, if true(1), just return.

Signed-off-by: Guanjun He <gjhe@suse.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2012-07-30 09:29:53 -07:00
Sage Weil
a16cb1f707 libceph: fix messenger retry
In ancient times, the messenger could both initiate and accept connections.
An artifact if that was data structures to store/process an incoming
ceph_msg_connect request and send an outgoing ceph_msg_connect_reply.
Sadly, the negotiation code was referencing those structures and ignoring
important information (like the peer's connect_seq) from the correct ones.

Among other things, this fixes tight reconnect loops where the server sends
RETRY_SESSION and we (the client) retries with the same connect_seq as last
time.  This bug pretty easily triggered by injecting socket failures on the
MDS and running some fs workload like workunits/direct_io/test_sync_io.

Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30 09:29:52 -07:00
Sage Weil
cd43045c2d libceph: initialize rb, list nodes in ceph_osd_request
These don't strictly need to be initialized based on how they are used, but
it is good practice to do so.

Reported-by: Alex Elder <elder@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30 09:29:51 -07:00
Sage Weil
d50b409fb8 libceph: initialize msgpool message types
Initialize the type field for messages in a msgpool.  The caller was doing
this for osd ops, but not for the reply messages.

Reported-by: Alex Elder <elder@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2012-07-30 09:29:50 -07:00
Linus Torvalds
9ec97169e7 Merge branch 'for-3.6' of git://gitorious.org/linux-pwm/linux-pwm
Pull PWM subsystem from Thierry Reding:
 "The new PWM subsystem aims at collecting all implementations of the
  legacy PWM API and to eventually replace it completely.

  The subsystem has been in development for over half a year now and
  many drivers have already been converted.  It has been in linux-next
  for a couple of weeks and there have been no major issues so I think
  it is ready for inclusion in your tree."

Arnd Bergmann <arnd@arndb.de>:
 "Very much Ack on the new subsystem.  It uses the interface
  declarations as the previously separate pwm drivers, so nothing
  changes for now in the drivers using it, although it enables us to
  change those more easily in the future if we want to.

  This work is also one of the missing pieces that are required to
  eventually build ARM kernels for multiple platforms, which is
  currently prohibited (amongs other things) by the fact that you cannot
  have more than one driver exporting the pwm functions."

Tested-and-acked-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Philip, Avinash <avinashphilip@ti.com> # TI's AM33xx platforms
Acked-By: Alexandre Pereira da Silva <aletes.xgr@gmail.com> # LPC32XX
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sachin Kamat <sachin.kamat@linaro.org>

Fix up trivial conflicts with other cleanups and DT updates.

* 'for-3.6' of git://gitorious.org/linux-pwm/linux-pwm: (36 commits)
  pwm: pwm-tiehrpwm: PWM driver support for EHRPWM
  pwm: pwm-tiecap: PWM driver support for ECAP APWM
  pwm: fix used-uninitialized warning in pwm_get()
  pwm: add lpc32xx PWM support
  pwm_backlight: pass correct brightness to callback
  pwm: Use pr_* functions in pwm-samsung.c file
  pwm: Convert pwm-samsung to use devm_* APIs
  pwm: Convert pwm-tegra to use devm_clk_get()
  pwm: pwm-mxs: Return proper error if pwmchip_remove() fails
  pwm: pwm-bfin: Return proper error if pwmchip_remove() fails
  pwm: pxa: Propagate pwmchip_remove() error
  pwm: Convert pwm-pxa to use devm_* APIs
  pwm: Convert pwm-vt8500 to use devm_* APIs
  pwm: Convert pwm-imx to use devm_* APIs
  pwm: Conflict with legacy PWM API
  pwm: pwm-mxs: add pinctrl support
  pwm: pwm-mxs: use devm_* managed functions
  pwm: pwm-mxs: use global reset function stmp_reset_block
  pwm: pwm-mxs: encode soc name in compatible string
  pwm: Take over maintainership of the PWM subsystem
  ...
2012-07-30 09:22:37 -07:00
Hans Verkuil
9bc31633c2 [media] Fix VIDIOC_TRY_EXT_CTRLS regression
Fixes an omission in the new v4l2_ioctls table: VIDIOC_TRY_EXT_CTRLS
must get the INFO_FL_CTRL flag, just like all the other control
related ioctls, otherwise the ioctl core won't know it also has
to check whether v4l2_fh->ctrl_handler is non-zero before it can
decide that this ioctl is not implemented.

Caught by v4l2-compliance while I was testing the mem2mem_testdev driver.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-07-30 11:55:24 -03:00
Roland Dreier
1da9b6b43e Merge branches 'cma', 'ipoib', 'ocrdma' and 'qib' into for-next 2012-07-30 07:47:27 -07:00
Shlomo Pongratz
b63b70d877 IPoIB: Use a private hash table for path lookup in xmit path
Dave Miller <davem@davemloft.net> provided a detailed description of
why the way IPoIB is using neighbours for its own ipoib_neigh struct
is buggy:

    Any time an ipoib_neigh is changed, a sequence like the following is made:

    			spin_lock_irqsave(&priv->lock, flags);
    			/*
    			 * It's safe to call ipoib_put_ah() inside
    			 * priv->lock here, because we know that
    			 * path->ah will always hold one more reference,
    			 * so ipoib_put_ah() will never do more than
    			 * decrement the ref count.
    			 */
    			if (neigh->ah)
    				ipoib_put_ah(neigh->ah);
    			list_del(&neigh->list);
    			ipoib_neigh_free(dev, neigh);
    			spin_unlock_irqrestore(&priv->lock, flags);
    			ipoib_path_lookup(skb, n, dev);

    This doesn't work, because you're leaving a stale pointer to the freed up
    ipoib_neigh in the special neigh->ha pointer cookie.  Yes, it even fails
    with all the locking done to protect _changes_ to *ipoib_neigh(n), and
    with the code in ipoib_neigh_free() that NULLs out the pointer.

    The core issue is that read side calls to *to_ipoib_neigh(n) are not
    being synchronized at all, they are performed without any locking.  So
    whether we hold the lock or not when making changes to *ipoib_neigh(n)
    you still can have threads see references to freed up ipoib_neigh
    objects.

    	cpu 1			cpu 2
    	n = *ipoib_neigh()
    				*ipoib_neigh() = NULL
    				kfree(n)
    	n->foo == OOPS

    [..]

    Perhaps the ipoib code can have a private path database it manages
    entirely itself, which holds all the necessary information and is
    looked up by some generic key which is available easily at transmit
    time and does not involve generic neighbour entries.

See <http://marc.info/?l=linux-rdma&m=132812793105624&w=2> and
<http://marc.info/?l=linux-rdma&w=2&r=1&s=allows+references+to+freed+memory&q=b>
for the full discussion.

This patch aims to solve the race conditions found in the IPoIB driver.

The patch removes the connection between the core networking neighbour
structure and the ipoib_neigh structure.  In addition to avoiding the
race described above, it allows us to handle SKBs carrying IP packets
that don't have any associated neighbour.

We add an ipoib_neigh hash table with N buckets where the key is the
destination hardware address.  The ipoib_neigh is fetched from the
hash table and instead of the stashed location in the neighbour
structure. The hash table uses both RCU and reference counting to
guarantee that no ipoib_neigh instance is ever deleted while in use.

Fetching the ipoib_neigh structure instance from the hash also makes
the special code in ipoib_start_xmit that handles remote and local
bonding failover redundant.

Aged ipoib_neigh instances are deleted by a garbage collection task
that runs every M seconds and deletes every ipoib_neigh instance that
was idle for at least 2*M seconds. The deletion is safe since the
ipoib_neigh instances are protected using RCU and reference count
mechanisms.

The number of buckets (N) and frequency of running the GC thread (M),
are taken from the exported arb_tbl.

Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-30 07:46:50 -07:00
Marek Szyprowski
97ef952a20 ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
dma_(un)map_(single,page,sg) functions family. It lets dma mapping clients
to create a mapping for the buffer for the given device without performing
a CPU cache synchronization. CPU cache synchronization can be skipped for
the buffers which it is known that they are already in 'device' domain (CPU
caches have been already synchronized or there are only coherent mappings
for the buffer). For advanced users only, please use it with care.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:47 +02:00
Marek Szyprowski
bdf5e4871f common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
This patch adds DMA_ATTR_SKIP_CPU_SYNC attribute to the DMA-mapping
subsystem.

By default dma_map_{single,page,sg} functions family transfer a given
buffer from CPU domain to device domain. Some advanced use cases might
require sharing a buffer between more than one device. This requires
having a mapping created separately for each device and is usually
performed by calling dma_map_{single,page,sg} function more than once
for the given buffer with device pointer to each device taking part in
the buffer sharing. The first call transfers a buffer from 'CPU' domain
to 'device' domain, what synchronizes CPU caches for the given region
(usually it means that the cache has been flushed or invalidated
depending on the dma direction). However, next calls to
dma_map_{single,page,sg}() for other devices will perform exactly the
same sychronization operation on the CPU cache. CPU cache sychronization
might be a time consuming operation, especially if the buffers are
large, so it is highly recommended to avoid it if possible.
DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
the CPU cache for the given buffer assuming that it has been already
transferred to 'device' domain. This attribute can be also used for
dma_unmap_{single,page,sg} functions family to force buffer to stay in
device domain after releasing a mapping for it. Use this attribute with
care!

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:47 +02:00
Marek Szyprowski
dc2832e1e7 ARM: dma-mapping: add support for dma_get_sgtable()
This patch adds support for dma_get_sgtable() function which is required
to let drivers to share the buffers allocated by DMA-mapping subsystem.

Generic implementation based on virt_to_page() is not suitable for ARM
dma-mapping subsystem.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:47 +02:00
Marek Szyprowski
d2b7428eb0 common: dma-mapping: introduce dma_get_sgtable() function
This patch adds dma_get_sgtable() function which is required to let
drivers to share the buffers allocated by DMA-mapping subsystem. Right
now the driver gets a dma address of the allocated buffer and the kernel
virtual mapping for it. If it wants to share it with other device (= map
into its dma address space) it usually hacks around kernel virtual
addresses to get pointers to pages or assumes that both devices share
the DMA address space. Both solutions are just hacks for the special
cases, which should be avoided in the final version of buffer sharing.

To solve this issue in a generic way, a new call to DMA mapping has been
introduced - dma_get_sgtable(). It allocates a scatter-list which
describes the allocated buffer and lets the driver(s) to use it with
other device(s) by calling dma_map_sg() on it.

This patch provides a generic implementation based on virt_to_page()
call. Architectures which require more sophisticated translation might
provide their own get_sgtable() methods.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-30 12:25:46 +02:00
Marek Szyprowski
955c757e09 ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
This patch adds support for DMA_ATTR_NO_KERNEL_MAPPING attribute for
IOMMU allocations, what let drivers to save precious kernel virtual
address space for large buffers that are intended to be accessed only
from userspace.

This patch is heavily based on initial work kindly provided by Abhinav
Kochhar <abhinav.k@samsung.com>.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:46 +02:00
Marek Szyprowski
d5724f172f common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
This patch adds DMA_ATTR_NO_KERNEL_MAPPING attribute which lets the
platform to avoid creating a kernel virtual mapping for the allocated
buffer. On some architectures creating such mapping is non-trivial task
and consumes very limited resources (like kernel virtual address space
or dma consistent address space). Buffers allocated with this attribute
can be only passed to user space by calling dma_mmap_attrs().

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-30 12:25:46 +02:00
Marek Szyprowski
64ccc9c033 common: dma-mapping: add support for generic dma_mmap_* calls
Commit 9adc5374 ('common: dma-mapping: introduce mmap method') added a
generic method for implementing mmap user call to dma_map_ops structure.

This patch converts ARM and PowerPC architectures (the only providers of
dma_mmap_coherent/dma_mmap_writecombine calls) to use this generic
dma_map_ops based call and adds a generic cross architecture
definition for dma_mmap_attrs, dma_mmap_coherent, dma_mmap_writecombine
functions.

The generic mmap virt_to_page-based fallback implementation is provided for
architectures which don't provide their own implementation for mmap method.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-07-30 12:25:46 +02:00
Marek Szyprowski
9fa8af91f0 ARM: dma-mapping: fix error path for memory allocation failure
This patch fixes incorrect check in error path. When the allocation of
first page fails, the kernel ops appears due to accessing -1 element of
the pages array.

Reported-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-07-30 12:25:45 +02:00
Marek Szyprowski
50262a4bf3 ARM: dma-mapping: add more sanity checks in arm_dma_mmap()
Add some sanity checks and forbid mmaping of buffers into vma areas larger
than allocated dma buffer.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-07-30 12:25:45 +02:00
Marek Szyprowski
e9da6e9905 ARM: dma-mapping: remove custom consistent dma region
This patch changes dma-mapping subsystem to use generic vmalloc areas
for all consistent dma allocations. This increases the total size limit
of the consistent allocations and removes platform hacks and a lot of
duplicated code.

Atomic allocations are served from special pool preallocated on boot,
because vmalloc areas cannot be reliably created in atomic context.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
2012-07-30 12:25:45 +02:00
Marek Szyprowski
5e6cafc83e mm: vmalloc: use const void * for caller argument
'const void *' is a safer type for caller function type. This patch
updates all references to caller function type.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
2012-07-30 12:25:44 +02:00
Tomasz Stanislawski
efc42bc980 scatterlist: add sg_alloc_table_from_pages function
This patch adds a new constructor for an sg table. The table is constructed
from an array of struct pages. All contiguous chunks of the pages are merged
into a single sg nodes. A user may provide an offset and a size of a buffer if
the buffer is not page-aligned.

The function is dedicated for DMABUF exporters which often perform conversion
from an page array to a scatterlist. Moreover the scatterlist should be
squashed in order to save memory and to speed-up the process of DMA mapping
using dma_map_sg.

The code is based on the patch 'v4l: vb2-dma-contig: add support for
scatterlist in userptr mode' and hints from Laurent Pinchart.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
CC: Andrew Morton <akpm@linux-foundation.org>
2012-07-30 12:25:44 +02:00
Shuah Khan
73a1180e14 mm: Fix build warning in kmem_cache_create()
The label oops is used in CONFIG_DEBUG_VM ifdef block and is defined
outside ifdef CONFIG_DEBUG_VM block. This results in the following
build warning when built with CONFIG_DEBUG_VM disabled. Fix to move
label oops definition to inside a CONFIG_DEBUG_VM block.

mm/slab_common.c: In function ‘kmem_cache_create’:
mm/slab_common.c:101:1: warning: label ‘oops’ defined but not used
[-Wunused-label]

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-07-30 13:15:40 +03:00
Jan Beulich
e273bd98c9 hwmon: struct x86_cpu_id arrays can be __initconst
... as being referenced from __init code only.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-07-30 11:33:00 +02:00
Oleg Nesterov
194f8dcbe9 uprobes: __replace_page() needs munlock_vma_page()
Like do_wp_page(), __replace_page() should do munlock_vma_page()
for the case when the old page still has other !VM_LOCKED
mappings. Unfortunately this needs mm/internal.h.

Also, move put_page() outside of ptl lock. This doesn't really
matter but looks a bit better.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20120729182249.GA20372@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30 11:27:25 +02:00
Oleg Nesterov
57683f72b8 uprobes: Rename vma_address() and make it return "unsigned long"
1. vma_address() returns loff_t, this looks confusing and this
   is unnecessary after the previous change. Make it return "ulong",
   all callers truncate the result anyway.

2. Its name conflicts with mm/rmap.c:vma_address(), rename it to
   offset_to_vaddr(), this matches vaddr_to_offset().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20120729182247.GA20365@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30 11:27:25 +02:00
Oleg Nesterov
f4d6dfe551 uprobes: Fix register_for_each_vma()->vma_address() check
1. register_for_each_vma() checks that vma_address() == vaddr,
   but this is not enough. We should also ensure that
   vaddr >= vm_start, find_vma() guarantees "vaddr < vm_end" only.

2. After the prevous changes, register_for_each_vma() is the
   only reason why vma_address() has to return loff_t, all other
   users know that we have the valid mapping at this offset and
   thus the overflow is not possible.

   Change the code to use vaddr_to_offset() instead, imho this looks
   more clean/understandable and now we can change vma_address().

3. While at it, remove the unnecessary type-cast.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar.vnet.ibm.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20120729182244.GA20362@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-30 11:27:24 +02:00