Commit Graph

16306 Commits

Author SHA1 Message Date
Linus Torvalds
888a6f77e0 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (52 commits)
  sched: fix RCU lockdep splat from task_group()
  rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value
  sched: suppress RCU lockdep splat in task_fork_fair
  net: suppress RCU lockdep false positive in sock_update_classid
  rcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held
  rcu: Add advice to PROVE_RCU_REPEATEDLY kernel config parameter
  rcu: Add tracing data to support queueing models
  rcu: fix sparse errors in rcutorture.c
  rcu: only one evaluation of arg in rcu_dereference_check() unless sparse
  kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC
  rcu: fix _oddness handling of verbose stall warnings
  rcu: performance fixes to TINY_PREEMPT_RCU callback checking
  rcu: upgrade stallwarn.txt documentation for CPU-bound RT processes
  vhost: add __rcu annotations
  rcu: add comment stating that list_empty() applies to RCU-protected lists
  rcu: apply TINY_PREEMPT_RCU read-side speedup to TREE_PREEMPT_RCU
  rcu: combine duplicate code, courtesy of CONFIG_PREEMPT_RCU
  rcu: Upgrade srcu_read_lock() docbook about SRCU grace periods
  rcu: document ways of stalling updates in low-memory situations
  rcu: repair code-duplication FIXMEs
  ...
2010-10-21 12:54:12 -07:00
Linus Torvalds
a8fe150098 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (26 commits)
  selinux: include vmalloc.h for vmalloc_user
  secmark: fix config problem when CONFIG_NF_CONNTRACK_SECMARK is not set
  selinux: implement mmap on /selinux/policy
  SELinux: allow userspace to read policy back out of the kernel
  SELinux: drop useless (and incorrect) AVTAB_MAX_SIZE
  SELinux: deterministic ordering of range transition rules
  kernel: roundup should only reference arguments once
  kernel: rounddown helper function
  secmark: export secctx, drop secmark in procfs
  conntrack: export lsm context rather than internal secid via netlink
  security: secid_to_secctx returns len when data is NULL
  secmark: make secmark object handling generic
  secmark: do not return early if there was no error
  AppArmor: Ensure the size of the copy is < the buffer allocated to hold it
  TOMOYO: Print URL information before panic().
  security: remove unused parameter from security_task_setscheduler()
  tpm: change 'tpm_suspend_pcr' to be module parameter
  selinux: fix up style problem on /selinux/status
  selinux: change to new flag variable
  selinux: really fix dependency causing parallel compile failure.
  ...
2010-10-21 12:41:19 -07:00
Linus Torvalds
2017bd1945 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (22 commits)
  ceph: do not carry i_lock for readdir from dcache
  fs/ceph/xattr.c: Use kmemdup
  rbd: passing wrong variable to bvec_kunmap_irq()
  rbd: null vs ERR_PTR
  ceph: fix num_pages_free accounting in pagelist
  ceph: add CEPH_MDS_OP_SETDIRLAYOUT and associated ioctl.
  ceph: don't crash when passed bad mount options
  ceph: fix debugfs warnings
  block: rbd: removing unnecessary test
  block: rbd: fixed may leaks
  ceph: switch from BKL to lock_flocks()
  ceph: preallocate flock state without locks held
  ceph: add pagelist_reserve, pagelist_truncate, pagelist_set_cursor
  ceph: use mapping->nrpages to determine if mapping is empty
  ceph: only invalidate on check_caps if we actually have pages
  ceph: do not hide .snap in root directory
  rbd: introduce rados block device (rbd), based on libceph
  ceph: factor out libceph from Ceph file system
  ceph-rbd: osdc support for osd call and rollback operations
  ceph: messenger and osdc changes for rbd
  ...
2010-10-21 12:38:28 -07:00
Eric Paris
ff660c80d0 secmark: fix config problem when CONFIG_NF_CONNTRACK_SECMARK is not set
When CONFIG_NF_CONNTRACK_SECMARK is not set we accidentally attempt to use
the secmark fielf of struct nf_conn.  Problem is when that config isn't set
the field doesn't exist.  whoops.  Wrap the incorrect usage in the config.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:13:00 +11:00
Eric Paris
1ae4de0cdf secmark: export secctx, drop secmark in procfs
The current secmark code exports a secmark= field which just indicates if
there is special labeling on a packet or not.  We drop this field as it
isn't particularly useful and instead export a new field secctx= which is
the actual human readable text label.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:52 +11:00
Eric Paris
1cc63249ad conntrack: export lsm context rather than internal secid via netlink
The conntrack code can export the internal secid to userspace.  These are
dynamic, can change on lsm changes, and have no meaning in userspace.  We
should instead be sending lsm contexts to userspace instead.  This patch sends
the secctx (rather than secid) to userspace over the netlink socket.  We use a
new field CTA_SECCTX and stop using the the old CTA_SECMARK field since it did
not send particularly useful information.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:51 +11:00
Eric Paris
2606fd1fa5 secmark: make secmark object handling generic
Right now secmark has lots of direct selinux calls.  Use all LSM calls and
remove all SELinux specific knowledge.  The only SELinux specific knowledge
we leave is the mode.  The only point is to make sure that other LSMs at
least test this generic code before they assume it works.  (They may also
have to make changes if they do not represent labels as strings)

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:48 +11:00
Eric Paris
15714f7b58 secmark: do not return early if there was no error
Commit 4a5a5c73 attempted to pass decent error messages back to userspace for
netfilter errors.  In xt_SECMARK.c however the patch screwed up and returned
on 0 (aka no error) early and didn't finish setting up secmark.  This results
in a kernel BUG if you use SECMARK.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:47 +11:00
Sage Weil
240634e9b3 ceph: fix num_pages_free accounting in pagelist
Decrement the free page counter when removing a page from the free_list.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20 15:38:23 -07:00
Yehuda Sadeh
010e3b48fc ceph: don't crash when passed bad mount options
This only happened when parse_extra_token was not passed
to ceph_parse_option() (hence, only happened in rbd).

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-10-20 15:38:22 -07:00
Greg Farnum
ac0b74d8a1 ceph: add pagelist_reserve, pagelist_truncate, pagelist_set_cursor
These facilitate preallocation of pages so that we can encode into the pagelist
in an atomic context.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20 15:38:16 -07:00
Yehuda Sadeh
602adf4002 rbd: introduce rados block device (rbd), based on libceph
The rados block device (rbd), based on osdblk, creates a block device
that is backed by objects stored in the Ceph distributed object storage
cluster.  Each device consists of a single metadata object and data
striped over many data objects.

The rbd driver supports read-only snapshots.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20 15:38:13 -07:00
Yehuda Sadeh
3d14c5d2b6 ceph: factor out libceph from Ceph file system
This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph.  This
is mostly a matter of moving files around.  However, a few key pieces
of the interface change as well:

 - ceph_client becomes ceph_fs_client and ceph_client, where the latter
   captures the mon and osd clients, and the fs_client gets the mds client
   and file system specific pieces.
 - Mount option parsing and debugfs setup is correspondingly broken into
   two pieces.
 - The mon client gets a generic handler callback for otherwise unknown
   messages (mds map, in this case).
 - The basic supported/required feature bits can be expanded (and are by
   ceph_fs_client).

No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20 15:37:28 -07:00
Linus Torvalds
799c10559d De-pessimize rds_page_copy_user
Don't try to "optimize" rds_page_copy_user() by using kmap_atomic() and
the unsafe atomic user mode accessor functions.  It's actually slower
than the straightforward code on any reasonable modern CPU.

Back when the code was written (although probably not by the time it was
actually merged, though), 32-bit x86 may have been the dominant
architecture.  And there kmap_atomic() can be a lot faster than kmap()
(unless you have very good locality, in which case the virtual address
caching by kmap() can overcome all the downsides).

But these days, x86-64 may not be more populous, but it's getting there
(and if you care about performance, it's definitely already there -
you'd have upgraded your CPU's already in the last few years).  And on
x86-64, the non-kmap_atomic() version is faster, simply because the code
is simpler and doesn't have the "re-try page fault" case.

People with old hardware are not likely to care about RDS anyway, and
the optimization for the 32-bit case is simply buggy, since it doesn't
verify the user addresses properly.

Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-15 11:09:28 -07:00
Kees Cook
b00916b189 net: clear heap allocations for privileged ethtool actions
Several other ethtool functions leave heap uncleared (potentially) by
drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
are well controlled. In some situations (e.g. unchecked error conditions),
the heap will remain unchanged in areas before copying back to userspace.
Note that these are less of an issue since these all require CAP_NET_ADMIN.

Cc: stable@kernel.org
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11 12:23:25 -07:00
Jiri Slaby
5518b29f22 ATM: mpc, fix use after free
Stanse found that mpc_push frees skb and then it dereferences it. It
is a typo, new_skb should be dereferenced there.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11 11:05:42 -07:00
Kees Cook
ae6df5f96a net: clear heap allocation for ETHTOOL_GRXCLSRLALL
Calling ETHTOOL_GRXCLSRLALL with a large rule_cnt will allocate kernel
heap without clearing it. For the one driver (niu) that implements it,
it will leave the unused portion of heap unchanged and copy the full
contents back to userspace.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-08 10:48:28 -07:00
David S. Miller
94b105723a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-10-08 10:36:51 -07:00
Paul E. McKenney
1144182a87 net: suppress RCU lockdep false positive in sock_update_classid
> ===================================================
> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> include/linux/cgroup.h:542 invoked rcu_dereference_check() without protection!
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 1, debug_locks = 0
> 1 lock held by swapper/1:
>  #0:  (net_mutex){+.+.+.}, at: [<ffffffff813e9010>]
> register_pernet_subsys+0x1f/0x47
>
> stack backtrace:
> Pid: 1, comm: swapper Not tainted 2.6.35.4-28.fc14.x86_64 #1
> Call Trace:
>  [<ffffffff8107bd3a>] lockdep_rcu_dereference+0xaa/0xb3
>  [<ffffffff813e04b9>] sock_update_classid+0x7c/0xa2
>  [<ffffffff813e054a>] sk_alloc+0x6b/0x77
>  [<ffffffff8140b281>] __netlink_create+0x37/0xab
>  [<ffffffff813f941c>] ? rtnetlink_rcv+0x0/0x2d
>  [<ffffffff8140cee1>] netlink_kernel_create+0x74/0x19d
>  [<ffffffff8149c3ca>] ? __mutex_lock_common+0x339/0x35b
>  [<ffffffff813f7e9c>] rtnetlink_net_init+0x2e/0x48
>  [<ffffffff813e8d7a>] ops_init+0xe9/0xff
>  [<ffffffff813e8f0d>] register_pernet_operations+0xab/0x130
>  [<ffffffff813e901f>] register_pernet_subsys+0x2e/0x47
>  [<ffffffff81db7bca>] rtnetlink_init+0x53/0x102
>  [<ffffffff81db835c>] netlink_proto_init+0x126/0x143
>  [<ffffffff81db8236>] ? netlink_proto_init+0x0/0x143
>  [<ffffffff810021b8>] do_one_initcall+0x72/0x186
>  [<ffffffff81d78ebc>] kernel_init+0x23b/0x2c9
>  [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff8149e2d0>] ? restore_args+0x0/0x30
>  [<ffffffff81d78c81>] ? kernel_init+0x0/0x2c9
>  [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10

The sock_update_classid() function calls task_cls_classid(current),
but the calling task cannot go away, so there is no danger of
the associated structures disappearing.  Insert an RCU read-side
critical section to suppress the false positive.

Reported-by: Subrata Modak <subrata@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-10-07 10:02:28 -07:00
John W. Linville
4efe7f51be Revert "mac80211: use netif_receive_skb in ieee80211_tx_status callpath"
This reverts commit 5ed3bc7288.

It turns-out that not all drivers are calling ieee80211_tx_status from a
compatible context.  Revert this for now and try again later...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-07 11:35:40 -04:00
David S. Miller
fb3dbece26 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-2.6 2010-10-07 00:59:39 -07:00
Ingo Molnar
556ef63255 Merge commit 'v2.6.36-rc7' into core/rcu
Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-07 09:43:45 +02:00
Ingo Molnar
d4f8f217b8 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu 2010-10-07 09:43:11 +02:00
David S. Miller
12e94471b2 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2010-10-06 19:11:17 -07:00
Johannes Berg
44271488b9 mac80211: delete AddBA response timer
We never delete the addBA response timer, which
is typically fine, but if the station it belongs
to is deleted very quickly after starting the BA
session, before the peer had a chance to reply,
the timer may fire after the station struct has
been freed already. Therefore, we need to delete
the timer in a suitable spot -- best when the
session is being stopped (which will happen even
then) in which case the delete will be a no-op
most of the time.

I've reproduced the scenario and tested the fix.

This fixes the crash reported at
http://mid.gmane.org/4CAB6F96.6090701@candelatech.com

Cc: stable@kernel.org
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-06 15:58:29 -04:00
Eric Dumazet
79315068f4 caif: fix two caif_connect() bugs
caif_connect() might dereference a netdevice after dev_put() it.

It also doesnt check dev_get_by_index() return value and could
dereference a NULL pointer.

Fix it, using RCU to avoid taking a reference.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 20:35:53 -07:00
Dan Carpenter
4e18b3edf7 cls_u32: signedness bug
skb_headroom() is unsigned so "skb_headroom(skb) + toff" is also
unsigned and can't be less than zero.  This test was added in 66d50d25:
"u32: negative offset fix"  It was supposed to fix a regression.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 00:40:39 -07:00
Gustavo F. Padovan
eaa71b318c Bluetooth: Disallow to change L2CAP_OPTIONS values when connected
L2CAP doesn't permit change like MTU, FCS, TxWindow values while the
connection is alive, we can only set that before the
connection/configuration process. That can lead to bugs in the L2CAP
operation.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-10-04 19:28:52 -03:00
Linus Torvalds
4a73a43741 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  vlan: dont drop packets from unknown vlans in promiscuous mode
  Phonet: Correct header retrieval after pskb_may_pull
  um: Proper Fix for f25c80a4: remove duplicate structure field initialization
  ip_gre: Fix dependencies wrt. ipv6.
  net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out()
  iwl3945: queue the right work if the scan needs to be aborted
  mac80211: fix use-after-free
2010-10-04 11:11:01 -07:00
Dan Rosenberg
51e97a12be sctp: Fix out-of-bounds reading in sctp_asoc_get_hmac()
The sctp_asoc_get_hmac() function iterates through a peer's hmac_ids
array and attempts to ensure that only a supported hmac entry is
returned.  The current code fails to do this properly - if the last id
in the array is out of range (greater than SCTP_AUTH_HMAC_ID_MAX), the
id integer remains set after exiting the loop, and the address of an
out-of-bounds entry will be returned and subsequently used in the parent
function, causing potentially ugly memory corruption.  This patch resets
the id integer to 0 on encountering an invalid id so that NULL will be
returned after finishing the loop if no valid ids are found.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 21:58:49 -07:00
Dan Rosenberg
d7e0d19aa0 sctp: prevent reading out-of-bounds memory
Two user-controlled allocations in SCTP are subsequently dereferenced as
sockaddr structs, without checking if the dereferenced struct members fall
beyond the end of the allocated chunk.  There doesn't appear to be any
information leakage here based on how these members are used and
additional checking, but it's still worth fixing.

[akpm@linux-foundation.org: remove unfashionable newlines, fix gmail tab->space conversion]
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 21:58:48 -07:00
David Stevens
5b7c840667 ipv4: correct IGMP behavior on v3 query during v2-compatibility mode
A recent patch to allow IGMPv2 responses to IGMPv3 queries
bypasses length checks for valid query lengths, incorrectly
resets the v2_seen timer, and does not support IGMPv1.

The following patch responds with a v2 report as required
by IGMPv2 while correcting the other problems introduced
by the patch.

Signed-Off-By: David L Stevens <dlstevens@us.ibm.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 21:58:47 -07:00
Ben Hutchings
c5d3557103 Revert "ipv4: Make INET_LRO a bool instead of tristate."
This reverts commit e81963b180.

LRO is now deprecated in favour of GRO, and only a few drivers use it,
so it is desirable to build it as a module in distribution kernels.

The original change to prevent building it as a module was made in an
attempt to avoid the case where some dependents are set to y and some
to m, and INET_LRO can be set to m rather than y.  However, the
Kconfig system will reliably set INET_LRO=y in this case.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 21:46:23 -07:00
Nagendra Tomar
482964e56e net: Fix the condition passed to sk_wait_event()
This patch fixes the condition (3rd arg) passed to sk_wait_event() in
sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory()
causes the following soft lockup in tcp_sendmsg() when the global tcp
memory pool has exhausted.

>>> snip <<<

localhost kernel: BUG: soft lockup - CPU#3 stuck for 11s! [sshd:6429]
localhost kernel: CPU 3:
localhost kernel: RIP: 0010:[sk_stream_wait_memory+0xcd/0x200]  [sk_stream_wait_memory+0xcd/0x200] sk_stream_wait_memory+0xcd/0x200
localhost kernel:
localhost kernel: Call Trace:
localhost kernel:  [sk_stream_wait_memory+0x1b1/0x200] sk_stream_wait_memory+0x1b1/0x200
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [ipv6:tcp_sendmsg+0x6e6/0xe90] tcp_sendmsg+0x6e6/0xce0
localhost kernel:  [sock_aio_write+0x126/0x140] sock_aio_write+0x126/0x140
localhost kernel:  [xfs:do_sync_write+0xf1/0x130] do_sync_write+0xf1/0x130
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [hrtimer_start+0xe3/0x170] hrtimer_start+0xe3/0x170
localhost kernel:  [vfs_write+0x185/0x190] vfs_write+0x185/0x190
localhost kernel:  [sys_write+0x50/0x90] sys_write+0x50/0x90
localhost kernel:  [system_call+0x7e/0x83] system_call+0x7e/0x83

>>> snip <<<

What is happening is, that the sk_wait_event() condition passed from
sk_stream_wait_memory() evaluates to true for the case of tcp global memory
exhaustion. This is because both sk_stream_memory_free() and vm_wait are true
which causes sk_wait_event() to *not* call schedule_timeout().
Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping.
This causes the caller to again try allocation, which again fails and again
calls sk_stream_wait_memory(), and so on.

[ Bug introduced by commit c1cbe4b7ad
  ("[NET]: Avoid atomic xchg() for non-error case") -DaveM ]

Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 20:41:32 -07:00
Maciej Żenczykowski
ae878ae280 net: Fix IPv6 PMTU disc. w/ asymmetric routes
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03 14:49:00 -07:00
Eric Dumazet
173e79fb70 vlan: dont drop packets from unknown vlans in promiscuous mode
Roger Luethi noticed packets for unknown VLANs getting silently dropped
even in promiscuous mode.

Check for promiscuous mode in __vlan_hwaccel_rx() and vlan_gro_common()
before drops.

As suggested by Patrick, mark such packets to have skb->pkt_type set to
PACKET_OTHERHOST to make sure they are dropped by IP stack.

Reported-by: Roger Luethi <rl@hellgate.ch>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30 18:04:21 -07:00
David S. Miller
9262919531 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-09-30 12:02:22 -07:00
Gustavo F. Padovan
b0239c80fe Revert "Bluetooth: Don't accept ConfigReq if we aren't in the BT_CONFIG state"
This reverts commit 8cb8e6f168.

That commit introduced a regression with the Bluetooth Profile Tuning
Suite(PTS), Reverting this make sure that L2CAP is in a qualificable
state.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30 12:19:35 -03:00
Gustavo F. Padovan
fad003b6c8 Bluetooth: Fix inconsistent lock state with RFCOMM
When receiving a rfcomm connection with the old dund deamon a
inconsistent lock state happens. That's because interrupts were already
disabled by l2cap_conn_start() when rfcomm_sk_state_change() try to lock
the spin_lock.

As result we may have a inconsistent lock state for l2cap_conn_start()
after rfcomm_sk_state_change() calls bh_lock_sock() and disable interrupts
as well.

[ 2833.151999]
[ 2833.151999] =================================
[ 2833.151999] [ INFO: inconsistent lock state ]
[ 2833.151999] 2.6.36-rc3 #2
[ 2833.151999] ---------------------------------
[ 2833.151999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 2833.151999] krfcommd/2306 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 2833.151999]  (slock-AF_BLUETOOTH){+.?...}, at: [<ffffffffa00bcb56>] rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999] {IN-SOFTIRQ-W} state was registered at:
[ 2833.151999]   [<ffffffff81094346>] __lock_acquire+0x5b6/0x1560
[ 2833.151999]   [<ffffffff8109534a>] lock_acquire+0x5a/0x70
[ 2833.151999]   [<ffffffff81392b6c>] _raw_spin_lock+0x2c/0x40
[ 2833.151999]   [<ffffffffa00a5092>] l2cap_conn_start+0x92/0x640 [l2cap]
[ 2833.151999]   [<ffffffffa00a6a3f>] l2cap_sig_channel+0x6bf/0x1320 [l2cap]
[ 2833.151999]   [<ffffffffa00a9173>] l2cap_recv_frame+0x133/0x770 [l2cap]
[ 2833.151999]   [<ffffffffa00a997b>] l2cap_recv_acldata+0x1cb/0x390 [l2cap]
[ 2833.151999]   [<ffffffffa000db4b>] hci_rx_task+0x2ab/0x450 [bluetooth]
[ 2833.151999]   [<ffffffff8106b22b>] tasklet_action+0xcb/0xe0
[ 2833.151999]   [<ffffffff8106b91e>] __do_softirq+0xae/0x150
[ 2833.151999]   [<ffffffff8102bc0c>] call_softirq+0x1c/0x30
[ 2833.151999]   [<ffffffff8102ddb5>] do_softirq+0x75/0xb0
[ 2833.151999]   [<ffffffff8106b56d>] irq_exit+0x8d/0xa0
[ 2833.151999]   [<ffffffff8104484b>] smp_apic_timer_interrupt+0x6b/0xa0
[ 2833.151999]   [<ffffffff8102b6d3>] apic_timer_interrupt+0x13/0x20
[ 2833.151999]   [<ffffffff81029dfa>] cpu_idle+0x5a/0xb0
[ 2833.151999]   [<ffffffff81381ded>] rest_init+0xad/0xc0
[ 2833.151999]   [<ffffffff817ebc4d>] start_kernel+0x2dd/0x2e8
[ 2833.151999]   [<ffffffff817eb2e6>] x86_64_start_reservations+0xf6/0xfa
[ 2833.151999]   [<ffffffff817eb3ce>] x86_64_start_kernel+0xe4/0xeb
[ 2833.151999] irq event stamp: 731
[ 2833.151999] hardirqs last  enabled at (731): [<ffffffff8106b762>] local_bh_enable_ip+0x82/0xe0
[ 2833.151999] hardirqs last disabled at (729): [<ffffffff8106b93e>] __do_softirq+0xce/0x150
[ 2833.151999] softirqs last  enabled at (730): [<ffffffff8106b96e>] __do_softirq+0xfe/0x150
[ 2833.151999] softirqs last disabled at (711): [<ffffffff8102bc0c>] call_softirq+0x1c/0x30
[ 2833.151999]
[ 2833.151999] other info that might help us debug this:
[ 2833.151999] 2 locks held by krfcommd/2306:
[ 2833.151999]  #0:  (rfcomm_mutex){+.+.+.}, at: [<ffffffffa00bb744>] rfcomm_run+0x174/0xb20 [rfcomm]
[ 2833.151999]  #1:  (&(&d->lock)->rlock){+.+...}, at: [<ffffffffa00b9223>] rfcomm_dlc_accept+0x53/0x100 [rfcomm]
[ 2833.151999]
[ 2833.151999] stack backtrace:
[ 2833.151999] Pid: 2306, comm: krfcommd Tainted: G        W   2.6.36-rc3 #2
[ 2833.151999] Call Trace:
[ 2833.151999]  [<ffffffff810928e1>] print_usage_bug+0x171/0x180
[ 2833.151999]  [<ffffffff810936c3>] mark_lock+0x333/0x400
[ 2833.151999]  [<ffffffff810943ca>] __lock_acquire+0x63a/0x1560
[ 2833.151999]  [<ffffffff810948b5>] ? __lock_acquire+0xb25/0x1560
[ 2833.151999]  [<ffffffff8109534a>] lock_acquire+0x5a/0x70
[ 2833.151999]  [<ffffffffa00bcb56>] ? rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999]  [<ffffffff81392b6c>] _raw_spin_lock+0x2c/0x40
[ 2833.151999]  [<ffffffffa00bcb56>] ? rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999]  [<ffffffffa00bcb56>] rfcomm_sk_state_change+0x46/0x170 [rfcomm]
[ 2833.151999]  [<ffffffffa00b9239>] rfcomm_dlc_accept+0x69/0x100 [rfcomm]
[ 2833.151999]  [<ffffffffa00b9a49>] rfcomm_check_accept+0x59/0xd0 [rfcomm]
[ 2833.151999]  [<ffffffffa00bacab>] rfcomm_recv_frame+0x9fb/0x1320 [rfcomm]
[ 2833.151999]  [<ffffffff813932bb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60
[ 2833.151999]  [<ffffffff81093acd>] ? trace_hardirqs_on_caller+0x13d/0x180
[ 2833.151999]  [<ffffffff81093b1d>] ? trace_hardirqs_on+0xd/0x10
[ 2833.151999]  [<ffffffffa00bb7f1>] rfcomm_run+0x221/0xb20 [rfcomm]
[ 2833.151999]  [<ffffffff813905e7>] ? schedule+0x287/0x780
[ 2833.151999]  [<ffffffffa00bb5d0>] ? rfcomm_run+0x0/0xb20 [rfcomm]
[ 2833.151999]  [<ffffffff81081026>] kthread+0x96/0xa0
[ 2833.151999]  [<ffffffff8102bb14>] kernel_thread_helper+0x4/0x10
[ 2833.151999]  [<ffffffff813936bc>] ? restore_args+0x0/0x30
[ 2833.151999]  [<ffffffff81080f90>] ? kthread+0x0/0xa0
[ 2833.151999]  [<ffffffff8102bb10>] ? kernel_thread_helper+0x0/0x10

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30 12:19:35 -03:00
Gustavo F. Padovan
ccbb84af28 Bluetooth: Simplify L2CAP Streaming mode sending
As we don't have any error control on the Streaming mode, i.e., we don't
need to keep a copy of the skb for later resending we don't need to
call skb_clone() on it.
Then we can go one further here, and dequeue the skb before sending it,
that also means we don't need to look to sk->sk_send_head anymore.

The patch saves memory and time when sending Streaming mode data, so
it is good to mainline.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30 12:19:35 -03:00
Andrei Emeltchenko
8183b775bc Bluetooth: fix MTU L2CAP configuration parameter
When receiving L2CAP negative configuration response with respect
to MTU parameter we modify wrong field. MTU here means proposed
value of MTU that the remote device intends to transmit. So for local
L2CAP socket it is pi->imtu.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Acked-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30 12:19:35 -03:00
Mat Martineau
8c462b6047 Bluetooth: Only enable L2CAP FCS for ERTM or streaming
This fixes a bug which caused the FCS setting to show L2CAP_FCS_CRC16
with L2CAP modes other than ERTM or streaming.  At present, this only
affects the FCS value shown with getsockopt() for basic mode.

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2010-09-30 12:19:35 -03:00
Kumar Sanghvi
a91e7d471e Phonet: Correct header retrieval after pskb_may_pull
Retrieve the header after doing pskb_may_pull since, pskb_may_pull
could change the buffer structure.

This is based on the comment given by Eric Dumazet on Phonet
Pipe controller patch for a similar problem.

Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-29 19:41:04 -07:00
David S. Miller
68c1f3a96c ip_gre: Fix dependencies wrt. ipv6.
The GRE tunnel driver needs to invoke icmpv6 helpers in the
ipv6 stack when ipv6 support is enabled.

Therefore if IPV6 is enabled, we have to enforce that GRE's
enabling (modular or static) matches that of ipv6.

Reported-by: Patrick McHardy <kaber@trash.net>
Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28 22:37:56 -07:00
Damian Lukowski
4d22f7d372 net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out()
Fixes kernel Bugzilla Bug 18952

This patch adds a syn_set parameter to the retransmits_timed_out()
routine and updates its callers. If not set, TCP_RTO_MIN is taken
as the calculation basis as before. If set, TCP_TIMEOUT_INIT is
used instead, so that sysctl_syn_retries represents the actual
amount of SYN retransmissions in case no SYNACKs are received when
establishing a new connection.

Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-28 13:08:32 -07:00
Linus Torvalds
a2724f28d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  tcp: Fix >4GB writes on 64-bit.
  net/9p: Mount only matching virtio channels
  de2104x: fix ethtool
  tproxy: check for transparent flag in ip_route_newports
  ipv6: add IPv6 to neighbour table overflow warning
  tcp: fix TSO FACK loss marking in tcp_mark_head_lost
  3c59x: fix regression from patch "Add ethtool WOL support"
  ipv6: add a missing unregister_pernet_subsys call
  s390: use free_netdev(netdev) instead of kfree()
  sgiseeq: use free_netdev(netdev) instead of kfree()
  rionet: use free_netdev(netdev) instead of kfree()
  ibm_newemac: use free_netdev(netdev) instead of kfree()
  smsc911x: Add MODULE_ALIAS()
  net: reset skb queue mapping when rx'ing over tunnel
  br2684: fix scheduling while atomic
  de2104x: fix TP link detection
  de2104x: fix power management
  de2104x: disable autonegotiation on broken hardware
  net: fix a lockdep splat
  e1000e: 82579 do not gate auto config of PHY by hardware during nominal use
  ...
2010-09-28 12:01:26 -07:00
David S. Miller
01db403cf9 tcp: Fix >4GB writes on 64-bit.
Fixes kernel bugzilla #16603

tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write
zero bytes, for example.

There is also the problem higher up of how verify_iovec() works.  It
wants to prevent the total length from looking like an error return
value.

However it does this using 'int', but syscalls return 'long' (and
thus signed 64-bit on 64-bit machines).  So it could trigger
false-positives on 64-bit as written.  So fix it to use 'long'.

Reported-by: Olaf Bonorden <bono@onlinehome.de>
Reported-by: Daniel Büse <dbuese@gmx.de>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27 20:24:54 -07:00
Sven Eckelmann
0b20406cda net/9p: Mount only matching virtio channels
p9_virtio_create will only compare the the channel's tag characters
against the device name till the end of the channel's tag but not till
the end of the device name. This means that if a user defines channels
with the tags foo and foobar then he would mount foo when he requested
foonot and may mount foo when he requested foobar.

Thus it is necessary to check both string lengths against each other in
case of a successful partial string match.

Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27 15:54:44 -07:00
Ulrich Weber
7e1b33e5ea ipv6: add IPv6 to neighbour table overflow warning
IPv4 and IPv6 have separate neighbour tables, so
the warning messages should be distinguishable.

[ Add a suitable message prefix on the ipv4 side as well -DaveM ]

Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27 15:02:18 -07:00
Yuchung Cheng
b3de7559af tcp: fix TSO FACK loss marking in tcp_mark_head_lost
When TCP uses FACK algorithm to mark lost packets in
tcp_mark_head_lost(), if the number of packets in the (TSO) skb is
greater than the number of packets that should be marked lost, TCP
incorrectly exits the loop and marks no packets lost in the skb. This
underestimates tp->lost_out and affects the recovery/retransmission.
This patch fargments the skb and marks the correct amount of packets
lost.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-27 14:55:57 -07:00