Commit Graph

22723 Commits

Author SHA1 Message Date
Allan Stephens
336ebf5bf5 tipc: Add routines for safe checking of node's network address
Introduces routines that test whether a given network address is
equal to a node's own network address or if it lies within the node's
own network cluster, and which work properly regardless of whether
the node is using the default network address <0.0.0> or a non-zero
network address that is assigned later on. In essence, these routines
ensure that address <0.0.0> is treated as an alias for "this node",
regardless of which network address the node is actually using.

Old users of the pre-existing more strict match in_own_cluster()
have been accordingly redirected to what is now called
in_own_cluster_exact() --- which does not extend matching to <0,0,0>.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-19 15:46:39 -04:00
Allan Stephens
fd6eced8a4 tipc: Don't record failed publication attempt as a success
No longer increments counter of number of publications by a node
if an attempt to add a new publication fails. This prevents TIPC from
incorrectly blocking future publications because the configured maximum
number of publications has been reached.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-19 15:46:37 -04:00
Allan Stephens
1110b8d33a tipc: Update node-scope publications when network address is assigned
Ensures that node-scope name publications that exist prior to the
configuration of a node's network address are properly re-initialized
with that address when it is assigned. TIPC's node-scope publications
are now tracked using a publications list like the lists used for
cluster-scope and zone-scope publications so they can be easily updated
when required.

The inclusion of node scope name publications in a conventional publication
list means that they must now also be withdrawn, just like cluster and zone
scope publications are currently withdrawn.  So some conditional tests on
scope ==/!= TIPC_NODE_SCOPE are inserted/removed accordingly.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-19 15:46:36 -04:00
Allan Stephens
a909804f7c tipc: Separate cluster-scope and zone-scope names into distinct lists
Utilizes distinct lists to track zone-scope and cluster-scope names
published by a node. For now, TIPC continues to process the entries
in both lists in the same way; however, an upcoming patch will utilize
the existence of the lists to prevent the sending of cluster-scope names
to nodes that are not part of the local cluster.

To achieve this, an array of publication lists is introduced, so
that they can be iterated over and accessed via publ->scope as
an index where convenient.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-19 15:46:05 -04:00
Allan Stephens
e11aa05971 tipc: Factor out name publication code to a separate function
This is done so that it can be reused with differing publication
lists, instead of being hard coded to the cluster publicaton list.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-18 09:36:02 -04:00
Allan Stephens
3f8375fee3 tipc: introduce publication lists struct
There is currently a single list that is containing both cluster-scope and
zone-scope publications, and the list count is a separate free floating
variable.  Create a struct to bind the count to the list, and to pave
the way for factoring out the publications into zone/cluster/node scope.

The current "publ_root" most matches what will be the cluster scope
list, so it is named accordingly in this commit.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-18 09:36:02 -04:00
majianpeng
798ec84d45 net/core:Remove memleak reports by kmemleak_not_leak.
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-18 00:20:28 -04:00
majianpeng
7f59388108 net/ipv4:Remove two memleak reports by kmemleak_not_leak.
Signed-off-by: majianpeng <majianpeng@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-18 00:20:28 -04:00
Daniel Baluta
a75afd4770 can: fix sparse warning for cgw_list
Make cgw_list static to remove the following sparse warning:
net/can/gw.c:69:1: warning: symbol 'cgw_list' was not declared.
Should it be static?

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-04-16 21:08:18 +02:00
David S. Miller
56845d78ce Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/atheros/atlx/atl1.c
	drivers/net/ethernet/atheros/atlx/atl1.h

Resolved a conflict between a DMA error bug fix and NAPI
support changes in the atl1 driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:19:04 -04:00
John Fastabend
3ff661c38c net: rtnetlink notify events for FDB NTF_SELF adds and deletes
It is useful to be able to monitor for FDB events in user space.
This patch adds support to generate netlink events when a change
is made to a device supporting the FDB ops.

This brings embedded switches inline with the SW net/bridge which
triggers events on FDB updates as well.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
John Fastabend
d83b060360 net: add fdb generic dump routine
This adds a generic dump routine drivers can call. It
should be sufficient to handle any bridging model that
uses the unicast address list. This should be most SR-IOV
enabled NICs.

v2: return error on nlmsg_put and use -EMSGSIZE instead
    of -ENOMEM this is inline other usages

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
John Fastabend
12a9463445 net: addr_list: add exclusive dev_uc_add and dev_mc_add
This adds a dev_uc_add_excl() and dev_mc_add_excl() calls
similar to the original dev_{uc|mc}_add() except it sets
the global bit and returns -EEXIST for duplicat entires.

This is useful for drivers that support SR-IOV, macvlan
devices and any other devices that need to manage the
unicast and multicast lists.

v2: fix typo UNICAST should be MULTICAST in dev_mc_add_excl()

CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
John Fastabend
77162022ab net: add generic PF_BRIDGE:RTM_ FDB hooks
This adds two new flags NTF_MASTER and NTF_SELF that can
now be used to specify where PF_BRIDGE netlink commands should
be sent. NTF_MASTER sends the commands to the 'dev->master'
device for parsing. Typically this will be the linux net/bridge,
or open-vswitch devices. Also without any flags set the command
will be handled by the master device as well so that current user
space tools continue to work as expected.

The NTF_SELF flag will push the PF_BRIDGE commands to the
device. In the basic example below the commands are then parsed
and programmed in the embedded bridge.

Note if both NTF_SELF and NTF_MASTER bits are set then the
command will be sent to both 'dev->master' and 'dev' this allows
user space to easily keep the embedded bridge and software bridge
in sync.

There is a slight complication in the case with both flags set
when an error occurs. To resolve this the rtnl handler clears
the NTF_ flag in the netlink ack to indicate which sets completed
successfully. The add/del handlers will abort as soon as any
error occurs.

To support this new net device ops were added to call into
the device and the existing bridging code was refactored
to use these. There should be no required changes in user space
to support the current bridge behavior.

A basic setup with a SR-IOV enabled NIC looks like this,

          veth0  veth2
            |      |
          ------------
          |  bridge0 |   <---- software bridging
          ------------
               /
               /
  ethx.y      ethx
    VF         PF
     \         \          <---- propagate FDB entries to HW
     \         \
  --------------------
  |  Embedded Bridge |    <---- hardware offloaded switching
  --------------------

In this case the embedded bridge must be managed to allow 'veth0'
to communicate with 'ethx.y' correctly. At present drivers managing
the embedded bridge either send frames onto the network which
then get dropped by the switch OR the embedded bridge will flood
these frames. With this patch we have a mechanism to manage the
embedded bridge correctly from user space. This example is specific
to SR-IOV but replacing the VF with another PF or dropping this
into the DSA framework generates similar management issues.

Examples session using the 'br'[1] tool to add, dump and then
delete a mac address with a new "embedded" option and enabled
ixgbe driver:

# br fdb add 22:35:19:ac:60:59 dev eth3
# br fdb
port    mac addr                flags
veth0   22:35:19:ac:60:58       static
veth0   9a:5f:81:f7:f6:ec       local
eth3    00:1b:21:55:23:59       local
eth3    22:35:19:ac:60:59       static
veth0   22:35:19:ac:60:57       static
#br fdb add 22:35:19:ac:60:59 embedded dev eth3
#br fdb
port    mac addr                flags
veth0   22:35:19:ac:60:58       static
veth0   9a:5f:81:f7:f6:ec       local
eth3    00:1b:21:55:23:59       local
eth3    22:35:19:ac:60:59       static
veth0   22:35:19:ac:60:57       static
eth3    22:35:19:ac:60:59       local embedded
#br fdb del 22:35:19:ac:60:59 embedded dev eth3

I added a couple lines to 'br' to set the flags correctly is all. It
is my opinion that the merit of this patch is now embedded and SW
bridges can both be modeled correctly in user space using very nearly
the same message passing.

[1] 'br' tool was published as an RFC here and will be renamed 'bridge'
    http://patchwork.ozlabs.org/patch/117664/

Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for
valuable feedback, suggestions, and review.

v2: fixed api descriptions and error case with both NTF_SELF and
    NTF_MASTER set plus updated patch description.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
Herbert Xu
c5c2326059 bridge: Add multicast_querier toggle and disable queries by default
Sending general queries was implemented as an optimisation to speed
up convergence on start-up.  In order to prevent interference with
multicast routers a zero source address has to be used.

Unfortunately these packets appear to cause some multicast-aware
switches to misbehave, e.g., by disrupting multicast packets to us.

Since the multicast snooping feature still functions without sending
our own queries, this patch will change the default to not send
queries.

For those that need queries in order to speed up convergence on start-up,
a toggle is provided to restore the previous behaviour.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:51:35 -04:00
Herbert Xu
c83b8fab06 bridge: Restart queries when last querier expires
As it stands when we discover that a real querier (one that queries
with a non-zero source address) we stop querying.  However, even
after said querier has fallen off the edge of the earth, we will
never restart querying (unless the bridge itself is restarted).

This patch fixes this by kicking our own querier into gear when
the timer for other queriers expire.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:51:34 -04:00
Herbert Xu
748572162a bridge: Add br_multicast_start_querier
This patch adds the helper br_multicast_start_querier so that
the code which starts the queriers in br_multicast_toggle can
be reused elsewhere.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:51:34 -04:00
Eric Dumazet
95c9617472 net: cleanup unsigned to unsigned int
Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:44:40 -04:00
Daniel Baluta
5e73ea1a31 ipv4: fix checkpatch errors
Fix checkpatch errors of the following type:
	* ERROR: "foo * bar" should be "foo *bar"
	* ERROR: "(foo*)" should be "(foo *)"

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:37:19 -04:00
David S. Miller
cf22f9a2b8 ipv6: Remove unused argument to addrconf_dad_start().
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 21:37:40 -04:00
Vijay Subramanian
a8cb05b238 tcp: Remove redundant code entering quickack mode
tcp_enter_quickack_mode() already calls tcp_incr_quickack() and sets
icsk->icsk_ack.ato  to TCP_ATO_MIN. This patch removes the duplication.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Reviewed-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:29:02 -04:00
Alex Copot
aacd9289af tcp: bind() use stronger condition for bind_conflict
We must try harder to get unique (addr, port) pairs when
doing port autoselection for sockets with SO_REUSEADDR
option set.

We achieve this by adding a relaxation parameter to
inet_csk_bind_conflict. When 'relax' parameter is off
we return a conflict whenever the current searched
pair (addr, port) is not unique.

This tries to address the problems reported in patch:
	8d238b25b1
	Revert "tcp: bind() fix when many ports are bound"

Tests where ran for creating and binding(0) many sockets
on 100 IPs. The results are, on average:

	* 60000 sockets, 600 ports / IP:
		* 0.210 s, 620 (IP, port) duplicates without patch
		* 0.219 s, no duplicates with patch
	* 100000 sockets, 1000 ports / IP:
		* 0.371 s, 1720 duplicates without patch
		* 0.373 s, no duplicates with patch
	* 200000 sockets, 2000 ports / IP:
		* 0.766 s, 6900 duplicates without patch
		* 0.768 s, no duplicates with patch
	* 500000 sockets, 5000 ports / IP:
		* 2.227 s, 41500 duplicates without patch
		* 2.284 s, no duplicates with patch

Signed-off-by: Alex Copot <alex.mihai.c@gmail.com>
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:28:55 -04:00
Eric Dumazet
c72e118334 inet: makes syn_ack_timeout mandatory
There are two struct request_sock_ops providers, tcp and dccp.

inet_csk_reqsk_queue_prune() can avoid testing syn_ack_timeout being
NULL if we make it non NULL like syn_ack_timeout

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:24:26 -04:00
Eric Dumazet
fd4f2cead6 tcp: RFC6298 supersedes RFC2988bis
Updates some comments to track RFC6298

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:24:26 -04:00
stephen hemminger
87b6d218f3 tunnel: implement 64 bits statistics
Convert the per-cpu statistics kept for GRE, IPIP, and SIT tunnels
to use 64 bit statistics.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 14:47:05 -04:00
Rémi Denis-Courmont
8e052e5289 Phonet: missing headers (sparse)
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 14:03:16 -04:00
Rémi Denis-Courmont
3c490a87f6 Phonet: phonet_net_id can be static (sparse)
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 14:03:16 -04:00
Hiroaki SHIMODA
dcd2ba92e8 neighbour: Make neigh_table_init_no_netlink() static.
neigh_table_init_no_netlink() is only used in net/core/neighbour.c file.

Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 14:00:44 -04:00
Eric Dumazet
447167bf56 udp: intoduce udp_encap_needed static_key
Most machines dont use UDP encapsulation (L2TP)

Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.

Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.

If static_key is not yet enabled, the fast path does a single JMP .

When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: dev@openvswitch.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 13:39:37 -04:00
stephen hemminger
efacb309b5 rtnetlink & bonding: change args got get_tx_queues
Change get_tx_queues, drop unsused arg/return value real_tx_queues,
and use return by value (with error) rather than call by reference.

Probably bonding should just change to LLTX and the whole get_tx_queues
API could disappear!

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 13:31:00 -04:00
David Ward
6f66cdc3e5 net/garp: fix GID rbtree ordering
The comparison operators were backwards in both garp_attr_lookup and
garp_attr_create, so the entire GID rbtree was in reverse order.
(There was no practical side effect to this though, except that PDUs
were sent with attributes listed in reverse order, which is still
valid by the protocol. This change is only for clarity.)

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 13:10:33 -04:00
David Woodhouse
9d02daf754 pppoatm: Fix excessive queue bloat
We discovered that PPPoATM has an excessively deep transmit queue. A
queue the size of the default socket send buffer (wmem_default) is
maintained between the PPP generic core and the ATM device.

Fix it to queue a maximum of *two* packets. The one the ATM device is
currently working on, and one more for the ATM driver to process
immediately in its TX done interrupt handler. The PPP core is designed
to feed packets to the channel with minimal latency, so that really
ought to be enough to keep the ATM device busy.

While we're at it, fix the fact that we were triggering the wakeup
tasklet on *every* pppoatm_pop() call. The comment saying "this is
inefficient, but doing it right is too hard" turns out to be overly
pessimistic... I think :)

On machines like the Traverse Geos, with a slow Geode CPU and two
high-speed ADSL2+ interfaces, there were reports of extremely high CPU
usage which could partly be attributed to the extra wakeups.

(The wakeup handling could actually be made a whole lot easier if we
 stop checking sk->sk_sndbuf altogether. Given that we now only queue
 *two* packets ever, one wonders what the point is. As it is, you could
 already deadlock the thing by setting the sk_sndbuf to a value lower
 than the MTU of the device, and it'd just block for ever.)

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 13:03:45 -04:00
Gao feng
1716a96101 ipv6: fix problem with expired dst cache
If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
this dst cache will not check expire because it has no RTF_EXPIRES flag.
So this dst cache will always be used until the dst gc run.

Change the struct dst_entry,add a union contains new pointer from and expires.
When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
we can use this field to point to where the dst cache copy from.
The dst.from is only used in IPV6.

rt6_check_expired check if rt6_info.dst.from is expired.

ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.

ip6_dst_destroy release the ort.

Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
and change the code to use these new adding functions.

Changes from v5:
modify ip6_route_add and ndisc_router_discovery to use new adding functions.

Only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 12:58:29 -04:00
Dmitry Tarnyagin
447648128e caif: set traffic class for caif packets
Set traffic class for CAIF packets, based on socket
priority, CAIF protocol type, or type of message.

Traffic class mapping for different packet types:
 - control:       TC_PRIO_CONTROL;
 - flow control:  TC_PRIO_CONTROL;
 - at:            TC_PRIO_CONTROL;
 - rfm:           TC_PRIO_INTERACTIVE_BULK;
 - other sockets: equals to socket's TC;
 - network data:  no change.

Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:37:36 -04:00
David S. Miller
e65ac4d545 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull in the 'net' tree to get CAIF bug fixes upon which
the following set of CAIF feature patches depend.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:36:48 -04:00
Tomasz Gregorek
5c699fb7d8 caif: Fix memory leakage in the chnl_net.c.
Added kfree_skb() calls in the chnk_net.c file on
the error paths.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:01:44 -04:00
James Chapman
c9be48dc8b l2tp: don't overwrite source address in l2tp_ip_bind()
Applications using L2TP/IP sockets want to be able to bind() an L2TP/IP
socket to set the local tunnel id while leaving the auto-assigned source
address alone. So if no source address is supplied, don't overwrite
the address already stored in the socket.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:01:44 -04:00
James Chapman
d1f224ae18 l2tp: fix refcount leak in l2tp_ip sockets
The l2tp_ip socket close handler does not update the module refcount
correctly which prevents module unload after the first bind() call on
an L2TPv3 IP encapulation socket.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:01:44 -04:00
Julia Lawall
89eb06f11c net/key/af_key.c: add missing kfree_skb
At the point of this error-handling code, alloc_skb has succeded, so free
the resulting skb by jumping to the err label.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:01:44 -04:00
Eric W. Biederman
03478756b1 phonet: Sort out initiailziation and cleanup code.
Recently an oops was reported in phonet if there was a failure during
network namespace creation.

[  163.733755] ------------[ cut here ]------------
[  163.734501] kernel BUG at include/net/netns/generic.h:45!
[  163.734501] invalid opcode: 0000 [#1] PREEMPT SMP
[  163.734501] CPU 2
[  163.734501] Pid: 19145, comm: trinity Tainted: G        W 3.4.0-rc1-next-20120405-sasha-dirty #57
[  163.734501] RIP: 0010:[<ffffffff824d6062>]  [<ffffffff824d6062>] phonet_pernet+0x182/0x1a0
[  163.734501] RSP: 0018:ffff8800674d5ca8  EFLAGS: 00010246
[  163.734501] RAX: 000000003fffffff RBX: 0000000000000000 RCX: ffff8800678c88d8
[  163.734501] RDX: 00000000003f4000 RSI: ffff8800678c8910 RDI: 0000000000000282
[  163.734501] RBP: ffff8800674d5cc8 R08: 0000000000000000 R09: 0000000000000000
[  163.734501] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880068bec920
[  163.734501] R13: ffffffff836b90c0 R14: 0000000000000000 R15: 0000000000000000
[  163.734501] FS:  00007f055e8de700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000
[  163.734501] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  163.734501] CR2: 00007f055e6bb518 CR3: 0000000070c16000 CR4: 00000000000406e0
[  163.734501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  163.734501] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  163.734501] Process trinity (pid: 19145, threadinfo ffff8800674d4000, task ffff8800678c8000)
[  163.734501] Stack:
[  163.734501]  ffffffff824d5f00 ffffffff810e2ec1 ffff880067ae0000 00000000ffffffd4
[  163.734501]  ffff8800674d5cf8 ffffffff824d667a ffff880067ae0000 00000000ffffffd4
[  163.734501]  ffffffff836b90c0 0000000000000000 ffff8800674d5d18 ffffffff824d707d
[  163.734501] Call Trace:
[  163.734501]  [<ffffffff824d5f00>] ? phonet_pernet+0x20/0x1a0
[  163.734501]  [<ffffffff810e2ec1>] ? get_parent_ip+0x11/0x50
[  163.734501]  [<ffffffff824d667a>] phonet_device_destroy+0x1a/0x100
[  163.734501]  [<ffffffff824d707d>] phonet_device_notify+0x3d/0x50
[  163.734501]  [<ffffffff810dd96e>] notifier_call_chain+0xee/0x130
[  163.734501]  [<ffffffff810dd9d1>] raw_notifier_call_chain+0x11/0x20
[  163.734501]  [<ffffffff821cce12>] call_netdevice_notifiers+0x52/0x60
[  163.734501]  [<ffffffff821cd235>] rollback_registered_many+0x185/0x270
[  163.734501]  [<ffffffff821cd334>] unregister_netdevice_many+0x14/0x60
[  163.734501]  [<ffffffff823123e3>] ipip_exit_net+0x1b3/0x1d0
[  163.734501]  [<ffffffff82312230>] ? ipip_rcv+0x420/0x420
[  163.734501]  [<ffffffff821c8515>] ops_exit_list+0x35/0x70
[  163.734501]  [<ffffffff821c911b>] setup_net+0xab/0xe0
[  163.734501]  [<ffffffff821c9416>] copy_net_ns+0x76/0x100
[  163.734501]  [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190
[  163.734501]  [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80
[  163.734501]  [<ffffffff810afd1f>] sys_unshare+0xff/0x290
[  163.734501]  [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  163.734501]  [<ffffffff82665539>] system_call_fastpath+0x16/0x1b
[  163.734501] Code: e0 c3 fe 66 0f 1f 44 00 00 48 c7 c2 40 60 4d 82 be 01 00 00 00 48 c7 c7 80 d1 23 83 e8 48 2a c4 fe e8 73 06 c8 fe 48 85 db 75 0e <0f> 0b 0f 1f 40 00 eb fe 66 0f 1f 44 00 00 48 83 c4 10 48 89 d8
[  163.734501] RIP  [<ffffffff824d6062>] phonet_pernet+0x182/0x1a0
[  163.734501]  RSP <ffff8800674d5ca8>
[  163.861289] ---[ end trace fb5615826c548066 ]---

After investigation it turns out there were two issues.
1) Phonet was not implementing network devices but was using register_pernet_device
   instead of register_pernet_subsys.

   This was allowing there to be cases when phonenet was not initialized and
   the phonet net_generic was not set for a network namespace when network
   device events were being reported on the netdevice_notifier for a network
   namespace leading to the oops above.

2) phonet_exit_net was implementing a confusing and special case of handling all
   network devices from going away that it was hard to see was correct, and would
   only occur when the phonet module was removed.

   Now that unregister_netdevice_notifier has been modified to synthesize unregistration
   events for the network devices that are extant when called this confusing special
   case in phonet_exit_net is no longer needed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:01:43 -04:00
Eric W. Biederman
7d3d43dab4 net: In unregister_netdevice_notifier unregister the netdevices.
We already synthesize events in register_netdevice_notifier and synthesizing
events in unregister_netdevice_notifier allows to us remove the need for
special case cleanup code.

This change should be safe as it adds no new cases for existing callers
of unregiser_netdevice_notifier to handle.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 11:01:43 -04:00
David S. Miller
816a7854d5 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2012-04-12 20:12:31 -04:00
David S. Miller
011e3c6325 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-12 19:41:23 -04:00
Eldad Zack
c1412fce7e net/ipv6/exthdrs.c: Strict PadN option checking
Added strict checking of PadN, as PadN can be used to increase header
size and thus push the protocol header into the 2nd fragment.

PadN is used to align the options within the Hop-by-Hop or
Destination Options header to 64-bit boundaries. The maximum valid
size is thus 7 bytes.
RFC 4942 recommends to actively check the "payload" itself and
ensure that it contains only zeroes.

See also RFC 4942 section 2.1.9.5.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 17:36:44 -04:00
Linus Torvalds
174808af90 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix bluetooth userland regression reported by Keith Packard, from
    Gustavo Padovan.

 2) Revert ath9k PS idle change, from Sujith Manoharan.

 3) Correct default TCP memory limits (again), from Eric Dumazet.

 4) Fix tcp_rcv_rtt_update() accidental use of unscaled RTT, from Neal
    Cardwell.

 5) We made a facility for layers like wireless to say how much tailroom
    they need in the SKB for link layer stuff such as wireless
    encryption etc., but TCP works hard to fill every SKB out to the end
    defeating this specification.

    This leads to every TCP packet getting reallocated by the wireless
    code in order to have the right amount of tailroom available.

    Fix TCP to only fill SKBs out to the real amount of data area it
    asked for during the allocation, this way it won't eat into the
    slack added for the device's tailroom needs.

    Reported by Marc Merlin and fixed by Eric Dumazet.

 6) Leaks, endian bugs, and new device IDs in bluetooth from Santosh
    Nayak, João Paulo Rechi Vita, Cho, Yu-Chen, Andrei Emeltchenko,
    AceLan Kao, and Andrei Emeltchenko.

 7) OOPS on tty_close fix in bluetooth's hci_ldisc from Johan Hovold.

 8) netfilter erroneously scales TCP window twice, fix from Changli Gao.

 9) Memleak fix in wext-core from Julia Lawall.

10) Consistently handle invalid TCP packets in ipv4 vs.  ipv6 conntrack,
    from Jozsef Kadlecsik.

11) Validate IP header length properly in netfilter conntrack's
    ipv4_get_l4proto().

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits)
  NFC: Fix the LLCP Tx fragmentation loop
  rtlwifi: Add missing DMA buffer unmapping for PCI drivers
  rtlwifi: Preallocate USB read buffers and eliminate kalloc in read routine
  tcp: avoid order-1 allocations on wifi and tx path
  net: allow pskb_expand_head() to get maximum tailroom
  bridge: Do not send queries on multicast group leaves
  MAINTAINERS: Mark NATSEMI driver as orphan'd.
  tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample
  tcp: restore correct limit
  Revert "ath9k: fix going to full-sleep on PS idle"
  rt2x00: Fix rfkill_polling register function.
  bcma: fix build error on MIPS; implicit pcibios_enable_device
  netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net
  netfilter: nf_ct_ipv4: packets with wrong ihl are invalid
  netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently
  net/wireless/wext-core.c: add missing kfree
  rtlwifi: Fix oops on rate-control failure
  mac80211: Convert WARN_ON to WARN_ON_ONCE
  rtlwifi: rtl8192de: Fix firmware initialization
  nl80211: ensure interface is up in various APIs
  ...
2012-04-12 14:04:33 -07:00
Jeffrin Jose
46ba5b23c3 net: Fixed coding style issues relating to braces.
Fixed coding style issues in net/core/utils.c
in relation with braces placement.

Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 16:35:48 -04:00
Eldad Zack
6fdbd1648b net/ipv6/ipv6_sockglue.c: Removed redundant extern
extern int sysctl_mld_max_msf is already defined in linux/ipv6.h.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 16:14:47 -04:00
Shuah Khan
e1e420c71b net/core: simple_strtoul cleanup
Changed net/core/net-sysfs.c: netdev_store() to use kstrtoul()
instead of obsolete simple_strtoul().

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 16:09:08 -04:00
Alexey I. Froloff
e35f30c131 Treat ND option 31 as userland (DNSSL support)
As specified in RFC6106, DNSSL option contains one or more domain names
of DNS suffixes.  8-bit identifier of the DNSSL option type as assigned
by the IANA is 31.  This option should also be treated as userland.

Signed-off-by: Alexey I. Froloff <raorn@raorn.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 15:56:57 -04:00
John W. Linville
7eab0f64a9 Merge branch 'master' into for-davem
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-testmode.c
	net/wireless/nl80211.c
2012-04-12 14:41:59 -04:00