Add support for multiple hardware receive queues. The ingress traffic is hashed into one of the receive queues based on IP or TCP or both headers. The max no. of receive queues supported is 8.
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus:
MIPS: O32 compat/N32: Fix to use compat syscall wrappers for AIO syscalls.
MAINTAINERS: Change list for ioc_serial to linux-serial.
SERIAL: ioc3_serial: Return -ENOMEM on memory allocation failure
MIPS: jz4740: Fix Kbuild Platform file.
MIPS: Repair Kbuild make clean breakage.
If the host is slow in reading data or doesn't read data at all,
blocking write calls not only blocked the program that called write()
but the entire guest itself.
To overcome this, let's not block till the host signals it has given
back the virtio ring element we passed it. Instead, send the buffer to
the host and return to userspace. This operation then becomes similar
to how non-blocking writes work, so let's use the existing code for this
path as well.
This code change also ensures blocking write calls do get blocked if
there's not enough room in the virtio ring as well as they don't return
-EAGAIN to userspace.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
CC: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no point using RCU for dst we allocate for a very short time
(used once).
Change dst_release() to take DST_NOCACHE into account, but also change
skb_dst_set_noref() to force a refcount increment for such dst.
This is a _huge_ gain, because we dont waste memory to store xx thousand
of dsts. Instead of queueing them to RCU, we can free them instantly.
CPU caches can stay hot, re-using same memory blocks to hold temporary
dsts.
Note : remove unneeded smp_mb__before_atomic_dec(); in dst_release(),
since atomic_dec_return() implies a full memory barrier.
Stress test, 160.000.000 udp frames sent, IP route cache disabled
(DDOS).
Before:
real 0m38.091s
user 0m13.189s
sys 7m53.018s
After:
real 0m29.946s
user 0m12.157s
sys 7m40.605s
For reference, if IP route cache was enabled :
real 0m32.030s
user 0m10.521s
sys 8m15.243s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces netif_alloc_netdev_queues which is called from
register_device instead of alloc_netdev_mq. This makes TX queue
allocation symmetric with RX allocation. Also, queue locks allocation
is done in netdev_init_one_queue. Change set_real_num_tx_queues to
fail if requested number < 1 or greater than number of allocated
queues.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up in RX queue allocation. In netif_set_real_num_rx_queues
return error on attempt to set zero queues, or requested number is
greater than number of allocated queues. In netif_alloc_rx_queues,
do BUG_ON if queue_count is zero.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In an erlier patch I modified napi_poll so that devices with IFF_MASTER polled
the per_cpu list instead of the device list for napi. I did this because the
bonding driver has no napi instances to poll, it instead expects to check the
slave devices napi instances, which napi_poll was unaware of. Looking at this
more closely however, I now see this isn't strictly needed. As the bond driver
poll_controller calls the slaves poll_controller via netpoll_poll_dev, which
recursively calls poll_napi on each slave, allowing those napi instances to get
serviced. The earlier patch isn't at all harmfull, its just not needed, so lets
revert it to make the code cleaner. Sorry for the noise,
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some recent testing in netpoll with bonding showed this backtrace
------------[ cut here ]------------
kernel BUG at drivers/net/bonding/bonding.h:134!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.2/usb7/devnum
CPU 0
Pid: 1876, comm: rmmod Not tainted 2.6.36-rc3+ #10 D26928/
RIP: 0010:[<ffffffffa0514ba4>] [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0
RSP: 0018:ffff88003b1b5d58 EFLAGS: 00010296
RAX: ffff88003b9b6200 RBX: ffff8800373e8e00 RCX: 00000000000f4240
RDX: 00000000ffffffff RSI: 0000000000000286 RDI: 0000000000000286
RBP: ffff88003b1b5dc8 R08: 0000000000000000 R09: 00000001af7de920
R10: 0000000000000000 R11: ffff880002495e98 R12: ffff880037922700
R13: ffff880038c31000 R14: ffff880037922730 R15: 0000000000000286
FS: 00007f90e6d72700(0000) GS:ffff880002400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000346f0d9ad0 CR3: 000000003b263000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 1876, threadinfo ffff88003b1b4000, task ffff88003b36aa80)
Stack:
00000000ffffffff ffff88003b1b5d7a ffff8800379221e8 ffff880037922000
<0> ffff88003b1b5dc8 ffffffff813eb5fb ffff88003b1b5da8 0000000031b177a3
<0> ffff88003b1b5da8 ffff880037922000 ffff88003b1b5e48 ffff88003b1b5e48
Call Trace:
[<ffffffff813eb5fb>] ? rtmsg_ifinfo+0xcb/0xf0
[<ffffffff813daad8>] rollback_registered_many+0x168/0x280
[<ffffffff813dac09>] unregister_netdevice_many+0x19/0x80
[<ffffffff813e97b3>] __rtnl_kill_links+0x63/0x90
[<ffffffff813e980b>] __rtnl_link_unregister+0x2b/0x60
[<ffffffff813e9bde>] rtnl_link_unregister+0x1e/0x30
[<ffffffffa052124b>] bonding_exit+0x37/0x51 [bonding]
[<ffffffff81098b2e>] sys_delete_module+0x19e/0x270
[<ffffffff810bb2b2>] ? audit_syscall_entry+0x252/0x280
[<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
RIP [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0 [bonding]
RSP <ffff88003b1b5d58>
---[ end trace 1395ad691cea24d1 ]---
It occurs because of my recent netpoll blocking patches, which I added to avoid
recursive deadlock in the bonding driver. It relies on some per cpu bits, but
the shutdown path forces some rescheduling as we cancel workqueues for the
driver and wait for some device refcounts. If after the forced reschedule, we
wind up on a different cpu we trigger the bughalt in unblock_netpoll_tx.
The fix is to remove the netpoll block/unblock calls from bond_release_all.
This is safe to do because bond_uninit, which is called via ndo_uninit in
rollback_registered_many, doesn't occur until we send a NETDEV_UNREGISTER event,
which triggers netconsole to remove us as a netpoll client, so we are guaranteed
not to recurse into our own tx path here.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IOC3 is also being used on SGI MIPS systems but this particular driver is
only being used on IA64 systems so linux-mips made no sense as a list. Pat
also thinks linux-serial@vger.kernel.org is the better list.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When running make clean, Kbuild doesn't process the .config file, so nothing
generates a platform-y variable. We can get it to descend into the platform
directories by setting $(obj-).
The dec Platform file was unconditionally setting platform-, obliterating
its previous contents and preventing some directories from being cleaned.
This is change to an append operation '+=' to allow cavium-octeon to be
cleaned.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Patchwork: https://patchwork.linux-mips.org/patch/1718/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
kvm reloads the host's fs and gs blindly, however the underlying segment
descriptors may be invalid due to the user modifying the ldt after loading
them.
Fix by using the safe accessors (loadsegment() and load_gs_index()) instead
of home grown unsafe versions.
This is CVE-2010-3698.
KVM-Stable-Tag.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The possible deadlock (on 57710 devices only) will prevent from the
device to generate interrupts.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lists were initialized after the module was registered. Multiple ipvsadm
processes at module load triggered a race condition that resulted in a null
pointer dereference in do_ip_vs_get_ctl(). As a result, __ip_vs_mutex
was left locked preventing all further ipvsadm commands.
Signed-off-by: Eduardo J. Blanco <ejblanco@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
This code has been broken forever, but in several different and
creative ways.
So far as I can work out, the R6040 MAC filter has 4 exact-match
entries, the first of which the driver uses for its assigned unicast
address, plus a 64-entry hash-based filter for multicast addresses
(maybe unicast as well?).
The original version of this code would write the first 4 multicast
addresses as exact-match entries from offset 1 (bug #1: there is no
entry 4 so this could write to some PHY registers). It would fill the
remainder of the exact-match entries with the broadcast address (bug #2:
this would overwrite the last used entry). If more than 4 multicast
addresses were configured, it would set up the hash table, write some
random crap to the MAC control register (bug #3) and finally walk off
the end of the list when filling the exact-match entries (bug #4).
All of this seems to be pointless, since it sets the promiscuous bit
when the interface is made promiscuous or if >4 multicast addresses
are enabled, and never clears it (bug #5, masking bug #2).
The recent(ish) changes to the multicast list fixed bug #4, but
completely removed the limit on iteration over the exact-match entries
(bug #6).
Bug #4 was reported as
<https://bugzilla.kernel.org/show_bug.cgi?id=15355> and more recently
as <http://bugs.debian.org/600155>. Florian Fainelli attempted to fix
these in commit 3bcf8229a8, but that
actually dealt with bugs #1-3, bug #4 having been fixed in mainline at
that point.
That commit fixes the most important current bug #6.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@kernel.org [2.6.35 only]
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert inetdev_by_index() to not increment in_dev refcount.
Callers hold RCU or RTNL, and should not decrement in_dev refcount.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We hold RTNL in ip_mc_find_dev(), no need to touch device refcount.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If you are genuinely using one of these legacy MCA drivers
then you are tragically on hardware where you really don't
have the extra CPU cycles to be wasting on this.
In addition, it makes two less cases for people to inadvertently
blindly copy flags from without explicitly thinking whether it
makes sense -- see the addition to feature-removal.txt as per
commit 9d9b8fb0e5.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 encapsulation uses a bad source address for the tunnel.
i.e. VIP will be used as local-addr and encap. dst addr.
Decapsulation will not accept this.
Example
LVS (eth1 2003::2:0:1/96, VIP 2003::2:0:100)
(eth0 2003::1:0:1/96)
RS (ethX 2003::1:0:5/96)
tcpdump
2003::2:0:100 > 2003::1:0:5: IP6 (hlim 63, next-header TCP (6) payload length: 40) 2003::3:0:10.50991 > 2003::2:0:100.http: Flags [S], cksum 0x7312 (correct), seq 3006460279, win 5760, options [mss 1440,sackOK,TS val 1904932 ecr 0,nop,wscale 3], length 0
In Linux IPv6 impl. you can't have a tunnel with an any cast address
receiving packets (I have not tried to interpret RFC 2473)
To have receive capabilities the tunnel must have:
- Local address set as multicast addr or an unicast addr
- Remote address set as an unicast addr.
- Loop back addres or Link local address are not allowed.
This causes us to setup a tunnel in the Real Server with the
LVS as the remote address, here you can't use the VIP address since it's
used inside the tunnel.
Solution
Use outgoing interface IPv6 address (match against the destination).
i.e. use ip6_route_output() to look up the route cache and
then use ipv6_dev_get_saddr(...) to set the source address of the
encapsulated packet.
Additionally, cache the results in new destination
fields: dst_cookie and dst_saddr and properly check the
returned dst from ip6_route_output. We now add xfrm_lookup
call only for the tunneling method where the source address
is a local one.
Signed-off-by:Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch allows to listen to events that inform about
expectations destroyed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus:
MIPS: Enable ISA_DMA_API config to fix build failure
MIPS: 32-bit: Fix build failure in asm/fcntl.h
MIPS: Remove all generated vmlinuz* files on "make clean"
MIPS: do_sigaltstack() expects userland pointers
MIPS: Fix error values in case of bad_stack
MIPS: Sanitize restart logics
MIPS: secure_computing, syscall audit: syscall number should in r2, not r0.
MIPS: Don't block signals if we'd failed to setup a sigframe
This patch reverts the driver to enabling/disabling the NFC interrupt
mask rather than enabling/disabling the system interrupt. This cleans
up the driver so that it doesn't rely on interrupts being disabled
within the interrupt handler.
For i.MX21 we keep the current behaviour, that is calling
enable_irq/disable_irq_nosync to enable/disable interrupts. This patch
is based on earlier work by John Ogness.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: John Ogness <john.ogness@linutronix.de>
Tested-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus/i2c/2636-rc8' of git://git.fluff.org/bjdooks/linux:
i2c-imx: do not allow interruptions when waiting for I2C to complete
i2c-davinci: Fix TX setup for more SoCs
Add ISA_DMA_API config item and select it when GENERIC_ISA_DMA enabled.
This fixes build failure on allmodconfig like following:
CC sound/isa/es18xx.o
sound/isa/es18xx.c: In function 'snd_es18xx_playback1_prepare':
sound/isa/es18xx.c:501:9: error: implicit declaration of function 'snd_dma_program'
sound/isa/es18xx.c: In function 'snd_es18xx_playback_pointer':
sound/isa/es18xx.c:818:3: error: implicit declaration of function 'snd_dma_pointer'
make[3]: *** [sound/isa/es18xx.o] Error 1
make[2]: *** [sound/isa/es18xx.o] Error 2
make[1]: *** [sub-make] Error 2
make: *** [all] Error 2
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1717/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>