Add the ability to set/clear labels assigned to a conntrack
via ctnetlink.
To allow userspace to only alter specific bits, Pablo suggested to add
a new CTA_LABELS_MASK attribute:
The new set of active labels is then determined via
active = (active & ~mask) ^ changeset
i.e., the mask selects those bits in the existing set that should be
changed.
This follows the same method already used by MARK and CONNMARK targets.
Omitting CTA_LABELS_MASK is the same as setting all bits in CTA_LABELS_MASK
to 1: The existing set is replaced by the one from userspace.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Introduce CTA_LABELS attribute to send a bit-vector of currently active labels
to userspace.
Future patch will permit userspace to also set/delete active labels.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
similar to connmarks, except labels are bit-based; i.e.
all labels may be attached to a flow at the same time.
Up to 128 labels are supported. Supporting more labels
is possible, but requires increasing the ct offset delta
from u8 to u16 type due to increased extension sizes.
Mapping of bit-identifier to label name is done in userspace.
The extension is enabled at run-time once "-m connlabel" netfilter
rules are added.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Most SIP devices use a source port of 5060/udp on SIP requests, so the
response automatically comes back to port 5060:
phone_ip:5060 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:5060 100 Trying
The newer Cisco IP phones, however, use a randomly chosen high source
port for the SIP request but expect the response on port 5060:
phone_ip:49173 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:5060 100 Trying
Standard Linux NAT, with or without nf_nat_sip, will send the reply back
to port 49173, not 5060:
phone_ip:49173 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:49173 100 Trying
But the phone is not listening on 49173, so it will never see the reply.
This patch modifies nf_*_sip to work around this quirk by extracting
the SIP response port from the Via: header, iff the source IP in the
packet header matches the source IP in the SIP request.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
bnx2x does an internal GRO pass but doesn't provide gso_segs, thus
breaking qdisc_pkt_len_init() in case ingress qdisc is used.
We store gso_segs in NAPI_GRO_CB(skb)->count, where tcp_gro_complete()
expects to find the number of aggregated segments.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the 64bit optimized version of ipv6_prefix_equal to convert the
bitmask to network byte order only after the bit-shift.
The bug was introduced in:
3867517 ipv6: 64bit version of ipv6_prefix_equal().
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increase the amount of memory usage limits for incomplete
IP fragments.
Arguing for new thresh high/low values:
High threshold = 4 MBytes
Low threshold = 3 MBytes
The fragmentation memory accounting code, tries to account for the
real memory usage, by measuring both the size of frag queue struct
(inet_frag_queue (ipv4:ipq/ipv6:frag_queue)) and the SKB's truesize.
We want to be able to handle/hold-on-to enough fragments, to ensure
good performance, without causing incomplete fragments to hurt
scalability, by causing the number of inet_frag_queue to grow too much
(resulting longer searches for frag queues).
For IPv4, how much memory does the largest frag consume.
Maximum size fragment is 64K, which is approx 44 fragments with
MTU(1500) sized packets. Sizeof(struct ipq) is 200. A 1500 byte
packet results in a truesize of 2944 (not 2048 as I first assumed)
(44*2944)+200 = 129736 bytes
The current default high thresh of 262144 bytes, is obviously
problematic, as only two 64K fragments can fit in the queue at the
same time.
How many 64K fragment can we fit into 4 MBytes:
4*2^20/((44*2944)+200) = 32.34 fragment in queues
An attacker could send a separate/distinct fake fragment packets per
queue, causing us to allocate one inet_frag_queue per packet, and thus
attacking the hash table and its lists.
How many frag queue do we need to store, and given a current hash size
of 64, what is the average list length.
Using one MTU sized fragment per inet_frag_queue, each consuming
(2944+200) 3144 bytes.
4*2^20/(2944+200) = 1334 frag queues -> 21 avg list length
An attack could send small fragments, the smallest packet I could send
resulted in a truesize of 896 bytes (I'm a little surprised by this).
4*2^20/(896+200) = 3827 frag queues -> 59 avg list length
When increasing these number, we also need to followup with
improvements, that is going to help scalability. Simply increasing
the hash size, is not enough as the current implementation does not
have a per hash bucket locking.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While a privileged program can open a raw socket, attach some
restrictive filter and drop its privileges (or send the socket to an
unprivileged program through some Unix socket), the filter can still
be removed or modified by the unprivileged program. This commit adds a
socket option to lock the filter (SO_LOCK_FILTER) preventing any
modification of a socket filter program.
This is similar to OpenBSD BIOCLOCK ioctl on bpf sockets, except even
root is not allowed change/drop the filter.
The state of the lock can be read with getsockopt(). No error is
triggered if the state is not changed. -EPERM is returned when a user
tries to remove the lock or to change/remove the filter while the lock
is active. The check is done directly in sk_attach_filter() and
sk_detach_filter() and does not affect only setsockopt() syscall.
Signed-off-by: Vincent Bernat <bernat@luffy.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 6f0333b ("r8169: use 50% less ram for RX ring") the rx
ring buffers are always copied making dirty_rx useless.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some new APIs will require tracing a chandef without
it being part of a channel context, so separate out
the tracing macros for that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
To ease further DFS development regarding interface combinations, use
the interface combinations structure to test for radar capabilities.
Drivers can specify which channel widths they support, and in which
modes. Right now only a single AP interface is allowed, but as the
DFS code evolves other combinations can be enabled.
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The NL80211_ATTR_USE_MFP attribute was originally added for
NL80211_CMD_ASSOCIATE, but it is actually as useful (if not even more
useful) with NL80211_CMD_CONNECT, so process that attribute with the
connect command, too.
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Mesh PERR action frames are robust and thus may be encrypted, so add
proper head/tailroom to allow this. Fixes this warning when operating
a Mesh STA on ath5k:
WARNING: at net/mac80211/wpa.c:427 ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211]()
Call Trace:
[<c011c5e7>] warn_slowpath_common+0x63/0x78
[<c011c60b>] warn_slowpath_null+0xf/0x13
[<e090621d>] ccmp_encrypt_skb.isra.5+0x7b/0x1a0 [mac80211]
[<e090685c>] ieee80211_crypto_ccmp_encrypt+0x1f/0x37 [mac80211]
[<e0917113>] invoke_tx_handlers+0xcad/0x10bd [mac80211]
[<e0917665>] ieee80211_tx+0x87/0xb3 [mac80211]
[<e0918932>] ieee80211_tx_pending+0xcc/0x170 [mac80211]
[<c0121c43>] tasklet_action+0x3e/0x65
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A user reported warnings in ath5k due to transmitting frames with no
rates set up. The frames were Mesh PERR frames, and some debugging
showed an empty control block with just the vif pointer:
> [ 562.522682] XXX txinfo: 00000000: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 ................
> [ 562.522688] XXX txinfo: 00000010: 00 00 00 00 00 00 00 00 54 b8 f2
> db 00 00 00 00 ........T.......
> [ 562.522693] XXX txinfo: 00000020: 00 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 ................
Set the IEEE80211_TX_INTFL_NEED_TXPROCESSING flag to ensure that
rate control gets run before the frame is sent.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
As per email discussion Jouni Malinen pointed out that:
"P2P message exchanges can be executed on the current operating channel
of any operation (both P2P and non-P2P station). These can be on 5 GHz
and even on 60 GHz (so yes, you _can_ do GO Negotiation on 60 GHz).
As an example, it would be possible to receive a GO Negotiation Request
frame on a 5 GHz only radio and then to complete GO Negotiation on that
band. This can happen both when connected to a P2P group (through client
discoverability mechanism) and when connected to a legacy AP (assuming
the station receive Probe Request frame from full scan in the beginning
of P2P device discovery)."
This means that P2P messages can be sent over different radio devices.
However, these should use the same P2P device address so it should be
able to provision this from user-space. This patch adds a parameter for
this to struct vif_params which should only be used during creation of
the P2P device interface.
Cc: Jouni Malinen <j@w1.fi>
Cc: Greg Goldman <ggoldman@broadcom.com>
Cc: Jithu Jance <jithu@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
[add error checking]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add the nl80211_mesh_power_mode enumeration which holds possible
values for the mesh power mode. These modes are unknown, active,
light sleep and deep sleep.
Add power_mode entry to the mesh config structure to hold the
user-configured default mesh power mode. This value will be used
for new peer links.
Add the dot11MeshAwakeWindowDuration value to the mesh config.
The awake window is a duration in TU describing how long the STA
will stay awake after transmitting its beacon in PS mode.
Add access routines to:
- get/set local link-specific power mode (STA)
- get remote STA's link-specific power mode (STA)
- get remote STA's non-peer power mode (STA)
- get/set default mesh power mode (mesh config)
- get/set mesh awake window duration (mesh config)
All config changes may be done at mesh runtime and take effect
immediately.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Signed-off-by: Ivan Bezyazychnyy <ivan.bezyazychnyy@gmail.com>
Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com>
[fix commit message line length, error handling in set station]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Move the default mesh beacon interval and DTIM period to cfg80211
and make them accessible to nl80211. This enables setting both
values when joining an MBSS.
Previously the DTIM parameter was not set by mac80211 so the
driver's default value was used.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This functions will be used for mesh beacons, too.
Signed-off-by: Marco Porsch <marco@cozybit.com>
[some formatting fixes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The established peer link count is indicated in mesh beacons and
used for other internal tasks. Previously it was not updated when
authenticated peering is performed in userspace.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Acked-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ranges are taken from IEEE 802.11-2012, common sense or current
implementation requirements.
Signed-off-by: Marco Porsch <marco@cozybit.com>
Acked-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
v4: hold rtnl lock for the whole netpoll_setup()
v3: remove the comment
v2: use RCU read lock
This patch fixes the following warning:
[ 72.013864] RTNL: assertion failed at net/core/dev.c (4955)
[ 72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
[ 72.019582] Call Trace:
[ 72.020295] [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
[ 72.022545] [<ffffffff81784edd>] netpoll_setup+0x61/0x340
[ 72.024846] [<ffffffff815d837e>] store_enabled+0x82/0xc3
[ 72.027466] [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
[ 72.029348] [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
[ 72.030959] [<ffffffff8115d239>] vfs_write+0xaf/0xf6
[ 72.032359] [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
[ 72.033824] [<ffffffff8115d453>] sys_write+0x5c/0x84
[ 72.035328] [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b
In case of other races, hold rtnl lock for the entire netpoll_setup() function.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VMXNET3 device provides RSS hash value for received packets,
but it is not being used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than generating a different RSS key on each boot, just use
a predetermined value that will map same flow to same value on
every device for more predictable testing. This is already done
on most hardware drivers.
Initial key value just some arbitrary bits extracted once
from /dev/random.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This static variable is never set, it initializes to 0 which
is VMXNET3_INTR_BUDDYSHARE, and never changes.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
An atomic counter of devices present is maintained but never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the standard netdev_xxx() and dev_xxx() wrappers to format
log messages.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_dbg() rather than dev_dbg() because the former prints
the device name which is more useful than the pci name.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This messages that occur during boot time from this device
when netdev_err is called before calling register_netdevice().
Switch to using dev_XXX macros which correlate message with PCI info which
is available.
Rather than fixing the features message, just remove it since
the information is redundant and available through ethtool.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The uncommitted[] array was set but never used except in a debug
message. Remove it.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_alloc_skb_align, rather than open code using dev_alloc_skb.
Change allocation at startup to use GFP_KERNEL.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
This series contains updates to e1000e only.
v2- updates patch 09/15 "e1000e: resolve checkpatch PREFER_PR_LEVEL warning"
based on feedback from Joe Perches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow mesh interface to disable the power save which is by default
turn on in certain chipset. Testing with 2 units of ZCN-1523H-5-16
featuring AR9280 chipset which have power save enabled by default.
Constant reset if the average signal of the peer mesh STA is below
-80 dBm and power save is enabled.
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When the driver's resume function can't completely
restore the configuration in the device, it returns
1 from the callback which will be treated like a HW
restart request, but done directly.
In this case, also call the driver's restart_complete()
function so it can finish the reconfiguration there.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
My commit 529ba6e931
("mac80211: clean up association better in suspend")
introduced a bug when resuming from WoWLAN when a
device reset is desired. This case must not use the
suspend_bss_conf as it hasn't been stored.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When building the 80211 DocBook, scripts/kernel-doc reports
the following type of warnings:
Warning(include/net/cfg80211.h:334): No description found for return value of 'cfg80211_get_chandef_type'
These warnings are only reported when scripts/kernel-doc
runs in verbose mode.
To fix these use "Return:" to describe function return values.
Signed-off-by: Yacine Belkadi <yacine.belkadi.1@gmail.com>
[adjust for freq_reg_info() change]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Channel contexts are not always used with monitor interfaces. If no channel
context is set, use the oper channel, otherwise tx fails.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
[check local->use_chanctx]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since:
commit b23b025fe2
Author: Ben Greear <greearb@candelatech.com>
Date: Fri Feb 4 11:54:17 2011 -0800
mac80211: Optimize scans on current operating channel.
we do not disable PS while going back to operational channel (on
ieee80211_scan_state_suspend) and deffer that until scan finish.
But since we are allowed to send frames, we can send a frame to AP
without PM bit set, so disable PS on AP side. Then when we switch
to off-channel (in ieee80211_scan_state_resume) we do not enable PS.
Hence we are off-channel with PS disabled, frames are not buffered
by AP.
To fix remove offchannel_ps_disable argument and always enable PS when
going off-channel and disable it when going on-channel, like it was
before.
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
During FT roaming, wpa_supplicant attempts to set the
key before association. This used to be rejected, but
as a side effect of my commit 66e67e4189
("mac80211: redesign auth/assoc") the key was accepted
causing hardware crypto to not be used for it as the
station isn't added to the driver yet.
It would be possible to accept the key and then add it
to the driver when the station has been added. However,
this may run into issues with drivers using the state-
based station adding if they accept the key only after
association like it used to be.
For now, revert to the behaviour from before the auth
and assoc change.
Cc: stable@vger.kernel.org
Reported-by: Cédric Debarge <cedric.debarge@acksys.fr>
Tested-by: Cédric Debarge <cedric.debarge@acksys.fr>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Also when things go wrong (queues don't get emtpy), try to
get some data from the HW.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The rate scaling won't treat the information in a frame
with IEEE80211_TX_CTL_AMPDU set if IEEE80211_TX_STAT_AMPDU
is cleared. But all the frames coming from an AGG tx queue
have IEEE80211_TX_CTL_AMPDU set, and IEEE80211_TX_STAT_AMPDU
is set only if the frame was sent in an AMPDU.
This means that all the data in frames in AGG tx queues that
aren't sent as an AMPDU is thrown away.
This is even more harmful when in bad link conditions, the
frames are sent in an AMPDU and then finally sent as single
frame. So a lot of failures weren't reported and the rate
scaling got stuck in high rates leading to very poor
connectivity.
Fix that by clearing IEEE80211_TX_CTL_AMPDU when the frame
isn't part of an AMPDU.
This bug was introduced by
2eb81a40aa
iwlwifi: don't clear CTL_AMPDU on frame status
This fix basically reverts the aforementioned commit.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
On resuming, the opmode may have to be able to talk
to the WoWLAN/D3 firmware in order to query it about
its status and wakeup reasons. To do that, the opmode
has to call the new d3_resume() transport API which
will set up the device for command communcation.
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Writing 130 dwords into the device one by one is
rather inefficient, every one needs to lock, grab
NIC access (a few register reads/writes) and then
write the address and data registers.
Use the new memory clearing function to make this
easier and faster.
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Enabling the RF-kill interrupt is sufficient for getting
RF-kill notifications, and no other interrupt is needed
as the device isn't functional when suspended and will be
restarted/reconfigured when mac80211 resumes it later.
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>