net_sched: SFB flow scheduler

This is the Stochastic Fair Blue scheduler, based on work from :

W. Feng, D. Kandlur, D. Saha, K. Shin. Blue: A New Class of Active Queue
Management Algorithms. U. Michigan CSE-TR-387-99, April 1999.

http://www.thefengs.com/wuchang/blue/CSE-TR-387-99.pdf

This implementation is based on work done by Juliusz Chroboczek

General SFB algorithm can be found in figure 14, page 15:

B[l][n] : L x N array of bins (L levels, N bins per level)
enqueue()
Calculate hash function values h{0}, h{1}, .. h{L-1}
Update bins at each level
for i = 0 to L - 1
   if (B[i][h{i}].qlen > bin_size)
      B[i][h{i}].p_mark += p_increment;
   else if (B[i][h{i}].qlen == 0)
      B[i][h{i}].p_mark -= p_decrement;
p_min = min(B[0][h{0}].p_mark ... B[L-1][h{L-1}].p_mark);
if (p_min == 1.0)
    ratelimit();
else
    mark/drop with probabilty p_min;

I did the adaptation of Juliusz code to meet current kernel standards,
and various changes to address previous comments :

http://thread.gmane.org/gmane.linux.network/90225
http://thread.gmane.org/gmane.linux.network/90375

Default flow classifier is the rxhash introduced by RPS in 2.6.35, but
we can use an external flow classifier if wanted.

tc qdisc add dev $DEV parent 1:11 handle 11:  \
        est 0.5sec 2sec sfb limit 128

tc filter add dev $DEV protocol ip parent 11: handle 3 \
        flow hash keys dst divisor 1024

Notes:

1) SFB default child qdisc is pfifo_fast. It can be changed by another
qdisc but a child qdisc MUST not drop a packet previously queued. This
is because SFB needs to handle a dequeued packet in order to maintain
its virtual queue states. pfifo_head_drop or CHOKe should not be used.

2) ECN is enabled by default, unlike RED/CHOKe/GRED

With help from Patrick McHardy & Andi Kleen

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Andi Kleen <andi@firstfloor.org>
CC: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet
2011-02-23 10:56:17 +00:00
committed by David S. Miller
parent dee9f4bceb
commit e13e02a3c6
4 changed files with 760 additions and 0 deletions

View File

@ -522,4 +522,43 @@ struct tc_mqprio_qopt {
__u16 offset[TC_QOPT_MAX_QUEUE];
};
/* SFB */
enum {
TCA_SFB_UNSPEC,
TCA_SFB_PARMS,
__TCA_SFB_MAX,
};
#define TCA_SFB_MAX (__TCA_SFB_MAX - 1)
/*
* Note: increment, decrement are Q0.16 fixed-point values.
*/
struct tc_sfb_qopt {
__u32 rehash_interval; /* delay between hash move, in ms */
__u32 warmup_time; /* double buffering warmup time in ms (warmup_time < rehash_interval) */
__u32 max; /* max len of qlen_min */
__u32 bin_size; /* maximum queue length per bin */
__u32 increment; /* probability increment, (d1 in Blue) */
__u32 decrement; /* probability decrement, (d2 in Blue) */
__u32 limit; /* max SFB queue length */
__u32 penalty_rate; /* inelastic flows are rate limited to 'rate' pps */
__u32 penalty_burst;
};
struct tc_sfb_xstats {
__u32 earlydrop;
__u32 penaltydrop;
__u32 bucketdrop;
__u32 queuedrop;
__u32 childdrop; /* drops in child qdisc */
__u32 marked;
__u32 maxqlen;
__u32 maxprob;
__u32 avgprob;
};
#define SFB_MAX_PROB 0xFFFF
#endif