Zahadny PacketLoss

Zbyněk Burget zburget at burgnet.cz
Fri Apr 3 11:58:29 CEST 2009


Zdravim konferenci!

Objevil jsem zajimavy problem na routeru:

Intel S3000AHX, CPU Core2 duo E6300 (1.86 GHz)
dva integrovane Gbit iface

em0: chip=0x108c8086
82573E Intel Corporation 82573E Gigabit Ethernet Controller (Copper)

em1: chip=0x10768086
82541EI Gigabit Ethernet Controller
(nechapu, proc tam nedavaji dve stejne)

em1 je nakonfigurovana jako vnitrni iface, em0 jako vnejsi.

Pri "nejakem" provozu pres router (ktery zatim nemam nejak vysledovany) 
se zacnou objevovat vypadky na vnejsim interface. Shodou okolnosti na 
obou stranach routeru je za 10GHz bezdratovym pojitkem switch Cisco 2960 
takze testovat se da na obe strany routeru stejne.
jako test pouzivam prikaz:
ping -i.1 -c100 -s1500 IP_adresa_switche
smerem dovntir (tedy pres em1) neni nejmensi problem
smerem ven (pres em0) je packet loss od 1 do 9(!)%

Napr ted aktualne pres interface tece cca 25 mbit download + 5 mbit 
upload (5 min. prumer) a problem se neprojevuje. Vcera v dobe, kdy teklo 
18 down + 5 up jsem problem pozoroval - takze objemem protekajicich dat 
to nebude).

Na routeru bezi IPFW + NAT + DUMMYNET
packtety do pravidel DUMMYNETu jsou posilany pouze na vnitrnim iface, 
proto jsem jeho mozny vliv vyloucil.

NAT vytezuje jedno jadro CPU na cca 50% - takze pretizenim cpu nejakou 
casti sitoveho subsystemu to taky nebude.

Mame zjisteno, ze onim 10GHz bezdratovym pojitkem to neni (jednotky na 
obou stranach jsou vymenene) a z vnejsi strany neni vypadek na jednotku, 
ale na router ano.

sysctl dev.em.0.stats=1:
Excessive collisions = 0
Sequence errors = 0
Defer count = 0
Missed Packets = 0
Receive No Buffers = 0
Receive Length Errors = 0
Receive errors = 0
Crc errors = 0
Alignment errors = 0
Collision/Carrier extension errors = 0
RX overruns = 0
watchdog timeouts = 0
XON Rcvd = 0
XON Xmtd = 0
XOFF Rcvd = 0
XOFF Xmtd = 0
Good Packets Rcvd = 29472571
Good Packets Xmtd = 22267200
TSO Contexts Xmtd = 0
TSO Contexts Failed = 0

pro em1 je to prakticky identické

sysctl dev.em.0.debug=1:
Adapter hardware address = 0xc478921c
CTRL = 0x50140248 RCTL = 0x8002
Packet buffer = Tx=20k Rx=12k
Flow control watermarks high = 10240 low = 8740
tx_int_delay = 66, tx_abs_int_delay = 66
rx_int_delay = 32, rx_abs_int_delay = 66
fifo workaround = 0, fifo_reset_count = 0
hw tdh = 222, hw tdt = 222
hw rdh = 117, hw rdt = 116
Num Tx descriptors avail = 256
Tx Descriptors not avail1 = 0
Tx Descriptors not avail2 = 0
Std mbuf failed = 0
Std mbuf cluster failed = 0
Driver dropped packets = 0
Driver tx dma failure in encap = 0

Ani na vypisu netstatu nevidim nic, co by mohlo napovidat o nejake 
chybe. Netusim, kam se dal podivat na jake udaje, ktere by mohly neco 
napovedet.
Napadne nekoho neco, kde by mohly byt uzitecne informace?

Zbynek



More information about the Users-l mailing list