Forums.ATC.no

Teknisk => Generelt teknisk => Emne startet av: ATC på 25. Juni 2010, 09:21 am

Tittel: N6040: Inconsistent network connectivity
Skrevet av: ATC25. Juni 2010, 09:21 am
We had a really difficult issue with one of our N series 6040 NAS controllers. One of the Vfilers would only respond to certain types of traffic from certain hosts on the network.

- Some protocols like HTTP would work. Others, like ICMP and DNS, would fail.
- Different hosts on the network would show different results, some were okay while others failed.
- Failing over to the other controller in the cluster seemed to remedy the situation.
- Other Vfilers on the same controller would exhibit different results.
- Reducing the MTU size had no effect.
- Removing the failover/partner configuration or disabling the interface on the partner controller had no effect.
Tittel: [Solved] N6040: Inconsistent network connectivity
Skrevet av: ATC25. Juni 2010, 09:21 am
Inspecting the routing table on the problem controller revealed an inconsistency:

N6040-B> route -s
Routing tables

Internet:
Destination      Gateway            Flags     Refs     Use  Interface
default          172.29.20.1        UGS        38     6075  e0M
10.80.2/24       link#12            UC          0        0  vif1-2
10.80.2.1        0:10:db:ff:10:1    UHL         0        0  vif1-2
10.80.2.250      0:50:56:b1:22:a3   UHL         0    16136  vif1-2
10.80.2.255      ff:ff:ff:ff:ff:ff  UHL         0      359  vif1-2
10.80.4/24       link#13            UC          0        0  vif1-4
10.80.4.41       0:50:56:b1:76:f0   UHL         0       90  e0M
10.80.4.42       0:50:56:b1:50:a9   UHL         0       24  e0M
10.80.4.250      0:50:56:b1:32:2e   UHL         0    10652  e0M
10.80.6/24       link#14            UC          0        0  vif1-6
[snip]

Here, communication with 10.80.4.41 and 10.80.4.42 was the problem. The routing table shows that the controller has in fact chosen the wrong interface.

We simply deleted the corresponding ARP entries:

N6040-B> arp -d 10.80.4.41
10.80.4.41 (10.80.4.41) deleted
N6040-B> arp -d 10.80.4.42
10.80.4.42 (10.80.4.42) deleted

All of a sudden the communication worked as it should:

N6040-B> vfiler run vf-sikker ping 10.80.4.41

===== vf-sikker
10.80.4.41 is alive

N6040-B> vfiler run vf-sikker ping 10.80.4.42

===== vf-sikker
10.80.4.42 is alive


Checking the routing table again confirmed that an incorrect entry in the ARP table had been the cause of our problems:

N6040-B> route -s
Routing tables

Internet:
Destination      Gateway            Flags     Refs     Use  Interface
default          172.29.20.1        UGS        38     6410  e0M
10.80.2/24       link#12            UC          0        0  vif1-2
10.80.2.1        0:10:db:ff:10:1    UHL         0        0  vif1-2
10.80.2.250      0:50:56:b1:22:a3   UHL         0    16203  vif1-2
10.80.2.255      ff:ff:ff:ff:ff:ff  UHL         0      359  vif1-2
10.80.4/24       link#13            UC          0        0  vif1-4
10.80.4.1        0:10:db:ff:10:1    UHL         0        0  vif1-4
10.80.4.41       0:50:56:b1:76:f0   UHL         0        1  vif1-4
10.80.4.42       0:50:56:b1:50:a9   UHL         0        1  vif1-4
10.80.4.250      0:50:56:b1:32:2e   UHL         0    10697  e0M
[snip]