5 ways to check when OSPF routes flap occurs


Warning: foreach() argument must be of type array|object, string given in /www/wwwroot/wordpress/wp-content/themes/loobek/single.php on line 138

1. Checking Whether Router LSAs Trigger Route Calculation

Run the display ospf spf-statistics verbose command in any view to check route calculation statistics in an OSPF process. Focus on the Type field in the command output. This field indicates the LSA type that triggers route calculation, including RouterNetworkSum-NetExternal, and NSSA.
 

[HUAWEI] display ospf spf-statistics verbose 
          OSPF Process 1 with Router ID 13.0.0.3
Routing table change statistics:
Index: 1
           Time     : 2014-01-10 15:50:52-08:00
           Intra    : 0 Added,0 Deleted, 0 Modified
           Inter    : 0 Added,0 Deleted, 0 Modified
           External : 0 Added,0 Deleted, 0 Modified
           The reason of calculation is:Topo
    Type          LS ID             Adv Router
           1       Router        13.0.0.1          13.0.0.1
Index: 2
           Time     : 2014-01-10 15:50:52-08:00
           Intra    : 0 Added,0 Deleted, 0 Modified
           Inter    : 0 Added,0 Deleted, 0 Modified
           External : 0 Added,0 Deleted, 0 Modified
           The reason of calculation is:LSA
    Type LS ID             Adv Router
           1       Router        3.3.3.3           3.3.3.3

 
 
Rectify the fault based on the Type field:
 
If the Type field displays Router, go to step 2.
If the Type field displays Network, perform Checking Whether Network LSAs Trigger Route Calculation.
If the Type field displays Sum-Net, perform Checking Whether Sum-Net LSAs Trigger Route Calculation.
If the Type field displays External or NSSA, perform Checking Whether External LSAs or NSSA LSAs Trigger Route Calculation.
If the Adv Router field in the command output displays the router ID of the local device, the Router LSA generated by the local device is being refreshed. If the Adv Router field displays the router ID of another device, log in to this device to find the LSA refresh reason.
Run the display interface command in any view to check the interface status and focus on the Last physical up time and Last physical down time fields. The two fields indicate the last time an interface goes Up or Down. If the values for the two fields of some interfaces are being updated, these interfaces are flapping.

<HUAWEI> display interface
GigabitEthernet1/0/0 current state : UP (ifindex: 4)
Line protocol current state : UP
Description:
Route Port,The Maximum Transmit Unit is 1500
IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 0025-9e03-cba1
The Vendor PN is HFBR-5710L      
The Vendor Name is AVAGO           
Port BW: 1G, Transceiver max BW: 1G, Transceiver Mode: MultiMode
WaveLength: 850nm, Transmission Distance: 550m
Loopback:none, full-duplex mode, negotiation: negotiation disable, Pause Flowcontrol:Receive Enable and Send EnableLast physical up time   : 2017-12-25 21:04:29
Last physical down time : 2017-12-25 21:03:07Current system time: 2017-12-25 10:12:50
Statistics last cleared:never
    Last 10 seconds input rate: 0 bits/sec, 0 packets/sec
    Last 10 seconds output rate: 0 bits/sec, 0 packets/sec
    Input peak rate 768 bits/sec, Record time: 2013-12-28 21:04:40
    Output peak rate 853 bits/sec, Record time: 2013-12-28 21:04:40
    Input: 571520 bytes, 4465 packets
    Output: 571648 bytes, 4466 packets
    Input:
      Unicast: 0 packets, Multicast: 4465 packets
      Broadcast: 0 packets, JumboOctets: 0 packets
      CRC: 0 packets, Symbol: 0 packets
      Overrun: 0 packets, InRangeLength: 0 packets
      LongPacket: 0 packets, Jabber: 0 packets, Alignment: 0 packets
      Fragment: 0 packets, Undersized Frame: 0 packets
      RxPause: 0 packets
    Output:
      Unicast: 0 packets, Multicast: 4466 packets
      Broadcast: 0 packets, JumboOctets: 0 packets
      Lost: 0 packets, Overflow: 0 packets, Underrun: 0 packets
      System: 0 packets, Overruns: 0 packets
      TxPause: 0 packets
    Last 10 seconds input utility rate:  0.00%
    Last 10 seconds output utility rate: 0.00%

 
Focus on these flapping interfaces, and run the display ospf process-id interface command in any view to check OSPF interface information and determine whether the IP Address field displays the IP addresses of these flapping interfaces.

<HUAWEI> display ospf 1 interface   
    OSPF Process 1 with Router ID 192.168.1.1
                  Interfaces
 Area: 0.0.0.0
 IP Address      Type         State    Cost    Pri   DR            BDR
 192.168.1.1     P2P          P-2-P    1562    1     0.0.0.0       0.0.0.0
 Area: 0.0.0.1
 IP Address      Type         State    Cost    Pri   DR            BDR
 172.16.0.1      Broadcast    DR       1       1     172.16.0.1    0.0.0.0

 
 
If an OSPF interface flaps, check whether the device’s physical link is faulty. To rectify the interface and link faults, see Interconnected Optical Ports Cannot Go Up or Interconnected Electrical Interfaces Cannot Go Up.
If no OSPF interface flaps, check whether neighbors flap. That is, run the display ospf process-id peer last-nbr-down command in any view to check brief information about the last neighbor that went Down in the OSPF area.

<HUAWEI> display ospf 1 peer last-nbr-down 
          OSPF Process 1 with Router ID 192.168.1.1
                         Last Down OSPF Peer
         Neighbor Ip Address : 23.1.0.3
         Neighbor Area   Id  : 0.0.0.1
         Neighbor Router Id  : 23.0.0.3
         Interface           : Eth3/0/1.1 (22)
         Immediate Reason    : Neighbor Down Due to 1-Wayhello Received
         Primary Reason      : 1-Wayhello Received
         Down Time           : 2013-12-28 09:29:43

 
If a neighbor flaps, rectify the flapping fault. For details, see OSPF Neighbors Flap.
 
If the LSA refresh reason still cannot be located, check whether router IDs conflict. Run the display ospf lsdb | include Router command multiple times in any view to check OSPF LSDB information.

<HUAWEI> display ospf lsdb | include Router    
          OSPF Process 1 with Router ID 192.168.1.1                   
                  Link State Database                              
                                                               
                          Area: 0.0.0.0               
 Type     LinkState  ID   AdvRouter      Age  Len   Sequence   Metric      
 Router   192.168.1.1     192.168.1.1    98   36     8000000B       1     
 Router   1.1.1.1         1.1.1.1        92   36     80000005       1   
                                                                 
                          Area: 0.0.0.1                          
Type    LinkState  ID   AdvRouter      Age   Len    Sequence   Metric
Router   192.168.1.1    192.168.1.1     58   36     8000001B       1      
Router   2.2.2.2         2.2.2.2        32   36     80000025       1

 
If the Sequence field value of a Router LSA keeps increasing and the Age field value is always small, there is a high probability that a router ID conflict occurs in the OSPF area.
Resolve the router ID conflict in the OSPF area.
Run the display trapbuffer command in any view to check alarm information on the device and check whether the alarm similar to the following router ID conflict alarm exists:

OSPF/2/RTRID_CONFLCT:OID [oid] Router IDs conflict in an intra area.
(ProcessId=[integer], AreaId=[ipaddr], SelfIfnetIndex=[integer], NbrIpAddr=[ipaddr], RouterId=[ipaddr], NbrRtrId=[ipaddr])

Generally, a router ID conflict in an OSPF area can be resolved automatically because a new router ID will be elected within a certain period. However, it may take a long period to resolve this conflict automatically. You can manually resolve the conflict according to the following procedures. Check whether the refreshed Router LSA is advertised by the local device based on whether the AdvRouter field value in the display ospf lsdb | include Router command output is the same as the router ID of the local OSPF process.
 
If this Router LSA is advertised by the device to which you currently log in, you need to log in to another device in this OSPF area to perform subsequent operations. This is because only the Router LSA advertised by the local device can be displayed using the display ospf lsdb | include Router command when a router ID conflict occurs.
If this Router LSA is not advertised by the device to which you currently log in, perform subsequent operations on this device.
Log in to the device, and run the display ospf lsdb router link-state-id command multiple times in any view to observe the Link ID field value of this Router LSA. In this command, link-state-id is the LinkState ID field value of the abnormal Router LSA in the display ospf lsdb | include Router command output.

<HUAWEI> display ospf lsdb router 1.1.1.1
          OSPF Process 100 with Router ID 13.0.0.1
                          Area: 0.0.0.0
                  Link State Database
  Type      : Router
  Ls id     : 1.1.1.1
  Adv rtr   : 1.1.1.1  Ls age    : 501
  Len       : 60
  Options   :  ABR  E
  seq#      : 8000024d
  chksum    : 0xf58c
  Link count: 3
   * Link ID: 13.0.0.1     Data   : 100.0.0.2
     Link Type: P-2-P
     Metric : 1562
   * Link ID: 100.0.0.0
     Data   : 255.255.255.0
     Link Type: StubNet
     Metric : 1562
     Priority : Low
   * Link ID: 23.0.0.2
     Data   : 23.0.0.2
     Link Type: TransNet
     Metric : 1         
   * Link ID: 100.3.3.3
     Data   : 255.255.255.255
     Link Type: StubNet
     Metric : 0
     Priority : Medium

 

Table 1 Link ID and Data fields
Link Type Link ID Data
P2P Router ID of a neighbor IP address of the P2P interface that connects this device to the current area
TransNet Interface IP address of the DR IP address of the broadcast interface that connects this device to the current area
StubNet Network ID Network mask

If the Link ID field displays different values, and one of the values indicates the Router LSA advertised by the device with a conflicting router ID, find the device’s interface IP address or loopback interface IP address and then locate the device.
Router LSAs are refreshed fast when a router ID conflict occurs. Therefore, you may be unable to locate the device with the conflicting router ID according to the preceding procedures. In this situation, log in each device in the OSPF area and run the display ospf brief command in any view to check the router ID of each process until you locate the device with the conflicting router ID.

<HUAWEI> display ospf brief
          OSPF Process 1 with Router ID 3.3.3.3
                  OSPF Protocol Information
 RouterID: 3.3.3.3          Border Router:  AREA  AS
 Multi-VPN-Instance is not enabled
 Global DS-TE Mode: Non-Standard IETF Mode
 Graceful-restart capability: disabled
 Helper support capability  : not configured
 Applications Supported: MPLS Traffic-Engineering
 Spf-schedule-interval: max 10000ms, start 500ms, hold 1000ms
 Default ASE parameters: Metric: 1 Tag: 1 Type: 2
 Route Preference: 10

 
After you find the two devices with conflicting router IDs, log in one device and run the ospf process-id router-id router-id command in the user view to change the router ID of the specified OSPF process to resolve the router ID conflict.

<HUAWEI> display ospf 1 brief //Change the router ID of OSPF process 1 to 3.3.3.3.
          OSPF Process 1 with Router ID 3.3.3.3
                  OSPF Protocol Information RouterID: 3.3.3.3          Border Router: AREA  AS
 Multi-VPN-Instance is not enabled
 Global DS-TE Mode: Non-Standard IETF Mode
 Graceful-restart capability: disabled
 Helper support capability  : not configured
 Applications Supported: MPLS Traffic-Engineering
 Spf-schedule-interval: max 10000ms, start 500ms, hold 1000ms
 Default ASE parameters: Metric: 1 Tag: 1 Type: 2
 Route Preference: 10
 …
<HUAWEI> system-view
[HUAWEI] ospf 1 router-id 2.2.2.2 //Change the OSPF router ID to 2.2.2.2.
Info: The configuration succeeded. You need to restart the OSPF process to valid
ate the new router ID.
[HUAWEI-ospf-1] quit
[HUAWEI-ospf-1] quit
<HUAWEI> reset ospf 1 process //Restart OSPF process 1 to make the configured router ID take effect.
Warning: The OSPF process will be reset. Continue? [Y/N]:y
<HUAWEI> display ospf brief
          OSPF Process 1 with Router ID 2.2.2.2
                  OSPF Protocol Information
 RouterID: 2.2.2.2          Border Router:  AREA  AS
 Multi-VPN-Instance is not enabled
 Global DS-TE Mode: Non-Standard IETF Mode
 Graceful-restart capability: disabled
 Helper support capability  : not configured
 Applications Supported: MPLS Traffic-Engineering
 Spf-schedule-interval: max 10000ms, start 500ms, hold 1000ms
 Default ASE parameters: Metric: 1 Tag: 1 Type: 2
 Route Preference: 10
 ASE Route Preference: 150
 SPF Computation Count: 4

 

2. Checking Whether Network LSAs Trigger Route Calculation

Log in to the device indicated by the Adv Router field in the display ospf spf-statistics verbose command output, and run the display ospf peer last-nbr-down command in any view to check whether neighbors flap.
 

<HUAWEI> display ospf peer last-nbr-down
          OSPF Process 1 with Router ID 10.1.1.1
                         Last Down OSPF Peer
         Neighbor Ip Address : 20.2.1.2
         Neighbor Area   Id  : 0.0.0.0
         Neighbor Router Id  : 2.2.2.2
         Interface           : Vlanif100
         Immediate Reason    : Neighbor Down Due to Kill Neighbor
         Primary Reason      : Logical Interface State Change
         Down Time           : 2012-09-14 17:17:7

 
If a neighbor flaps, rectify the flapping fault. For details, see OSPF Neighbors Flap.
If no neighbor flaps, go to step 2.
Run the display ospf lsdb network link-state-id command multiple times in any view to check OSPF LSDB information and focus on the seq# and Ls age fields. If the seq# field value of a Network LSA keeps increasing and the Ls age field value is always small, the LSA keeps being refreshed. There is a high probability that an IP address conflict occurs in the OSPF area. This IP address is indicated by the Adv rtr field.

<HUAWEI> display ospf lsdb network 101.0.0.1   
       OSPF Process 1 with Router ID 100.3.3.4
                          Area: 0.0.0.0
                  Link State Database
  Type      : Network
  Ls id     : 101.0.0.1
  Adv rtr   : 1.1.1.1
  Ls age    : 1789
  Len       : 32
  Options   :  E
  seq#      : 80000003
  chksum    : 0xe7f5
  Net mask  : 255.255.255.0
  Priority  : Low
     Attached Router    1.1.1.1
     Attached Router    3.3.3.3

 
 
If an IP address conflict occurs, go to step 3.
If no reason is found, go to Collecting Information and Seeking Technical Support.
 
Find the device with a conflicting IP address and change its IP address.
Find the device with a conflicting IP address using either of the following methods, and specify another IP address for the device based on network planning to avoid an IP address conflict:
Log in to a non-DR device on the same network segment in the OSPF area, and run the Tracert ip-address command multiple times in any view to check whether different paths are displayed at different moments. Then find the device with a conflicting IP address and change its IP address.

<HUAWEI> tracert 1.1.1.1
traceroute to 1.1.1.1 (1.1.1.1), max hops: 30, packet length: 40, press CTRL_C to break
1 10.3.112.1   10 ms 10 ms 10 ms
2 10.32.216.1  19 ms 19 ms 19 ms
3 * * *
4 * * *
5 * * *
6 * * *
7 1.1.1.1   339 ms 279 ms 279 ms

 
Check the configuration file of each device in the OSPF area of the live network, find the device with a conflicting IP address, and change its IP address.

3. Checking Whether Sum-Net LSAs Trigger Route Calculation

If Sum-Net LSAs trigger route calculation, the fault source resides outside the local OSPF area. You need to find the area where this fault occurs. Run the display ospf lsdb summary command multiple times in any view of the current device to check Summary LSAs and determine which prefixes are changing.
 

<HUAWEI> display ospf lsdb summary
          OSPF Process 1 with Router ID 100.3.3.4
                          Area: 0.0.0.0
                  Link State Database
  Type      : Sum-Net
  Ls id     : 100.0.0.0
  Adv rtr   : 1.1.1.1
  Ls age    : 1216
  Len       : 28
  Options   :  E
  seq#      : 80000088
  chksum    : 0x60ec
  Net mask  : 255.255.255.0
  Tos 0  metric: 1562
  Priority  : Low

 
Log in to an ABR (indicated by the Adv rtr field in the command output), run the display ospf routing ip-address command in any view to check the OSPF routing table. Find from which area the route is learned and rectify the fault in this area.
 

 <HUAWEI> display ospf routing 1.1.1.1
         OSPF Process 1 with Router ID 3.3.3.3
 Destination : 1.1.1.1/32
 AdverRouter : 1.1.1.1                  Area      : 0.0.0.1
 Cost        : 1562                     Type      : Stub
 NextHop     : 101.0.0.1                Interface : Serial0/0/1
 Priority    : Medium                   Age       : 00h00m46s

 

4. Checking Whether External LSAs or NSSA LSAs Trigger Route Calculation

If External LSAs or NSSA LSAs trigger route calculation, the fault location methods are similar. The following example describes how to rectify the fault when External LSAs trigger route calculation.
A common reason for this fault is that the route sources of OSPF external routes change. If you log in to the ASBR (indicated by the Adv Router field), you will see that routes of other routing protocols are flapping. Rectify the fault according to the troubleshooting procedures of these routing protocols.
Check whether incorrect route preferences are configured when routing protocols import routes from each other.
 
For example, OSPF is configured between SwitchA and SwitchB; a static route is imported; a lower preference is configured for the static route. SwitchA receives an OSPF route from SwitchB, so this OSPF route replaces the low-preference static route in the IP routing table. However, after this static route disappears from the IP routing table, the OSPF route source disappears, so the OSPF route also disappears. Then the static route replaces the OSPF route in the IP routing table. This process repeats.
 
To rectify the fault, configure a correct preference for routes based on network planning.
 
If external routes do not change, check whether interface or neighbor flapping occurs on the ASBR according to the preceding procedures.
If you still cannot find the reasons for route flapping, a router ID conflict may occur. In any view of the current device, run the display ospf lsdb ase command multiple times to check AS-External LSA information. If External LSAs remain unchanged, the Seq# field value keeps increasing, and the Ls age field value is small, a router ID conflict occurs in the current AS. To rectify the fault, find the device with the conflicting router ID and assign a new router ID to the device based on network planning.
 

<HUAWEI> display ospf lsdb ase
         OSPF Process 1 with Router ID 100.3.3.4
                  Link State Database
  Type      : External
  Ls id     : 200.1.1.0
  Adv rtr   : 3.3.3.3
  Ls age    : 712
  Len       : 36
  Options   :  E
  seq#      : 80000005
  chksum    : 0x53a2
  Net mask  : 255.255.255.0
  TOS 0  Metric: 1
  E type    : 2
  Forwarding Address : 0.0.0.0
  Tag       : 1
  Priority  : Low

 
 

5. Collecting Information and Seeking Technical Support

If the fault persists, collect related information and seek technical support.
Collect fault information.
Collect operation results of the preceding steps and record the results in a file.
Collect all diagnostic information and export the information to a file.
Run the display diagnostic-information file-name command in the user view to collect diagnostic information and save the information to a file.

<HUAWEI> display diagnostic-information dia-info.txt
Now saving the diagnostic information to the device
 100%
Info: The diagnostic information was saved to the device successfully.

 
 
When the diagnostic file is generated, you can export the file from the switch using FTP, SFTP, or SCP.
 
NOTICE:
You can run the dir command in the user view to check whether the file is generated.
You can also run the display diagnostic-information command and save terminal logs in a diagnostic file on a disk. .
If this command displays a long output, you can press Ctrl+C to abort this command.
This command is used to collect diagnostic information for fault location. Executing this command may affect the system performance. For example, it may cause a high CPU usage. Therefore, do not run this command when the switch is running properly.
Do not run the display diagnostic-information command on multiple terminals connected to the switch at the same time. Otherwise, the CPU usage of the switch will increase sharply, causing system performance deterioration.
 
Collect the device log and trap information and export the information to files.
Run the save logfile all command in the user view to save the logs in the user log buffer area and diagnostic log buffer area to the user log file and diagnostic log file, respectively.

<HUAWEI> save logfile all
Info: Save logfile successfully.
Info: Save diagnostic logfile successfully.

 
 
When the diagnostic file is generated, you can export the file from the switch using FTP, SFTP, or SCP.
 
NOTE:
You can also run the display logbuffer and display trapbuffer commands to check the log and trap information on the device, and save the information in diagnostic files on a disk.
 
If any question, please contact csd@telecomate.com to seek technical support.
NOTE:
Technical support personnel will provide instructions for you to submit all the collected information and files, so that they can locate faults.