Difference between revisions of "CCNP TSHOOT 642-832/Chapter 2"
From Teknologisk videncenter
m (New page: =Introduction to Troubleshooting Processes= __TOC__ {{Source cli}} Category:CCNPv6Category:CCNPv6 TSHOOT) |
m (→Including Troubleshooting in Routine Network Maintenance) |
||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=Introduction to Troubleshooting Processes= | =Introduction to Troubleshooting Processes= | ||
__TOC__ | __TOC__ | ||
+ | =Troubleshooting Methods= | ||
+ | ==Defining Troubleshooting== | ||
+ | *'''Step 1:''' Problem Report | ||
+ | *'''Step 2:''' Problem Diagnosis | ||
+ | *'''Step 3:''' Problem resolution | ||
+ | === Diagnosing a Problem=== | ||
+ | {|border=1 ;style="margin: 0 auto; text-align: center;cellpadding="5" cellspacing="0" | ||
+ | |+ Diagnosing a Problem | ||
+ | |- bgcolor=lightgrey | ||
+ | ! Step !! Description | ||
+ | |- | ||
+ | |'''Collect Information''' || A problem report often lacks sufficient information. Collect additional information from. fx. Network Management tolls or interviewing the user. | ||
+ | |- | ||
+ | |'''Examine Collected Information'''||Examine collected information. Fx. comparing to baseline information. | ||
+ | |- | ||
+ | |'''Eliminate Potential causes''' || Based on knowledge of network and collected information - start to eliminate causes. | ||
+ | |- | ||
+ | |'''Hypothesize Underlying Cause''' || After eliminating causes hypothesize the most likely cause of the problem. | ||
+ | |- | ||
+ | | '''Verify Hypothesis''' || Test if the hypothesize resolve the problem | ||
+ | |} | ||
+ | ==The Value of a Structured Troubleshooting Approach == | ||
+ | [[Image:TSHOOT kapitel 2 - 1.png|500px|thumb|none|Structured Troubleshooting Approach]] | ||
+ | [[Image:TSHOOT Kapitel 2 - 2.png|500px|thumb|none|'''Shoot from the hip''' Troubleshooting Approach]] | ||
+ | ==Popular Troubleshooting Methods== | ||
+ | [[Image:TSHOOT kapitel 2 - 3.png|800px|thumb|none|Top-down, Bottom-up or Divide and conquer approaches]] | ||
+ | [[Image:TSHOOT kapitel 2 - 4.png|600px|thumb|none|Follow the Path of Traffic]] | ||
+ | [[Image:TSHOOT kapitel 2 - 5.png|300px|thumb|none|Component Swapping]] | ||
+ | ==Structured Troubleshooting Procedure== | ||
+ | By combining the previously mentioned Three-step troubleshooting procedure and the subprocesses of Problem Diagnosis steps you get | ||
+ | # Problem Report | ||
+ | # Collect Information | ||
+ | # Examine Collected Information | ||
+ | # Eliminate Potential Causes | ||
+ | # Hypothesize Underlying Cause | ||
+ | # Verify Hypothesize | ||
+ | # Problem resolution | ||
+ | === Problem Report === | ||
+ | *Often lacks information | ||
+ | *Are you authorized to resolve the problem or need to forward it. | ||
+ | *Interview the user who reported the problem. | ||
+ | === Collect Information === | ||
+ | *Collect information from routers and switches... (show debug commands, log, NMS etc) | ||
+ | === Examine Collected information === | ||
+ | *Identify indicators pointing to the underlying cause of the problem | ||
+ | *Find evidence that can be used to eliminate potential causes | ||
+ | Fin a balance between | ||
+ | *What ''is'' occurring on the network? | ||
+ | *What ''should be'' occurring on the network? | ||
− | + | === Eliminate Potential Causes=== | |
+ | *Is OSPF running etc. | ||
+ | ===Hypothesize Underlying Cause=== | ||
+ | *If problem can't be resolved (Lack of authority, devices) a temporary fix could resolve the problem here and now. | ||
+ | ===Verify Hypothesis === | ||
+ | *Implementing the fix. (Make a plan) | ||
+ | ===Problem resolution=== | ||
+ | *Document the resolved problem. | ||
+ | =Including Troubleshooting in Routine Network Maintenance= | ||
+ | *Keep documentation up to date and reliable | ||
+ | ==Maintaining Network Documentation== | ||
+ | *'''Require documentation''' - Make documentation a component in the troubleshooting flow. (get used to documenting) | ||
+ | *'''Schedule documentation checks''' - Routinely verify documentation. | ||
+ | *'''Automate documentation''' - Any configuration change on a device should be reflected in the documentation. (Compare) | ||
+ | ==Establish a baseline== | ||
+ | <source lang=cli> | ||
+ | R1#<input>show processes cpu sorted 5min</input> | ||
+ | CPU utilization for five seconds: 5%/0%; one minute: 5%; five minutes: 5% | ||
+ | PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process | ||
+ | 112 3719202 1028129 3617 0.15% 0.13% 0.15% 0 HRPC qos request | ||
+ | 4 2111767 304796 6928 0.00% 0.09% 0.06% 0 Check heaps | ||
+ | 139 232 304 763 0.15% 0.11% 0.05% 2 Virtual Exec | ||
+ | 142 554927 2283402 243 0.15% 0.10% 0.05% 0 IP Input | ||
+ | 34 57731 47158926 1 0.00% 0.02% 0.00% 0 RedEarth Tx Mana | ||
+ | 7 637306 3220477 197 0.15% 0.03% 0.00% 0 ARP Input | ||
+ | 6 0 2 0 0.00% 0.00% 0.00% 0 Timers | ||
+ | <notice> Output omitted... </notice> | ||
+ | </source> | ||
+ | ==Communicating throughout the Troubleshooting Process == | ||
+ | {|border=1 ;style="margin: 0 auto; text-align: center;cellpadding="5" cellspacing="0" | ||
+ | |+Communicating throughout the Troubleshooting Process | ||
+ | |- bgcolor=lightgrey | ||
+ | ! Step !! Description | ||
+ | |- | ||
+ | |'''Problem Report''' || When a user report a problem. Communicate with the user to clarify the problem. | ||
+ | |- | ||
+ | |'''Collect Information''' || Collect information from for example a Service Provider. require communication. | ||
+ | |- | ||
+ | |'''Examine Collected Information'''|| Communicate with other IT-staff | ||
+ | |- | ||
+ | |'''Eliminate Potential causes''' || Perhaps the troubleshooter communicate with a consultant. | ||
+ | |- | ||
+ | |'''Hypothesize Underlying Cause''' || The communication with other IT-staff helps to clarify the underlying problem. | ||
+ | |- | ||
+ | | '''Verify Hypothesis''' || Before making online network changes causing outlets. Notify users about the interruption. | ||
+ | |- | ||
+ | | '''Problem Resolution''' || The user reporting the problem should be informed and the user confirm that the problem is resolved. | ||
+ | |} | ||
+ | ==Change Management== | ||
+ | The process of Change Management includes using policies that dictate rules regarding how and when a change can be made and how that change is documented. | ||
{{Source cli}} | {{Source cli}} | ||
[[Category:CCNPv6]][[Category:CCNPv6 TSHOOT]] | [[Category:CCNPv6]][[Category:CCNPv6 TSHOOT]] |
Latest revision as of 14:28, 6 June 2010
Introduction to Troubleshooting Processes
Contents
Troubleshooting Methods
Defining Troubleshooting
- Step 1: Problem Report
- Step 2: Problem Diagnosis
- Step 3: Problem resolution
Diagnosing a Problem
Step | Description |
---|---|
Collect Information | A problem report often lacks sufficient information. Collect additional information from. fx. Network Management tolls or interviewing the user. |
Examine Collected Information | Examine collected information. Fx. comparing to baseline information. |
Eliminate Potential causes | Based on knowledge of network and collected information - start to eliminate causes. |
Hypothesize Underlying Cause | After eliminating causes hypothesize the most likely cause of the problem. |
Verify Hypothesis | Test if the hypothesize resolve the problem |
The Value of a Structured Troubleshooting Approach
Popular Troubleshooting Methods
Structured Troubleshooting Procedure
By combining the previously mentioned Three-step troubleshooting procedure and the subprocesses of Problem Diagnosis steps you get
- Problem Report
- Collect Information
- Examine Collected Information
- Eliminate Potential Causes
- Hypothesize Underlying Cause
- Verify Hypothesize
- Problem resolution
Problem Report
- Often lacks information
- Are you authorized to resolve the problem or need to forward it.
- Interview the user who reported the problem.
Collect Information
- Collect information from routers and switches... (show debug commands, log, NMS etc)
Examine Collected information
- Identify indicators pointing to the underlying cause of the problem
- Find evidence that can be used to eliminate potential causes
Fin a balance between
- What is occurring on the network?
- What should be occurring on the network?
Eliminate Potential Causes
- Is OSPF running etc.
Hypothesize Underlying Cause
- If problem can't be resolved (Lack of authority, devices) a temporary fix could resolve the problem here and now.
Verify Hypothesis
- Implementing the fix. (Make a plan)
Problem resolution
- Document the resolved problem.
Including Troubleshooting in Routine Network Maintenance
- Keep documentation up to date and reliable
Maintaining Network Documentation
- Require documentation - Make documentation a component in the troubleshooting flow. (get used to documenting)
- Schedule documentation checks - Routinely verify documentation.
- Automate documentation - Any configuration change on a device should be reflected in the documentation. (Compare)
Establish a baseline
R1#<input>show processes cpu sorted 5min</input>
CPU utilization for five seconds: 5%/0%; one minute: 5%; five minutes: 5%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
112 3719202 1028129 3617 0.15% 0.13% 0.15% 0 HRPC qos request
4 2111767 304796 6928 0.00% 0.09% 0.06% 0 Check heaps
139 232 304 763 0.15% 0.11% 0.05% 2 Virtual Exec
142 554927 2283402 243 0.15% 0.10% 0.05% 0 IP Input
34 57731 47158926 1 0.00% 0.02% 0.00% 0 RedEarth Tx Mana
7 637306 3220477 197 0.15% 0.03% 0.00% 0 ARP Input
6 0 2 0 0.00% 0.00% 0.00% 0 Timers
<notice> Output omitted... </notice>
Communicating throughout the Troubleshooting Process
Step | Description |
---|---|
Problem Report | When a user report a problem. Communicate with the user to clarify the problem. |
Collect Information | Collect information from for example a Service Provider. require communication. |
Examine Collected Information | Communicate with other IT-staff |
Eliminate Potential causes | Perhaps the troubleshooter communicate with a consultant. |
Hypothesize Underlying Cause | The communication with other IT-staff helps to clarify the underlying problem. |
Verify Hypothesis | Before making online network changes causing outlets. Notify users about the interruption. |
Problem Resolution | The user reporting the problem should be informed and the user confirm that the problem is resolved. |
Change Management
The process of Change Management includes using policies that dictate rules regarding how and when a change can be made and how that change is documented.