You are on page 1of 60

Advanced SAN Troubleshooting

Mike Frase Session BRKSAN 3708

Session_3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

2

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

1

Agenda
Fibre Channel & MDS Switch Basics
FC Operations Review Addressing, FC Services Domains, Zoning

MDS Serviceability Tools
FCanalyzer SPAN & PAA (WireShark usage) SAN/OS (Output analysis, debug, logs, Cores) Performance Manager (Licensed part of Fabric Manager) NTOP (Using Netflow and SPAN w/PAA)

Troubleshooting
Device connections ISL’s Zoning IVR NPV
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

3

Simple to Complex - Troubleshooting at all Levels
Simple SAN – small port count
No Inter-switch links Dual Fabric Minimal SAN/OS feature use

Director Class collapsed CORE SAN – large port count
High availability Design Port Channels, FSPF Routing Multi San/OS feature use

Core Director SAN w/ Bladeserver & edge switches
Multi Domains Interop NPV - Top of Rack

Central SAN to Backup DR SAN
Inter VSAN Routing Distant extended ISL’s, FCIP

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

4

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

2

Fibre Channel Operations

Physical Layer basics Understanding FC addressing
Needing to live within the limits of the standards

Fibre Channel Protocol Services & SAN/OS
Refreshers on FLOGI. PLOGI, standards operations as they relate to SAN/OS The ISL connection

Domains - Operational understanding Zoning – Basic & Enhanced operation
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

5

Fibre Channel Layers
Structure Is Divided into Five Levels of Functionality
FC-0—defines the physical interface characteristics
Signaling rates, cables, connectors, distance capabilities, etc.

FC-1—defines how characters are encoded/decoded for transmission
Transmission characters are given desirable characters

FC-2—defines how information is transported
Frames, sequences, exchanges, login sessions

FC-4s FC-3 FC-2 FC-1 FC-0

SCSI

IP Common Services Signaling Protocol Transmission Code Physical Interface FC-PH

FC-3—place holder for future functions FC-4—defines how different protocols are mapped to use Fiber Channel
SCSI, IP, Virtual Interface Architecture, others

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

6

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

3

Detailed SFP levels

This output of detailed SFP transceiver only available on new 4 Gig & 10 gig qualified Cisco SFP’s

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

7

Fibre Channel Transmission Words: FC-1
Ordered sets are transmitted continuously to indicate that specific conditions within the link are encountered Ordered sets transmitted while the condition exist Four primitive sequences which can determine where problem exist
Not Operational Sequence (NOS) Offline Sequence (OLS) Link Reset Sequence (LR) Link Reset Response Sequence (LRR)

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

8

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

4

Link Initialization Flow Fibre Channel Layer 1 Protocol (FC-1)
Port A Port B

Link Failure Condition
These Are All Special-Ordered Sets of 8B/10B Coding:

AC

AC

NOS

LF

LF = Link Failure State NOS = Not Operational Sequence

LF
OLS LR

OL
LRR

OLS = Offline Sequence OL = Offline State LR = Link Reset

LR LR
Idle

LRR = Link Reset Response LR = Link Recovery State

AC
Idle

AC = Active State

AC
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

9

Fibre Channel Port Issues
In Order for an F_Port to Come Up on an MDS Switch:
1. The switch port must first acquire bit and word synchronization with the N_Port 2. N_Port must issue a FLOGI to the MDS
Primitive Sequences Counters Can Determine Layer 0–1 Problems

Tip:
Clear Counters and Monitor to Verify Active Issues, Use Device Manager Monitor Tool to Monitor Live; Set and Activate Threshold Manager to Alert You; MDS_Switch# clear counters interface fc 1/1
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

10

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

5

Line Card Basics
Show module will display slot locations and type of card General information in interface and statistics can be gathered from switch main-level prompt More detailed gathering of ASIC counters may be required to troubleshoot difficult issues, attach would then be required

Attach to Module with “Attach” Command, All Modules Can Be Attached to, Including Standby Supervisor and IPS Blade Exit to Detach
11

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

Monitor Link Init State – Gen 1 LC
(Flow from Bottom to Top)
Attached to Mod 1 FC Port 1/2

module-1# show hardware internal fc-mac port 2 stateinfo F-Port Point to Point Negotiated
LINK: 052 022180c5 LR_RECEIVE(03) =>ACTIVE(01) LINK: 051 022180c3 OLS_TRANSMIT(07) =>LR_RECEIVE(03) LINK: 050 022180c1 SENTINAL(00) =>OLS_TRANSMIT(07) LOOP: 049 022180c1 HW_ALPAS(0d) MASTER_LISA_WAIT(1f)=> OLD_PORT(3f) LOOP: 048 022180c1 HW_ALPAS(0d) MASTER_LISA(1e)=> MASTER_LISA_WAIT(1f) LOOP: 047 022180c1 HW_ALPAS(0d) MASTER_LIHA_WAIT(1d)=> MASTER_LISA(1e) LOOP: 046 022180c1 HW_ALPAS(0d) MASTER_LIHA(1c)=> MASTER_LIHA_WAIT(1d) LOOP: 045 022180c1 HW_ALPAS(0d) MASTER_LIPA_WAIT(1b)=> MASTER_LIHA(1c) LOOP: 044 022180c1 HW_ALPAS(0d) MASTER_LIPA(1a)=> MASTER_LIPA_WAIT(1b) LOOP: 043 022180c1 HW_ALPAS(0d) MASTER_LIFA_WAIT(19)=> MASTER_LIPA(1a) LOOP: 042 022180c1 HW_ALPAS(0d) MASTER_LIFA(18)=> MASTER_LIFA_WAIT(19) LOOP: 041 022180c1 HW_X_ARB(0c) MASTER_START(17)=> MASTER_LIFA(18) LOOP: 040 022180c1 HW_LISM0(0a) OPEN_INIT_SELECT_MASTER(06)=> MASTER_START(17) LOOP: 039 022180bf HW_R_LIP(09) OPEN_INIT_START(05)=> OPEN_INIT_SELECT_MASTER(06) LOOP: 038 022180bf HW_X_LIP(08) NORMAL_INITIALIZE(04)=> OPEN_INIT_START(05) LOOP: 037 022180bf HW_R_LIP(09) LPSM_STARTED(01)=> NORMAL_INITIALIZE(04) LOOP: 036 022180b0 HW_OLDP(07) LPSM_DISABLED(00)=> LPSM_STARTED(01) LINK: 035 022170e8 ACTIVE(01) =>SENTINAL(00)
Cisco Public

Port Tries Loop First When Port Is Set to Auto

Interface Shut/ No Shut via Configuration

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

12

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

6

Monitor Link Init State - Gen-2

LC

The newer Generation-2 Line Cards (4 Gig) support a different look to debugging the link events Still requires attach to the module
module-7# show hardware internal fc-mac port 1 link-event ================= FCP Port#1 Link State Machine Event Log ================== MMDDYY HHMMSS 032907 032907 032907 032907 032907 032907 014953 014953 014953 014953 014953 014949 usecs 369768 (0000) 368963 (0000) 365690 (0000) 365593 (0001) 360463 (016D) 710690 (413C) Event E_LINK_IDLE E_LINK_LR E_LINK_NOS E_LINK_MIN_OLS E_LINK_LINK_INIT E_LINK_CLEANUP Current State LINK_ACTIVE LINK_LR_RX LINK_NOS_RX LINK_OLS_TX LINK_INIT LINK_DIS LINK_ACTIVE

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

13

N-Port Virtualization Operations NPIV-NPV
NPV-Core Switch w/NPIV configured

FLOGI (acc) PLOGI (acc) ACC PLOGI (acc)
F

NP

NPV enabled switch

FLOGI

PLOGI

FDISC
F

N

FLOGI

PLOGI

PRLI

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

14

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

7

Understanding FC Addressing

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

15

Addresses
From Fabric Manager
FCID Domain/Area/Host Port World Wide Name Vendor name derived from standards assigned OUI 0b / 00 / 01

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

16

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

8

Addresses

From CLI

Show fcns database
You will Require the Domain List <sh fcdomain domain-list> from the Switches to Know Where the Devices Are Physically Connected on the SAN, Could Be IVR’ed from Other VSAN sh fcdomain domain-list vsan 2

Target or Initiator Passed to Switch via Process Login (PRLI) ACCEPT and ACCEPT Frames During Nameserver Login

OUI is Part of the PWWN and Is IEEE Assigned. 00:e0:8b is Assigned to Qlogic, 00:00:c9 is Assigned to Emulex, and 00:04:cf is Assigned to Seagate, Many Others
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

17

Addresses
From Performance Manager

Capabilities to detach the address list and import to xls for reference when troubleshooting

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

18

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

9

Fabric Services
FC Standards
FFFFFE—Well-known address of login server
Basic responsibility to assign FCID

FFFFFC—Well-known address of fabric directory name server
Basic responsibility registration of devices and distribution of database

FFFFFD—Well-known address of fabric controller
Basic duties responsible for managing fabric, initialization, FSPF routing, responsible for RSCNs and listens for SCNs

FFFcxx—Cisco Specific, Representing Name Server in a VSAN and Domain xx
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

19

WireShark Uses
Wireshark (Once known as Ethereal) is part of the SAN/OS system image and can be run directly on the switch via ssh/telnet.
(FCAnalyzer command)

Wireshark on a PC with use of Cisco SPAN and a Port Analyzer Adaptor can be used as a inline tool with no disruption to traffic. The combination of Wireshark on a PC with a PAA can give complete look at the flow beyond the FLOGI/PLOGI process We will look more at Wireshark best practice later in this session
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

20

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

10

FCP Login Fundamentals
Fabric Login to Login Server Fabric ID assigned by Login Server NOS, LOS,LR,LRR Port Login to Fabric Nameserver State Change Registration with Fabric Controller Query Nameserver for FC Type 4 devices (Storage) PLOGI to storage FLOGI PLOGI SCR

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

21

Switch to switch ISL Communication Fundamentals
ISL’s in the topology change the complexity of the SAN

ISL

• Common Fabric Parameters required • Principal switch selection • Domain ID’s unique across SAN / VSAN • FSPF routing is activated • Zone Merging must occur • Possible VSAN trunking involved • Possible Port Channeling involved

We will be looking at this in greater detail in troubleshooting section
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

22

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

11

Domains

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

23

The Domain ID
Domain IDs are assigned by the principal switch based on the non-principal switch's requesting domain ID. If it is available, the domain ID is assigned to that switch. If not, a domain ID is picked from a "Least Recently Used" free list. On a fresh switch, the search for the free domain starts from 239 and goes in decreasing order. Before a switch ever joins a fabric, each switch assigns itself a domain ID based on its configured domain ID. If the configured domain ID type is preferred and configured domain ID is 0, then it assigns itself a random domain ID.

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

24

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

12

Running Domains in a VSAN

Domain ID’s
Configured Domains in a VSAN

BF RCF

Choices for action to configure the Domain ID

Duplicate Domain ID’s but different VSAN’s If a requirement to run IVR between VSAN’s with same ID’s then IVR NAT would be required

Local is the switch I am connected to

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

25

Principal Switch Priority- Operation
MDS-switch(config)# fcdomain priority 2 vsan 10 MDS-switch(config)# do show fcdomain vsan 10 The local switch is a Subordinated Switch. Local switch run time information: State: Stable Local switch WWN: 20:0a:00:05:30:01:97:43 Running fabric name: 20:0a:00:05:30:00:49:df Running priority: 128 Current domain ID: 0x4d(77) Local switch configuration information: State: Enabled FCID persistence: Enabled Auto-reconfiguration: Disabled Contiguous-allocation: Disabled Configured fabric name: 20:01:00:05:30:00:28:df Optimize Mode: Disabled Configured priority: 2 Configured domain ID: 0x00(0) (preferred) Principal switch run time information: Running priority: 4 MDS-switch(config)# fcdomain restart vsan 10

Priority set per VSAN, Try to reserve priority 1 for reasons of capabilities to trump any switch, lowest number always wins Does not immediately go into effect , Still at default 128 Configured priority

Priority of principal switch

Must do a non-disruptive (BF) restart

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

26

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

13

Domain ID - be in the know!
Per VSAN
It is per VSAN configuration

Build Fabric or Reconfigure Fabric
2 choices when defining a Domain ID per FC-SW standards BF= non disruptive to complete Fabric, does force changes that could effect logged in devices to re login. RCF= Disruptive to Fabric, loss of complete path to a switch

Interop modes and effects on Domain ID assignments
Depending on interop mode, range of domains may be limited to 97-127 due to McData’s 31 DomainID limit

Planning, proper administration
Consider no duplicate domain ID’s across all VSAN’s if plan may include someday IVR, then NAT will not be required Most best practice is to configure core Domain ID’s and principle switch primary & secondary
27

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

Zoning

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

28

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

14

Zoning – Basic operational understanding
What can we Zone RSCN’s and Zones Zoning Standards – Basic vs. Enhanced Zone Distribution, Export, Import, Merge

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

29

Physical points for zone members
PWWN or FCID

1/8 Domain 0x4

7/11

VSAN 100
1/23 Domain 0x5 1/2

7/13

3/5

Zoning Choices Switch FC Interface or Fabric Port WWN (FWWN)

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

30

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

15

RSCNs and Zoning
Devices must register with MDS if they would like to receive RSCN. MDS generates local RSCN to devices within the affected zone when a zoned member logs in or out of the fabric. Devices that log in or out of the fabric and are not part of the zone, will not generate RSCNs to devices in the zone. SW_RSCNs will be sent to all switches, and they in turn will decide if a local RSCN needs to be sent (based on zoning and affected devices)
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

31

Zone Server Modes
Zone server supports 2 different modes
Basic mode – represents the zone server behavior of FC-GS3/FC-SW2 standard. All SAN-OS support basic mode. Enhanced mode – represents the zone server behavior of FC-GS4/FC-SW3 standard. SAN-OS 2.0 and later are required for enhanced mode.

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

32

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

16

Activate Zoneset Flow Across ISLs
Zoneset Distribution Would Go to Every Domain Within the VSAN
ACA—Acquire Change Auth

Lock the Fabric Move the Zone Data to the Switches Trigger Switches to Activate the New Zoneset Unlock the Fabric

ACC SFC—Stage Fabric Conf ACC UFC—Update Fabric Conf ACC RCA—Release Change Auth ACC

ACK1 Frames Are Not Shown Here
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

33

Enhanced Zoning
Enhanced zoning provides the following advantages
Disallow parallel configuration attempts Standardized generation of RSCN Reduced payload size of the SFC frame Fabric-wide policy enforcement (default zone, merge control) Enhanced error reporting Distributing zonesets without activation Unique Vendor Types FWWN based member type standardized Enhanced interop thru ESS (Exchange Switch Support) defined in SW-3

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

34

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

17

Active only operation

•Only the active zone data is sent •FCaliases, zones and zonesets that are not part of the zoneset being activates are not sent •The running configuration on Switch Fishfry will not show the active zoning information.
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

35

Full Zoneset active operation

• The complete zone data is sent • FCaliases, zones and zonesets that are not part of the zoneset being activated are sent • The running configuration on Fishfry will show the active zoning information.
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

36

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

18

Recommendations
If the SAN administrators wish to be able to manage zones from any switch in the fabric, then configure all switches/VSANs for ‘distribute full zone database’. If the SAN administrator wishes to manage zone changes from only 1 switch in the fabric, then they can leave the default configuration of ‘distribute active zone database only’. Inconsistent zone distribution policies can cause problems when a zoneset is modified on a switch that may not have the most current information in it’s configuration when the change was made.

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

37

MDS9000 SAN Tools

FCanalyzer SPAN & PAA (WireShark usage) SAN/OS (Output analysis, debug, logs, Cores) Performance Manager (Licensed part of Fabric Manager) NTOP (Using Netflow and SPAN w/PAA)
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

38

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

19

How Do We Troubleshoot the Network
SPAN (Port Monitor) on Cisco L2 Switch

Run WireShark on IP Host

iSCSI.server2

iSCSI.server1

I

SI iS C

CS

FC Analyzer Local Server2

iS

iSCSIserver3

iSCSI.server4

Terminal

Server3 PAA Server1 Fabric Manager and Device Manager PC Storage Ethernet
Out-of-Band Management Network

Debugs and Show Commands, Telnet or Console
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

FC Analyzer Remote Wireshark PC

Span FC Ports, Port Channels, iSCSI or FCIP Port to SD Port on MDS
39

Gathering Protocol Traces for Analysis
Using built-in FC Analyzer (CLI)
Linux version from Wireshark.com

Using Wireshark on PC (local and remote) Using the MDS port analyzer adapter w/SPAN Using an external FC Analyzer tester in line or with SPAN

Non-Disruptive to Switch Operations and Traffic on the SAN

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

40

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

20

MDS FC Analyzer (SAN/OS Embedded)
Output is displayed to the console in readable snifferlike format Is only used to monitor Fibre Channel traffic to and from supervisor on the MDS9000 Traffic-like fabric logins, FSPF routing, switch-to-switch control traffic Output can go direct to your console screen or to a workstation running WireShark program

Note: SPAN Is Used for FC Port to FC Port Monitoring
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

41

FC Analyzer Options
Local or remote—Where to send the trace. Can be to local devices or remote PAA attached to different MDS switch. Brief or detailed—Header information vs. full output of frame including hex. Detail is default. Limit-captured-frames—Number of frames to capture. Default is FC analyzer will trace 100 frames. Specifying zero is unlimited frame capture. Limit-capture-size—Allows to capture N number of bytes of frame. Useful for not capturing frame data when it is not relevant to troubleshooting.

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

42

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

21

Display-Filter Options
When not specified, FC analyzer will capture all traffic on VSAN 1
Example: fcanalyzer local brief

To specify a different VSAN, use a display-filter and specify the VSAN to be captured in hex or decimal
Example for 100 (note the two equal signs): fcanalyzer local display-filter mdshdr.vsan==0x64 or mdshdr.vsan==100

To capture for a specific address in VSAN 100:
Example:
fcanalyzer local brief display-filter ((fc.d_id==64.01.00)or(fc.s_id==64.01.00))

Recommend to use remote capture method for ease of filter capabilities on the WireShark GUI
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

43

Using Write Option for fcanalyzer
Using write option sends output of the fcanalyzer to a file on the switch, directory called volatile. This trace file can be then copied off MDS switch and viewed with WireShark app on PC
Example: Capture 250 frames of all traffic on VSAN 200

fcanalyzer local display-filter mdshdr.vsan==0xc8 write volatile:capture.trc limit-captured-frames 250

The file name on volatile: filesystem will have extra characters appended. Issue following command to see contents of filesystem and then copy the file via tftp/ftp
dir volatile: copy volatile:capture_00001_20050321172628.trc tftp://<tftp server ip addr>/capture.trc Added Characters
When File Is Written
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

44

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

22

FCAnalyzer Local Brief
Capture Is Done in Configuration Mode
TOP-9216i# conf t Enter configuration commands, one per line. End with CNTL/Z. TOP-9216i(config)# fcanalyzer local brief display-filter mdshdr.vsan==2 Warning: Couldn't obtain netmask info (eth2: no IPv4 address assigned). Capturing on eth2 2.829871 2.853261 2.853422 2.853592 2.853565 2.859648 2.860885 2.861007 2.861175 2.862053 2.865904 2.865981 2.866153 2.866297 2.866445 2.866496 2.866615 2.868792 2.868857 2.871132 2.872013 2.872021 2.872139 2.872163 2.872234 2.891239 2.891359 2.891469 2.891613 2.900160 2.900394 2.901916 2.902151 2.908296 2.908444 2.919880 00.00.00 -> ff.ff.fe ff.ff.fe -> 7e.01.00 7e.01.00 -> ff.ff.fc ff.ff.fc -> 7e.01.00 7e.01.00 -> ff.ff.fd ff.fc.7e -> 7e.01.00 7e.01.00 -> ff.fc.7e ff.fc.7e -> 7e.01.00 7e.01.00 -> ff.fc.7e 7e.01.00 -> ff.ff.fc ff.fc.7e -> ff.fc.0a ff.fc.7e -> ff.fc.cb ff.fc.0a -> ff.fc.7e ff.fc.cb -> ff.fc.7e ff.fc.0a -> ff.fc.7e ff.fc.7e -> ff.fc.0a ff.ff.fd -> 7e.01.00 ff.fc.cb -> ff.fc.7e ff.fc.7e -> ff.fc.cb ff.fc.7e -> 7e.01.00 7e.01.00 -> ff.fc.7e 7e.01.00 -> ff.fc.7e ff.fc.7e -> 7e.01.00 ff.fc.cb -> ff.fc.7e ff.fc.7e -> ff.fc.cb ff.fc.7e -> ff.fc.cb ff.ff.fc -> 7e.01.00 7e.01.00 -> ff.ff.fc ff.fc.cb -> ff.fc.7e ff.ff.fc -> 7e.01.00 7e.01.00 -> ff.ff.fc ff.ff.fc -> 7e.01.00 7e.01.00 -> ff.ff.fc ff.ff.fc -> 7e.01.00 7e.01.00 -> ff.ff.fc ff.ff.fc -> 7e.01.00 0x2288 0xffff FC ELS FLOGI 0x2288 0xc728 FC ELS ACC (FLOGI) 0x22a0 0xffff FC ELS PLOGI 0x22a0 0xc729 FC ELS ACC (PLOGI) 0x22b8 0xffff FC ELS SCR 0xc72c 0xffff FC ELS PLOGI 0xc72c 0x22d0 FC ELS ACC (PLOGI) 0xc72b 0xffff FC ELS PRLI 0xc72b 0x22e8 FC ELS ACC (PRLI) 0x2300 0xffff dNS RFT_ID 0xc72e 0xffff SW_ILS SW_RSCN 0xc72f 0xffff SW_ILS SW_RSCN 0xc72e 0x77f9 FC Link Ctl, ACK1 0xc72f 0x77fa FC Link Ctl, ACK1 0xc72e 0x77f9 SW_ILS SW_ACC (SW_RSCN) 0xc72e 0x77f9 FC Link Ctl, ACK1 0x22b8 0xc72a FC ELS ACC (SCR) 0xc72f 0x77fa SW_ILS SW_ACC (SW_RSCN) 0xc72f 0x77fa FC Link Ctl, ACK1 0xc730 0xffff FC ELS LOGO 0xc730 0x2318 FC ELS ACC (LOGO) 0x2318 0xffff FC ELS PLOGI 0x2318 0xc731 FC ELS LS_RJT (PLOGI) 0x77fb 0xffff dNS GE_ID 0x77fb 0xc732 FC Link Ctl, ACK1 0x77fb 0xc732 dNS ACC (GE_ID) 0x2300 0xc72d dNS ACC (RFT_ID) 0x2330 0xffff dNS RFF_ID 0x77fb 0xc732 FC Link Ctl, ACK1 0x2330 0xc733 dNS ACC (RFF_ID) 0x2378 0xffff dNS RNN_ID 0x2378 0xc734 dNS ACC (RNN_ID) 0x23a8 0xffff dNS RSNN_NN 0x23a8 0xc735 dNS ACC (RSNN_NN) 0x23c0 0xffff dNS GNN_FT 0x23c0 0xc736 dNS ACC (GNN_FT)

Brief Option Used to get Single Line Caption Along with a Display Filter to Narrow Output to only VSAN 2 Control-C Stops Trace Capture

Display filters are a must for narrowing output on a busy network. See MDS Config Guide for other filter types. Capture is done by default to console screen, so make sure you are able to save output to large capture buffer or log with your telnet application
N
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

45

SPAN & PAA

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

46

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

23

Use of SPAN Feature
Used for FC port to FC port analyzing Same type of tool as used on Cisco Catalyst® products. Cisco Catalyst uses port monitor. Can be left configured on switch Ingress and egress ports are sent to an FC-port setup as a SPAN Destination (SD-port type) No limits to where the ports are located on the MDS switch network Used to output to third-party test equipment or to Cisco Port Analyzer Adapter
N

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

47

FC Linecard

Best Practices using SPAN
MDS9500/9200
16 SPAN sessions, Multi Source & Destinations

MDS9124
Only 1 SPAN session, 1 Direction at a time

MDS9020 no SPAN capabilities Try to design into the Fabric solution a SD port dedicated to for SPAN usage. 1 per fabric (w/ remote SPAN) or 1 per switch (w/o remote
SPAN)

Bladeswitch ports can also be configured for SPAN Use filters mnemonics with FCAnalyzer to limit capture Learn to use WireShark
HP Blade IBM Blade

MDS9124

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

48

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

24

SAN/OS Tools

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

49

Command Line Debugging
Available debugs depend on features enabled in SAN/OS Many, many different options to select when turning on debugs Where is it output going?
Logfile—Data file in switch memory Capture to direct to screen via console, telnet or SSH

Requires admin privileges to run debugs Debugs can only be run from CLI No debugging available from Fabric Manager or Device Manager
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

50

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

25

Debug Logging
1. TOP-9216i# debug logfile networkers_debugs size 5000
Tip: use show debug to see name of debug file TOP-9216i# show debug logfile networkers_debug

2. Display captured debug to screen

3. Copy debug file off MDS to a server
TOP-9216i# copy log:networkers_debugs ftp: Enter hostname for the ftp server: 10.91.42.166 Enter username: networkers Password: networkers

To delete the debug logfile
TOP-9216i# clear debug-logfile networkers_debugs Or TOP-9216i# undebug all Or debug logfile will be cleared and over written when next debug logfile is created, only one debug logfile is allowed by system
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

51

Debugs to Direct Telnet Window
Use a telnet/SSH or console application that will capture the expected output to buffer or file Undebug all or no debug of specific debug command is required to turn trace off The debugs are not persistent across reboots Most debugs are very readable and sensible to understand, some not

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

52 52

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

26

Design for Troubleshooting
Leverage VSAN design to support troubleshooting methodology Have a SPAN port allocated in port count needs Integrate a Cisco Port Analyzer Adapter (PAA) into SAN design Provision an Analyzer (Finisar, Agilent, other) in the network, keep it operational with FTP access to extract traces Design for Syslog servers and scheduled configuration saves (use CFS to simplify)

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

53

Performance Manager

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

54

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

27

Fabric Manager Performance Manager
Licensed part of Fabric Manager Requirement for proactive SCSI I/O performance across the MDS Fabric Accounting logs to Monitor Fabric Changes Net Flow Tool
Create flows Use for Setting Benchmarks for I/O use

Use to monitor Port Groups

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

55

Accounting

Web interface allows for simple accounting views

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

56

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

28

PM Output – Monitor Port Groups
Port-group 3 on a 24 port LC

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

57

PM- Monitor Netflows
Flow creation done with Flow Wizard Historical look at Initiator to Target flow

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

58

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

29

NTOP

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

59

NTOP Network Traffic Probe http://www.ntop.org/
Use PAA-2 to SPAN selected critical paths or devices. Set encapsulation on SD port to eisl to capture all VSAN traffic SPAN’ed Traffic can be dynamically edited and changed as needed based on issue at hand
PAA-2 SD

SPAN

NTOP Web server (Linux Preferred)
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

60

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

30

NTOP Determining SCSI Latencies on the SAN

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

61

Performance of I/O across Fabric

LUN 0 Traffic

Information in NTOP is referenced to FCID’s. Many places in FM, DM and CLI to reference to what FCID’s belong to what.

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

62

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

31

Troubleshooting with the Tools

Device Connections ISL’s Zoning IVR NPV
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

63

MDS 9124
Simple SAN FC Device Operation MDS 9124 – 1 SPAN session, look at Tx first then Rx SPAN Host session to SD port so to see complete FC login and monitor SCSI traffic
PAA
Fibre Channel Wireshark Application Ethernet

Note: We saw CLI of FCAnalyzer usage in slide 20
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

64

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

32

Host Login Trace
Transmit side of Trace
Decoded in FC-GS standard

Response Side side of Trace

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

65

Inter Switch Link
• VSAN 1 & 100 already configured up and running in core production SAN • Add new 9509 switch that has VSAN 1 & 100 configured on it • New switch using default Domain settings • No known conflicts with zonesets
Picture of VSAN 100

VSAN 100 Domain ID’s
Add new Switch and ISL to Fabric.

VSAN 1 Domain ID’s

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

66

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

33

ISL Trace - TE Port SPAN to PAA
PAA
Ethernet Fibre Channel Wireshark Application

BEAR# sh span sess 3 Session 3 (active) Destination is fc7/24 No session filters configured Ingress (rx) sources are fc1/2, Egress (tx) sources are fc1/2,

SPAN view from Fabric Manager

SPAN view from Device Manager

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

67

ISL Trace
ACK1 Filter Applied Exchange Link Parameters Exchange Switch Capabilities Exchange Peer Parameters (This Is Proprietary to MDS Only to Determine if Connecting Switch Is Another MDS) Exchange Fabric Parameters Build Fabric Domain ID Assign by Existing Principal Switch Request Domain ID from New Switch Enhanced Zoning Merge Request Resource Allocation (New in FCSW3 Standard) Zone Merge Request
Break down of how the Domains were selected and distributed on next slide
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

68

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

34

Domain ID assignments & Distribution
VSAN 1 VSAN 100

Request for ID 104 Decimal

Request for ID 113 Decimal

EFP distributes to new switch the list of where other domains are found

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

69

Debugging Zoning
Understanding what is Active Fabric Manager / CLI Basic/Enhanced Zoneset activate failure? Zoneset merge failure? Members not able to communicate? Host can not see storage?

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

70

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

35

Show Zoneset Active
Shows the zoneset activated in the fabric An asterisk to left of device indicates that device has registered with the nameserver Will not show if a zone has been configured but changes have not been activated Zoneset and zone names are case sensitive Inter-VSAN routing zone added to ZoneSet via IVR wizard if this VSAN has member being zoned to another VSAN

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

71

Zoneset activate failure
Look at the messages on the seed switch to determine what caused the activate to fail. ‘show logging log’
(seed is the switch where the change was initiated from)

For multi switch fabrics, check that the ISL or TE-ISL is operational. Show interface fcx/x or show interface port-channel x Show interface trunk vsan x Show fcdomain domain-list vsan x Show zone internal

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

72

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

36

Change event history
BEAR# show zone internal change event-history vsan 100 Change Protocol Event Log For VSAN: 100 >>>>FSM has 50 logged transitions<<<<
46) Transition at Tue May 1 10:37:34 2007 Prev State: [ACA Sent] Trig event: [RCVD_ACC] (Dom:5) Next State: [ACA Sent] 47) Transition at Tue May 1 10:37:34 2007 Prev State: [ACA Sent] Trig event: [RCVD_ALL_ACC] Next State: [ACA Complete] 48) Transition at Tue May 1 10:37:34 2007 Prev State: [ACA Complete] Trig event: [SEND_RCA] Next State: [RCA Sent] 49) Transition at Tue May 1 10:37:34 2007 Prev State: [RCA Sent] Trig event: [RCVD_ACC] (Dom:5) Next State: [RCA Sent] 50) Transition at Tue May 1 10:37:34 2007 Prev State: [RCA Sent] Trig event: [RCVD_ALL_ACC] Next State: [Idle] Curr state: [Idle]
Command to view change history for VSAN 100

One event. Time of event is noted previous state, event, and next state are shown. Domain that initiated the event is also seen

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

73

Zone merge failure options
Confirm that there is a discrepancy in the zonesets on opposite sides of the ISL or E-ISL that is isolated. Determine which of the 2 fabrics contain the desired active zoneset, then use the zoneset import/export command. This command only works if the ISL/E-ISL is isolated. Prune the VSAN from the TE port, and add it back. Edit one or the other zonesets and then shut/no shut the ISL or E-ISL. This action will impact all VSANs on the E-ISL even those that are not isolated.

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

74

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

37

Host can’t see storage?
View ‘show zoneset active vsan x’ on each switch. Look for * next to the affected device. The * indicates the device is in the zone, and is active in the name server. If the * is present, in each switch, then the zoning displays look as good as they get. Verify the zone members are correct for the devices that are affected. Correct PWWN and FCID displayed?
BEAR# show zoneset active vsan 100 zoneset name VS100_Zoneset vsan 100 zone name Net-2 vsan 100 * fcid 0x0501e1 [pwwn 22:00:00:20:37:c5:36:f0] * fcid 0x040001 [pwwn 10:00:00:00:c9:2f:99:3d] [DA-Net-2] zone name Snipe vsan 100 * fcid 0x040100 [pwwn 21:01:00:e0:8b:25:a2:8f] * fcid 0x050200 [pwwn 21:00:00:e0:8b:05:a2:8f] [Snipe] * fcid 0x050300 [pwwn 20:04:00:a0:b8:0c:64:51] pwwn 20:05:00:a0:b8:0c:64:51 zone name Net-2-extrastor vsan 100 * fcid 0x0501e2 [pwwn 22:00:00:20:37:c5:22:de] * fcid 0x040001 [pwwn 10:00:00:00:c9:2f:99:3d] [DA-Net-2]
75

Device is not active in the fabric but is in the zone

Device is active in the fabric and in the zone

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

See if the Switch can see the Storage but still not the host
MDS has a pseudo initiator device that can log into targets (if target accepts PLOGI from the switch PWWN)

LUNs

Targets

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

76

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

38

Zoneset activation - Trace
Activate zone in VSAN 100, using VSAN filter option on the SPAN session

BEAR# show span session 3 Session 3 (active) Destination is fc7/24 Session filter vsans are 100 Ingress (rx) sources are port-channel 1, Egress (tx) sources are port-channel 1,

Success

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

77

Recovering from Zone Issues
Zoneset import/export command
MDS_Switch# show int fc1/1 FC1/1 is trunking (Not all VSANs UP on the trunk) Hardware is Fibre Channel, SFP is short wave laser w/o OFC (SN) Port WWN is 20:c5:00:05:30:00:49:1e Peer port WWN is 20:81:00:0d:ec:0f:b4:c0 Admin port mode is E, trunk mode is on snmp traps are enabled Port mode is TE Port vsan is 1 Speed is 2 Gbps Transmit B2B Credit is 255 VSAN 10 is Receive B2B Credit is 12 isolated Receive data field Size is 2112 Beacon is turned off Trunk vsans (admin allowed and active) (1,10) Trunk vsans (up) (1) Trunk vsans (isolated) (10) Show Zoneset MDS_Swich# show zoneset act v 10 zoneset name z10 vsan 10 zone name duplicate vsan 10 pwwn 10:10:10:10:10:10:10:10 MDS_Switch# zoneset import interface fc1/1 vsan 10
that is deemed undesirable

Command to bypass merge checking and force import of zone data

<show port internal info> will give greater detail on merger failure reason
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

78

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

39

Recovering from Zone Issues
Zoneset import/export command - After the import
MDS_Switch# show int fc1/1 Fc1/1 is trunking Hardware is Fibre Channel, SFP is short wave laser w/o OFC (SN) Port WWN is 20:c5:00:05:30:00:49:1e Peer port WWN is 20:81:00:0d:ec:0f:b4:c0 Admin port mode is E, trunk mode is on snmp traps are enabled Port mode is TE Port vsan is 1 Speed is 2 Gbps Transmit B2B Credit is 255 Receive B2B Credit is 12 Receive data field Size is 2112 Beacon is turned off Trunk vsans (admin allowed and active) (1,10) Trunk vsans (up) (1,10) Trunk vsans (isolated) () Trunk vsans (initializing) () MDS_Switch# show zoneset act vsan 10 zoneset name v10 vsan 10 zone name duplicate vsan 10 pwwn 21:00:00:20:37:a9:cd:6e
admin:Import option is set on interface fc1/1 on VSAN 10
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

VSAN 10 is no longer isolated

VSAN 10 zoneset has been imported

Entry in the accounting log reflecting the import action

79

Zone merge failure observations
Guernsey# show zoneset active vsan 1 zoneset name networkers vsan 1 zone name Networkers2007 vsan 1 pwwn 20:20:20:20:20:20:20:22
Different switches Same zone name
Different member

BEAR# show zoneset active vsan 1 zoneset name networkers vsan 1 zone name Networkers2007 vsan 1 pwwn 20:20:20:20:20:20:20:11
BEAR %ZONE-2-ZS_MERGE_FAILED: %$VSAN 1%$ Zone merge failure, isolating interface fc1/2 error: Member mismatch Guernsey %ZONE-2-ZS_MERGE_FAILED: %$VSAN 1%$ Zone merge failure, isolating interface fc2/1 error: Received rjt from adjacent switch
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

80

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

40

Zone merge failure as seen on the ISL interface

BEAR# show interface fc1/2 fc1/2 is trunking (Not all VSANs UP on the trunk) Hardware is Fibre Channel, SFP is short wave laser w/o OFC (SN) Port WWN is 20:02:00:0c:85:67:b1:c0 Peer port WWN is 20:41:00:0d:ec:01:40:80 Admin port mode is E, trunk mode is on snmp link state traps are enabled Port mode is TE Port vsan is 1 Speed is 2 Gbps Transmit B2B Credit is 12 Receive B2B Credit is 255 Receive data field Size is 2112 Beacon is turned off Trunk vsans (admin allowed and active) (1,100) Trunk vsans (up) (100) Trunk vsans (isolated) (1) Trunk vsans (initializing) ()

VSANs with no merge conflicts will show here. VSAN 1 is isolated

Show port internal info interface fc1/2
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

81

Trace of Zone failure

BEAR# show span session 3 Session 3 (active) Destination is fc7/24 Session filter vsans are 1 Ingress (rx) sources are fc1/2, Egress (tx) sources are fc1/2,

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

82

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

41

MR (merge request) frame from < FCanalyzer detail >
FCID’s are the Fabric Controller

Trace taken from config prompt on switch Capture to file if need be Can use Wireshark to also view and analyze the captured file

MR command code

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

83

Enhanced Zone analysis
BEAR# show zone analysis ? active Show active zoneset analysis analyze the active zoneset on a vsan vsan Show analysis in the specified VSAN analyze the full database for a vsan zoneset Show zoneset analysis analyze a specific zoneset for a vsan

BEAR# show zone analysis zoneset VS100_Zoneset vsan 100 Zoning database analysis vsan 100 Zoneset analysis: VS100_Zoneset Num zonesets: 1 Num zones: 3 SFC will be 580 bytes of the max 2048Kb Num aliases: 6 Num attribute groups: 1 Formattted size: 580 bytes / 2048 Kb
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

84

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

42

Troubleshooting Zone/ACL Issues
What to collect
On the Supervisor: switch# show switch# show switch# show switch# show tech-support details tech-support acltcam-soc tech-support zbm // This is Zone Block Manager tech-support zone

On the linecard (attach module 1): module-1# show hardware internal packet-flow dropped •check the port-stats for port P & acl-stats on the port-grp N where you see drops module-1# show hardware internal fwd port <p> port-stats module-1# show hardware internal fwd 0 port-group <n> acl-stats module-1# show hardware internal errors module-1# show hardware internal fwd 0 error-statistics

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

85

Best Practices
Make periodic backups of zoning database Prior to any changes, make a backup of the current zoning Single initiator zones Meaningful zone names Default-zone set to deny Distribute full zoneset Use alias, device-alias preferred Manage from designated seed switch or switches
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

86

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

43

IVR Inter VSAN Routing

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

87

Reading IVR

Where, What

Host on 9216i switch needs to get at Disk on different VSAN on different switch
IVR only enabled where needed - only 9216i

VSAN10

VSAN 200

No NAT required because Domain ID’s are unique

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

88

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

44

CLI for IVR – Show Configuration
Showing IVR-VSAN Topology Configuration (should match in both switches)
switch# show ivr vsan-topology AFID SWITCH WWN Active Cfg. VSANS -------------------------------------------------------------1 1 20:00:00:05:30:00:3c:5e yes 20:00:00:05:30:00:58:de yes yes 1,4 yes 2,4

Showing IVR-Zone Configuration (should match in both switches)
switch# show ivr zoneset active zoneset name ZoneSet1 zone name Zone_VSAN1-2 * pwwn 21:00:00:e0:8b:02:ca:4a vsan 1 * pwwn 21:00:00:20:37:c8:5c:6b vsan 2

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

89

IVR initiated distribution of information via CFS (Cisco Fabric Services)
Information distribution initiated by IVR process, not by user Events that alter the IVR topology database VSAN creation Link Shutdown Must do IVR COMMIT to initiate distribution for config changes

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

90

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

45

Show IVR, what can I look at in 3.0?
Sunny# show ivr ? fcdomain Display IVR persistent fcdomain database internal Show ivr internal information merge Show ivr merge status pending Show ivr pending configuration pending-diff Show ivr pending-diff service-group Show IVR service groups session Show ivr session status tech-support Show information for IVR technical support staff virtual-domains Show IVR virtual domains for all local VSANs virtual-fcdomain-add-status Show IVR-virtual fcdomain status virtual-switch-wwn Show IVR-virtual switch WWN vsan-topology Show IVR VSAN topology zone Inter vsan zone show commands zoneset Inter vsan routing zoneset show commands | Output modifiers. > Output Redirection. <cr> Carriage return.

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

91

Best Practices for IVR
Management
Configure IVR only in needed border switches Configure/manage from Fabric Manager Do not use IVR topology auto discovery in production (pre 3.0 SAN-OS, it adds every VSAN to IVR) Use transit VSANs for FCIP links or FCIP port channels

Domain ID’s & VSAN’s
Plan out your VSAN numbers and domain IDs Use static domain IDs Use RDI mode to reserve domain IDs

Zoning
Keep default zone policy at deny Manage local zones from IVR enabled switch

Keep all IVR enabled switches at the same SAN OS version Do not mix IVR NAT with FCIP write acceleration
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

92

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

46

IVR Troubleshooting
Database checks
Are devices logged into their local VSAN? (show flogi database) Are devices exported into remote FCNS in both directions? (show fcns database) Are FC Devices in the native VSAN FCNS in all switches in that VSAN? (show fcns database) Do FC devices show correctly in the Transit VSAN, if one is in use? Does the device have a valid PWWN and NWWN? IVR checks before exporting.
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

Zone checks
Use command line to view active local zoneset (show zoneset active) Does the same IVR zone show up in both local and remote VSAN’s active zoneset? Did IVR zoneset activation succeed in all VSANs for the affected devices?

Miscellaneous Checks
Is it possible that a natted FCID changed because of a reload causing AIX or HP-UX to have target binding issues?

Ensure HBA is not configured to time out PLOGI to quickly. IVR NAT delays ACC to PLOGI for a few seconds. Most HBAs have a 10 second timeout.

Look for the * next to all IVR’d devices in both VSANs local active zoneset

Is the IVR VSAN topology exactly the same in every IVR enabled switch in the fabric? <show IVR VSAN topology>

93

Debugging IVR – Internal Commands as of 3.0

Many internal commands to assist Support in narrowing issues Most commands required in request from support only

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

94

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

47

Upgrading & What to collect for Support
Upgrades need to be planned and well thought out The mix of IVR1 (Non NAT) and IVR2 (NAT) can be tricky and confusing to configure without introducing traffic disruption Downgrades might require entire fabric to be downgraded show tech-support IVR (on each IVR enabled switch) show tech-support details (on the affected switches) FM screen snap shots

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

95

NPV Troubleshooting

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

96

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

48

NPV: Internal Logins (FLOGIs)
When an NP port comes up NPV itself first FLOGIs into the core
NPV-Core Switch

fwwn of Port P2

F
NP fc1/2

P1

VSAN WWN of NPV IP Address of NPV

P2
Switch name of NPV & Interface name of P2

NPV Switch

npv-core# show fcns database detail -----------------------VSAN:1 FCID:0x010001 -----------------------port-wwn (vendor) :20:02:00:0d:ec:2f:c1:40 (Cisco) node-wwn :20:01:00:0d:ec:2f:c1:41 class :2,3 node-ip-addr :172.20.150.38 ipa :ff ff ff ff ff ff ff ff fc4-types:fc4_features :npv symbolic-port-name :para-3:fc1/2 symbolic-node-name :para-3 port-type :N port-ip-addr :0.0.0.0 fabric-port-wwn :20:02:00:0d:ec:04:99:40 hard-addr :0x000000 permanent-port-wwn (vendor) :20:02:00:0d:ec:2f:c1:40 (Cisco)

fwwn of Port P1
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

97

NPV: Logins from End Devices (FDISCs)
An end port logs into the npv-core as follows:
NPV-Core Switch

F
NP fc1/2

P1

VSAN:1 FCID:0x331e00 -----------------------port-wwn (vendor) node-wwn class node-ip-addr ipa fc4-types:fc4_features symbolic-port-name

P2

NPV
F P3
fc1/1

:2f:ff:00:06:2b:10:c4:4c (LSI) :2f:ff:00:06:2b:10:c4:4c :3 :0.0.0.0 :ff ff ff ff ff ff ff ff :scsi-fcp:init :LSI7404XP-LC BR A.1 03-01081-03A FW:01.03.17 Port 0 symbolic-node-name : port-type :N port-ip-addr :0.0.0.0 fabric-port-wwn :20:02:00:0d:ec:04:99:40 hard-addr :0x000000 permanent-port-wwn (vendor) :20:02:00:0d:ec:2f:c1:40 (Cisco)

fwwn of Port P2 fwwn of Port P1

N

P4
FCID 0X331e00

NPV converts FLOGI to FDISC
Cisco Public

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

98

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

49

NPV: Distribution of end Device Logins
An example of current mapping of ports
NPV-Core Switch

NP Port P1 P2 P3
NP Ports
P1 P2 P3

No. of mapped F ports 5 2 9 (fc1/1, 1/5 ...) (fc1/7, fc1/21) (fc1/2, fc1/8, ...)

NPV Switch

F

FC

Next F port on NPV would be assigned to NP Port P2
(NP port with minimum number of mapped F ports)

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

99

NPV: What happens when an NP port goes down?
All F ports mapped to that NP port are re-initialized These N ports would attempt re-login If another NP port is available
N ports would be logged in via the available NP port(s) Logins would be distributed as per the previous slide
(shut no shut)

If no NP port is available
F ports would remain in down state waiting for an NP port
P1

P2 P3

NP Ports

NPV

When the failed NP port comes back up
The logins are NOT re-distributed (to avoid disruption)

F F F

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

100

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

50

NPV: What happens when an F port goes down?
NPV sends a LOGO to NPV-core on behalf of the N port
If there were multiple FCIDs assigned to that N port (NPIV-enabled host) then a LOGO is sent to the NPV-core for each FCID

The F port is not allowed to come back up until the LOGO(s) are completed
NP Ports
P1 P2 P3

Conflict in Port-security
DPVM or FC-SP configuration on NPV-code switch can logout an FCID or prevent it from coming up

F

FC

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

101

NPV Switch Logon –
Span of F-Port on Core MDS to an SD port with PAA attached

FLOGI From NPV MDS switch to Core MDS PLOGI of NPV Switch to Core MDS Other Plogi Parameter exchanges much as a HBA would do

FDISC- converted FLOGI or connecting host or storage device on the NPV switch PLOGI accept to the real HBA on the MDS NPV switch, FCID of attached device is 0b0002 Remaining exchanges are from device 0b0002 to the code MDS switch

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

102

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

51

NPV Related Show Commands on NPV Switch
The following show commands can be used on the NPV switch to display info. on the NPV devices There is no familiar fcns or flogi database to view, for these services are not running in a NPV enabled switch
show npv flogi-table show npv status show tech-support npv show npv internal event-history { errors | events | ext-if-fsm [ <interface > ] | flogi-fsm [ interface < interface > | pwwn <wwn> ] } show npv internal event-history msgs show npv internal event-history svr-if-fsm [ interface < interface > ] show show show show show show show npv npv npv npv npv npv npv internal internal internal internal internal internal internal info info external-interface { all | < interface > } info global info interface { all | < interface > } info server-interface { all | < interface > } mem-stats [ detail ] pending-queue interface { all | < interface >

debug npv { all | erros | events | ... } show debug npv

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

103

NPV Related Show Commands on NPV-Core Switch
The following show commands can be used in the NPV-core switch to display information on the NPV devices. Since these outputs are based on the name server information, this command can be run from any non-NPV MDS switch running 3.2(1) or later release
show show fcns fcns database database npv npv [ detail [ vsan <vsan range> ]] [ node_wwn <wwn> ] [ vsan <vsan range> ]

Example Outputs
npv# show fcns database npv
VSAN 1: ------------------------------------------------------------------------------NPV NODE-NAME NPV IP_ADDR NPV IF CORE SWITCH WWN CORE IF ------------------------------------------------------------------------------20:00:00:0d:ec:3d:62:80 10.1.96.24 fc1/20 20:00:00:0d:ec:2d:af:40 fc4/4 20:00:00:0d:ec:3d:62:80 10.1.96.24 fc1/19 20:00:00:0d:ec:2d:af:40 fc4/3 20:00:00:0d:ec:3d:62:80 10.1.96.24 fc1/17 20:00:00:0d:ec:2d:af:40 fc4/1

...

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

104

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

52

NPV Related Commands on NPV-Core Switch
npv# show fcns database npv detail -----------------------------------------------------------VSAN:1 NPV Node Name: 20:00:00:0d:ec:3d:62:80 -----------------------------------------------------------NPV Fabric Port-WWN :20:14:00:0d:ec:3d:62:80 class :2,3 NPV IP Address :10.1.96.24 ipa :ff ff ff ff ff ff ff ff fc4-types:fc4_features :npv NPV Switch Name:Interface :sw24-gd96:fc1/20 port-type :N P Core Switch fabric-port-wwn :20:c4:00:0d:ec:2d:af:40 permanent-port-wwn (vendor) :20:14:00:0d:ec:3d:62:80 (Cisco) .... npv# show fcns database npv node_wwn 20:00:00:0d:ec:3d:42:40 VSAN 1: -------------------------------------------------------------------------FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE -------------------------------------------------------------------------0x330f00 N 2f:ff:00:06:2b:10:c7:b2 (LSI) scsi-fcp:init 0x331000 N 2f:ff:00:06:2b:10:c7:b3 (LSI) scsi-fcp:init Total number of npv-attached entries = 2

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

105

Wrap-up
Cornerstone to SAN network troubleshooting is to understand Standards operation Each SAN/OS feature has methods of troubleshooting, Basic most seen issues were covered in this session Familiarity with available tools in SAN/OS will narrow time to resolution if either O/S bug or operational issues Have other insight on HBA operations & firmware, Array tools and Applications Interop with Brocade/McData requires another skill set Understanding architecture of Cisco Fabric Manager deployments and best practices also a plus
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

106

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

53

Appendix - Extras

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

107

Necessary Network Settings
Switch Fabric clocks synchronized
Use NTP along with CFS to simplify job Use fabric manager to set all clocks and time zones

Setup system logs, syslogd server
Use CFS to simplify

Save off switch configurations or even complete show tech-support details regularly
Use CLI scheduler in SAN/OS to simplify process

Core Dumps

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

108

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

54

Errors on Application Server Logs, User Having Performance Problems
Use logging capabilities on switch to piece together network issues Put together:
1. What interfaces are on the fabric (up, down, flapping) 2. What was health of each interface (virtual and physical interfaces) 3. Examine interface health (FC protocol errors, physical layer errors) 4. Are effects being seen on the complete network fabric or within the VSAN (RSCNs, Zones, ISLs, Errors on common controllers)

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

109

Gathering Internal Counter Information for Unknown Issues and Plaguing Connectivity Problems
Determine what ports on which switches you need to examine, this is done by narrowing to physical switch, VSAN, zone Look at path through complete network, initiator and targetside along with ISLs One stop shopping on each switch for all the data:
Attach to Line Card Module that has the interface you need information on

module-1# terminal length 0
Set scroll back buffer or log to file on telnet tool

module-1# show hardware internal debug-info interface fc1/2
Based on Line Card type proper information will be output
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

110

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

55

Show system error
How to get information about device errors

Error message gleaned from logs
2007 Feb 4 17:05:27 BEAR %MODULE-2-MOD_DIAG_FAIL: Module 1 (serial: JAB070507SB) Reported failure on ports 1/5-1/8 (Fibre Channel) due to X-bar Interface ASIC Error in device 5 (device error 0xc05006a3)

At command line on switch decode to error BEAR# show system error-id 0xc05006a3
Error Description: Device Name:[U-frontier] Instance:[0] Error Type:[interrupt error] code:[163]

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

111

Tidbits & Reasons

• Reason for shutdown 'bit error rate exceeded', we are following the FICON specification • By actively disabling a bad link, we can minimize the side effects on other good links • A bad link ends up link flapping on good links. This was because of error handling mechanisms implemented in some storage systems.
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

112

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

56

Short cut to find commands using pipe
MDS-Switch# which | include fcns
show show show show show show show show show show show show show show show show show show show fcns database fcid<FC ID> detail[o] vsan<VSAN INT> fcns database fcid<FC ID> vsan<VSAN INT> fcns database local fcns database local detail fcns database local detail vsan<VSAN INT RANGE> fcns database local vsan<VSAN INT RANGE> fcns database vsan<VSAN INT RANGE> fcns internal event-history fcns internal event-log fcns internal info fcns internal info vsan[o]<VSAN INT RANGE>[o] fcns internal reject-log vsan<VSAN INT RANGE> fcns statistics fcns statistics detail[o] fcns statistics detail[o] vsan<VSAN INT RANGE> fcns statistics vsan<VSAN INT RANGE> logging level fcns tech-support fcns tech-support fcns vsan<VSAN INT RANGE>

clear fcns statistics vsan<VSAN INT RANGE> debug fcns all debug fcns all vsan[o]<VSAN INT>[o] debug fcns errors debug fcns errors vsan[o]<VSAN INT>[o] debug fcns events mts debug fcns events mts vsan[o]<VSAN INT>[o] debug fcns events query debug fcns events query vsan[o]<VSAN INT>[o] debug fcns events register debug fcns events register vsan[o]<VSAN INT>[o] show debug fcns show fcns database show fcns database detail[o] show fcns database detail[o] vsan<VSAN INT RANGE> show fcns database domain<INTEGER> show fcns database domain<INTEGER> detail[o] show fcns database domain<INTEGER> detail[o] vsan<VSAN INT RANGE> show fcns database domain<INTEGER> vsan<VSAN INT RANGE>

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

113

Clearing Asic Counters
The following command shows the device ID’s for the various components: module-1# clear asic-cnt list-all-devices You can then clear all, which does not clear all module-1# clear asic-cnt all Cleared counters for asic type id = 4, name = 'luxor' Cleared counters for asic type id = 5, name = 'U-frontier' Cleared counters for asic type id = 6, name = 'D-frontier' Cleared counters for asic type id = 7, name = 'aladdin' Cleared counters for asic type id = 8, name = 'ssa’ In addition to the above asics, always clear riviera as well: module-1# clear asic-cnt device-id 3 Clearing counters for devId = 3, name = 'riviera'

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

114

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

57

Determining Link-Event reasons

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

115

Core Dumps
Show cores
Guernsey# sh cores Module-num ----------------1 Process-name PID Core-create-time ----------------------Jul 18 08:39 --------------------- ---------cimxmlserver 20029

Configure switch for core dumps
Switch# sh system cores Cores are transferred to tftp://10.91.42.133/

Show process log will display cores dumped to server Switch# sh processes log Process PID Normal-exit Stack Core Log-create-time ---------------- -------- ------------------ ------- ------- -------------------------------SystemHealth 27828 N Y N Tue Dec 7 19:08:09 2004 SystemHealth 27880 N Y N Tue Dec 7 19:08:20 2004 SystemHealth 27934 N Y N Tue Dec 7 19:08:30 2004 sme 2030 N Y N Sun Sep 23 18:47:15 2007 sme 2306 N Y N Sun Sep 23 18:47:17 2007 syslogd 2271 N N N Thu Sep 7 13:29:12 2006 syslogd 2442 N N N Thu Sep 7 13:30:12 2006 syslogd 2510 N N N Thu Sep 7 13:31:12 2006
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

116

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

58

Q and A

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

117

Recommended Reading
Continue your Cisco Live learning experience with further reading from Cisco Press Check the Recommended Reading flyer for suggested books

Available Onsite at the Cisco Company Store
Session 3708 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public

118

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

59

Complete Your Online Session Evaluation
Cisco values your input Give us your feedback—we read and carefully consider your scores and comments, and incorporate them into the content program year after year Go to the Internet stations located throughout the Convention Center to complete your session evaluations Thank you!

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

119

Session 3708

© 2008 Cisco Systems, Inc. All rights reserved.

Cisco Public

120

© 2006, Cisco Systems, Inc. All rights reserved. Presentation_ID.scr

60