getting to know the san stack
TRANSCRIPT
James C. McPherson
Solaris Datapath EngineeringStorage GroupSun Microsystems
Getting to know the SAN stack
Topics to cover (1)● The big picture (SAN driver stack)● Switches and attached devices● HBA, multipathing, target and layered drivers● fp, fctl, fcsm, fcp, fcip● Userland libraries● Userland utilities● Debugging features - mdb
Topics to cover (2)● Debugging features – Solaris CAT● Some useful data structures● Putting it all together● The NWS consolidation● Source code location and structure● References for further reading
Userland utilities (luxadm|fcinfo|cfgadm)
Userland libraries (libg_fc|liba5k|libHBAAPI|libsun_fc)
[md(svm)] [vxvm, vxdmp]sd st sgen (target drivers)scsi_vhci/STMS(mpxio)
fcp fcsm
HBAs: QLogic, Emulex, JNI
Physical devices: disks, tapes, ....
Switches: Brocade, McData, QLogic
streamsdlpi
fcip iSCSI
fp fctl (Fibre Channel Transport Layer)
The big picture (physical, kernel, userland)
The big picture (inside the kernel)
fcip iSCSI
scsi_vhci(aka STMS or mpxio)
fcp fcsm
fp fctl (Fibre Channel Transport Layer)
Solaris SCSA Framework DLPI
SCSI Target drivers: sd, st, sgen...
TCP/IP Streams stack and modules
Layered storage driversmd, rdac, vxvm, zfs, vxdmp
userlanduserland
Switches and Attached Devices
In general, devices attached to a fibre switch are usuallySCSI targets or HBAs (host bus adapters)
SCSI targets can be disks or disk-array luns, tapes, media changers
HBAs can work as targets and initiators for storage (SAN)or networking purposes (IP over FC)
Switches provide a routing service for FibreChannel packetsbetween attached device and hosts
Switches provide naming, configuration and other utility services
HBA drivers: qlc, emlxs and jfca
Drivers provided to Sun by the HBA manufacturers for both sparc and x86|x64 architectures:
Qlogic: qlcEmulex: emlxsAMCC (formerly JNI Corp): jfca (sparc only, EOL)
Sun Storage (group/division/...)'s Solaris Datapath Engineeringhas the source for each of these drivers for referencepurposes, but....
HBA manufacturers do bugfixes and sustaining
HBA drivers: qlc, emlxs and jfca
Sun does not write special fcode or bios for HBAs
Yes, cards that Sun sells will show up asSUNW,qlc orSUNW,emlxs
BUT that is identification done by Qlogic and Emulex attheir factories.
If you watch the OBP console output closely at power-on, you'llsee something like this:
/pci@7c0/pci@0/pci@8: Device 0 lpfc SUNW,emlxs fp disk lpfc SUNW,emlxs fp disk
The PCI vid/did combinations are the same as appear on linuxor MS-Windows
Qlogic Based HBA Support Matrix Vendor HBA Vendor ID Device ID Subsys Vendor ID Subsys ID Minimium Solaris Version or Patch Level
use the "prtpicl -v" command to find the following Solaris 8 Solaris 9 Solaris 10 x86 Solaris 10 Sparc
Sun SG-XPCI1FC-QLC 1077 6322 1077 132 Not Supported Not Supported
Sun X6799A 1077 2200A 1077 4082
SAN 4.4.8
Sun SG-XPCI1FC-QF2 or x6767A 1077 2310 1077 106
Sun SG-XPCI2FC-QF2 or x6768A 1077 2312 1077 10A
Sun X6727A 1077 2200A 1077 4083
Sun SG-XPCI1FC-QF4 1077 2422 1077 140
Sun SG-XPCI2FC-QF4 1077 2422 1077 141
Sun SG-XPCIE1FC-QF4 1077 2432 1077 142
Sun SG-XPCIE2FC-QF4 1077 2432 1077 143
Qlogic QCP2340 1077 2312 1077 109
Qlogic QCP2342 1077 2312 1077 10B
Qlogic QLA200 1077 6312 1077 119Not Supported Not Supported
Qlogic QLA210 1077 6322 1077 12F
Qlogic QLA2310 1077 2310 1077 106
SAN 4.4.8Qlogic QLA2310F/QLA2310FL 1077 2310 1077 9
Qlogic QLA2340/QLA2340L 1077 2312 1077 100
Qlogic QLA2342/QLA2342L 1077 2312 1077 101
Qlogic QLA2344/QLA2344-P 1077 (2)2312 1077 102
Qlogic QLA2440 1077 2422 1077 145 Not Supported Not Supported
Qlogic QLA2460 1077 2422 1077 133
SAN 4.4.8Qlogic QLA2462 1077 2422 1077 134
Qlogic QLE2360 1077 2432 1077 117
Qlogic QLE2362 1077 2432 1077 118
Qlogic QLE2440 1077 2432 1077 147 Not Supported Not Supported
Qlogic QLE2460 1077 2432 1077 137
SAN 4.4.8Qlogic QLE2462 1077 2432 1077 138
Qlogic QSB2340 1077 2312 1077 104
Qlogic QSB2342 1077 2312 1077 105
S10 update 1 or S10 + 119131-13
S10 update 1 or S10 + 119130-13
S10 update 1 or S10 + 119130-13
S10 update 1 or S10 + 119130-13
S10 update 1 or S10 + 119130-13
Emulex Based HBA Support Matrix Vendor HBA Model Minimium Solaris/Patch Level
Code Name
use the "prtpicl -v" command to find the following Solaris 9
Sun 10df fc00 10df fc00 LP10000-SSAN 4.4.6 Rainbow
Sun 10df fc00 10df fc00 LP10000DC-S
Sun 10df 10df LPe11000-SNot Supported SummitE
Sun 10df 10df fc22 LPe11002-SSun SG-XPCI1FC-EM4 10df Fc10 10df Fc11 LP11000-S
TBD PyramidESun SG-XPCI2FC-EM4 10df Fc10 10df Fc12 LP11002-S
Emulex LP10000 10df fa00 10df fa00 LP10000
SAN 4.4.7 N/A
Emulex LP10000DC 10df fa00 10df fa00 LP10000DCEmulex LP10000ExDC 10df fa00 10df fa00 LP10000ExDCEmulex LPe11000 10df fe00 10df fe00 LPe11000Emulex LPe11002 10df fe00 10df fe00 LPe11002Emulex LP11000 10df fd00 10df fd00 LP11000Emulex LP11002 10df fd00 10df fd00 LP11002Emulex LP9802 10df f980 10df f980 LP9802Emulex LP9002DC 10df f900 10df f900 LP9002DCEmulex LP9002L 10df f900 10df f900 LP9002LEmulex LP9002S 10df f095 10df f095 LP9002S
Vendor ID
Device ID
Subsys Vendor
ID
Subsys ID
Solaris 8
Solaris 10 x86
Solaris 10 Sparc
SG-XPCI1FC-EM2 S10 update 1 or S10 +
120223-04
S10 update 1 or S10 +
120222-04SG-XPCI2FC-EM2
SG-XPCIE1FC-EM4 fc20 fc21 S10 update 2
or S10 + Patch
102223-06
S10 update 2 or S10 +
Patch 120222-06
SG-XPCIE2FC-EM4 fc20
S10 update 1 or S10 +
120223-04
S10 update 1 or S10 +
120222-04
Notes: Model numbers ending in -S are Sun HBAs, Model numbers with no - extension are Emulex HBAs, Model numbers ending with -E are EMC HBAs. All will enumerate under the emlxs driver.
Fibre Channel Fabric Boot
HBA
Sparc Boot X86/X64 Boot
Code NameMinimium Solaris Version or Patch Level
Minimium LevelSolaris 8 Solaris 9 Solaris 10 Solaris 10
SG-XPCI1FC-QLC Not Supported
S10 Update 1
PrismX6799A
SAN 4.4.2 S10 FCS
AmberSG-XPCI1FC-QF2 or x6767A Amber 2SG-XPCI2FC-QF2 or x6768A Crystal 2aX6727A Crystal +SG-XPCI1FC-QF4
SAN 4.4.8
S10 Update 1
Pyramid SG-XPCI2FC-QF4SG-XPCIE1FC-QF4
Not Supported Summit SG-XPCIE2FC-QF4
SAN 4.4.7 Rainbow
Not Supported S10 Update 2 SummitE
SG-XPCI1FC-EM4TBD PyramidE
SG-XPCI2FC-EM4
UBC Fcode/BIOS
Minimium Solaris Version or Patch Level
Fcode: 1.14.11 BIOS: TBD
UBC: TBD Fcode: 1.11 BIOS 1.04
SG-XPCI1FC-EM2UBC:5.01a4 Fcode 1.50a4 BIOS 1.70a3
SG-XPCI2FC-EM2SG-XPCIE1FC-EM4SG-XPCIE2FC-EM4
Target and layered drivers
Multipathing drivers such as scsi_vhci (aka StorEdge Traffic Management Software/STMS/MPxIO/Solaris Multipathing).This sits below the target drivers in the kernel
The fcp driver provides device discovery in the SCSI layer
Target drivers use sd/ssd (for disks), st (for tapes) and sgen (for “generic” devices such as tape changers) and sit ontop of scsi_vhci/vxdmp if used.
So-called “layered” drivers provide more functionality on topof the target drivers. The two best examples of these aremd (for SVM) and vxvm (for Veritas Volume Manager).
The zfs driver is also layered on top of the target drivers
Multipathing with scsi_vhciThe scsi_vhci driver provides a T10 ALUA-compliant
multi-pathing and failover driverALUA is Asymmetric Logical Unit Access, defined in
T10 standard SPC-4, SCSI Primary Commands rev 4.
If your storage is T10 ALUA-compliant and operates in anactive/active mode, it should “just work” with scsi_vhci
refer to “Solaris Fibre Channel and Storage Multipathing”http://docs.sun.com/source/819-0139
Various vendor-specific implementations of ALUA hardwareare also supported (typically EMC and Engenio/LSI)
The fp (FC Port) driver
Performs the login to and logout of switchesPLOGI+PLOGO (point-to-point), LOGI+LOGO (fabric)
Handles basic accept and rejectBA_ACC, BA_RJT
Handles the Extended Link Services accept and rejectELS_ACC, ELS_RJT
Logs in and out of, and queries the fabric name serviceCreates the per-port loop map as requiredPasses information about changes (new, old, disconnected) luns to
fcp and thence to devfsadmd threads
The fcsm (FC San Management) driver
Implements the FC Management Server (Fabric) configurationcommands
Provides relatively-direct access to switch Management Serveroperations via the ioctl(2) interface
You're unlikely to use this directly
The fctl (FC Transport Layer) driver
Handles “orphan” portsMaintains the list of WWNs attached to this port
(both local and remote)Doles out the SFK work through job_requestsProvides utility functions for the rest of the SFK stack
The fcp (FC Protocol - layer4) driver
Does the work of encapsulating SCSI commands inside FCframe structures
When your thread needs SCSA access, fcp does the job
Handles scsi target device discoveryTalks to the NDI (bus nexus) and MDI (multipath driver interface)
frameworks for device discovery
Generically, fcp routes SCSI packets to and from targets
The fcip (IP+Arp encapsulated in FC) driver
Does the work of encapsulating IP and ARP packets inside FCframe structures
Maintains the routing table for fcip instances
Userland Libraries
libg_fc and liba5k are used mainly for luxadm(libg_fc and liba5k are sparc only)
libHBAAPI and libsun_fc are used for cfgadm_fp libima and libsun_ima are the Multipath Management API
libHBAAPI and libima are SNIA source codelibsun_fc and libsun_ima are the vendor-specific plugins
that Sun provides for libHBAAPI and libima
Userland Utilities
We've got four:
luxadm (supposed to go away... not soon enough!)luxadm(1M)
cfgadm (with the fp plugin)cfgadm(1M) cfgadm_fp(1M)
fcinfo (new in Solaris 10 update 1)fcinfo(1M)
iscsiadm (new in Solaris 10 update 1)iscsiadm(1M)
Userland Utilities# fcinfo hba-port
HBA Port WWN: 210000e08b954220
OS Device Name: /dev/cfg/c2
Manufacturer: QLogic Corp.
Model: QLE2462
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb 4Gb
Current Speed: 4Gb
Node WWN: 200000e08b954220
HBA Port WWN: 210100e08bb54220
OS Device Name: /dev/cfg/c3
Manufacturer: QLogic Corp.
Model: QLE2462
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb 4Gb
Current Speed: 2Gb
Node WWN: 200100e08bb54220
Userland Utilities# fcinfo remote-port -p 210000e08b110125
Remote Port WWN: 256000c0ffc7ecd2
Active FC4 Types: SCSI
SCSI Target: yes
Node WWN: 206000c0ff07ecd2
# fcinfo remote-port -l -p 210000e08b110125
Remote Port WWN: 256000c0ffc7ecd2
Active FC4 Types: SCSI
SCSI Target: yes
Node WWN: 206000c0ff07ecd2
Link Error Statistics:
Link Failure Count: 0
Loss of Sync Count: 0
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 0
Invalid CRC Count: 0
Userland Utilities# fcinfo remote-port -s -p 210000e08b110125
Remote Port WWN: 256000c0ffc7ecd2
Active FC4 Types: SCSI
SCSI Target: yes
Node WWN: 206000c0ff07ecd2
LUN: 0
Vendor: SUN
Product: StorEdge 3511
OS Device Name: /dev/rdsk/c0t600C0FF00000000007ECD20CD4BBE500d0s2
LUN: 1
Vendor: SUN
Product: StorEdge 3511
OS Device Name: /dev/rdsk/c0t600C0FF00000000007ECD20CD4BBE501d0s2
LUN: 2
Vendor: SUN
Product: StorEdge 3511
OS Device Name: /dev/rdsk/c0t600C0FF00000000007ECD20CD4BBE502d0s2
....
Userland Utilities# cfgadm -la -o show_SCSI_LUN
Ap_Id Type Receptacle Occupant Condition
c1 fc-private connected configured unknown
c1::200400a0b81770cf,0 disk connected configured unknown
c1::200400a0b81770cf,1 disk connected configured unknown
c1::200400a0b81770cf,31 disk connected configured unknown
c2 fc-fabric connected configured unknown
c2::210100e08b275cb5 unknown connected unconfigured unknown
c2::210100e08b27abb5 unknown connected unconfigured unknown
c2::50020f2300000cf0,0 disk connected configured unknown
c2::50020f2300000cf0,1 disk connected configured unknown
c2::50020f2300000cf0,2 disk connected configured unknown
c2::50020f2300000cf0,3 disk connected configured unknown
c2::50020f2300004bf0,0 disk connected configured unknown
c2::50020f2300004bf0,1 disk connected configured unknown
c2::50020f2300004bf0,2 disk connected configured unknown
...
HBAs visible from this port, in the same zone as this port
Userland Utilities# cfgadm -la -o show_FCP_dev
Ap_Id Type Receptacle Occupant Condition
c1 fc-private connected configured unknown
c1::200400a0b81770cf,0 disk connected configured unknown
c1::200400a0b81770cf,1 disk connected configured unknown
c1::200400a0b81770cf,31 disk connected configured unknown
c2 fc-fabric connected configured unknown
c2::210100e08b275cb5 unknown connected unconfigured unknown
c2::210100e08b27abb5 unknown connected unconfigured unknown
c2::50020f2300000cf0,0 disk connected configured unknown
c2::50020f2300000cf0,1 disk connected configured unknown
c2::50020f2300000cf0,2 disk connected configured unknown
c2::50020f2300000cf0,3 disk connected configured unknown
c2::50020f2300004bf0,0 disk connected configured unknown
c2::50020f2300004bf0,1 disk connected configured unknown
c2::50020f2300004bf0,2 disk connected configured unknown
....
Debugging features (1)
We provide dcmds and walkers in mdb
SFK support is being added to Solaris CAT v5
If you run 'touch /var/adm/sun_fc.debug' then any command which uses libsun_fc will write traceinformation to it
luxadm allows you to look at various link statusescfgadm show basic multipath status informationfcinfo shows you what local ports you have, and also
what remote ports, devices and targets are attached
Debugging features (2) – dcmds in mdbSome of our dcmds are
::fcptrace displays the FCP trace buffer::fptrace displays the FP trace buffer::fcip displays FCIP instances::fcport displays FCP port instances::remote_port displays remote FC port instances::ulps displays the Upper Layer Protocol
modules installed and their IDs::ulpmods displays the port to ULP mapping::emlxs_msgbuf Dumps the emlxs driver message buffer::emlxs_show Shows any structure in the emlxs driver::qlclinks Print qlc link information::qlcstate Print qlc adapter state information::emlxs_msgbuf Dumps the emlxs driver internal msgbuf::emlxs_show Shows the contents of any emlxs struct
Of the above, the fcptrace and fptrace buffers are probablythe most useful to get started with debugging an issue
Debugging features (3) – dcmds in mdbYou can find out what dcmds and walkers are in a
particular module by running
::dmods -l [module name]NWS modules for mdb are fctl, fcp, fcip, qlc and emlxs
> ::dmods -l fctl
fctl
dcmd fcport - Display a Leadville fc_local_port structure
dcmd fcptrace - Dump the fcp trace buffer, optionally supplying starting and ending packet numbers.
dcmd fptrace - Dump the fp trace buffer, optionally supplying starting and ending packet numbers.
dcmd ports - Leadville port list
dcmd remote_port - Display fc_remote_port structures
dcmd ulpmods - Leadville ULP module list
dcmd ulps - Leadville ULP list
walk job_request - walk list of job_request structures for a local port
walk orphan - walk list of orphan structures for a local port
walk pd_by_did - walk list of fc_remote_port structures hashed by D_ID
walk pd_by_pwwn - walk list of fc_remote_port structures hashed by PWWN
walk ports - walk list of Leadville port structures
walk ulpmods - walk list of Leadville ULP module structures
walk ulps - walk list of Leadville ULP structures
Debugging features (4) – fcp trace buffer
The FCP and FP drivers keep a trace buffer (pointed to by (ss)fcp_logq and (ss)fp_logq) where events of interest are noted
> ::fcptrace
[Tue Dec 20 14:38:49 2005] 13=>ssfcp(2)::PLOGI to d_id=0x80200 succeeded, wwn=c0006025d2ecc7ff[Tue Dec 20 14:38:49 2005] 14=>ssfcp(2)::ssfcp_send_els: d_id=0x80200 ELS 0x20 (PRLI)[Tue Dec 20 14:38:49 2005] 15=>ssfcp(2)::ssfcp_send_els: returning 0[Tue Dec 20 14:38:49 2005] 16=>ssfcp(2)::ELS (20) callback state=0x1 for 80200[Tue Dec 20 14:38:49 2005] 17=>ssfcp(2)::PRLI to d_id=0x80200 succeeded...[Tue Dec 20 14:38:49 2005] 20=>ssfcp(2)::ssfcp_handle_reportlun: port=2, tgt D_ID=0x80200[Tue Dec 20 14:38:49 2005] 21=>ssfcp(2)::!Dynamically discovered 18 LUNs for D_ID=80200
Each entry has a timestamp, a monotonically-increasing sequence number, the fcp or fp instance (ssfcp prior to Nevada) making theentry, and the actual message itself
Debugging features (5) – fp trace buffer> ::fptrace[Tue Dec 20 14:38:49 2005] 47=>fp(2)::RSCN with D_ID page; port=ffffffff85180000, d_id=80200, pd=0[Tue Dec 20 14:38:49 2005] 48=>fp(2)::NS Query response, cmd_code=112, xfer_len=8[Tue Dec 20 14:38:49 2005] 49=>fp(2)::GPN_ID results; 25 60 0 ffffffc0 ffffffff[Tue Dec 20 14:38:49 2005] 50=>fp(2)::NS Query Response for D_ID page; rev=1, in_id=0, cmdrsp=8002, reason=0, expln=0, rval=0[Tue Dec 20 14:38:49 2005] 51=>fp(2)::new port attached to domain, calling fp_validate_area_domain[Tue Dec 20 14:38:49 2005] 52=>fp(2)::GAN response; port=ffffffff85180000, d_id=80200[Tue Dec 20 14:38:49 2005] 53=>fp(2)::GAN response details; port=ffffffff85180000, d_id=80200, type_id=20801, pwwn=2560 0 c0 ff c7 ec d2, nwwn=20 60 0 c0 ff 7 ec d2[Tue Dec 20 14:38:49 2005] 54=>fp(2)::GAN PD stuffing; pd=ffffffff867c8800, port_id=20801, sym_len=28 fc4-type=10000[Tue Dec 20 14:38:49 2005] 55=>fp(2)::GAN response; port=ffffffff85180000, d_id=80600[Tue Dec 20 14:38:49 2005] 56=>fp(2)::fp_validate_area_domain: get_devcount found 1 devices attached to port 0xffffffff85180000
Note that we've got the same format as for the FCP trace buffer....
Debugging features (6) – fp trace buffer
It's really important that we get trace buffer contents when youlog a bug
FC analyser traces tell us what happens between the HBA andthe attached storage
Trace buffers tell us what Leadville {fp|fcp|fcip|fctl|fcsm} does(and what they think is going on)
When you log a bug, give us a crash dump (live or dead) and we'llwork out whether an FC analyser trace is needed.
Debugging features (7) – ports!> ::ports
Port I# State Soft FCA Handle Port DIP FCA Port DIP
ffffffff85180000 2 401 0 ffffffff8252f200 ffffffff83c78dc0 ffffffff807613d8
ffffffff84058000 3 0 0 ffffffff8252fb00 ffffffff81d3b400 ffffffff807611f8
ffffffff83332000 1 0 0 ffffffff835d5040 ffffffff83c77dc8 ffffffff807615b8
ffffffff83372000 0 0 0 ffffffff82eff040 ffffffff835c5098 ffffffff80761798
The fields have types as follows:Port: fc_local_port (snv) or fc_port_t (s10 and earlier)I#: instance numberState: port speed (msb) and actual state (lsb)Soft: soft_state defined in fc_portif.hFCA Handle: opaque pointer for the local port devicePort DIP: port dip (dev_info_t)FCA Port Dip: fca dip (dev_info_t)
Debugging features (8) – ULP info> ::ulps
ULP Name Type Revision
FCSM 20 2
SunFC FCIP v20050927-1.43 5 2
fcp 8 2
> ::ulpmods
Type Port Handle dstate statec
8 ffffffff83372000 1 2
8 ffffffff83332000 1 2
8 ffffffff84058000 1 2
8 ffffffff85180000 1 0
5 ffffffff83332000 1 2
5 ffffffff83372000 1 2
5 ffffffff84058000 1 2
5 ffffffff85180000 1 0
20 ffffffff85180000 1 0
20 ffffffff83332000 1 2
20 ffffffff83372000 1 2
20 ffffffff84058000 1 2
ULPS areUpper Layer Protocols
Debugging features (9) – mdb walkers
Walkers in mdb:cmds - walk list of SCSI commands in fcp's per-lun queuefcp - walk list of Leadville fcp instancesfcpX_cache - walk the fcpX_cache cachefcsm_job_cache - walk the fcsm_job_cache cachefctl_cache - walk the fctl_cache cachefpX_cache - walk the fpX_cache cacheluns - walk list of LUNs in an fcp targetpd_by_did - walk list of fc_remote_port structures hashed by D_IDpd_by_pwwn - walk list of fc_remote_port structures hashed by PWWNports - walk list of Leadville port structurestargets - walk list of fcp targets attached to the local portulpmods - walk list of Leadville ULP module structuresulps - walk list of Leadville ULP structuresqlcstates - walk list of qlc ql_state_t structures
Debugging features (10) – mdb walkers
Usage example (Solaris 10):
*ssfcp_port_head::walk fcp|::walk targets|::walk luns
which gives you a struct ssfcp_lun which you can mdb-pipe to ::print like this:
*ssfcp_port_head::walk fcp|::walk targets|::walk luns|::print -tstruct ssfcp_lun
Debugging features (11) – Solaris CAT
The Solaris CAT support functions and routines mirror that of mdbNote: currently under development, tentatively targettedfor Solaris CAT v5
Command is called “san” and will have these options:
fptrace [-s m][-e n] dumps the FP trace buffer (-s start, -e end packet #)fcptrace [-s m][-e n] dumps the FCP trace buffer (-s start, -e end packet #)
ports [-l|-r|-h|-v] shows all ports, -l local ports, -r remote ports-h hba info, -v all info
[WWN|port#] with WWN, shows info for WWN onlywith port#, shows info for port instance# only
targets [WWN|port#] shows targets, with WWN shows targetsattached to that WWN or port instance only
luns {WWN|port#} shows luns attached to WWN or port instance
Debugging features (12) – Data structures
There are several data structures which you should know about:
fp_cmdfc_packet
fc_local_port / fc_remote_port (nevada)fc_port / fc_port_device (s10)
job_requestfc_orphanfca_port
You can explore these in mdb with::print -t struct [name of structure]
You can explore these in Solaris CAT v4.2 and later withstype [name of structure]
Search for their definitions athttp://cvs.opensolaris.org/source/xref/nwsc/src/sun_nws
Debugging features (13) – Data structures
We embed multipathing pointers within the (ss)fcp_lun structure:
struct ssfcp_lun { (size: 0xc8 bytes)...int lun_mpxio;typedef child_info_t * = void * *lun_cip; (offset 0x20 bytes, size 0x8 bytes)...
}
If lun_mpxio is 0 (ie, not a multipathed lun), then the lun_cip pointer is astruct dev_info
If lun_mpxio is 1 (multipathed), then the lun_cip pointer is a struct mdi_pathinfo
Putting it all together (1) – from app to target
App issues write(2) or read(2) commandSystem call interface routes this to SCSA layer, calls down
into fcpfcp encapsulates the SCSI command it receives from the
target driver into a fibre-channel packet, fills in the address portions appropriately, looks up the correcthba to send the fc packet to and sends it out
The hba accepts the packet and sends it out over the fibreIf we go through a switch, the switch routes the packet to
the correct device, otherwise the device we send the packet to accepts it and deals with it
Putting it all together (2) – from target to app
Target puts data into its buffer, sends an “interrupt” outto the switch (or hba if directly attached).
Switch routes packet to D_ID (destination ID)The hba accepts the packet and sends it to fcpfcp un-encapsulates the data from the packet structures,
and when the SCSI packet is complete, passes theSCSI packet up to a multipathing (scsi_vhci/vxdmp)or target driver (sd/st/sgen/sg)
The target driver passes the SCSI packet up to a layereddriver such as md/vxvm/zfs/... if required
Either the target driver or the layered driver then providesyour data in a buffer to your app
It's just like any other SCSI-attached storage, so that you(app writer/sysadmin) don't have to worry about the mechanics of what happens..... It Just Works! (tm)
Putting it all together – Device Discovery (1)
A new device is added to the zone, so the switch sends aRSCN (Registered State Change Notification) to eachport which has registered to receive RSCNs
Each port then invokes its RSCN handler. For fp, that's fp_validate_unsol_rscn()
fp repeatedly interrogates the switch with a GPN_IDquery (Get Port Next, by ID) and then
Fills out the list of “old” devices and makes sure the LILPmap is correct, or
Queries the switch's Name Service for each device in thezone and then re-validates the port descriptor tables
If required, we then hand it all off to fcp....
Putting it all together – Device Discovery (2)
Each fcp instance has a hotplug task (fcp_hp_task), whichgets triggered by fp/fctl/fcsm
fcp_hp_task then calls fcp_trigger_lun to either take thedevice offline, or bring it online
If we're offlining the lun, then we call fcp_offline_lun,which palms the work off to ndi_devi_offline for anon-mpxio lun, or mdi_pi_offline otherwise
If we're onlining the lun, we call fcp_get_cip to get theappropriate device path and then throw the device at
ndi_devi_online (non-mpxio) or mdi_pi_online.
What about standards?
Acronym Title
FC-PH Fibre Channel Physical InterfaceFC-PH-2 Fibre Channel Physical Interface, gen 2FC-PH-3 Fibre Channel Physical Interface, gen 3FC-AL Arbitrated LoopFC-AL-2 Arbitrated Loop, gen 2FC-FG Generic FabricFC-SW Switched FabricFC-GS Generic ServicesFC-GS-2 Generic Services, gen 2FC-LE Link EncapsulationFC-SB Single-Byte command set mappingFC-PLDA Private Loop Direct Attach10 BIT 10-bit InterfaceFC-FLA Fabric Loop Attachment
SCSI-FCP SCSI-3 encapsulation in FCSCSI-GPP Generic Packetized Protocol
www.t11.org for FCwww.t10.org for SCSI
Code Pointers
The NWS consolidation is available athttp://cvs.opensolaris.org/source/xref/nwsc You can search and browse the source using OpenGrok
You can download snapshots of the NWS consolidation fromhttp://www.opensolaris.org/os/downloads
http://mp-mgmt-api.sourceforge.net MPAPIhttp://www.snia.org/apps/org/workgroup/os-attach
References and Further Reading (1)
www.sun.com/storagetek/storage_networkingSan Foundation Kit home
www.opensolaris.orgOpenSolaris!
docs.sun.com/source/819-0139Solaris Fibre Channel and Storage Multipathing
sunsolve.sun.cominfodocs and patches
www.sun.com/bigadmin A portal for system administration topics
References and Further Reading (2)
www.t11.org FibreChannel Standards Committeewww.t10.org SCSI Standards Committeewww.snia.org Storage Networking Industry Association
www.qlogic.com QLogic Corporationwww.emulex.com Emulex Corporation
Blogs
blogs.sun.com/roller/page/jmcpJames McPherson
blogs.sun.com/roller/page/torreyTorrey McMahon
blogs.sun.com/roller/page/AaronDaileyAaron Dailey
blogs.sun.com/roller/page/dweibelDavid Weibel
blogs.sun.com/roller/page/hbainsightsSumit Gupta
WhatQuestions
DoYou
Have????
Getting to knowthe SAN Stack
James C. McPherson
Image copyright © 2006 James C. McPherson