819-3741-13

294
Sun Microsystems, Inc. www.sun.com Submit comments about this document at: http://www.sun.com/hwdocs/feedback Sun Fire V445 Server Admini strati on Guide Part No. 819-3741-13 September 2007, Revision A

Upload: wayne-huang

Post on 07-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 1/294

Sun Microsystems, Inc.www.sun.com

Submit comments about this document at: http://www.sun.com/hwdocs/feedback

Sun Fire™ V445 ServerAdministration Guide

Part No. 819-3741-13September 2007, Revision A

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 2/294

PleaseRecycle

Copyr ight 2007 Sun Microsystems, Inc., 4150 Netw ork Circle,Santa Clara, California 95054, U.S.A. All rights reserved .

Sun Microsystems, Inc. has intellectual prop erty rights relating to technology that is described in this docu men t. In par ticular, and with outlimitation, these intellectual property rights may include one or more of the U.S. patents listed at http:/ / ww w.sun.com/ p atents and one ormore add itional patents or pend ing patent app lications in the U.S.and in other countries.

This docum ent and the p roduct to which it pertains are distributed und er licenses restricting their use, copying,d istribution, anddecompilation.N o part ofth e product or ofthis document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any.

Third-pa rty softwa re, includin g font technology, is copyrigh ted and licensed from Sun supp liers.

Parts of the prod uct may be der ived from Berkeley BSD systems, licensed from the University ofCa lifornia. UNIX is a registered tradem ark inthe U.S.a nd in other countries, exclusively licensed through X/ Open Comp any, Ltd.

Sun, Sun Microsystems, the Sun logo, Sun Fire, Solaris, VIS, Sun StorEdge, Solstice DiskSuite, Java, SunVTSand the Solaris logo aretradem arks or registered tradem arks of Sun Microsystems,Inc. in the U.S.and in other countries.

All SPARC trad ema rks are used u nd er license and are trad ema rks or registered trad ema rks of SPARC Intern ational, Inc. in the U.S. and in othercountr ies. Prod ucts bearing SPARC trad ema rks are based up on an architecture developed by Sun Microsystems, Inc.

The OPEN LOOK and Sun™ Grap hical User Interface was dev eloped by Sun Microsystems, Inc. for its users and licensees. Sun acknow ledgesthe pioneering efforts of Xerox in researching and d eveloping the concept of visual or graph ical user interfaces for the comp uter indu stry.Sunholds a n on-exclusive license from Xerox to the Xerox Graph ical User Interface, wh ich license also covers Sun’s licensees wh o imp lement OPENLOOK GUIs and oth erw ise comp ly with Sun’s wr itten license agreem ents.

U.S. Government Rights –Comm ercial use. Government u sers are subject to the Sun Microsystems, Inc.stand ard license agreement andapp licable provisions of the FAR and its supp lements.

DOCU MENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CON DITIONS, REPRESENTATION S AND WARRANTIES,INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON -INFRINGEMENT,ARE DISCLAIMED, EXCEPT TO TH E EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.

Copyr ight 2007 Sun Microsystems, Inc., 4150 Netw ork Circle,Santa Clara, California 95054, Etats-Unis. Tous droits réservés.Sun Microsystems, Inc. a les droits de propr iété intellectuels relatants à la technologie qui est décrit da ns ce documen t. En par ticulier, et sans lalimitation, ces droits de propr iété intellectuels peu vent inclure un ou plus des brevets américains énum érés à http :/ / ww w.sun.com/ p atents etun ou les brevets plus sup plémentaires ou les app lications de brevet en attente da ns les Etats-Unis et dan s les autres pays.

Ce produ it ou documen t est protégé par un copyright et distribué avec des licences qui en restreignent l’utilisation,la copie, la distribution, et ladécompilation.Au cune partie de ce produ it ou document ne peut être reproduite sous aucune forme, par quelque moyen qu e ce soit,sansl’autor isation préa lable et écrite de Sun et d e ses bailleurs d e licence, s’il y ena.

Le logiciel déten u par des tiers, et qui comp rend la technologie relative au x polices de caractères, est protégé par un copyright et licencié par d esfournisseurs d e Sun.

Des parties de ce prod uit pou rront être dér ivées des systèmes Berkeley BSD licenciés par l’Université de Californie. UNIX est une ma rquedép osée aux Etats-Unis et da ns d’autres pays et licenciée exclusivem ent par X/ Open Comp any, Ltd.

Sun, Sun Microsystems,le logo Sun, Sun Fire, Solaris, VIS, Sun StorEd ge, Solstice DiskSuite, Java, SunVTS et le logo Solaris sont desmarqu es de fabrique ou des m arques dép osées de Sun M icrosystems, Inc.au x Etats-Unis et dans d ’autres pays.

Toutes les marqu es SPARC sont utilisées sous licence et sont des ma rques d e fabrique ou d es marqu es dép osées de SPARC Internation al, Inc.aux Etats-Unis et dans d’autres pays. Les prod uits protan t les marqu es SPARC sont basés sur un e architecture dév elopp ée par SunMicrosystems,Inc.

L’interface d’utilisation graph ique OPEN LOOK et Sun ™ a été dévelop pée par Sun Microsystems, Inc. pou r ses utilisateurs et licenciés. Sunreconnaît les efforts de pionn iers de Xerox pour la recherche et le développ emen t du concep t des interfaces d’utilisation visuelle ou grap hiquepou r l’indu strie de l’informa tique. Sun détient u ne license non exclusive de Xerox sur l’interface d’utilisation graphiqu e Xerox, cette licencecouvra nt égalemen t les licenciées de Sun qu i mett ent en place l’interface d ’utilisation graph ique OPEN LOOK et qui en outre se conformen taux licences écrites de Sun .

LA DO CUMENTATION EST FOURNIE "EN L’ÉTAT" ET TOUTES AUTRES CON DITIONS, DECLARATION S ET GARAN TIES EXPRESSESOU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PARLA LOI APPLICABLE, Y COMPRIS NOTAMMENTTOUTE GARAN TIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UN E UTILISATION PARTICULIERE OU AL’ABSENCE DE CON TREFAÇON .

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 3/294

iii

Contents

Preface xxvii

1. System Overview 1

Sun Fire V445 Server Overview 1

Processors and Mem ory 3

External Ports 3

Gigabit Ethernet Ports 3

10BASE-T Network Management Port 4

Serial Managemen t an d DB-9 Ports 4

USB Por ts 4

RAID 0,1 Internal H ard Drives 5

PCI Subsystem 5

Power Sup plies 5

System Fan Trays 6

ALOM System Controller Card 6

Ha rdw are Disk Mirroring and Striping 6

Pred ictive Self-Healing 6

New Features 7

Locating Front Panel Features 9

Front Panel Ind icators 10

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 4/294

iv Sun Fire V445 Server Administration Guide • September 2007

Power Button 12

USB Port s 12

SAS Disk Dr ives 14Removable Media Drive 14

Locating Back Panel Featu res 16

Back Panel Ind icators 17

Power Supp lies 17

PCI Slots 17System Controller Ports 19

Network Management Port 19

Serial Management Port 20

System I/ O Port s 20

USB Port s 20Gigabit Ethernet Ports 20

DB-9 Serial Por t 21

Reliability, Ava ilability, and Serviceability (RAS) Featur es 22

Sun Cluster Software 22

Sun Managem ent Center Software 23

2. Configuring the System Console 25

About Com mu nicating With the System 26

About Using the System Console 27

Default System Console Connection Through the Serial Managem ent andNetw ork Management Ports 29

Access Through the N etwork Mana gement Port 30

ALOM 30

Alternative System Console Configur ation 31

Accessing the System Console Through a Grap hics Monitor 32

About the sc> Prompt 32

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 5/294

Contents v

Access Through Mu ltiple Controller Sessions 34

Ways of Reaching th e sc> Prompt 34

About the ok Prompt 35Entering the ok Prompt 35

Graceful Shu tdow n 36

ALOM System Con troller break or console Command 36

L1-A (Stop-A) Keys or Break Key 37

Externally In itiated Reset (XIR) 37Manu al System Reset 37

About Switching Between the ALOM System Controller and the SystemConsole 38

Entering the ok Prompt 40

w To Enter the ok Prompt 40

Using the Serial Managem ent Port 41

w To Use the Serial Managem ent Port 42

Activating the Netw ork Management Port 42

w To Activate the N etwork Man agement Port 43

Accessing the System Console With a Term inal Server 44

w To Access the System Console With a Term inal Server Throu gh the SerialManagement Port 44

w To Access the System Console With a Terminal Server Through the TTYBPort 46

What N ext 47

Accessing the System Console With a Tip Connection 47w To Access the System Con sole With a Tip Conn ection Throug ht th e Serial

Management Port 48

w To Access the System Console With a Tip Conn ection Throu gh the TTYBPort 49

Modifying th e /etc/remote File 51

w To Mod ify the /etc/remote File 51

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 6/294

vi Sun Fire V445 Server Administration Guide • September 2007

Accessing th e System Console With an Alphan um eric Terminal 53

w To Access the System Con sole With an Alp han um eric Term inal Throughthe Serial Managem ent Port 53

w To Access the System Con sole With an Alp han um eric Term inal Throughthe TTYB Port 54

Verifying Serial Port Settings on TTYB 55

w To Verify Serial Port Settings on TTYB 55

Accessing the System Console With a Local Graph ics Monitor 56

w To Access the System Console With a Local Graph ics Monitor 56

Reference for System Console Op enBoot Configu ration Variab le Settings 59

3. Powering On and Powering Off the System 61

Before You Begin 61

Powering O n th e Server Remotely 62w To Pow er On the Server Remotely 62

Powering On the Server Locally 63

w To Pow er On the Server Locally 63

Powering Off the System Remotely 64

w

To Pow er Off the System Remotely From the ok Prompt 65w To Pow er Off the System Remotely From the ALOM System Controller

Prompt 65

Pow ering O ff the Server Locally 66

w To Pow er Off the Server Locally 66

Initiating a Reconfigura tion Boot 66

w To Initiate a Reconfigura tion Boot 67

Selecting a Boot Device 69

w To Select a Boot Dev ice 70

4. Configuring Hardw are 73

About the CPU/ Memory Modu les 73

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 7/294

Contents vii

DIMMs 74

Memory Interleaving 76

Independent Memory Subsystems 76DIMM Configuration Rules 77

About the ALOM System Con troller Card 77

Configura tion Rules 80

About th e PCI Cards an d Buses 81

Configura tion Rules 84Abou t the SAS Contr oller 84

Abou t the SAS Backplan e 85

Configura tion Rules 85

About H ot-Pluggable and Hot-Swap pable Components 85

H ard Disk Drives 86Power Supp lies 86

System Fan Trays 87

USB Comp onents 87

About th e Internal Disk Drives 87

Configura tion Rules 89About the Power Supp lies 89

Performing a Pow er Sup ply Hot-Sw ap Op eration 91

Power Sup ply Configura tion Rules 92

About the System Fan Trays 92

System Fan Configuration Rules 94Abou t the USB Ports 95

Configura tion Rules 95

About the Serial Ports 96

5. Managing RAS Features and System Firmware 97

Abou t Reliability, Availability, and Serviceability Featu res 98

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 8/294

viii Sun Fire V445 Server Administration Guide • September 2007

Hot-Pluggable and Hot-Swap pable Components 98

n+2 Power Sup ply Redund ancy 99

ALOM System Con troller 99Environmental Monitoring and Control 100

Au tomatic System Restoration 101

Sun StorEdge Traffic Man ager 102

H ardw are Watchdog Mechanism and XIR 102

Sup port for RAID Storage Configur ations 102Error Correction and Parity Checking 103

About the ALOM System Controller Comm and Promp t 103

Logging In to the ALOM System Controller 104

w To Log In to th e ALOM System Controller 105

About the scadm Utility 106Viewing Environmenta l Information 107

w To View Environ men tal Informat ion 107

Controlling the Locator Ind icator 108

w To Control the Locator Ind icator 108

About Performing Op enBoot Emergency Procedu res 109Stop -A Fun ction 110

Stop-N Function 110

w To Emu late the Stop-N Function 110

Stop-F Function 111

Stop-D Function 111About Autom atic System Restoration 111

Unconfiguring a Device Manua lly 112

w To Unconfigu re a Device Manu ally 112

Reconfiguring a Device Manually 114

w To Reconfigu re a D evice Manu ally 114

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 9/294

Contents ix

Enabling the H ardw are Watchd og Mechanism and Its Options 114

w To Enable the H ardw are Watchdog Mechanism an d Its Options 115

About Mu ltipath ing Software 115

6. Managing D isk Volumes 117

About Disk Volum es 118

About Volume Managem ent Software 118

VERITAS Dynam ic Multipa thing 119

Sun StorEdge Traffic Manager 119

About RAID Technology 120

Disk Concatenation 120

RAID 0: Disk Striping or Intergated Stripe (IS) 121

RAID 1: Disk Mirroring or Integr ated Mirror (IM) 121

Hot-Spares 122

About H ardw are Disk Mirroring 122

Abou t Physical Disk Slot Nu mbers, Physical Device Nam es, and Logical DeviceNames 123

Creating a Hard wa re Disk Mirror 124

w To Create a H ardw are Disk Mirror 124

Creating a H ard wa re Mirrored Volume of the Default Boot Device 126

w To Create a H ard ware Mirrored Volum e of the Default Boot Device 127

Creating a H ardw are Striped Volum e 128

Configur ing and Labeling a H ard wa re RAID Volum e for Use in th e Solaris

Op erating System 129Deleting a H ard wa re Disk Mirror 132

w To Delete a H ardw are Disk Mirror 133

Performing a Mirrored Disk Hot-Plug O peration 134

w To Perform a Mirrored Disk H ot-Plug O peration 134

Performing a N onm irrored Disk Hot-Plug Operation 136w To View the Statu s of the SCSI Devices 136

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 10/294

x Sun Fire V445 Server Administration Guide • September 2007

w To Perform a N onm irrored Disk Hot-Plug Opera tion 138

7. Managing Network Interfaces 141

About th e Netw ork Interfaces 141

About Redund ant N etwork Interfaces 142

Attaching a Twisted-Pair Ethernet Cable 143

w To Attach a Twisted-Pair Ethernet Cab le 143

Configuring th e Primary N etwork Interface 144

w To Configure the Prim ary N etwork Interface 144

Configuring A dd itional Netw ork Interfaces 145

w To Configure Ad ditional N etwork Interfaces 146

8. Diagnostics 151

Diagnostic Tools Overview 152About Sun Ad vanced Lights-Out Man ager 1.0 (ALOM) 154

ALOM Management Ports 155

Setting the admin Password for ALOM 155

Basic ALOM Fun ctions 156

w To Sw itch to the ALOM Prom pt 156w To Sw itch to the Server Console Prompt 156

About Status Ind icators 157

Abou t POST Diagnostics 157

Op enBoot PROM Enhan cements for Diagnostic Operation 158

What’s New in Diagnostic Operation 158About the N ew a nd Redefined Configura tion Variables 158

About the Default Configura tion 159

About Service Mod e 162

About Initiating Service Mode 163

About Overriding Service Mode Settings 164

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 11/294

Contents xi

About Normal Mode 164

About Initiating Norm al Mode 165

About the post Command 165w To Initiate Service Mode 167

w To Initiate Norm al Mode 167

Reference for Estimat ing System Boot Time (to the ok Prompt) 168

Boot Time Estimates for Typ ical Configu rations 169

Estimating Boot Time for Your System 169Reference for Sample Ou tpu ts 170

Reference for Determining Diagnostic Mode 172

Qu ick Reference for Diagn ostic Opera tion 175

OpenBoot Diagnostics 176

w To Start Op enBoot Diagnostics 177Contro lling Op enBoot Diagnostics Tests 178

test an d test-all Commands 179

Op enBoot Diagn ostics Error Messages 180

About OpenBoot Comm ands 181

probe-scsi-all 181probe-ide 182

show-devs 184

w To Run Op enBoot Comm and s 185

Abou t Pred ictive Self-Healing 185

Predictive Self-Healing Tools 186Using the Predictive Self-Healing Comm and s 187

Using the fmdump Command 187

Using the fmadm faulty Command 189

Using the fmstat Command 189

Abou t Trad itional Solaris OS Diagnostic Tools 190

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 12/294

xii Sun Fire V445 Server Administration Guide • September 2007

Error and System Message Log Files 190

Solaris System Information Com man ds 191

Using the prtconf Command 192Using the prtdiag Command 192

Using the prtfru Command 197

Using the psrinfo Command 201

Using the showrev Command 201

w To Run Solaris System Information Comm and s 202Viewing Recent Diagn ostic Test Results 203

w To View Recent Test Resu lts 203

Setting OpenBoot Configuration Variables 203

w To View a nd Set Op enBoot Configurat ion Variables 204

Ad ditional Diagn ostic Tests for Specific Devices 205Using the probe-scsi Comm and to Confirm That Ha rd Disk Drives are

Active 205

Using the probe-ide Comm and To Confirm That the DVD Drive isConnected 206

Using the watch-net an d watch-net-all Command s to Check the

Network Connections 206About Autom atic Server Restart 207

About Autom atic System Restoration 208

Auto-Boot O ptions 209

w To Set th e Auto-Boot Switches 209

Error Han dling Sum mary 210Reset Scenarios 211

Au tomatic System Restoration User Comman ds 212

Enabling Automatic System Restoration 212

Disabling Automatic System Restoration 212

w To Disable Au tom atic System Restoration 212Displaying Au tomatic System Restoration Information 213

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 13/294

Contents xiii

About Sun VTS 214

Sun VTS Software and Security 214

Using SunVTS 215w To Find O ut Wheth er SunVTS Is Installed 216

Installing SunVTS 216

Viewing SunVTS Docum enta tion 216

About Sun Man agement Center 217

How Sun Man agemen t Center Works 218Using Sun Managem ent Center 219

Other Sun Man agemen t Center Features 219

Informa l Tracking 219

H ardw are Diagnostic Suite 220

Interoperability With Third-Party Monitoring Tools 220Obtaining th e Latest Information 220

Ha rdw are Diagnostic Suite 220

When to Run H ardw are Diagnostic Suite 220

Requirements for Using H ardw are Diagnostic Suite 221

9. Troubleshooting 223

Troubleshooting Op tions 223

About Up dated Troubleshooting Information 224

Product Notes 224

Web Sites 224

Sun Solve Online 224

Big Ad min 225

About Firm wa re and Software Patch Managem ent 225

Abou t Sun Install Check Tool 226

About Sun Explorer Data Collector 226

About Sun Remote Services Net Connect 227

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 14/294

xiv Sun Fire V445 Server Administration Guide • September 2007

About Configuring the System for Troubleshooting 227

H ardw are Watchdog Mechanism 227

Automatic System Restoration Settings 228Remote Troubleshooting Capabilities 229

System Con sole Logging 229

Pred ictive Self-Healing 230

Core Dump Process 230

Enabling the Core Du mp Process 231w To Enable the Core Du mp Process 231

Testing the Core Dum p Setup 233

w To Test the Core Du mp Setup 233

A. Connector Pinouts 235

Reference for the Serial Managemen t Port Conn ector 235

Serial Management Connector Diagram 236

Serial Managem ent Connector Signals 236

Reference for the N etwork Managem ent Port Conn ector 236

Netw ork Management Connector Diagram 237

N etwork Mana gement Conn ector Signals 237

Reference for the Serial Port Con nector 238

Serial Port Connector Diagram 238

Serial Port Con nector Signals 238

Reference for the USB Conn ectors 239

USB Conn ector Diagram 239

USB Conn ector Signals 239

Reference for the Gigabit Ethernet Connectors 240

Gigabit Ethernet Connector Diagram 240

Gigabit Ethernet Connector Signals 241

241

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 15/294

Contents xv

B. System Specifications 243

Reference for Physical Specifications 244

Reference for Electrical Specifications 244Reference for Environ menta l Specifications 245

Reference for Agen cy Compliance Specifications 246

Reference for Clearan ce and Service Access Specifications 247

C. OpenBoot Configuration Variables 249

Index 253

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 16/294

xvi Sun Fire V445 Server Administration Guide • September 2007

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 17/294

xvii

Figures

FIGURE 1-1 Front Panel Features 9

FIGURE 1-2 Front Panel System Status Indicators 10

FIGURE 1-3 Power Button Location 12

FIGURE 1-4 USB Ports Location 13

FIGURE 1-5 Hard Disk Drives Location 14

FIGURE 1-6 Removable Media Drive Location 15

FIGURE 1-7 Back Panel Features 17

FIGURE 1-8 PCI Slot Locations 18

FIGURE 1-9 Network and Serial Management Port Locations 19

FIGURE 1-10 System I/O Port Locations 20

FIGURE 1-11 Gigabit Ethernet Port Locations 21

FIGURE 2-1 Directing the System Console to Different Ports and Different Devices 28

FIGURE 2-2 Serial Management Port (Default Console Connection) 29

FIGURE 2-3 Separate System Console and System Controller Channels 39

FIGURE 2-4 Patch Panel Connection Between a Terminal Server and a Sun Fire V445 Server 45

FIGURE 2-7 Tip Connection Between a Sun Fire V445 Server and Another Sun System 48

FIGURE 4-1 Memory Module Groups 0 and 1 75

FIGURE 4-2 ALOM System Controller Card 78

FIGURE 4-3 ALOM System Controller Card Ports 80

FIGURE 4-4 PCI Slots 83

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 18/294

xviii Sun Fire V445 Server Administration Guide • September 2007

FIGURE 4-5 Hard Disk Drives and Indicators 88

FIGURE 4-6 Power Supplies and Indicators 90

FIGURE 4-7 System Fan Trays and Fan Indicators 93

FIGURE 8-7 Diagnostic Mode Flowchart 175

FIGURE A-1 Serial Management Connector Diagram 236

FIGURE A-2 Network Management Connector Diagram 237

FIGURE A-3 Serial Port Connector Diagram 238

FIGURE A-4 USB Connector Diagram 239

FIGURE A-5 Gigabit Ethernet Connector Diagram 241

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 19/294

xix

Tables

TABLE 1-1 Sun Fire V445 Server Features at a Glance 2

TABLE 1-2 System Status Indicators 11

TABLE 1-3 System Diagnostic Indicators 11

TABLE 1-4 Network Management Port Indicator 19

TABLE 1-5 Ethernet Indicators 21

TABLE 2-1 Ways of Communicating With the System 26

TABLE 2-2 36

TABLE 2-3 Ways of Accessing the ok Prompt 41

2. 42

TABLE 2-4 42

TABLE 2-5 43

TABLE 2-6 43

TABLE 2-7 43

TABLE 2-8 44

TABLE 2-9 Pin Crossovers for Connecting to a Typical Terminal Server 45

TABLE 5 46

TABLE 6 46

TABLE 2-10 46

2. 47

Table 2-11 49

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 20/294

xx Sun Fire V445 Server Administration Guide • September 2007

Table 2-12 49

TABLE 2-13 49

2. 50

Table 2-14 51

Table 2-15 51

Table 2-16 52

TABLE 2-17 54

2. 54

Table 2-18 55

Table 2-19 55

8. 58

TABLE 2-20 OpenBoot Configuration Variables That Affect the System Console 59

TABLE 3-1 62

TABLE 3-2 65

TABLE 3-3 65

TABLE 3-4 68

7. 68

TABLE 3-5 68

TABLE 3-6 68

l 70

Note –  70

TABLE 4-1 Memory Module Groups 0 and 1 75

TABLE 4-2 PCI Bus Characteristics, Associated Bridge Chips, Motherboard Devices,

and PCI Slots 82

TABLE 4-3 PCI Slot Device Names and Paths 83

TABLE 4-4 Hard Disk Drive Status Indicators 88

TABLE 4-5 Power Supply Status Indicators 90

TABLE 4-6 Fan Tray Status Indicators 93

TABLE 5-1 104

TABLE 5-2 105

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 21/294

Tables xxi

TABLE 5-3 105

TABLE 5-4 106

TABLE 5-5 107

TABLE 5-6 108

TABLE 5-7 108

TABLE 5-8 108

TABLE 5-9 109

TABLE 5-10 109

TABLE 5-11 109

TABLE 5-12 110

TABLE 5-13 110

TABLE 5-14 111

1. 112

TABLE 5-15 Device Identifiers and Devices 112

n 113

n 113

n 113

2. 113

1. 114

1. 115

TABLE 5-16 115

4. 115

5. 115

TABLE 6-1 Disk Slot Numbers, Logical Device Names, and Physical Device Names 124

TABLE 6-2 124

TABLE 6-3 125

TABLE 6-4 125

TABLE 6-5 125

TABLE 6-6 125

TABLE 6-7 126

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 22/294

xxii Sun Fire V445 Server Administration Guide • September 2007

TABLE 6-8 127

TABLE 6-9 127

TABLE 6-10 127

TABLE 6-11 128

TABLE 6-12 128

TABLE 6-13 128

TABLE 6-14 129

TABLE 6-15 129

TABLE 6-16 130

TABLE 6-17 130

TABLE 6-18 131

TABLE 6-19 131

TABLE 6-20 132

TABLE 6-21 133

TABLE 6-22 133

TABLE 6-23 133

TABLE 6-24 133

TABLE 6-25 133

TABLE 6-26 134

TABLE 6-27 134

TABLE 6-28 135

TABLE 6-29 135

TABLE 6-30 135

TABLE 6-31 136

TABLE 6-32 136

TABLE 6-33 137

TABLE 6-34 138

TABLE 6-35 138

TABLE 6-36 138

TABLE 6-37 139

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 23/294

Tables xxiii

TABLE 6-38 139

TABLE 6-39 139

TABLE 8-1 Summary of Diagnostic Tools 152

TABLE 8-2 What ALOM Monitors 154

TABLE 8-3 156

TABLE 8-4 156

TABLE 8-5 156

TABLE 8-6 156

TABLE 8-7 OpenBoot Configuration Variables That Control Diagnostic Testing and Automatic System

Restoration 160

TABLE 8-8 Service Mode Overrides 163

TABLE 8-9 Scenarios for Overriding Service Mode Settings 164

TABLE 1 167

TABLE 2 167

TABLE 3 167

TABLE 4 167

TABLE 5 171

TABLE 6 172

TABLE 8-10 Summary of Diagnostic Operation 175

TABLE 8-11 177

TABLE 8-12 177

TABLE 8-13 Sample obdiag Menu 177

TABLE 8-14 177

TABLE 8-15 177

TABLE 8-16 178

TABLE 8-17 Keywords for the test-args OpenBoot Configuration Variable 179

TABLE 8-18 179

TABLE 8-19 179

TABLE 8-20 180

TABLE 8-21 180

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 24/294

xxiv Sun Fire V445 Server Administration Guide • September 2007

TABLE 8-22 180

TABLE 8-23 System Generated Predictive Self-Healing Message 186

TABLE 8-24 188

TABLE 8-25 188

TABLE 8-26 188

TABLE 8-27 189

TABLE 8-28 189

TABLE 8-29 190

TABLE 8-30 showrev -p Command Output 202

TABLE 8-31 Using Solaris Information Display Commands 202

TABLE 8-32 203

TABLE 8-33 204

TABLE 8-34 204

1. 209

n 212

1. 212

2. 213

l 213

TABLE 8-35 SunVTS Tests 215TABLE 8-36 216

TABLE 8-37 216

TABLE 8-38 What Sun Management Center Monitors 217

TABLE 8-39 Sun Management Center Features 218

TABLE 9-1 OpenBoot Configuration Variable Settings to Enable Automatic System Restoration 228

TABLE 9-2 231

TABLE 9-3 232

TABLE 9-4 232

TABLE 9-5 232

TABLE 9-6 233

TABLE 9-7 233

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 25/294

Tables xxv

TABLE A-1 Serial Management Connector Signals 236

TABLE A-2 Network Management Connector Signals 237

TABLE A-3 Serial Port Connector Signals 238

TABLE A-4 USB Connector Signals 239

TABLE A-5 Gigabit Ethernet Connector Signals 241

TABLE B-1 Dimensions and Weight 244

TABLE B-2 Electrical Specifications 244

TABLE B-3 Environmental Specifications 245

TABLE B-4 Agency Compliance Specifications 246

TABLE B-5 Clearance and Service Access Specifications 247

TABLE C-1 OpenBoot Configuration Variables Stored on a ROM Chip 249

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 26/294

xxvi Sun Fire V445 Server Administration Guide • September 2007

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 27/294

xxvii

Preface

The Sun Fire V445 Server A dministration Guide is intended for experienced systemadm inistrators. It includes general d escriptive information about the Sun FireTM

V445 server an d detailed instructions for configuring an d adm inistering the server.

To use the information in this manu al, you m ust have w orking knowledge of computer network concepts and terms, and advan ced familiarity with the Solaris™Operating System (OS).

How This Book Is OrganizedThe Sun Fire V445 Server Administration Guide is divided into the following chap ters:

s Chapter 1 presents an illustrated overview of the system and a d escription of thesystem’s reliability, ava ilability, an d serviceability (RAS) features, as well as n ewfeatures introdu ced w ith this server.

s Chapter 2 describes the system console and how to access it.

s Chapter 3 describes how to power on and power off the system, and how toinitiate a reconfiguration boot.

s Chapter 4 describes and illustrates system hard wa re components. It also includ es

configuration information for CPU/ Memory mod ules and DIMMs.s Chapter 5 describes the tools used to configu re system firm ware, includ ing Sun TM

Adva nced Lights Out Mana ger (ALOM) system controller environmentalmonitoring, automatic system recovery (ASR), hardw are w atchdog m echanism,and mu ltipathing software. In add ition, it describes how to unconfigure andreconfigure a device manually.

s Chapter 6 describes how to man age internal disk volum es and d evices.

s Chapter 7 provides instructions for configuring network interfaces.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 28/294

xxviii Sun Fire V445 Server Administration Guide • September 2007

s Chapter 8 describes how to perform system d iagnostics.

s Chapter 9 describes how to troubleshoot the system.

This manu al also includes th e following ap pend ices:

s Append ix A details connector p inouts.

s Append ix B provides tables of various system specifications.

s Append ix C prov ides a list of all Open Boot™ configura tion variables, and a shortdescription of each.

Using UNIX Comm andsThis document m ight not contain information abou t basic UNIX® comman ds andprocedures such as shu tting dow n the system, booting the system, and configuringdevices.

See one or more of the following for this information:s Online documentation for the Solaris OS at docs.sun.com

s Other software docum entation that you received w ith your system

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 29/294

Preface xxix

Typographic Conventions

System Prompts

TABLE P-1

Typeface*

* The settings on your brow ser might differ from these settings.

Meaning Examples

AaBbCc123 The nam es of comm and s, files,and directories; on-screencomputer output

Edit your.login file.

Use ls -a to list all files.

% You have mail.

 AaBbCc123 What you type, when contrastedwith on-screen computer output

% suPassword:

 AaBbCc123 Book titles, new w ords or term s,words to be em phasized

Read Chapter 6 in the User’s Guide.

These are called class options.

You must be superuser to do this.

 AaBbCc123 Comm and -line variable; replacewith a real name or value

To d elete a file, type rm filename.

TABLE P-2

Type of Prompt Prompt

C shell machine-name%

C shell sup eruser machine-name#

Bourn e shell and Korn shell $

Bourn e shell and Korn shell superu ser #

ALOM system controller sc>

Open Boot firmw are ok

OpenBoot Diagnostics obdiag>

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 30/294

xxx Sun Fire V445 Server Administration Guide • September 2007

Related Docum entationThe docum ents listed as online are available at:

http://www.sun.com/products-n-solutions/hardware/docs/

TABLE P-3

Application Title Part Number Format Location

Late-breakingproductinformation

Sun Fire V445 Server Product Notes

819-3744 PDF Online

Installationoverview

Sun Fire V445 Server Getting

Started Guide

819-4664 Prin ted

PDF

Shipp ing kit

Online

Installation Sun Fire V445 Server 

 Installation Guide

819-3743 PDF Online

Service Sun Fire V445 Server Service

 Manual

819-3742 PDF Online

Siteplanning

Site Planning Guide for S un

Servers

819-5730 PDF Online

Siteplanning

data sheet

Sun Fire V445 Server Site

Planning Guide

819-3745 Prin ted

PDF

Shipp ing kit

Online

SunAdvancedLights OutManager(ALOM)systemcontroller

Sun Advanced Lights Ou t 

 Manager (ALOM) 1.6 Online

 Help

817-1960 PDF Online

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 31/294

Preface xxxi

Documentation, Support, and Training

Third -Party Web SitesSun is not responsible for the availability of third-party web sites mentioned in thisdocum ent. Sun does not end orse and is not responsible or liable for any content,

advertising, produ cts, or other ma terials that are ava ilable on or throu gh such sitesor resources. Sun will not be responsible or liable for any actual or alleged damageor loss caused by or in connection w ith the u se of or reliance on any such content,goods, or services that are available on or through such sites or resources.

Sun Welcomes Your CommentsSun is interested in improving its documen tation and welcomes your comments an dsuggestions. You can submit your comments by going to:

http://www.sun.com/hwdocs/feedback

Please include the title and par t num ber of your d ocument w ith your feedback:

Sun Fire V445 Server Administration Guide, par t nu mber 819-3741

Sun Function URL

Documentation http://www.sun.com/documentation/

Support http://www.sun.com/support/

Training http://www.sun.com/training/

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 32/294

xxxii Sun Fire V445 Server Administration Guide • September 2007

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 33/294

1

CHAPTER 1

System Overview

This chapter introduces you to the Sun Fire V445 server and describes its features.The following sections are included:

s “Sun Fire V445 Server Overview” on page 1s “New Features” on page 7s “Locating Front Panel Features” on page 9s “Locating Back Panel Featu res” on page 16s “Reliability, Availability, and Serviceability (RAS) Features” on page 22s “Sun Cluster Software” on page 22s “Sun Managemen t Center Software” on p age 23

Note – This docum ent d oes not provide instru ctions for installing or removinghard ware components. For instructions on prep aring the system for servicing andprocedures to install and remove the server components d escribed in this document,refer to the Sun Fire V445 Server Service Manual.

Sun Fire V445 Server OverviewThe Sun Fire V445 server is a high-performance, shared memory, symmetric

multiprocessing server that supports up to four UltraSPARC® IIIi processors anduses the Fire ASIC PCIe NorthBridge along with PCI-X and PCIe expansion slots.The UltraSPARC IIIi processor has a 1 Mbyte L2 cache an d imp lements the SPARC®V9 Instru ction Set Architecture (ISA) and the Visual Instruction Set extensions (SunVIS software) that accelerate mu ltimed ia, networking, encryption, an d Java™software processing. The Fire ASIC provides higher I/ O performance and interfaceswith th e I/ O su bsystem, w hich contains 4 10/ 100/ 1000Mb Ethernet p orts, 8 SASdisk d rives, 1 DVD-RW d rive, 4 USB ports, a PO SIX comp liant DB-9 serial por t, and

service processor communication ports. The PCI expansion subsystem isconfigurable with a var iety of plug-in third p arty ad apters.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 34/294

2 Sun Fire V445 Server Administration Guide • September 2007

System reliability, availability, and serviceability (RAS) are enhanced by features thatinclude hot-pluggable disk drives and redundant, hot-swappable power suppliesand fan trays. RAS features are described in Chapter 5.

The system, which is mountable in a 4-post rack, measures 6.85 inches high (4 rackunits - U), 17.48 inches wide, and 25 inches deep (17.5 cm x 44.5 cm x 64.4 cm). Thesystem weighs ap proxim ately 75 lb (34.02 kg). Robust rem ote access is prov idedwith Adva nced Lights Out Manager (ALOM) software, w hich also controlspow ering on/ off and d iagnostics. The system also meets ROHS requirements.

TABLE 1-1 provides a brief description of the Sun Fire V445 server features. Moredetails on these features are provid ed in the following subsections.

TABLE 1-1 Sun Fire V445 Server Features at a Glance

Feature Description

Processor 4 UltraSPARC IIIi CPUs

Memory 16 slo ts tha t can be popu la ted wi th one o f the fo llowing types o f  

DDR1 DIMMS:

• 512 MB (8 GB maximu m)

• 1 GB (16 GB maximu m)• 2 GB (32 GB maximu m)

External ports • 4 Gigabit Ethernet ports – Support several modes of operations at10, 100, and 1000 megabits per second (Mbps)

• 1 10BASE-T netw ork m anagem ent port – Reserved for the ALOMsystem controller and the system console

• 2 Serial port s – One POSIX comp liant DB-9 conn ector, and one RJ-45 serial managem ent connector on the ALOM system controller

card• 4 USB por ts – USB 2.0 complian t and sup por t 480 Mbps, 12 Mbp s,

and 1.5 Mbps speeds

Internal hard d rives 8 2.5 inch (5.1 cm) high, hot-pluggable Serial Attached SCSI (SAS)disk drives

Other internalperipherals

1 DVD/ ROM/ RW device

PCI interfaces 8 PCI slots: four 8 lane PCIe slots (2 of which also support 16 laneform factor cards) and 4 PCI-X slots

Power 4 550-wa tt hot -swappable power supp lie s, each with it s own coolingfan

Cooling 6 hot -swappable h igh-power fan t rays (one fan pe r t ray) o rgan izedinto three redundant pairs – 1 redundant pair for disk drives – 2redund ant pairs for the CPU/ memory modu les, memory DIMMs,I/ O subsystem , and front-to-rear cooling of the system

S Fi V445 S F Gl (C d)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 35/294

Chapter 1 System Overview 3

Processors and Memory

Processing p ower is provided by u p to four CPU/ Memory mod ules. Each m odu leincorporates one UltraSPARC IIIi processor, and slots for four double data rate(DDR) du al inline m emory mod ules (DIMMs).

System main m emory is provided by u p to 16 DDR synchronous dyn amic randomaccess memory DIMMs. The system supports 512-Mbyte, 1-Gbyte, and 2-GbyteDIMMs. Total system memory is shared by all CPUs in the system and ranges froma m inimum of 1 Gbyte (one CPU/ mem ory m odu le with tw o 512-Mbyte DIMMs) to

a m aximu m of 32 Gbytes (four m odu les fully pop ulated with 2-Gbyte DIMMs). Formore information abou t system m emory, see “DIMMs” on page 74.

External Ports

The Sun Fire V445 server p rovid es four Gigabit Ethern et p orts, one 10BASE-Tnetwork man agement p ort, two Serial ports, and four USB ports.

Gigabit Ethernet Ports

The four on-board Gigabit Ethernet p orts located on the back panel sup port severalmodes of operations at 10, 100, and 1000 megabits per second (Mbps). AdditionalEthernet interfaces or connections to other n etwork typ es can be provided byinstalling the ap prop riate PCI interface cards. Multiple netw ork interfaces can be

combined w ith Solaris Internet Protocol (IP) network mu ltipathing software to

Remote management A serial port for the ALOM management controller card and a

10BASE-T network management port for remote access to systemfunctions and the system controller

Disk Mirror ing Hardware RAID 0,1 support for in ternal d isk dr ives

RAS features Robust reliability, availability, and serviceability (RAS) features aresupported. See Chapter 5 for details.

Firm w are Su n system firm w are con tain in g:

• Open Boot PROM for system settings and p ower-on self-test

(POST) supp ort• ALOM for remote management administration

Operating system The Solaris OS is preinstalled on disk 0.

TABLE 1-1 Sun Fire V445 Server Features at a Glance (Continued)

Feature Description

id h d d d d f il bilit ll l d b l i

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 36/294

4 Sun Fire V445 Server Administration Guide • September 2007

provide h ardw are redu nd ancy and failover capability, as well as load balancing onoutbound traffic. Should one of the interfaces fail, the software can automaticallyswitch all network traffic to an alternate interface to maintain network availability.For more information about network connections, see “Configuring the Pr imary

Netw ork Interface” on page 144 and “Configuring Add itional N etwork Interfaces”on p age 145.

10BASE-T Netw ork Management Port

The network man agement port (labeled N ET MGT) is located on the chassis backpanel. This port is reserved for use with the ALOM system controller and t he system

console.

This port prov ides direct netw ork access to the ALOM system controller card and itsfirmware. This port also provides access to the system console, power-on self-test(POST) outpu t m essages, and ALOM system controller m essages. Use this port toperform remote administration, including externally initiated resets (XIR).

Serial Managem ent and DB-9 PortsThe DB-9 port is POSIX comp liant w ith a gen eral-pu rp ose DB-9 conn ector (labeledTTYB) on the system b ack panel. The serial man agem ent p ort is an RJ-45 conn ector(labeled SERIAL MGT) on the chassis back panel, and is reserved for use with theALOM system controller and the system console.

The serial managem ent p ort enables you to set up a system console device, withou tconfiguring an existing port. All power-on self-test (POST) and ALOM system

controller messages are directed to the serial managem ent p ort by d efault. For moreinformation, see “About th e Serial Ports” on p age 96.

USB Ports

The front and back panels both provide two Universal Serial Bus (USB) ports forconnecting periph eral devices such as m odems, p rinters, scanners, d igital cameras,

or a Sun Typ e-6 USB keyboard and mou se. The USB por ts are U SB 2.0 compliant,and supp ort 480 Mbps, 12 Mbps, and 1.5 Mbps sp eeds. For ad ditional details, see“About the USB Ports” on page 95.

RAID 0 1 I t l H d D i

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 37/294

Chapter 1 System Overview 5

RAID 0,1 Internal Hard Drives

Interna l disk storage is provid ed by u p to eight 2.5 inch (5.1 cm) high, hot-plug gable,SAS disk drives. The basic system includ es a SAS disk backp lane tha t accomm oda teseight disks capable of data tran sfer rates of u p to 320 megabytes per second. See“About th e Internal Disk Drives” on page 87 an d “Locating Back Panel Featu res” onpage 16.

External multidisk storage subsystems and redundant array of independent disks(RAID) storage arrays can be su pp orted by installing p eripheral componentinterconnect (PCI) host ada pter cards a long with th e app ropriate system softwa re.Software dr ivers supp orting SCSI and other types of d evices are included in the

Solaris OS. In addition, the system supports internal hardware mirroring (RAID 0,1)using the on -board SAS controller. See “Abou t RAID Technology” on page 120.

PCI Subsystem

System I/ O is handled by two expan ded Peripheral Compon ent Interconnect (PCIe)buses and two PCI-X buses. The system has eight PCI slots: four 8 lane PCIe slots(two of which also support 16 lane form factor cards) and four PCI-X slots. The PCI-X slots operat e at u p to 133 MH z, are 64-bit capa ble, and su pp ort legacy PCI devices.All PCI-X slots comply with PCI Local Bus Specification Rev 2.2 and PCI-X LocalBus Specification Rev 1.0. All PCIe slots comply with PCIe Base Specification r1.0aand PCI Stand ard SHPC Specification, r1.1. For add itional deta ils, see “About thePCI Cards and Buses” on p age 81.

Power Supplies

The basic system includes four 550-watt power supplies, each with its own coolingfan. The pow er sup plies are plugged into a sepa rate pow er d istribution board (PDB).This board is connected to the m otherboard th rough 12-volt high current bus bars.Two power supplies provide sufficient current (1100 DC watts) for maximumconfiguration. The other pow er sup plies provide 2+2 redu nd ancy, enabling th e

system to continue op erating if up to two p ower su pp lies fail.The power sup plies are hot-swap pable – you can remove and replace a faulty powersupp ly without shu tting dow n the system. With four separ ate AC inlets you canwire the server w ith a fully redu nd ant AC circuit. A failed p ower su pp ly does notneed to remain installed to su stain proper cooling. For m ore information abou t thepow er sup plies, see “About the Pow er Sup plies” on p age 89.

System Fan Trays

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 38/294

6 Sun Fire V445 Server Administration Guide • September 2007

System Fan Trays

The system is equipped with six fan trays organized into three redu nd ant pa irs. Oneredu nd ant p air is for cooling the disk drives. The other tw o redun dan t pairs are forcooling the CPU/ Memory mod ules, mem ory DIMMs, I/ O subsystem, and p rovidefront-to-rear cooling of the system. Not all fans mu st be present to p rovide ad equatecooling – only one fan per redu nd ant pair m ust be present.

Note – All system cooling is provided by the fan trays – pow er sup ply fans do notprovide system cooling.

See “About the System Fan Trays” on page 92 for details.

ALOM System Controller Card

The Sun ALOM system controller card enables system m anagement andad ministra tion for the Sun Fire V445 server over a serial line or an Ethernet netw ork.

The ALOM system controller provides rem ote system ad ministration forgeograp hically distribu ted or p hysically inaccessible systems. These featu res includ epowering on/ off the system and enabling diagnostics. The firmware installed on theALOM system controller card enables you to m onitor the system, without having toinstall any sup porting software.

The ALOM system controller card run s indepen dently of the host system, andoperates off of standby p ower from the systems pow er sup plies. This allows th eALOM system controller to serve as a lights out man agement tool that continues to

function even w hen th e server operating system goes offline or w hen th e server ispow ered off.

Hardware Disk Mirroring and Striping

The SAS controller su pp orts hard ware disk m irroring and striping (RAID 0,1)

capabilities for all internal disk d rives, resulting in imp roved disk d riveperformance, data integrity, data availability, and fault recovery.

Pred ictive Self-Healing

Sun Fire V445 servers with Solaris 10 or later feature the latest fault managementtechnologies. With Solaris 10, Sun introduces a new architecture for building anddeploying systems and services capable of predictive self-healing. Self-healing

technology enables Sun systems to accurately pred ict compon ent failures and

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 39/294

Chapter 1 System Overview 7

gy y y p pmitigate many serious problems before they actually occur. This technology isincorporated into both the hard ware and software of the Sun Fire V445 server.

At th e hear t of the Pred ictive Self-Healing capab ilities is the Solaris™ Fau lt Manag er,a service that receives data relating to h ardw are and software errors, andautom atically and silently diagnoses the un derlying problem. Once a problem isdiagnosed, a set of agents autom atically respond s by logging the event, and if necessary, takes the faulty component offline. By automatically diagnosingproblems, business-critical applications and essential system services can continueuninterru pted in the event of software failures, or major hard ware comp onentfailures.

New FeaturesThe Sun Fire V445 server provid es faster compu ting in a denser, more pow er-efficient package. The following key new features are included:

s UltraSPARC IIIi CPU

The UltraSPARC IIIi CPU prov ides a faster JBus system interface bus th atconsiderably en hances system p erformance.

s Higher I/ O Per forman ce With Fire ASIC, PCIe, and PCI-X

The Sun Fire V445 server provides higher I/ O performance with PCIe cardsintegrated with the latest Fire chip (NorthBridge). This integration allows higherbandwidth and lower latency datapaths between the I/ O subsystem and the

CPUs. The server supp orts tw o full height or low profile/ full dep th 16 lane(wired 8 lane) PCIe cards and two full height or low p rofile/ half depth 8 lanePCIe cards. The system also supports four PCI-X slots that operate at up to 133MHz, are 64-bit capable, and sup port legacy PCI cards.

The Fire ASIC is a high-performance JBus to PCIe host bridge. On the host busside, Fire supp orts a coherent, sp lit transaction, 128-bit JBus in terface. On th e I/ Oside, Fire supports two 8 lane serial PCIe interconnects.

s SAS Disk SubsystemCompact 2.5-inch disk drives provide faster, denser, more flexible, and morerobust storage. Hard ware RAID 0/ 1 is sup ported across all eight d isks.

s ALOM Cont rol of System Settings

The Sun Fire V445 server provides robust remote access to system functions andthe system controller. The physical system contol keyswitch has been removedand the switch settings (power on/ off, diagnostic mod e) are now emu lated w ith

ALOM and software commands.

Other n ew features include the following:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 40/294

8 Sun Fire V445 Server Administration Guide • September 2007

g

s Four hot-swap p ower su pp lies enable fully redu nd ant AC/ DC capabilities (N+N )

s Fan trays are redundant and hot-swappable (N+1)

s Increased data Integrity and availability for all SAS disk drives using HW Raid(0+1) controller

s Persistent storage of firmw are initialization and probing

s Persistent storage of error state on error reset events

s Persistent storage of d iagnostic outpu t

s Persistent storage of configuration change events

s Autom ated d iagnosis of CPU, memory, and I/ O fault events du ring run time(Solaris 10 and subsequ ent com pa tible versions of Solaris OS)

s Dynam ic FRUID sup port of environmental events

s Software readable chassis serial num ber for asset man agement

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 41/294

Chapter 1 System Overview 9

Locating Front Panel Featu resThe illustration below shows the system features that you can access from the frontpanel.

FIGURE 1-1 Front Panel Features

For information a bout front p anel controls and indicators, see “Front PanelIndicators” on page 10.

The system is configured with up to eight disk drives, which are accessible from thefront of the system.

Front Panel Indicators

Several front p anel indicators provide general system status, alert you to systemproblems, and h elp you to determ ine the location of system faults.

During system startup , the indicators are toggled on a nd off to verify that each one

is working correctly. Indicators located on the front panel work in conjunction withspecific fault indicators. For example, a fault in th e p ower sup ply su bsystem

Status Indicators/control panel SAS disk drives (8)

Removable media driveUSB ports

illuminates the p ower supp ly Service Required indicator on the affected pow erl ll h S i i d i di Si ll f l

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 42/294

10 Sun Fire V445 Server Administration Guide • September 2007

supply, as well as the system Service Required indicator. Since all front panel statusindicators are powered by the system’s standby pow er source, fault indicatorsremain lit for any fault condition that results in a system shu tdow n.

At the top left of the system as you look at its front are six system status indicators.Power/ OK indicator and the Service Required ind icator provid e a snap shot of theoverall system status. The Locator indicator helps you to quickly locate a specificsystem even though it may be one of num erous systems in a room. The Locatorindicator/ button is at the far left in the cluster, and is lit remotely by the systemadm inistrator, or toggled on and off locally by p ressing th e button.

FIGURE 1-2 Front Panel System Status Indicators

Each system statu s indicator has a corresponding ind icator on th e back panel.

Listed from left to right, the system status indicators operate as described in thef ll i t bl

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 43/294

Chapter 1 System Overview 11

following table.

TABLE 1-3 lists add itional fault indicators, and describes the type of service required.

For hard disk d rive indicator descriptions, see TABLE 4-4. For fan tray indicatordescriptions located on the top p anel of the server, see TABLE 4-6.

TABLE 1-2 System Status Indicators

Icon Name Description

Locator This white indicator is li t by a Solar is command, SunManagement Center comman d, or ALOM command s to helpyou locate the system . There is also a Locator indicator bu ttonthat allows you to reset the Locator indicator. For informationon controlling the Locator indicator, see “Controlling th eLocator Indicator” on page 108.

Service Required This amber ind icator lights steadily when a system fault isdetected. For example, the system Service Required indicatorlights when a fault occurs in a power supply or disk drive.

In ad dition to the system Service Required ind icator, otherfault indicators might also be lit, depend ing on the natu re of the fault. If the system Service Required indicator is lit,check the status of other fault ind icators on the front p aneland other FRUs to d etermine the na ture of the fault. See

Chapter 8 an d Chapter 9.

System Activity This green indicator blinks slowly then quickly duringstartup . The Power/ OK indicator lights continuosly wh en thesystem pow er is on and th e Solaris Operating System isloaded and running.

TABLE 1-3 System Diagnostic Indicators

Icon Name Location

Fan Tray Fault This indicator indicates a fault in a fan tray. Add itionalindicators on the top p anel indicate wh ich fan tray requiresservice.

Power Supply

Fault

The indicator indicates a fault in a p ower sup ply. Look at

the individual power supply status indicators (on the backpanel) to determine which power supply requires service.

CPUOvertemperature

This ind icator indicates that a CPU has d etected anovertemp erature condition. Look for any fan failures, aswell as a local overtemperatu re cond ition arou nd the server.

Power Button

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 44/294

12 Sun Fire V445 Server Administration Guide • September 2007

The system Power bu tton is recessed to prevent a ccidentally turning th e system onor off. If the operating system is run ning, pressing an d releasing the Power button

initiates a graceful software system shu tdow n. Pressing and holding d own thePower button for four seconds causes an immediate hardware shutdown.

Caution – Whenever p ossible, use the gr aceful shu tdow n m ethod. Forcing animmediate hardware shutdown can cause disk drive corruption and loss of data.

FIGURE 1-3 Power Button Location

USB Ports

The Sun Fire V445 server has fou r Un iversal Serial Bus (USB) ports: tw o on the frontpan el, and two on the back panel. All four USB ports comp ly with the USB 2.0

specification.

Power button

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 45/294

Chapter 1 System Overview 13

FIGURE 1-4 USB Port s Location

For more information about the USB ports, see “About the USB Ports” on page 95.

USB ports

SAS Disk Drives

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 46/294

14 Sun Fire V445 Server Administration Guide • September 2007

The system has u p to eight hot-pluggable internal SAS disk drives.

FIGURE 1-5 Hard Disk Drives Location

For more information about h ow to configure internal disk dr ives, see the “Aboutthe Internal Disk Drives” on p age 87.

Removable Med ia Drive

The Sun Fire V445 server has a DVD-ROM drive in a removable media bay. Thisdr ive also has DVD-RW an d CD-RW capabilities.

SAS disk drives (8)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 47/294

Chapter 1 System Overview 15

FIGURE 1-6 Remova ble Media Drive Location

For more information about servicing the DVD-ROM drive, see the Sun Fire V445

Server Service Manual.

Removable media drive

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 48/294

16 Sun Fire V445 Server Administration Guide • September 2007

Locating Back Panel FeaturesThe illustration below shows the system features that are accessible from the backpanel.

Powersupplies

PCI-X card slots

PCIe card slots

Externalports

System statusindicators

FIGURE 1-7 Back Panel Features

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 49/294

Chapter 1 System Overview 17

Back Panel IndicatorsThe back panel system status indicators consist of the Locator indicator, ServiceRequired in dicator, and th e System Activity indicator. These indicators are located inthe bottom center of the back panel, and operate as d escribed in TABLE 1-2.

For power supp ly indicator d escriptions, see TABLE 4-5. For fan tray indicatordescriptions located on the top p anel of the server, see TABLE 4-6.

Power Supplies

There are four AC/ DC redund ant (N+N) and h ot-swap pable power sup plies, wh eretwo p ower su pp lies are sufficient to pow er a fully configured system.

For more information about pow er sup plies, see the following sections in th e SunFire V445 Server Service Manual:

s “About Hot-Pluggable Components”s “Removing a Power Sup ply”s “Installing a Power Sup ply”s “Reference for Power Supply Status LEDs”

For more information about pow er sup plies, see “About the Pow er Sup plies” onpage 89.

PCI Slots

The Sun Fire V445 server has fou r PCIe slots and four PCI-X slots. (One of the PCI-Xslots is occupied by the LSI Logic 1068X SAS controller.) These are labeled on theback panel.

Back pan el system status indicators

PCI6PCI0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 50/294

18 Sun Fire V445 Server Administration Guide • September 2007

FIGURE 1-8 PCI Slot Locations

For more information about h ow to install a PCI card, see the Sun Fire V445 Server 

Service M anual.

For more information about PCI cards, see “About the PCI Cards and Buses” on

page 81.

PCI6PCI0

PCI1

PCI2

PCI3

PCI7

PCI5

PCI4

System Controller Ports

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 51/294

Chapter 1 System Overview 19

There are tw o system controller por ts. Both u se an RJ-45 conn ector.

FIGURE 1-9 Network and Serial Management Port Locations

Network Management Port

This port p rovides direct network access to the ALOM system controller, wh en

configured, and can access the ALOM prom pt an d system console output.

Note – The system controller is accessed th rough the serial managem ent p ort bydefault. You m ust reconfigure th e system controller to u se the netw ork m anagemen tpor t. See “Activating the Netw ork Management Port” on page 42.

The network m anagemen t port h as a Link indicator that operates as d escribed inTABLE 1-4.

TABLE 1-4 Network Management Port Ind icator

Name Description

Link This green indicator is li t when an Ethernet connect ion ispresent.

Network management port Serial management port(NET MGT) (SER MGT)

Serial Management Port

The serial management port provides the default connection to the system controller

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 52/294

20 Sun Fire V445 Server Administration Guide • September 2007

The serial management port provides the default connection to the system controllerand can access the ALOM promp t and system console outp ut. You can connect to the

serial management port using a VT100 terminal, a tip connection, or a terminalserver.

System I/ O Ports

FIGURE 1-10 System I/ O Port Locations

USB Ports

There are two USB ports on the back panel. These comply with the USB 2.0specification.

For more information about the USB ports, see “About the USB Ports” on page 95.

Gigabit Ethernet Por ts

The Sun Fire V445 server has four Gigabit Ethernet ports.

Gigabit Ethernet ports

USB ports: DB9 serial port (TTYB)(USB0USB1)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 53/294

Chapter 1 System Overview 21

FIGURE 1-11 Gigabit Ethernet Port Locations

Each Gigabit Ethernet p ort has a corresponding status ind icator, described in

TABLE 1-5.

DB-9 Serial Port

There is a POSIX comp liant DB-9 serial por t labeled TTYB. In ad dition, you m ay

configu re the RJ-45 serial manag emen t p ort as a conv entional serial p ort. See “Aboutthe Serial Ports” on page 96.

TABLE 1-5 Ethernet Indicators

Color Description

(N on e) N o con nection p resen t.

Green This indicates a 10/ 100 Megabit Ethernet connection. Theindicator blinks to indicate netw ork a ctivity.

Amber This indicates a Gigabit Ethernet connection. The indicatorblinks to indicate network activity.

NET0 NET1NET2 NET3

Reliability Availability and

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 54/294

22 Sun Fire V445 Server Administration Guide • September 2007

Reliability, Availability, and

Serviceability (RAS) FeaturesThe Sun Fire V445 server p rovides t he following RAS features:

s Hot-pluggable disk drives

s Redund ant, hot-swa pp able power su pp lies, fan trays, and USB components

s Sun ALOM system controller with SSH conn ections for all remote mon itoring andcontrol

s Environmental monitoring

s Automatic system restoration (ASR) capabilities for PCI cards and memoryDIMMs

s Ha rdw are w atchdog m echanism and externally initiated reset (XIR) capability

s Internal hard ware disk m irroring (RAID 0/ 1)

s Sup port for disk and n etwork m ultipathing w ith automatic failover

s Error correction an d parity checking for imp roved d ata integritys Easy access to all internal replaceable components

s Full in-rack serviceability for all compon ents

s Persistent storage for all configuration change events

s Persistent storage for all system console outpu t

See Chapter 5 for information on how to configure these features.

Sun Cluster Softw areSun Cluster software enables you to connect up to eight Sun servers in a clusterconfiguration. A cluster is a group of nodes that a re interconnected to wor k as a

single, highly available and scalable system. A node is a single instance of Solarissoftware. The software can be runn ing on a stand alone server or on a dom ain withina stand alone server. With Sun Cluster software, you can ad d or remove nod es whileonline, and mix and match servers to m eet your sp ecific needs.

Sun Cluster software delivers high availability through autom atic fault d etectionand recovery, and scalability, ensuring t hat m ission-critical app lications an d servicesare always available when needed.

With Sun Cluster software installed, other nodes in the cluster will automaticallytake over and assum e the workload w hen a nod e goes dow n. The software d eliverspredictability and fast recovery capabilities through features such as local

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 55/294

Chapter 1 System Overview 23

app lication restart, individu al app lication failover, and local network adap ter

failover. Sun Cluster software significantly reduces downtime and increasesprod uctivity by h elping to ensure continu ous service to all users.

The software lets you ru n both stand ard an d p arallel applications on the samecluster. It supp orts the d ynam ic add ition or rem oval of nodes, and ena bles Sunservers and storage prod ucts to be clustered together in a variety of configurations.Existing resources are used more efficiently, resulting in additional cost savings.

Sun Cluster software allows nodes to be separated by up to 10 kilometers. This way,

in the event of a d isaster in one location, all mission-critical da ta and services remainavailable from the other unaffected locations.

For more information, see the d ocumentation sup plied w ith the Sun Clustersoftware.

Sun Managem ent Center SoftwareSun Managemen t Center software is an op en, extensible system mon itoring an dman agement tool. The software is written in Java and uses Simple N etworkManagement Protocol (SNMP) to p rovide enterpr ise-wide m onitoring of Sun serversand workstations, includ ing their subsystems, comp onents, and peripheral d evices.

For more information, see “About Sun Mana gement Center” on page 218.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 56/294

24 Sun Fire V445 Server Administration Guide • September 2007

CHAPTER 2

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 57/294

25

Configuring the System Console

This chap ter explains wh at th e system console is, describes the d ifferent w ays of configu ring it on a Sun Fire V445 server, and helps you un d erstan d its relation to thesystem controller.

Tasks covered in this chapter include:

s “Entering the ok Promp t” on page 40s “Using the Serial Management Port” on page 41s “Activating the Netw ork Managemen t Port” on page 42s “Accessing th e System Console With a Term inal Server” on page 44s “Accessing the System Console With a Tip Connection” on page 47s “Modifying the /etc/remote File” on page 51s “Accessing the System Console With an Alphanumeric Terminal” on page 53s “To Verify Serial Port Settings on TTYB” on page 55s “Accessing the System Console With a Local Graphics Monitor” on page 56

Other information in this chapter includes:

s “About Com mu nicating With the System” on p age 26s “About the sc> Prompt” on pa ge 32s “About the ok Promp t” on page 35s “About Switching Between the ALOM System Controller and the System

Console” on page 38s “Reference for System Console Op enBoot Configur ation Variable Settings” on

page 59

Abou t Communicating With the System

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 58/294

26 Sun Fire V445 Server Administration Guide • September 2007

g yTo install your system software or to diagnose problems, you need some w ay tointeract at a low level with the system. The system console is Sun’s facility for doingthis. You use the system console to view messages and issue commands. There canbe only one system console per comp uter.

The serial management port (SERIAL MGT) is the default port for accessing thesystem console upon initial system installation. After installation, you can configurethe system console to accept inpu t from and send ou tpu t to d ifferent d evices. See

TABLE 2-1 for a summ ary.

TABLE 2-1 Ways of Comm un icating With the System

Devices Available for Accessing the System Console

During

Installation*

After

Installation

A term inal server attached to the serial man agement port (SERIALMGT) or TTYB. See:

• “Using the Serial Managem ent Port” on p age 41• “To Access the System Conso le With a Termina l Server Throu gh

the Serial Managem ent Port” on p age 44

• “To Verify Serial Port Settings on TTYB” on page 55

• “Reference for System Console OpenBoot ConfigurationVariable Settings” on pag e 59

An alph anu meric terminal or similar device attached to th e serialmanag emen t p ort (SERIAL MGT) or TTYB. See:

• “Using the Serial Managem ent Port” on p age 41• “Accessing th e System Con sole With an Alphan um eric

Terminal” on page 53

• “To Verify Serial Port Settings on TTYB” on page 55

• “Reference for System Console OpenBoot ConfigurationVariable Settings” on pag e 59

TABLE 2-1 Ways of Comm un icating With the System (Continued)

Devices Available for Accessing the System Console

During

Installation*After

Installation

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 59/294

Chapter 2 Configuring the System Console 27

Abou t Using the System ConsoleThe system console device can be either a stand ard alphanu meric terminal, terminalserver, Tip connection from another Sun system, or a local graphics monitor. Thedefau lt conn ection is throu gh the serial managem ent p ort (labeled SERIAL MGT) onthe chassis back panel. You can also connect an alphanumeric terminal to the serial(DB-9) connector (as TTYB) on th e system back p anel. A local grap hics mon itorrequ ires installation of a PCI graph ics card , monitor, USB keyboard , and mou se. Youcan also access the system console through a netw ork connection w ith the netw orkmanagement port.

The system console displays status and error messages generated by firmware-basedtests during system startup . After those tests have been ru n, you can enter sp ecialcommand s that affect the firmw are and alter system behavior. For m ore informationabout tests that run du ring the boot process, see Chapter 8 an d Chapter 9.

A tip line attached to the serial management port (SERIAL MGT)or TTYB. See:

• “Using the Serial Managemen t Port” on p age 41

• “Accessing the System Console With a Tip Connection” onpage 47

• “Modifying the /etc/remote File” on pa ge 51

• “To Verify Serial Port Settings on TTYB” on page 55

• “Reference for System Console OpenBoot ConfigurationVariable Settings” on pa ge 59

An Ethernet line connected to the network management port(NET MGT). See:

• “Activating the Network Management Port” on page 42

A local graphics mon itor (frame buffer card, grap hics monitor,mou se, and so forth). See:

• “To Access the System Console With a Local Graph ics Monitor”on p age 56

• “Reference for System Console OpenBoot ConfigurationVariable Settings” on pa ge 59

* After initial system installation, you can redirect the system console to take its inp ut from and send its outp ut tothe serial port TTYB.

Once the OS is booted, th e system console displays UN IX system m essages andaccepts UN IX command s.

To use the system console, you need some means of getting data in to and out of the

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 60/294

28 Sun Fire V445 Server Administration Guide • September 2007

system, which m eans attaching some kind of hardw are to the system. Initially, you

might have to configure that h ardw are, and load and configure app ropriate softwareas w ell.

You also mu st ensure that the system console is directed to th e app ropriate port onthe Sun Fire V445 server’s back panel – generally, the one to which your hardwareconsole device is attached . (See FIGURE 2-1.) You d o this by settin g the input-

device an d output-device OpenBoot configuration variables.

FIGURE 2-1 Directing the System Console to Different Ports and Different Devices

The following subsections p rovide background informa tion and references toinstructions ap prop riate for the p articular device you choose to access the systemconsole. For instructions on attaching and configuring a device to access the systemconsole, see:

s “Using the Serial Managemen t Port” on page 41s “Activating the Netw ork Management Port” on page 42s “Accessing t he System Con sole With a Term inal Server” on pag e 44s “Accessing the System Console With a Tip Connection” on page 47

Graphics Card

ttyb

NET MGT

SERIAL MGT

Sun Fire V445 Server

ServerTerminal

TerminalAlphanumeric

Linetip

Monitor

Graphics

ConsoleSystem

input-device=keyboard

output-device=screen

input-device=ttyb

output-device=ttyb

input-device=ttya

output-device=ttya

Ports Console Devices

OpenBoot Config. Variable Settings

Defau lt System Console Connection Throu gh the SerialManagement and Network Management Ports

On Sun Fire V445 servers the system console comes preconfigured to allow input

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 61/294

Chapter 2 Configuring the System Console 29

On Sun Fire V445 servers, the system console comes preconfigured to allow input

and outp ut only by means of hardw are devices connected to the serial or networkman agement p orts. How ever, because the network m anagement p ort is not availableuntil networ k par ameters are assigned, your first connection m ust be to the serialman agement p ort. The netw ork can be configured once the system is connected topow er and ALOM completes its self test.

Typically, you connect one of the following hardware devices to the serialmanagement port:

s

Terminal servers Alphanu meric terminal or similar devices A Tip line connected to anoth er Sun compu ter

This provides for secure access at the installation site.

FIGURE 2-2 Serial Managem ent Port (Default Con sole Connection)

Using a Tip line m ight be p referable to connecting an alphanu meric terminal, sinceth e tip command allows you to use windowing and OS features on the machinebeing u sed t o connect to th e Sun Fire V445 server.

Although the Solaris OS sees the serial management port as TTYA, the serialman agement p ort is not a general-pu rpose serial port. If you w ant to use a general-pu rpose serial port w ith your server – to connect a serial printer, for instance – usethe regular 9-pin serial port on the back panel of the Sun Fire V445. The Solaris OSsees this port as TTYB.

Network management port Serial management port(NET MGT) (SER MGT)

For instructions on accessing the system console through a terminal server, see“Accessing t he System Con sole With a Term inal Server” on page 44.

For instructions on a ccessing the system console through an alph anu meric terminal,“A i th S t C l With Al h i T i l” 53

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 62/294

30 Sun Fire V445 Server Administration Guide • September 2007

see “Accessing the System Console With an Alphanumeric Terminal” on page 53 .

For instructions on accessing the system console with a Tip line, see “To Access theSystem Console With a Tip Connection Throught the Serial Management Port” onpage 48.

Access Through the N etwork Management Port

Once you have configured the netw ork man agement p ort, you can connect anEthernet-capable device to the system console throu gh you r n etwork. Thisconnection provides for remote mon itoring an d control. In ad dition, up to foursimultaneous connections to the system controller sc> prom pt are available throughthe netw ork ma nagement port. For m ore information, see “Activating the Netw orkManagemen t Port” on p age 42.

For more information about the system console and the ALOM system controller,see:

s “About the sc> Promp t” on page 32s “About the ok Promp t” on page 35

ALOM

ALOM software is prein stalled on th e server ’s system controller (SC) and is enabledat the first power on. ALOM provides remote pow ering on an d off, diagnosticscapabilities, environmental control, and monitoring operations for the server. Theprima ry functions of ALOM include the following:

s Operation of system ind icatorss Fan speed monitoring and adjustments Tempera ture mon itoring and alertss Power supply health monitoring and controls USB overcurrent m onitoring and alertss

Hot-plug configuration change m onitoring and alertss Dynam ic FRU ID data transactions

For more information about ALOM software, see “About the ALOM SystemController Card ” on page 77.

Alternative System Console Configura tion

In the d efault configuration, system controller alerts and system console outp utappear interspersed in the same window. After initial system installation, you can

di h l k i i f d d i h i l

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 63/294

Chapter 2 Configuring the System Console 31

redirect the system console to take its input from and send its outp ut to the serialpor t TTYB, or to a gr ap hics card’s port.

A serial port and the PCI slots are located on the rear panel. Two USB ports arelocated on the front pan el.

The chief advan tage of redirecting the system console to another port is that itallows you to divide system controller alerts and system console outp ut into tw oseparate windows.

How ever, there are som e serious d isadvantages to alternative console configuration:

s POST outpu t can only be directed to the serial man agement and networ kman agement ports. It cannot be d irected to TTYB or to a graph ics card’s port.

s If you h ave d irected the system console to TTYB, you cannot u se this port for an yother serial device.

s In a default configuration, the serial management and network m anagemen t portsenable you to open up to four ad ditional window s by which you can view, but

not a ffect, system console activity. You can not op en th ese win dow s if the systemconsole is redirected to TTYB or to a grap hics card’s port.

s In a default configuration, the serial management and network m anagemen t portsenable you to switch between viewing system console and system controlleroutp ut on th e same d evice by typing a simple escape sequence or comman d. Theescape sequen ce and comm and s do not work if the system console is redirected toTTYB or to a grap hics card’s port.

s

The system controller keep s a log of console messages, but som e messages are n otlogged if the system console is redirected to TTYB or to a g rap hic card’s port. Theomitted information could be importan t if you n eed to contact Sun customerservice w ith a p roblem.

For all the preceding reasons, the best practice is to leave the system console in itsdefault configuration.

You change the system console configuration by setting OpenBoot configuration

variables. See “Reference for System Console Op enBoot Configu ration VariableSettings” on page 59.

You can also set OpenBoot configuration variables using the ALOM systemcontroller. For d etails, see the Sun Advanced Lights Out Manager (ALOM) Online Help.

Accessing th e System Console Throu gh a Grap hics Monitor

The Sun Fire V445 server is shipped withou t a mou se, keyboard, m onitor, or framebuffer for the d isplay of bitmap ped graph ics. To install a graphics monitor on the

t i t ll f b ff d i t PCI l t d tt h it

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 64/294

32 Sun Fire V445 Server Administration Guide • September 2007

server, you must install a frame buffer card into a PCI slot, and attach a monitor,mouse, and keyboard to the appropriate back panel ports.

After starting the system, you m ight need to install the correct software driver forthe PCI card you have installed. For detailed hard wa re instructions, see “To Accessthe System Console With a Local Graphics Monitor” on page 56 .

Note – Power-on self-test (POST) diagnostics cannot display status and errormessages to a local graphics monitor.

About the sc> Prompt

The ALOM system controller runs independently of the Sun Fire V445 server andregardless of system power state. When you connect a Sun Fire V445 server to ACpow er, the ALOM system controller immed iately starts up , and begins monitoringthe system.

Note – To view ALOM system controller boot messages, you must connect analphanumeric terminal to the serial management port before connecting the ACpower cords to the Sun Fire V445 server.

You can log in to the ALOM system controller at any time, regardless of systempow er state, as long as AC power is connected to the system an d you have a w ay of interacting w ith the system . You can also access the ALOM system controller p romp t(sc>) from the ok prom pt or from th e Solaris promp t, provided the system consoleis configured to be accessible through th e serial management and networkman agement p orts. For m ore information, see:

s “Entering the ok Promp t” on page 40s “About Switching Between the ALOM System Controller and the System

Console” on page 38

The sc> prompt indicates that you are interacting with the ALOM system controllerdirectly. It is the first prom pt you see when you log in to the system through theserial management port or network management port, regardless of system powerstate.

Note – When you access the ALOM system controller for the first time, it forces youto create a user nam e and passw ord for subsequen t access. After this initialconfiguration, you w ill be prompted to enter a user nam e and p assword every timeyou access the ALOM system controller

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 65/294

Chapter 2 Configuring the System Console 33

you access the ALOM system controller.

Access Throu gh Multiple Controller Sessions

Up to five ALOM system cont roller sessions can be active concurren tly, one sessionthrough the serial management port and up to four sessions through the network

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 66/294

34 Sun Fire V445 Server Administration Guide • September 2007

management port.Users of each of these sessions can issue comm and s at the sc> prom pt, but only oneuser session can have write-access to the system console at anytime. The othersessions accessing the system console will have read-only capability.

For more information, see:

s “Using the Serial Managemen t Port” on page 41s “Activating the Netw ork Management Port” on page 42.

Any additional ALOM system controller sessions afford passive views of systemconsole activity, until the active user of the system console logs out. However, theconsole -f command, if you enable it, allows users to seize access to the systemconsole from one another. For more information, see the Sun Advanced Lights Out 

 Manager (ALOM) Online Help.

Ways of Reaching the sc> PromptThere are several w ays to get to the sc> prom pt. These are:

s If the system console is directed to the serial managemen t and networkman agement ports, you can typ e the ALOM system controller escape sequ ence(#.).

Note – #. (pound period) is the d efault setting for the escape sequ ence to enterALOM. It is a configura ble variable.

s You can log in d irectly to the ALOM system controller from a dev ice connected tothe serial managemen t port. See “Using the Serial Management Port” on page 41.

s You can log in d irectly to the ALOM system controller using a conn ection throu ghthe network management port. See “Activating the Network Management Port”on p age 42.

About the ok Prompt

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 67/294

Chapter 2 Configuring the System Console 35

A Sun Fire V445 server w ith th e Solaris OS installed is capable of op erating atdifferent run levels. A synopsis of run levels follows. For a full description, see theSolaris system a dm inistration d ocumentation.

Most of the time, you operate a Sun Fire V445 server at run level 2 or run level 3,wh ich are m ultiuser states with access to full system and network resources.Occasionally, you might operate the system at run level 1, which is a single-useradm inistrative state. How ever, the lowest op erational state is run level 0. At this

state, it is safe to turn off pow er to the system.When a Sun Fire V445 server is at ru n level 0, the ok prompt appears. This promptindicates that the Op enBoot firmw are is in control of the system.

There are a num ber of scenarios in wh ich Op enBoot firmw are control can hap pen.

s By d efault, the system p owers u p to Op enBoot firmw are control before the OS isinstalled.

s The system boots to the ok prompt when the auto-boot? OpenBootconfiguration variable is set to false.

s The system transitions to ru n level 0 in an orderly w ay w hen th e OS is halted.

s The system reverts to Op enBoot firmw are control when the OS crashes.

s When a serious hardw are problem develops wh ile the system is runn ing, the OStransitions sm oothly to ru n level 0.

s You deliberately place the server u nd er firmw are control in ord er to execute

firmw are-based comm and s or to run diagnostic tests.It is the last of these scenarios that most often concerns you as an administrator,since there w ill be times wh en you n eed to reach the ok prom pt. The several ways todo this are outlined in “Entering the ok Prompt” on page 35. For deta iledinstructions, see “Entering the ok Promp t” on page 40.

Entering the ok PromptThere are several ways to enter the ok prom pt, depen ding on th e state of the systemand the m eans by w hich you are accessing the system console. In ord er of desirab ility, these are:

s Graceful shutdowns ALOM system controller break or console commands L1-A (Stop-A) keys or Break keys Externally initiated reset (XIR)

s Manu al system reset

A description of each method follows. For instructions, see “Entering th e okPromp t” on page 40.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 68/294

36 Sun Fire V445 Server Administration Guide • September 2007

Graceful Shu tdow n

The preferred m ethod of reaching the ok promp t is to shut d own the OS by issuingan app ropriate comm and (for examp le, the shutdown, init, or uadmin command)as described in Solaris system administration documentation. You can also use thesystem Power bu tton to initiate a graceful system shutd own.

Gracefully shutting dow n the system p revents data loss, enables you to w arn u sersbeforehand , and causes minimal disrup tion. You can u sually perform a gr acefulshutd own , provided the Solaris OS is running and the hard ware ha s not experiencedserious failure.

You can also perform a graceful system shutd own from the ALOM system controllercommand prompt.

For more information, see:

s “Powering Off the Server Locally” on page 66s “Powering Off the System Remotely” on page 64

ALOM System Controller break or console Command

Typing break from the sc> promp t forces a ru nning Sun Fire V445 server to m oveto OpenBoot firmware control. If the OS is already halted, you can use the console

command instead of break to reach the ok prompt.

If you issue a break at the SC you will remain in an SC prom pt. To use the Op enBootprompt, enter the console command . For example:

After forcing the system into Op enBoot firmw are control, be aw are that issuingcertain OpenBoot commands (like probe-scsi, probe-scsi-all, or probe-ide)might hang the system.

TABLE 2-2

hostname> #. [characters are not echoed to the screen]

sc> break -y [break on its own will generate a confirmation prompt]

sc> consoleok

L1-A (Stop-A) Keys or Break Key

When it is imp ossible or imp ractical to shut d own the system g racefully, you can getto the ok prom pt by typ ing the L1-A (Stop-A) key sequen ce from a Sun key board , or,if you have an alphanu meric terminal attached to the Sun Fire V445 server, by

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 69/294

Chapter 2 Configuring the System Console 37

pressing the Break key.

After forcing the system into Op enBoot firmw are control, be aw are that issuingcertain OpenBoot commands (like probe-scsi, probe-scsi-all, or probe-ide)might hang the system.

Note – These methods of reaching the ok prom pt w ill only work if the systemconsole has been red irected to the ap prop riate port. For details, see “Reference forSystem Console Op enBoot Configur ation Variable Settings” on p age 59

Externally Initiated Reset (XIR)

Use the ALOM system controller reset -x command to execute an externallyinitiated reset (XIR). Forcing an XIR might b e effective in breaking the d ead lock that

is hanging up the system. However, an XIR also precludes the ord erly shutdow n of applications, and so it is not the preferred method of reaching the ok prompt, unlessyou are troubleshooting these types of system han gs. Generating an XIR has theadvan tage of allowing you to issue the sync command to produ ce a du mp file of thecurrent system state for diagnostic purp oses.

For more information, see:

s Chapter 8 and Chapter 9

s Sun Advanced Lights O ut Manager (A LOM) On line Help

Caution – Because an XIR precludes an orderly shutdown of applications, it shouldonly be attemp ted if previously described m ethods d o not work.

Manu al System ResetUse the ALOM system controller reset command , or poweron an d poweroff

command s, to reset the server. Reaching the ok prompt by performing a manu alsystem reset or by pow er-cycling the system shou ld be the m ethod of last resort.Doing this results in the loss of all system coherence and state information. Aman ual system reset could corrupt the server ’s file systems, although the fsck

command usually restores them.

Caution – Forcing a m anu al system reset results in loss of system state data, andshould be attemp ted only as a last resort. After a m anu al system reset, all stateinforma tion is lost, which inhibits troubleshooting th e cau se of the p roblem u ntil theproblem reoccurs.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 70/294

38 Sun Fire V445 Server Administration Guide • September 2007

Caution – When you access the ok prom pt from a functioning Sun Fire V445 server,you are susp ending th e Solaris OS and placing the system u nd er firmw are control.Any p rocesses that were ru nning u nd er the OS are also suspen ded , and the state of 

such processes might not be recoverable.

The comm and s you ru n from the ok p rompt have the p otential to affect the state of the system. This means that it is not always p ossible to resume execution of the OSfrom the point at w hich it was susp ended . The diagnostic tests you ru n from the okprom pt will affect the state of the system. This means that it is not possible toresume execution of the OS from the p oint at wh ich it w as suspend ed.

Although the go command will resume execution in most circumstances, in general,each time you force the system d own to the ok prom pt, you should expect to have toreboot the system to get back to the OS.

As a ru le, before suspend ing the OS, you shou ld back up files, warn users of theimpend ing shutd own , and halt the system in an orderly man ner. How ever, it is notalways possible to take such precautions, especially if the system is malfunctioning.

For more information about the Op enBoot firmw are, see the OpenBoot 4.x Command 

 Reference Manual. An online version of the man ual is included with th e OpenBoot 

Collection Answ erBook that ships w ith Solaris software.

About Switching Betw een the ALOMSystem Controller and the System

ConsoleThe Sun Fire V445 server features tw o managem ent p orts, labeled SERIAL MGT andNET MGT, located on the server ’s back pan el. If the system console is directed to u sethe serial man agement and network m anagement p orts (its default configuration),these ports p rovide access to both th e system console and the ALOM systemcontroller, each on separate channels (FIGURE 2-3).

ok

System Console

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 71/294

Chapter 2 Configuring the System Console 39

FIGURE 2-3 Separate System Console and System Controller Chann els

If the system console is configured to be accessible from the serial management andnetwork m anagement p orts, wh en you connect through one of these ports you canaccess either the ALOM command-line interface or the system console. You can

switch betw een the ALOM system controller and th e system console at any time, butyou cannot access both at the same time from a single terminal or shell tool.

The promp t displayed on the terminal or shell tool tells you wh ich channel you areaccessing:

s The # or % prom pt indicates that you are at the system console and that theSolaris OS is running.

s The ok prom pt ind icates that you are at th e system console and tha t the server is

run ning un der Op enBoot firmw are control.s The sc> promp t indicates that you are at the ALOM system controller.

Note – If no text or prom pt a pp ears, it might be th e case that no console messageswere recently generated by the system. If this happens, pressing the terminal’s Enteror Return key should produce a prompt.

NET MGTor SERIAL MGT

Port

sc>

#

ALOM System Controller

console #.

To reach the system console from the ALOM system controller, type the console

comm and at the sc> prom pt. To reach the ALOM system controller from the systemconsole, type the system controller escape sequence, which by default is #. (poundperiod).

F i f ti

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 72/294

40 Sun Fire V445 Server Administration Guide • September 2007

For more information, see:s “About Com mu nicating With the System” on page 26s “About the sc> Promp t” on page 32s “About the ok Promp t” on page 35s “Using the Serial Managemen t Port” on page 41s Sun Advanced Lights Out Manager (A LOM) Online Help

Entering the ok PromptThis procedu re provides several ways of reaching the ok prompt. The methods arenot equally desirable. For details about when to use each method, see “About the okPromp t” on page 35.

Caution – Dropping the Sun Fire V445 server to the ok prompt suspend s allapp lication and OS software. After you issue firmw are comm and s and r unfirmw are-based tests from th e ok prom pt, the system m ight not be able to resum ewhere it left off.

w To Enter th e ok Prompt1. If at all poss ibl e, back up sy stem data before starting this procedure.

For information about the approp riate backup and shu tdow n procedu res, refer toSolaris system ad ministration docum entation.

2. Exit or stop all applications and w arn users of the impending los s of se rvice.

3. De cide which method you need to use to reach the ok prompt.

See “About the ok Promp t” on page 35 for details.

4. Refer to TABLE 2-3 for ins tructions.

TABLE 2-3 Ways of Accessing the ok Prompt

Access Method What to Do

f l h d f h ll d l i d i i

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 73/294

Chapter 2 Configuring the System Console 41

Using the Serial Management PortThis procedu re assum es that th e system console is directed to use th e serialmanagement and network management ports (the default configuration).

When you are accessing the system console using a device connected to the serialman agement port, your first p oint of access is the ALOM system controller and itssc> prompt. After connecting to the ALOM system controller, you can switch to thesystem console itself.

For more information about the ALOM system controller card, see:s “About th e ALOM System Controller Card” on p age 77s Sun Advanced Lights O ut Manager (A LOM) On line Help

Ensure that the serial port on your connecting device is set to th e followingparameters:

s 9600 bauds 8 bitss No p arity

Graceful shutdown of the Solaris OS

• From a shell or comm and tool window, issue an appropriatecommand (for example, the shutdown or init comm and) asdescribed in Solaris system ad ministration docum entation.

L1-A (Stop-A) keys or

Break key

• From a Sun k eyboard connected directly to the Sun Fire V445server, press the Stop and A keys simu ltaneously.*

 –or–

• From an alphanu meric terminal configured to access the systemconsole, press t he Break key.

* Requires the OpenBoot configuration var iable input-device=keyboard. For more inform ation, see “Ac-cessing th e System Console With a Local Graph ics Monitor” on p age 56 an d “Reference for System Co nsoleOpen Boot Configur ation Variable Settings” on p age 59.

ALOM systemcontroller console orbreak command

• From the sc> prompt, type the break comm and. The consolecommand also works, provided the OS software is not runningand the server is already u nd er OpenBoot firmw are control.

Externally initiatedreset (XIR)

• From the sc> prompt, type the reset -x command.

Manual system reset • From the sc> prompt, type the reset command.

s 1 stop bits No hand shaking

w T U th S i l M t P t

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 74/294

42 Sun Fire V445 Server Administration Guide • September 2007

w To Use th e Serial Management Port1. Establish an ALOM system controller session.

See Sun Advanced Lights Out Manager (A LOM) On line Help for instructions.

2. To connect to the system console, at the ALOM system controller command

prompt, type:

The console command switches you to the system console.

3. To sw itch back to the sc> prompt, type the #. escape s equence.

For instructions on how to use the ALOM system controller, see:

s Sun Advanced Lights Out Manager (A LOM) Online Help

Activating the Netw ork ManagementPortYou m ust assign an Internet Protocol (IP) add ress to the network man agement p ortbefore you can u se it. If you are configuring th e netw ork m anagemen t port for the

first time, you must first connect to the ALOM system controller using the serialman agement p ort and assign an IP add ress to the network m anagemen t port. Youcan either assign an IP add ress manu ally, or you can configure the port to obtain anIP add ress using the Dynam ic Host Configuration Protocol (DHCP) from anotherserver.

Data centers frequently devote a separa te subnet to system man agement. If yourdata center has su ch a configuration, connect the network m anagement port to thissubnet.

sc> console

TABLE 2-4

ok #. [characters are not echoed to the screen]

Note – The network management port is a 10BASE-T port. The IP address assignedto the network m anagemen t port is a unique IP add ress, separate from the main SunFire V445 server IP address, and is dedicated for use only with the ALOM systemcontroller. For m ore informa tion, see “About the ALOM System Controller Card” on

page 77

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 75/294

Chapter 2 Configuring the System Console 43

page 77.

w To Activate the N etwork Man agement Port

1. Connect an Ethernet cable to the ne twork management port.

2. Log in to the ALOM system controller through the serial management port.

For more information about connecting to the serial man agement p ort, see“Using the Serial Management Port” on page 41.

3. Assign IP addresses by typing one of the follow ing commands:

s If your network uses static IP addresses, type:

Note – The if_network command requires resetting the SC before the changes

take effect. Reset th e SC with the resetsc command after changing networkparameters.

s If your network uses Dynamic Host Configuration Protocol (DHCP), type:

4. Select the communications protocol, either Telnet, SSH, or none, type:

Note – none is the default.

TABLE 2-5

sc> setsc if_network true

sc> setsc netsc_ipaddr ip-address

sc> setsc netsc_ipnetmask ip-address

sc> setsc netsc_ipgateway ip-address

TABLE 2-6

sc> setsc netsc_dhcp

TABLE 2-7

sc> setsc if_connection none| ssh| telnet 

5. To verify the netwo rk settings, type:

6 f A O i

TABLE 2-8

sc> shownetwork

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 76/294

44 Sun Fire V445 Server Administration Guide • September 2007

6. Log out of the ALOM system controller session.

To connect through the netw ork man agement p ort, use the telnet command to theIP address you specified in Step 3 of the preceding p rocedu re.

Accessing the System Console With aTerm inal ServerThe following p rocedu re assum es that you are accessing th e system console byconnecting a terminal server to the serial management port (SERIAL MGT) of theSun Fire V445 server.

w To Access the System Console With a Term inalServer Through the Serial Management Port

1. Complete the physical connection f rom the serial management port to your

terminal server.

The serial mana gement p ort on th e Sun Fire V445 server is a data terminalequipm ent (DTE) port. The pinouts for the serial man agement p ort correspond withthe pinouts for the RJ-45 ports on the Serial Interface Breakout Cable supplied byCisco for u se w ith th e Cisco AS2511-RJ term inal server. If you u se a ter mina l servermad e by another man ufacturer, check that the serial port p inouts of the Sun FireV445 server m atch those of the terminal server you p lan to u se.

If the pinouts for the server serial ports correspond with the p inouts for the RJ-45ports on the term inal server, you have tw o connection options:

s Conn ect a serial interface breakou t cable d irectly to t he Sun Fire V445 server. See“Using the Serial Managemen t Port” on page 41.

s Connect a serial interface breakout cable to a patch p anel and use the straight-through patch cable (supp lied by Sun) to connect the patch p anel to the server.

1 2 3 4 5 6 7 8 9 10 1 1 1 2 13 1 4 1 5

Terminal

server Straight-through cable

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 77/294

Chapter 2 Configuring the System Console 45

FIGURE 2-4 Patch Panel Connection Between a Terminal Server and a Sun Fire V445Server

If the pinou ts for the serial managem ent p ort do not correspond with the p inouts forthe RJ-45 ports on the term inal server, you n eed to m ake a crossover cable that takeseach pin on the Sun Fire V445 server serial managem ent p ort to the correspond ingpin in th e terminal server ’s serial port.

TABLE 2-9 shows the crossovers that the cable must p erform.

TABLE 2-9 Pin Crossovers for Connecting to a Typical Terminal Server

Sun Fire V445 Serial Port (RJ-45 Connector) Pin Terminal Server Serial Port Pin

Pin 1 (RTS) Pin 1 (CTS)

Pin 2 (DTR) Pin 2 (DSR)

Pin 3 (TXD) Pin 3 (RXD)

Pin 4 (Signal Ground) Pin 4 (Signal Ground)

Pin 5 (Signal Ground) Pin 5 (Signal Ground)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Patch panel

Patch cable to serial management port

Sun Fire V445

server

Pin 6 (RXD) Pin 6 (TXD)

Pin 7 (DSR / DCD) Pin 7 (DTR)Pi 8 (CTS) Pi 8 (RTS)

TABLE 2-9 Pin Crossovers for Connecting to a Typical Terminal Server (Continued)

Sun Fire V445 Serial Port (RJ-45 Connector) Pin Terminal Server Serial Port Pin

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 78/294

46 Sun Fire V445 Server Administration Guide • September 2007

2. Open a terminal session o n the connecting device, and type:

For examp le, for a Sun Fire V445 server connected to p ort 10000 on a term inal serverwhose IP ad dress is 192.20.30.10, you would typ e:

w To Access the System Console With a Term inalServer Throu gh the TTYB Por t

1. Redirect the system console by changing OpenBoot configuration variables.

At the ok prompt, type:

Note – Redirecting the system console does not red irect POST outp ut. You can onlyview POST messages from the serial and n etwork m anagemen t port d evices.

Note – There are m any other Op enBoot configuration variables. Although thesevariables do not affect which hardware device is used to access the system console,some of them affect which diagnostic tests the system run s and wh ich m essages thesystem disp lays at its console. See Chapter 8 an d Chapter 9.

Pin 8 (CTS) Pin 8 (RTS)

TABLE 5

% telnet IP-address-of-terminal-server port-number 

TABLE 6

% telnet 192.20.30.10 10000

TABLE 2-10

ok setenv input-device ttyb

ok setenv output-device ttyb

2. To cause the changes to take effect, pow er off the system. Type:

Th t tl t th t h d ff

ok power-off

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 79/294

Chapter 2 Configuring the System Console 47

The system p erman ently stores the param eter changes and pow ers off.

Note – You can also pow er off the system u sing the front pan el Power bu tton.

3. Connect the null modem serial cable to the TTYB port on the Sun Fire V445

server.

If required, use the DB-9 or DB-25 cable adapter supplied with the server.

4. Power on the system.

See Chapter 3 for power-on procedures.

What N extContinue w ith your installation or diagnostic test session as app ropriate. Whenyou are finished, end your session by typing the terminal server’s escapesequence and exit the wind ow.

For more information about connecting to an d using the ALOM system controller,see:

s Sun Advanced Lights O ut Manager (A LOM) On line Help

If you have redirected the system console to TTYB and want to change the systemconsole settings back to use the serial man agement an d netw ork man agement p orts,see:

s “Reference for System Console Op enBoot Configur ation Variable Settings” onpage 59

Accessing the System Console With a TipConnectionThis procedure assumes that you are accessing the Sun Fire V445 server systemconsole by connecting the serial port of another Sun system to the serial

managem ent p ort (SERIAL MGT) of the Sun Fire V445 server (FIGURE 2-7).

Another Sun systemTip connection

Serial portSerial management port

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 80/294

48 Sun Fire V445 Server Administration Guide • September 2007

FIGURE 2-7 Tip Connection Between a Sun Fire V445 Server and Another Sun System

w To Access the System Console With a TipConnection Throu ght the Serial Managem entPort

1. Connect the RJ-45 serial cable and, if required, the D B-9 or DB-25 adapter

provided.The cable and adap ter connect between another Sun system’s serial port(typically TTYB) and the serial management port on the back panel of the SunFire V445 server. Pinouts, part numbers, and other details about the serial cableand ad apter are provided in the Sun Fire V445 Server Parts Installation and Removal

Guide.

2. Ensure that the /etc/remote file on the Sun system contains an entry for

hardwire.

Most releases of Solaris OS software ship ped since 1992 contain an /etc/remote

file with the app ropriate hardwire entry. How ever, if the Sun system is runn ingan older version of Solaris OS software, or if the /etc/remote file has beenmod ified, you m ight need to edit it. See “Modifying th e /etc/remote File” onpage 51 for details.

yp co ect o

3. In a shell tool window on the Sun system, type:

The Sun system responds by d isplaying:

Table 2-11

% tip hardwire

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 81/294

Chapter 2 Configuring the System Console 49

y p y p y g

The shell tool is now a Tip window directed to the Sun Fire V445 server through

the Sun system’s serial port. This connection is established and maintained evenwhen the Sun Fire V445 server is completely powered off or just starting up.

Note – Use a shell tool or a CDE or JDS term inal (such as dtterm), not a comma ndtool. Some tip comm ands m ight not w ork properly in a comman d tool window.

w To Access the System Console With a TipConn ection Throu gh the TTYB Port

1. Redirect the sys tem console by changing the OpenBoot configuration

variables.

At the ok prompt on the Sun Fire V445 server, type:

Note – You can only access the sc> prom pt an d v iew POST messages from eitherthe serial management port or the network m anagement port.

Note – There are ma ny oth er Op enBoot configuration variables. Although thesevariables do not affect which hardware device is used to access the system console,some of them a ffect w hich d iagnostic tests the system ru ns and wh ich m essages thesystem d isplays at its console. See Chapter 8 an d Chapter 9.

Table 2-12

connected

TABLE 2-13

ok setenv input-device ttyb

ok setenv output-device ttyb

2. To cause the changes to take effect, pow er off the system. Type:

The system perm anently stores the param eter changes and pow ers off.

ok power-off

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 82/294

50 Sun Fire V445 Server Administration Guide • September 2007

y p y p g p w

Note – You can also pow er off the system u sing the front panel Pow er button.

3. Connect the null m odem serial cable to the TTYB port on the Sun Fire V445

server.

If required, use the DB-9 or DB-25 cable adapter supplied with the server.

4. Power on the system.

See Chapter 3 for power-on procedures.

Continue w ith your installation or diagnostic test session as app ropriate. Whenyou are finished u sing the tip wind ow, end your Tip session by typ ing ~. (thetilde sym bol followed by a p eriod) and exit the w indow. For m ore information

abouttip

commands, see thetip

man page.For more information about connecting to an d using the ALOM system controller,see:

s Sun Advanced Lights Out Manager (A LOM) Online Help

If you have redirected the system console to TTYB and want to change the systemconsole settings back to use the serial managemen t and network m anagement p orts,see:

s “Reference for System Console Op enBoot Configu ration Variable Settings” onpage 59

Mod ifying the /etc/remote File

This procedure might be necessary if you are accessing the Sun Fire V445 serverusing a Tip connection from a Sun system ru nnin g an older ver sion of the Solaris OSf Y i h l d f hi d if h / / fil

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 83/294

Chapter 2 Configuring the System Console 51

software. You m ight also need to per form this procedu re if the /etc/remote file onthe Sun system h as been altered and no longer contains an app ropriate hardwire

entry.

This procedure assum es that you are logged in as sup eruser to the system console of a Sun system that you intend to use to establish a tip connection to the Sun FireV445 server.

w To Modify the /etc/remote File

1. Determine the release level of Solaris OS sof tware installed on the Sun

system. Type:

The system respond s with a release num ber.

2. Do one of the following, depending on the number displayed.

s If the number displayed by the uname -r command is 5.0 or higher:

The Solaris software shipp ed w ith an ap prop riate entry for hardwire in the/etc/remote file. If you h ave reason to susp ect that this file was altered an dth e hardwire entry m odified or deleted, check the entr y against thefollowing example, and edit it as needed .

Note – If you intend to use the Sun system’s serial port A rather th an serial port B,edit this entry by replacing /dev/term/b with /dev/term/a.

Table 2-14

# uname -r

Table 2-15

hardwire:\

:dv=/dev/term/b:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:

s If the number displayed b y the uname -r command is less than 5.0:

Check the /etc/remote file and add the following entry, if it does notalready exist.

Table 2-16

hardwire:\

d /d / b b #9600 l ^C^S^Q^ ^ i %$ ^

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 84/294

52 Sun Fire V445 Server Administration Guide • September 2007

Note – If you intend to use the Sun system’s serial port A ra ther tha n serial port B,edit this entry by replacing /dev/ttyb with /dev/ttya.

The /etc/remote file is now properly configured. Continue establishing a Tipconnection to the Sun Fire V445 server system console. See:

s “Accessing the System Console With a Tip Connection” on page 47

If you have redirected the system console to TTYB and want to change the systemconsole settings back to use the serial managemen t and network m anagement p orts,see:

s “Reference for System Console Op enBoot Configu ration Variable Settings” onpage 59

:dv=/dev/ttyb:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:

Accessing the System Console With an

Alphanu meric TerminalThis procedure assumes that you are accessing the Sun Fire V445 server system

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 85/294

Chapter 2 Configuring the System Console 53

pThis procedure assumes that you are accessing the Sun Fire V445 server systemconsole by connecting the serial port of an alph anu meric terminal to the serialmanagem ent por t (SERIAL MGT) of the Sun Fire V445 server.

w

To Access th e System Console With anAlphanu m eric Terminal Through the SerialManagement Port

1. Attach one end of the serial cable to the alphanumeric terminal’s serial port.

Use a null modem serial cable or an RJ-45 serial cable and null modem adapter.Plug this cable in to the terminal’s serial port connector.

2. Attach the oppos ite end of the serial cable to the serial management port on

the Sun Fire V445 server.

3. Connect the alphanumeric terminal’s pow er cord to an AC outlet.

4. Set the alphanumeric terminal to receive:

s 9600 bauds 8 bitss No p aritys 1 stop bits No handshake protocol

See the docum entation accomp anying your terminal for information about h ow toconfigu re it.

w To Access the System Console With anAlph anumeric Term inal Through the TTYB Port

1. Redirect the system console by changing the Ope nBoot configuration

variables.

At the ok prompt, type:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 86/294

54 Sun Fire V445 Server Administration Guide • September 2007

Note – You can only access the sc> prom pt an d view POST messages from eitherthe serial management port or the network m anagement port.

Note – There are m any other Op enBoot configuration variables. Although thesevariables do not affect which hardware device is used to access the system console,some of them affect which diagnostic tests the system run s and wh ich m essages the

system disp lays at its console. See Chapter 8 an d Chapter 9.

2. To cause the changes to take effect, pow er off the system. Type:

The system perm anently stores the param eter changes and pow ers off.

Note – You can also pow er off the system u sing the front panel Pow er button.

3. Connect the null m odem serial cable to the TTYB port on the Sun Fire V445

server.

If required, use the DB-9 or DB-25 cable adapter supplied with the server.4. Power on the system.

See Chapter 3 for power-on procedures.

You can issue system comman ds and view system messages using the alphanu mericterminal. Continu e with you r installation or d iagnostic procedu re, as needed. Whenyou are finished, type th e alphan um eric terminal’s escape sequence.

TABLE 2-17

ok setenv input-device ttyb

ok setenv output-device ttyb

ok power-off

For more information about connecting to an d using the ALOM system controller,see:

s Sun Advanced Lights O ut Manager (A LOM) On line Help

If you have redirected the system console to TTYB and want to change the system

console settings back to use the serial man agement an d netw ork man agement p orts,see:

“R f f S C l O B C fi i V i bl S i ”

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 87/294

Chapter 2 Configuring the System Console 55

s “Reference for System Console Op enBoot Configur ation Variable Settings” onpage 59

Verifying Serial Port Settings on TTYBThis procedure enables you to verify the baud rate and other serial port settingsused by the Sun Fire V445 server to comm un icate with a d evice attached to its TTYBport.

Note – The serial man agement p ort always op erates at 9600 baud , 8 bits, with n o

parity and 1 stop bit.

You must be logged in to the Sun Fire V445 server, and the server must be runningSolaris OS software.

w To Verify Serial Port Settings on TTYB

1. Open a shell tool window.

2. Type:

3. Look for the follow ing output:

This line ind icates that the Sun Fire V445 server ’s serial p ort TTYB is configuredfor:

Table 2-18

# eeprom | grep ttyb-mode

Table 2-19

ttyb-mode = 9600,8,n,1,-

s 9600 bauds 8 bitss No p aritys 1 stop bits No handshake protocol

For more information about serial port settings, see the eeprom man page. For moreinformation about the TTYB-mode OpenBoot configuration variable, see

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 88/294

56 Sun Fire V445 Server Administration Guide • September 2007

Append ix C.

Accessing the System Console With aLocal Graphics MonitorAfter initial system installation, you can install a local graphics monitor andconfigure it t o access the system console. You cannot use a local graph ics mon itor toperform initial system installation, nor can you use a local graphics monitor to viewpow er-on self-test (POST) messages.

To install a local graphics monitor, you must have:

s A sup ported PCI-based grap hics frame bu ffer card and software d river.An 8/ 24-Bit Color Graphics PCI adapter frame buffer card (Sun part numberX3768A or X3769A is cu rren tly su pp orted )

s A monitor w ith approp riate resolution to sup port the frame bu ffer

s A Sun-comp atible USB keyboard (Sun USB Type–6 keyboard )

s

A Sun-compatible USB mouse (Sun USB mouse) and mou se pad

w To Access th e System Console With a LocalGraphics Mon itor

1. Install the graphics card into an appropriate PCI slo t.

Installation m ust be performed by a qualified service provider. For furth erinforma tion, see the Sun Fire V445 Server Installation Guide or contact yourqualified service provider.

2. Attach the monitor’s video cable to the graphics card’s video port.

Tighten the thum bscrews to secure the connection.

3. Connect the monitor’s pow er cord to an AC outlet.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 89/294

6. Obtain the ok prompt.

For more information, see “Entering the ok Promp t” on page 40.

7. Set OpenBoot configuration variables appropriately.

From the existing system console, type:

ok setenv input-device keyboard

k t t t d i

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 90/294

58 Sun Fire V445 Server Administration Guide • September 2007

Note – There are m any other Op enBoot configuration variables. Although thesevariables do not affect which hardware device is used to access the system console,

some of them affect which diagnostic tests the system run s and wh ich m essages thesystem disp lays at its console. See Chapter 8 an d Chapter 9.

8. To cause the changes to take effect, type:

The system stores the param eter changes, and boots automatically when theOpenBoot configuration variable auto-boot? is set to true (its default value).

Note – To store para meter changes, you can also pow er cycle the system u sing thePower button.

You can issue system comma nd s and view system m essages using you r localgraph ics m onitor. Continu e w ith your installation or diagnostic procedu re, asneeded.

If you w ant to red irect the system console back to the serial managem ent andnetwork management ports, see:

s “Reference for System Console Op enBoot Configu ration Variable Settings” onpage 59.

ok setenv output-device screen

ok reset-all

Reference for System Console Op enBoot

Configuration Variable SettingsThe Sun Fire V445 system console is directed to the serial management and networkman agement ports (SERIAL MGT and NET MGT) by default How ever you can

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 91/294

Chapter 2 Configuring the System Console 59

man agement ports (SERIAL MGT and NET MGT) by default. How ever, you canredirect th e system console to the serial DB-9 port (TTYB), or to a local graph icsmonitor, keyboard, and mouse. You can also redirect the system console back to theserial management and network man agement ports.

Certain OpenBoot configuration variables control from where system console input

is taken and to wh ere its outpu t is directed. The table below show s how to set thesevariables in order to use the serial management and network management ports,TTYB, or a local grap hics monitor as the system console conn ection.

*POST output w ill still be directed to the serial management p ort, as POST has no mechanism to d irect its outpu tto a graphics monitor.

The serial management port and network management port are present in theOpenBoot configuration variables as ttya. However, the serial management portdoes not fun ction as a standard serial connection. If you w ant to connect aconventiona l serial dev ice (such as a printer ) to the system , you n eed to connect it toTTYB, not  the serial management port. See “About the Serial Ports” on p age 96 formore information.

The sc> prom pt an d POST messages are only available through th e serialmanagement port and network management port. In addition, the ALOM system

controller console comm and is ineffective wh en the system console is redirected toTTYB or a local graph ics monitor.

In addition to the OpenBoot configuration variables described in TABLE 2-20, thereare other variables that affect and determ ine system behavior. These variables arecreated d uring system configuration and stored on a ROM chip.

TABLE 2-20 OpenBoot Configuration Variables That Affect the System Console

OpenBoot Configuration VariableName

System Console Output

Serial and

NetworkManagement Ports Serial Port (TTYB)*

Local GraphicsMonitor*

output-device ttya ttyb screen

input-device ttya ttyb keyboard

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 92/294

60 Sun Fire V445 Server Administration Guide • September 2007

CHAPTER 3

P i O d P i Off h

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 93/294

61

Powering On and Pow ering Off theSystem

This chapter describes how to pow er on and pow er off the system, and how toinitiate a reconfiguration boot.

This chapter explains the following tasks:

s “Powering On the Server Remotely” on page 62s “Powering On the Server Locally” on page 63

s “Powering Off the System Remotely” on page 64s “Powering Off the Server Locally” on page 66s “Initiating a Reconfiguration Boot” on page 66s “Selecting a Boot Dev ice” on page 69

Before You BeginNote – Before pow ering on th e system, you m ust attach a system console device togain access to the system . See Chapter 2. ALOM automatically boots up wh en thesystem is plugged in.

The following is a brief sum mary of powering on th e system p roperly:

1. Attach a system console device to the serial managem ent port an d tu rn theconsole device on.

Serial managem ent access is only p ossible du ring first-time startu p.

2. Plug in the system pow er cords.

ALOM boots and starts issuing console messages. At this time, you can assigna username and password.

3. Pow er on the system. Once pow ered on, type console to get to the OK prom ptto watch th e system boot sequence.

Pow ering On the Server Remotely

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 94/294

62 Sun Fire V445 Server Administration Guide • September 2007

To issue software comm and s, you need to set up an alph anu meric terminalconnection, a local graphics monitor connection, ALOM system controllerconnection, or a Tip connection to th e Sun Fire V445 server. See Chapter 2 for moreinformation about connecting the Sun Fire V445 server to a terminal or similardevice.

Do not use this power-on procedu re if you have just added any new internal optionor external storage device, or if you h ave removed a storage device withoutreplacing it. To pow er on the system u nd er those circum stances, you m ust initiate areconfiguration boot. For those instructions, see:

s “Initiating a Reconfiguration Boot” on page 66

Caution – Before you p ower on the system, ensure that the system d oors and allpan els are properly installed.

Caution – Never m ove the system w hen the system pow er is on. Movement cancause catastrophic d isk drive failure. Always pow er off the system before moving it.

For more information, see:

s “About Com mu nicating With the System” on page 26s “About the sc> Promp t” on page 32

w To Pow er On the Server Remotely

1. Log in to the ALOM system controll er.

2. Type:

TABLE 3-1

sc> poweron

Pow ering On the Server Locally

Do not use this power-on procedu re if you have just added any new internal optionor external storage device, or if you h ave removed a storage device withoutreplacing it. To pow er on the system u nd er those circum stances, you m ust initiate areconfiguration boot. For those instructions, see:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 95/294

Chapter 3 Powering On and Powering Off the System 63

s “Initiating a Reconfiguration Boot” on page 66

Caution – Never m ove the system w hen the system pow er is on. Movement cancause catastrophic disk d rive failure. Always p ower off the system before moving it.

Caution – Before you p ower on the system, ensure that the system d oors and allpanels are prop erly installed.

w To Power On the Server Locally1. Turn on pow er to any external peripherals and storage devices.

Read the d ocumentation su pp lied with the d evice for specific instructions.

2. Establish a connection to the system console.

If you a re pow ering on th e system for the first time, conn ect a device to the serialmanagement port using one of the methods described in Chapter 2. Otherwise,

use one of the methods for connecting to the system console, also described inChapter 2.

3. Connect the AC pow er cords.

Note – As soon as the AC p ower cords are connected to th e system, the ALOMsystem controller boots an d d isplays its pow er-on self-test (POST) messages. Thoughthe system p ower is still off, the ALOM system controller is up and run ning, andmon itoring the system. Regardless of system p ower state, as long as the power cord sare connected an d providing stand by pow er, the ALOM system controller is on an dmonitoring the system.

4. Press and release the Power button with a ball-point pen to pow er on the

system.

Power button

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 96/294

64 Sun Fire V445 Server Administration Guide • September 2007

The power supply Power OK indicators light when power is applied to thesystem. Verbose POST output is immediately displayed to the system console if 

diagnostics are enabled at pow er-on, and the system console is directed to th eserial and network management ports.

Text messages ap pear from 30 seconds to 20 minu tes on the system mon itor (if one isattached) or the system prom pt ap pears on an attached term inal. This time depen dson the system configuration (num ber of CPUs, mem ory mod ules, PCI cards, andconsole configuration), and the level of power-on self-test (POST) and OpenBootDiagnostics tests being performed. The System Activity indicator lights when theserver is run ning u nd er control of the Solaris OS.

Pow ering Off the System RemotelyTo issue software comm and s, you need to set up an alph anu meric terminal

connection, a local graphics monitor connection, ALOM system controllerconnection, or a Tip connection to th e Sun Fire V445 server. See Chapter 2 for moreinformation about connecting the Sun Fire V445 server to a terminal or similardevice.

You can power off the system remotely either from the ok prompt or from theALOM system controller sc> prompt.

Caution – Applications running on the Solaris OS can be adversely affected by apoorly executed system shu tdow n. ensure that you stop and exit app lications, andshut d own the OS before pow ering off the server.

For more information, see:s “About Com mu nicating With the System” on p age 26s “About the ok Promp t” on page 35

“E i h k P ” 40

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 97/294

Chapter 3 Powering On and Powering Off the System 65

s “Entering the ok Promp t” on page 40s “About the sc> Prompt” on pa ge 32

w To Power Off the System Remotely From the okPrompt

1. No tify users that the server wi ll be pow ered off.

2. Back up the sys tem file s and data, if necessary.

3. Obtain the ok prompt.

See “Entering th e ok Prompt” on page 40.

4. Issue the following command:

w To Power Off the System Remotely From theALOM System Controller Prompt

1. No tify users that the system wi ll be pow ered off.

2. Back up the sys tem file s and data, if necessary.

3. Log in to the ALOM system controller.

See “Using the Serial Management Port” on page 41.

4. Issue the following command:

TABLE 3-2

ok power-off

TABLE 3-3

sc> poweroff

Pow ering Off the Server Locally

Caution – Applications running on the Solaris OS can be adversely affected by apoorly executed system shu tdow n. Ensure that you stop an d exit applications, andshut dow n the OS before powering off the server.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 98/294

66 Sun Fire V445 Server Administration Guide • September 2007

w To Pow er Off the Server Locally

1. No tify users that the server wi ll be pow ered dow n.

2. Back up the sys tem files and data, if necessary.

3. Press and release the Power button with a ball-point pin.

The system begins a graceful software system shu tdow n.

Note – Pressing and releasing the Power button initiates a graceful software systemshutd own . Pressing and holding in the Power bu tton for four seconds causes animmediate hardware shutdown. Whenever possible, you should use the gracefulshutdown method. Forcing an immediate hardware shutdown can cause disk drivecorruption and loss of data. Use that m ethod only as a last resort.

4. Wait for the system to power off.

The power su pp ly Power OK ind icators extinguish wh en the system is pow eredoff.

Caution – Ensure no other u sers have access to power on the system or systemcomponents w hile working on internal components.

Initiating a Reconfiguration BootAfter installing an y new internal option or external storage device, you mu stperform a reconfiguration boot so that the OS is able to recognize newly installeddevices. In add ition, if you rem ove any d evice and do n ot install a replacementdevice prior to rebooting th e system, you mu st perform a reconfiguration boot for

the OS to recognize the configuration change. This requirement also app lies to an ycomponent th at is connected to the system I2C bus to ensure proper environmentalmonitoring.

This requirement does not  apply to any component that is:

s Installed or removed as part of a hot-plug op erations Installed or removed before the OS is installed

s Installed as an identical replacement for a compon ent that is already recognizedby the OS

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 99/294

Chapter 3 Powering On and Powering Off the System 67

by the OS

To issue software comm and s, you need to set up an alph anu meric terminalconnection, a local graphics monitor connection, ALOM system controllerconnection, or a Tip connection to the Sun Fire V445 server. See Chapter 2 for more

information about connecting the Sun Fire V445 server to a terminal or similardevice.

This procedu re assum es that you are accessing th e system console u sing the serialmanagement or network management port.

For more information, see:

s “About Com mu nicating With the System” on p age 26s “About the sc> Prompt” on pa ge 32s “About the ok Promp t” on page 35s “About Switching Between the ALOM System Controller and the System

Console” on page 38s “Entering the ok Promp t” on page 40

w To In itiate a Reconfigu rat ion Boot

1. Turn on pow er to any external peripherals and storage devices.

Read the d ocumentation su pp lied with the d evice for specific instructions.

2. Turn on pow er to the alphanumeric terminal o r local graphics moni tor, or log

in to the ALOM system controller.

3. Use ALOM to initiate D iagnostics mode to run powe r-on self-test (POST) and

OpenBoot Diagnostics tests to verify that the system functions correctly with

the new part(s) you just installed.

4. Press the Power button with a ball-point pen to pow er on the system.

5. If you are logged in to the sc> prompt, switch to the ok prompt. Type:

TABLE 3-4

sc> console

6. When the system banner is displayed on the system console , immediately stop

the boot process to access the system ok prompt.

The system banner contains the Ethernet ad dress and host ID. To stop the bootprocess, use one of the following m ethods:

s Hold d own the Stop (or L1) key and p ress A on your keyboard.

s Press the Break key on the term inal keyboard.

s Type the break command from the sc> prompt.

7. At the ok prompt, type:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 100/294

68 Sun Fire V445 Server Administration Guide • September 2007

7. At the ok prompt, type:

You must set the auto-boot? variable to false and issue the reset-allcommand to ensure that th e system correctly initiates up on reboot. If you d o notissue these comman ds, the system m ight fail to initialize, because the bootprocess was stopp ed in Step 6.

8. At the ok prompt, type:

You m ust set auto-boot? variable back to true so that the system bootsautomatically after a system reset.

9. At the ok prompt, type:

The boot -r command rebuilds the d evice tree for the system, incorporatingany n ewly installed op tions so that the OS will recognize them.

Note – A system bann er app ears in 30 seconds to 20 minutes. This time depen ds onthe system configuration (nu mber of CPUs, memory mod ules, PCI cards) and thelevel of POST and OpenBoot Diagnostics tests being performed. For moreinformation about OpenBoot configuration variables, see Append ix C.

The system front pan el LED indicators p rovide p ower-on status information. Forinformation about the system indicators, see:

ok setenv auto-boot? false

ok reset-all

TABLE 3-5

ok setenv auto-boot? true

TABLE 3-6

ok boot -r

s “Front Panel Indicators” on p age 10s “Back Panel Indicators” on page 17

If the system encounters a p roblem d uring startup (running in the norm al mode), tryrestarting the system in Diagnostics mode to d etermine the source of the problem.Use ALOM or the OpenBoot Prompt (ok prompt) to switch to Diagnostics mod eand pow er cycle the system. See “Powering Off the Server Locally” on page 66.

For information about system d iagnostics and troubleshooting, see Chapter 8.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 101/294

Chapter 3 Powering On and Powering Off the System 69

Selecting a Boot DeviceYou specify the boot device by setting an OpenBoot configuration variable calledboot-device. The default setting of this variable is disk net. With th is setting,the firmw are first attemp ts to boot from the system hard disk dr ive, and if that fails,from the on-board net0 Gigabit Ethernet interface.

Before you can select a boot device, you mu st comp lete system installation accordingto the instructions in the Sun Fire V445 Server Installation Guide.

This procedure assum es that you are familiar w ith the OpenBoot firmw are and thatyou know how to enter the Op enBoot environment. For more information, see:

s “About the ok Promp t” on page 35

Note – The serial man agement p ort on th e ALOM system controller card ispreconfigured as the default system console port. For m ore information, seeChapter 2.

If you w ant to boot from a netw ork, you mu st connect the network interface to thenetwork. See, “Attaching a Twisted-Pair Ethernet Cable” on page 143.

w To Select a Boot Device

q At the ok prompt, type:

ok setenv boot-device device-specifier 

where the device-specifier  is one of the following:

s cdrom – Specifies th e DVD-ROM d rive

s disk – Specifies the system boot disk (internal disk 0 by default)

s disk0 – Specifies intern al d isk 0

s disk1 – Specifies intern al d isk 1

s disk2 – Specifies intern al d isk 2

s disk3 – Specifies intern al d isk 3

s disk4 – Specifies intern al d isk 4

5

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 102/294

70 Sun Fire V445 Server Administration Guide • September 2007

s disk5 – Specifies intern al d isk 5

s disk6 – Specifies intern al d isk 6

s disk7 – Specifies intern al d isk 6

s net, net0, net1– Specifies the netw ork in terfaces

s full path name – Specifies the device or network interface by its full path name

Note – The Solaris OS m odifies the boot-device variable to its full path nam e, notthe alias name. If you choose a nond efault boot-device variable, the Solaris OSspecifies the full device path of the boot device.

Note – You can also specify the name of the program to be booted as well as theway the boot program operates. For more information, see the OpenBoot 4.x

Command Reference Manual in the OpenBoot Collection AnswerBook for you r sp ecificSolaris OS release.

If you w ant to specify a netw ork interface other than a n on-board Ethernetinterface as the default boot device, you can determine the full path name of eachinterface by typing:

The show-devs command lists the system d evices and d isplays the full pathnam e of each PCI d evice.

For more information about using the OpenBoot firmw are, refer to the OpenBoot 4.xCommand Reference Manual in the OpenBoot Collection AnswerBook for you r sp ecificSolaris release.

ok show-devs

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 103/294

Chapter 3 Powering On and Powering Off the System 71

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 104/294

72 Sun Fire V445 Server Administration Guide • September 2007

CHAPTER 4

Configuring H ardware

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 105/294

73

This chapter provides hardware configuration information for the Sun Fire V445server.

Note – This chap ter d oes not p rovide instructions for installing or rem ovinghard ware components. For instructions on prep aring the system for servicing andprocedures to install and rem ove the server compon ents described in this chapter,refer to th e Sun Fire V445 Server Service Manu al.

Topics in this chapter include:

s “About the CPU/ Memory Mod ules” on page 73s “About th e ALOM System Controller Card” on p age 77s “About th e PCI Cards and Buses” on p age 81s “About the SAS Controller” on page 84s “About the SAS Backplane” on page 85s “About H ot-Pluggable and H ot-Swap pable Components” on p age 85

s “About the Internal Disk Drives” on p age 87s “About the Pow er Sup plies” on page 89s “About the System Fan Trays” on page 92s “About the USB Ports” on page 95s “About the Serial Ports” on p age 96

About the CPU/ Memory ModulesThe system m otherboard p rovides slots for up to four CPU/ Memory m odu les. EachCPU/ Memory module incorporates one UltraSPARC IIIi processor, and slots for upto four DIMMs. The CPUs in the system are nu mbered from 0 to 3, dep end ing on theslot w here each CPU resides.

Note – CPU/ Memory modules on a Sun Fire V445 server are not  hot-pluggable orhot-swappable.

The UltraSPARC IIIi processor is a high-performance, highly integrated superscalar

processor imp lementin g t he SPARC V9 64-bit architecture. The UltraSPARC IIIiprocessor can su pp ort both 2D and 3D graphics, as well as image processing, videocompression an d decompression, and video effects through the soph isticated VisualInstru ction Set extension (Sun VIS software). The VIS software prov ides h igh levelsof multimedia performance, including two streams of MPEG-2 decompression at full

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 106/294

74 Sun Fire V445 Server Administration Guide • September 2007

p g pbroadcast quality with no additional hardware support.

The Sun Fire V445 server employs a shared-memory multiprocessor architecturewith all processors sharing the same p hysical ad dress space. The system processors,main m emory, and I/ O su bsystem commu nicate via a h igh-speed systeminterconnect bus. In a system configured with mu ltiple CPU/ Memory mod ules, allmain mem ory is accessible from any p rocessor over the system bus. The mainmemory is logically shared by all processors and I/ O devices in the system.However, mem ory is controlled and a llocated by the CPU on its host modu le, that is,the DIMMs on CPU/ Memory m odu le 0 are man aged by CPU 0.

DIMMs

The Sun Fire V445 server uses 2.5-volt, high-capacity double data rate dual inlinememory modules (DDR DIMMs) with error-correcting code (ECC). The systemsup por ts DIMMs w ith 512-Mbyte, 1-Gbyte, and 2-Gbyte cap acities. EachCPU/ Memory m odu le contains slots for four DIMMs. Total system mem ory ran gesfrom a m inimum of 1 Gbyte (one CPU/ Memory mod ule w ith two 512-Mbyte

DIMMs) to a m aximu m of 32 Gbytes (four m odu les fully pop ulated with 2-GbyteDIMMs).

Within each CPU/ Memory m odu le, the four DIMM slots are organized into groupsof two. The system reads from, or writes to, both DIMMs in a group simultaneously.DIMMs, therefore, must be add ed in pairs. The figure below show s the DIMM slotsand DIMM groups on a Sun Fire V445 server CPU/ Memory module. Adjacent slotsbelong to th e same DIMM group . The two groups are d esignated 0 and 1 as shownin FIGURE 4-1.

1 0 1 0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 107/294

Chapter 4 Configuring Hardware 75

FIGURE 4-1 Memory Module Groups 0 and 1

TABLE 4-1 lists the DIMMs on the CPU/ Memory m odu le, and to wh ich group eachDIMM belongs.

The DIMMs must be ad ded in pairs within the same DIMM group, and each pairused mu st have tw o identical DIMMs installed – that is, both DIMMs in each groupmu st be from the sam e man ufacturer and mu st have the same capacity (for examp le,two 512-Mbyte DIMMs or two 1-Gbyte DIMMs).

Note – Each CPU/ Memory module must be populated with a minimum of two

DIMMs, installed in either group 0 or group 1.

Caution – DIMMs are ma de of electronic comp onents tha t are extremely sensitiveto static electricity. Static from your clothes or work environment can destroy themod ules. Do not rem ove a DIMM from its antistatic packaging un til you are read y toinstall it on the CPU/ Mem ory mod ule. Hand le the modu les only by their edges. Do

TABLE 4-1 Memory Module Groups 0 and 1Label Group Physical Group

B1/ D1 B1 1 (mu st b e in st alled a s a p air )

B1/ D0

B0/ D1 B0 0 (mu st b e in st alled a s a p air )

B0/ D0

DIMM group 1 DIMM group 0

1 0 1 0

not touch the components or any m etal parts. Always w ear an antistatic ground ingstrap w hen you hand le the modu les. For more information, refer to the Sun FireV445 Server Installation Gu ide an d the Sun Fire V445 Server Service Man ua l.

For guidelines and complete instructions on h ow to install and identify DIMMs in a

CPU/ Memory m odu le, refer to the Sun Fire V445 Server Service Manual and the Su nFire V445 Server Installation Guide.

Memory Interleaving

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 108/294

76 Sun Fire V445 Server Administration Guide • September 2007

Memory Interleaving

You can m aximize the system’s mem ory band wid th by taking a dvan tage of its

memory interleaving capabilities. The Sun Fire V445 server supports two-wayinterleaving. In most cases, higher interleaving results in improved systemperforman ce. How ever, actual performan ce results can vary dep ending on thesystem application. Two-way interleaving occurs automatically in any DIMM bankwhere the DIMM capacities in DIMM group 0 match the capacities used in a DIMMgroup 1. For optimum performance, install identical DIMMs in all four slots in aCPU/ Memory module.

Ind ependent Memory Subsystems

Each Sun Fire V445 server CPU/ Memory mod ule contains an indep enden t m emorysubsystem. Memory controller logic incorporated into the UltraSPARC IIIi CPUallows each CPU to control its own m emory su bsystem.

The Sun Fire V445 server uses a shared memory architecture. During normal system

operations, the total system m emory is shared by all CPUs in th e system.

DIMM Configuration Ruless You must physically remove a CPU/ Memory module from the system before you

can install or remove DIMMs.

s You m ust ad d DIMMs in p airs.

s Each group used must have two identical DIMMs installed – that is, both DIMMsmust be from the same manu facturer and must have the same d ensity andcapacity (for examp le, two 512-Mbyte DIMMs or tw o 1-Gbyte DIMMs).

s For maximu m m emory p erformance and to take full advan tage of the Sun FireV445 server ’s mem ory inter leaving features, use identical DIMMs in all four slots

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 109/294

Chapter 4 Configuring Hardware 77

V445 server s mem ory inter leaving features, use identical DIMMs in all four slotsof a CPU/ Memory mod ule.

For information about installing or removing DIMMs, see the Sun Fire V445 Server 

Parts Installation and Removal Guide.

Abou t the ALOM System Controller

CardThe Sun Advanced Lights Out Manager (ALOM) system controller card enablesaccess, monitorin g, and control of the Sun Fire V445 server from a rem ote location. Itis a fully indep enden t processor card w ith its own resident firmw are, self-diagnostics, and OS.

In addition, the ALOM system controller card functions as the default consoleconnection to the system, throu gh its serial managem ent p ort. For m ore information

about using the ALOM system controller as the default console connection, see:s “About Com mu nicating With the System” on p age 26s “Using the Serial Management Port” on page 41

When you first power on the system, the ALOM system controller card p rovides adefault connection to the system console through its serial man agement port. Afterinitial setup , you can assign an IP address to the network m anagement p ort andconnect the netw ork man agement p ort to a netw ork. You can ru n d iagnostic tests,

view d iagnostic and error messages, reboot your server, and d isplay environm entalstatus information using the ALOM system controller software. Even if theoperating system is down or the system is powered off, the ALOM system controllercan send an email alert about h ardw are failures, or other imp ortant events that canoccur on the server.

The ALOM system controller provides the following features:

s Secure Shell (SSH) or Telnet conn ectivity – N etwork conn ectivity can also bedisabled

s Remote powering on/ off the system and diagnostics

s Default system console connection through its serial man agement port to analphanu meric terminal, terminal server, or m odem

s Network management port for remote monitoring and control over a network,after initial setup

s Remote system m onitoring and error rep orting, including d iagnostic outpu t

s Remote reboot, power-on, power-off, and reset functions

s Ability to m onitor system environm ental conditions remotely

s Ability to run diagnostic tests using a remote connection

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 110/294

78 Sun Fire V445 Server Administration Guide • September 2007

y g g

s Ability to remotely capture and store boot and run logs, which you can review orreplay later

s Remote event notification for overtemperatu re conditions, power su pp ly faults,system shutd own, or system resets

s Remote access to detailed event logs

FIGURE 4-2 ALOM System Controller Card

The ALOM system controller card featu res serial and 10BASE-T Ethern et interfacesthat p rovide mu ltiple ALOM system controller software u sers with simultaneousaccess to the Sun Fire V445 server. ALOM system controller software u sers areprovided secure password-protected access to the system’s Solaris and OpenBootconsole functions. ALOM system controller users also have full control over power-on self-test (POST) and Op enBoot Diagnostics tests.

Caution – Although access to the ALOM system controller through the netw orkman agement p ort is secure, access through th e serial man agement p ort is not secure.Therefore, avoid connecting a serial modem to the serial managemen t port.

Note – The ALOM system controller serial management port (labeled SERIALMGT) and network man agement p ort (labeled N ET MGT) are present in the SolarisOS device tree as /dev/ttya, and in the Op enBoot configur ation var iables as ttya.How ever, the serial man agement p ort does not function as a standard serialconnection. If you w ant to attach a standard serial device to the system (such as a

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 111/294

Chapter 4 Configuring Hardware 79

connection. If you w ant to attach a standard serial device to the system (such as aprinter), you need to use th e DB-9 connector on th e system back pan el, wh ichcorrespond s to /dev/ttyb in the Solaris device tree, and as ttyb in the OpenBootconfigu ration variables. See “About the Serial Ports” on p age 96 for moreinformation.

The ALOM system controller card ru ns ind epend ently of the host server, andoperates off standby pow er from the server pow er sup plies. The card featureson-board d evices that interface w ith the server env ironmental mon itoring su bsystemand can automatically alert administrators to system problems. Together, thesefeatures enable the ALOM system controller card and ALOM system controller

software to serve as a lights out ma nagement tool that continues to fun ction evenwh en the server OS goes offline or w hen th e server is powered off.

The ALOM system controller card p lugs in to a d edicated slot on th e motherboardand provides the following p orts (as show n in FIGURE 4-3) through an opening in thesystem’s back panel:

s Serial communication port via an RJ-45 connector (serial management port,labeled SERIAL MGT)

s 10-Mbps Ethern et p ort via a n RJ-45 twisted -pair Ether net (TPE) connector(network m anagemen t p ort, labeled N ET MGT) with green Link/ Activityindicator

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 112/294

80 Sun Fire V445 Server Administration Guide • September 2007

FIGURE 4-3 ALOM System Controller Card Ports

Configuration Rules

Caution – The system su pp lies pow er to the ALOM system controller card evenwh en the system is p owered off. To avoid p ersonal injury or d amage to the ALOMsystem controller card, you mu st disconnect the AC p ower cords from the systembefore servicing the ALOM system controller card. The ALOM system controllercard is not hot-swap pable or hot-pluggable.

s The ALOM system controller card is installed in a dedicated slot on the systemmotherboard . Never move the ALOM system controller card to another systemslot, because it is not a PCI-compatible card. In ad dition, do not attempt to installa PCI card into the ALOM system controller slot.

s Avoid connecting a serial modem to the serial managem ent p ort because it is notsecure.

s The ALOM system controller card is not  a hot-pluggable component. Beforeinstalling or removing the A LOM system controller card, you m ust p ower off thesystem and disconnect all system pow er cords.

s The serial managem ent port on the ALOM system controller cannot be u sed as aconventional serial port. If your configuration requires a standard serialconnection, use the DB-9 port labeled “TTYB” instead .

Network management port Serial management port(NET MGT) (SER MGT)

s The 100BASE-T network management port on the ALOM system controller isreserved for u se with the ALOM system controller and the system console. Thenetwork m anagement p ort does not supp ort connections to Gigabit networks. If your configuration requires a h igh-speed Ethernet port, u se one of the GigabitEthernet ports instead. For information on configuring the Gigabit Ethernet ports,see Chapter 7.

s The ALOM system controller card must be installed in the system for the systemto function properly.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 113/294

Chapter 4 Configuring Hardware 81

Abou t the PCI Card s and BusesAll system commu nication w ith storage, peripherals and network interface devicesis mediated by four buses using th ree Peripheral Compon ent Interconnect (PCI)bridge chips on the system m otherboard. The Fire ASIC PCIe Northbrige m anagescommu nication between the system m ain interconnect bus (J-Bus) and two PCIebuses. In add ition, two PCIe/ PCI-X bridg e ASICs manag e comm un ication from eachPCIe bus to two PCI-X buses, giving the system a total of four PCI buses. The fourPCI buses support up to four PCIe interface cards and four PCI-X interface cards, as

well as multiple motherboard devices.

TABLE 4-2 describes the PCI bus characteristics and maps each bus to its associatedbridge chip, integrated devices, and PCI card slots. All slots comply with PCI LocalBus Specification Revision 2.2.

Note – PCI cards in a Sun Fire V445 server are not  hot-pluggable or hot-swap pable.

TABLE 4-2 PCI Bus Characteristics, Associated Bridge Chips, Moth erboard Devices,and PCI Slots

PCIe Bus

Data Rate /

Bandwidth Integrated Devices PCI Slot Type / Number / Capability

A 2.5 Gb/ sec *8 lanes

Gigabit Ethernet 0

Gigabit Ethernet 1

PCI-X Bridge 0

PCIe Slot 0 x16 (wired x8)

PCIe Slot 6 x8 (wired x16)SAS Controller Expansion Connector**

PCI X Slot 2 64 bit 133MHz 3 3v

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 114/294

82 Sun Fire V445 Server Administration Guide • September 2007

* Data Rate shown is per lane and per direction.

** Internal SAS Controller Card Expansion Connector not in use at time of this release

*** Slot Consu med by th e SAS1068 Disk Contr oller

FIGURE 4-4 shows the PCI card slots on the motherboard .

PCI-X Slot 2 64-bit 133MHz 3.3v

PCI-X Slot 3 64-bit 133MHz 3.3v

B 2.5Gb/ sec *8 lanes PCI-X Bridge 1Gigabit Ethernet 2

Gigabit Ethernet 3

Southbridge M1575

(USB 2.0 Controller

DVD-ROM Controller

Miscelaneous System

Devices)

PCI-X Slot 4 64-bit 133MHz 3.3v ***PCI-X Slot 5 64-bit 133MHz 3.3v

PCIe Slot 1 x16 (wired x8)

PCIe Slot 7 x8 (wired x16)

PCI6PCI0

PCI1 PCI7

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 115/294

Chapter 4 Configuring Hardware 83

FIGURE 4-4 PCI Slots

TABLE 4-3 lists the device name an d path for the eight PCI slots.

Configuration Ruless Slots (on the left ) accept two long PCI-X cards and two long PCIe cards.

TABLE 4-3 PCI Slot Device Nam es and Paths

PCI Slot PCIe Bus Device Name an d Base Path (not full path)

PCIe Slot 0 A /pci@1e,600000/pci@0

PCIe Slot 1 B /pci@1f,700000/pci@0

PCI-X Slot 2 A /pci@1e,600000/pci@0

PCI-X Slot 3 A /pci@1e,600000/pci@0

PCI-X Slot 4 B /pci@1f,700000/pci@0

PCI-X Slot 5 B /pci@1f,700000/pci@0

PCIe Slot 6 A /pci@1e,600000/pci@0

PCIe Slot 7 B /pci@1f,700000/pci@0

PCI2

PCI3 PCI5

PCI4

s Slots (on the right) accept two short PCI-X cards and two short PCIe cards

s All PCI-X slots comp ly w ith PCI-X local bu s sp ecification rev 1.0.

s All PCIe slots comply with PCIe base specification r1.0a and PCI standard SHPCspecification, r1.1.

s All PCI-X slots accept eith er 32-bit or 64-bit PCI card s.

s All PCI-X slots comply with PCI Local Bus Specification Revision 2.2.

s All PCI-X slots accept un iversal PCI card s.

s Comp act PCI (cPCI) cards a nd SBus card s are not supp orted.

s You can improve overall system availability by installing redu nd ant n etwork ort i t f t PCI b F dd iti l i f ti “Ab t

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 116/294

84 Sun Fire V445 Server Administration Guide • September 2007

storage interfaces on separate PCI buses. For add itional information, see “AboutMultipathing Software” on page 115.

Note – A 33-MHz PCI card plugged in to any of the 66-MHz or 133-MHz slotscauses that bus to op erate at 33 MHz. PCI-X slots 2 and 3 ru n at th e speed of theslowest card installed. PCI-X slots 4 and 5 run at the speed of the slowest cardinstalled. If two PCI-X 133-MHz card s are insta lled on the same bus (PCI-X Slots 2and 3) they each ru n at 100 MH z. 133-MHz op eration is only possible when on ly oneslot is populated with one PCI-X 133-MHz capable card.

For information about installing or removing PCI cards, see the Sun Fire V445 Server 

Service M anual.

About the SAS ControllerThe Sun Fire V445 server su pp orts 2 configur ations for th e SAS controller: theStand ard configuration and the Alternate configuration. The Stand ard configurationembeddes the SAS controller logic on the motherboard. The Alternate configurationuses a n int elligent, tw o-chan nel, SAS controller. This controller resides on PCI Bus2B and sup por ts a 64-bit, 66-MHz PCI interface.

Either configuration provides hardware RAID mirroring (RAID 0,1) capability withhigher p erformance than conventional software RAID m irroring. Up to two pairs of hard disk drives can be mirrored using a SAS controller.

For more inform ation abou t RAID configu rations, see “Abou t RAID Technology” onpage 120. For more information about configuring hard ware m irroring using theSAS cont roller, see “Creating a Ha rdw are Disk Mirror” on page 124.

Abou t the SAS BackplaneThe Sun Fire V445 server includ es a single SAS backplane w ith connections for u p t o

eight internal hard disk d rives, all of which are hot-pluggable.

The SAS disk ba ckplane accepts eight, low-profile (2.5-inch), SAS disk d rives. Eachhard disk drive is connected to the backplane with a standard SAS hot-plug d iskconnector, which makes it easy to add or remove hard disk dr ives from the system.Disks using SCA connectors provide better serviceability than disks using othertypes of connectors

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 117/294

Chapter 4 Configuring Hardware 85

types of connectors.

For information about installing or removing a SAS backplane, refer to the Sun Fire

V445 Server Service Manual.

Configuration Ruless The SAS backplan e requ ires low-profile (2.5-inch) hard disk d rives.s The SAS disks are hot-pluggable.

For information abo ut installing or rem oving the SAS backplan e, refer to the Sun Fire

V445 Server Service Manual.

About Hot-Pluggable and H ot-

Swappable ComponentsIn a Sun Fire V445 server, the SAS disk d rives are hot-pluggable components. Hot-pluggable components you can install or remove wh ile the system is running,without affecting system operation. How ever, you mu st prepare the O S prior to thehot-plug operation by p erforming certain system ad ministration tasks.

The pow er sup plies, fan trays, and USB components are hot-swappable. Hot-

swappable components you can remove and replace without software preparationand withou t affecting system operation. No other componen ts are hot-swa pp able.

Caution – You m ust always leave in place a minimum of two operational pow ersupp lies and one operational fan tray in each of the three fan tray pairs.

Caution – The ALOM system controller card is not  a hot-pluggable component. Toavoid personal injury an d d amage to the card, you mu st power off the system anddisconnect all AC p ower cords before installing or removing an ALOM systemcontroller card.

Caution – The PCI card s are not hot-pluggable comp onents. To avoid d amage to thecards, you m ust p ower off the system before removing or installing PCI cards.Access to the PCI slots requires removing th e top cover, which autom atically pow ersdown the system.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 118/294

86 Sun Fire V445 Server Administration Guide • September 2007

Hard Disk Drives

Before performing hard disk d rive hot-plug op erations, use the Solaris cfgadm(1m)utility to prepare the OS. The cfgadm utility is a comm and -line tool for m anaginghot-plug op erations on Sun Fire V445 internal d isk drives and external storagearrays. Refer to the cfgadm man page.

For more information about the disk drives, see “About the Internal Disk Drives” on

page 87. For general hard disk hot-plug p rocedu res, refer to the Sun Fire V445 Server Service M anual. For procedures to p erform a hard disk hot-plug operation onmirrored and nonmirrored disks, see “Performing a Mirrored Disk H ot-PlugOperation” on page 134 an d “Performing a Nonm irrored Disk Hot-Plug Operation”on p age 136.

Caution – When h ot-plugging a hard disk dr ive, first ensure that th e dr ive’s blueOK-to-Remove indicator is lit. Then, after disconnecting the drive from the SAS

backplane, allow 30 seconds or so for the d rive to spin d own completely beforeremoving it. Failing to let the drive spin d own before removing it could dam age thedr ive. See Chapter 6.

Power Sup plies

Sun Fire V445 server pow er sup plies are hot-swap pable. A pow er sup ply is hot-swap pable only when it is part of a redu nd ant p ower configuration, which is asystem configured with m ore than tw o pow er supp lies in w orking condition.

Caution – Removing a supp ly that is one of only two installed could causeundefined behavior in the server and could lead to system shutdown.

For add itional information, see “About the Pow er Sup plies” on p age 89. Forinstructions on removing or installing p ower supp lies, refer to th e Sun Fire V445

Server Service Manual.

System Fan TraysFor procedu es on rem oving an d in stalling fan trays, refer to the Sun Fire V445 ServerService Manual.

Caution – At least one fan must remain operational in each of the three pairs of fantrays to ma intain adequ ate system cooling

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 119/294

Chapter 4 Configuring Hardware 87

trays to ma intain adequ ate system cooling.

USB Comp onents

There are two USB ports located on the front p anel and two on the back p anel. Fordetails on the sup ported components, see “About the USB Ports” on page 95.

Abou t the Internal Disk DrivesThe Sun Fire V445 server supports up to eight internal, hot-pluggable 2.5 inch SASdisk d rives, attached to a backplane. The system also includ es an internal SAScontroller. See “About the SAS Controller” on page 84.

Indicators are associated w ith each d rive, ind icating the drive’s operating status,hot-plug readiness, and a ny fault conditions associated w ith the d rive.

FIGURE 4-5 shows the system’s eight internal hard d isk drives and h ighlights theseries of ind icators on each d rive. Disk dr ives are nu mb ered 0, 1, 2, 3, 4, 5, 6, and 7,with d rive 0 being the default system disk.

Power/ AccessService Requ iredOK-to-Remove

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 120/294

88 Sun Fire V445 Server Administration Guide • September 2007

FIGURE 4-5 Hard Disk Drives and Indicators

See TABLE 4-4 for a description of hard disk dr ive indicators and their function.

Note – If a hard disk d rive is faulty, the system Service Requ ired ind icator is also lit.

See “Front Panel Indicators” on p age 10 for more information.

The hot-plug feature of the system’s internal hard disk dr ives enables you to ad d,remove, or rep lace d isks while the system continues to operate. This capabilitysignificantly redu ces system d own time associated w ith hard disk d rive replacement.

TABLE 4-4 Hard Disk Drive Status Indicators

LED Color Description

OK-t o-Remov e Blu e On - Th e d r iv e is rea d fo r h ot -p lu g remov al.

Off - Norm al operation.

Unused Amber

A ctiv ity Green O n - D riv e is receiv in g p ow er. Solid ly lit ifd riv e isidle. Flashes while the dr ive processes a comman d.

Off - Power is off.

Disk drive hot-plug p rocedu res require software comm and s for prep aring thesystem prior to removing a h ard d isk drive and for reconfiguring the O S afterinstalling a drive. For detailed instru ctions, see Chapter 6 and a lso the Sun Fire V445

Server Service Manual.

The Solaris Volume Manager software supplied as part of the Solaris OS allows you

to use internal h ard disk d rives in four software RAID configurations: RAID 0(striping), RAID 1 (mirroring), and RAID 0+1 (striping p lus m irroring). You can alsoconfigure drives as hot-spares, disks installed and ready to operate if other disks fail.In add ition, you can configure h ardw are mirroring u sing the system’s SAScontroller. For more information about all supported RAID configurations, see“Abou t RAID Technology” on pag e 120. For m ore information about configuringhardware mirroring, see “Creating a Hard wa re Disk Mirror” on page 124.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 121/294

Chapter 4 Configuring Hardware 89

w g, g w p g

Configuration Ruless You mu st u se Sun stand ard 3.5-inch w ide and 2.54-inch high (8.89-cm x 5.08-cm)

hard disk drives that are SCSI-compatible and run at 10,000 revolutions perminu te (rpm ). Drives mu st be either the single-ended or low-voltage differential(LVD) typ e.

s The SCSI target address (SCSI ID) of each hard disk drive is determined by theslot location w here th e dr ive is connected to the SAS backplan e. There is no needto set any SCSI ID jum pers on the hard disk dr ives themselves.

About the Power SuppliesThe Power Distribution Board distributes DC pow er from four p ower su pp lies to allinternal system comp onents. The system’s four pow er sup plies called p ower su pp ly0, pow er supp ly 1, power sup ply 2 and pow er supp ly 3 plug in d irectly toconnectors on the pow er distribution board. Each power su pp ly has a separate ACinlet. Two ind epend ent AC power sources should be used to provide redu nd ant ACinlet pow er. All four p ower su pp lies share equally in satisfying the pow er dem and sof the system – any two of which can satisfy the entire load of a system with amaximum configuration. AC power is brought to each power supply with a powercord (total of four power cords).

The Sun Fire V445 server’s pow er su pp lies are m odu lar, hot-swap pable u nits. Theyare customer replacable units (CRUs) designed for fast, easy installation or removal,even w hile the system is fully operational. Power su pp lies are installed in bays atthe rear of the system.

The pow er su pp lies operate ov er an AC inp ut ran ge of 100 240 VAC, 47-63 Hz. Eachpow er sup ply can provide u p to 550 watts of 12V DC pow er. Each p ower su pp lycontains a series of status ind icators, visible w hen looking at the back pan el of thesystem. FIGURE 4-6 shows the location of the pow er sup plies and indicators.

DC Power OnService Requ ired

AC Power Present

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 122/294

90 Sun Fire V445 Server Administration Guide • September 2007

FIGURE 4-6 Power Supplies and Indicators

See TABLE 4-5 for a d escription of pow er sup ply ind icators and their function, listedfrom top to bottom.

Note – If a power supply is faulty, the system Service Required indicator is also lit.See “Front Panel Indicators” on p age 10 for more information.

TABLE 4-5 Power Supply Status Indicators

Indicator Color Notes

DC Power On Green Th is ind ica to r is lit when the sys tem is powered on andthe power supply is operating normally.

Service Required Amber This indicator is l it if there is a fault in the power supply.

AC Power Present Green This indicator is l it when the power supply is plugged inand AC pow er is available, regardless of system p ower

state.

Power supp lies in a red und ant configuration feature a hot-swap capability. You canremove and replace a faulty power supp ly without shutting dow n the OS or turningoff the system power.

A power supply can be hot-swapped only when there are at least two other powersupp lies online and working p roperly. In ad dition, the cooling fans in each pow er

supp ly are designed to op erate independ ently of the pow er supp lies. If a pow ersupp ly fails, but its fans are still operable, the fans continue to op erate by d raw ingpower from the other power supply through the p ower d istribution board.

For additional details, see “About Hot-Pluggable and Hot-Swappable Components”on page 85. For information abou t removing an d installing p ower supp lies, see“Performing a Power Supp ly Hot-Swap Operation” on p age 91, and refer to yourS Fi V445 S S i M l

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 123/294

Chapter 4 Configuring Hardware 91

Sun Fire V445 Server Service Manual.

Performing a Power Supply Hot-Swap Operation

You can h ot-swap any p ower supp ly wh ile tw o others are installed, online, andoperational. Check the Service Required indicators to verify which power supply hasfailed. The failed power supply causes the amber system Service Required indicatorand pow er sup ply Service Required indicator to light.

To complete this procedure, refer to the Sun Fire V445 Server Service Manual.

Power Supply Configuration Ruless Hot-swap a power supply only when there are at least two other power supplies

online and w orking properly.

s Good p ractice is to connect the four p ower supp lies to two sepa rate AC circuits,two supplies per circuit, which enables the system to remain operational if one of the AC circuits fails. Consult your local electrical codes for any additionalrequirements.

Ab t th S t F T

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 124/294

92 Sun Fire V445 Server Administration Guide • September 2007

About the System Fan TraysThe system is equipped with six fan trays organized into three redu nd ant pa irs. Oneredu nd ant p air is for cooling the disk drives. The other tw o redun dan t pairs are forcooling the CPU/ Memory mod ules, mem ory DIMMs, I/ O subsystem, and p rovidefront-to-rear cooling of the system. Not all fans mu st be present to p rovide ad equatecooling – only one fan per redu nd ant pair m ust be present.

Note – All system cooling is provided by the fan trays – pow er sup ply fans do not

provide system cooling.

The fans in the system plug directly into the m otherboard. Each fan is mou nted onits own tra y and is ind ividu ally hot-swapp able. If either fan in a p air fails theremaining fan is ad equate to keep its portion of the system cool. The p resence andhealth of the fans are indicated through six bicolor indicators located on the SASbackplane.

Open the fan tray doors on the top cover of the server to access the system fans.Power su pp lies are cooled separ ately, each p ower su pp ly with its own internal fan.

Caution – Fan trays contain sharp moving p arts. Use extreme caution w henservicing fan trays and blowers.

FIGURE 4-7 shows all six system fan trays an d their correspond ing indicators. For

each fan in the system, the environmental mon itoring su bsystem monitors fan speedin revolutions per m inute.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 125/294

Chapter 4 Configuring Hardware 93

FIGURE 4-7 System Fan Trays and Fan Indicators

Refer to these indicators to d etermine w hich fan tray needs to be replaced.

TABLE 4-6 provides a description of the fan tray ind icators.

Note – If a fan tray is not present, its corresponding indicator is not lit.

Note – If a fan tra y is faulty, the system Service Required ind icator is also lit. See“Front Panel Indicators” on p age 10 for more information.

TABLE 4-6 Fan Tray Status Indicators

Indicator Color Notes

Po wer / OK Green Th is in d ica tor is lit wh en t he sy st em is r un nin g a nd t hefan tray is op erating norm ally.

Service Required Yellow This indicator is l it when the system is running and thefan tray is faulty.

The environmental subsystem mon itors all fans in the system, and prints a w arningand lights the system Service Required indicator if any fan falls below its nominaloperating speed. This provides an early warn ing to an imp ending fan failure,enabling you to schedu le downtim e for replacement before an overtemperatu recondition shu ts dow n the system u nexpectedly.

For a fan failure, the following indicators are lit:Front panel:

s Service Required (amber)s Operating (green)s Fan failure (amber)s CPU over temp erature (if the system is overheating)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 126/294

94 Sun Fire V445 Server Administration Guide • September 2007

Top panel:

s Specific fan failure (am ber)s All other fans (green)

Back p anel:

s Service Required (amber)s Running (green)

In add ition, the environm ental subsystem p rints a warn ing and lights the system

Service Required indicator if internal temp erature rises above a p redeterminedthreshold, either du e to fan failure or external environm ental conditions. Foradd itional d etails, see Chapter 8.

System Fan Configu ration Rules

The minimu m system configuration requires at least one fan op erating perredund ant pair.

Note – For instructions on how to remov e and install fan tr ays, refer to the Sun FireV445 Server Service Man ua l.

About the USB PortsThe system front and back panels provide two external Universal Serial Bus (USB)

ports on two indepen dent controllers to connect USB periph eral devices such as:s Sun Type-6 USB keyboards Sun opto-mechanical three-button USB mouses Modemss Printerss Scannerss Digital cameras

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 127/294

Chapter 4 Configuring Hardware 95

The USB ports are compliant w ith the O pen Host Controller Interface (Open HCI)specification for USB Revision 1.1 and also 2.0 comp liant (EHCI) an d capable of 480Mbps as w ell as 12 Mbps and 1.5 Mbps. The p orts supp ort isochronous andasynchronous mod es, and enable data tran smission at speed s of 1.5 Mbps and 12Mbps. Note th at the USB data transmission speed is significantly faster th an th at of the stand ard serial ports, wh ich op erate at a m aximu m rate of 460.8 Kbaud.

The USB por ts are accessible by connecting a USB cable to a back pan el USBconnector. The connectors at each end of a USB cable are keyed so that you cannot

connect them incorrectly. One connector plugs in to the system or USB hub. Theother connector plugs in to the peripheral device. Up to 126 USB devices can beconnected to each controller simultaneously, through the use of USB hubs. The USBports p rovide p ower for smaller USB devices such as mod ems. Larger USB devices,such as scanners, require their own p ower source.

For the USB por t locations, see “Locating Back Panel Featu res” on page 16 an d“Locating Front Panel Features” on page 9. Also see “Reference for the USB

Connectors” on page 239.

Configuration Ruless USB ports support hot-swapping. You can connect and disconnect the USB cable

and periph eral devices while the system is run ning, without issuing softwarecommand s and withou t affecting system op erations. How ever, you can only hot-

swap USB components w hile the OS is running.s Hot-swap ping USB components is not supp orted w hen the system ok prompt is

displayed or before the OS boots.

s You can conn ect up to 126 devices to each of the tw o USB controllers, for a tota l of 252 USB dev ices per system.

About the Serial PortsThe default console connection to the Sun Fire V445 server is through the RJ-45 serial

management port (labeled SERIAL MGT) on the back panel of the ALOM systemcontroller card. This port operates only at 9600 baud.

Note – The serial managem ent port is not a standard serial port. For a standa rd an dPOSIX compliant serial port, use the DB-9 port on the system back panel, whichcorrespon d s to TTYB.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 128/294

96 Sun Fire V445 Server Administration Guide • September 2007

The system also provides a standard serial comm unication p ort through a DB-9 port(labeled TTYB) located on th e back p anel.This port correspon d s to TTYB, andsu pports baud ra tes of 50, 75, 110, 134, 150, 200, 300, 600, 1200, 1800, 2400, 4800, 9600,19200, 38400, 57600, 115200, 153600, 230400, 307200, and 460800. The port isaccessible by connecting a serial cable to the back panel serial port connector.

For the serial por t location, see “Locating Back Panel Featu res” on pag e 16. Also see“Reference for th e Serial Port Con nector” on page 238. For more information abou tthe serial man agement p ort, see Chapter 2.

CHAPTER 5

Managing RAS Featu res and SystemFirmware

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 129/294

97

This chapter describes how to m anage reliability, availability, and serviceability(RAS) features an d system firmw are, including Sun Adva nced Lights Out Manager(ALOM) system controller, automatic system restoration (ASR), and the hardwarewatchd og mechanism. In ad dition, this chap ter describes how to unconfigure andreconfigure a device manu ally, and introduces mu ltipathing software.

This chapter contains the following sections:

s “Abou t Reliability, Availability, and Serviceability Features” on page 98s “About the ALOM System Controller Command Prompt” on p age 103s “Logging In to the ALOM System Controller” on page 104s “About the scadm Utility” on p age 106s “Viewing Environmental Information” on page 107s “Controlling the Locator Indicator” on page 108s “About Performing OpenBoot Emergency Procedures” on page 109s “About Automatic System Restoration” on page 111

s “Unconfiguring a Device Manually” on page 112s “Reconfiguring a Device Manually” on page 114s “Enabling the Hard ware Watchdog Mechanism and Its Options” on page 114s “About Multipathing Software” on page 115

Note – This chap ter does not cover d etailed trou bleshooting and diagnosticprocedures. For informa tion about fault isolation and diagnostic procedu res, seeChapter 8 and Chapter 9.

Abou t Reliability, Availability, andServiceability FeaturesReliability, availability, and serviceability (RAS) are aspects of a system’s design thataffect its ability to operate continuously and to minimize the time necessary toservice the system.

s Reliability refers to a system ’s ability to operate continu ously w ithou t failures andto m aintain data integrity.

s System availability encompasses the ability of a system to both recover in thef f l i h i h i l i d i

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 130/294

98 Sun Fire V445 Server Administration Guide • September 2007

presen ce of a fault with n o impact to the opera tional environm ent – and restore in

the p resence of a fault, with m inimal impact to the op erational environmen t.

s Serviceability refers to the time it takes to diagnose and comp lete the repair p olicyof a system, following a system failure.

Together, reliability, availability, and serviceability featu res p rovide near continuou ssystem operation.

To deliver high levels of reliability, availability, and serviceability, the Sun Fire V445

server offers the following features:s Hot-pluggable disk drives

s Redund ant, hot-swa pp able power su pp lies, fan trays, and USB components

s Sun Advanced Lights Out Manager (ALOM) system controller with SSHconnections for all remote mon itoring and control

s Environmental monitoring

s Automatic system restoration (ASR) capabilities for PCI cards and memory

DIMMs

s Ha rdw are w atchdog m echanism and externally initiated reset (XIR) capability

s Internal hard ware disk m irroring (RAID 0/ 1)

s Sup port for disk and n etwork m ultipathing w ith automatic failover

s Error correction an d parity checking for imp roved d ata integrity

s Easy access to all internal replaceable components

s Full in-rack serviceability for all compon ents

Hot-Pluggable and Hot-Swapp able ComponentsSun Fire V445 hard ware is designed to sup port hot-plugging of internal disk drives.By u sing the prop er software comman ds, you can install or remove thesecomponents w hile the system is runn ing. The server also supp orts hot-swap ping of 

pow er sup plies, fan trays, and USB components. These comp onents can be rem ovedand installed without issuing software commands. Hot-plug and hot-swaptechnology significantly increase th e system ’s serviceability and availability, byproviding you with th e ability to do th e following:

s Increase storage capacity dynamically to handle larger work loads and to improvesystem performance

s Replace disk drives and pow er supp lies without service disrup tion

For add itional information abou t the system’s hot-pluggable and hot-swapp ablecomponents, see “About Hot-Pluggable and Hot-Swappable Components” onpage 85.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 131/294

Chapter 5 Managing RAS Features and System Firmware 99

n+2 Power Supp ly Redu nd ancyThe system features four hot-pluggable power su pp lies, any tw o of which arecapable of hand ling the system’s entire load. Thus, the four p ower su pp lies provideN+N redu nd ancy, enabling the system to continue op erating should u p to tw o of thepow er sup plies or its AC pow er source fail.

For more information about p ower supp lies, redu nd ancy, and configuration ru les,see “About the Pow er Sup plies” on p age 89.

ALOM System Controller

Sun Advanced Lights Out Manager (ALOM) system controller is a secure servermanagem ent tool that comes p reinstalled on th e Sun Fire V445 server, in the form of a mod ule with p reinstalled firmw are. It lets you m onitor and control your server

over a serial line or over a n etwork. The ALOM system controller p rovides remotesystem administration for geographically distributed or physically inaccessiblesystems. You can connect to the ALOM system controller card using a localalphanu meric terminal, a terminal server, or a mod em connected to its serialman agement p ort, or over a netw ork using its 10BASE-T network m anagement port.

For more d etails about the ALOM system controller hard ware, see “About theALOM System Controller Card” on page 77.

For information about configuring an d using th e ALOM system controller, see:

s “About the ALOM System Controller Command Prompt” on p age 103s “Logging In to the ALOM System Controller” on page 104s “About the scadm Utility” on p age 106s Sun Advanced Lights O ut Manager (A LOM) On line Help

Environmental Monitoring and Control

The Sun Fire V445 server features an environmental m onitoring subsystem tha tprotects the server and its comp onents against:

s Extreme temperatures

s Lack of adequ ate airflow throu gh th e systems Operating w ith missing or m isconfigured componentss Power su pp ly failuress Internal hardware faults

Monitoring and control capabilities are handled by the ALOM system controllerfirmw are. This ensures that m onitoring capabilities remain operational even if thesystem has halted or is una ble to boot, and withou t requiring the system to ded icateCPU and memory resources to monitor itself If the ALOM system controller fails

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 132/294

100 Sun Fire V445 Server Administration Guide • September 2007

CPU and memory resources to monitor itself. If the ALOM system controller fails,

the operating system reports the failure and takes over limited environmentalmonitoring and control functions.

The environmental monitoring subsystem uses an industry-standard I2C bus. TheI2C bus is a simple two-wire serial bus used throughou t the system to allow themonitoring and control of temp erature sensors, fan trays, pow er sup plies, and statu sindicators.

Temp erature sensors are located th roughou t the system to monitor the am bient

temp erature of the system, the CPUs, and the CPU d ie temperature. The monitoringsubsystem polls each sensor and uses the sampled tem peratu res to report an drespond to any overtemperature or undertemperature conditions. Additional I2Csensors detect component p resence and component faults.

The hardware and software together ensure that the temperatures within theenclosure do n ot exceed p redetermined “safe operation” ran ges. If the temp eratureobserved by a sensor falls below a low-temperatu re war ning threshold or rises

above a high-temp erature w arning threshold, the monitoring subsystem softwarelights the system Service Required indicators on the front and back panels. If thetemperature condition persists and reaches a critical threshold, the system initiates agraceful system sh utd own . In the event of a failure of the ALOM system controller,backup sensors are u sed to p rotect the system from serious d amage, by initiating aforced hard ware shutdow n.

All error and war ning m essages are sent to the system console and logged in the/var/adm/messages file. Service Requ ired ind icators remain lit after an autom atic

system shutdown to aid in problem diagnosis.

The monitoring subsystem is also designed to detect fan failures. The systemfeatures integral pow er sup ply fan trays, and six fan trays each containing one fan.Four fans are for cooling CPU/ Memory mod ules and two fans are for cooling thedisk d rive. All fans are h ot-swa pp able. If any fan fails, the m onitoring subsystem

detects the failure and generates an error message to the system console, logs themessage in the /var/adm/messages file, and lights the Service Requiredindicators.

The power subsystem is monitored in a similar fashion. Polling the pow er sup plystatus periodically, the monitoring subsystem indicates the status of each supply’s

DC outpu ts, AC inputs, and p resence.

Note – The pow er sup ply fans are not requ ired for system cooling. How ever, if apow er supp ly fails, its fan obtains pow er from other p ower su pp lies and throu gh themotherboard to maintain th e cooling function.

If a p ower supp ly problem is detected, an error m essage is sent to the system

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 133/294

Chapter 5 Managing RAS Features and System Firmware 101

console and logged in the /var/adm/messages file. Ad ditiona lly, ind icatorslocated on each p ower sup ply light to ind icate failures. The system Service Requiredindicator lights to indicate a system fault. The ALOM system controller consolealerts record p ower supp ly failures.

Automatic System Restoration

The system prov ides automa tic system restorat ion (ASR) from comp onen t failures inmemory m odules and PCI cards.

The ASR features enable the system to resume operation after experiencing certainnon fatal hardw are faults or failures. Autom atic self-test featu res enable the system todetect failed hard ware components. An a utoconfiguring capability d esigned into thesystem’s boot firmware enables the system to unconfigure failed components and torestore system op eration. As long as the system can operate w ithout th e failed

component, the ASR features enable the system to reboot autom atically, w ithoutoperator intervention.

During th e pow er-on sequence, if a faulty component is detected, the componen t ismarked as failed and , if the system can function, the boot sequence continues. In arun ning system, some typ es of failures can cause th e system to fail. If this hap pens,the ASR functionality enables the system to reboot immediately if it is possible forthe system to detect the failed component and operate w ithout it. This prevents afaulty hardw are comp onent from keeping the entire system dow n or causing the

system to crash repeatedly.

Note – Control over the system ASR functionality is provided by several OpenBootcommand s and configuration variables. For ad ditional details, see “AboutAutomatic System Restoration” on page 209.

Sun StorEdge Traffic Manager

Sun StorEdge™ Traffic Manager, a feature foun d in the Solaris OS and later versions,is a native m ultipathing solution for storage d evices such as Sun StorEdge d iskarrays. Sun StorEdge Traffic Manager provides the following features:

s

Host-level mu ltipathings Physical host controller interface (pHCI) supports Sun StorEdge T3, Sun StorEd ge 3510, and Sun StorEdg e A5x00 supp orts Load balancing

For more informa tion, see “Sun StorEdge Traffic Manager” on page 119. Also consultyour Solaris software d ocumentation.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 134/294

102 Sun Fire V445 Server Administration Guide • September 2007

Hardware Watchd og Mechan ism and XIR

To detect and respond to a system hang, should one ever occur, the Sun Fire V445server features a hardware “watchdog” mechanism, which is a hardware timer thatis continually reset as long as the op erating system is run ning. In the event of asystem ha ng, the opera ting system is no longer able to reset the timer. The timer w illthen expire and cause an automatic externally initiated reset (XIR), eliminating the

need for operator intervention. When the hardw are watchdog m echanism issues theXIR, debug information is displayed on the system console. The hardware watchdogmechanism is present by d efault, but it requires some ad ditional setup in the SolarisOS.

The XIR feature is also available for you to invoke manually at the ALOM systemcontroller prompt. You use the ALOM system controller reset -x commandman ually wh en the system is un responsive and an L1-A (Stop-A) keyboardcommand or alphanu meric terminal Break key does not work. When you issue thereset -x command man ually, the system is imm ediately returned to the Op enBootok prom pt. From there, you can use OpenBoot comm and s to debug the system.

For more information, see:

s “Enabling th e Ha rdw are Watchdog Mechanism and Its Options” on pa ge 114

s Chapter 8 an d Chapter 9

Support for RAID Storage ConfigurationsBy attaching one or m ore external storage d evices to the Sun Fire V445 server, youcan use a redu nd ant array of indep endent disks (RAID) software app lication such asSolstice DiskSuite™ to configure system disk storage in a variety of d ifferent RAIDlevels. Configuration op tions includ e RAID 0 (striping), RAID 1 (mirrorin g), RAID0+1 (striping plus mirroring), RAID 1+0 (mirroring plus striping), and RAID 5

(striping with interleaved parity). You choose the appropriate RAID configurationbased on the price, performance, reliability, and availability goals for your system.You can also configure one or more disk drives to serve as “hot spares” to fill inautom atically in th e event of a d isk drive failure.

In add ition to software RAID configurations, you can set up a hard wa re RAID 1

(mirroring) configuration for any pair of internal d isk drives using th e SAScontroller, providing a h igh-performan ce solution for d isk drive m irroring.

For more information, see:

s “About Volume Management Software” on page 118s “Abou t RAID Technology” on page 120s “Creating a Ha rdw are Disk Mirror” on page 124

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 135/294

Chapter 5 Managing RAS Features and System Firmware 103

Error Correction and Parity Checking

DIMMs employ error-correcting code (ECC) to ensure high levels of data integrity.The system rep orts an d logs correctable ECC errors. (A correctable ECC error is an ysingle-bit error in a 128-bit field.) Such errors a re corrected as soon as th ey aredetected. The ECC implementation can also detect double-bit errors in th e same128-bit field and multiple-bit errors in the same nibble (4 bits). In addition to

providing ECC protection for da ta, parity protection is also used on the PCI andUltraSCSI buses, and in th e UltraSPARC IIIi CPU internal caches. ECC detection an dcorrection for DRAM is present in the 1 Mbyte on-chip ecache SRAM of theUltraSPARC-IIIi processor.

Abou t the ALOM System ControllerComm and PromptThe ALOM system controller supports a total of five concurrent sessions per server:four connections available through the network management port and oneconnection through the serial management p ort.

Note – Some of the ALOM system controller command s are also available throughthe Solaris scadm utility. For m ore inform ation, see the Sun Advanced Lights Out 

 Manager (ALOM) Online Help.

After you log in to your ALOM account, the ALOM system controller commandprompt (sc>) appears, and you can enter ALOM system controller commands. If thecommand you w ant to use has m ultiple options, you can either enter the optionsindividually or group ed together, as show n in th e following example. Thecommand s are identical.

TABLE 5-1

sc> poweroff -f -y

sc> poweroff -fy

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 136/294

104 Sun Fire V445 Server Administration Guide • September 2007

Logging In to the ALOM SystemControllerAll environmental m onitoring an d control is hand led by the A LOM systemcontroller. The ALOM system controller command prom pt (sc>) provides you witha way of interacting with the system controller. For more information about the sc>

prompt, see “About the sc> Promp t” on page 32

For instructions on connecting to the ALOM system controller, see:

s “Using the Serial Management Port” on page 41s “Activating the Netw ork Managemen t Port” on page 42

w To Log In to the ALOM System Controller

Note – This procedu re assum es that the system console is directed to use the serialmanagement and network management ports (the default configuration).

1. If you are logged in to the system console , type #. to get to the sc> prompt.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 137/294

Chapter 5 Managing RAS Features and System Firmware 105

Press the hash key, followed by the period key. Then press the Return key.2. At the login prompt, enter the login name and press Return.

The default login n ame is admin.

3. At the password prompt, enter the passw ord and press Return twice to get to

the sc> prompt.

Note – There is no default passw ord. You m ust assign a passw ord d uring initialsystem configuration. For more information, see your Sun Fire V445 Server 

 Installation Guide an d Sun Advanced Lights Out Manager (A LOM) Online Help.

Caution – In order to provide optimu m system security, best practice is to changethe default system login name and password during initial setup.

Using the ALOM system controller, you can m onitor the system, turn the Locatorindicator on and off, or p erform maintenan ce tasks on the ALOM system controllercard itself. For more information, see:

TABLE 5-2

Sun(tm) Advanced Lights Out Manager 1.1

Please login: admin

TABLE 5-3

Please Enter password:

sc>

s Sun Advanced Lights Out Manager (A LOM) Online Help

About the scadm UtilityThe System Controller Administration (scadm) utility, wh ich is par t of the SolarisOS, enables you to perform man y ALOM tasks wh ile logged in to the h ost server.The scadm command s control several fun ctions. Some functions allow you to viewor set ALOM environm ent variables.

Note – Do not use the scadm utility while SunVTS™ diagnostics are running. See

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 138/294

106 Sun Fire V445 Server Administration Guide • September 2007

your SunVTS docum entation for more information.

You m ust be logged in to the system as sup eruser to use the scadm utility. Thescadm utility uses the following syntax:

The scadm utility sends its outpu t to stdout. You can also use scadm in scripts toman age and configure ALOM from the h ost system.

For more information about the scadm utility, refer to t he following :

s scadm man pages Sun Advanced Lights Out Manager (A LOM) Online Help

TABLE 5-4

# scadm command 

Viewing Environm ental Inform ationUse the showenvironment command to view environment information.

w To View Environmental Information

1. Log in to the ALOM system controller.

2. Use the showenvironment command to display a snapshot of the server’s

environmental status.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 139/294

Chapter 5 Managing RAS Features and System Firmware 107

The information this command can display includes temperature, power supplystatus, front p anel indicator status, and so on. The d isplay uses a format similar tothat of the UN IX command prtdiag(1m).

TABLE 5-5

sc> showenvironment

=============== Environmental Status ===============

------------------------------------------------------------------------------

System Temperatures (Temperatures in Celsius):------------------------------------------------------------------------------

Sensor Status Temp LowHard LowSoft LowWarn HighWarn HighSoft HighHard

------------------------------------------------------------------------------

C1.P0.T_CORE OK 72 -20 -10 0 108 113 120

C1.P0.T_CORE OK 68 -20 -10 0 108 113 120

C2.P0.T_CORE OK 70 -20 -10 0 108 113 120

C3.P0.T_CORE OK 70 -20 -10 0 108 113 120

C0.T_AMB OK 23 -20 -10 0 60 65 75

C1.T_AMB OK 23 -20 -10 0 60 65 75

C2.T_AMB OK 23 -20 -10 0 60 65 75

C3.T_AMB OK 23 -20 -10 0 60 65 75

FIRE.T_CORE OK 40 -20 -10 0 80 85 92

MB.IO_T_AMB OK 31 -20 -10 0 70 75 82

FIOB.T_AMB OK 26 -18 -10 0 65 75 85

MB.T_AMB OK 28 -20 -10 0 70 75 82

....

Note – Some environm ental information m ight not be available when th e server isin Standby mode.

Note – You do not need ALOM system controller user p ermissions to u se this

command.

Controlling the Locator Ind icatorTh L t i di t l t th i d t t l b Wh th L t

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 140/294

108 Sun Fire V445 Server Administration Guide • September 2007

The Locator indicator locates the server in a data center or lab. When the Locatorind icator is enabled, th e white Locator ind icator flashes.You can control the Locatorindicator either from the Solaris comm and prom pt or from the sc> prom pt. You canalso reset the Locator indicator with the Locator indicator button.

w To Control the Locator Indicator

1. To turn on the Locator indicator, do one of the fol low ing:

s In the Solaris OS, log in as sup eruser and type:

s From the ALOM system controller command prom pt, type:

2. To turn off the Locator ind icator, do one of the fol low ing:s In Solaris, log in as root and type:

TABLE 5-6

# /usr/sbin/locator -n

Locator LED is on.

TABLE 5-7

sc> locator on

Locator LED is on.

TABLE 5-8

# /usr/sbin/locator -f

Locator LED is off.

s From the ALOM system controller comman d p romp t, type:

3. To display the state of the Locator indicator, do one of the fol low ing :

s In the Solaris OS, log in as root and type:

F th ALOM t t ll d t t

TABLE 5-9

sc> locator off

Locator LED is off.

TABLE 5-10

# /usr/sbin/locator

The ’system’ locator is on.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 141/294

Chapter 5 Managing RAS Features and System Firmware 109

s From the ALOM system controller comman d p romp t, type:

Note – You d o not need user p ermissions to use the locator commands.

Abou t Performing OpenBoot Emergency

ProceduresThe introduction of Universal Serial Bus (USB) keyboards with the newest Sunsystems has ma de it necessary to change some of the OpenBoot emergencyprocedures. Specifically, the Stop-N, Stop-D, and Stop-F commands that wereavailable on systems w ith non-USB keyboards are not su pp orted on systems that u seUSB keyboard s, such as th e Sun Fire V445 server. If you are fam iliar w ith th e earlier(non-USB) keyboard functionality, this section d escribes the ana logous Open Boot

emergency procedures ava ilable in new er systems that use USB keyboards.

The following sections describe how to perform the functions of the Stop commandson system s that use USB keyboard s, such as th e Sun Fire V445 server. These sam efunctions are available through Sun Advan ced Lights Ou t Manager (ALOM) systemcontroller software.

TABLE 5-11

sc> locator

The ’system’ locator is on.

Stop-A Function

Stop-A (Abort) key sequence works the same as it d oes on systems with stan dardkeyboards, except th at it does not w ork d uring th e first few second s after the serveris reset. In addition, you can issue the ALOM system controller break command.For more information, see “Entering th e ok Prompt” on page 35.

Stop-N Function

The Stop-N function is not available. However, you can reset OpenBootconfiguration variables to their d efault va lues by comp leting th e following steps,provided the system console is configured to be accessible using either the serial

management port or the network management port

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 142/294

110 Sun Fire V445 Server Administration Guide • September 2007

management port or the network management port.

w To Emu late the Stop -N Function

1. Log in to the ALOM system controll er.

2. Type:

This comm and resets the d efault OpenBoot configuration variables.

3. To reset the system, type:

TABLE 5-12

sc> bootmode reset_nvram 

sc>

SC Alert: SC set bootmode to reset_nvram, will expire

20030218184441.

bootmode

Bootmode: reset_nvram

Expires TUE FEB 18 18:44:41 2003

TABLE 5-13

sc> reset

Are you sure you want to reset the system [y/n]? y

sc> console

4. To view console output as the system boots wi th default OpenBoot

configuration variables, sw itch to console mode.

5. Type set-defaults to discard any customized ID PROM values and to restore

the default settings for all OpenBoot configuration variables.

Stop -F Function

The Stop-F function is not available on systems with USB keyboards

TABLE 5-14

sc> console

ok

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 143/294

Chapter 5 Managing RAS Features and System Firmware 111

The Stop F function is not available on systems with USB keyboards.

Stop-D Function

The Stop-D (Diags) key sequence is not su pp orted on systems with USB keyboards.How ever, the Stop-D function can be closely em ulated with ALOM software byenabling the Diagnostics m ode.

In addition, you can emulate Stop-D function using the ALOM system controllerbootmode diag command . For m ore information, see the Sun Advanced Lights Out 

 Manager (ALOM) Online Help.

Abou t Automatic System RestorationThe system provides automatic system restoration (ASR) from failures in memorymod ules or PCI cards.

Autom atic system restoration functionality enables the system to resume operationafter experiencing certain nonfatal hard ware faults or failures. When ASR is enabled ,the system’s firmware diagnostics automatically detect failed hardware components.

An au toconfiguring capability designed into the Op enBoot firmw are enables thesystem to u nconfigure failed components an d to restore system operation. As longas the system is cap able of oper ating w ithou t the failed comp onent , the ASR featuresenable the system to reboot au tomatically, withou t op erator intervention.

For more information about ASR, see “About Autom atic System Restoration” onpage 209.

Unconfiguring a Device ManuallyTo sup port a d egraded boot capability, the Op enBoot firmw are p rovides theasr-disable command , which enables you to u nconfigure system d evicesman ually. This comman d “mar ks” a sp ecified device as disabled , by creating anappropriate status prop erty in the correspond ing d evice tree nod e. By convention,the Solaris OS does not activate a d river for any d evice so m arked.

w To Unconfigure a Device Manually

1. At the ok prompt, type:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 144/294

112 Sun Fire V445 Server Administration Guide • September 2007

where device-identifier  is one of the following:

s Any full physical device path as reported by the Op enBoot show-devs command

s Any valid d evice alias as reported by th e Open Boot devalias command

s Any device identifier from the following table

Note – The dev ice iden tifiers are not case-sensitive. You can ty pe th em as u pp ercaseor lowercase characters.

ok asr-disable device-identifier 

TABLE 5-15 Device Identifiers and Devices

Device Identifiers Devices

cpu0-bank0, cpu0-bank1, cpu0-bank2, cpu0-bank3, ... cpu3-bank0, cpu3-bank1, cpu3-bank2, cpu3-bank3

Memory banks 0 – 3 for each CPU

cpu0-bank*, cpu1-bank*, ... cpu3-bank* All memory banks for each CPU

ide On-board IDE controller

net0, net1,net2,net3 On-board Ethernet controllers

ob-scsi SAS controller

pci0, ... pci7 PCI slots 0 – 7

pci-slot* All PCI slots

pci* All on-board PCI devices (on-boardEthern et, SAS) and all PCI slots

s You can d etermine full ph ysical d evice p aths by typing:

The show-devs command lists the system d evices and displays the full pathnam e of each device

hba8, hba9 PCI bridge chips 0 an d 1, respecti

usb0, ..., usb4 USB devices

* All devices

ok show-devs

TABLE 5-15 Device Iden tifiers and Devices (Continued)

Device Identifiers Devices

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 145/294

Chapter 5 Managing RAS Features and System Firmware 113

nam e of each device.

s You can display a list of current device aliases by typing:

s You can also create your own device alias for a physical device by typing:

where alias-name is the alias that you wan t to assign, and physical-device-path isthe full ph ysical device path for the d evice.

Note – If you m anu ally d isable a d evice using asr-disable, and then assign adifferent alias to the d evice, the device remains disabled even though the d evicealias has changed .

2. To cause the parameter change to take effect, type:

The system p erman ently stores the parameter change.

ok devalias

ok devalias alias-name physical-device-path

ok reset-all

Note – To store para meter changes, you can also pow er cycle the system u sing thefront panel Power button.

Reconfiguring a Device ManuallyYou can use th e Op enBoot asr-enable command to reconfigure any device thatyou previously unconfigured with the asr-disable command.

w To Reconfigu re a Device Manually

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 146/294

114 Sun Fire V445 Server Administration Guide • September 2007

To Reconfigu re a Device Manually1. At the ok prompt, type:

where the device-identifier  is one of the following:

s Any full physical device path as reported by the Op enBoot show-devs command

s Any valid d evice alias as reported by th e Open Boot devalias command

s Any device identifier from the following table

Note – The dev ice iden tifiers are not case-sensitive. You can ty pe th em as u pp ercase

or lowercase characters.

For a list of device identifiers and dev ices, see TABLE 5-15.

Enabling the Hardware Watchd ogMechanism and Its Op tionsFor background information about the hardware watchdog mechanism and relatedexternally initiated reset (XIR) functionality, see:

s “H ardw are Watchdog Mechanism and XIR” on p age 102

ok asr-enable device-identifier 

w To Enable the Hardware Watchdog Mechanismand Its Options

1. Edit the /etc/system file to include the follow ing entry:

2. To obtain the ok prompt, type:

3 ff

set watchdog_enable = 1

TABLE 5-16

# init 0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 147/294

Chapter 5 Managing RAS Features and System Firmware 115

3. Reboot the sy stem so that the changes can take effect.

4. To have the hardw are watchdog mechanism automatically reboot the sys tem in

case of system hang, at the ok prompt, type:

5. To generate automated crash dumps in case of system hang, at the ok prompt,

type:

The sync option leaves you at the ok prompt to debug the system. For moreinformation about OpenBoot configuration variables, see Append ix C.

Abou t Multipathing SoftwareMultipathing softwar e allows you to define and control redu nd ant p hysical paths toI/ O devices, such as storage devices and network interfaces. If the active path to adevice becomes u navailable, the software can autom atically sw itch to an alternatepath to maintain availability. This capability is known as automatic failover . To take

ok setenv error-reset-recovery = boot

ok setenv error-reset-recovery = none

adva ntage of mu ltipathing capabilities, you mu st configure the server w ithredu nd ant hardw are, such as redu nd ant network interfaces or two host bus adap tersconnected to the sam e du al-ported storage array.

For the Sun Fire V445 server, three different types of multipathing software areavailable:

s Solaris IP Network Mu ltipathing software p rovides mu ltipathing an dload-balancing capabilities for IP network interfaces.

s Sun StorEdge™ Traffic Manager is an architecture fully integrated within theSolaris OS (beginning w ith th e Solaris 8 release) that enables I/ O d evices to beaccessed through multiple host controller interfaces from a single instance of theI/ O dev ice.

s VERITAS Volume Manager

For information about setting up red un dan t hard ware interfaces for networks, see“About Redu nd ant N etwork Interfaces” on page 142.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 148/294

116 Sun Fire V445 Server Administration Guide • September 2007

p g

For instructions on h ow to configure and adm inister Solaris IP N etworkMultipathing, consult the IP Network Multipathing Administration Guide providedwith you r sp ecific Solaris release.

For informa tion abou t Sun StorEdge Traffic Manager, see “Sun StorEdge TrafficManager” on p age 102 and refer to your Solaris OS documentation.

For information about VERITAS Volume Manager and its DMP feature, see “AboutVolume Management Software” on page 118 and refer to the documen tationprovided with the VERITAS Volume Manager software.

CHAPTER 6

Managing Disk Volumes

This chapter describes redu nd ant ar ray of indep endent disks (RAID) concepts, howt d i k l d h t fi h d i i i th SAS

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 149/294

117

to manage d isk volum es, and how to configure hard ware m irroring using the SAScontroller.

This chapter contains the following sections:

s “Abou t Disk Volum es” on pag e 118

s “About Volume Management Software” on page 118

s “Abou t RAID Technology” on page 120s “About H ardw are Disk Mirroring” on p age 122

s “About Physical Disk Slot Numbers, Physical Device Names, and Logical DeviceNam es” on pa ge 123

s “Creating a Ha rdw are Disk Mirror” on page 124

s “Creating a Hardware Mirrored Volume of the Default Boot Device” on page 126

s “Creating a Ha rdw are Striped Volume” on page 128

s “Configuring and Labeling a Hardware RAID Volume for Use in the SolarisOperating System” on page 129

s “Deleting a H ardw are Disk Mirror” on page 132

s “Performing a Mirrored Disk H ot-Plug Op eration” on p age 134

s “Performing a Non mirrored Disk Hot-Plug Op eration” on page 136

About Disk Volumes Disk volumes are logical disk devices comprising one or more physical disks orpartitions from several different disks.

Once you create a volume, the OS uses and maintains the volum e as if it w ere asingle d isk. By p roviding this logical volume man agement layer, the softwareovercomes the restrictions imposed by physical disk devices.

Sun’s volume management products also provide RAID data redundancy andperforman ce features. RAID is a technology that helps protect against disk andhard ware failures. Through RAID technology, volume m anagement software is able

to provide high d ata availability, excellent I/ O performance, and simplifiedadministration.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 150/294

118 Sun Fire V445 Server Administration Guide • September 2007

Abou t Volume Managem ent Software

Volum e man agem ent softwa re lets you create d isk volumes. Sun M icrosystems offerstwo different volume man agement app lications for u se on the Sun Fire V445 server:

s Solaris Volume Manager softwares VERITAS Volum e Ma nager software

Sun’s volume man agement app lications offer the following features:

s Support for several types of RAID configurations, which provide varying degreesof availability, capacity, and per forman ce

s Hot-spare facilities, wh ich provide for autom atic data recovery w hen disks fail

s Performance analysis tools, wh ich ena ble you to m onitor I/ O performan ce andisolate bottlenecks

s A graph ical u ser interface (GUI), wh ich simplifies storage man agement

s Supp ort for online resizing, wh ich enables volum es and th eir file systems to growand shrink online

s Online reconfiguration facilities, which let you change to a different RAIDconfiguration or modify characteristics of an existing configuration

VERITAS Dynamic Multipathing

VERITAS Volume Manager software actively supports multiported disk arrays. Itautom atically recognizes mu ltiple I/ O p aths to a particular d isk device w ithin anarray. Called Dynamic Multipathing (DMP), this capability provides increasedreliability by provid ing a pa th failover mechan ism. If one connection to a disk is lost,

VERITAS Volume Manager continues to access the data over the remainingconnections. This mu ltipath ing capability also provides greater I/ O th roughp ut byautom atically balancing the I/ O load un iformly across mu ltiple I/ O p aths to eachdisk device.

Sun StorEdge Traffic Manager

A new er alternative to DMP th at is also supp orted by the Sun Fire V445 server isSun StorEdge Traffic Manag er softw are Sun StorEdge Traffic Manag er is a server

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 151/294

Chapter 6 Managing Disk Volumes 119

Sun StorEdge Traffic Manag er softw are. Sun StorEdge Traffic Manag er is a server-based d ynam ic path failover software solution, used to improve th e overallavailability of business ap plications. Sun StorEdg e Traffic Manager (previou slyknown as mu ltiplexed inpu t/ outpu t, or MPxIO) is included in the Solaris OS.

The Sun StorEdge Traffic Manager software integrates multiple path I/ Ocapabilities, automatic load balancing, and path failover functions into one package

for Sun servers connected to supported Sun StorEdge systems. Sun StorEdge TrafficManager can p rovide you with increased system p erformance and availability forbuilding mission-critical storage area networks (SANs).

The Sun StorEdge Traffic Manager architecture provides the following capabilities:

s Helps protect against I/ O outages due to I/ O controller failures. Should one I/ Ocontroller fail, Sun StorEd ge Traffic Manager au tom atically sw itches to analternate controller.

s Increases I/ O performance by load balancing across multiple I/ O channels.

Sun StorEdg e T3, Sun StorEdge 3510, and Sun StorEdg e A5x00 storage ar rays a re allsupported by Sun StorEdge Traffic Manager on a Sun Fire V445 server. SupportedI/ O controllers are single and du al fibre-channel netw ork ad apters, includ ing thefollowing:

s PCI Single Fibre-Channel Host Adapter (Sun part number x6799A)s PCI Dual Fibre-Chann el Network Adap ter (Sun part nu mber x6727A)s 2 GByte PCI Single Fibre-Channel Host Adapter (Sun part number x6767A)s 2 GByte PCI Dual Fibre-Chann el Netw ork Ad apter (Sun part nu mber x6768A)

Note – Sun StorEdge Traffic Manager is not sup por ted for boot d isks containin g theroot (/) file system. You can use h ard ware m irroring or VERITAS Volum e Man agerinstead . See “Creating a Ha rdw are Disk Mirror” on page 124 an d “About VolumeManagement Software” on page 118.

Refer to the documentation supplied with the VERITAS Volume Manager andSolaris Volum e Mana ger softwa re. For m ore informa tion abou t Sun StorEdge TrafficManager, see your Solaris system ad ministration docum entation.

About RAID TechnologyVERITAS Volum e Man ager and Solstice DiskSuite™ software su pp ort RAID

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 152/294

120 Sun Fire V445 Server Administration Guide • September 2007

gyV S Vo u e a age a d So st ce s Su te so twa e su pp o ttechnology to optimize performance, availability, and cost per user. RAIDtechnology reduces recovery time in the event of file system errors, and increasesda ta ava ilability even in th e event of a d isk failure. There are sever al levels of RAIDconfigurations that p rovide varying d egrees of data availability w ith correspond ingtrade-offs in performan ce and cost.

This section d escribes some of the m ost popu lar and useful of those configurations,including:

s Disk concatenation

s Disk striping , integrated stripe (IS), or IS volumes (RAID 0)

s Disk mirroring, integrated mirror (IM), or IM volumes (RAID 1)

s Hot-spares

Disk Concatenation

Disk concatenation is a method for increasing logical volume size beyond thecapacity of one disk d rive by creating one large volume from tw o or m ore smallerdrives. This lets you create arbitrarily large partitions.Using this method, theconcatenated disks are filled with data sequentially, with the second disk beingwr itten to when no space remains on the first, the third wh en no spa ce remains on

the second, and so on.

RAID 0: Disk Strip ing or Intergated Stripe (IS)

Disk striping, Integr ated Stripe (IS), or RAID 0 is a techn ique for increasing systemthroughp ut by u sing several disk drives in parallel. In non striped d isks the OSwrites a single block to a single disk. In a striped arr angem ent, each block is dividedand portions of the data are w ritten to different disks simultaneously.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 153/294

Chapter 6 Managing Disk Volumes 121

System performance using RAID 0 will be better than using RAID 1, but thepossibility of data loss is greater because there is no way to retrieve or reconstructdata stored on a failed d isk drive.

RAID 1: Disk Mirroring or Integrated Mirror (IM)Disk mirroring, Integrated Mirror (IM), or RAID 1 is a technique th at u ses dataredu nd ancy – two complete copies of all data stored on tw o separate disks – toprotect against loss of data du e to d isk failure. One logical volume is dup licated ontwo separate disks.

Whenever the OS needs to write to a mirrored volum e, both d isks are up dated . Thedisks are maintained at all times w ith exactly the same information. When the O Sneeds to read from the mirrored volum e, it reads from w hichever d isk is moreread ily accessible at the mom ent, wh ich can result in enhan ced perform ance for readoperations.

RAID 1 offers the highest level of data protection, but storage costs are high, andwr ite performance compared to RAID 0 is reduced since all data m ust be storedtwice.

On the Sun Fire V445 server, you can configure h ardw are d isk mirroring using th eSAS controller. This provides h igher p erformance than with conventional softwaremirroring u sing volume m anagemen t softwa re. For more information, see:

s “Creating a H ardw are Disk Mirror” on page 124s “Deleting a Hard ware D isk Mirror” on page 132s “Performing a Mirrored Disk H ot-Plug Op eration” on p age 134

Hot-Spares

In a hot-spares arrangem ent, one or more disk dr ives are installed in the system bu tare un used du ring norm al operation. This configuration is also referred to a s hot 

l i Sh ld f h i d i f il h d h f il d di k i

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 154/294

122 Sun Fire V445 Server Administration Guide • September 2007

relocation. Should one of the active drives fail, the data on the failed disk isautom atically reconstructed and generated on a hot-spare d isk, enabling the entiredata set to maintain its availability.

About Hardware Disk MirroringOn the Sun Fire V445 server, the SAS controller supports mirroring and stripingusing the Solaris OS raidctl utility.

A hardware RAID volume created under the raidctl utility behaves slightlydifferently than one created using volume m anagement software. Und er a software

volume, each device has its own entry in the virtual device tree, and read/ w riteoperations are performed to both virtual devices. Under hard ware RAID volum es,only one device appears in the device tree. Member disk devices are invisible to theoperating system, and are accessed only by the SAS controller.

Note – The Sun Fire V445 server ’s on-board controller can configure as many as tw oRAID sets. Prior to volume creation, ensure that the member disks are available andthat there are not tw o sets already created.

Caution – Creating RAID volum es using the on-board controller d estroys all dataon th e m ember disks. The d isk controller ’s volum e initialization procedure reservesa portion of each p hysical disk for metad ata and other internal information u sed bythe controller. Once the volume initialization is complete, you can configure thevolume an d label it using format(1M). You can then use th e volum e in th e SolarisOperating System.

Caution – If a RAID Volume is created using the on-board controller and a diskdrive in the v olume set is removed withou t deleting the RAID Volume, the disk w illnot be useable in the Solaris Op erating System unless special procedu res arefollowed . Contact Sun Services if you h ave rem oved a d isk from a RAID Volum e andcannot reuse the dr ive.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 155/294

Chapter 6 Managing Disk Volumes 123

Abou t Physical Disk Slot Numbers,Physical Device Nam es, and Logical

Device NamesIn order to perform a d isk hot-plug p rocedu re, you m ust know the physical orlogical device nam e for the d rive that you wan t to install or remove. If your systemencounters a disk error, often you can find messages about failing or failed disks inthe system console. This information is also logged in the /var/adm/messages

file(s).

These error m essages typically refer to a failed h ard disk d rive by its physical devicenam e (such as /devices/pci@1f,700000/scsi@2/sd@1,0) or by its logical d evicenam e (such as c1t1d0). In ad dition, some app lications m ight report a disk slotnum ber (0 through 3).

You can use TABLE 6-1 to associate internal disk slot num bers with the logical andphysical device nam es for each hard disk dr ive.

TABLE 6-1 Disk Slot Nu mbers, Logical Device Na mes, and Physical Device Na mes

Disk Slot

Number

Logical Device

Name* Physical Device Name

Slot 0 c1t0d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@0,0

Slot 1 c1t1d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@1,0

Slot 2 c1t2d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@2,0

Slot 3 c1t3d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@3,0

Slot 4 c1t4d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@4,0

Slot 5 c1t5d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@5,0

Slot 6 c1t6d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@6,0

Sl

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 156/294

124 Sun Fire V445 Server Administration Guide • September 2007

Creating a Hard ware Disk MirrorPerform this procedure to create an internal h ardw are disk m irror (IM or RAID 1)configuration on you r system.

Verify w hich d isk drive corresponds w ith w hich logical device name and physical

device name. See:s “About Physical Disk Slot Numbers, Physical Device Names, and Logical Device

Nam es” on page 123

w To Create a Hardware Disk Mirror

1. To verify that a hardw are disk mirror does not already exis t, type:

* The logical device names might app ear differently on your system, depen ding on the nu mber and type of add-on d isk controllers in-stalled.

Slot 7 c1t7d0 /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/sd@7,0

TABLE 6-2

# raidctl

No RAID volumes found.

The example indicates that no RAID volume exists. In another case:

The example indicates a hardware mirror has degraded at disk c1t2d0.

Note – The logical device nam es might a pp ear d ifferently on you r system,dep ending on th e num ber and type of add-on d isk controllers installed.

2. Type:

TABLE 6-3

# raidctl

RAID Volume RAID RAID Disk

Volume Type Status Disk Status

------------------------------------------------------c0t4d0 IM OK c0t5d0 OK

c0t4d0 OK

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 157/294

Chapter 6 Managing Disk Volumes 125

yp

For example:

When y ou create a RAID mirror, the slave d rive (in this case, c1t1d0) disapp earsfrom the Solaris device tree.

3. To check the status of a RAID mirror, type:

The example ind icates that the RAID mirror is still resynchronizing with thebackup drive.

TABLE 6-4

# raidctl -c master slave

TABLE 6-5

# raidctl -c c1t0d0 c1t1d0

TABLE 6-6

# raidctl

RAID RAID RAID Disk

Volume Status Disk Status

--------------------------------------------------------

c1t0d0 RESYNCING c1t0d0 OK

c1t1d0 OK

Note – The process of synchronizing a d rive may take u p to 60 minutes.

The examp le below shows th at the RAID mirror is comp letely restored an d online.

Under RAID 1 (disk mirroring), all data is duplicated on both drives. If a disk fails,replace it w ith a w orking d rive and restore the mirror. For instructions, see:

s “Performing a Mirrored Disk H ot-Plug Op eration” on p age 134

TABLE 6-7

# raidctl

RAID RAID RAID Disk

Volume Status Disk Status

------------------------------------

c1t0d0 OK c1t0d0 OK

c1t1d0 OK

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 158/294

126 Sun Fire V445 Server Administration Guide • September 2007

For more information about the raidctl utility, see the raidctl(1M) man p age.

Creating a Hard ware Mirrored Volumeof the Defau lt Boot DeviceDue to th e volume initialization that occurs on the d isk controller w hen a newvolume is created, the volume m ust be configured an d labeled using theformat(1M) utility prior to use w ith the Solaris Op erating System (see “Configuring

and Labeling a Hard ware RAID Volum e for Use in the Solaris Opera ting System” onpage 129). Because of this limitation, raidctl(1M) blocks the creation of a hard wareRAID volum e if any of the member disks currently have a file system mou nted.

This section describes the p rocedu re required to create a hard ware RAID volum econtaining the default boot d evice. Since the boot device always has a mou nted filesystem when booted, an alternate boot medium must be employed, and the volumecreated in th at environment. One alternate med ium is a network installation imagein single-user mode (refer to the Solaris 10 Installation Guide for information abou t

configuring an d using n etwork-based installations).

w To Create a H ard w are Mirrored Volum e of theDefault Boot Device

1. Determine which disk is the default boot device

From the OpenBoot ok prompt, type the printenv command , and if necessary thedevalias command , to identify the default boot d evice. For examp le:

2. Type the boot net –s command.

TABLE 6-8

ok printenv boot-device

boot-device = disk

ok devalias disk

disk /pci@780/pci@0/pci@9/scsi@0/disk@0,0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 159/294

Chapter 6 Managing Disk Volumes 127

3. Once the system has booted, use the raidctl(1M) utility to create a hardware

mirrored volume, using the default boot device as the primary disk.

See “Configuring and Labeling a Hardware RAID Volume for Use in the SolarisOperating System” on page 129. For example:

4. Install the volume w ith the Solaris Operating System using any sup ported

method.

The hardware RAID volume c0t0d0 appears as a disk to the Solaris installation

program.

Note – The logical device nam es might a pp ear d ifferently on you r system,dep ending on th e num ber and type of add-on d isk controllers installed.

TABLE 6-9

ok boot net –s

TABLE 6-10

# raidctl -c c0t0d0 c0t1d0

Creating RAID volume c0t0d0 will destroy all data on member disks,

proceed(yes/no)? yes

Volume c0t0d0 created

#

Creating a Hard ware Striped Volum eUse this procedure to create a hard wa re striped (IS or RAID 0) volum e.

1. Verify w hich hard drive corresponds w ith which logical device name andphysi cal dev ice name.

See “About Physical Disk Slot Numbers, Physical Device Names, and LogicalDevice N ames” on p age 123.

To verify the current RAID configuration, type:

TABLE 6-11

# raidctlNo RAID volumes found.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 160/294

128 Sun Fire V445 Server Administration Guide • September 2007

The preceding examp le indicates that no RAID volume exists.

Note – The logical device nam es might ap pear differently on your system,dep ending on th e num ber and type of add-on d isk controllers installed.

2. Type:

The creation of the RAID volume is interactive, by default. For example:

When you create a RAID striped volum e, the other mem ber d rives (in this case,c0t2d0 an d c0t3d0) disappear from the Solaris device tree.

TABLE 6-12

# raidctl -c –r 0 disk1 disk2 ...

TABLE 6-13

# raidctl -c -r 0 c0t1d0 c0t2d0 c0t3d0

Creating RAID volume c0t1d0 will destroy all data on member disks,

proceed

(yes/no)? yes

Volume ’c0t1d0’ created

#

As an alternative, you can use the –f option to force the creation if you are sure of the mem ber disks, and sure th at the d ata on all other mem ber disks can be lost. Forexample:

3. To check the status of a RAID striped volume, type:

TABLE 6-14

# raidctl -f -c -r 0 c0t1d0 c0t2d0 c0t3d0

Volume ’c0t1d0’ created#

TABLE 6-15

# raidctl

RAID Volume RAID RAID Disk

Volume Type Status Disk Status

--------------------------------------------------------

c0t1d0 IS OK c0t1d0 OK

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 161/294

Chapter 6 Managing Disk Volumes 129

The examp le shows that th e RAID striped volume is online and functioning.

Under RAID 0 (disk striping), there is no replication of data across drives. The datais written to th e RAID volum e across all mem ber d isks in a round -robin fashion. If any on e disk is lost, all data on the volu me is lost. For this reason, RAID 0 cannot beused to ensure d ata integrity or availability, but can be used to increase w riteperformance in some scenarios.

For more information about the raidctl utility, see the raidctl(1M) man page.

Configuring and Labeling a HardwareRAID Volume for Use in the Solar isOp erating SystemAfter a creating a RAID volume using raidctl, use format(1M) to configure an dlabel the volume before attempting to use it in th e Solaris Operating System.

c0t2d0 OK

c0t3d0 OK

1. Start the format utility

The format utility m ight generate messages about corrup tion of the current

label on the volume, which you are going to change. You can safely ignore thesemessages.

2. Select the disk name that represents the RAID volume that you have

configured.

In this example, c0t2d0 is the logical name of the v olume.

TABLE 6-16

# format

TABLE 6-17

# format

Searching for disks...done

AVAILABLE DISK SELECTIONS:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 162/294

130 Sun Fire V445 Server Administration Guide • September 2007

0. c0t0d0 <SUN72G cyl 14084 alt 2 hd 24 sec 424>

/pci@780/pci@0/pci@9/scsi@0/sd@0,0

1. c0t1d0 <SUN72G cyl 14084 alt 2 hd 24 sec 424>

/pci@780/pci@0/pci@9/scsi@0/sd@1,0

2. c0t2d0 <SUN72G cyl 14084 alt 2 hd 24 sec 424>

/pci@780/pci@0/pci@9/scsi@0/sd@2,0Specify disk (enter its number): 2

selecting c0t2d0

[disk formatted]

FORMAT MENU:

disk - select a disk

type - select (define) a disk type

partition - select (define) a partition table

current - describe the current disk

format - format and analyze the disk

fdisk - run the fdisk program

repair - repair a defective sector

label - write label to the disk

analyze - surface analysis

defect - defect list management

backup - search for backup labels

verify - read and display labels

save - save new disk/partition definitionsinquiry - show vendor, product and revision

volname - set 8-character volume name

!<cmd> - execute <cmd>, then return

quit

Caution – If a RAID Volume is created using the on-board controller and a diskdrive in the v olume set is removed withou t deleting the RAID Volume, the disk w illnot be useable in the Solaris Op erating System unless special procedu res arefollowed . Contact Sun Services if you h ave rem oved a d isk from a RAID Volum e andcannot reuse the drive

3. Type the type command at the format> prompt, then select 0 (zero) to auto

configure the volume.

For example:

TABLE 6-18

format> type

AVAILABLE DRIVE TYPES:

0. Auto configure

1. DEFAULT

2 S 72G

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 163/294

Chapter 6 Managing Disk Volumes 131

4. Use the partition command to partition, or slice, the volume according to

your des ired configuration.

See th e format(1M) man page for ad ditional details.

5. Write the new label to the disk using the label command.

2. SUN72G

3. SUN72G

4. other

Specify disk type (enter its number)[3]: 0

c0t2d0: configured with capacity of 68.23GB

<LSILOGIC-LogicalVolume-3000 cyl 69866 alt 2 hd 16 sec 128>selecting c0t2d0

[disk formatted]

TABLE 6-19

format> label

Ready to label disk, continue? yes

6. Verify that the new label has been written by printing the disk li st using the

disk command.

Note that c0t2d0 now has a typ e indicating it is an LSILOGIC-

TABLE 6-20

format> disk

AVAILABLE DISK SELECTIONS:0. c0t0d0 <SUN72G cyl 14084 alt 2 hd 24 sec 424>

/pci@780/pci@0/pci@9/scsi@0/sd@0,0

1. c0t1d0 <SUN72G cyl 14084 alt 2 hd 24 sec 424>

/pci@780/pci@0/pci@9/scsi@0/sd@1,0

2. c0t2d0 <LSILOGIC-LogicalVolume-3000 cyl 69866 alt 2 hd

16 sec 128>

/pci@780/pci@0/pci@9/scsi@0/sd@2,0

Specify disk (enter its number)[2]:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 164/294

132 Sun Fire V445 Server Administration Guide • September 2007

LogicalVolume.

7. Exit the format utility.

The volume can n ow be used in the Solaris Operating System.

Note – The logical device nam es might ap pear differently on your system,dep ending on th e num ber and type of add-on d isk controllers installed.

Deleting a Hard ware Disk MirrorPerform this procedu re to remove a hardw are disk mirror configuration from yoursystem.

Verify w hich d isk drive corresponds w ith w hich logical device name and physicaldevice name. See:

s “About Physical Disk Slot Numbers, Physical Device Names, and Logical Device

Nam es” on page 123

w To Delete a H ard w are Disk Mirror

1. Determine the name of the mirrored volu me. Type:

In this examp le, the m irrored volume is c1t0d0.

Note – The logical device nam es might a pp ear d ifferently on you r system,dep ending on th e num ber and type of add-on d isk controllers installed.

TABLE 6-21

# raidctl

RAID RAID RAID Disk

Volume Status Disk Status

------------------------------------

c1t0d0 OK c1t0d0 OK

c1t1d0 OK

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 165/294

Chapter 6 Managing Disk Volumes 133

2. To delete the volume, type:

For example:

3. To confirm that you have del eted the RAID array, type:

For example:

For more information, see the raidctl(1M) man page.

TABLE 6-22

# raidctl -d mirrored-volume

TABLE 6-23

# raidctl -d c1t0d0

RAID Volume ‘c1t0d0’ deleted

TABLE 6-24

# raidctl

TABLE 6-25

# raidctl

No RAID volumes found

Perform ing a Mirrored Disk Hot-PlugOperation

Verify w hich d isk drive corresponds w ith w hich logical device name and physicaldevice name. See:

s “About Physical Disk Slot Numbers, Physical Device Names, and Logical DeviceNam es” on page 123

You n eed to refer to the following d ocument to p erform this procedure:

s Sun Fire V445 Server Service Man ua l

w To Perform a Mirrored Disk Hot-Plug Operation

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 166/294

134 Sun Fire V445 Server Administration Guide • September 2007

Caution – Ensure that the disk drive OK-to-Remove indicator is lit, indicating thatthe disk drive is offline. If the disk drive is still online, you risk removing the diskdu ring a read/ w rite operation, wh ich could result in data loss.

1. To confirm a failed disk, type:

For example:

This example indicates that the disk mirror has d egraded du e to a failure in diskc1t2d0.

TABLE 6-26

# raidctl

TABLE 6-27

# raidctl

RAID RAID RAID Disk

Volume Status Disk Status

----------------------------------------

c1t1d0 DEGRADED c1t1d0 OK

c1t2d0 DEGRADED

Note – The logical device nam es might a pp ear d ifferently on you r system,dep ending on th e num ber and type of add-on d isk controllers installed.

2. Remove the disk drive, as described in the Sun Fire V445 Server Service Manual.

There is no need to issue a software command to bring the drive offline when the

drive has failed and the OK-to-Remove indicator is lit.

3. Install a new disk drive, as described in the Sun Fire V445 Server Service Manual.

The RAID u tility autom atically restores the d ata to the d isk.

4. To check the status of a RAID rebuild , type:

For example:

TABLE 6-28

# raidctl

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 167/294

Chapter 6 Managing Disk Volumes 135

This example indicates that RAID volume c1t1d0 is resynchronizing.

If you issue th e comman d again some m inutes later, it indicates that th e RAIDmirror is finished resynchronizing and is back online:

For more information, see the raidctl(1M) man page.

TABLE 6-29

# raidctl

RAID RAID RAID Disk

Volume Status Disk Status----------------------------------------

c1t1d0 RESYNCING c1t1d0 OK

c1t2d0 OK

TABLE 6-30

# raidctl

RAID RAID RAID Disk

Volume Status Disk Status

----------------------------------------

c1t1d0 OK c1t1d0 OK

c1t2d0 OK

Performing a Nonm irrored Disk Hot-Plug Operation

Verify w hich d isk drive corresponds w ith w hich logical device name and physicaldevice name. See:

s “About Physical Disk Slot Numbers, Physical Device Names, and Logical DeviceNam es” on page 123

Ensure that no ap plications or processes are accessing th e d isk drive.

You n eed to refer to the following d ocument to p erform this procedure:

s Sun Fire V445 Server Service Man ua l

w To View the Statu s of th e SCSI Devices

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 168/294

136 Sun Fire V445 Server Administration Guide • September 2007

w To View the Statu s of th e SCSI Devices

1. Type:

For example:

TABLE 6-31

# cfgadm -al

TABLE 6-32

# cfgadm -al

Ap_Id Type Receptacle Occupant Conditionc0 scsi-bus connected configured unknown

c0::dsk/c0t0d0 CD-ROM connected configured unknown

c1 scsi-bus connected configured unknown

c1::dsk/c1t0d0 disk connected configured unknown

c1::dsk/c1t1d0 disk connected configured unknown

c1::dsk/c1t2d0 disk connected configured unknown

c1::dsk/c1t3d0 disk connected configured unknown

c2 scsi-bus connected configured unknown

c2::dsk/c2t2d0 disk connected configured unknownusb0/1 unknown empty unconfigured ok

usb0/2 unknown empty unconfigured ok

usb1/1 unknown empty unconfigured ok

usb1/2 unknown empty unconfigured ok

#

Note – The logical device nam es might a pp ear d ifferently on you r system,dep ending on th e num ber and type of add-on d isk controllers installed.

The -al options return the status of all SCSI devices, including buses and USBdevices. (In this example, no USB devices are connected to the system.)

Note that w hile you can use the Solaris OS cfgadm install_device an d cfgadmremove_device commands to perform a disk drive hot-plug procedure, thesecommand s issue the following w arning message when you invoke these comm and son a bu s containing the system disk:

TABLE 6-33

# cfgadm -x remove_device c0::dsk/c1t1d0

Removing SCSI device: /devices/pci@1f,4000/scsi@3/sd@1,0

This operation will suspend activity on SCSI bus: c0

Continue (yes/no)? y

dev = /devices/pci@1f,4000/scsi@3/sd@1,0

cfgadm: Hardware specific failure: failed to suspend:

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 169/294

Chapter 6 Managing Disk Volumes 137

This warning is issued because these comman ds attem pt to quiesce the SAS bus, butthe Sun Fire V445 server firmware prevents it. This warning message can be safelyignored in the Sun Fire V445 server, but the following proced ure av oids this warn ingmessage altogether.

Resource Information

------------------ -------------------------

/dev/dsk/c1t0d0s0 mounted filesystem "/"

/dev/dsk/c1t0d0s6 mounted filesystem "/usr"

w To Perform a Nonmirrored Disk Hot-PlugOperation

1. To remove the disk drive from the device tree, type:

For example:

This examp le removes c1t3d0 from the device tree. The blue OK-to-Removeindicator lights.

2. To verify that the device has been removed from the dev ice tree, type:

TABLE 6-34

# cfgadm -c unconfigure Ap-Id 

TABLE 6-35

# cfgadm -c unconfigure c1::dsk/c1t3d0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 170/294

138 Sun Fire V445 Server Administration Guide • September 2007

c1t3d0 is now unavailable an d unconfigured. The correspond ing disk

drive OK-to-Remove indicator is lit.3. Remove the disk drive, as described in the Sun Fire V445 Server Parts Installation

and Removal Guide.

The blue OK-to-Remove ind icator goes out w hen you remove the d isk drive.

TABLE 6-36

# cfgadm -al

Ap_Id Type Receptacle Occupant Conditionc0 scsi-bus connected configured unknown

c0::dsk/c0t0d0 CD-ROM connected configured unknown

c1 scsi-bus connected configured unknown

c1::dsk/c1t0d0 disk connected configured unknown

c1::dsk/c1t1d0 disk connected configured unknown

c1::dsk/c1t2d0 disk connected configured unknown

c1::dsk/c1t3d0 unavailable connected unconfigured unknown

c2 scsi-bus connected configured unknown

c2::dsk/c2t2d0 disk connected configured unknownusb0/1 unknown empty unconfigured ok

usb0/2 unknown empty unconfigured ok

usb1/1 unknown empty unconfigured ok

usb1/2 unknown empty unconfigured ok

#

4. Install a new disk drive, as described in the Sun Fire V445 Server Parts

 Installation and Removal Guide.

5. To configure the new disk drive, type:

For example:

The green Activity indicator flashes as the new disk at c1t3d0 is added to thedevice tree.

6. To verify that the new disk drive is in the device tree, type:

TABLE 6-37

# cfgadm -c configure Ap-Id 

TABLE 6-38

# cfgadm -c configure c1::dsk/c1t3d0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 171/294

Chapter 6 Managing Disk Volumes 139

Note that c1t3d0 is now listed as configured.

TABLE 6-39

# cfgadm -al

Ap_Id Type Receptacle Occupant Condition

c0 scsi-bus connected configured unknown

c0::dsk/c0t0d0 CD-ROM connected configured unknown

c1 scsi-bus connected configured unknown

c1::dsk/c1t0d0 disk connected configured unknown

c1::dsk/c1t1d0 disk connected configured unknown

c1::dsk/c1t2d0 disk connected configured unknown

c1::dsk/c1t3d0 disk connected configured unknown

c2 scsi-bus connected configured unknown

c2::dsk/c2t2d0 disk connected configured unknown

usb0/1 unknown empty unconfigured okusb0/2 unknown empty unconfigured ok

usb1/1 unknown empty unconfigured ok

usb1/2 unknown empty unconfigured ok

#

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 172/294

140 Sun Fire V445 Server Administration Guide • September 2007

CHAPTER 7

Managing Network Interfaces

This chap ter describes how to m anage netw ork interfaces.

This chapter contains the following sections:s “About th e Netw ork Interfaces” on p age 141s “About Redu nd ant N etwork Interfaces” on page 142s “Attaching a Twisted-Pair Ethernet Cable” on page 143s “Configuring the Primar y N etwork Interface” on p age 144

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 173/294

141

s Configuring the Primar y N etwork Interface on p age 144s “Configuring Add itional N etwork Interfaces” on page 145

Abou t the Network InterfacesThe Sun Fire V445 server provides four on-board Sun Gigabit Ethernet interfaces,which reside on the system motherboard and conform to the IEEE 802.3z Ethernetstandard . For an illustration of the Ethernet ports, see FIGURE 1-7. The Ethernet

interfaces operate at 10 Mbps, 100 Mbps, and 1000 Mbps.Four back panel por ts with RJ-45 conn ectors provid e access to the on-board Eth ernetinterfaces. Each interface is configured with a unique Media Access Control (MAC)address. Each connector features two LED indicators, as described in TABLE 1-5.Add itional Ethernet interfaces or connections to other network types are availableby installing the app ropriate PCI interface cards.

The system’s on-board interfaces can be configured for redundancy, or an additional

network interface card can serve as a red un dan t netw ork interface for one of thesystem’s on-board interfaces. If the active network interface becomes unavailable,the system can au tomatically switch to the redu nd ant interface to maintainavailability. This capability is know n a s automatic failover and must be configured atthe Solaris OS level. In add ition, this configuration p rovides outbou nd data loadbalancing for increased performance. For additional details, see “About Redund antNetw ork Interfaces” on page 142.

The Ethernet driver is installed automatically during the Solaris installationprocedure.

For instructions on configuring the system network interfaces, see:

s “Configuring the Primar y N etwork Interface” on p age 144s “Configuring Add itional N etwork Interfaces” on page 145

About Redund ant Network InterfacesTwo Sun Gigabit Ethernet (bge0 an d bge1) interfaces are on one controller and two(bge2 an d bge3) are on another controller. These interfaces are connected to the

Broadcom 5714 chips, which are Dual Ethernet controller and PCI-X bridgecomponents.

You can configure your system w ith redu nd ant netw ork interfaces to provide ahighly available network connection. Such a configuration relies on special Solarissoftware features to detect a failed or failing network interface and automatically

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 174/294

142 Sun Fire V445 Server Administration Guide • September 2007

software features to detect a failed or failing network interface and automaticallyswitch all network traffic over to the redu nd ant interface. This capability is know nas automatic failover.

To set up redu nd ant n etwork interfaces, you can enable automa tic failover betweenthe tw o similar interfaces using th e IP N etwork Multipathing feature of the SolarisOS. For ad ditiona l deta ils, see “About Multipathing Software” on page 115. You canalso install a pair of identical PCI network interface cards, or ad d a single card thatprovides an interface identical to one of the tw o on-board Ethernet interfaces.

To ensure maximum redu nd ancy, each on -board Ethernet interface resides on adifferent PCI bus. To help further maximize system availability, ensure that anyadd itional network interfaces add ed for redun dan cy also reside on separate PCI

buses, which are sup ported by separate PCI bridges. For ad ditional details, see“About th e PCI Cards and Buses” on p age 81.

Attaching a Twisted -Pair Ethernet CableYou must complete this task in the following section.

w To Attach a Twisted-Pair Ethernet Cable

1. Install the server into the rack.

Refer to th e Sun Fire V445 Server Installation Guide.

2. Locate the RJ-45 twisted-pair Ethernet (TPE) conne ctor for the app ropriate

Ethernet interface – the left top (net0), left bottom (net1), right top (net2,

right bottom (net3).

See “Locating Back Panel Features” on p age 16. For a PCI Ethernet ad apter card,see the documentation supplied with the card.

3 Connect a Category-5 unshie lded twis ted-pair (UTP) cable to the appropriate

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 175/294

Chapter 7 Managing Network Interfaces 143

3. Connect a Category-5 unshie lded twis ted-pair (UTP) cable to the appropriate

RJ-45 connector on the system back panel.

You should hear the connector tab click into place. The UTP cable length must

not exceed 100 meters (328 feet).4. Connect the other end of the cable to the RJ-45 outlet of the appropriate

network device.

You should hear the connector tab click into place.

Consult your network documentation if you need more information about how toconnect to your netw ork.

If you are installing your system, complete the installation procedure, as describedin the Sun Fire V445 Server Installation Guide.

If you are ad ding an add itional network interface to the system, you need toconfigu re th at in terface. See:

s “Configuring Add itional N etwork Interfaces” on page 145

Configuring the Primary NetworkInterface

For background information, see:s Sun Fire V445 Server Installation Guide

s “About th e Netw ork Interfaces” on p age 141

If you are u sing a PCI network interface card, see the docum entation supp lied w iththe card.

w To Configure the Primary Network Interface1. Choose a network port, using the follo wing table as a guide.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 176/294

144 Sun Fire V445 Server Administration Guide • September 2007

2. Attach an Ethernet cable to the port you chose.

See “Attaching a Twisted-Pair Ethernet Cable” on page 143.

3. Choose a network host name for the system and make a note of it.

You n eed to furn ish the nam e in a later step.

The host name m ust be u nique w ithin the network. It can consist only of alphanu meric characters and the dash (-). Do not use a d ot in the host nam e. Do notbegin the name w ith a nu mber or a special character. The nam e mu st not be longerthan 30 characters.

4. Determine the unique Internet Protocol (IP) address of the network interfaceand make a note of it.

You n eed to furn ish the ad dress in a later step.

An IP add ress must be assigned by the n etwork ad ministrator. Each networkdevice or interface mu st have a u nique IP add ress.

Ethernet

Port PCI

OpenBoot PROM

Device Alias Device Path

0 net0 /pci@1e,600000/pci@0/pci@1/pci@0/network@4

1 net1 /pci@1e,600000/pci@0/pci@1/pci@0/network@4,1

2 net2 /pci@1f,700000/pci@0/pci@2/pci@0/network@4

3 net3 /pci@1f,700000/pci@0/pci@2/pci@0/network@4,1

Dur ing installation of the Solaris OS, the softw are au toma tically detects the system ’son-board network interfaces and any installed PCI network interface cards for whichnative Solaris device drivers exist. The OS then asks you to select one of theinterfaces as the primary n etwork interface and p romp ts you for its host name andIP address. You can configure only one network interface during installation of theOS. You m ust configure an y ad ditiona l interfaces separately, after the OS is installed.For more information, see “Configuring Add itional N etwork Interfaces” on

page 145.

Note – The Sun Fire V445 server conform s to t he Eth ernet 10/ 100BASE-T stand ard ,which states that the Ethernet 10BASE-T link integrity test function should alwaysbe enabled on both the h ost system an d th e Ethernet hub. If you ha ve problemsestablishing a connection between this system and your hub, verify that the Ethernethub a lso has the link test function enabled. Consult the manu al provided w ith yourhub for more information about the link integrity test function.

After comp leting this p rocedu re, the primary netw ork interface is ready foroperation. However, in ord er for other network d evices to comm un icate with th esystem, you m ust enter the system’s IP add ress and host nam e into the namespace

th t k F i f ti b t tti t k

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 177/294

Chapter 7 Managing Network Interfaces 145

on the netw ork nam e server. For information about setting up a networ k nameservice, consult:

s Solaris Naming Configuration Guide for you r specific Solaris release

The device driver for the system’s on-board Sun Gigabit Ethernet interfaces isautomatically installed with the Solaris release. For information about operatingcharacteristics and configuration parameters for this driver, refer to the followingdocument:

s Platform Notes: The Sun GigaSwift Ethernet Device Driver 

This document is available on the Solaris on Sun Hardware AnswerBook , which is

provided on the Solaris CD or DVD for your specific Solaris release.If you w ant to set up an ad ditional network interface, you mu st configure itsepar ately, after installing t he O S. See:

s “Configuring Add itional N etwork Interfaces” on page 145

Configuring Ad ditional NetworkInterfacesPerform the following tasks to prep are an ad ditional network interface:

s Install the Sun Fire V445 server a s d escribed in the Sun Fire V445 ServerInstallation Guide.

s If you are setting up a redu nd ant netw ork interface, see “About Redund antNetw ork Interfaces” on page 142.

s If you need to install a PCI network interface card, follow the installationinstructions in the Sun Fire V445 Server Parts Installation and Removal Guide.

s Attach an Ethernet cable to the app ropriate port on the system back panel. See“Attaching a Twisted-Pair Ethernet Cable” on page 143. If you are u sing a PCInetwor k interface card, see the documentation sup plied with the card.

Note – All internal options, except hard disk d rives, must be installed by qualifiedservice personnel only. Installation procedures for these components are covered inth e Sun Fire V445 Server Parts Installation and Removal Guide.

w To Configure Additional Network Interfaces

1 Ch t k h t f h i t f

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 178/294

146 Sun Fire V445 Server Administration Guide • September 2007

1. Choose a network host name for each new interface.

You n eed to furn ish the nam e in a later step.

The host name m ust be u nique w ithin the network. It can consist only of alphanu meric characters and the dash (-). Do not use a d ot in the host nam e. Do notbegin the name w ith a nu mber or a special character. The nam e mu st not be longerthan 30 characters.

Usually an interface host nam e is based on the system host nam e. For m oreinformation, see the installation instructions accompanying the Solaris software.

2. Determine the Internet Protocol (IP) address for each new interface.

You n eed to furn ish the IP add ress in a later step.

An IP add ress must be assigned by your netw ork adm inistrator. Each interfaceon a network mu st have a unique IP address.

3. Boot the OS, if it is not already running.

Be sure to p erform a reconfiguration boot if you just add ed a new PCI networkinterface card . See “Initiating a Reconfiguration Boot” on page 66.

4. Log in to the system as superuser.

5. Create an appropriate /etc/hostname file for each new network interface.

The name of the file you create should be of the form /etc/hostname.typenum,where type is the netw ork interface type identifier (some comm on typ es are ce, le,hme, eri, and ge) and num is the device instance number of the interface accordingto the order in w hich it was installed in the system.

For example, the file names for the system’s Gigabit Ethernet interfaces are/etc/hostname.ce0 an d /etc/hostname.ce1. If you ad d a PCI Fast Ethernet

adap ter card as a third interface, its file nam e should be /etc/hostname.eri0. A tleast one of these files, the primary network interface, should exist already, havingbeen created automatically during the Solaris installation process.

Note – The documentation accompanying the n etwork interface card shouldident ify its type. Alterna tively, you can enter t he show-devs command from the okprompt to obtain a list of all installed devices.

6. Edit the /etc/hostname file(s) created in Step 5 to add the host name(s)

determined in Step 1.

Following is an examp le of the /etc/hostname files required for a system calledsunrise, which has two on-board Sun Gigabit Ethernet interfaces (bge0 andbge1) and an Intel Ophir Gigabit Ethernet ad apter (e1000g0) A netw ork

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 179/294

Chapter 7 Managing Network Interfaces 147

bge1) and an Intel Ophir Gigabit Ethernet ad apter (e1000g0). A netw orkconnected to the on -board bge0 an d bge1 interfaces will know the system assunrise an d sunrise-1, while networks connected to the PCI-based e1000g0

interface will know th e system as sunrise-2.

7. Create an entry in the /etc/hosts file for each active network interface.

An entry consists of the IP ad dress and the host na me for each interface.

sunrise # cat /etc/hostname.bge0

sunrise

sunrise # cat /etc/hostname.bge1

sunrise-1

sunrise # cat /etc/hostname.e1000g0

sunrise-2

The following example shows an /etc/hosts file with entries for the three netw orkinterfaces used as examples in this procedu re.

8. Manually configure and enable e ach new interface using the ifconfig

command.

For example, for the interface eri0, type:

F i f ti th if fi (1M)

sunrise # cat /etc/hosts

#

# Internet host table

#

127.0.0.1 localhost

129.144.10.57 sunrise loghost

129.144.14.26 sunrise-1

129.144.11.83 sunrise-2

# ifconfig e1000g0 plumb inet ip-address netmask ip-netmask  .... up

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 180/294

148 Sun Fire V445 Server Administration Guide • September 2007

For more information, see the ifconfig(1M) man p age.

Note – The Sun Fire V445 server conforms to th e Ethern et 10/ 100BASE-T stand ard ,which states that the Ethernet 10BASE-T link integrity test function should alwaysbe enabled on both the h ost system an d th e Ethernet hub. If you hav e problemsestablishing a connection between this system and your Ethernet hub, verify that thehu b also has the link test function enabled. Consult the man ual provided with you rhu b for more information about th e link integrity test fun ction.

After comp leting th is procedu re, any new network interfaces are ready for

operation. However, in ord er for other network d evices to comm unicate with thesystem through the new interface, the IP add ress and host na me for each newinterface must be entered into the n amespace on the n etwork n ame server. Forinformation abou t setting u p a network nam e service, consult:

s Solaris Naming Configuration Guide for you r specific Solaris release

The ce device driver for each of the system’s on-board Sun Gigabit Ethernetinterfaces is automatically configured during Solaris installation. For information

about operating characteristics and configuration p arameters for these d rivers, referto the following docum ent:

s Platform N otes: The Sun GigaSwift Ethernet D evice Driver 

This document is available on the Solaris on Sun Hardware AnswerBook , which isprovided on the Solaris CD or DVD for your specific Solaris release.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 181/294

Chapter 7 Managing Network Interfaces 149

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 182/294

150 Sun Fire V445 Server Administration Guide • September 2007

CHAPTER 8

Diagnostics

This chapter describes th e diag nostic tools available for the Sun Fire V445 server.

Topics in this chapter include:s “Diagnostic Tools Overview ” on page 152s “About Sun Advanced Lights-Out Manager 1.0 (ALOM)” on page 154s “About Status Indicators” on page 157s “About POST Diagnostics” on page 157s “OpenBoot PROM Enhancements for Diagnostic Op eration” on page 158

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 183/294

151

p g p p gs “OpenBoot Diagnostics” on page 177s “About Op enBoot Comman ds” on p age 182

s “About Predictive Self-Healing” on page 186s “Abou t Trad itional Solaris OS Diagnostic Tools” on page 191s “Viewing Recent Diagnostic Test Results” on page 204s “Setting O pen Boot Configura tion Variables” on page 204s “Ad ditiona l Diagnostic Tests for Specific Devices” on page 206s “About Autom atic Server Restart” on page 208s “About Automatic System Restoration” on page 209s “Abou t SunVTS” on page 215s

“About Sun Mana gement Center” on page 218s “Hard wa re Diagnostic Suite” on page 221

Diagnostic Tools OverviewSun provides a range of diagnostic tools for use with the Sun Fire V445 server.

The diagnostic tools are sum marized in TABLE 8-1.

TABLE 8-1 Sum mar y of Diagnostic Tools

Diagnostic Tool Type What It Does Accessibility and Availability Remote Capability

ALOM systemcontroller

Hardwarean dSoftware

Monitors environmentalconditions, performs basicfault isolation, and p rovidesremote console access

Can function on standbypower and without OS

Designed forremote access

LED indicators Hardware Indicates status of overallsystem and particularcomponents

Accessed from systemchassis. Available anytimepow er is available

Local, but can beviewed with theALOM systemconsole

POST Firm w are Tests core com pon en ts of  t

Runs au tomatically ont t A il bl h th

Local, but can bei d ith

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 184/294

152 Sun Fire V445 Server Administration Guide • September 2007

system startup . Available wh en theOS is not running

viewed withALOM systemcontroller

OpenBootDiagnostics

Firmware Tests system components,focusing on p eripherals andI/ O devices

Runs au tomatically orinteractively. Available whenthe OS is not running

Local, but can beviewed withALOM systemcontroller

OpenBootcommands

Firmware Display various kinds of system information

Available when th e OS is notrunning

Local, but can beaccessed withALOM systemcontroller

Solaris 10Pred ictive Self-Healing

Software Monitors system errors andreports and disables faultyhardware

Runs in the backgroundwhen the OS is running

Local, but can beaccessed withALOM systemcontroller

TraditionalSolaris OScommands

Software Displays various kinds of system information

Requires OS Local, but can beaccessed withALOM system

controller

Su n VTS So ft ware Exercises a nd str esses t hesystem, runn ing tests inparallel

Requires O S. Op tionalpackage that needs to beinstalled separately

View and cover networ

Sun

ManagementCenter

Software Monitors both hardware

environmental conditionsand software performanceof multiple machines.Generates alerts for variousconditions

Requires OS to be run ning

on both monitored andmaster servers. Requires adedicated database on themaster server

Designed fo

remote acce

HardwareDiagnosticSuite

Software Exercises an operationalsystem by ru nningsequ ential tests. Also reports

failed FRUs

Separately p urchasedoptional add-on to SunManagement Center.

Requires OS and SunManagement Center

Designed foremote acce

TABLE 8-1 Sum mar y of Diagnostic Tools (Continued)

Diagnostic Tool Type What It Does Accessibility and Availability Remote Capa

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 185/294

Chapter 8 Diagnostics 153

Abou t Sun Ad vanced Lights-Ou tManager 1.0 (ALOM)

The Sun Fire V445 server sh ips w ith Sun Ad van ced Lights Ou t Manager (ALOM) 1.0installed. The system console is directed to ALOM by default and is configured toshow server console information on startup .

ALOM enables you to mon itor and control your server over either a serialconnection (using the SERIAL MGT port), or Ethernet connection (using the NETMGT port). For information on configuring an Ethernet connection, refer to the

 ALOM Online Help.

Note – The ALOM serial p ort, labelled SERIAL MGT, is for serv er m anagemen tonly. If you need a general purpose serial port, use the serial port labeled TTYB.

ALOM can send em ail notification of hard wa re failures and other events related tothe server or to A LOM.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 186/294

154 Sun Fire V445 Server Administration Guide • September 2007

The ALOM circuitry u ses standby p ower from the server. This means that:

s ALOM is active as soon as the server is connected to a pow er source, and u ntilpower is removed by unplugging the power cable.

s ALOM firmw are and software continu e to be effective wh en the server OS goesoffline.

See TABLE 8-2 for a list of the comp onents m onitered by ALOM and the informationit provides for each.

TABLE 8-2 What ALOM Monitors

Component Information

H ard d isk d rives Presence and status

System an d CPU fan s Sp eed an d statu s

CPUs Presence, temperatu re and any thermal warning orfailure conditions

Power supplies Presence and status

System tempera tu re Ambient t empera tu re and any the rmal warning o r fa ilu reconditions

ALOM Management Ports

The d efault m anagemen t p ort is labeled SERIAL MGT. This port u ses an RJ-45connector and is for server man agement only – it sup por ts only ASCII connections toan external console. Use this port w hen you first begin to operate the server.

Another serial port – labeled TTYB – is available for general p urp ose serial datatransfer. This port uses a DB-9 connector. For information on pinouts, refer to theSun Fire V445 Server Installation Guide.

In ad dition, the server h as one 10BASE-T Ethernet m anagemen t d omain interface,labelled NET MGT. To use this port, ALOM configuration is required. For moreinformation see the ALOM Online Help

Server front p anel Statu s ind icator

Voltage Status and thresholds

SAS and USB circuit breakers Status

TABLE 8-2 What ALOM Monitors (Continued)

Component Information

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 187/294

Chapter 8 Diagnostics 155

information, see the ALOM Online Help.

Setting the admin Password for ALOM

When you switch to the ALOM promp t after initial power-on, you w ill be logged inas the admin user and prom pted to set a password. You m ust set this password inorder to execute certain comman ds.

If you are prom pted to do so, set a password for the admin u ser.

The password mu st:

s contain at least tw o alpha betic characterss contain at least one n um eric or one sp ecial characters be at least six characters long

Once the password is set, the admin user has full permissions and can execute allALOM CLI command s.

Basic ALOM Functions

This section covers some basic ALOM functions. For comp rehen sive docum entation ,refer to the ALOM Online Help.

w

To Switch to the ALOM Promptq Type the default keystroke sequence:

Note – When you sw itch to the ALOM promp t, you w ill be logged in with theuserid admin. See “Setting the admin Password for ALOM” on page 155.

w To Switch to the Server Console Prompt

TABLE 8-3

# #.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 188/294

156 Sun Fire V445 Server Administration Guide • September 2007

q Type:

More than on e ALOM user can be connected to the server console stream at a time,but only one u ser is permitted to type inp ut characters to the console.

If anoth er user is logged on a nd has w rite capability, you will see the message below

after issuing the console command:

To take console write capability away from another user, type:

TABLE 8-4

sc> console

TABLE 8-5

sc> Console session already in use. [view mode]

TABLE 8-6

sc> console -f

Abou t Statu s Ind icatorsFor a summ ary of the server ’s LED statu s indicators, see “Front Panel Indicators” onpage 10 an d “Back Panel Indicators” on page 17.

Abou t POST DiagnosticsPOST is a firmw are program that is useful in d etermining if a portion of the systemhas failed. POST verifies the core functionality of the system, including the CPU

mod ule(s), motherboard , memory, and some on-board I/ O d evices, and generatesmessages that can determine the n ature of a hard ware failure. POST can be run evenif the system is unable to boot.

POST detects CPU and Memory subsystem faults and is located in a SEEPROM onthe MBC (ALOM) board. POST can be set to ru n by the Op enBoot program atpow er-on by setting three environmen t variables, the diag-switch?, diag-

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 189/294

Chapter 8 Diagnostics 157

trigger, and diag-level.

POST runs au tomatically when the system p ower is applied, or following anoncritical error reset, if all of the following conditions apply:

s diag-switch? is set to true or false (default is false)

s diag-level is set to min, max, o r menus (default is min)

s diag-trigger is set to power-on-reset an d error-reset (default is power-on-reset an d error-reset)

If diag-level is set to min or max, POST performs an abbreviated or extended test,respectively. If diag-level is set to menus, a menu of all the tests executed atpow er-up is displayed. POST diagnostic and error message reports are d isplayed ona console.

For information on starting and controlling POST diagnostics, see “About the postComma nd ” on page 165.

Op enBoot PROM Enhancements forDiagnostic Op eration

This section describes the diagnostic operation enhancemen ts provid ed by Open BootPROM Version 4.15 and later and presents information about how to use th eresulting new operational features. Note that th e behavior of certain operationalfeatures on your system might differ from the behavior described in this section.

What’s New in Diagnostic Operation

The following features are th e d iagnostic operation enh ancements:s New and redefined configuration variables simplify diagnostic controls and allow

you to customize a “norm al mod e” of diagnostic operation for your environment.See “About the New and Redefined Configuration Variables” on page 158.

s New standard (default) configuration enables and ru ns diagnostics and enablesAu tom atic System Restoration (ASR) cap abilities at pow er-on and after error reset

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 190/294

158 Sun Fire V445 Server Administration Guide • September 2007

events. See “About th e Default Configuration” on page 159.

s Service mode establishes a Sun prescribed methodology for isolating anddiagnosing problems. See “About Service Mode” on page 162.

s The post command executes the power-on self-test (POST) and provides optionsthat enable you to specify the level of diagnostic testing and verbosity of diagnostic outpu t. See “About the post Command” on page 165.

About the New and Redefined ConfigurationVariables

New and redefined configuration variables simp lify diagnostic operation an dprovide you with m ore control over the amou nt of diagnostic outpu t. The followinglist summarizes the configuration variable changes. See TABLE 8-7 for completedescriptions of the variables.

s New variables:s service-mode? – Diagnostics are executed at a Sun-prescribed level.s diag-trigger – Replaces and consolidates the functions of post-trigger

and obdiag-trigger.s verbosity – Controls the amou nt and detail of firmw are outpu t.

s Redefined variable:

s diag-switch? param eter has m odified behaviors for controlling diagnosticexecution in n orm al mod e on Sun U ltraSPARC based volum e servers. Behaviorof the diag-switch? parameter is unchanged on Sun workstations.

s Default value changes:

s auto-boot-on-error? – New d efault value is true.s diag-level – New d efault value is max.s error-reset-recovery – New d efault value is sync.

About the Default Configuration

The new standard (default) configuration run s d iagnostic tests and enables full ASRcapabilities during power-on and after the occurrence of an error reset (RED StateException Reset, CPU Watchdog Reset, System Watchdog Reset, Software-Instruction

Reset, or Hard wa re Fatal Reset). This is a change from th e p revious defaultconfiguration, which did n ot run diagnostic tests. When you p ower on your systemfor the first time, the change will be visible to you through the increased boot timeand the display of approximately two screens of diagnostic output p rodu ced byPOST and OpenBoot Diagnostics.

Note Th t d d (d f lt) fi ti d t i t b t ti

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 191/294

Chapter 8 Diagnostics 159

Note – The standard (default) configuration d oes not increase system boot timeafter a reset that is initiated by user comm and s from Op enBoot (reset-all orboot) or from Solaris (reboot, shutdown, o r init).

The visible changes are d ue to the default settings of two configuration variables,diag-level (max) and verbosity (normal):

s diag-level (max) specifies maximum diagnostic testing, including extensivememory testing, which increases system boot time. See “Reference for Estima tingSystem Boot Time (to the ok Promp t)” on p age 168 for more information abou t

the increased boot time.s verbosity (normal) specifies that diagnostic messages and information will be

displayed, wh ich u sually produ ces approximately tw o screens of outpu t. See“Reference for Sample Outputs” on page 170 for diagnostic outpu t samp les of verbosity settings min and normal.

After initial power-on, you can customize th e stand ard (default) configuration bysetting th e configuration variables to define a “norm al mod e” of operation that is

appropriate for your production environment. TABLE 8-7 lists and describes thedefaults and keyword s of the Op enBoot configuration variables that controldiagn ostic testing an d ASR cap abilities. These are the v ariables you w ill set to defineyour normal mode of operation.

Note – The standard (default) configuration is recomm ended for imp roved faultisolation and system restoration, and for increased system availability.

TABLE 8-7 Open Boot Configuration Variables That Control Diagnostic Testing an d Autom atic SystemRestoration

OpenBoot Configuration

Variable Description and Keywords

auto-boot? Determines w hether th e system autom atically boots. Default is true.

• true – System autom atically boots after initialization, provided no firmw are-based (d iagnostics or O penBoot) errors are d etected.

• false – System remains at th e ok prompt u ntil you type boot.

auto-boot-on-error? Determines whether the system attempts a degraded boot after a nonfatal error.

Default is true.• true – System autom atically boots after a nonfatal error if the variableauto-boot? is also set to true.

• false – System remains at th e ok prompt.

boot-device Specifies the name of the default boot d evice, which is also the norma l mod e bootdevice.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 192/294

160 Sun Fire V445 Server Administration Guide • September 2007

boot-file Specifies the default boot argum ents, which are also the norm al mod e boot

arguments.diag-device Specifies the nam e of the boot d evice that is u sed w hen diag-switch? is true.

diag-file Specifies the boot argum ents that are u sed w hen diag-switch? is true.

diag-level Specifies the level or type of diagnostics that are executed. Default is max.

• off – No testing.

• min – Basic tests are run.

• max – More extensive tests might be run, d epend ing on the d evice. Memory is

extensively checked.

diag-out-console Redirects system console outp ut to the system controller.

• true – Redirects outp ut to the system controller.

• false – Restores outp ut to th e local console.

Note: See your system docum entation for informa tion about red irecting systemconsole outp ut to th e system controller. (Not all systems are equip ped with asystem controller.)

diag-passes Specifies the nu mb er of consecutive executions of Op enBoot Diagnostics self-teststhat are ru n from the O penBoot Diagnostics (obdiag) menu . Default is 1.

Note: diag-passes app lies only to systems w ith firmw are that containsOpen Boot Diagnostics and has no effect outside the Op enBoot Diagnostics menu .

diag-script Determines w hich devices are tested by Open Boot Diagnostics. Default isnormal.

•none

– Open Boot Diagnostics do n ot run .• normal – Tests all devices that are expected to be present in the system’sbaseline configuration for which self-tests exist.

• all – Tests all devices that have self-tests.

diag-switch? Controls diagnostic execution in n ormal m ode. Default is false.

For servers:

• true – Diagnostics are only executed on p ower-on reset events, but the level of test coverage, verbosity, and outp ut is d etermined by u ser-defined settings.

• false – Diagnostics are executed up on n ext system reset, but only for thoseclass of reset events specified by the OpenBoot configuration variablediag-trigger. The level of test coverage, verbosity, and output is determinedby u ser-defined settings.

For workstations:

• true – Diagnostics are only executed on p ower-on reset events, but the level of test coverage, verbosity, and outp ut is d etermined by u ser-defined settings.

Di i di bl d

TABLE 8-7 Open Boot Configuration Variables That Control Diagn ostic Testing an d Autom atic SystemRestoration (Continued)

OpenBoot Configuration

Variable Description and Keywords

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 193/294

Chapter 8 Diagnostics 161

• false – Diagnostics are disabled.

diag-trigger Specifies the class of reset event that causes diagnostics to run automatically.Default setting is power-on-reset error-reset.

• none – Diagnostic tests are n ot executed .

• error-reset – Reset that is caused by certain hard ware error events su ch asRED State Exception Reset, Watchdo g Resets, Softw are-Instru ction Reset, orHard wa re Fatal Reset.

• power-on-reset – Reset that is caused by p ower cycling th e system.

• user-reset – Reset that is initiated by an OS panic or by user-initiated

commands from OpenBoot (reset-all or boot) or from Solaris (reboot,shutdown, o r init).

• all-resets – Any kind of system reset.

Note: Both POST and OpenBoot Diagnostics run at the specified reset event if thevariable diag-script is set to normal or all. If diag-script is set to none,only POST runs.

error-reset-recovery Specifies recovery action after an error reset. Default is sync.

• none – No recovery action.

• boot – System attem pts to boot.• sync – Firmw are attemp ts to execute a Solaris sync callback routine.

service-mode? Controls wheth er the system is in service mod e. Default is false.

• true – Service mode. Diagnostics are executed at Sun-specified levels,overriding but preserving user settings.

• false – Nor mal m ode. Diagnostics execution dep ends entirely on the settingsof diag-switch? and other u ser-defined Op enBoot configuration variables.

test-args Customizes O penBoot Diagnostics tests. Allows a text string of reservedkeyword s (separated by comm as) to be specified in the following ways:

• As an argum ent to the test comm and at the ok prompt.

• As an OpenBoot variable to the setenv comm and at the ok or obdiag prompt.

Note: The variable test-args applies only to systems with firmware that

contains OpenBoot Diagnostics. See your system d ocumentation for a list of keywords.

verbosity Controls the am oun t and detail of Open Boot, POST, and Open Boot Diagnosticsoutput.Default is normal.

• none – Only error and fatal messages are displayed on th e system console.Banner is not d isplayed.Note: Problems in systems with verbosity set to none might be deemed not

TABLE 8-7 Open Boot Configuration Variables That Control Diagnostic Testing an d Autom atic SystemRestoration (Continued)

OpenBoot Configuration

Variable Description and Keywords

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 194/294

162 Sun Fire V445 Server Administration Guide • September 2007

About Service Mode

Service mode is an operational mode defined by Sun that facilitates fault isolationand recovery of systems that ap pear to be non functional. When initiated, servicemod e overrides the settings of key Op enBoot configuration variables.

Note th at service mode d oes not change you r stored settings. After initialization (atth e ok prom pt), all OpenBoot PROM configuration variables revert to th e u ser-defined settings. In this way, you or your service provider can quickly invoke aknow n and maximum level of diagnostics and still preserve your normal m odesettings.

Note: Problems in systems with verbosity set to none might be deemed not

diagnosable, rendering the system u nserviceable by Sun .• min – Notice, error, warn ing, and fatal messages are displayed on the systemconsole. Transitional states and banner are also displayed.

• normal – Summary progress and operational messages are displayed on thesystem console in add ition to the m essages displayed by th e min setting. Thework -in-progress indicator show s the status and progress of the boot sequence.

• max – Detailed p rogress and operational messages are displayed on th e systemconsole in add ition to the messages displayed by th e min an d normal settings.

TABLE 8-8 lists the OpenBoot configuration variables that are affected by servicemod e and th e overrides that are applied w hen you select service mode.

About Initiating Service Mod e

TABLE 8-8 Service Mode Overrid es

OpenBoot Configuration Variable Service Mode Override

auto-boot? false

diag-level max

diag-trigger power-on-reset error-reset user-reset

input-device Factory default

output-device Factory default

verbosity max

The following ap ply only to system s w ith firmw are that contains Op enBoot Diagnostics:

diag-script normal

test-args subtests,verbose

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 195/294

Chapter 8 Diagnostics 163

Enhancements p rovide a software m echanism for specifying service mode:

service-mode? configuration variable – When set to true, initiates service mode.(Service mod e should be used only by auth orized Sun service providers.)

Note – The diag-switch? configuration variable should rem ain at the d efaultsetting (false) for normal operation. To specify diagnostic testing for your OS, see“To Initiate Normal Mode” on page 167.

For instru ctions, see “To Initiate Service Mode” on page 167.

About Overriding Service Mode Settings

When the system is in service mod e, three comm and s can override service mod esettings. TABLE 8-9 describes the effect of each command.

1 – If the system is not reset within 10 minutes of issuing the bootmode system controller command , the comman d is cleared.

N t ll i d i h ll

TABLE 8-9 Scenarios for Over riding Service Mode Settings

Command Issued From What It Does

post ok promp t OpenBoot firmware forces a one-time execution of normal modediagnostics.

• For information about normal mod e, see “About Normal Mode”on page 164.

• For information about post command options, see “About thepost Command ” on p age 165.

bootmode diag systemcontroller

Open Boot firm ware ov errides service mod e settings and forces aone-time execution of norm al mod e diagnostics.1

bootmode skip_diag systemcontroller

Open Boot firmw are supp resses service mod e and byp asses allfirmw are d iagnostics.1

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 196/294

164 Sun Fire V445 Server Administration Guide • September 2007

Note – Not a ll systems are equipp ed w ith a system controller.

About N ormal Mode

Norm al mode is the customized op erational mode that you d efine for yourenvironment. To d efine norm al mod e, set the values of the OpenBoot configuration

variables that control diagnostic testing. See TABLE 8-7 for the list of variables thatcontrol diagnostic testing.

Note – The standard (default) configuration is recomm ended for imp roved faultisolation and system restoration, and for increased system availability.

When you are deciding wheth er to enable diagnostic testing in your norm alenvironment, remember that you always should ru n diagnostics to troubleshoot anexisting problem or after the following events:

s Initial system installations New hardware installation and replacement of defective hardwares Hard ware configuration m odifications Hard ware relocations Firmw are upgrades Power interru ption or failures Hardw are errorss Severe or inexplicable software problems

About Initiating Norm al Mode

If you d efine normal m ode for your environmen t, you can specify norm al mode w iththe following method :

System controller bootmode diag command – When you issue this comm and , itspecifies normal m ode w ith the configuration values d efined by you – with thefollowing exceptions:

s If you defined diag-level = off, bootmode diag specifies diagn ostics atdiag-level = min.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 197/294

Chapter 8 Diagnostics 165

diag level min.

s If you defined verbosity = none, bootmode diag specifies diagn ostics atverbosity = min.

Note – The next reset cycle must occur within 10 minutes of issuing thebootmode diag comm and or the bootmode command is cleared and normal modeis not initiated.

For instru ctions, see “To Initiate Normal Mode” on page 167.

About the post Command

The post command enables you to easily invoke POST diagnostics and to controlthe level of testing and the amou nt of outpu t. When you issue the post command,

OpenBoot firmw are p erforms the following actions:s Initiates a user resets Triggers a one-time execution of POST at the test level and verbosity that you

specifys Clears old test resultss Displays and logs the new test results

Note – The post command overrides service mode settings and pend ing systemcontroller bootmode diag an d bootmode skip_diag commands.

The syntax for the post command is:

post [level [verbosity]]

where:

s level = min or maxs verbosity = min, normal, o r max

The level an d verbosity options provide th e same functions as the Op enBootconfiguration variables diag-level and verbosity. To determine which settingsyou should u se for the post command options, see TABLE 8-7 for descriptions of thekeywords for diag-level an d verbosity.

You can specify settings for:

s Both level an d verbosity

s level only (If you sp ecify a verbosity setting, you must also specify a level

setting.)

s Neither level nor verbosity

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 198/294

166 Sun Fire V445 Server Administration Guide • September 2007

If you sp ecify a setting for level only, the post command uses the normal modevalue for verbosity with the following exception:

s If the norm al mode value of verbosity = none, post uses verbosity = min.

If you specify settings for neither level nor verbosity, the post comman d u sesthe norm al mod e values you specified for the configuration variables,diag-level an d verbosity, with tw o exceptions:

s If the norm al mode value of diag-level = off, post uses level = min.

s If the norm al mode value of verbosity = none, post usesverbosity = min.

w To In itiate Service Mod e

For backgroun d information, see “About Service Mode” on page 162.

q Set the service-mode? variable. At the ok prompt, type:

For service mode to take effect, you must reset the system.

9. At the ok prompt, type:

TABLE 1

ok setenv service-mode? true

TABLE 2

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 199/294

Chapter 8 Diagnostics 167

w To In itiate N orm al ModeFor backgroun d information, see “About N ormal Mode” on p age 164.

1. At the ok prompt, type:

The system will not actually enter n ormal m ode u ntil the next reset.

2. Type:

TABLE 2

ok reset-all

TABLE 3

ok setenv service-mode? false

TABLE 4

ok reset-all

Reference for Estimating System Boot Time (to theok Prompt)

Note – The standard (default) configuration d oes not increase system boot timeafter a reset that is initiated by u ser command s from Open Boot (reset-all orboot) or from Solaris (reboot, shutdown, or init).

The measuremen t of system boot time begins when you p ower on (or reset) thesystem and ends when the OpenBoot ok prompt appears. During the boot timeperiod, the firmware executes diagnostics (POST and OpenBoot Diagnostics) andperforms OpenBoot initialization. The time required to run OpenBoot Diagnosticsand to perform OpenBoot setup, configuration, and initialization is generally similarfor all systems, dep ending on the nu mber of I/ O cards installed w hen

diag-script is set to all. How ever, at the d efault settings (diag-level = maxand verbosity = normal), POST executes extensive memory tests, which willincrease system boot time.

System boot time will vary from system-to-system, dep ending on th e configurationof system memory and the number of CPUs:

s Because each CPU tests its associated m emory and POST performs the mem orytests simultaneously, memory test time will dep end on the amou nt of memory on

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 200/294

168 Sun Fire V445 Server Administration Guide • September 2007

the most populated CPU.s Because the competition for system resources makes CPU testing a less linear

process than m emory testing, CPU test time w ill depend on the nu mber of CPUs.

If you need to know the app roximate boot time of your new system before youpow er on for the first time, the following sections describe two meth ods you can u seto estimate boot time:

s If you r system configura tion ma tches one of the three typical configurations citedin “Boot Time Estimates for Typical Configura tions” on page 169, you can use theapp roximate boot time given for the a pp ropriate configuration.

s If you know how the mem ory is configured among the CPUs, you can estimate

the boot time for you r specific system configuration u sing the m ethod describedin “Estimating Boot Time for Your System” on pag e 169.

Boot Time Estimates for Typ ical Configurat ions

The following are three typ ical configurations and the ap proximate boot time youcan expect for each:

s Small configurat ion (2 CPUs and 4 Gbytes of mem ory) – Boot time isapp roximately 5 minutes.

s Medium configuration (4 CPUs and 16 Gbytes of memory) – Boot time isapp roximately 10 minutes.

s Large configu ration (4 CPUs an d 32 Gbytes of m emory) – Boot time isapp roximately 15 minutes.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 201/294

Chapter 8 Diagnostics 169

Estimating Boot Time for Your System

Genera lly, for systems configu red w ith defau lt settings, the times required to executeOpenBoot Diagnostics and to perform OpenBoot setup, configuration, andinitialization are the same for all systems:

s 1 minute for OpenBoot Diagnostics testing might require more time for systemswith a greater n um ber of devices to be tested.

s 2 minutes for OpenBoot setup, configuration, and initialization

To estimate the time requ ired to run POST memory tests, you n eed to know theamou nt of memory associated w ith the most p opu lated CPU. To estimate the timerequired to ru n POST CPU tests, you need to know the nu mber of CPUs. Use thefollowing guidelines to estimate m emory an d CPU test times:

s 2 minutes per Gbyte of mem ory associated with the m ost popu lated CPU

s 1 minute per CPU

The following examp le shows how to estimate th e system boot time of a samp leconfigur ation consisting of 4 CPUs and 32 Gbytes of system m emor y, with 8 Gbytesof memory on the most populated CPU.

Sample Configuration

CPU0 8 Gbytes

CPU2 8 Gbytes

CPU3 4 Gbytes

CPU4 2 Gbytes

CPU5 2 Gbytes

CPU6 2 Gbytes

POST memory test 8 Gbytes x 2 min per Gbyte = 16 min

POST CPU test 8 CPUs x 1 min per CPU = 8 min

OpenBoot Diagnostics 1 min

OpenBoot initialization 2 min

CPU1 4 Gbytes

CPU7 2 Gbytes

Estimation of Boot Time

8 Gbytes on most populated CPU

8 CPUs in the system

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 202/294

170 Sun Fire V445 Server Administration Guide • September 2007

Reference for Sample Outputs

At the d efault setting of verbosity = normal, POST and OpenBoot Diagnosticsgenerate less diagnostic outp ut (about 2 pages) than w as prod uced before theOpenBoot PROM enhancements (over 10 pages). This section includes outputsamp les for verbosity settings at min an d normal.

Note – The diag-level configuration variable also affects how mu ch outp ut thesystem generates. The following sam ples were p rodu ced w ith diag-level set tomax, the d efault setting.

Total system boot time (to the ok prompt) 27 min

The following sam ple shows the firmw are outpu t after a power reset whenverbosity is set to min. At this verbosity setting, Op enBoot firmw are d isplaysnotice, error, warn ing, and fatal messages but d oes not d isplay progress oroperational messages. Transitional states and the p ower-on bann er are alsodisplayed. Since no error cond itions w ere encountered, this sample shows on ly thePOST execution m essage, the system ’s install bann er, and the d evice self-testsconducted by OpenBoot Diagnostics.

TABLE 5

Executing POST w/%o0 = 0000.0400.0101.2041

Sun Fire V445, Keyboard Present

Copyright 1998-2006 Sun Microsystems, Inc. All rights reserved.

OpenBoot 4.15.0, 4096 MB memory installed, Serial #12980804.

Ethernet address 8:0:20:c6:12:44, Host ID: 80c61244.

Running diagnostic script obdiag/normal

Testing /pci@8,600000/network@1

Testing /pci@8,600000/SUNW,qlc@2

Testing /pci@9,700000/ebus@1/i2c@1,2e

Testing /pci@9,700000/ebus@1/i2c@1,30

Testing /pci@9,700000/ebus@1/i2c@1,50002e

Testing /pci@9,700000/ebus@1/i2c@1,500030

Testing /pci@9,700000/ebus@1/bbc@1,0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 203/294

Chapter 8 Diagnostics 171

Testing /pci@9,700000/ebus@1/bbc@1,500000Testing /pci@8,700000/scsi@1

Testing /pci@9,700000/network@1,1

Testing /pci@9,700000/usb@1,3

Testing /pci@9,700000/ebus@1/gpio@1,300600

Testing /pci@9,700000/ebus@1/pmc@1,300700

Testing /pci@9,700000/ebus@1/rtc@1,300070

{7} ok

The following sam ple shows the d iagnostic outpu t after a pow er reset whenverbosity is set to normal, the d efault setting. At this verbosity setting, th eOpenBoot firmw are displays summ ary p rogress or operational messages in add itionto the notice, error, warning, and fatal messages; transitional states; and installbanner displayed by the min setting. On the console, the work-in-progress indicatorshows the statu s and progress of the boot sequence.

TABLE 6

Sun Fire V445, Keyboard Present

Copyright 1998-2004 Sun Microsystems, Inc. All rights reserved.

OpenBoot 4.15.0, 4096 MB memory installed, Serial #12980804.

Ethernet address 8:0:20:c6:12:44, Host ID: 80c61244.

Running diagnostic script obdiag/normal

Testing /pci@8,600000/network@1

Testing /pci@8,600000/SUNW,qlc@2

Testing /pci@9,700000/ebus@1/i2c@1,2e

Testing /pci@9,700000/ebus@1/i2c@1,30

Testing /pci@9,700000/ebus@1/i2c@1,50002e

Testing /pci@9,700000/ebus@1/i2c@1,500030

Testing /pci@9,700000/ebus@1/bbc@1,0

Testing /pci@9,700000/ebus@1/bbc@1,500000

Testing /pci@8,700000/scsi@1

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 204/294

172 Sun Fire V445 Server Administration Guide • September 2007

Reference for Determ ining Diagnostic ModeThe flowchart in FIGURE 8-7 sum marizes grap hically how various system controllerand OpenBoot variables affect whether a system boots in normal or service mode, aswell as wh ether any overrides occur.

Testing /pci@9,700000/network@1,1Testing /pci@9,700000/usb@1,3

Testing /pci@9,700000/ebus@1/gpio@1,300600

Testing /pci@9,700000/ebus@1/pmc@1,300700

Testing /pci@9,700000/ebus@1/rtc@1,300070

{7} ok

CODE EXAMPLE 8-1

{3} ok postSC Alert: Host System has Reset

Executing Power On Self Test

Q#0>

0>@(#)Sun Fire[TM] V445 POST 4.22.11 2006/06/12 15:10

/export/delivery/delivery/4.22/4.22.11/post4.22.x/Fiesta/boston/

integrated (root)

0>Copyright ? 2006 Sun Microsystems, Inc. All rights reserved

SUN PROPRIETARY/CONFIDENTIAL.

Use is subject to license terms.0>OBP->POST Call with %o0=00000800.01012000.

0>Diag level set to MIN.

0>Verbosity level set to NORMAL.

0>Start Selftest.....

0>CPUs present in system: 0 1 2 3

0>Test CPU(s)....Done

0>Interrupt Crosscall....Done

0>Init Memory....|

SC Alert: Host System has Reset

'Done

0>PLL Reset....Done

0>Init Memory....Done

0>Test Memory....Done

0>IO-Bridge Tests....Done

0>INFO:

CODE EXAMPLE 8-1

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 205/294

Chapter 8 Diagnostics 173

0> POST Passed all devices.

0>

0>POST: Return to OBP.

SC Alert: Host System has Reset

Configuring system memory & CPU(s)

Probing system devices

Probing memory

Probing I/O buses

screen not found.

keyboard not found.

Keyboard not present. Using ttya for input and output.

Probing system devices

Probing memory

Probing I/O buses

Sun Fire V445, No Keyboard

Copyright 2006 Sun Microsystems, Inc. All rights reserved.

OpenBoot 4.22.11, 24576 MB memory installed, Serial #64548465.

Ethernet address 0:3:ba:d8:ee:71, Host ID: 83d8ee71.

CODE EXAMPLE 8-1

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 206/294

174 Sun Fire V445 Server Administration Guide • September 2007

SystemReset

skip_diag diag

normal

service-mode?variable

true

false

System Controldiag

normal

user-reset

error reset

Normal Mode

Service Mode

System Controllerbootmode one-shot execution with some overrides 

Sun-prescribed level of diagnostics 

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 207/294

Chapter 8 Diagnostics 175

diag-switch?variable

diag-triggervariable

error-reset power-on-reset

none

false

true

yes

no

OpenBoot

and Initialize

ok

Power-On Reset?Normal Mode

full user control 

Test, Configure,

 bold type indicates default values

FIGURE 8-7 Diagnostic Mode Flowchart

Quick Reference for Diagnostic Operation

TABLE 8-10 summarizes the effects of the following user actions on diagnosticoperation:

s Set service-mode? to true

s Issue th e bootmode commands, bootmode diag or bootmode skip_diags Issue th e post command

TABLE 8-10 Summary of Diagnostic Operation

User Action Sets Configuration Variables And Initiates

Service Mode

Set service-mode? to true Note: Service m ode ov errides the settings of thefollowing configuration variables withoutchanging your stored settings:

• auto-boot? = false

• diag-level = max

• diag-trigger = power-on-reset

error-reset user reset

• input-device Factory default

Service m ode(defined by Sun )

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 208/294

176 Sun Fire V445 Server Administration Guide • September 2007

• input device = Factory default• output-device = Factory default

• verbosity = max

The following ap ply only to systems w ithfirmw are that contains OpenBoot Diagnostics:

• diag-script = normal

• test-args = subtests,verbose

Normal Mode

Set service-mode? to false • auto-boot? = u ser-defined setting

• auto-boot-on-error? = user-definedsetting

• diag-level = u ser-defined setting

• verbosity = u ser-defined setting

• diag-script = u ser-defined setting

• diag-trigger = u ser-defined setting

• input-device = u ser-defined setting• output-device = u ser-defined setting

Normal mode(user-defined)

 bootmode Commands

Issue bootmode diag comm and Overrides service mode settings and usesnormal mode settings with the followingexceptions:

• diag-level = min if normal mod evalue = off

• verbosity = min if normal mod evalue = none

Normal mode diagnosticswith th e exceptions in thepreceding column.

Issue bootmode skip_diagcommand

OpenBoot initializationwithout runningdiagnostics

post Command

Note: If the value of diag-script = normal or all, Op enBoot Diagnostics also run.

Issue post command POST d iagnostics

Specify b oth level an dverbosity

level an d verbosity = user-defined values

Specify neither level norverbosity

level an d verbosity = normal mod e valueswith the following exceptions:

• level = min if normal m ode value of diag-level = none

• verbosity = min if normal mode value ofb i

TABLE 8-10 Summary of Diagnostic Operation (Continued)

User Action Sets Configuration Variables And Initiates

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 209/294

Chapter 8 Diagnostics 177

Op enBoot DiagnosticsLike POST diagn ostics, OpenBoot Diagnostics cod e is firmw are-based and resides inthe boot PROM.

• verbosity = min if normal mode value of verbosity = none

Specify level only level = user-defined value

verbosity = normal m ode value forverbosity (Exception: verbosity = min if normal mod e value of verbosity = none)

w To Star t OpenBoot Diagnostics

1. Type:

2. Type:

This comm and displays the OpenBoot Diagnostics m enu. See TABLE 8-13.

TABLE 8-11

ok setenv diag-switch? true

ok setenv auto-boot? false

ok reset-all

TABLE 8-12

ok obdiag

TABLE 8-13 Sample obdiag Menu

obdiag

1 LSILogic,sas@1

4 rmc-comm@0,c28000

serial@3,fffff8

2 flashprom@0,0

5 rtc@0,70

3 network@0

6 serial@0,c2c000

Commands: test test all except help what setenv set default exit

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 210/294

178 Sun Fire V445 Server Administration Guide • September 2007

Note – If you have a PCI card installed in the server, then additional tests willappear on the obdiag menu.

3. Type:

where n represents the number corresponding to the test you want to run.

A sum mary of the tests is available. At the obdiag> prompt, type:

Commands: test test-all except help what setenv set-default exit

diag-passes=1 diag-level=min test-args=args

TABLE 8-14

obdiag> test n

TABLE 8-15

obdiag> help

4. You can also run all tests, type:

TABLE 8-16

obdiag> test-all

Hit the spacebar to interrupt testing

Testing /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1

......... passed

Testing /ebus@1f,464000/flashprom@0,0................................. passed

Testing /pci@1f,700000/pci@0/pci@2/pci@0/pci@8/pci@2/network@0

Internal loopback test -- succeeded.

Link is -- up

........ passed

Testing /ebus@1f,464000/rmc-comm@0,c28000

............................. passed

Testing /pci@1f,700000/pci@0/pci@1/pci@0/isa@1e/rtc@0,70

.............. passed

Testing /ebus@1f,464000/serial@0,c2c000

............................... passed

Testing /ebus@1f,464000/serial@3,fffff8

............................... passed

Pass:1 (of 1) Errors:0 (of 0) Tests Failed:0 Elapsed Time: 0:0:1:1

Hit any key to return to the main menu

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 211/294

Chapter 8 Diagnostics 179

Note – From th e obdiag promp t you can select a d evice from the list and test it.How ever, at the ok promp t you n eed to u se the full device path. In ad dition, thedevice needs to h ave a self-test method , otherwise errors w ill result.

Controlling OpenBoot Diagnostics Tests

Most of the OpenBoot configuration variables you use to control POST (seeTABLE 8-7) also affect Open Boot Diagn ostics tests.

s Use the diag-level variable to control the OpenBoot Diagnostics testing level.

s Use test-args to customize how the tests run.

By default, test-args is set to contain an empty string. You can modify test-

args using one or more of the reserved keyword s shown in TABLE 8-17.

TABLE 8-17 Keywords for the test-args OpenBoot Configuration Variable

Keyword What It Does

bist Invokes built-in self-test (BIST) on external and peripheral devices

debug Displays all debu g m essages

iopath Verifies bus/ interconnect integrity

loopback Exercises external loopback path for the device

media Verifies external and peripheral device media accessibility

restore Attemp ts to restore original state of the d evice if the previousexecution of the test failed

silent Displays only errors rather than the status of each test

subtests Displays main test and each subtest that is called

verbose Displays detailed m essages of status of all tests

callers=N  Displays backtrace of N  callers when a n error occurs• callers=0 - displays backtrace of all callers before the error

errors=N  Continues executing th e test until N  errors are encountered• errors=0 - displays all error reports w ithout term inating testing

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 212/294

180 Sun Fire V445 Server Administration Guide • September 2007

If you w ant to make mu ltiple customizations to the Op enBoot Diagnostics testing,you can set test-args to a comma-separated list of keywords, as in this examp le:

test and test-all Commands

You can a lso run O pen Boot Diagnostics tests directly from the ok prom pt. To do this,type the test command , followed by the full hardw are path of the device (or set of devices) to be tested. For example:

TABLE 8-18

ok setenv test-args debug,loopback,media

TABLE 8-19

ok test /pci@x,y/SUNW,qlc@2

Note – Knowing how to construct an appropriate hardware device path requiresprecise know ledge of the hard ware architecture of the Sun Fire V445 system.

To customize an individual test, you can use test-args as follows:

This affects only the curren t test w ithout changing the value of the test-argsOpenBoot configuration variable.

You can test all the devices in the device tree with the test-all command:

If you specify a pa th argu ment to test-all, then only the specified device and itschildren are tested. The following example shows the comma nd to test the USB busand all devices with self-tests that are connected to the USB bus:

TABLE 8-20

ok test /usb@1,3:test-args={verbose,debug}

TABLE 8-21

ok test-all

TABLE 8-22

ok test-all /pci@9,700000/usb@1,3

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 213/294

Chapter 8 Diagnostics 181

OpenBoot Diagnostics Error Messages

OpenBoot Diagnostics error results are rep orted in a tabu lar format th at contains ashort sum mary of the problem, the hardw are device affected, the subtest that failed,and other diagnostic information. The following example displays a sampleOpenBoot Diagnostics error message.

CODE EXAMPLE 8-2 Open Boot Diagnostics Error Message

Testing /pci@1e,600000/isa@7/flashprom@2,0

ERROR : There is no POST in this FLASHPROM or POST header is

unrecognized

DEVICE : /pci@1e,600000/isa@7/flashprom@2,0

SUBTEST : selftest:crc-subtest

MACHINE : Sun Fire V445

SERIAL# : 51347798

DATE : 03/05/2003 15:17:31 GMT

CONTR0LS: diag-level=max test-args=errors=1

Error: /pci@1e,600000/isa@7/flashprom@2,0 selftest failed, return code = 1

Selftest at /pci@1e,600000/isa@7/flashprom@2,0 (errors=1) .............

failed

Pass:1 (of 1) Errors:1 (of 1) Tests Failed:1 Elapsed Time: 0:0:0:1

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 214/294

182 Sun Fire V445 Server Administration Guide • September 2007

About OpenBoot Command sOpenBoot commands are commands you type from the ok prom pt. OpenBootcommand s that can provide u seful d iagnostic information are:

s

probe-scsi-alls probe-ide

s show-devs

probe-scsi-all

The probe-scsi-all command diagnoses problems w ith the SAS devices.

Caution – If you u sed the halt command or the Stop-A key sequence to reach theok prompt, then issuing the probe-scsi-all command can hang the system.

The probe-scsi-all command commu nicates w ith all SAS devices connected toon-board SAS controllers and accesses devices connected to any host adaptersinstalled in PCI slots.

For any SAS device that is connected and active, the probe-scsi-all commanddisplays its loop ID, host ad apter, logical unit n um ber, unique World Wide N ame(WWN), and a device description that includ es type and man ufacturer.

The following is samp le outpu t from the probe-scsi-all command.

CODE EXAMPLE 8-3 Sample probe-scsi-all Command Output

{3} ok probe-scsi-all

/pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1

MPT Version 1.05, Firmware Version 1.08.04.00

Target 0Unit 0 Disk SEAGATE ST973401LSUN72G 0356 143374738

Blocks, 73 GB

SASAddress 5000c50000246b35 PhyNum 0

Target 1

Unit 0 Disk SEAGATE ST973401LSUN72G 0356 143374738

Blocks, 73 GB

SASAddress 5000c50000246bc1 PhyNum 1

Target 4 Volume 0

Unit 0 Disk LSILOGICLogical Volume 3000 16515070

Blocks 8455 MB

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 215/294

Chapter 8 Diagnostics 183

probe-ide

The probe-ide comm and comm un icates w ith all Integrated Drive Electronics (IDE)devices connected to th e IDE bus. This is the internal system bu s for med ia devicessuch as the DVD drive.

Caution – If you used the halt command or the Stop-A key sequence to reach theok prompt, then issuing the probe-ide command can hang the system.

Blocks, 8455 MB

Target 6

Unit 0 Disk FUJITSU MAV2073RCSUN72G 0301 143374738

Blocks, 73 GB

SASAddress 500000e0116a81c2 PhyNum 6

{3} ok

The following is sample outp ut from the probe-ide command.

CODE EXAMPLE 8-4 Sample probe-ide Command Output

{1} ok probe-ide

Device 0 ( Primary Master )

Removable ATAPI Model: DV-28E-B

Device 1 ( Primary Slave )Not Present

Device 2 ( Secondary Master )

Not Present

Device 3 ( Secondary Slave )

Not Present

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 216/294

184 Sun Fire V445 Server Administration Guide • September 2007

show-devs

The show-devs command lists the hardw are device paths for each d evice in thefirmware device tree. shows some sample output.

CODE EXAMPLE 8-5 show-devs Command Output (Truncated)

/i2c@1f,520000

/ebus@1f,464000/pci@1f,700000

/pci@1e,600000

/memory-controller@3,0

/SUNW,UltraSPARC-IIIi@3,0

/memory-controller@2,0

/SUNW,UltraSPARC-IIIi@2,0

/memory-controller@1,0

/SUNW,UltraSPARC-IIIi@1,0

/memory-controller@0,0

/SUNW,UltraSPARC-IIIi@0,0

/virtual-memory

/memory@m0,0

/aliases

/options

/openprom

/chosen

/packages

/i2c@1f,520000/cpu-fru-prom@0,e8

/i2c@1f 520000/dimm spd@0 e6

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 217/294

Chapter 8 Diagnostics 185

/i2c@1f,520000/dimm-spd@0,e6

/i2c@1f,520000/dimm-spd@0,e4

.

.

.

/pci@1f,700000/pci@0

/pci@1f,700000/pci@0/pci@9

/pci@1f,700000/pci@0/pci@8/pci@1f,700000/pci@0/pci@2

/pci@1f,700000/pci@0/pci@1

/pci@1f,700000/pci@0/pci@2/pci@0

/pci@1f,700000/pci@0/pci@2/pci@0/pci@8

/pci@1f,700000/pci@0/pci@2/pci@0/network@4,1

/pci@1f,700000/pci@0/pci@2/pci@0/network@4

/pci@1f,700000/pci@0/pci@2/pci@0/pci@8/pci@2

/pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1

/pci@1f,700000/pci@0/pci@2/pci@0/pci@8/pci@2/network@0

/pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/disk

/pci@1f,700000/pci@0/pci@2/pci@0/pci@8/LSILogic,sas@1/tape

w To Run OpenBoot Comm and s1. Halt the system to reach the ok prompt.

How you d o this depen ds on the system’s condition. If possible, you shouldwarn users before you shut the system down.

2. Type the appropriate command at the console prompt.

About Pred ictive Self-HealingIn Solaris 10 system s, the Solaris Pred ictive Self-Healing (PSH) technology enablesSun Fire V445 server to diagnose problems while the Solaris OS is running, and

mitigate man y p roblems before they negatively a ffect op erations.

The Solaris OS uses the fault manager daem on, fmd(1M), which starts at boot timeand runs in the background to monitor the system. If a component generates anerror, the daemon h and les the error by correlating the error with d ata from previouserrors and other related information to diagnose the problem. Once diagnosed, thefault manager daem on assigns the problem a Universal Unique Identifier (UUID)that distinguishes the p roblem across any set of systems. When p ossible, the faultman ager d aemon initiates steps to self-heal the failed comp onent and take thecomponent offline. The daemon also logs the fault to the syslogd daemon andprov ides a fau lt notification with a m essage ID (MSGID) You can use m essage ID to

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 218/294

186 Sun Fire V445 Server Administration Guide • September 2007

prov ides a fau lt notification w ith a m essage ID (MSGID). You can use m essage ID toget ad ditional information a bout th e problem from Sun ’s know ledge articledatabase.

The Pred ictive Self-Healing technology covers the following Sun Fire V445 servercomponents:

s UltraSPARC IIIi processorss Memorys I/ O bus

The PSH console message provides the following information:

s Types Severitys Descriptions Automated Responses Impacts Suggested Action for System Administrator

If the Solaris PSH facility has detected a faulty component, use the fmdump

command (described in the following subsections) to identify the fault. Faulty FRUsare identified in fault messages using the FRU na me.

Use the following w eb site to interpret faults and obtain information on a fault:

http://www.sun.com/msg/

This web site directs you to provide the message ID that your system displayed. Theweb site then p rovides knowledge a rticles about the fault and corrective action toresolve the fault. The fault information and docum entation at this w eb site isupdated regularly.

You can find m ore d etailed d escriptions of Solaris 10 Predictive Self-Healing a t thefollowing web site:

http://www.sun.com/bigadmin/features/articles/selfheal.html

Predictive Self-Healing Tools

In sum mary, the Solaris Fault Manager daem on (fmd) performs the followingfunctions:

s Receives telemetry information a bout p roblems d etected by the system software.

s Diagnoses the problems and provides system generated m essages.

s Initiates pro-active self-healing activities such as disabling faulty components.

TABLE 8-23 shows a typ ical message generated w hen a fault occurs on your system.

The message appears on you r console and is recorded in the /var/adm/messagesfile.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 219/294

Chapter 8 Diagnostics 187

Note – The messages in TABLE 8-23 indicate that the fault has already beendiagnosed. Any corrective action that the system can p erform has already takenplace. If your server is still run ning, it continues to ru n.

TABLE 8-23 System Generated Predictive Self-Healing Message

Output Displayed Description

Jul 1 14:30:20 sunrise EVENT-TIME: Tue Nov 1 16:30:20

PST 2005

EVENT-TIME: the time stamp of thediagnosis.

Jul 1 14:30:20 sunrise PLATFORM: SUNW,A70, CSN: -,

HOSTNAME: sunrise

PLATFORM: A description of thesystem encountering the p roblem

Jul 1 14:30:20 sunrise SOURCE: eft, REV: 1.13 SOURCE: Information on theDiagnosis Engine used todetermine the fault

U i th P di ti S lf H li C d

Jul 1 14:30:20 sunrise EVENT-ID: afc7e660-d609-4b2f-

86b8-ae7c6b8d50c4

EVENT-ID: The UniversallyUnique event ID (UUID) for thisfault

Jul 1 14:30:20 sunrise DESC:

Jul 1 14:30:20 sunrise A problem was detected in thePCI-Express subsystem

DESC: A basic description of th efailure

Jul 1 14:30:20 sunrise Refer to

http://sun.com/msg/SUN4-8000-0Y for more information.

WEBSITE: Where to find specificinformation and actions for thisfault

Jul 1 14:30:20 sunrise AUTO-RESPONSE: One or more

device instances may be disabled

AUTO-RESPONSE: What, if anything, the system did toalleviate any follow-on issues

Jul 1 14:30:20 sunrise IMPACT: Loss of services

provided by the device instances associated with this

fault

IMPACT: A description of wh at thatresponse may have d one

Jul 1 14:30:20 sunrise REC-ACTION: Schedule a repair

procedure to replace the affected device. Use Nov 1

14:30:20 sunrise fmdump -v -u EVENT_ID to identify the

device or contact Sun for support.

REC-ACTION: A short d escriptionof what the system administratorshould do

TABLE 8-23 System Generated Predictive Self-Healing Message

Output Displayed Description

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 220/294

188 Sun Fire V445 Server Administration Guide • September 2007

Using the Predictive Self-Healing Commands

For complete information about Predictive Self-Healing commands, refer to theSolaris 10 man pag es. This section describes som e d etails of the followingcommands:

s fmdump(1M)s fmadm(1M)s fmstat(1M)

Using the fmdump Command

After the m essage in TABLE 8-23 is displayed, more information abou t the fault isavailable. The fmdump command displays the contents of any log files associated

with the Solaris Fault Manager.

The fmdump command produces output similar to TABLE 8-23. This example assumesthere is only one fault.

fmdump -V

The -V option provides m ore details.

Three lines of new outp ut are d elivered w ith the -V option.

s The first line is a sum mar y of informa tion displayed previously in the consolemessage but includes the tim estamp, the UUID, and the Message-ID.

s The second line is a declaration of the certainty of the diagnosis. In this case thefailure is in the ASIC described. If the diagnosis could involve multiple

TABLE 8-24

# fmdump

TIME UUID SUNW-MSG-ID

Jul 02 10:04:15.4911 0ee65618-2218-4997-c0dc-b5c410ed8ec2 SUN4-8000-0Y

TABLE 8-25

# fmdump -V -u 0ee65618-2218-4997-c0dc-b5c410ed8ec2

TIME UUID SUNW-MSG-ID

Jul 02 10:04:15.4911 0ee65618-2218-4997-c0dc-b5c410ed8ec2 SUN4-8000-0Y100% fault.io.fire.asic

FRU: hc://product-id=SUNW,A70/motherboard=0

rsrc: hc:///motherboard=0/hostbridge=0/pciexrc=0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 221/294

Chapter 8 Diagnostics 189

components, two lines wou ld be d isplayed here w ith 50 percent in each, forexample.

s The FRU line d eclares the part tha t needs to be rep laced to retu rn th e system to afully operational state.

s The rsrc line d escribes what component w as taken ou t of service as a result of 

this fault.

fmdump -e

To get information of the errors that caused this failure, use the -e option.

TABLE 8-26

# fmdump -e

TIME CLASS

Nov 02 10:04:14.3008 ereport.io.fire.jbc.mb_per

Using the fmadm faulty Command

The fmadm faulty command lists and m odifies system configuration p aram etersthat are maintained by the Solaris Fault Manager. The fmadm faulty command isprima rily used to d etermine the status of a component involved in a fault.

The PCI device is degrad ed and is associated w ith the same UUID as seen above.You may also see faulted states.

fmadm config

The fmadm config command output shows the version numbers of the diagnosisengines in use by your system, and also displays their current state. You can checkthese versions against information on th e http://sunsolve.sun.comweb site todeterm ine if your server is u sing the latest diagnostic engines.

TABLE 8-27

# fmadm faulty

STATERESOURCE / UUID

-------- -------------------------------------------------------------

degraded dev:////pci@1e,600000

0ee65618-2218-4997-c0dc-b5c410ed8ec2

TABLE 8-28

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 222/294

190 Sun Fire V445 Server Administration Guide • September 2007

Using the fmstat Command

The fmstat command can report statistics associated with the Solaris FaultManager. The fmstat command shows information about DE performance. In theexample below, the eft DE (also seen in the console outpu t) has received an eventwh ich it accepted. A case is opened for that event and a diagnosis is performed tosolve the cause for the failure.

# fmadm config

MODULE VERSION STATUS DESCRIPTION

cpumem-diagnosis 1.5 active UltraSPARC-III/IV CPU/Memory Diagnosis

cpumem-retire 1.1 active CPU/Memory Retire Agent

eft 1.16 active eft diagnosis engine

fmd-self-diagnosis 1.0 active Fault Manager Self-Diagnosis

io-retire 1.0 active I/O Retire Agent

snmp-trapgen 1.0 active SNMP Trap Generation Agent

sysevent-transport 1.0 active SysEvent Transport Agent

syslog-msgs 1.0 active Syslog Messaging Agent

zfs-diagnosis 1.0 active ZFS Diagnosis Engine

Abou t Trad itional Solaris OS DiagnosticToolsIf a system passes OpenBoot Diagnostics tests, it normally attempts to boot itsmultiuser OS. For most Sun systems, this means the Solaris OS. Once the server is

run ning in m ultiuser m ode, you have access to the software-based exerciser tools,SunVTS and Sun Management Center. These tools enable you to monitor the server,exercise it, and isolate faults.

TABLE 8-29

# fmstat

module ev_recv ev_acpt wait svc_t %w %b open solve memsz bufsz

cpumem-diagnosis 0 0 0.0 0.0 0 0 0 0 3.0K 0

cpumem-retire 0 0 0.0 0.0 0 0 0 0 0 0

eft 0 0 0.0 0.0 0 0 0 0 713K 0

fmd-self-diagnosis 0 0 0.0 0.0 0 0 0 0 0 0io-retire 0 0 0.0 0.0 0 0 0 0 0 0

snmp-trapgen 0 0 0.0 0.0 0 0 0 0 32b 0

sysevent-transport 0 0 0.0 6704.4 1 0 0 0 0 0

syslog-msgs 0 0 0.0 0.0 0 0 0 0 0 0

zfs-diagnosis 0 0 0.0 0.0 0 0 0 0 0 0

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 223/294

Chapter 8 Diagnostics 191

exercise it, and isolate faults.

Note – If you set th e auto-boot OpenBoot configuration variable to false, the OSdoes not  boot following completion of the firmware-based tests.

In addition to the tools mentioned above, you can refer to error and system messagelog files, and Solaris system information commands.

Error and System Message Log Files

Error and other system m essages are saved in the /var/adm/messages file.Messages are logged to this file from many sources, including the OS, theenvironmental control subsystem, and various software ap plications.

Solaris System Information CommandsThe following Solaris comm and s display data that you can use wh en assessing thecondition of a Sun Fire V445 server:

s prtconf

s prtdiag

s prtfru

s psrinfos showrev

This section d escribes the information these comman ds give you . For m oreinformation on using these comman ds, refer to the Solaris man pages.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 224/294

192 Sun Fire V445 Server Administration Guide • September 2007

Using the prtconf Command

The prtconf command displays the Solaris device tree. This tree includes all thedevices probed by OpenBoot firmw are, as well as ad ditional devices, like individualdisks. The outpu t of prtconf also includ es the total amou nt of system mem ory, andshows an excerpt of prtconf outp ut (trun cated to save space).

CODE EXAMPLE 8-6 prtconf Command Outp ut (Truncated)

# prtconf

System Configuration: Sun Microsystems sun4u

Memory size: 1024 Megabytes

System Peripherals (Software Nodes):

SUNW,Sun-Fire-V445

packages (driver not attached)

SUNW,builtin-drivers (driver not attached)

deblocker (driver not attached)disk-label (driver not attached)

terminal-emulator (driver not attached)

dropins (driver not attached)

kbd-translator (driver not attached)

obp-tftp (driver not attached)

SUNW,i2c-ram-device (driver not attached)

SUNW,fru-device (driver not attached)

ufs-file-system (driver not attached)

chosen (driver not attached)

openprom (driver not attached)

client-services (driver not attached)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 225/294

Chapter 8 Diagnostics 193

The prtconf command -p option prod uces outpu t similar to the Op enBootshow-devs command . This outp ut lists only those devices comp iled by the systemfirmware.

Using the prtdiag Command

The prtdiag command displays a table of diagnostic information that summ arizesthe status of system components.

client-services (driver not attached)

options, instance #0

aliases (driver not attached)

memory (driver not attached)

virtual-memory (driver not attached)

SUNW,UltraSPARC-IIIi (driver not attached)

memory-controller, instance #0SUNW,UltraSPARC-IIIi (driver not attached)

memory-controller, instance #1 ...

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 226/294

Base Address Size Interleave Factor Contains

-----------------------------------------------------------------------

0x0 8GB 16 BankIDs

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15

0x1000000000 8GB 16 BankIDs

16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31

0x2000000000 4GB 4 BankIDs 32,33,34,35

0x3000000000 4GB 4 BankIDs 48,49,50,51

Bank Table:

-----------------------------------------------------------

Physical Location

ID ControllerID GroupID Size Interleave Way

-----------------------------------------------------------

0 0 0 512MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15

1 0 0 512MB

2 0 1 512MB

3 0 1 512MB

4 0 0 512MB

5 0 0 512MB

6 0 1 512MB

7 0 1 512MB

8 0 1 512MB

9 0 1 512MB

10 0 0 512MB11 0 0 512MB

12 0 1 512MB

13 0 1 512MB

CODE EXAMPLE 8-7 prtdiag Command Output (Continued)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 227/294

Chapter 8 Diagnostics 195

13 0 1 512MB

14 0 0 512MB

15 0 0 512MB

16 1 0 512MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15

17 1 0 512MB

18 1 1 512MB

19 1 1 512MB

20 1 0 512MB

21 1 0 512MB

22 1 1 512MB

23 1 1 512MB

24 1 1 512MB

25 1 1 512MB

26 1 0 512MB

27 1 0 512MB28 1 1 512MB

29 1 1 512MB

30 1 0 512MB

31 1 0 512MB

32 2 0 1GB 0,1,2,3

33 2 1 1GB

34 2 1 1GB

35 2 0 1GB

48 3 0 1GB 0,1,2,3

49 3 1 1GB

50 3 1 1GB

51 3 0 1GB

Memory Module Groups:

--------------------------------------------------

ControllerID GroupID Labels Status

--------------------------------------------------

0 0 MB/C0/P0/B0/D0

0 0 MB/C0/P0/B0/D1

0 1 MB/C0/P0/B1/D0

0 1 MB/C0/P0/B1/D1

1 0 MB/C1/P0/B0/D0

1 0 MB/C1/P0/B0/D1

1 1 MB/C1/P0/B1/D0

1 1 MB/C1/P0/B1/D1

2 0 MB/C2/P0/B0/D0

2 0 MB/C2/P0/B0/D1

2 1 MB/C2/P0/B1/D0

2 1 MB/C2/P0/B1/D1

3 0 MB/C3/P0/B0/D03 0 MB/C3/P0/B0/D1

3 1 MB/C3/P0/B1/D0

3 1 MB/C3/P0/B1/D1

CODE EXAMPLE 8-7 prtdiag Command Output (Continued)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 228/294

196 Sun Fire V445 Server Administration Guide • September 2007

3 1 MB/C3/P0/B1/D1

=============================== usb Devices ===============================

Name Port#

------------ -----

hub HUB0

bash-3.00#

Page 177

Verbose output with fan tach fail

============================ Environmental Status ============================Fan Status:

-------------------------------------------

Location Sensor Status

-------------------------------------------

MB/FT0/F0 TACH okay

MB/FT1/F0 TACH failed (0 rpm)

MB/FT2/F0 TACH okay

MB/FT5/F0 TACH okay

PS1 FF_FAN okay

PS3 FF_FAN okay

Temperature sensors:

-----------------------------------------Location Sensor Status

-----------------------------------------

MB/C0/P0 T_CORE okay

MB/C1/P0 T_CORE okay

MB/C2/P0 T_CORE okay

MB/C3/P0 T_CORE okay

MB/C0 T_AMB okay

MB/C1 T_AMB okay

MB/C2 T_AMB okay

MB/C3 T_AMB okay

MB T_CORE okay

MB IO_T_AMB okay

MB/FIOB T_AMB okay

MB T_AMB okay

PS1 FF_OT okay

PS3 FF_OT okay

------------------------------------Current sensors:

----------------------------------------

Location Sensor Status

CODE EXAMPLE 8-7 prtdiag Command Output (Continued)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 229/294

Chapter 8 Diagnostics 197

In add ition to the information in CODE EXAMPLE 8-7, prtdiag with the verbose

option (-v) also reports on front panel status, disk status, fan status, power sup plies,hardware revisions, and system temperatures.

Location Sensor Status

----------------------------------------

MB/USB0 I_USB0 okay

MB/USB1 I_USB1 okay

CODE EXAMPLE 8-8 prtdiag Verbose Output

System Temperatures (Celsius):

-------------------------------

Device Temperature Status

---------------------------------------

CPU0 59 OK

CPU2 64 OK

DBP0 22 OK

In the event of an overtemp erature condition, prtdiag reports an error in th e Statuscolumn.

Similarly, if there is a failure of a p articular comp onen t, prtdiag reports a fault inthe appropriate Status column.

Using the prtfru Command

The Sun Fire V445 system ma intains a hierarchical list of all FRUs in th e system, as

CODE EXAMPLE 8-9 prtdiag Overtemperature Indication Ou tput

System Temperatures (Celsius):

-------------------------------

Device Temperature Status

---------------------------------------

CPU0 62 OK

CPU1 102 ERROR

CODE EXAMPLE 8-10 prtdiag Fault Ind ication Outpu t

Fan Status:-----------

Bank RPM Status

---- ----- ------

CPU0 4166 [NO_FAULT]

CPU1 0000 [FAULT]

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 230/294

198 Sun Fire V445 Server Administration Guide • September 2007

y y ,well as specific information about various FRUs.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 231/294

/InstallationR[0]/Parent_Part_Number: 5017066

/InstallationR[0]/Parent_Serial_Number: BM004E

/InstallationR[0]/Parent_Dash_Level: 05

/InstallationR[0]/System_Id:

/InstallationR[0]/System_Tz: 238

/InstallationR[0]/Geo_North: 15658734

/InstallationR[0]/Geo_East: 15658734

/InstallationR[0]/Geo_Alt: 238/InstallationR[0]/Geo_Location:

/InstallationR[1]

/InstallationR[1]/UNIX_Timestamp32: Mon Mar 6 10:08:30 EST 2006

/InstallationR[1]/Fru_Path: MB.SEEPROM

/InstallationR[1]/Parent_Part_Number: 3753302

/InstallationR[1]/Parent_Serial_Number: 0001

/InstallationR[1]/Parent_Dash_Level: 03

/InstallationR[1]/System_Id:

/InstallationR[1]/System_Tz: 238

/InstallationR[1]/Geo_North: 15658734

/InstallationR[1]/Geo_East: 15658734

/InstallationR[1]/Geo_Alt: 238

/InstallationR[1]/Geo_Location:

/InstallationR[2]

/InstallationR[2]/UNIX_Timestamp32: Tue Apr 18 10:00:45 EDT 2006

/InstallationR[2]/Fru_Path: MB.SEEPROM

/InstallationR[2]/Parent_Part_Number: 5017066/InstallationR[2]/Parent_Serial_Number: BM004E

/InstallationR[2]/Parent_Dash_Level: 05

/InstallationR[2]/System_Id:

CODE EXAMPLE 8-12 prtfru -c Command Output (Continued)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 232/294

200 Sun Fire V445 Server Administration Guide • September 2007

/InstallationR[2]/System_Tz: 0

/InstallationR[2]/Geo_North: 12704

/InstallationR[2]/Geo_East: 1

/InstallationR[2]/Geo_Alt: 251

/InstallationR[2]/Geo_Location:

/InstallationR[3]/InstallationR[3]/UNIX_Timestamp32: Fri Apr 21 08:50:32 EDT 2006

/InstallationR[3]/Fru_Path: MB.SEEPROM

/InstallationR[3]/Parent_Part_Number: 3753302

/InstallationR[3]/Parent_Serial_Number: 0001

/InstallationR[3]/Parent_Dash_Level: 03

/InstallationR[3]/System_Id:

/InstallationR[3]/System_Tz: 0

/InstallationR[3]/Geo_North: 1

/InstallationR[3]/Geo_East: 16531457

/InstallationR[3]/Geo_Alt: 251

/InstallationR[3]/Geo_Location:

/Status_EventsR (0 iterations)

SEGMENT: PE

/Power_EventsR (50 iterations)

/Power_EventsR[0]

/Power_EventsR[0]/UNIX_Timestamp32: Mon Jul 10 12:34:20 EDT 2006

/Power_EventsR[0]/Event: power_on

/Power_EventsR[1]

/Power_EventsR[1]/UNIX_Timestamp32: Mon Jul 10 12:34:49 EDT 2006

/Power_EventsR[1]/Event: power_off

/Power_EventsR[2]/Power_EventsR[2]/UNIX_Timestamp32: Mon Jul 10 12:35:27 EDT 2006

/Power_EventsR[2]/Event: power_on

/Power_EventsR[3]

/Power_EventsR[3]/UNIX_Timestamp32: Mon Jul 10 12:58:43 EDT 2006

/Power_EventsR[3]/Event: power_off

/Power_EventsR[4]

/Power_EventsR[4]/UNIX_Timestamp32: Mon Jul 10 13:07:27 EDT 2006

/Power_EventsR[4]/Event: power_on

/Power_EventsR[5]

/Power_EventsR[5]/UNIX_Timestamp32: Mon Jul 10 14:07:20 EDT 2006

/Power_EventsR[5]/Event: power_off

/Power_EventsR[6]

/Power_EventsR[6]/UNIX_Timestamp32: Mon Jul 10 14:07:21 EDT 2006

/Power_EventsR[6]/Event: power_on

/Power_EventsR[7]

/Power_EventsR[7]/UNIX_Timestamp32: Mon Jul 10 14:17:01 EDT 2006

/Power_EventsR[7]/Event: power_off/Power_EventsR[8]

/Power_EventsR[8]/UNIX_Timestamp32: Mon Jul 10 14:40:22 EDT 2006

/Power_EventsR[8]/Event: power_on

/ [9]

CODE EXAMPLE 8-12 prtfru -c Command Output (Continued)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 233/294

Chapter 8 Diagnostics 201

Data displayed by the prtfru command varies depend ing on the typ e of FRU. Ingeneral, it includes:

s FRU d escriptions Manu facturer nam e and locations Part num ber and serial numbers Hard ware revision levels

/Power_EventsR[9]

/Power_EventsR[9]/UNIX_Timestamp32: Mon Jul 10 14:42:38 EDT 2006

/Power_EventsR[9]/Event: power_off

/Power_EventsR[10]

/Power_EventsR[10]/UNIX_Timestamp32: Mon Jul 10 16:12:35 EDT 2006

/Power_EventsR[10]/Event: power_on/Power_EventsR[11]

/Power_EventsR[11]/UNIX_Timestamp32: Tue Jul 11 08:53:47 EDT 2006

/Power_EventsR[11]/Event: power_off

/Power_EventsR[12]

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 234/294

When used with the -p option, this comm and displays installed patches. TABLE 8-30shows a partial sample output from the showrev comman d w ith the -p option.

w To Run Solaris System Inform ation Com mand s

1. Decide w hat kind of system information you want to display.

For more information, see “Solaris System Information Commands” on page 192.

2. Type the appropriate command at a console prompt.

See TABLE 8-31 for a sum mary of the comm ands.

TABLE 8-30 showrev -p Command Output

Patch: 109729-01 Obsoletes: Requires: Incompatibles: Packages: SUNWcsu

Patch: 109783-01 Obsoletes: Requires: Incompatibles: Packages: SUNWcsu

Patch: 109807-01 Obsoletes: Requires: Incompatibles: Packages: SUNWcsu

Patch: 109809-01 Obsoletes: Requires: Incompatibles: Packages: SUNWcsu

Patch: 110905-01 Obsoletes: Requires: Incompatibles: Packages: SUNWcsu

Patch: 110910-01 Obsoletes: Requires: Incompatibles: Packages: SUNWcsu

Patch: 110914-01 Obsoletes: Requires: Incompatibles: Packages: SUNWcsu

Patch: 108964-04 Obsoletes: Requires: Incompatibles: Packages: SUNWcsr

TABLE 8-31 Using Solaris Information Display Commands

Command What It Displays What to Type Notes

fmadm Fault management information /usr/sbin/fmadm Lists information andchanges settings.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 235/294

Chapter 8 Diagnostics 203

g g

fmdump Fault management information /usr/sbin/fmdump Use the -v op tion foradditional detail.

prtconf System configuration information /usr/sbin/prtconf –

prtdiag Diagnostic and configurationinformation

/usr/platform/sun4u/sbi

n/prtdiagUse the -v option foradditional detail.

prtfru FRU hierarchy and SEEPROMmemory contents

/usr/sbin/prtfru Use the -l option tohierarchy. Use the -c

to display SEEPROM

psrinfo Date and time each CPU cameonline; processor clock speed

/usr/sbin/psrinfo Use the -v option toclock speed a nd othe

showrev Hardware and software revisioninformation

/usr/bin/showrev Use the -p option tosoftware patches.

Viewing Recent Diagnostic Test Resu ltsA summary of the results of the most recent power-on self-test (POST) are savedacross power cycles.

w To View Recent Test Results

1. Obtain the ok prompt.

2. To see a summary of the most recent POST results, type:

Setting Op enBoot Configu ration

VariablesSwitches and diagnostic configuration variables stored in the IDPROM d eterminehow and when power-on self-test (POST) diagnostics and OpenBoot Diagnostics

TABLE 8-32

ok show-post-results

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 236/294

204 Sun Fire V445 Server Administration Guide • September 2007

tests are performed. This section explains how to access and modify OpenBootconfigur ation var iables. For a list of impor tant O pen Boot configur ation va riables, seeTABLE 8-7.

Changes to OpenBoot configuration variables usually take effect upon the nextreboot.

w To View and Set OpenBoot ConfigurationVariables

1. Obtain the ok prompt.

s To display the current values of all OpenBoot configuration variables, use theprintenv command.

The following examp le shows a short excerpt of this command ’s outp ut.

s To set or change th e value of an Op enBoot configur ation var iable, use th e setenvcommand:

To set O penBoot configuration variables tha t accept m ultiple keyword s, separatekeywords with a space.

TABLE 8-33

ok printenv

Variable Name Value Default Value

diag-level min min

diag-switch? false false

TABLE 8-34

ok setenv diag-level max

diag-level = max

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 237/294

Chapter 8 Diagnostics 205

Ad ditional Diagnostic Tests for SpecificDevices

Using the probe-scsi Command to ConfirmThat Hard Disk Drives are Active

The probe-scsi command transmits an inquiry to SAS devices connected to thesystem’s interna l SAS interface. If a SAS device is conn ected and active, thecommand displays the unit number, device type, and manufacturer name for thatdevice.

The probe-scsi-all command transmits an inquiry to all SAS devices connectedto both the system’s internal and its external SAS interfaces. CODE EXAMPLE 8-16

CODE EXAMPLE 8-15 probe-scsi Output Message

ok probe-scsi

Target 0

Unit 0 Disk SEAGATE ST336605LSUN36G 4207

Target 1

Unit 0 Disk SEAGATE ST336605LSUN36G 0136

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 238/294

206 Sun Fire V445 Server Administration Guide • September 2007

shows sam ple outp ut from a server w ith no externally connected SAS devices butcontaining two 36 Gbyte Hard Disk Drives, both of them active.

CODE EXAMPLE 8-16 probe-scsi-all Output Message

ok probe-scsi-all

/pci@1f,0/pci@1/scsi@8,1

/pci@1f,0/pci@1/scsi@8

Target 0

Unit 0 Disk SEAGATE ST336605LSUN36G 4207

Target 1

Unit 0 Disk SEAGATE ST336605LSUN36G 0136

Using the probe-ide Command To Confirm Thatthe DVD Drive is Connected

The probe-ide command transmits an inquiry command to internal and externalIDE devices connected to the system ’s on-board IDE interface. The following sam pleoutput reports a DVD drive installed (as Device 0) and active in a server.

Using the watch-net and watch-net-all

Command s to Check the N etwork Connections

CODE EXAMPLE 8-17 probe-ide Outpu t Message

ok probe-ide

Device 0 ( Primary Master )

Removable ATAPI Model: DV-28E-B

Device 1 ( Primary Slave )

Not Present

Device 2 ( Secondary Master )

Not Present

Device 3 ( Secondary Slave )

Not Present

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 239/294

Chapter 8 Diagnostics 207

The watch-net diagnostics test monitors Ethernet packets on the primar y netw orkinterface. The watch-net-all diagnostics test mon itors Ethernet p ackets on theprimar y netw ork interface and on any ad ditional network interfaces connected to

the system board. Good packets received by the system are indicated by a period (.).Errors such as the framing error an d the cyclic redu nd ancy check (CRC) error areindicated with an X and an associated error description.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 240/294

If the kernel hangs and the w atchdog times out, ALOM reports and logs the eventand performs on e of three user configurable actions.

s xir: this is the default action an d will cause the server to capture cpu register

an d memory contents to the dump-device using the firmw are level synccommand . In the event of the sync hanging, ALOM falls back to a hard resetafter 15 minutes.

Note – Do not confuse this OpenBoot sync command with the Solaris OS synccommand , which results in I/ O w rites of buffered d ata to the d isk drives, prior tounm ounting file systems.

s Reset: this is a hard reset and results in a r apid system recovery but diagnosticdata regard ing the hang is not stored, and file system d amage m ay result.

s None - this will result in the system being left in the hung state indefinitely afterthe watchdog timeout has been reported.

For more information, see the sys_autorestart section of the ALOM Online Help.

Abou t Automatic System Restoration

Note – Automatic System Restoration (ASR) is not the same as Automatic ServerRestart, which the Sun Fire V445 server also supports.

A i S R i (ASR) i f lf f d

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 241/294

Chapter 8 Diagnostics 209

Automatic System Restoration (ASR) consists of self-test features and an auto-configuring capability to d etect failed hard ware comp onents and un configure them.By d oing this, the server is able to resum e operating after certain nonfatal hard warefaults or failures have occured.

If a comp onent is one th at is mon itored by ASR, and the server is capable of operating w ithout it, the server will automatically reboot if that component shoulddevelop a fault or fail.

ASR monitors th e following components:

s Memory modu less PCI cards

If a fault is detected du ring the p ower-on sequence, the faulty component isdisabled. If the system remains capable of functioning, the boot sequence continues.

If a fault occurs on a ru nning server, and it is possible for the server to r un withou tthe failed component, the server au tomatically reboots. This prevents a faultyhard ware compon ent from keeping the entire system d own or causing the system tocrash repeatedly.

To support such a degraded boot capability, the OpenBoot firmware uses the 1275Client Interface (via the device tree) to mark a device as either failed or disabled , bycreating a n ap prop riate status p roperty in the d evice tree nod e. The Solaris OS willnot activate a d river for any su bsystem so marked .

As long as a failed comp onent is electrically d orman t (not causing rand om bu serrors or signal noise, for examp le), the system w ill reboot automa tically and r esum eoperation w hile a service call is mad e.

Note – ASR is enabled by default.

Auto-Boot Options

The OpenBoot firmware stores configuration variables on a ROM chip called auto-

boot? and auto-boot-on-error?The d efault setting on the Sun Fire V445 serverfor both of these variables is true.

The auto-boot? setting controls whether or not the firmware autom atically bootsthe OS after each r eset. The auto-boot-on-error? setting controls whether th esystem will attemp t a d egraded boot when a subsystem failure is detected. Both theauto-boot? an d auto-boot-on-error? settings m ust be set to true (default) toenable an automatic degraded boot.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 242/294

210 Sun Fire V445 Server Administration Guide • September 2007

w To Set the Auto-Boot Switches

1. Type:

Note – With both of these variables set to true, the system attempts a degradedboot in response to any fatal nonrecoverable error.

ok setenv auto-boot? true

ok setenv auto-boot-on-error? true

Error Hand ling Summ aryError hand ling d uring th e pow er-on sequence falls into one of the following threecases:

s If no errors are detected by POST or OpenBoot Diagnostics, the system attemptsto boot if auto-boot? is true.

s If only non fatal errors are d etected by POST or Open Boot Diagnostics, the system

attempts to boot if auto-boot? is true and auto-boot-on-error? is true.Non -fatal errors include the following:

s SAS sub system failure. In this case, a workin g alterna te path to the boot d isk isrequired. For m ore information, see “About Mu ltipathing Softwa re” onpa ge 115.

s Ethernet interface failure.

s USB interface failure.

s Serial inter face failure.s PCI card failure.

s Memory failure.

Given a failed DIMM, the firmware unconfigures the entire logical bankassociated w ith the failed mod ule. Another non failing logical bank m ustbe present in the system for the system to attem pt a d egraded boot. See“About the CPU/ Memory Mod ules” on page 73.

Note – If POST or Op enBoot Diagnostics detects a nonfatal error a ssociated w ith thenorma l boot d evice, the O penBoot firmware autom atically u nconfigures the faileddevice and tries the next-in-line boot device, as specified by the boot-device

configuration variable.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 243/294

Chapter 8 Diagnostics 211

g

s If a critical or fatal error is d etected by POST or Open Boot Diagn ostics, the systemwill not boot regardless of the settings of auto-boot? or auto-boot-on-error?. Critical and fatal nonrecoverable errors include the following:s Any CPU faileds All logical memory banks faileds Flash RAM cyclical red un dancy check (CRC) failures Critical field-replaceable un it (FRU) PROM configura tion d ata failures Critical app lication-specific integra ted circuit (ASIC) failure

For more information a bout troubleshooting fatal errors, see Chapter 9.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 244/294

Automatic System Restoration User Comm and sThe OpenBoot comm and s .asr, asr-disable, and asr-enable are available forobtaining ASR status information and for manu ally u nconfiguring or reconfiguringsystem devices. For more information, see “Unconfiguring a Device Manually” onpa ge 112.

Enabling Automatic System RestorationThe ASR feature is enabled by d efault. ASR is alwa ys enabled w hen the diag-

switch? OpenBoot variable is set to true, and wh en the diag-trigger setting isset to error-reset.

To activate any pa rameter changes, type the following at the ok prompt:

The system p erman ently stores the param eter changes and boots au tomaticallywh en th e Op enBoot configuration variable auto-boot? is set to true (default).

Note – To store param eter changes, you can also pow er cycle the system using thefront panel Power button.

Di bli A t m ti S t m R t ti

ok reset-all

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 245/294

Chapter 8 Diagnostics 213

Disabling Automatic System Restoration

After you disable the automatic system restoration (ASR) feature, it is not activated

again un til you enable it at the system ok prompt.

w To Disable Au tom atic System Restoration

1. At the ok prompt, type:

ok setenv auto-boot-on-error? false

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 246/294

Abou t SunVTSSunVTS is a software suite that performs system and subsystem stress testing. Youcan view an d control a SunVTS session over a network. Using a remote m achine,you can view the p rogress of a testing session, chan ge testing option s, and control alltesting features of another machine on the netw ork.

You can run SunVTS software in four different test modes:

s Connection test mode provides a low-stress, quick testing of the availability andconnectivity of selected devices. These tests are nonintrusive, meaning theyrelease the d evices after a qu ick test, and th ey do not place a heavy load onsystem activity.

s Functional test mod e provides robust testing of your system an d devices. It uses

your system resources for thorough testing and it assumes that no otherapp lications are run ning.

s Exclusive test mode enables performing the tests that require no other Sun VTStests or applications running at the same time.

s Online test mod e enables performan ce of SunVTS testing wh ile other customerapp lications are run ning.

s Auto Config au tom atically detects all subsystem s and exercises them in on e of tw oways:

s Confidence testing – Performs on e pa ss of tests on all subsystems, and thenstops. For typ ical system configurations, this requires one or two hours.

s Comprehensive testing – Tests all subsystems repeatedly for up to 24 hours.

Since SunVTS software can run man y tests in pa rallel and consume m any system

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 247/294

Chapter 8 Diagnostics 215

y p y yresources, you shou ld be cautious wh en using it on a p rodu ction system. If you arestress-testing a system using th e Functional test mod e, do not ru n an ything else onthat system at the same time.

To install and u se Sun VTS, a system m ust be ru nn ing a Solaris OS comp atible for theSunVTS version. Since SunVTS software packages are optional, they may not beinstalled on your system. See “To Find Out Whether SunVTS Is Installed” onpage 217 for instructions.

SunVTS Software and SecurityDuring SunVTS software installation, you must choose between Basic or SunEnterp rise Au then tication Mechanism™ secur ity. Basic secur ity uses a local securityfile in the SunVTS installation directory to limit the users, groups, and hosts

perm itted to use SunVTS software. Sun Enterprise Auth entication Mechanismsecurity is based on the stand ard network auth entication protocol Kerberos andprovides secure user au thentication, data integrity and privacy for transactions overnetworks.

If your site uses Sun Enterprise Authentication Mechanism security, you m ust h avethe Sun Enterprise Authentication Mechanism client and server software installed inyour netw orked environmen t and configured p roperly in both Solaris and SunVTSsoftware. If your site d oes not u se Sun Enterpr ise Authentication Mechanism

security, do not choose the Sun Enterprise Authentication Mechanism option duringSunVTS software installation.

If you en able the wrong security scheme d uring installation, or if you improp erlyconfigure the security scheme you choose, you m ay find you rself un able to runSunVTS tests. For m ore inform ation, see the SunV TS User’s Guide and theinstructions accomp anying the Sun Enterprise Authentication Mechanism software.

Using SunVTS

SunVTS, the Sun Validation and Test Suite, is an online d iagnostics tool that y ou canuse to verify the configuration an d functionality of h ardw are controllers, devices,and platforms. It ru ns in th e Solaris OS and presents the following interfaces:

s Comm and line interface

s Serial (TTY) interfaceSunVTS software enables you to view and control testing sessions on a remotelyconnected server. TABLE 8-35 lists some of the tests that are available:

TABLE 8-35 SunVTS Tests

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 248/294

216 Sun Fire V445 Server Administration Guide • September 2007

SunVTS Test Description

cputest Tests th e CPU

disktest Tests the local disk drives

dvdtest Tests the DVD-ROM drive

fputest Tests the floating-point unit

nettest Tests the Ethernet hard wa re on the system board and the networ kinghardware on any optional PCI cards

netlbtest Performs a loopback test to check that th e Ethernet adap ter can sendand receive packets

pmemtest Tests the ph ysical mem ory (read only)

sutest Tests the server ’s on-board serial ports

w To Find Ou t Wh ether SunVTS Is Installed

q Type:

If SunVTS software is loaded, information about the p ackage will be displayed.

If SunVTS software is not loaded, you will see the following error message:

Installing SunVTS

vmemtest Tests the virtual mem ory (a combination of the swap partition andthe physical memory)

env6test Tests the environmental devices

ssptest Tests ALOM hard wa re devices

i2c2test Tests I2C dev ices for correct op eration

TABLE 8-36

# pkginfo -l SUNWvts

TABLE 8-37

ERROR: information for “SUNWvts” was not found

TABLE 8-35 SunVTS Tests

SunVTS Test Description

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 249/294

Chapter 8 Diagnostics 217

By d efault, Sun VTS is not installed on the Sun Fire V445 servers. H owev er, it isavailable in the Solaris_10/ExtraValue/CoBundled/SunVTS_ X.X Solaris 10

DVD supp lied in the Solaris Media Kit. For informat ion about d ownload ing Sun VTSfrom the Sun Dow nloard Center, refer to the Sun Hardware Platform Guide for theSolaris version you are using.

To find out more about using SunVTS, refer to the SunVTS documentation thatcorrespond s to the Solaris release that you are run ning.

Viewing SunVTS DocumentationThe SunVTS documents are accessible in the Solaris on Sun Hardwaredocumentation collection at http://docs.sun.com .

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 250/294

Sun Management Center software extends and enhances the mana gement capabilityof Sun ’s hardw are and software prod ucts.

Sun Man agement Center software is geared primarily toward system ad ministratorswh o have large data centers to monitor or other installations that have m anycomputer platforms to mon itor. If you ad minister a more m odest installation, youneed to weigh Sun Man agement Center software’s benefits against the requirement

of maintaining a significant database (typically over 700 Mbytes) of system statusinformation.

The servers being monitored must be up and running if you want to use SunManagement Center, since this tool relies on the Solaris OS. For instructions onusing this tool to m onitor a Sun Fire V445 server, see Chapter 8.

TABLE 8-39 Sun Management Center Features

Feature Description

System management Monitors and manages the system at the hardw are and operatingsystem levels. Monitored hard ware includes boards, tapes, pow er

supplies, and disks.Operating systemmanagement

Monitors and manages operating system parameters including load,resource usage, disk space, and n etwork statistics.

App lication andbusiness systemmanagement

Provides technology to m onitor business app lications such as trad ingsystems, accounting systems, inventory systems, and real-timecontrol systems.

Scalability Provides an open, scalable, and flexible solution to configure andman age multiple managem ent adm inistrative doma ins (consisting of man y systems) spanning an en terprise. The software can beconfigured an d u sed in a centralized or d istributed fashion bymultiple users.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 251/294

Chapter 8 Diagnostics 219

How Sun Management Center WorksSun Managemen t Center consists of three components:

s Agents Servers Monitor

You insta ll agents on systems to be m onitored. The agents collect system statusinformation from log files, device trees, and platform-specific sources, and reportthat data to the server component.

The server component m aintains a large database of status information for a w iderange of Sun platforms. This database is up dated frequently, and includ esinformation abou t boards, tapes, pow er sup plies, and d isks as well as OS param eterslike load, resource usage, and disk space. You can create alarm thresholds and benotified w hen th ese are exceeded.

The monitor components p resent the collected data to you in a standa rd format. SunManagemen t Center software p rovides both a standalone Java ap plication and aWeb browser-based interface. The Java interface affords physical and logical views

of the system for highly-intuitable monitoring.

Using Sun Management Center

Sun Man agement Center software is aimed at system adm inistrators wh o have largedata centers to monitor or other installations that have man y compu ter platforms to

monitor. If you ad minister a sma ller installation, you need to w eigh SunManagemen t Center software’s benefits against the requ irement of maintaining asignificant database (typically over 700 Mbytes) of system status information.

The servers to be monitored mu st be runn ing , Sun Management Center relies on theSolaris OS for its op eration.

For detailed instructions, see the Sun Management Center Software User’s Guide.

Other Sun Management Center Features

Sun Management Center software p rovides you w ith add itional tools, which canoperate w ith managem ent utilities mad e by other comp anies.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 252/294

220 Sun Fire V445 Server Administration Guide • September 2007

The tools are an informal tracking mechanism and th e optional add-on, Hard wareDiagnostics Suite.

Informal Tracking

Sun Man agement Center agent software mu st be loaded on any system you w ant tomonitor. How ever, the prod uct enables you to informally track a su pp orted p latformeven w hen th e agent software has n ot been installed on it. In this case, you do n othave full monitoring capability, but you can add the system to you r brow ser, have

Sun Man agement Center p eriodically check wh ether it is up and run ning, and notifyyou if it goes out of commission.

Hardw are Diagnostic SuiteThe Hardware Diagnostic Suite is a package that you can pu rchase as an ad d-on toSun Management Center. The suite enables you to exercise a system while it is stillup and running in a production environment. See “Hard ware Diagnostic Suite” onpage 221 for more information.

Interoperability With Third-Party Monitoring ToolsIf you ad minister a heterogeneous netw ork and use a third-party netw ork-basedsystem monitoring or m anagemen t tool, you m ight be able to take advantage of SunManagement Center software’s support for Tivoli Enterprise Console, BMC Patrol,and H P Openview.

Obtaining the Latest InformationFor the latest information abou t this produ ct, go to the Sun Management Center website: http://www.sun.com/sunmanagementcenter

Hard ware Diagnostic Su iteThe Sun Management Center features an op tional Hard ware Diagnostic Suite, whichyou can p urchase as an ad d-on. The H ardw are Diagnostic Suite is designed toexercise a production system by running tests sequentially.

Sequential testing m eans the H ardw are Diagnostic Suite has a low imp act on the

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 253/294

Chapter 8 Diagnostics 221

Sequential testing m eans the H ardw are Diagnostic Suite has a low imp act on thesystem. Unlike SunVTS, which stresses a system by consuming its resources withman y p arallel tests (see “Abou t SunVTS” on pag e 215), the H ardw are DiagnosticSuite lets the server ru n other app lications w hile testing proceeds.

When to Run Hardware Diagnostic Suite

The best use of the H ardw are Diagnostic Suite is to disclose a susp ected orintermittent problem w ith a noncritical part on an otherwise functioning m achine.

Examp les might include qu estionable disk drives or memory mod ules on a machinethat has ample or redun dant d isk and memory resources.

In cases like these, the H ardw are Diagnostic Suite ru ns u nobtrusively until itidentifies the source of the problem. The machine un der test can be kept inprod uction mode u ntil and un less it must be shut d own for repair. If the faulty partis hot-pluggable or hot-swap pable, the entire d iagnose-and -repair cycle can becompleted with m inimal imp act to system users.

Requirements for Using Hardware DiagnosticSuite

Since it is a part of Sun Managemen t Center, you can only run Ha rdw are DiagnosticSuite if you ha ve set up you r d ata center to run Sun Man agement Center. Thismeans you have to ded icate a master server to run the Sun Managem ent Centerserver software th at sup ports Sun Managem ent Center software’s database of platform status information. In ad dition, you m ust install and set up Sun

Managemen t Center agent software on the systems to be m onitored. Finally, youneed to install the console portion of Sun Management Center software, whichserves as your interface to the H ardw are Diagnostic Suite.

Instructions for setting u p Sun Managemen t Center, as well as for u sing theHard wa re Diagnostic Suite, can be foun d in the Sun Management Center Software

User’s Guide.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 254/294

222 Sun Fire V445 Server Administration Guide • September 2007

CHAPTER 9

Troubleshooting

This chapter describes th e diag nostic tools available for the Sun Fire V445 server.

Topics in this chapter include:

s

“Troubleshooting Options” on page 223s “About Upd ated Troubleshooting Information” on page 224s “About Firmw are and Softwa re Patch Management” on page 225s “Abou t Sun Insta ll Check Tool” on p age 226s “About Sun Explorer Data Collector” on page 226s “About Sun Remote Services Net Connect” on page 227s “About Configuring the System for Troubleshooting” on page 227s “Core Dum p Process” on p age 230s “Enabling th e Core Dum p Process” on p age 231s “Testing the Core Dump Setup” on page 233

Troubleshooting Op tions

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 255/294

223

Troubleshooting Op tions

There are several troubleshooting options that you can imp lement w hen you set upand configure the Sun Fire V445 server. By setting up your system withtroubleshooting in m ind, you can save time and minimize disrup tions if the systemencounters any problems.

Tasks covered in this chapter include:

s “Enabling th e Core Dum p Process” on p age 231s “Testing the Core Dump Setup” on page 233

Other information in this chapter includes:

s “About Upd ated Troubleshooting Information” on page 224s “About Firmw are and Softwa re Patch Management” on page 225s “Abou t Sun Insta ll Check Tool” on p age 226

s

“About Sun Explorer Data Collector” on page 226s “About Configuring the System for Troubleshooting” on page 227

About Updated Troubleshooting

InformationYou can obtain the most current server troubleshooting information in the Sun Fire

V445 Server Product Notes and at Sun web sites. These resources can help youunderstand and diagnose problems that you might encounter.

Product N otesSun Fire V445 Server Product Notes contain late-breaking information about thesystem, includ ing th e following:

s Current recommended and required software patchess Updated hardware and driver compatibility informations Known issues and bug descriptions, including solutions and workarounds

The latest p rodu ct notes are available at:http://www.sun.com/documentation

Web Sites

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 256/294

224 Sun Fire V445 Server Administration Guide • September 2007

The following Sun w eb sites provide troubleshooting a nd other u seful information.

SunSolve Online

This site presents a collection of resources for Sun technical and supportinformation. Access to some of the information on this site d epend s on the level of your service contract with Sun. This site includes the following:

s Patch Support Portal – Everything you need to dow nload and install patches,

includ ing tools, produ ct patches, security patches, signed patches, x86 dr ivers,and m ore.

s Sun Install Check tool – A utility you can use to verify proper installation andconfigur ation of a new Sun Fire server. This resource checks a Sun Fire server forvalid p atches, hardw are, OS, and configuration.

s

Sun System Handbook – A d ocument th at contains technical information an dprovides access to discussion group s for most Sun hard wa re, including the SunFire V445 server.

s Sup port docum ents, security bulletins, and related links.

The SunSolve Online Web site is at:

http://sunsolve.sun.com

Big Admin

This site is a one-stop resource for Sun system administrators. The Big Admin website is at:

http://www.sun.com/bigadmin

Abou t Firmware and Software PatchManagementSun makes every attemp t to ensure that each system is shipped with the latest

firmw are and software. How ever, in comp lex systems, bugs and problems arediscovered in the field after systems leave the factory. Often, these problems arefixed with patches to th e system’s firmw are. Keeping you r system’s firmw are andSolaris OS current with the latest recomm ended and required p atches can help youavoid problems that others might have already d iscovered and solved.

Firmw are and OS upd ates are often requ ired to d iagnose or fix a problem. Schedu le

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 257/294

Chapter 9 Troubleshooting 225

p q g pregular up dates of your system’s firmw are and software so that you will not have to

up date the firmw are or software at an inconvenient time.You can find the latest patches and updates for the Sun Fire V445 server at the Websites listed in “Web Sites” on page 224.

About Sun Install Check ToolWhen you install the Sun Install Check tool, you also install Sun Explorer DataCollector. The Sun Install Check tool uses Sun Explorer Data Collector to h elp youconfirm that Sun Fire V445 server installation has been completed optimally.Together, they can evaluate your system for the following:

s Minimum required OS levels Presence of key critical pa tchess Proper system firmw are levelss Unsupported hard ware components

When Sun Install Check tool and Sun Explorer Data Collector identify potentialproblems, a report is generated th at p rovides specific instructions to rem edy theissue.

The Sun Insta ll Check tool is available at:

http://sunsolve.sun.com

At th at site, click on the link t o the Sun Install Check tool.

See also “About Sun Explorer Data Collector” on page 226.

About Sun Explorer Data CollectorThe Sun Explorer Data Collector is a system data collection tool that Sun supportservices engineers sometimes u se w hen troubleshooting Sun systems. In certain

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 258/294

226 Sun Fire V445 Server Administration Guide • September 2007

services engineers sometimes u se w hen troubleshooting Sun systems. In certainsupp ort situations, Sun supp ort services engineers might ask you to install and run

this tool. If you installed th e Sun Install Check tool at in itial installation, you alsoinstalled Sun Explorer Data Collector. If you did not install the Sun Install Checktool, you can install Sun Explorer Data Collector later with out th e Sun Install Checktool. By installing this tool as part of your initial system setup, you avoid having toinstall the tool at a later, and often inconvenient time.

Both the Sun Install Check tool (with bundled Sun Explorer Data Collector) and theSun Explorer Data Collector (standalone) are available at:

http://sunsolve.sun.com

At tha t site, click on th e ap propr iate link.

Abou t Sun Remote Services Net ConnectSun Remote Services (SRS) Net Con nect is a collection of system m anagemen tservices designed to h elp you better control your comp uting environm ent. TheseWeb-delivered services enable you to monitor systems, to create performance andtrend reports, and to receive automatic notification of system events. These services

help you to act more quickly wh en a system event occurs and to m anage potentialissues before they become problems.

More information about SRS Net Connect is available at:

http://www.sun.com/service/support/srs/netconnect

Abou t Configuring the System forTroubleshootingSystem failures are characterized by certain symptoms. Each symptom can be tracedto one or m ore problems or causes by u sing specific troubleshooting tools and

techniques. This section describes troubleshooting tools and techniques that you cancontrol through configuration variables.

Hardware Watchdog Mechanism

The hardw are watchdog m echanism is a hard ware timer that is continua lly reset as

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 259/294

Chapter 9 Troubleshooting 227

The hardw are watchdog m echanism is a hard ware timer that is continua lly reset aslong as the O S is run ning. If the system hangs, the OS is no longer able to reset thetimer. The timer th en expires and causes an autom atic externally initiated reset(XIR), displaying debug information on the system console. The hardware watchdogmechanism is enabled by d efault. If the hard ware w atchdog m echanism is disabled,the Solaris OS mu st be configured before the hard ware w atchdog m echanism can bereenabled.

The configuration variable error-reset-recovery allows you to control how th ehard ware w atchdog mechanism behaves w hen the timer expires. The following are

the error-reset-recovery settings:s boot (default) – Resets the timer and attemp ts to reboot the system.

s sync (recomm ended ) – Attemp ts to autom atically generate a core du mp filedu mp , reset the timer, and reboot the system.

s

none (equivalent to issuing a manual XIR from the ALOM system controller ) –Drops the server to the ok prompt, enabling you to issue commands and debugthe system.

For more information about the hard wa re watchd og mechanism and XIR, seeChapter 5.

Automatic System Restoration SettingsThe Automatic System Restoration (ASR) features enable the system to resumeoperation after experiencing certain nonfatal hardware faults or failures. When ASRis enabled, the system’s firmw are d iagnostics automatically d etect failed hard warecomponents. An auto-configuring capability designed into th e Op enBoot firmw areenables the system to u nconfigure failed components an d to restore systemoperation. As long as the system is capable of operating without the failed

component, th e ASR features enable th e system to reboot au tomatically, withou toperator intervention.

How you configure ASR settings has an effect not only on how the system han dlescertain typ es of failures but also on how you go about trou bleshooting certainproblems.

For day-to-day operations, enable ASR by setting OpenBoot configuration variablesas shown in TABLE 9-1.

TABLE 9-1 Open Boot Configuration Variable Settings to Enable Au tomatic SystemRestoration

Variable Setting

auto-boot? true

auto-boot-on-error? true

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 260/294

228 Sun Fire V445 Server Administration Guide • September 2007

Configuring you r system this w ay ensures tha t diagnostic tests run autom aticallywhen m ost serious hard wa re and softwa re errors occur. With this ASR configur ation,

you can save time diagnosing problems since POST and OpenBoot Diagnostics testresults are already available after the system encounters an error.

For more information about how ASR works, and complete instructions for enablingASR capability, see “About Automatic System Restoration” on page 209.

auto boot on error? true

diag-level max

diag-switch? true

diag-trigger all-resets

diag-device (Set to the boot-device value)

Remote Troubleshooting CapabilitiesYou can use the Sun Advanced Lights Out Manager (ALOM) system controller totroubleshoot and diagnose the system remotely. The ALOM system controllerenables you to do the following:

s Turn system p ower on an d off s Control the Locator indicators Change OpenBoot configuration variabless View system environmental status informations View system event logs

In addition, you can use the ALOM system controller to access the system console,provided it has not been redirected. System console access enables you to do thefollowing:

s Run OpenBoot Diagnostics testss View Solaris OS outpu t

s View POST outputs Issue firmware commands at the ok prompts View error events wh en the Solaris OS terminates abrup tly

For more information about ALOM system controller, see: Chapter 5 or the Su n

 Advanced Lights Out Manager (ALOM) Online Help.

For more information about the system console, see Chapter 2.

System Console Logging

Console logging is the ability to collect and log system console output. Consolelogging captures console messages so that system failure data, like Fatal Reset errordetails and POST outp ut, can be recorded and analyzed.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 261/294

Chapter 9 Troubleshooting 229

Console logging is especially valuable when troubleshooting Fatal Reset errors and

RED State Exceptions. In these cond itions, the Solaris OS term inates abr up tly, andalthough it send s messages to the system console, the OS software d oes not log anymessages in traditional file system locations like the /var/adm/messages file.

The error logging daem on, syslogd, automatically records various systemwarn ings and errors in messag e files. By defau lt, many of these system messages aredisplayed on th e system console and are stored in the /var/adm/messages file.

Note – Solaris 10 moves CPU and mem ory hard ware d etected d ata from the/var/adm/messages file to the fault management components. This make it easierto locate hardware events and to facilitate predictive self healing.

You can d irect w here system log messages are stored or have th em sent to a remotesystem by setting up system message logging. For m ore information, see “How toCustomize System Message Logging” in the System A dministration Guide: Advanced 

 Administration, which is part of the Solaris System Administrator Collection.

In some failure situations, a large stream of data is sent to the system console.Because ALOM system controller log messages are written into a circular bu ffer thatholds 64 Kbytes of data, it is possible that the output identifying the original failingcomponent can be overwritten. Therefore, you may wan t to explore furth er system

console logging options, such as SRS Net Connect or third-party vendor solutions.For more information about SRS Net Connect, see “About Sun Remote Services NetConnect” on page 227.

More information about SRS Net Connect is available at:

http://www.sun.com/service/support/

Certain third-party ven dors offer d ata logging terminal servers and centralized

system console management solutions that monitor and log output from manysystems. Depend ing on the nu mber of systems you are ad ministering, these mightoffer solutions for logging system console information.

For more information about the system console, see Chapter 2.

Pred ictive Self-Healing

The Solaris Fault Manager daem on, fmd(1M), run s in the background on everySolaris 10 or later system and receives telemetry information about problemsdetected by the system software. The fault manager then u ses this information todiagnose detected problems and initiate proactive self-healing activities such asdisabling faulty components.

fmdump(1M) fmadm(1M) and fmstat(1M) are the three core comm and s that

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 262/294

230 Sun Fire V445 Server Administration Guide • September 2007

fmdump(1M), fmadm(1M), and fmstat(1M) are the three core comm and s that

adm inister the system generated messages produ ced by the Solaris Fault Manager.See “About Predictive Self-Healing” on page 186 for details. Also refer to the manpages for these command s.

Core Dump ProcessIn some failure situations, a Sun en gineer migh t need to analyze a system core dum pfile to determine the root cause of a system failure. Although the core dump processis enabled b y defau lt, you shou ld configure your system so that th e core du mp file issaved in a location with ad equate spa ce. You m ight also want to change the d efault

core du mp directory to an other locally m ounted location so that you can betterman age any system core du mp s. In certain testing and pre-produ ctionenvironments, this is recomm ended since core dum p files can take up a largeamount of file system space.

Swap sp ace is used to sav e the du mp of system mem ory. By defau lt, Solaris softwareuses the first swap device that is defined. This first swap device is know n as th edump device.

During a system core dum p, the system saves the content of kernel core memory tothe du mp device. The dum p content is compressed du ring the du mp process at a 3:1ratio; that is, if the system were using 6 Gbytes of kernel mem ory, the du mp file willbe about 2 Gbytes. For a typical system, the du mp dev ice shou ld be at least one thirdthe size of the total system memory.

See “Enabling th e Core Dum p Process” on p age 231 for instructions on h ow tocalculate the amou nt of available swa p space.

Enabling the Core Dump ProcessThis is norm ally a task th at you wou ld comp lete just p rior to placing a system intothe production environment.

Access th e system console. See:

s “About Com mu nicating With the System” on p age 26

w To Enable the Core Du mp Process

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 263/294

Chapter 9 Troubleshooting 231

1. Check that the core dump process is enabled . As root, type the dumpadm

command.

By default, the core dump process is enabled in the Solaris 8 OS.

TABLE 9-2

# dumpadm 

Dump content: kernel pages

Dump device: /dev/dsk/c0t0d0s1 (swap)

Savecore directory: /var/crash/machinename

Savecore enabled: yes

2. Verify that there is sufficient swap space to dump memo ry. Type the swap -l

command.

To determ ine how man y bytes of swap space are available, mu ltiply the nu mberin the blocks colum n by 512. Taking th e nu mb er of blocks from the first entry,c0t3d0s0, calculate as follows:

4097312 x 512 = 2097823744

The result is approximately 2 Gbytes.

3. Verify that there is sufficient file system space for the core dump files. Type thedf -k command.

By default the location where savecore files are stored is:

/var/crash/‘uname -n‘

For instance, for the mysystem server, the default directory is:

/var/crash/mysystem

The file system specified must have space for the core dump files.

If you see messages from savecore indicating not enou gh space in the/var/crash/ file, any other locally mou nted (not N FS) file system can be u sed.

TABLE 9-3

# swap -l

swapfile dev swaplo blocks free

/dev/dsk/c0t3d0s0 32,24 16 4097312 4062048

/dev/dsk/c0t1d0s0 32,8 16 4097312 4060576

/dev/dsk/c0t1d0s1 32,9 16 4097312 4065808

TABLE 9-4

# df -k /var/crash/‘uname -n‘

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 264/294

232 Sun Fire V445 Server Administration Guide • September 2007

Following is a sample message from savecore.

Perform Steps 4 and 5 if there is not enou gh sp ace.

TABLE 9-5

System dump time: Wed Apr 23 17:03:48 2003

savecore: not enough space in /var/crash/sf440-a (216 MB avail,

246 MB needed)

4. Type the df -k1 command to identify locations wi th more space.

5. Type the dumpadm -s command to specify a location for the dump file.

The dumpadm -s command enables you to specify the location for the swap file.See th e dumpadm (1M) man page for more information.

Testing the Core Dump SetupBefore placing the system into a prod uction environment, it might be useful to test

TABLE 9-6

# df -k1

Filesystem kbytes used avail capacity Mounted on

/dev/dsk/c1t0d0s0 832109 552314 221548 72% /

/proc 0 0 0 0% /proc

fd 0 0 0 0% /dev/fd

mnttab 0 0 0 0% /etc/mntabswap 3626264 16 362624 81% /var/run

swap 3626656 408 362624 81% /tmp

/dev/dsk/c1t0d0s7 33912732 9 33573596 1% /export/home

TABLE 9-7

# dumpadm -s /export/home/Dump content: kernel pages

Dump device: /dev/dsk/c3t5d0s1 (swap)

Savecore directory: /export/home

Savecore enabled: yes

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 265/294

Chapter 9 Troubleshooting 233

whether the core dump setup works. This procedure might take some timedep ending on th e amoun t of installed m emory.

Back up all your data and access the system console. See:

s “About Com mu nicating With the System” on p age 26

w To Test the Core Dump Setup

1. Gracefully shut down the system using the shutdown command.

2. At the ok prompt, issue the sync command.

You shou ld see “du mp ing” messages on the system console.

The system reboots. During this process, you can see the savecore messages.

3. Wait for the system to finish rebooting.

4. Look for system core dump files in yo ur savecore directory.

The files are nam ed unix. y an d vmcore. y, wh ere y is the integer du mp n um ber.

There should also be a bounds file that contains the next crash nu mbersavecore will use.

If a core dum p is not generated, perform the procedure d escribed in “Enabling theCore Dum p Process” on p age 231.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 266/294

234 Sun Fire V445 Server Administration Guide • September 2007

APPENDIX A

Connector Pinouts

This app endix provides reference information about th e system back panel ports andpin assignments.

Topics covered in this appendix include:

s “Reference for the Serial Management Port Connector” on page 235s “Reference for the Netw ork Managem ent Port Conn ector” on page 236s “Reference for th e Serial Port Connector” on p age 238s “Reference for th e USB Conn ectors” on page 239s “Reference for the Gigabit Ethernet Connectors” on page 240

Reference for the Serial ManagementPort ConnectorThe serial m anagemen t conn ector (labeled SERIAL MGT) is an RJ-45 connectorlocated on the back panel. This port is the default connection to the system console.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 267/294

235

Serial Management Connector Diagram

FIGURE A-1 Serial Management Connector Diagram

Serial Management Connector SignalsFor Serial Management connector signals, see TABLE A-1.

TABLE A-1 Serial Managem ent Con nector Signals

Pin Signal Description Pin Signal Description

1 Request to Send 5 Ground2 Data Terminal Ready 6 Receive Data

3 Transmit Data 7 Data Set Ready

4 Ground 8 Clear to Send

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 268/294

236 Sun Fire V445 Server Administration Guide • September 2007

Reference for the Network ManagementPort ConnectorThe network management connector (labeled NET MGT) is an RJ-45 connectorlocated on the ALOM card and can be accessed from the back panel. This port needsto be configured p rior to use.

Network Management Connector Diagram

FIGURE A-2 Network Management Connector Diagram

Network Management Connector SignalsFor Netw ork Man agement connector signals, see TABLE A-2.

TABLE A-2 Network Management Connector Signals

Pin Signal Description Pin Signal Description

1 Transmit Data + 5 Common Mode Termination

2 Transmit Data – 6 Receive Data –

3 Receive Data + 7 Common Mod e Termination

4 Com mon Mod e Term ination 8 Com mon Mod e Term ination

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 269/294

Appendix A Connector Pinouts 237

Reference for the Serial Port ConnectorThe serial p ort conn ector (TTYB) is a DB-9 connector th at can be accessed from theback panel.

Serial Port Connector Diagram

FIGURE A-3 Serial Port Con nector Diagram

Serial Port Connector SignalsFor serial port connector signals, see TABLE A-3.

TABLE A-3 Serial Port Connector Signals

Pin Signal Description Pin Signal Description

1 Data Carrier Detect 6 Data Set Ready

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 270/294

238 Sun Fire V445 Server Administration Guide • September 2007

2 Receive Data 7 Request to Send

3 Transmit Data 8 Clear to Send

4 Data Terminal Ready 9 Ring Ind icate

5 Ground

Reference for the USB ConnectorsTwo Universal Serial Bus (USB) ports are located on the motherboard in a double-stacked layout and can be accessed from the back panel.

USB Connector Diagram

FIGURE A-4 USB Connector Diagram

USB2

USB31 2 3 4

1 2 3 4

A

B

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 271/294

Appendix A Connector Pinouts 239

USB Connector Signals

For USB connector signals, see TABLE A-4.

TABLE A-4 USB Conn ector Signals

Pin Signal Description Pin Signal Description

A1 +5 V (fused) B1 +5 V (fused)

Reference for the Gigabit EthernetConnectorsFour RJ-45 Gigabit Ethernet connectors (NET0, NET1, NET2, NET3) are located on

the system m otherboard a nd can be accessed from the back pan el. The Ethernetinterfaces op erate at 10 Mbit/ sec, 100 Mbit/ sec, and 1000 Mbit/ sec.

Gigabit Ethernet Connector Diagram

A2 USB0/ 1- B2 USB2/ 3-

A3 USB0/ 1+ B3 USB2/ 3+

A4 Ground B4 Ground

TABLE A-4 USB Conn ector Signals

Pin Signal Description Pin Signal Description

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 272/294

240 Sun Fire V445 Server Administration Guide • September 2007

FIGURE A-5 Gigabit Ethernet Con nector Diagram

Gigabit Ethernet Connector Signals

For Gigabit Ethernet connector signals, see TABLE A-5.

TABLE A-5 Gigabit Ethernet Conn ector Signals

Pin Signal Description Pin Signal Description

1 Tran sm it/ Receive Data 0 + 5 Tran sm it/ Receive Data 2 –

2 Tran sm it/ Receive Data 0 – 6 Tran sm it/ Receive Data 1 –

3 Tran sm it/ Receive Data 1 + 7 Tran sm it/ Receive Data 3 +

4 Tran sm it/ Receive Data 2 + 8 Tran sm it/ Receive Data 3 –

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 273/294

Appendix A Connector Pinouts 241

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 274/294

242 Sun Fire V445 Server Administration Guide • September 2007

APPENDIX B

System Specifications

This appendix provides the following specifications for the Sun Fire V445 server:

s “Reference for Phy sical Specifications” on page 244s “Reference for Electrical Specifications” on page 244s

“Reference for Env ironm ental Specifications” on p age 245s “Reference for Ag ency Com pliance Specifications” on p age 247s “Reference for Clearance and Service Access Specifications” on pag e 248

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 275/294

243

Reference for Physical SpecificationsThe dimensions and weight of the system are as follows.

Reference for Electrical SpecificationsThe following table p rovides t he electrical specifications for the system . Allspecifications pertain to a fully configured system operating at 50 Hz or 60 Hz.

TABLE B-1 Dimensions and Weight

Measurement U.S. Metric

Height 6.85 in. 17.5 cm

Wid th 17.48 in. 44.5 cm

Depth 25 in. 64.4 cm

Weight:

Minimum

Maximum

70 lbs

82 lbs

31 kg

37.2 kgPower Cord 8.2 ft 2.5 m

TABLE B-2 Electrical Specifications

Parameter Value

Input

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 276/294

244 Sun Fire V445 Server Administration Guide • September 2007

Nominal Frequencies 50 or 60 Hz

N om in al Voltage Ran ge 100 to 240 VAC

Maximum Cur ren t AC RMS * 13.2 A @100 VAC

11 A @ 120 VAC

6.6 A @200 VAC

6.35 A @ 208 VAC

6 A @220 VAC5.74 A @ 230 VAC

5.5A @ 240 VAC

Output

*Refers to total inp ut curren t required for four AC inlets wh en operatin g with all four pow er supp lies or current

required for a du al AC inlet when operat ing with the minim um of two pow er supp lies.

Reference for EnvironmentalSpecificationsThe operating an d nonoperating environmental specifications for the system areas follows.

+12 VDC

-12 VDC

+5 VDC

-5 VDC

0.5 to 45A

0 to 0.8A

0.5 to 28A

0.5 to 50A

Maximum DC Ou tpu t of Two (2) Power

Supplies

1100W Max AC power consumption 1320W for

oper ation @ 100 VAC to 240 VAC Ma x heatdissipation 4505 BTUs/ Hr for operation @ 200VAC to 240 VAC.

Maximu m AC Power Consumption 788W for operation @100 VAC to 240 VAC(maximum configuration)

Maximum Heat Dissipat ion 4505 Btu/ hr for operation @100 VAC to 240 VAC

TABLE B-3 Environmental Specifications

Parameter Value

Operating

TABLE B-2 Electrical Specifications

Parameter Value

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 277/294

Appendix B System Specifications 245

OperatingTemperature 5˚C to 35˚C (41˚F to 95˚F) noncondensingIEC 60068-2-1&2

Humidity 20% to 80% RH noncondensing; 27˚C max wet bulbIEC 60068-2-3&56

Alt itude Up to 3000 meter s, max ambien t t empera tu re is de ra ted by 1˚C per500 meters above 500 meters IEC 60068-2-13

Vibration (rand om) 0.0001 g2/ Hz, 5 to 150 Hz, -12db/ octave slope 150 to 500 Hz

Shock 3.0 g peak, 11 milliseconds half-s ine pulseIEC 60068-2-27

Nonoperating

Temperature -40˚C to 60˚C (-40˚F to 140˚F)noncondensingIEC 60068-2-1&2

Humidity Up to 93% RH noncondensing; 38˚C max wet bulbIEC 60068-2-3&56

Alt itude 0 to 12,000 meters (0 to 40,000 feet)IEC 60068-2-13

Vibration 0.001 g2/ Hz, 5 to 150 Hz, -12db/ octave s lope 150 to 500 Hz

Shock 15.0 g peak, 11 milliseconds half-sine pulse; 1.0 inch roll-off front toback, 0.5 inch roll-off side to sideIEC 60068-2-27

Hand ling Drops 60 mm , 1 drop per corner, 4 cornersIEC 60068-2-31

Threshold Impact 0.85m/ s, 3 imp acts per caster, all 4 casters, 25 mm step-up ETE1010-01

TABLE B-3 Environmental Specifications (Continued)

Parameter Value

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 278/294

246 Sun Fire V445 Server Administration Guide • September 2007

Reference for Agency ComplianceSpecificationsThe system complies with the following specifications.

TABLE B-4 Agency Compliance Specifications

Category Relevant Standards

Safety UL/ CSA-60950-1, EN60950-1, IEC60950-1 CB Scheme with a llcountry deviations, IEC825-1, 2, CFR21 part 1040, CNS14336

RFI/ EMI EN 55022 Class A

47 CFR 15B Class A

ICES-003 Class AVCCI Class A

AS/ NZ 3548 Class A

CNS 13438 Class A

KSC 5858 Class A

EN61000-3-2

EN61000-3-3

Immunity EN55024

IEC 61000-4-2

IEC 61000-4-3

IEC 61000-4-4

IEC 61000-4-5

IEC 61000-4-6

IEC 61000-4-8

IEC 61000-4-11

Telecomm un ications EN300-386

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 279/294

Appendix B System Specifications 247

Regulator y Mark ing CE, FCC, ICES-003, C-tick, VCCI, GOST-R, BSMI, MIC, UL/ cUL,UL/ S-mark, UL/ GS-mark

Reference for Clearance and ServiceAccess SpecificationsMinimum clearances needed for servicing the system are as follows.

TABLE B-5 Clearance and Service Access Specifications

Blockage Required Clearance

Front of System 36 in (91.4 cm)

Back of System 36 in (91.4 cm)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 280/294

248 Sun Fire V445 Server Administration Guide • September 2007

APPENDIX C

Op enBoot Configuration Variables

TABLE C-1 describes the OpenBoot firmware configuration variables stored on anIDPROM module on a new system controller. The OpenBoot configuration variablesare printed here in the order in which they appear when you issue the showenv

command.

TABLE C-1 Open Boot Configuration Variables Stored on a ROM Chip

Variable Possible Values Default Value Description

test-args variable-name none Default test argum ents passed to Op enBootDiagnostics. For more information and a list of possible test argum ent values, see Chapter 8.

diag-passes 0-n 1 Defines the nu mber of times self-testmethod(s) are performed.

local-mac-

address?

true, false false If true, network drivers use their ow n MACaddress, not the server MAC address.

fcode-debug? true, false false If true, include name fields for plug-in deviceFCodes.

silent-mode? true, false false Sup press all messages if true an d

diag-switch? is false.i i i i id 0 15 7 SAS ID f th SAS t ll

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 281/294

249

scsi-initiator-id 0-15 7 SAS ID o f the SAS controller.

oem-logo? true, false false If true, use custom O EM logo, otherw ise, useSun logo.

oem-banner? true, false false If true, use custom OEM ban ner.

ansi-terminal? true, false true If true, enable ANSI terminal emu lation.

screen-#columns 0-n 80 Sets num ber of column s on screen.

screen-#rows 0-n 34 Sets num ber of rows on screen.

ttyb-rts-dtr-off true, false false If true, operating system does not assert rts(request-to-send ) an d dtr

(data-transfer-ready) on ttyb.

ttyb-ignore-cd true, false true If true, operating system ignores carrier-detect on ttyb.

ttya-rts-dtr-off true, false false If true, operating system does not assert rts(request-to-send ) an d dtr

(data-transfer-ready) on serial managem entport.

ttya-ignore-cd true, false true If true, operating system ignores carrier-detect on serial management port.

ttyb-mode baud-rate, bits, parity,

stop, handshake

9600,8,n,1,

-

ttyb (baud rate, num ber of bits, parity,num ber of stops, handshake).

ttya-mode 9600,8,n,1,- 9600,8,n,1,

-

Serial management port (baud rate, bits, parity,stop, handshake). The serial management portonly works at the default values.

output-device ttya, ttyb, screen ttya Power-on output device.

input-device ttya, ttyb, keyboard ttya Power-on input device.

auto-boot-on-

error?

true, false false If true, boot au tomatically after system error.

load-base 0-n 16384 Address.

auto-boot? true, false true If true, boot automa tically after pow er on orreset.

boot-command variable-name boot Action following a boot command.

diag-file variable-name none File from which to boot if diag-switch? istrue.

diag-device variable-name net Device from w hich to boot if diag-switch? istrue.

TABLE C-1 OpenBoot Configuration Variables Stored on a ROM Chip (Continued)

Variable Possible Values Default Value Description

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 282/294

250 Sun Fire V445 Server Administration Guide • September 2007

boot-file variable-name none File from which to boot if diag-switch? isfalse.

boot-device variable-name disk net Device(s) from which to boot if diag-switch? is false.

use-nvramrc? true, false false If true, execute comm and s in NVRAMRCdu ring server start-up.

nvramrc variable-name none Comm and script to execute if use-nvramrc?is true.

security-mode none, command, full none Firmw are security level.

security-password variable-name none Firmware security password if security-mode is not none (neverdisplayed) - do not set this directly.

security-

#badlogins

variable-name none Number of incorrect security passwordattempts.

diag-script all, normal, none normal Specifies the set of tests that OpenBootDiagnostics will run. Selecting all isequivalent to running test-all from theOpenBoot command line.

diag-level none, min, max min Defines how diagnostic tests are run .

diag-switch? true, false false If true:

• Run in diagnostic mod e

• After a boot request, boot diag-file from

diag-device

If false:

• Run in nondiagnostic mode

• After a boot request, boot boot-file fromboot-device

diag-trigger none, error-reset,

power-on-reset,

user-reset, all-

resets

power-on-

reset,

error-

reset

Specifies the class of reset event that causesdiagnostics to ru n autom atically. Defaultsetting is power-on-reset error-reset.

• none – Diagnostic tests are not executed.

• error-reset – Reset that is caused bycertain hard wa re error events such as REDState Exception Reset, Watchd og Resets,Software-Instruction Reset, or H ardw areFatal Reset.

• power-on-reset – Reset that is caused bypow er cycling the system.

• user-reset – Reset that is initiated by anoperating system pan ic or by user-initiated

TABLE C-1 Open Boot Configuration Variables Stored on a ROM Chip (Continued)

Variable Possible Values Default Value Description

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 283/294

Appendix C OpenBoot Configuration Variables 251

p g y p ycommands from OpenBoot (reset-all orboot) or from Solaris (reboot, shutdown,or init).

• all-resets – Any kind of system reset.

Note: Both POST and OpenBoot Diagnosticsrun at the sp ecified reset event if the variablediag-script is set to normal or all. If diag-script is set to none, only POST runs.

error-reset-

recovery

boot, sync, none boot Comm and to execute following a system resetgenerated by an error.

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 284/294

252 Sun Fire V445 Server Administration Guide • September 2007

Index

Symbols/etc/hostname file, 146

/etc/hosts file, 147

/etc/remote file, 48modifying, 51

/var/adm/messages file, 190

Numerics1+1 redun dancy, power sup plies, 5

AActivity (disk d rive LED), 139

Activity (system statu s LED), 64Advan ced Lights Ou t Manager (ALOM)

about, 77, 99commands, Seesc> promptconfiguration rules, 80escape sequence (#.), 34features, 77

invoking xir comm and from, 102multiple connections to, 34

accessing system console from, 27, 53remote power-off, 64, 67remote pow er-on, 62

setting baud rate, 53asr-disable (OpenBoot command), 112

auto-boot (OpenBoot configur ation variable), 35,209

Autom atic System Recovery (ASR)use in troubleshooting, 228

autom atic system recovery (ASR)about, 111

commands, 212enabling, 212

Autom atic System Restoration (ASR)enabling, OpenBoot configuration variables

for, 228autom atic system restoration (ASR)

about, 101

BBig Adm in

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 285/294

253

p ,ports, 79remote power-off, 64, 67remote pow er-on, 62

agency compliance specifications, 246agents, Sun Management Center, 218

ALOM (Advan ced Lights Ou t Manager)

accessing system console, 229use in troubleshooting, 229

ALOM, See Sun Advanced Lights Out Manager(ALOM)

alphanu meric terminal

troubleshooting resource, 225Web site, 225

BIST, See built-in self-testBMC Patrol, See third-party m onitoring tools

boot-device (OpenBoot configu rationvariable), 69

bootmode diag (sc> command), 111bootmode reset_nvram (sc> comm and), 110

bounds file, 234

break (sc> command), 36

Break key (alphanu meric terminal), 41

built-in self-testtest-args variable and, 179

Ccables, keyboard and mouse, 57

central processing unit, See CPUcfgadm (Solaris command), 136

cfgadm install_device (Solaris comman d),

cautions against using, 137cfgadm remove_device (Solaris comman d),

cautions against using, 137Cisco L2511 Term inal Server, connecting, 44clearance specifications, 247

clock speed (CPU), 201comm and prom pts, explained, 39

comm unicating w ith the systemabout, 26options, table, 26

concatenation of disks, 120

console (sc> command ), 36console configu ration, connection alternatives

explained, 31

console -f (sc> command), 34core dum p

enabling for troubleshooting, 231testing, 233

CPUdisplaying information about, 201

CPU, about, 3See also UltraSPARC IIIi processor

CPU/ memory modu les, about, 73

DDB-9 connector (for ttyb port), 27

diag-level variable, 178

diagn ostic toolssummary of (table), 152

diagnosticsobdiag, 177POST, 157probe-ide, 206probe-scsi and p robe-scsi-all, 205SunVTS, 215

watch-net and watch-net-all, 206DIMMs (du al inline memory m odu les)about, 3configuration rules, 77error correcting, 103groups, illustrated, 75interleaving, 76parity checking, 103

disk configurationconcatenation, 120hot-plug, 88hot-spares, 89, 122mirroring, 89, 102, 120RAID 0, 89, 102, 121, 128RAID 1, 89, 102, 121, 124RAID 5, 102striping, 89, 102, 121, 128

disk drivesabout, 5, 86, 87caution, 62, 63configuration rules, 89hot-plug, 88LEDs

Activity, 139OK-to-Remove, 134, 135, 138

locating drive bays, 88

logical device names, table, 123disk hot-plug

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 286/294

254 Sun Fire V445 Server Administration Guide • September 2007

DB 9 connector (for ttyb port), 27

default system console configuration, 29device paths, hardw are, 179, 184

device reconfiguration, manual, 114device tree

defined, 218Solaris, disp laying, 192

device trees, rebuilding , 68device unconfiguration, manual, 112

df -k command (Solaris), 232DHCP (Dynamic Host Configuration Protocol), 42

p gmirrored disk, 134non-mirrored disk, 136

disk m irror (RAID 0), See hardw are disk mirrordisk slot num ber, reference, 124

disk volumesdeleting, 133

DMP (Dynamic Multipathing), 119double-bit errors, 103dtterm (Solaris utility), 49du al inline m emory m odu les (DIMMs), See DIMMs

dumpadm command (Solaris), 231

dumpadm -s command (Solaris), 233Dynamic Host Configuration Protocol (DHCP)

client on network man agement port, 42, 43

Dynamic Multipathing (DMP), 119

EECC (error-correcting code), 103electrical specifications, 244

environmental monitoring and control, 100environmental monitoring subsystem, 100

environmental specifications, 245error m essages

correctable ECC error, 103log file, 100Open Boot Diagnostics, interp reting, 180power-related, 101

error-correcting code (ECC), 103error-reset-recovery (OpenBoot

configuration variable), 115

error-reset-recovery variable, setting fortroubleshooting, 227

escape sequence (#.), ALOM system controller, 34Ethernet

cable, attaching, 143

configuring interface, 144interfaces, 141link integrity test, 145, 148using multiple interfaces, 145

Ethernet portsabout, 3, 141configuring redu nd ant interfaces, 142outbound load balancing, 4

exercising the systemwith Hardware Diagnostic Suite, 220with SunVTS 214

fans, monitoring and control, 100

firmw are patch management, 225front p anel

illustration, 9LEDs, 10

FRUhardware revision level, 200hierarchical list of, 198manufacturer, 200

part number, 200FRU data

contents of IDPROM, 200

fsck (Solaris command), 37

Ggo (OpenBoot comm and), 38graceful system halt, 36, 41

graphics card, See graphics monitor; PCI graphicscard

graphics monitoraccessing system console from, 56configuring, 27connecting to PCI grap hics card, 56restrictions against using for initial setup , 56restrictions against u sing to view POST

output, 56

Hhalt, gracefully, advan tages of, 36, 41hardw are device paths, 179, 184

Hardware Diagnostic Suite, 220about exercising the system with, 220

hardw are disk mirrorabout, 6, 7, 122checking the status of, 125creating, 124

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 287/294

Index 255

with SunVTS, 214externally initiated reset (XIR)

invoking from sc> prompt, 37invoking through network management port, 4manual command, 102use in troubleshooting, 227

Ffan trays

configuration rules, 94illustration, 93

g,hot-plug operation, 134removing, 132

hardw are disk striped volum echecking the status of, 129

hardw are revision, displaying with showrev, 201hardw are watchdog mechanism, 102

hardw are watchdog mechanism, use introubleshooting, 227

host adapter (probe-scsi), 181hot-plug op eration

non-mirrored disk drive, 136

on hardw are disk mirror, 134hot-pluggable comp onents, about, 98hot-spares (disk drives), 122

See also disk configurationHP Op enview, See third-party m onitoring tools

II2C bus, 100

IDE bus, 182ifconfig (Solaris command), 148

independent memory subsystems, 76

init (Solaris command), 36, 41

input-device (OpenBoot configurationvariable), 46, 58, 59

Integrated Drive Electronics, See IDE busintermittent problem, 220

internal disk dr ive bays, locating, 88Internet Protocol (IP) network m ultipathing, 3

interpreting error m essagesOpen Boot Diagnostics tests, 180

Kkeyboard

attaching, 57Sun Type-6 USB, 4

keyboard sequencesL1-A, 35, 37, 41, 87

LL1-A keyboard sequ ence, 35, 37, 41, 87

LEDsActivity (disk d rive LED), 139Activity (system statu s LED), 64front panel, 10

Locator (system status LED)

controlling from sc> prompt, 108, 109controlling from Solaris, 108, 109

log files, 190, 218logical device name (disk d rive), reference, 123logical unit n um ber (probe-scsi), 181

logical view (Sun Managem ent Center), 219loop ID (probe-scsi), 181

Mmanual device reconfiguration, 114manu al device unconfiguration, 112manu al system reset, 37, 41

memory interleavingabout, 76See also DIMMs (du al inline memory m odu les)

memory m odules, See DIMMs (du al inline mem ory

modules)memory subsystems, 76

messagePOST, 157

mirrored disk, 89, 102, 120monitor, attaching, 56

monitored hardware, 218monitored software properties, 218

mouseattaching, 57USB device, 4, 27

moving the system, cautions, 62, 63multiple ALOM sessions, 34multiple-bit errors, 103

Multiplexed I/ O (MPxIO), 119

Nnetwork

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 288/294

256 Sun Fire V445 Server Administration Guide • September 2007

OK-to-Remove (disk d rive LED), 134, 135, 138Power OK (pow er supp ly LED), 66Service Required (power sup ply LED), 91

light-emitting d iodes (LEDs)back pan el LEDs

system status LEDs, 17

link integrity test, 145, 148local graph ics mon itor

remote power-off, 64, 67remote pow er-on, 62

networkname server, 148primary interface, 145

network interfacesabout, 141configuring additional, 145redundant, 142

network m anagement p ort (NET MGT)about, 27activating, 42advan tages over serial management p ort, 30

configuration rules, 81

configuring IP add ress, 43configuring using Dynam ic Host Configuration

Protocol (DHCP), 42issuing an externally initiated reset (XIR) from, 4

non-mirrored disk hot-plug operation, 136

Ook prompt

about, 35accessing via ALOM break command , 35, 36accessing via Break key, 35, 37accessing via externally initiated reset (XIR), 37accessing via graceful system shutdown, 36accessing via L1-A (Stop-A) keys, 35, 37, 87accessing via man ual system reset, 36, 37risks in using, 38ways to access, 35, 40

OK-to-Remove (disk d rive LED), 134, 135, 138on-board storage, 5

See also disk drives; disk volumes; internal drivebays, locating

OpenBoot comm andsasr-disable, 112go, 38power-off, 47, 50, 54

probe-ide, 36, 37, 182probe-scsi, 37probe-scsi an d probe-scsi-all, 181probe-scsi-all, 36, 37reset-all, 58, 113, 213set-defaults, 111setenv, 46, 58show-devs, 70, 113, 147, 184showenv, 249

OpenBoot configuration variablesauto-boot, 35, 209

test command, 179

test-all command , 180OpenBoot emergency procedures

performing, 109OpenBoot firmw are

scenarios for control, 35selecting a boot device, 69

operating environment software, suspend ing, 38output message

watch-net all diagn ostic, 207watch-net diagnostic, 207

output-device (OpenBoot configurationvariable), 46, 58, 59

overtemperature conditiondetermining with prtdiag, 197

P

parity, 53, 56parity protectionPCI buses, 103UltraSCSI bus, 103UltraSPARC IIIi CPU interna l cache, 103

patch managementfirmware, 225software, 225

patch pan el, terminal server connection, 44patches, installeddetermining with showrev, 202

PCI busesabout, 81characteristics, table, 82parity protection, 103

PCI cardsabout, 81configuration rules, 84device names, 70, 113f b ff 56

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 289/294

Index 257

boot-device, 69enabling ASR, 228error-reset-recovery, 115input-device, 46, 58, 59output-device, 46, 58, 59ttyb-mode, 56

OpenBoot diagnostics, 177OpenBoot Diagnostics tests

error messages, interpreting, 180hardw are device paths in, 179runn ing from the ok prompt, 179

frame buffers, 56slots for, 82

PCI graph ics cardconfigur ing to access system console, 56connecting graphics monitor to, 56

ph ysical device nam e (disk dr ive), 123

ph ysical specifications, 244ph ysical view (Sun Man agemen t Center), 219

por t settings, verifying on ttyb, 55ports, external, 3

See also serial man agemen t p ort (SERIAL MGT);

netw ork managemen t port (NET MGT); ttybport; UltraSCSI port; USB ports

POSTmessages, 157

POST, See power-on self-test (POST)power

specifications, 244turning off, 66

Power button, 66Power OK (pow er supp ly LED), 64, 66power supplies

1+1 redun dancy, 5about, 5, 86as hot-pluggable compon ents, 86configuration rules, 92fault monitoring, 101outpu t capacity, 244presence required for system cooling, 5redundancy, 5, 99

power-off (OpenBoot command), 47, 50, 54poweroff (sc> command), 37

poweron (sc> command ), 37

power-on self-test (POST)default port for messages, 4output messages, 4

probe-ide (OpenBoot command), 36, 37

probe-ide command (OpenBoot), 182

probe-scsi (OpenBoot comm and), 37

probe-scsi-all (OpenBoot command), 36, 37processor speed, displaying, 201

prtconf command (Solaris), 192

prtdiag command (Solaris), 192

prtfru command (Solaris), 198psrinfo command (Solaris), 201

reconfiguration boot, 66

redun dan t array of independ ent disks, See RAID(redu nd ant array of independ ent disks)

redun dan t network interfaces, 142

reliability, availability, and ser viceability (RAS), 98to 103

resetmanu al system, 37, 41scenarios, 211

reset (sc> command), 37reset -x (sc> command), 37

reset-all (OpenBoot command), 58, 113, 213revision, hardware and software

displaying with showrev, 201RJ-45 serial comm un ication, 96

RJ-45 twisted -pair Ethern et (TPE) connector, 143run levels

explained, 35ok prompt and, 35

Ssafety agency compliance, 246

savecore directory, 234

sc> commandsbootmode diag, 111

bootmode reset_nvram, 110break, 36console, 36, 111console -f, 34poweroff, 37poweron, 37reset, 37, 110reset -x, 37setlocator, 108, 109setsc, 43showlocator, 109

44

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 290/294

258 Sun Fire V445 Server Administration Guide • September 2007

RRAID (redu nd ant array of independ ent disks)

disk concatenation, 120hardw are mirror, See hardw are disk mirrorstorage configurations, 102

striping, 121, 128RAID 0 (striping), 121, 128

RAID 1 (mirroring), 121, 124

raidctl (Solaris comm and ), ?? to 135

shownetwork, 44sc> prompt

about, 32accessing from network man agement port, 34accessing from serial management p ort, 34multiple sessions, 34

system console escape sequence (#.), 34system console, switching between, 38ways to access, 34

scadm (Solaris utility), 106

SEAM (Sun Enterpr ise Authentication

Mechanism), 215serial man agemen t p ort (SERIAL MGT)

about, 4, 7acceptable console device connections, 29as d efault commu nication port on initial

startup, 26as default console connection, 96baud rate, 96configuration rules, 80default system console configuration, 29using, 41

SERIAL MGT, See serial management p ortservice access specifications, 247

Service Required (pow er supply LED), 91

set-defaults (OpenBoot comman d), 111

setenv (OpenBoot command), 46, 58

setlocator (sc> command), 109setlocator (Solaris command), 108

setsc (sc> command), 43

show-devs (OpenBoot command), 70, 113, 147

show-devs command (OpenBoot), 184showenv (OpenBoot command), 249

shownetwork (sc> command ), 44

showrev command (Solaris), 201

shutdown (Solaris command), 36, 41single-bit errors, 103software patch management, 225software properties monitored by Sun Management

Center software, 218

software revision, displaying with showrev, 201Solaris commands

cfgadm, 136cfgadm install_device, cautions against

using, 137cfgadm remove device cautions against

psrinfo, 201

raidctl, ?? to 135scadm, 106setlocator, 108showlocator, 109showrev, 201shutdown, 36, 41swap -l, 232sync, 37tip, 47, 49uadmin, 36uname, 51uname -r, 51

Solaris Volume Manager , 89, 118, 120

Solstice DiskSuite, 89, 120specifications, 243 to 246

agency comp liance, 246clearance, 247

electrical, 244environmental, 245physical, 244service access, 247

SRS Net Connect, 227Stop-A (USB keyboard functionality), 110Stop-D (USB keyboard functionality), 111

Stop-F (USB keyboard functionality), 111

Stop-N (USB keyboard functionality), 110storage, on-board, 5stress testing, See also exercising the system, 214striping of disks, 89, 102, 121, 128

Sun Enterprise Authentication Mechanism, See

SEAMSun Install Check tool, 226

Sun Management Centertracking systems informally with, 219

Sun Management Center software, 23, 218

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 291/294

Index 259

cfgadm remove_device, cautions againstusing, 137

df -k, 232dumpadm, 231dumpadm -s, 233fsck, 37ifconfig, 148init, 36, 41prtconf, 192prtdiag, 192prtfru, 198

Sun Remote Services Net Connect, 227

Sun StorEdge 3310, 119Sun StorEdge A5x00, 119Sun StorEdge T3, 119

Sun StorEdge Traffic Manager software (TMS), 119,120

Sun Type-6 USB keyboard , 4Sun Solve On line

troubleshooting resources, 224web site, 225

SunVTS

exercising the system with, 214suspend ing the operating environment software, 38swap d evice, saving core dum p, 231

swap -l command (Solaris), 232swap space, calculating, 232

sync (Solaris command), 37

sync command (Solaris)testing core dump setup, 234

system configuration card, 157system console

about, 27accessing via alphanumeric terminal, 53accessing via grap hics mon itor, 56accessing via terminal server, 26, 44accessing via tip connection, 47alphanumeric terminal connection, 26, 53

alternate configurations, 31alternative connections (illustr ation), 31connection using grap hics monitor, 32default configuration explained, 26, 29default connections, 29defined, 26devices used for connection to, 27Ethernet attachment through netw ork

management port, 27

graphics monitor connection, 27, 32logging error messages, 229mu ltiple view sessions, 34network managem ent port connection, 30sc> promp t, switching between, 38

system memorydetermining amount of, 192

system reset scenarios, 211

system sp ecifications, See specificationssystem status LEDs

Activity, 64

pinou ts for crossover cable, 45

test command (Op enBoot Diagnostics tests), 179test-all command (OpenBoot Diagnostics

tests), 180

test-args variable, 179keywords for (table), 179

thermistors, 100third-party monitoring tools, 220

tip (Solaris command), 49

tip connectionaccessing system console, 27, 29, 30, 47accessing terminal server, 47

Tivoli Enterpr ise Console, See third-partymonitoring tools

tree, device, 218

troubleshootingerror logging, 229

using configuration variables for, 227ttyb port

about, 4, 96baud rates, 96verifying baud rate, 55, 56verifying settings on, 55

ttyb-mode (OpenBoot configur ation variable), 56

Uuadmin (Solaris command), 36

Ultra-4 SCSI backplaneconfiguration rules, 85

Ultra-4 SCSI controller, 84UltraSCSI bus parity p rotection, 103

UltraSCSI disk drives supp orted, 85UltraSPARC IIIi processor

about, 74internal cache parity p rotection, 103

uname (Solaris command), 51

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 292/294

260 Sun Fire V445 Server Administration Guide • September 2007

yas environmental fault indicators, 101Locator, 108, 109See also LEDs

T

temperature sensors, 100terminal serveraccessing system console from, 29, 44connection through p atch pan el, 44connection through serial management port, 27

( ),

uname -r (Solaris command), 51Universal Serial Bus (USB) dev ices

running OpenBoot Diagnostics self-tests on, 180USB ports

about, 4

configuration rules, 95connecting to, 95

V

VERITAS Volume Manager, 118, 119, 120

Wwatchdog, hardware, See hardware watchdog

mechanismwatch-net all diagnostic

outpu t message, 207watch-net d iagnostic

outpu t message, 207World Wide Nam e (probe-scsi), 181

XXIR, See externally initiated resetXIR, See externally initiated reset (XIR)

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 293/294

Index 261

8/6/2019 819-3741-13

http://slidepdf.com/reader/full/819-3741-13 294/294

262 Sun Fire V445 Server Administration Guide • September 2007