event monitoring service 1 ems hardware monitors 2002. 5. 16 강사 : 공 용섭 과장 hpcs/sdo/mc

88
Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강강 : 강 강강 강강 HPCS/SDO/MC

Upload: lesley-lambert

Post on 18-Jan-2016

216 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

Event Monitoring

Service

1

EMS Hardware Monitors

2002. 5. 16

강사 : 공 용섭 과장

HPCS/SDO/MC

Page 2: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

Event Monitoring

Service

2

Agenda

•EMS 의 기초적인 이해

•EMS Monitors 의 설치

•EMS HW Monitors 의 운용

•HW Monitor 환경설정 파일

•PSM – Peripheral Status Monitor

•Review of Operation and Basic Trouble-shooting Guidance

Page 3: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

3

Event Monitoring

Service 목 적

HW 관련된 문제들을 해결할 수 있도록 EMS HW Monitors 의 활용 즉 , 그 사용법과 설치 및 설정 그리고 시스템에 적용하는 방법을 알아봅니다 .

EMS HW Monitors 의 기초적인 troubleshooting을 도와줄 여러 가지 방법을 알아봅니다 .

Event Monitoring

Service

Page 4: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

4

Event Monitoring

Service

SECTION 1:EMS 의 기초적인 이해

•EMS HW Monitor 의 장점•EMS 란 무엇인가 ?•EMS 환경에서 현재 사용중인 Monitors

•설정 가능한 notification methods

Event Monitoring

Service

Page 5: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

5

Event Monitoring

Service

HP ResponseCenter

Event Monitoring ArchitectureH

P-U

X S

erv

ers

Events generatedby HP Servers

EMS HA*Monitors

Third PartyMonitors

Detected byMonitors

Event

Mon

itori

ng

Serv

ice F

ram

ew

ork

*

Relayed to EMSFramework

Customer

Enterprise Mgmt Applications*• HP OpenView• CA TNG

MC/ServiceGuard*

HP Predictive*

Distributed toNotification Device

Event Monitoring

Service

EMS HardwareMonitors

User DevelopedMonitors

Page 6: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

6

Event Monitoring

Service

EMS HW Monitor 의 장점

•System downtime 감소•문제 분석 및 수리 시간 단축 •Default monitoring 환경 설정 •HW 자원 감시를 위한 일반적 툴 제공•여러 가지 통보 방법 제공•타 applications 과 적용 가능•최소 관리로 최대 효과

Event Monitoring

Service

Page 7: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

7

Event Monitoring

Service Monitors currently utilizing EMS• AutoRAID Disk Array (armmon)• Chassis Monitor (dm_chassis, June

2001)• CMC Monitor (cmc_em, June 2001)• Core Hardware (dm_core_hw)• Disk (disk_em)• Disk Array FC60 (fc60mon)• Fast Wide SCSI Disk Array

(fw_disk_array)• Fibre Channel Adapter

(dm_FCMS_adapter)• Fibre Channel Adapter A5158

(dm_TL_adapter)• Fibre Channel Arbitrated Loop Hub

(dm_fc_hub)• Fibre Channel SCSI Multiplexer

(dm_fc_scsi_mux)• Fibre Channel Switch (dm_fc_sw)

• High Availability Disk Array (ha_disk_array)

• High Availability Storage System (See SES Enclosure Monitor)

• Itanium Core Hardware Monitor (ia64_corehw, June 2001)

• Kernel Resource (krmond)• LPMC (lpmc_em)• Memory (dm_memory)• Remote (RemoteMonitor)• SCSI Card (scsi123_em)• SCSI Tape Devices (dm_stape)• SES Enclosure Monitor

(ses_enclosure)• System Status (sysstat_em)• UPS (dm_ups)

Event Monitoring

Service

http://www.docs.hp.com/hpux/onlinedocs/diag/ems/ems_prod.htm

Page 8: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

8

Event Monitoring

Service

EMS Monitors

Configuration Clients

Resources

Configuration Core/Framework

Target Apps

NotificationMethods

EMS 란 무엇인가?

• Event Monitoring Services 는 다음과 같은 내용을 제공하는 하나의 구조체 입니다 .:

• 자원 감시 설정을 위한 일반적인 툴 제공• 특정 자원에서 event 또는 critical value 발생시 여러 경로를 통해

통보• 표준 API 를 통해 새로운 resource monitor 를 쉽게 적용 가능

Event Monitoring

Service

Page 9: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

9

Event Monitoring

Service HW Event Monitoring 구성 요소

• EMS

•Event 통보를 위한 구조체•Monitoring request manager ( monconfig)

• Hardware Monitors

•환경설정 files 및 툴• Support Tools Manager

•Event 를 기록 및 보여주는데 사용되는 low-level error handling components.

•STM 은 하나의 Map 을 제공하는데 , 이는 감시해야 될 장치와 그것을 결정하는 HW monitors 에 의해 사용된다 .

Event Monitoring

Service

Page 10: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

10

Event Monitoring

Service

HW Device Driver

diaglogd

memlogd

Raw Error Log

STM

Raw Memory Log

OS

Memory

DecoderLogtoolFormatted

Logs

EMSFramework

HW Monitors

Event Monitoring

Notify

*OS MemoryAccess

Memory Subsystem

*OS Memory Access really means hooks in

the Kernel to access memory subsystem.

(i.e. read registers)

Diag2Pseudo-Driver

EMS HW Monitoring: 자원간 통신방법

Event Monitoring

Service

Page 11: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

11

Event Monitoring

Service

STM

RegistrarPoll

Hard

ware

Syste

m

PSM

Notify

HA Monitors

HW Monitors

OnSite Pred. (emsscan)

RC Predictive

Monconfig

EMS GUI

Configure

MC/SG MC/LM

SAM

Package Configuration

CONFIGURATION CLIENTS

RES

OU

RC

ES

Poll

EMS MONITORS

CONFIGURATION CORE/FRAMEWOR

K

DictDB

P-Client

IT/O NNM

Up/DN

Values

Info-Min-Maj-Ser-Crit

Up/DN

(Base Path to Monitor Logs /etc/opt/resmon/log/)

registar.log

client.log

api.log api.log

armmon.log

fc60mon.log

emslog

client.log

EMS Monitor 구성도

TCP, UDP, SNMP, eMail, OPC, syslog, textlog, emslog, console

Event Monitoring

Service

NOTIFICATION METHODS

TARGET APPS

USER

Page 12: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

12

Event Monitoring

ServiceEMS Event 통보 방법

• Messages written to the system:• SYSLOG• TEXTLOG : event.log• CONSOLE• Predictive Text File

• Messages sent via various protocols:• EMAIL• TCP• UDP• SNMP• OPC (Open View Messaging)

• Notification integrated with MC/ServiceGuard and MC/LockManager

Event Monitoring

Service

Page 13: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

13

Event Monitoring

Service Target Applications

SNMP traps,TCP 또는 UDP 메시지 서비스를 지원하는 어떤 시스템관리 어플리케이션 이라도 event 통보를 받을 수 있습니다 .

HP Open View IT/O and HP Network Node Manager message templates for EMS are available at no charge as contributed software from the external web page: http://www.software.hp.com

http://www.software.hp.com/products/EMS/index.html

Event Monitoring

Service

Page 14: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

14

Event Monitoring

Service

Notification Targets End User Actions

Event Monitoring

Service

METHOD

NOTIFICATION TARGET END USER ACTION

Write to syslog

•/ var/adm/syslog/syslog.log •Sys Admin reads syslog

Write to console

•System console •Sys Admin views console msg on display

Write to text log

•User defined text log

•Default: /var/opt/resmon/log/event.log

•Sys Admin reads text log

Write to Predictive log

•/ var/opt/pred/emslog

•Onsite Predictive scanner (emsscan) consults rule set and notifies RC if necessary

•RCE reads Predictive message and takes appropriate action

MC/SG

MC/LM

•MC/SG, MC/LM •MC/SG, MC/LM performs package fail-over

Page 15: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

15

Event Monitoring

Service

Notification Targets End User Actions (cont’d)

METHOD NOTIFICATION TARGET END USER ACTION

Send via eMail

•eMail address •Sys Admin (or addressee) invokes eMail application and reads message

Send via TCP

•User written socket program – host and port specified

•Application dependent

•Sys Admin runs receiving application.

Send via UDP

•User written socket program – host and port specified

•Application dependent

•Sys Admin runs receiving application

Send via SNMP

•Any application configured to receive SNMP msgs

•Templates provided for integration with HP NNM

•Application dependent

•Sys Admin runs Network Node Manager, receives message. Visual change in HW icon

Send OPC format

•Templates provided for integration with HP OpenView IT/O

•Sys Admin runs OpenView IT/O, receives message. Visual change in HW icon.

Event Monitoring

Service

Page 16: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

Event Monitoring

Service

16

Section 2:EMS Monitors 의 설치

•포함된 과정에 대한 개관•EMS Hardware Monitors 설치

Event Monitoring

Service

Page 17: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

17

Event Monitoring

Service

OnlineDiag SupportTools Bundle

800

B4708AA

EMS-Core

EMS-Config

Contrib-ToolsLIF-LOAD

Sup-Tool-Mgr-800

B4708AA-1000EMS HW Event

Monitors

B7609BA

HP-UX 10.20HP-UX 11.00

Predictive

OnlineDiag SupportTools Bundle

700

B4708AA

EMS-Core

EMS-Config

Contrib-ToolsLIF-LOAD

Sup-Tool-Mgr-700

B4708AA-1000EMS HW Event

Monitors

B7609BA

HP-UX 10.20HP-UX 11.00

EMS HW Monitors: Product 구조Event

Monitoring Service

Page 18: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

18

Event Monitoring

Service The Steps Involved

1. 최신의 Support Plus Media 로 부터 STM 설치 .2. 어떤 장치를 모니터하기 위해 특별히 필요한 사항이 있는지

지원되는 products list 확인 .3. Monitoring Request Manager 실행 :

/etc/opt/resmon/lbin/monconfig

4. Enable hardware event monitoring if release of media is earlier than June 1999

5. Default monitoring requests 가 적절한지 결정 .6. Add or modify monitoring requests as necessary.

7. Verify monitor operation (optional)

Event Monitoring

Service

NOTE: More information on special requirements can be found in the EMS Hardware Monitors User’s Guide, pages 30-34, or at http://www.docs.hp.com/hpux/onlinedocs/diag/ems/ems_prod.htm

Page 19: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

19

Event Monitoring

Service EMS Hardware Monitors 설치

•HW Monitoring 을 위해 설치되는 Software components :

•All hardware event monitors

•Monitor configuration files

•Monitoring Request Manager

•EMS framework, including EMS graphical interface

Event Monitoring

Service

Page 20: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

20

Event Monitoring

Service Supported System Configuration

•HP 9000 Series 700 or 800 Computer

•HP-UX 10.20 or 11.00, 11i

•Support Plus Media or http://www.software.hp.com

•If you are using MC/ServiceGuard (optional), you must have

•Version A.10.11 for HP-UX 10.20

•Version A.11.04 for HP-UX 11.0

Event Monitoring

Service

Page 21: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

21

Event Monitoring

Service Online Diagnostic Products

OnlineDiag Support Tools Bundle HP-UX 10.20 B.10.20.22.xx HP-UX 11.00 B.11.00.17.xx HP-UX 11.11 B.11.11.03.xx HP-UX 11.20 B.11.20.01.xx

Supported on HP-UX 10.20 and 11.X EMS-Core: EMS Framework (B7609BA) EMS-Config: SAM interface to EMS (B7609BA)Sup-Tool-Mgr: STM, EMS HW Monitors, PSM,

monconfig (B4708AA)Predictive: Onsite Predictive, emsscan

Event Monitoring

Service

Page 22: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

22

Event Monitoring

Service

• Special requirements for individual monitors:

•URL: http://www.docs.hp.com/hpux/onlinedocs/diag/ems/ems_prod.htm

• AutoRAID Disk Arrays - ARMserver

• Fibre Channel SCSI Multiplexers - F/W ver. 3840 이상• Fibre Channel Adapters

• Fibre Channel Arbitrated Loop Hubs

• Fibre Channel Switches

• System Chassis Code Monitor ( 11i only)

• HP UPSs - ups_mond

Event Monitoring

Service

Monitor-Specific InstallationTasks – (Example)

Page 23: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

23

Event Monitoring

Service

Enable Monitoring

• IPR9906 - EMS HW Monitors are active by default after installation

•E-Mail message for enabling HW monitors is sent to root@system*

•HW monitors enabled from character-based configuration client called the “Monitoring Request Manager”

– /etc/opt/resmon/lbin/monconfig

• Default monitoring requests provided for HW event monitoring

•Users may accept these defaults or customize their monitoring configuration using monconfig

Event Monitoring

Service

Page 24: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

24

Event Monitoring

ServiceEnable Monitoring – (cont’d)

• Default monitoring requests NOT provided for HW status monitoring. (PSM)

• Status monitoring configured using the EMS GUI

•EMS HW resources will show up in EMS GUI configuration client after specific HW monitoring has been added using monconfig

• IPR0006 - CONSOLE logging not enabled by default.

Event Monitoring

Service

Page 25: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

25

Event Monitoring

Service

Event Monitoring

Service

EMS HW MONITORS EMS HA MONITORS

Monitor HW resources such as I/O devices, interface cards, and memory

Monitor disk, cluster,network, and system resources

All HP 9000 systems running HP-UX 10.20 or 11.x

Only HP 9000/800 systems running HP-UX 10.20 or 11.x

Distributed “free” on the Support Plus Media

Available from HP at extra cost

Works best in a high availability environment

Event monitoring via Monitoring Request Manager (monconfig)

Status Monitoring via a SAM GUI Status Monitoring via SAM GUI

EMS HW Monitors and EMS HAMonitors - Differences

Page 26: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

26

Event Monitoring

Service

Section 3:EMS HW Monitors 의 운용

Event Monitoring

Service

•Default settings

•Listing

•Viewing

•Adding

•Modifying

•Verifying

•Checking status

•Retrieving and interpreting event messages

•Removing/deleting/disabling

Page 27: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

27

Event Monitoring

Service Monitoring Request 란 무엇인가 ?

•어떤 hardware 를 감시할 것인가 ?- HW 자원과 관련된 monitor 를 선택 .

•어떤 events 를 리포팅 할 것인가 ?- 리포팅 할 event 의 severity level 를 선택 .

•어떻게 통보를 해줄 것인가 ?- Notification 방법을 선택 .

Event Monitoring

Service

Page 28: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

28

Event Monitoring

Service Monitoring Request Manager

•Monitoring is enabled by default

•Opening screen indicates if monitoring is currently enabled or disabled

•Must be logged on as root

•Type /etc/opt/resmon/lbin/monconfig

Event Monitoring

Service

Page 29: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

29

Event Monitoring

Service

Monitoring Request Manager:Opening Screen

Event Monitoring

Service

EMS Version : A.03.20

STM Version : A.26.00

Page 30: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

30

Event Monitoring

Service

Enabling Hardware Event Monitoring

•Support Tools bundle 이 설치될 때 자동으로 enable 됨

•Run the Hardware Monitoring Request Manager

•/etc/opt/resmon/lbin/monconfig

•Enter “E” from the main menu selection prompt

Event Monitoring

Service

Page 31: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

31

Event Monitoring

Service HW Monitoring Request ManagerEvent

Monitoring Service

Page 32: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

Event Monitoring

Service

32

Default Monitoring Requests

SEVERITY LEVELS

NOTIFICATION METHOD

All (>= INFORMATION)

TEXTLOG file:

/var/opt/resmon/log/event.log

Major Warning,Serious, Critical

SYSLOG:/var/adm/syslog/syslog.log

Major Warning, Serious, Critical

EMAIL, address: root

Event Monitoring

Service

Page 33: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

33

Event Monitoring

ServiceListing Monitor Descriptions

•특정 HW 자원에 적합한 monitor 를 선택할 때 사용 .•각각의 monitor 가 어떤 HW 자원을 지원하는지

확인할 때 사용 .•사용 가능한 monitor 들의 내용을 나열한다 .

•Run the Hardware Monitoring Request Manager– /etc/opt/resmon/lbin/monconfig

•Enter ‘L’ from the main menu selection prompt

Event Monitoring

Service

http://www.docs.hp.com/hpux/onlinedocs/diag/ems/emd_summ.htm

Page 34: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

34

Event Monitoring

ServiceListing Monitor Descriptions

Event Monitoring

Service

Page 35: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

35

Event Monitoring

Service

Viewing Current Monitoring Requests

•Monitoring requests 를 추가 또는 수정하기 전에 확인할 경우에 사용

•Monitoring and notification strategy 를 수행할 추가적인 requests 를 결정할 때 사용 .

•Run the Hardware Monitoring Request Manager

•/etc/opt/resmon/lbin/monconfig

•Enter ‘S’ from the main selection prompt

Event Monitoring

Service

Page 36: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

36

Event Monitoring

ServiceCurrent Monitoring Requests

Event Monitoring

Service

Page 37: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

37

Event Monitoring

ServiceAdding a Monitoring Request

• 특정 monitor 를 위해 통보 방법을 추가 및 설정할 때 사용 .• Run the Monitoring Request Manager

• /etc/opt/resmon/lbin/monconfig

• Enter “A” at the main selection prompt, then select

• Monitors to which this configuration can apply

• Criteria Thresholds (INFORMATION, CRITICAL, …)

• Criteria Operator (>=, =, …)

• Notification Method Prompt (EMAIL, SYSLOG, …)

• User Comment (Any desired comments)

• Client Configuration File Prompt

– (C)lear – default client configuration file

– (A)dd – specify a client configuration file

• Save request when prompted

Event Monitoring

Service

Page 38: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

38

Event Monitoring

ServiceHardware Monitoring Requests

Hardware Event

Monitor

어떤 HW 를 모니터 할 것인지 설정한다 . 각 request 에 대해 여러 개의 monitor 를 선택할 수 있다 .

Severity Level:Critical = 5Serious = 4

Major Warning = 3Minor Warning = 2

Information = 1

Operator=><

>=<=!

+어떤 event 에 대해 리포팅 할 것인지 설정한다 . 각 request 에 대해 한 쌍의 설정 값을 선택할 수 있다 .

Notification Method

Event 가 발생시 어떻게 통보할 것인지설정한다 .각 request 에 대해 오직 하나의 통보 방법을 선택할 수 있다 .

Event Monitoring

Service

Page 39: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

39

Event Monitoring

Service

Critical An event that will or has already caused data loss, system down time, or other loss of service. System operation will be impacted and normal use of the HW should not continue until the problem is corrected. Immediate action is required to correct the problem.

If MC/ServiceGuard is installed and this is a critical component, a package fail-over WILL occur.

Serious An event that may cause data loss, system down time, or other loss of service if left uncorrected. System operation and normal use of the HW may be impacted. The problem should be repaired as soon as possible

If MC/ServiceGuard is installed and this is a critical component, a package fail-over WILL occur

Major

Warning

An event that could escalate to a Serious condition if not corrected. System operation should not be impacted and normal use of the HW can continue. The problem should be repaired at a convenient time.

If MC/ServiceGuard is installed and this is a critical component, a package fail-over WILL NOT occur.

Minor

Warning

An event that will not likely escalate to a more severe condition if left uncorrected. System operation will not be interrupted and normal use of the hardware can continue. The problem can be repaired at a convenient time.

If MC/ServiceGuard is installed and this is a critical component, a package fail-over WILL NOT occur.

Information

An event that occurs as part of the normal operation of the hardware. No action is required.

If MC/ServiceGuard is installed and this is a critical component, a package fail-over WILL NOT occur.

Event Monitoring

Service

Event Severity Levels & Interaction with MC/SG

Page 40: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

40

Event Monitoring

Service

Event Monitoring

Service

Example: Adding a Monitoring

Request (AutoRaid)

Page 41: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

41

Event Monitoring

Service

Event Monitoring

Service

Example: Adding a MonitoringRequest (cont’d)

Page 42: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

42

Event Monitoring

Service

Event Monitoring

Service

Example: Adding a MonitoringRequest (cont’d)

Page 43: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

43

Event Monitoring

Service

Event Monitoring

Service

Example: Adding a MonitoringRequest (cont’d)

Page 44: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

44

Event Monitoring

Service

Event Monitoring

Service

Example: Adding a MonitoringRequest (cont’d)

Page 45: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

45

Event Monitoring

ServiceModifying Monitoring Requests

•Run the Hardware Monitoring Request Manager

•/etc/opt/resmon/lbin/monconfig

•Enter “M” from the main selection prompt

•Enter the number of the request you want to modify

•Change the setting(s)

•Save request when prompted

Event Monitoring

Service

Page 46: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

46

Event Monitoring

Service Modifying Monitoring RequestsEvent

Monitoring Service

Page 47: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

47

Event Monitoring

Service

Remove/Disable EMS Hardware Monitoring

•Remove STM and/or EMS

– Run swremove

– Select B7609BA or OnlineDiag bundle

•Disable

– Run Hardware Monitoring Request Manager

– /etc/opt/resmon/lbin/monconfig

–Enter “K” from main menu selection prompt

–Confirm when prompted

Event Monitoring

Service

Page 48: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

48

Event Monitoring

Service

Delete EMS HW Monitor Requests

Event Monitoring

Service

•Delete–Run Hardware Monitoring Request Manager

–/etc/opt/resmon/lbin/monconfig–Enter “D” from main menu selection prompt–Enter the number assigned to request to delete–Delete when prompted

Page 49: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

49

Event Monitoring

ServiceDeleting A Monitoring Request

Event Monitoring

Service

Page 50: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

50

Event Monitoring

Service

Verifying Hardware Event Monitoring

•Run the command send_test_event (introduced September 2000)

•Simulate a hardware failure or event

•Remove a disk from an array

•Unplug a cable

•Turn off the hardware resource

•Use known defective media

•Event messages will be generated

Event Monitoring

Service

Page 51: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

51

Event Monitoring

Service

Checking Detailed Monitoring Status

•현재 적용된 모든 monitoring requests 를 나열한다 .

•현재 활성화된 request 만 보여준다 .•비 활성화된 monitor 는 “ NOT MONITORING” 로

표시된다 .•감시할 어떤 자원도 갖고 있지 않은 모든 monitor

들은 비활성화 된다 .

Event Monitoring

Service

Page 52: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

52

Event Monitoring

Service

• Predictive adds an additional default request for FC monitors• List of Predictive-Enabled Monitors as of June 2001 for 11.00:

• dm_core_hw• disk_em• dm_FCMS_adapter• dm_TL_adapter• dm_fc_scsi_mux• ha_disk_array• dm_ses_enclosure• lpmc_em• dm_memory• RemoteMonitor• dm_stape• scsi123_em• sysstat_em• dm_ups

• Events with severity >= INFORMATION are written to /var/opt/pred/emslog and TEXTLOG

Event Management Service

Predictive

http://www.docs.hp.com/hpux/onlinedocs/diag/ems/ems_pred.htm

Page 53: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

53

Event Monitoring

Service

Retrieving & Interpreting Event Messages

•Email 과 textfile 통보 방법은 전체 메시지의 내용을 보여준다 .

•다른 통보 방법으로 받은 내용은 “ resdata”utility를 사용하여 메시지를 볼 수 있다 .

Event Monitoring

Service

Page 54: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

54

Event Monitoring

Service

Sample Event Message

Event Monitoring

Service

Event Monitoring Service Event Notification %<

Notification Time: Wed Sep 9 10:48:30 2000

Hpbs8684 sent Event Monitor notification information:

/storage/events/disks/default/10_4_4.0.0 is >= 1.

Its current value is CRITICAL(5)

Event data from monitor:

Event time: Wed Sep 9 10:48:30 2000

Hostname: hpbs8684.boi.hp.com IP Address : 15.62.120.25

Event Id: 0x00356B15e00000000 Monitor : disk_em

Event # : 100037 Event Class: I/O

Severity: CRITICAL

Disk at hardware path 10/4/4.0.0 : Media Failure

Associated OS error log entry id(s):

000000000000000000

Description of Error:

The device was unsuccessful in reading data for the current I/O request due to an error on the medium. The data

could not be recovered. The request was likely processed in a way which could cause damage to or loss of data.

Probable Cause / Recommended Action:

The medium in the device is flawed. If the medium is removable, replace the medium with a fresh one. Alternatively,

if the medium is not removable, the device has experienced a hardware failure. Repair or replace the device, as necessary.

Page 55: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

55

Event Monitoring

Service

Information Contained in an Event Message

• Standard information:

• Notification time

• Value that triggered event

• Event data from monitor

– Event time, hostname, event #, severity, IP address, etc.

• Description of Error

• Probable Cause

• Recommended Action

• HW Resource Information

• Product Information

– Path, FRU, ID

• SCSI Status and Sense Data

Event Monitoring

Service

Page 56: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

56

Event Monitoring

Service

Section 4:HW Monitor Configuration Files

Event Monitoring

Service

•각각의 HW monitor 와 관련된 설정 파일 형태 .

•설정 값들 .

•수정 가능한 설정 값들 .

•Configuration parameters 변경 .

•When changes take effect

Page 57: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

57

Event Monitoring

Service설정 파일의 형태

• Control the operation of each HW event monitor

•Global.cfg

• Monitor Specific files

•monitor_name.cfg

•default_monitor_name.clcfg (multiple-view)

• Start up Specific files

•monitor _name.sapcfg

• PSM monitor specific files

•monitor_name.psmcfg

Event Monitoring

Service

Page 58: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

58

Event Monitoring

Service HW Monitor 설정 파일들

• Global monitor configuration file. 이 파일에 정의된 설정 값들은 모든 monitor 에 유효하며 , monitor-specific file 보다 후 순위 우선권을 갖는다 ./ var/stm/config/tools/monitor/Global.cfg

• Monitor-specific configuration file. 각 monitor 들은 최적화된 설정 값을 이 파일에 포함하고 있다 . 이 설정 값들은 global configuration file 보다 우선순위를 갖는다 .

/var/stm/config/tools/monitor/monitor_name.cfg

• June 2000 버전부터 몇 가지 hardware monitor 는 "multiple-view" (Predictive-enabled) 로 전환되었는데 , 이 monitor 들은 다른 설정파일을 사용한다 . 즉 , Client Configuration File./var/stm/config/tools/monitor/default_monitor_name.clcfg

• 다음과 같은 공통적인 operating parameters 를 가지고 있다 .:– Polling interval, Repeat Frequency, Severity Actions, and

Event Definition

Event Monitoring

Service

Page 59: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

59

Event Monitoring

Service

HW Monitor 설정 파일들 (cont.)

•Monitor Startup Specific files– /var/stm/config/tools/monitor/monitor_name.sapcfg

– Default information for that specific monitor

•PSM Monitor Specific files– /var/stm/config/tools/monitor/

monitor_name.psmcfg

– Optimized operating settings for specific monitors

Event Monitoring

Service

Page 60: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

60

Event Monitoring

Service

Operating Parameters: Polling Interval

• HW 상태 점검을 위한 Polling 주기를 설정 .•시스템 성능을 고려하여 설정 값을 선택 .

•Default in Global.cfg and monitor_name.cfg :POLL_INTERVAL 60 # in minutes (one hour)

•Monitor 가 enable 된 후 부터의 시간임• 변경 이유 :

•HW 와 관련된 잠재적인 문제들을 줄이기 위해 .

•Global.cfg 파일보다 개별적인 Monitor 설정 파일의 값을 수정 .

Event Monitoring

Service

Page 61: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

61

Event Monitoring

Service

Operating Parameters:

Repeat Frequency

• 같은 event 에 대해 얼마나 자주 리포팅 할 것인가 ?•계속적인 같은 메시지 발생에 대해 시스템 부하를

덜어주기 위해 .•Default in Global.cfg :

REPEAT_FREQUENCY 1440 # in minutes (one day)

– Default is once per day ( 하루 한번 )• 변경 이유 :

•한 event 에 대해 좀 더 자주 리포팅 할 필요가 있을 때 .

Event Monitoring

Service

Page 62: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

62

Event Monitoring

Service

Operating Parameters: Severity Actions

• 특정 Severity 에 대해 EMS 에 리포팅 할 것인지 , 무시할 것인지 결정한다 .

• Defaults in Global.cfg :

– SEVERITY_ACTION CRITICAL NOTIFY

– SEVERITY_ACTION SERIOUS NOTIFY

– SEVERITY_ACTION MAJOR_WARNING NOTIFY

– SEVERITY_ACTION MINOR_WARNING NOTIFY

– SEVERITY_ACTION INFORMATION NOTIFY

• 변경 이유 :• 덜 중요한 events 에 대해 무시할 수 있도록 “ IGNORE” 로

변경한다 .

Event Monitoring

Service

Page 63: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

63

Event Monitoring

Service

Operating Parameters: Events

• Monitor 에 의해 조정될 event 를 정의• Severity level 를 정의• Event 발생시 취해질 행동에 대해 정의• Format in Global.cfg and monitor_name.cfg:

config-verb event no. severity action #descriptionDEFINE_EVENT 10001 CRITICAL DEFAULT #comments here

• Format in default_monitor_name.clcfg: EQ:event_number:severity:enable flag:suppression_time:time_window:threshold: value threshold1:operator1:operator2:value threshold2

EQ:3:CRITICAL:TRUE:1440:ANY:1:NONE:NO_OP:NO_OP:NONE

• 변경 이유 :•모든 환경에서 severity level 이 event 의 중요도를

모두 반영하지는 못함 .•특정 event 를 무시하기 위해 .

Event Monitoring

Service

Page 64: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

64

Event Monitoring

Service

Startup Configuration Files

• Contain monitoring request definitions for each monitor

• /var/stm/config/tools/monitor/monitor_name.sapcfg

• Format:

– MONITOR: /storage/events/disk_arrays/FW_SCSI

– Criteria Threshold: INFORMATION

– Criteria Operator: >=

– Target Type: TEXTLOG

– Target TEXTLOG File: /var/opt/resmon/log/event.log

• 초기 구동 시 , ioscan 및 monconfig 가 수행될 때 , 이 파일의 내용이 사용되어 진다 .

• 변경 이유 :• Monitoring requests 를 고객 환경에 맞추기 위해 .

Event Monitoring

Service

Page 65: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

65

Event Monitoring

Service

Startup Configuration File Entries

Event Monitoring

Service

Keyword

Values Description

MONITOR

(required)

A valid event monitor resource path

Identifies HW event monitor to which entry applies. Entries must use resource path for monitor being configured. Note: This must be the first keyword in each entry.

Criteria

Threshold

(required)

Valid values include:

Critical

Serious

Major_Warning

Minor_Warning

Informational

Defines severity level used as notification criteria threshold.

Criteria

Operator

(required)

Valid operators are:

< less than

<= less than or equal to

> greater than

>= greater than or equal to

! not equal to

Identifies arithmetic operator used with criteria threshold to control what events are reported. Operator treats each severity level as a numeric value assigned as follows:

Critical = 5 Minor warning = 2

Serious = 4 Informational = 1

Major warning = 3

Event severity received is the left operand. Criteria Threshold value is the right operand.

Page 66: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

66

Event Monitoring

Service

Startup Configuration File Entries (cont’d)

Event Monitoring

Service

Keyword Values Description

Target Type

(required)

VALID VALUES:

UDP TCP OPC

SNMP TEXTLOG SYSLOG

EMAIL CONSOLE

Identifies the method of notification used.

Target Type Modifier (required for the following target types):

UDP

Target UDP Host – hostname of the machine to which UDP event messages will be sent.

Target UDP Port – port number on the host that will be used for the network connection.

TCP

Target TCP Host – hostname of the machine to which TCP event messages will be sent.

Target TCP Port – port number on the host that will be used for the network connection.

TEXTLOG

Target TEXTLOG – name of the log file to which event messages will be sent.

EMAIL

Target EMAIL Address – email address of the recipient of the event messages.

Comment

(optional):

Any text string Optional field presented as user data in each event meeting this criteria.

Page 67: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

67

Event Monitoring

ServiceDefault File Entries

Event Monitoring

Service

Description Entry

Entry to send all events to textlog

MONITOR: /storage/events/disk_arrays/FW_SCSI

Criteria Threshold: INFORMATION

Criteria Operator: >=

Target Type: TEXTLOG

Target TEXTLOG FILE: /var/opt/resmon/log/event.log

Entry to send SERIOUS and CRITICAL events to syslog

MONITOR: /storage/events/disk_arrays/FW_SCSI

Criteria Threshold: SERIOUS

Criteria Operator: >=

Target Type: SYSLOG

Entry to send SERIOUS and CRITICAL events to email

MONITOR: /storage/events/disk_arrays/FW_SCSI

Criteria Threshold: SERIOUS

Criteria Operator: >=

Target Type: EMAIL

Target EMAIL address: root

Page 68: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

68

Event Monitoring

Service

PSM Monitor Configuration Files

• PSM 과 HW event monitor 사이의 상호작용은 다음의 PSM Configuration file 에 의해 제어된다 .

• /var/stm/config/tools/monitor/monitor_name.psmcfg• Format:

– MONITOR_RESOURCE_NAME: /storage/events/disks/default– PSM_RESOURCE_NAME (valid PSM resource path)– MONITOR_STATE_HANDLING (type of state handling)– DOWN_SEVERITY_THRESHOLD: CRITICAL– DOWN_SEVERITY_OPERATOR: =

• 어떤 severity level 이 “ Down” 상태를 야기하고 , 그에 따른 행동 및 다시 “ Up” 상태로 되돌리는데 필요한 부분을 정의한다 .

• PSM 은 매 10 분마다 설정 파일을 체크 한다 .• 변경 이유 :

• 특정 자원을 “ Down” 상태로 변경하도록 severity level 를 재 설정 할때 .

Event Monitoring

Service

Page 69: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

69

Event Monitoring

Service

Section 5: PSM Peripheral Status Monitors

Event Monitoring

Service

•Peripheral Status Monitors 란 ?

•MC/Service Guard 와 구성방법

Page 70: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

70

Event Monitoring

Service

When to Use EMS HW StatusMonitoring

• Peripheral Status Monitoring

•HW 가 운용가능한지 판단하기 위해•OpenView IT/O 와 같은 시스템 관리 프로그램과

연동하기 위해•HW 자원에 종속적인 페키지를 만들어

MC/ServiceGuard 와 통합하기 위해

Event Monitoring

Service

Page 71: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

71

Event Monitoring

Service개 관

Event Monitoring

Service

•HW event 를 device/resource 상태의 변화로 전환

•Data 의 사용에 영향을 미치게 되는 상태를 리포팅

•SAM 에서 EMS GUI 를 이용 .

•변경 후 package fail-over 시 MC/ServiceGuard 에 의해 사용됨 .

•HW event monitors 와 MC/ServiceGuard 사이의 Interface 역할 .

Page 72: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

72

Event Monitoring

ServiceHow PSM Works

Event Monitoring

Service

Hardware Event

Monitor

Peripheral Status

Monitor (PSM)

Event MonitoringService(EMS)

To MC/

ServiceGuard

HW event monitor 는 각각의 event 에 대해 severity level 를 지정하고 그것을 PSM 에 전달한다 .

The PSM 은 그 event 의 severity level 를 device status (UP or DOWN) 로 전환하고 그 상태를 EMS 에 전달한다 .

EMS

Notification

그 자원에 대해 PSM monitoring request 가 만들어지면 , 지정된 통보 방법에 의해 통보된다 .

그 자원이 MC/ServiceGuard package 와 관련이 있을 경우 , EMS 는 MC/SG 에게 그 상태를 바꾸도록 통보한다 . 만약 그 자원의 상태가 “ Down” 으로 바뀌게 되면 , MC/SG 는 package 를 fail-over 시키게 된다 .

Page 73: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

73

Event Monitoring

ServicePSM 구성 요소

Event Monitoring

Service

•psmctd – Peripheral Status Client/Target daemon

•HW resource 의 상태를 감시하는데 사용됨 .

•psmmon – Peripheral Status Monitor

•psmctd 에 의해 인식된 자원들의 상태를 감시하는 유틸리티 .

•set_fixed – HW resource 의 상태를 “ Down” 에서 “ Up”으로 직접 바꾸어주는 유틸리티 .

•자동으로 이러한 수행을 하지 못하는 monitor 에게만 사용가능 .

예 ) “DOWN” 상태인 HW resource 들을 나열 /opt/resmon/bin/set_fixed –L ”DOWN” 에서 ” UP” 으로 변경 : /opt/resmon/bin/set_fixed –n resource_name

Page 74: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

74

Event Monitoring

Service

PSM States

Event Monitoring

Service

Condition Interpretation

Up HW is operating normally

Down An event has occurred that indicates a failure with the HW

Unknown Cannot determine the state of the HW. This state is treated as DOWN by the PSM

Page 75: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

75

Event Monitoring

Service

Event Monitoring

Service

Configuring MC/ServiceGuard Package Dependencies with the PSM

•PSM 에서 사용 가능한 한 개 이상의 자원을 MC/SG package 와 구성 하는 방법 .

•특정 자원에 대한 상태를 감시하는 EMS monitoring request 를 만든다 .

•그 자원의 상태가 변경될 경우 MC/SG 에 통보한다 .

•PSM package 를 구성하는 두 가지 방법

•SAM

•Editing package configuration file (/ etc/cmcluster/pkg.ascii)

Page 76: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

76

Event Monitoring

Service

Data

Mirror

VG01Node 2Node 1

Package 1 pkg.ascii

Package Dependency:

VG01

Applic 1

IP Addr - Pkg 1IP Addr - Node 1 IP Addr - Node 2

Exclusive VG Activation

만약 VG01 의 상태가 Node1 에서 “ DOWN” 이 되면 , 그 패키지는 VG01의 상태가 “ UP” 으로 보이는 다른 시스템 , 즉 Node2 에서 시작된다 .

MC/ServiceGuard Packages

Event Monitoring

Service

Page 77: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

77

Event Monitoring

Service

GUI Monitoring Request – Example

Event Monitoring

Service

Page 78: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

78

Event Monitoring

Service

GUI Monitoring Request – Example (cont’d)

Event Monitoring

Service

Page 79: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

79

Event Monitoring

Service

GUI Monitoring Request – Example (cont’d)

Event Monitoring

Service

Page 80: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

80

Event Monitoring

Service

GUI Monitoring Request – Example (cont’d)

Event Monitoring

Service

Page 81: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

81

Event Monitoring

Service

SECTION 6:Basic Troubleshooting Guidance

•정보 수집•Disable a Monitored Resource

•How to Test Online Diagnostics

•How to Completely Disable EMS

Event Monitoring

Service

Page 82: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

82

Event Monitoring

Service 정보 수집 Event

Monitoring Service

다음은 EMS troubleshooting 할 때 유용한 정보들 이지만 , 문제의 성격에 따라 아래 정보들이 모두 필요한 것은 아닙니다 .

•EMS and STM version (monconfig, cstm, swlist)•System type (uname –a, model)•swlist –l bundle and swlist –l product•/var/opt/resmon/log/event.log•/etc/opt/resmon/log/api.log -client.log -registrar.log•/var/adm/sw/swagent.log•/var/stm/logs/sys•Persistence files /etc/opt/resmon/persistence

EMS 와 STM Version 확인 : http://www.docs.hp.com/hpux/onlinedocs/diag/stm/stm_upd.htm#table

Page 83: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

83

Event Monitoring

Service Disable a Monitored ResourceEvent

Monitoring Service

• September 2000 (IPR 0009) 버전 이후로 startmon_client 는 다음 파일을 참조 한다 .

/ var/stm/data/tools/monitor/disabled_instances

• 위에 나열된 각 항목들은 한 줄 당 한 항목을 나타낸다 .• Wildcards 도 사용 가능하다 .:

/ storage/events/disks/default/*

• As user root:1. Add/Delete/Modify instances in disabled_instances file2. Execute monconfig (E)nable Monitoring3. Wait for monitoring to be re-enabled 4. Select (C)heck detailed monitoring status

Page 84: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

84

Event Monitoring

Service Disable an Individual EventEvent

Monitoring Service

• For Multiple-View Predictive-Enabled monitors edit the client configuration file/var/stm/config/tools/monitor/default_monitor_name.clcfg• Change the “enabled flag” from TRUE to FALSE:

EQ:3:CRITICAL:FALSE:1440:ANY:1:NONE:NO_OP:NO_OP:NONE

• For other monitors edit the monitor configuration file/var/stm/config/tools/monitor/monitor_name.cfg• Change the action flag to IGNORE:

DEFINE_EVENT 5 CRITICAL IGNORE # power supply fault

• 변경 후 바로 적용되며 monitoring 을 restart 할 필요가 없다 .

Page 85: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

85

Event Monitoring

Service How to Test Online DiagnosticsEvent

Monitoring Service

1. Hardware monitoring 은 세개의 daemon 이 필요하다 .: diagmond, diaglogd, and memlogd. Check with ps -ef command.

2. List all currently active HW monitors:ps -ef | grep stm

3. Run /etc/opt/resmon/lbin/monconfig to (C)heck detailed monitoring status. The initial screen should show event monitoring enabled.

4. EMS 를 통해 test event 를 보내어 테스트할 경우에는 send_test_event 명령을 사용한다 .:/opt/resmon/bin/send_test_event –v monitor_name

Page 86: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

86

Event Monitoring

Service How to Completely Disable EMSEvent

Monitoring Service

일시적으로 EMS 전체를 Disable 할 필요가 있을 경우에는 다음과 같은 몇 가지 순서대로 수행해야 한다 .:

1. Run monconfig (K)ill (disable) monitoring2. Edit /etc/inittab using vi, comment out the 4 lines labeled

ems1, ems2, ems3 and ems43. Reread /etc/inittab by running init q4. Change EMS_ENABLED to 0 in /etc/rc.config.d/ems5. Change AUTOSTART_EMSAGT to 0 in

/etc/rc.config.d/emsagtconf6. Kill emsagent and p_client processes (if still running)7. Verify monitors stopped using ps -ef|grep stm

Enable 할 경우에는 위의 반대로 수행하고 다음을 실행한다 ./sbin/init.d/ems start/sbin/init.d/emsa start

Page 87: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

87

Event Monitoring

Service Q & AEvent

Monitoring Service

Page 88: Event Monitoring Service 1 EMS Hardware Monitors 2002. 5. 16 강사 : 공 용섭 과장 HPCS/SDO/MC

Event Monitoring

Service

88

Thanks