splunklive brisbane getting starting with it service intelligence
TRANSCRIPT
SetupBeforeYouCanPlayDownloadthispresentationslidedeck:https://splunk.box.com/splunkliveitsi16Followtheinstructionsonyourpaperhand-out tologintoyourVM.
Pleaseloginaseither• [email protected]• [email protected]• Passwordis“Changeme1”or
“Changeme2”
Afterloggingin,selectITServiceIntelligencefromthelistofappsattheleft
2
WhatisaService?
Service RequestsResponses
InITSI,aService isalogicalgroupoftechnologycomponents thatauserdeemsneedtobemonitoredtogether.
Itcanoftenbegeneralizedasa“blackbox”whichwesendrequests,andexpectresponses
5
WhatisaService?
DNS RequestsResponses
TechnicalServices
Auth RequestsResponses
Web RequestsResponses
Servicescanbelowerlevel(technical)…
6
WhatisaService?
DNS RequestsResponses
TechnicalServices
CustomerTransactions
RequestsResponses
BusinessServices
Auth RequestsResponses
Web RequestsResponses
SupportDesk RequestsResponses
Servicescanalsobehigherlevel(business)…
7
WhatisaService?
PacketNetwork
HypervisorandHosts
RBMDBs
StorageTier
APIServices
WebServices
CustomerTransactions
Mobile
API/Middlew
are
PartnerPortal
DNS
ServicescanencompassmultipletiersoftheITdomain.Servicesmayalsodependupon otherservices
8
WhatisaKPI?
DNS RequestsResponses
KPI:NumberofrequestsKPI:ErrorrateKPI:Averageresponse timeKPI:ServerCPUloadKPI:ServernetworkI/Ferrors
CustomerTransactions
RequestsResponses
KPI:NumberoftransactionsKPI:ErrorrateKPI:Averageresponse timeKPI:CountofIncidentTicketsKPI:SyntheticTransxHealth
KPIsandHealthscoresconstitutethemeansbywhichServicesaremonitored.
9
KeyPerformanceIndicators(KPIs)
10
AKeyPerformanceIndicator(KPI)isaSplunksavedsearchcreatedwithintheITSIUIthathelpsmonitoraspecificfieldlikeCPU,Memory,NumberofErrors
andsoon. KPIsarecontainedwithinServices.
ServiceHealthScores
11
AHealthscoreisascoreform0-100(0beingcriticaland100beingnormal)thathelpsdeterminethehealthofaService.ItiscalculatedbasedonallKPIs
importanceanditsstatus(e.g.green,orange,red),onceeveryminute.
Let’sTalkEntities
12
● Entitiesaretherelevantcomponentsthatsupportaservice(oftenbutnotalwayshosts)
● Selectthecorrectentitieswithfilters,ANDs,ORs
● EntitylistcancomefromaCMDB,aspreadsheet,aSplunksearch…
ServiceDecompositioninITSI
18
Identifytheprocessflowandunderlyingsub-services(Web->Middleware->DB->Middleware->Web)
ServiceDecompositioninITSI
19
Foreachsub-service,identifyKPIsthatwillshowhealthandstatus(Requests,responsetime,errors,OShealth…)
NewRequirements!
25
● CreateanewKPIfortheDBService:● NetworkUtilization
● Modify theExecutiveGlassTableinordertoshowofftheservicesyouslaveover
“WEonlyhaveabout15minTODOWHAT???!!???”
Thinkabouthowlongthiswouldtakeyoutoday?
AKPIin5minutes?Absolutely!
27
ClickNew– GenericKPI
Select DataModel● HostOperatingSystem● Network● #bytes● Next
KPIsContinued….
28
SplunkBuildsSearchesforyou–OhYeah,that’shappeningJ
● Select Yesfor Splitby& Filteroptions● Select hostfor EntityLookup& Aliasoptions● Click Next
AlmostThere…
29
Select● KPISearchSchedule:EveryMinute● EntityCalculation: Average● Service/AggCalculation: Average● Calculation Window: LastMinute● Click Next
● Unit:Bps● Click Next
FinalSteps…
30
Setyourthresholds:● Aggregate (All)● PerEntity
● Click “Add Threshold”TWICE● MaketheNeapolitanicecreamcolors
Yellow,Green,Yellow● Dragthesliders aroundinordertoget
thecurrentdatagraphentirelyinside theGreen(normal) band
● Click Finish● Otheroptions arealsoavailable,
including adaptivethresholds andanomalydetection
AnomalyDetection
34
● MachineLearning
● Workswellfordatawithpatterns
● Requiressome“training”(trial&error)tozeroinonbestsensitivity
NamethatKPI!
36
FromthelistofKPIs,selectyournewone(atthebottom)● Clickonthelittlepencilnexttothename● Callit“NetworkUtilization”,
withyourusernameupfront
● ClickonSave atbottomrightwhenfinished!
ClonetheGlassTable
38
ReturntoSavedGlassTablespage(click onGlassTablesintheuppermenubar)
CLICKEdit for“ButtercupGamesBusiness Process(INPROGRESS)”• Select Clone• Title:Add yourusername
tothefront• Permissions:SharedinApp• Click ClonePage
• Click onyournewGlassTablefromthelist,toviewit
Edit&HaveFun!
39
ClickonEdit intheupperrightcornerofyourGlassTable
Usethe“Services”panelonthelefttoselectIndividual KPIs,or Aggregate ServiceHealthScores• Choose 2KPIsfromOnline Store thatwouldbeuseful in
the“OrderProcess”section• Dragtheselectedwidgetsontothecanvas,positioning in
thegrayoval
• What’s thedifferencebetweenthe
and toolsatthetopleft?
MoreFunwiththeGlassTableEditor…
40
UsetheConfigurations panelontherighttoeditaselectedwidget• Canchangethevisualization type,drilldown
behavior, andothersettings
• Youshould hitSave frequently• IwonderwhatAutoLayoutdoes?• (YIKES!)RevertAllChangesmightbehelpful
Finishingup…
41
• AddaServiceHealthScorewidgetforOnlineStoreunder Buttercup
• Choose aVizTypewithasparklinegraph,thenresizetomakeitlookpretty
• Modify theCustomDrilldownactiontogotothesavedglasstable,ButtercupGamesOnline Store
• BonusPoints:Makethelabelbigger,morereadable
• Click Save• View whendone
ATroubleshootingExercise
42
Let’suseITSItotroubleshootanoutage● StartatyourGlassTable,“<UserName>ButtercupBusiness Process”● CustomerCarereportsthatunhappy customersarecomplaining offailures
andlongdelayswhentryingtopurchase● Thecallsbegancominginataroundtenminutesafterthehour.● IntheupperrightcorneroftheGlassTable,changethetimepickerfromNow
toXX:10:00.0,whereXXistheappropriatehour.Forexample,ifitiscurrently14:05,setthetimepickerto13:10:00.0,thenApply
● Thisishowwecan“timetravel”backtoseeconditions ataparticularoutage– ohyeah!
ATroubleshootingExercise,cont’d
43
● TheOnline Storeseemstobedegraded,justasCustomerCarereported.Clickonthewidgetunder Buttercuptodrilldown further
ATroubleshootingExercise,cont’d.
44
● TheOnline StoreGlassTableshows amuchmoredetailedview,including theimpactedcustomer-facingKPIsatthefarleft(Revenue,etc)
● Basedonthisviewofalltherelevantservices,wheredoyouthink therootcauselies?
● Which serviceshouldwetroubleshoot first?● ClickonHealthwidgetforthatservice, to
drilldowntoaDeepDive
DeepDive
45
● DeepDiveshowsmultiple KPIsandHealthScoresinparallel“swimlanes”.
● TheHealthScoreforthisServiceisthetopswimlane.Canyouseewhenitbeginstodegradefrom100%?
● Mousing overthispointintime,canyouspottheKPIwiththeleadingfaultindication, i.e.,whatfailedfirst?
Multi-KPIAlertsandNotableEvents
46
● Click onNotableEventsReview● MultipleKPIsandHealthscorescan
becombinedinsophisticatedwaystocreateMulti-KPIalerts
● WhenaMulti-KPIalertfires,oneoftheoutcomesisthecreationofaNotableEvent
● NotableEventsallowNOCpersonnel andotherstotriageandcoordinateeventmanagementefforts
ServiceAnalyzer
47
● Click onServiceAnalyzer> DefaultServiceAnalyzer
● Backwherewestarted!● Thisviewshows a“no-frills” listof
services (top)andhottestKPIs(bottom)
● Provides aquickjumping offpointintoDeepDivesandtheNotableEventsReview
● Itisuseful forNOCs andotherswhoneedahigh-levelsituationalview
WrapUp- Review
48
● High-valueservicescanbedecomposed andmodeled inITSI,usingmachinedatafromtherelevantsystems
● Services andKPIs canbecreatedinminutes,withsophisticatedthresholdingtechniques todistinguish “normal”from“notnormal”
● GlassTablesallowservicehealthandKPImetricstobedisplayedinawaythatmakessensetospecificgroups, suchasExecutiveLeadership,BusinessServiceOwners,theNOC,DevOps&Others
● DeepDivesallowKPIstobecomparedside-by-sideacrossanytimerange,acceleratingrootcauseanalysisandsignificantly reducingMTTR
● Multi-KPIAlertsandNotableEventsreducealertnoise,producing actionableeventsandameanstomanagethem
● …andit’sfuntobuild!
Wanttoexploreonyourown?
49
Signupforyourveryownseven-dayfreesandbox!http://splunk.com/ITSI
Thenclick:
You’llfindaSandboxGuideintheDashboards!IntheITSIappofyoursandbox, gotoSearch>Dashboards>ITSISandboxGuide
NorthernCalTechTalks!MonthlyWebExSessions• TedTalk stylepresentation• Q&AChatforum
Sowhat’snextontheagenda?• April20th@10AMPST- Top5mostuseful
searchcommands.• May18th @10AMPST– SplunkforIT
ServiceIntelligence
Seemoreat:http://live.splunk.com/NorCalTechTalks
51
SEPT26-29,2016WALTDISNEYWORLD,ORLANDOSWANANDDOLPHINRESORTS
• 5000+IT&BusinessProfessionals• 3daysoftechnicalcontent• 165+sessions• 80+CustomerSpeakers• 35+Apps inSplunkAppsShowcase• 75+TechnologyPartners• 1:1networking:AskTheExpertsandSecurityExperts,BirdsofaFeatherandChalkTalks
• NEWhands-on labs!• Expandedshowfloor,DashboardsControlRoom&Clinic,andMORE!
The7th AnnualSplunkWorldwideUsers’Conference
PLUSSplunkUniversity• Threedays:Sept24-26,2016• GetSplunkCertifiedforFREE!• GetCPE creditsforCISSP,CAP,SSCP• Savethousands onSplunkeducation!
A flying start to Service Intelligence
Start With A problem worth solving
Collaborate with Subject Matter Experts
Design Before Configuring
SignUpHere- We’reHereToHelp!Harnessthecreativityanddomainknowledgeofyourorganizationtounlockthevalueofdataandsolveanimportantserviceproblemthroughajoint
serviceintelligenceworkshopwithkeystakeholders
Definemethodsfor:
• Proactiveservicemonitoring
• Reducedriskandfailures
• Fasterissueresolution
• Increasedbusiness
performance
Whatisit?
• 1DayOnsiteWorkshop
• Tightlylinkedwithvalue
• Collaborativeapproach
• BuildyourownSplunk
ITSIGlassTable……