geros global meta evaluation report 2014 · narrative justification is also provided and supported...

GG EE RROO SS –– GG ll oo bb aa ll MM ee tt aa ‐‐ EE vv aa ll uu aa tt ii oo nn RR ee pp oo rr tt 22 00 11 44

FF ii nn aa ll RR ee pp oo rr tt

S e p t e m b e r 2 0 1 5

G E R O S – G l o b a l M e t a ‐ E v a l u a t i o n R e p o r t 2 0 1 4

U n i v e r s a l i a 2 4 5 V i c t o r i a A v e n u e , S u i t e 2 0 0 W e s t m o u n t , Q u e b e c C a n a d a H 3 Z 2 M 6 w w w . u n i v e r s a l i a . c o m


i Universalia

EE xx ee cc uu tt ii vv ee SS uu mm mm aa rr yy

ThepurposeofUNICEF’sGlobalEvaluationQualityAssuranceSystem(GEROS)istoensurethatUNICEF’sevaluationsupholdthehighqualitystandardssetforthem.TheGEROSreviewprocessandtheinformationprovidedinannualmeta‐evaluationreportshelpUNICEFtomonitoritsprogress,identifyitsstrengthsandbecomeawareofitsareasforimprovementwithregardtoevaluation.

ThefourmainobjectivesofGEROSareto:(i)provideseniormanagerswithaclearandindependentassessmentofthequalityandusefulnessofevaluationreports;(ii)strengtheninternalevaluationcapacity,throughpracticalfeedbackonhowtoimprovefutureevaluations;(iii)contributetocorporateknowledgemanagementandorganizationallearning,makingavailablegoodqualityevaluations;and(iv)reporttotheExecutiveBoardonthequalityofevaluationreports.

EachofUNICEF’sregionalofficesisresponsibleforsubmittingcompletedevaluationstotheGEROSprocess.Theseincludeadiverserangeofevaluationswhichdifferintermsofthematicarea,geographiccoverage,managementapproach,andotherelements.

Methodology

Reportscompletedduringthe2014GEROSreviewcyclewerereviewedoverafive‐monthperiod(mid‐JanuarytoearlyJune2015).Intotal,69reportswereanalysedusingtheGEROSassessmenttool,whichcontains58standardsforevaluationreportsestablishedbyUNICEF.IntheGEROSreviewprocess,eachquestionandsectionisgivenaratingbasedonacolour‐codedfour‐pointperformancescorecardthatgoesfromOutstanding/BestPracticetoUnsatisfactory.Foreachofthesectionandsub‐sectionratings,anarrativejustificationisalsoprovidedandsupportedbyevidencefromthereport(e.g.examplesandpageorsectionnumbers).

Thismeta‐analysisreportpresentstheanalysisofaggregateratingsfromthe2014reviewcycle.ReportswitharatingofHighlySatisfactoryorOutstanding/BestPractice,areconsideredtobe“quality”reports.DatafromrecentGEROScycles(i.e.from2010to2013)hasalsobeenincludedforcomparisonpurposes.

Overarchingconclusions,recommendationsandlessonslearnedarepresentedbelow.

Profileofreportsreviewedin2014

MostreportsweresubmittedbytheCentralandEasternEuropean/CommonwealthofIndependentStatesRegionalOffice(CEE/CIS),theSouthAsiaRegionalOffice(ROSA),andtheEastandSouthernAfricaRegionalOffice(ESARO).TheEastAsiaandPacificRegionalOffice(EAPRO),theLatinAmericaandCaribbeanRegionalOffice(LACRO),andtheMiddleEastandNorthAfricaRegionalOffice(MENARO)submittedfarfewerreportsthanin2013,withfiveorlessreportsperregionalofficein2014.

Mostreportspresentedtheresultsofeitherprogrammeorprojectevaluationsandallevaluationreportsreviewedthisyearwerecarriedoutbyindependentexternalevaluators.


Universalia ii

Conc lus ions

ThequalityofUNICEF’sevaluationreportscontinuedtoincreasethrough2014,butonlymoderately.Theupwardtrendinreportqualitymaintaineditselfforanotheryear,asgoodqualityratingsreached74%(from69%in2013and62%in2012).ThisindicatesthatmostreportsdoanadequatejobofmeetingGEROSstandards.

Nationallevelevaluationreportsincreasedinqualitycomparedtopreviousyears(from50%in2013to63%in2014).Project‐levelevaluationsincreasedsignificantlyinqualitywhencomparedto2013(from58%in2013to84%in2014).Intermsofqualitybyareas,WASHevaluationshadthelowestqualityrating,atonly29%.Education,ontheotherhand,hadthehighestqualityratingat82%.Intermsofevaluationmanagement,i.e.controlandoversightofevaluationdecisions,thequalityofUNICEF‐managedevaluationscontinuedtoimprove(from74%in2013to80%in2014).In2014,thehighestratedsectionsofevaluationreportswere“Methodology”with71%ofqualityratingsand“FindingsandConclusions”with74%ofqualityratings.Thelowestratedsectionofevaluationreportswastheoneon“Recommendationsandlessonslearned”withonly52%ofqualityratings.

ThecontributionofreportstotheGEROSprocessbysomeofUNICEF’sregionalandcountryofficesappearstobedropping.In2014,thenumberofreportsrevieweddroppedconsiderably,decreasingfrom96to69.ThisdropreflectsfewersubmissionsfromWCARO,LACRO,ESARO,andMENARO.ThereareseveralfactorsthatmayaffectthenumberofevaluationreportssubmittedinagivenGEROScycle,andnofirmconclusionscanbedrawntoexplainthisdecrease.Insomecases,fewerevaluationsmaybeconductedinoneyearcomparedtoanother.Thisissometimesrelatedtolowimplementationrates,inwhichprojectsorprogrammeshavenotprogressedenoughtojustifyanevaluation.Evaluationsmaybeplannedbutnotallexecutedduetoissuesrelatedtobudget,lackofpriority,orstaffturnover.Finally,insomecaseslowqualityevaluationsmaybereclassifiedasareviewandsubsequentlynotsubmittedtoGEROS.Itisunclearwhichofthesefactorshadthegreatesteffectonthelowernumberofevaluationreportsin2014.

Reportscontinuetodemonstratesimilarshortcomingsfoundin2013.WhilegoodqualityratingsformostoftheGEROSstandardsimproved,somestandardsremainedparticularlyweak.Thedescriptionoftheevaluatedobject’stheoryofchange,keystakeholdercontributions,andethicalconsiderationsandsafeguardswasoftenmissingorincomplete.Thesecanbeeasilyaddressedbysystematicallyrequiringevaluatorstoincludetheseelements.Inothercases,asignificantimprovementwillrequiredifferentapproachestoanevaluation,asinconductingcostanalysisorbetterincorporatinggender,humanrightsfromstarttofinishofanevaluationassignment,includinginthedesignoftheTermsofReference.ReportswerealsofrequentlymissingaclearrationalefortheuseofidentifiedOECD/DACevaluationcriteriaandrecommendationsthataretargetedtospecificstakeholders,andclearlessonslearned.Itwouldalsoappearthatmanyevaluatorsand/orevaluationmanagersdonotfullyunderstandUNICEF’sdefinitionorexpectationsaroundlessonslearned,suchastheneedtohavelessonsthatcanbegeneralizedtoindicatewiderrelevancetoothercontextsandsituations.


iii Universalia

Recommendat ions

UNICEFshouldexaminewhethertheincreaseinqualityofevaluationreportshasresultedinseniormanagershavinggreaterconfidenceinevaluationreports.

Giventhatevaluationqualityappearstobegrowing,effortsshouldbemadetoanalysehowtheGEROSsystemcontributestotheincreaseduseandvalueofUNICEF’sevaluationfunction,asawhole.Morespecifically,UNICEFshouldseeif,asaresultoftheincreasedreportingqualitynoted,seniormanagersnowhavegreaterconfidenceinthereportsproducedandwhethertheinformationobtainedisuseable.

Withinitsdecentralisedevaluationstrategy,UNICEFshouldcontinuetobuilditsownregional/countryofficeevaluationcapacitiesandnationalcapacitiestoconductrelevanttypesofevaluations.

Whilegoodfoundationshavebeenestablished,thereisstillaneedtoensurethattargeted,focusedsupportisprovidedtobuildUNICEFregionalandcountryofficecapacityandnationalevaluationcapacities.Thiscapacitysupportshouldtakeintoaccounttypesofevaluationsthatarecurrentlynotfrequentlyconducted,including:evaluationofstrategies,country‐levelevaluations,andcountry‐ledevaluations.

Specialeffortsshouldbemadetostrengthencertainaspectsofevaluationreportsthathavebeenconsistentlyweakinthepastfewyears.

OverthepastfewGEROScycles,performanceonsomeevaluationcomponentshasbeenconsistentlyweak(e.g.theoryofchange,costanalysis,identificationoflessonslearned).Inordertoaddresstheseweaknesses,UNICEFshouldfocusfurthereffortsandattentiononimprovedtrainingandguidancearoundthesestandardsinparticular.

UNICEFshouldcontinuetoupdateandsystematicallycommunicateitsrequirementsforevaluationreportsacrossitsentireevaluationoversight/managementsystem.TheseupdatesshouldtakeintoaccountevolvingstandardsforevaluationintheUNSystem.

UNICEFshouldensurethatitsstandardsandcriteriaareclearandup‐to‐dateacrossallevaluationmanagementlevels.ThisincludesreviewingandclarifyingtheGEROSframeworkandstandards(whichUNICEFdoesperiodically)aswellasthesystematicintegrationofevaluationprioritiesandstandardsinallTORs.ThisshouldalsoapplytotheUN‐SWAPrequirementsforgender‐responsiveevaluation,whichhavenotbeendisseminatedwidely.

AspartoftheperiodicreviewofGEROS,UNICEFshouldconsiderrevisingtheratingscaleandseveralelementsoftheGEROStemplateinordertoensuregreaterprecisioninthemessagesthatareprovidedaboutevaluationqualityandthecharacteristicsofevaluationreports,andtocreatemoreefficiencyinapplyingthetemplate.

Forthe2012cycle,afterthreeyearsofimplementingGEROS,UNICEFchangedtheratingscale.Theratingof“AlmostSatisfactory”waschangedto“MostlySatisfactory”,whilethetrafficlightcolour(yellow)associatedwiththeratingremainedthesame.Thechangeintheratingcreatedalargergapbetween“Unsatisfactory”and“MostlySatisfactory”ratings,whichcreateschallengesinmakingthejudgmentabouthowtorateparticularquestionsorstandardsinthetemplate.AppendixIIIprovidesfurthercommentsonthescaleandidentifiesotherareaswheregreaterclarityorefficiencycanbeintroducedinapplyingtheevaluationreportclassificationandqualitystandards,asdescribedintheGEROStemplate.


Universalia iv

Lessons Learned

ClearandsystematiccommunicationofevaluationstandardsandprioritiesfavourstheeffectivealignmentofevaluationswithUNICEFstandards,fromtheoutset(i.e.TORsstage).

InordertoensurethatevaluationsrespondtoUNICEF’sexpectationsandpriorities,evaluationTORsshouldclearlyidentifythestandardsbywhichreportswillbejudgedintheGEROSprocess.

Whilecommonstandardshelpimproveevaluationquality,qualityassurancesystemssuchasGEROSshouldprovidesufficientflexibilitytoaccountfordifferenttypesofevaluations.

Abalanceshouldbefoundbetweencommonstandardsandflexibility,totakeintoaccountthedifferenttypesofevaluationsproducedeachyearwithinUNICEF.Currently,theGEROStemplateuseddoesnotalwayseasilyadaptitselftospecifictypesofevaluationreports(e.g.impact,separatelypublishedcasestudyevaluationswhichhavebeencompletedaspartofabroaderevaluation).

Inadecentralizedsystem,compliancewithqualityassurancesystemssuchasGEROSisaffectedbyincentives,availableresources,andtheperceptionofrelevance.

ForfullparticipationinaprocesssuchasGEROS,therelevanceoftheprocessmustbecleartotheUNICEFregionalstakeholders.PartofmaintainingrelevanceisensuringthatthesystemiscontinuouslyupdatedtoreflectchangingcorporateandUN‐systemexpectationsandpriorities.Inaddition,UNICEFregionsshouldhaveequitableaccesstotheresources(humanandother)tosupportqualityenhancementtoevaluationsintheirregions.

QualityassurancesystemssuchasGEROSneedtostrikeabalancebetweenconsistentapplicationoveraperiodoftime(whichallowsforcomparison)andmakingmajoradjustmentsinordertoimproveutilityandreflectchangesintheenvironment.

UpdatingandchangingtheGEROStemplateeveryyearisnotnecessaryandwouldbetimeconsuming.ItishowevernecessarytohaveaformalandconsultativerevieweverytwotothreeyearsinordertokeeptheprocessrelevantandtoupdatetheGEROStemplateasrequiredinlightofnewdevelopmentsorrequirements.


v Universalia

AA cc rr oo nn yy mm ss

CEE/CIS CentralandEasternEurope/CommonwealthofIndependentStates(RegionalOffice)

COs CountryOffices

EAPRO EastAsiaandPacificRegionalOffice

EO EvaluationOffice

ESARO EastandSouthernAfricaRegionalOffice

GEROS GlobalEvaluationReportOversightSystem

HQ Headquarters

HRBAP HumanRightsBasedApproachtoProgramming

LACRO LatinAmericaandCaribbeanRegionalOffice

M&E MonitoringandEvaluation

MENARO MiddleEastandNorthAfricaRegionalOffice

MTSP MediumTermStrategicPlan

N/A NotApplicable

OECD/DAC OrganisationforEconomicCo‐operationandDevelopment/DevelopmentAssistanceCommittee

RBM Results‐basedManagement

ROs RegionalOffices

ROSA RegionalOfficeofSouthAsia

RTE Real‐timeevaluation

SPOA StrategicPlanObjectiveArea

SWAP System‐wideActionPlan

ToC TheoryofChange

TORs TermsofReference

UN UnitedNations

UNDAF UnitedNationsDevelopmentAssistanceFramework

UNEG UnitedNationsEvaluationGroup

UNICEF UnitedNationsChildren’sFund

WASH Water,SanitationandHygiene

WCARO WestandCentralAfricaRegionalOffice


vii Universalia

CC oo nn tt ee nn tt ss

1 Introduction 1

2 GEROSBackground 2

3 Purpose,ObjectivesandScopeofGEROS 3

4 Methodology 4

4.1 ReviewofEvaluationReports 4 4.2 Meta‐Evaluation 6 4.3 ChangesMadeFromPreviousYears 7 4.4 Limitations 7

5 Findings 9

5.1 OverallRatingsandFeedback 9 5.2 OverallRegionalTrends 11 5.3 TrendsbyTypeandScopeofEvaluation 12 5.4 TrendsbyQualityAssessmentCategory 17

5.4.1 OverallTrendsAllSections 17 5.4.2 SectionA:ObjectoftheEvaluation 18 5.4.3 SectionB:EvaluationPurpose,ObjectivesandScope 20 5.4.4 SectionC:EvaluationMethodology,Gender,HumanRightsandEquity 22 5.4.5 SectionD:FindingsandConclusions 26 5.4.6 SectionE:RecommendationsandLessonsLearned 28 5.4.7 SectionF:ReportStructure,LogicandClarity 30 5.4.8 Coherence 32 5.4.9 AddressingthegapsinEvaluationReports 32

5.5 ExamplesofGoodPractices 33

6 Conclusions 35

7 Recommendations 37

8 LessonsLearned 40


Universalia viii

EE xx hh ii bb ii tt ss

Exhibit4.1 PerformanceScorecard 5 Exhibit5.1 ReportsReviewedperRegionperYear 9 Exhibit5.2 GoodQualityEvaluationReportsoverTime(2009‐2014) 10 Exhibit5.3 OverallRatingsfor2014 10 Exhibit5.4 ProportionofGoodQualityReportsperRegion 12 Exhibit5.5 ProportionofReportsReviewedandGoodQualityReportsbyPurpose(2014) 14 Exhibit5.6 ChangesinMTSPandSPOACorrespondencecategories 14 Exhibit5.7 ProportionofReportsReviewedandGoodQualityReportsbySPOACorrespondence

15 Exhibit5.8 ProportionofReportsbyLevelofIndependence(2009‐2014) 16 Exhibit5.9 GoodQualityRatingsperSection–YearbyYearProgression(2011‐2014) 18 Exhibit5.10 SectionARatingsandComparison(2011‐2014) 18 Exhibit5.11 Sub‐SectionRatings:ObjectoftheEvaluation 20 Exhibit5.12 SectionBRatingsandComparison(2011‐2014) 21 Exhibit5.13 Sub‐SectionRatings:Purpose,ObjectivesandScopeoftheEvaluation 22 Exhibit5.14 SectionCRatingsandComparison(2011‐2014) 22 Exhibit5.15 Sub‐SectionRatings:Methodology,Gender,HumanRights,andEquity 24 Exhibit5.16 InclusionofHumanRights,Gender,andEquity:GoodQualityRatingsYearbyYear 25 Exhibit5.17 SectionDRatingsandComparison(2011‐2014) 26 Exhibit5.18 Sub‐SectionRatings:FindingsandConclusions 27 Exhibit5.19 SectionERatingsandComparisonwith(2011‐2014) 28 Exhibit5.20 Sub‐SectionRatings:RecommendationsandLessonsLearned 30 Exhibit5.21 SectionFRatingsandComparison(2011‐2014) 30 Exhibit5.22 Sub‐SectionRatings:ReportStructure,LogicandClarity 31


ix Universalia

AA pp pp ee nn dd ii cc ee ss

AppendixIListofFindings 41 AppendixIIListofRecommendations 43 AppendixIIIDetailedCommentsonGEROSFramework 44 AppendixIVTermsofReference 50 AppendixVListofReportsAssessed 53 AppendixVIGEROSAssessmentTool 60 AppendixVIIRegionalRatingsGraphs 76 AppendixVIIIUNICEF,Country‐ledandJointlyManagedEvaluationsbyRegion 80 AppendixIXOverallRatingsGraphsbyReportTypology 85 AppendixXSub‐SectionRatings 89 AppendixXIAnalysisTablefor2014Reports 93 AppendixXIIClarificationofCriteriaforCompletingGEROSReviews 95 AppendixXIIISpecialCasesinthe2014GEROSExercise 100


1 Universalia

1 II nn tt rr oo dd uu cc tt ii oo nn

TheUniversaliaManagementGroupLimited(Universalia)ispleasedtosubmittoUNICEFthefinalGlobalMeta‐EvaluationReporton2014evaluationreports.AspartoftheGlobalEvaluationReportsOversightSystem(GEROS),thereportaimstohighlightimportanttrends,weaknesses,strengths,lessonslearnedandgoodpracticesdrawnfromourreviewofthe69finalevaluationreportssubmittedinthe2014cycle.Additionally,theGEROSprocessandtemplateisanalysed,andpossibleimprovementsaresuggestedinordertofostercontinuouslearningwithinUNICEF.

Inthisreport,quantitativeanalysesofthedocumentsreviewedandaverageratingsarecomplementedbyqualitativeassessmentsofstrengthsandareasforimprovement.Goodpracticesareidentifiedandsuggestionsforimprovementsareprovided,whererelevant.Themainconclusions,recommendationsandlessonslearnedderivedfromthisanalysisarepresentedtowardtheendofthereport.

Thereportisstructuredasfollows:

Section2–GEROSBackground

Section3–Purpose,ObjectivesandScopeofGEROS

Section4–Methodology

Section5–Findings

Section6–Conclusions

Section7–Recommendations

Section8–LessonsLearned

TheAppendicesarecomposedofimportantsupportingdocumentation,notablytheTermsofReference(AppendixIV),GEROSassessmenttool(AppendixVI)andadditionalgraphsshowingkeytrends(AppendicesVIItoXI).


Universalia 2

2 GG EE RR OO SS BB aa cc kk gg rr oo uu nn dd

The2013EvaluationPolicydefinesevaluationasasharedfunctionwithinUNICEF,withkeyrolesdistributedacrossseniorleadersandoversightbodies,headsofoffices,technicalevaluationstaffandsectoralprogrammestaff.UNICEF’sEvaluationOffice(EO)andRegionalOffices(ROs)collaborateinordertostrengthentheorganisation’sevaluationfunction.Bothlevelsplayqualityassuranceroles,aimingtoensurethattheevaluationsmanagedorcommissionedbyUNICEFupholdthehighqualitystandardssetforthem.

BecauseUNICEFisdecentralisedinnature,itsevaluationsaregenerallycommissionedandmanagedattheCountryOfficelevel.Ononehand,suchanarrangementhelpsensurethatreportanalysesremainhighlyfocusedonthenationalcontext,butontheother,thisdecentralisedsystemmakesitdifficulttomaintainuniformquality,highcredibilityandutilityoftheevaluationsproducedorganisation‐wide.RecognisingthatM&EfunctionsandpracticestendtovaryacrossregionsandCountryOffices,UNICEFcreatedGEROSasatooltoassessqualityofevaluationsandinformfurtherdevelopmentoftheorganisation’sevaluationfunction.Aftersixyearsofimplementation,somesmalladjustmentsweremade(seeSection4.3),butthetemplateusedtoassess2014reportsisvirtuallyidenticalcomparedtothepreviousyear.Thetemplatehasbeenusedinamoreflexiblewaywhenappliedtoimpactevaluations.

TheGEROSprocessandannualmeta‐evaluationsprovidesanimportantsourceofinformationforUNICEFtomonitoritsprogress,identifyitsstrengthsandbecomeawareofitsareasforimprovementwithregardstoevaluationsconductedaroundtheworldonitsbehalf.


3 Universalia

3 PP uu rr pp oo ss ee ,, OO bb jj ee cc tt ii vv ee ss aa nn dd SS cc oo pp ee oo ff GG EE RR OO SS

ThefourmainobjectivesofGEROSareasfollows:

1) ToprovideseniormanagersintheUNICEFHQ,RegionalOfficesandCountryOfficeswithaclearandsuccinctindependentassessmentofindividualevaluationreports;

2) Tostrengtheninternalevaluationcapacitybyprovidingfeedbacktocommissioningofficeswithpracticalrecommendationstoimprovefutureevaluationsandtoinformtheirownassessmentoftheperformanceofexternalconsultantswhomightbehiredforfutureevaluations;

3) ToreportonthequalityofevaluationreportsthroughareviewandassessmentprocesswhoseresultsarecommunicatedtoseniormanagementandinclusionofthisinformationintheGlobalEvaluationDatabase;and

4) TocontributetotheEO’scorporateknowledgemanagementandorganisationallearning,byprovidingtheevidencebaseforameta‐analysisofgoodqualityreports.

TheseobjectivesconstitutethefoundationoftheGEROSprocess,anorganisation‐widesystemthatisstronglyanchoredintheUnitedNationsEvaluationGroup(UNEG)standardsandotherUNICEF‐specificstandards,notablytheimportanceofconsideringgender‐responsiveness,equityandhumanrights‐basedapproachesinthereportingprocess.Countryoffices(COs)andROscanalsotakepartinadditionalquality‐assurancemechanismsthataredesignedtoassessandimprovethequalityofTermsofReference(TORs),inceptionreportsanddraftevaluationreports,beforetheyaresubmittedtothefinalevaluationreviewprocess.

Inordertoensuretheintegrityandreliabilityoftheprocess,anindependentconsultingfirmconductstheGEROSreviewsandproducesthemeta‐analysis,eachyear.Thecriteriaofassessment,laidoutinadetailedassessmenttool(AppendixVI),arethebasisforensuringin‐depthanalysisofthecontentandqualityofthereportssubmittedthroughCOs,ROsandHQ.


Universalia 4

4 MM ee tt hh oo dd oo ll oo gg yy

For2014,1asinpreviouscycles,theGEROSreviewprocesswasconductedinthreemainphases.Theprocesswasinitiatedwithareview‐teambriefinginwhichtheGEROSbackgroundandobjectiveswereexplained,theassessmenttoolwaspresented,keyconsiderationsandclarificationswereprovided,andlessonslearnedfromthe2013cyclewereshared.ThebriefingwasfollowedbythemethodicalassessmentofeachUNICEFevaluationsubmittedin2014,andthedataobtainedwasanalysedinordertoproducethemeta‐evaluation.Themethodologyforthelattertwophasesisdescribedinthefollowingsections.

44 .. 11 RR ee vv ii ee ww oo ff EE vv aa ll uu aa tt ii oo nn RR ee pp oo rr tt ss

Toensurethevalidity,credibilityanduniformityoftheassessment,thereviewswerethemselvesconductedintwophases:first,eachreportwassystematicallyevaluatedaccordingtotheGEROSassessmenttool,andreviewsunderwentaqualityassuranceprocess,bothinternallyandwithUNICEF.

Rev iew Process and Eva luat ion Tool

TheUniversaliareviewteamwasgrantedafive‐monthperiodtoreviewtheevaluationreportsuploadedtotheUNICEFGlobalEvaluationDatabasefor2014.ThoughUNICEF’sEOfirstscreenedthereportstoensurethatsubmissionswereinfactevaluations,2anadditionalscreeningtookplaceinternallyuponbeginningeachreview:thetopsectionoftheassessmenttoolservestocategoriseeachreportaccordingtoitsgeographiccoverage,managerialoversight,purpose,levelofchangesought,StrategicPlanObjectiveArea(SPOA)correspondence,levelofindependence,andapproach(i.e.summative,formativeorboth).Intheend,allevaluationreportsuploadedtothedatabasein2014,foratotalof69reports,3werereviewedfrommid‐JanuarytoearlyJune2015(forafulllist,seeAppendixV).

1Thereviewprocessbeginsimmediatelyafterallreportsaresubmittedforagivenyear.Thus,reportsdated2014arereviewedinthe2015calendaryear,butareconsideredpartofthe2014reviewcycle.AsnotedinSection1,some2015evaluationswereincludedinthe2014reportpopulation.

2BecausetheGlobalEvaluationDatabasecontainssurveys,researchandbaselinestudiesofdifferenttypes,inadditiontoevaluations,thisscreeningprocesshelpedensurethatonlytheappropriatereportswereincludedintheGEROSprocess.UNICEFdefinesanevaluationasa“judgment[on]therelevance,appropriateness,effectiveness,efficiency,impactandsustainabilityofdevelopmentefforts,basedonagreedcriteriaandbenchmarksamongkeypartnersandstakeholders.Itinvolvesarigorous,systematicandobjectiveprocessinthedesign,analysisandinterpretationofinformationtoanswerspecificquestions.Itprovidesassessmentsofwhatworksandwhy,highlightsintendedandunintendedresults,andprovidesstrategiclessonstoguidedecision‐makersandinformstakeholders”(UNICEF,2011).

3ItisimportanttonotethatanyevaluationconductedbytheserviceproviderforGEROS(UniversaliaManagementGroup)wasratedbyanotherexternalcompany;in2014therewasonlyoneofthesecases.


5 Universalia

Thereportsreceivedwereassignedtoreviewers,wherepossible,accordingtothematicexpertise(e.g.gender)andlinguisticabilities(i.e.eachreviewwascompletedinthesamelanguageasthereportitself).TheGEROSassessmenttoolusedinthereviewprocesscontains58questionsderivedfromtheUNICEF‐adaptedUNEGevaluationreportsstandards.4Thesequestionsaredistributedacrosssixmainsections,asfollows:

Objectoftheevaluation;

Evaluationpurpose,objectivesandscope;

Evaluationmethodology,gender,andhumanrights;

Findingsandconclusions;

Recommendationsandlessonslearned;and

Structure,logic,andclarityofthereport.

Theseoverarchingsectionscontain22sub‐sectionsstructuredaroundfundamentalareasoffocus(e.g.methodologicalrobustness,soundnessoffindings,depthandvalue‐addedoftheconclusions).IntheGEROSreviewprocess,eachquestionandsectionweregivenaratingbasedonacolour‐codedfour‐pointperformancescorecard(asdemonstratedinExhibit4.1below).Foreachratingandsection,anarrativejustificationwasprovidedandsupportedbyevidence,examplesandpage/sectionnumbers.

Exhibit4.1 PerformanceScorecard

Colour

Coding CC DarkGreen Green Amber Red White

Questions OutstandingBestPractice

Highlysatisfactory

MostlySatisfactory

Unsatisfactory NotApplicable

Aqualitativeapproachwastakentoeachreportassessmentandservedtogeneratetargeted,relevantanalysisthattookintoaccounteachreport’sspecificitiesandcontext,inadditiontoprovidingconstructivefeedbackforfuturereports.Asshownabove,thesectionsandoverallreportreceivedgradesof:“Outstanding,BestPractice,”“HighlySatisfactory,”“MostlySatisfactory,”or“Unsatisfactory.”Thoughitwasusedsparingly,an“N/A”optionwasalsoavailabletoaccountforspecialcaseswherecertainquestionswerenotrelevant.Insuchinstances,thereviewerswereexpectedtoexplainwhythenatureofthereportmadeintegratingagivencomponentimpossible.

Nexttotheratings,anarrativewasprovidedtojustifythegradeattributedtothesub‐sectionsandoverarchingsections.Foritspart,the“ConstructiveFeedback”columnineachmainsectionwasusedtohighlighteitherareasforimprovement(includingexamplesandjustification)orareasofgreatstrength/bestpracticethatshouldbemaintainedinfuturereports.Aftereachmainsection,afewsentencessummarisingkeyratingsandcommentswereincludedunder“ExecutiveFeedback.”(SeeAppendixVIforanoverviewofthescorecard.)

4 ThecomponentsoftheGEROSassessmenttoolareanchoredintheeightUNICEF‐adaptedUNEGevaluationreportsstandards,whichconsistof:(i)thereportstructure;(ii)theobjectofevaluation;(iii)theevaluationpurpose,objective(s)andscope;(iv)theevaluationmethodology;(v)findings;(vi)conclusionsandlessonslearned;(vii)recommendations;and(viii)genderandhumanrights,includingchildrights(Annex2oftheGlobalEvaluationReportsOversightSystem,December2012).


Universalia 6

Usingthisscorecard,thereviewprocessgeneratedthreekeydatasetsforeachreport:reporttypology,ratings(quantitativeinnature),andcomments/feedbackonreportingquality(qualitativeinnature).Thesedatasetsformedthebasisforsystematiccomparisonacrossthereportssubmitted.

Qual i ty Assurance Process

Asinpreviousyears,theGEROSprocesswasentrustedtoadedicatedteamofreviewersandseniorreviewers,acoordinator,andaprojectmanager.ThecoordinatorandprojectmanagerworkedcloselywithasmallnumberofreviewersandUNICEFtooverseetheprocess,ensuresystematiccompletionofreviews,andrespondtoclientandreviewerquestionsandrequestsinatimelymanner.Aswiththe2013cycle,weworkedwithasmallteamofreviewers.Whilethisgreatlyfacilitatedtheprocessofprovidingqualityassurance,theteamdidrequiremorecalendartimetocompletethereviews.Asinthepast,however,differentmembers’skillsandexpertise(e.g.evaluationexperience,languageskills,knowledgeofUNICEF)wereleveragedwithaviewtoensureaccuracy,depth,andconsistencyinbothlanguageandcontent.Wesoughttoensureconsistencyinratingsthroughthreekeysteps:

Teambriefingsanddiscussions:Thereviewteammetatthestartoftheprocessinordertobecomefamiliarwiththereviewtemplate,requirements,andexpectationsoftheprocess.Subsequentdiscussionsservedtostandardiseratings,aswellasironoutanyuncertaintiesregardingspecialoruniquecases.

SeniorandPeerReviews:BeforereviewsweresubmittedtoUNICEF,thecoordinatorwentthrougheachonetoensurethatallaspectshadbeencompleted.Followingthis,20%ofreportswentthroughapeerreviewprocess,inwhichseniorpeerreviewersverifiedthedepth,coherence,andqualityofreviewsaccordingtotheestablishedexpectationsandobjectivesoftheGEROSprocess.ThisproportioncorrespondstothatindicatedintheGEROSguidancedocument.SpecialreportsorreportsflaggedbyUNICEFwerereviewedbythemoreseniorteammembers.Inreviewingtheworkoftheirpeers,teammembershadtheopportunitytodiscussandchallengeratings,ensureuniformityintheapplicationofthetemplate,correctinconsistenciesandrequestfurtherjustificationandsupportingexamples,aswellasmakeanylinguisticimprovementsneeded.

UNICEFFeedback:TheUNICEFEvaluationOfficewasaninherentpartofthequalityassuranceprocess,verifyingeachreviewtoensureconsistencyintheratingsandthequalityofthenarrativecontent/language.Inafewcases,reviewswerereturnedsothattheteamcouldfurtherelaborateorsupportratingsandexplanations.

44 .. 22 MMee tt aa ‐‐ EE vv aa ll uu aa tt ii oo nn

Atotalof69reportsweresubmittedtotheUNICEFdatabaseforinclusioninthe2014GEROSprocess.Thedatageneratedfromthereviewofthesereportsformsthebasisofthepresentmeta‐evaluation.Thefindingsherewithwerederivedfromthefollowingprocess:

DataAggregation:AssoonasUNICEFapprovedallofthereviewssubmitted,UniversaliausedaVisualBasicforApplicationscodeinExceltoaggregatetheratingsandcommentsintoasingleExcelworkbook.Theresultingdocumentconstitutedthefundamentaldatasetallowingfortrends,strengthsandweaknessestobehighlightedandcompared.


7 Universalia

Year‐to‐YearComparisons:Toensureconsistencyandcoherence,theapproachtakentoanalysingdataandtrendsmirroredthatofpreviousyears.Accordingtothisapproach,overallandsectionratingsweredisaggregatedbasedonreporttypology(region,geographicscope,management,purpose,result‐level,levelofindependence,approach,Medium‐termStrategicPlan(MTSP)(andasof2014,SPOA)correspondence,andlanguage).Resultingtrendswereanalysedaccordingtoreporttypology,andthesectionswerecategorisedasperthefour‐pointperformancescalepresentedinExhibit4.1.Itisworthnotingthat,inmanycases,ratingsof“Outstanding”and“HighlySatisfactory”wereclassifiedtogetherinordertoreflectallofthegoodqualityreportsthatrespectUNICEFstandards.Expressedaspercentageratio,ratingsforthedifferentpointsonthescalewerecomparedtothedataobtainedfrom2010to2014.Thegraphsincludedherewithservetoillustrateyear‐to‐yeartrends,notablytheprogressmadeaswellasareasstillrequiringimprovement.

TrendsintheOverarchingSections:Finally,bothquantitativeandqualitativedataservedtohighlighttrendswithintheoverarchingsectionsofthetemplate(e.g.SectionD–FindingsandConclusions,SectionE–RecommendationsandLessonsLearned,etc.).Theratingsattributedtoeachsub‐questionofthetemplatewerecompiledanddisaggregatedaccordingtotheirsub‐section(e.g.CompletenessandInsightofConclusions,RelevanceandClarityofRecommendations).Theratingspersub‐sectionwerethenanalysedinordertoextractkeystrengthsandweaknessesacrossthepopulationofreports.Qualitativedatawasgroupedaccordingtokeytermorfocusarea,whichhelpedpinpointthemesandsupporttrendsnotedinthecomparisonofquantitativedata.

44 .. 33 CC hh aa nn gg ee ss MMaa dd ee FF rr oomm PP rr ee vv ii oo uu ss YY ee aa rr ss

Early2010markedthestartoftheGEROSprocess,inwhichreviewsof2009reportswereconducted.Whiletheprocessremainsverysimilartothisday,somechangesweremadefollowingarapidreviewbyUNICEFin2012,basedontheexperienceandlessonsofpreviousyears.Thesemodificationsincludednewwording(butthesamecolourcoding)inthefour‐pointperformancescale;newoptions(“ExternallyManaged”and“UNDAFEvaluation”)under“ManagementoftheEvaluation”;theadditionof“Regional/Multi‐CountryProgrammeEvaluation”under“Purpose”;aswellastheoptiontoselectboth“Summativeandformative”undertheevaluation’sapproach(formerlycalled“stage”).

Forthe2014cycle,veryminorchangesweremadetotheGEROStemplate.Intheclassificationsectionofthetool,“MTSPCorrespondence”anditscategorieswerechangedto“SPOACorrespondence”anditscategories,inordertoalignwithUNICEF’snewstrategicplanfor2014‐2017.

44 .. 44 LL ii mm ii tt aa tt ii oo nn ss

TheGEROSteammetwithcertainmethodologicalandanalyticalchallengesinreviewingreportsandproducingthepresentdocument.AlthoughthereareobjectivestandardsintheGEROStemplate,itisalwaysnecessarytousebestjudgmentwhenapplyingthosestandardstoeachreport,thuscreatingpersonalbiasesorexpectationsthatmayinfluencetheresults.Toharmonizeratingsacrossreviewers,previouslessonsweresharedinternallyandintegratedintotheprocessfromtheoutsetandpeerreviewswereconducted.Thereviewteamjointlyreviewedareportatthebeginningoftheprocesstoensurethateveryonehadthesameunderstandingofthestandards.AllGEROStemplateswerereviewedbythecoordinatorandUNICEFforqualitycontrol.Thereviewteamnaturallycameacrossuniqueorspecialcasesthatrequireddeeperanalysisordiscussioninordertoensureconsistencyofratings,suchasimpactevaluations,evaluationoftraining,andcomplexevaluations.


Universalia 8

Indeed,uniqueorspecialcases–whentheyconcernedanentirereport,ratherthanjustoneortwotemplatequestions–necessarilyposedchallengestothereviewteam.Whileallthereportsreviewedwereconsideredevaluations,differentevaluationtypesanddesignsputtheflexibilityofthetemplatetothetest,andmayhavealsoinfluencedratingconsistency(seeAppendixXIIIforfurtherdetail).Theteamworkedtogethertodeterminethemostsuitableratingstoattribute,buttheexperienceprompteddiscussionsaroundtheapplicabilityofthetemplateandthedifferentoptionsthatcouldbeenvisagedtoincreaseitsflexibility.

Insomecases(althoughmuchimprovedfrompreviousyears),theevaluationTORswerenotincludedtohelpcomplementorjustifycertainchoices,approachesorfociinthereports(e.g.thosethatdifferedfromexpectedstandardsortraditionalpractice).WhileGEROSrequirementsmayhavebeensatisfiedinotherdeliverables,theexclusionofthesedeliverablesfromthefinalreportmayhaveaffectedtheratingsassigned.

Finally,asnotedinpreviousyears,changesmadetothetemplate–suchasthewordingofthecolour‐codedscalein2012‐mayhavealsoinfluencedtheratingsattributed(duetodifferencesininterpretation)andthus,theabilitytocompareresultsovermultipleyears.Whencomparisonsacrossyearswerenotpossible,thereportexplicitlydescribeswhythatisthecase.


9 Universalia

5 FF ii nn dd ii nn gg ss

Basedontheanalysisofthedataobtainedfromthereviewprocess,thefollowingfindingsemerge.

55 .. 11 OO vv ee rr aa ll ll RR aa tt ii nn gg ss aa nn dd FF ee ee dd bb aa cc kk

Betweenthe2013and2014GEROSexercises,thenumberofreportsrevieweddecreasedfrom96to69reports.AsdemonstratedbyExhibit5.1,thenumberofreportssubmittedbyeachregionhasvaried–insomecases,quitesignificantly–overthelastfourGEROScycles.5WhilethenumberofreportsfromtheCentralandEasternEurope/CommonwealthofIndependentStates(CEE/CIS)RegionalOfficehasstayedrelativelystableandnowleadsintermsofthenumberofreportssubmitted,reportssubmittedfromtheEastandSouthernAfricaRegionalOffice(ESARO)hascontinuedtodecline.ThenumberofreportsfromtheRegionalOfficeofSouthAsia(ROSA)andHQincreasedsomewhatcomparedtolastyear.Thistotalnumberofreporthastobeunderstoodinrelativeterms.Threemulti‐country/regionalevaluationsareincludedinthatsample,whichmayincreasecountrycoverageinsomeregions.Twomulti‐countryevaluationswerecompletedbyCEE/CIS,whileoneregionalevaluationwascarriedoutbytheLatinAmericaandCaribbeanRegionalOffice(LACRO).Multi‐countryevaluationsbyCEE/CIScoveraboutfivecountries,whichfurtherincreasesthecoverageforthatregion.

Exhibit5.1 ReportsReviewedperRegionperYear

5Weunderstandthatadditionalevaluationreportsweresubmittedafterthecut‐offdate(May1st,2015)andthusarenotreflectedinthisaggregateanalysis.Inaddition,othertypesofevaluativereportsproducedbycountryofficesarenotreflectedinthisanalysis,suchasresearch,reviews,andassessments,whichmayprovidefeedbackonperformance,butarenotconsideredevaluationsasperUNICEF’sdefinition.

9

15

21

11

3

10 109

5

13

20

3 3

6

19

1615

1415

7 7

5

26

7

5

14

11

5

3

13

9 9

LACRO CEECIS ESARO EAPRO MENARO ROSA WCARO HQ

2011 2012 2013 2014


Universalia 10

Finding1: In2014,thequalityofreportssubmittedcontinuedtoincreaseoverall,accompaniedbyalargeincreaseinthenumberofreportswithTORsincludedintheappendicesorasseparatedocuments.

ForthepurposesoftheGEROSmeta‐analyses,goodqualityreportsarethoseratedashighlysatisfactoryoroutstanding.AsdemonstratedinExhibit5.2below,thenumberofgoodqualityreportshasgrownconsistentlyovertheyears,andcontinuedtodosoin2014,withthelargestproportionofgoodqualityreports(74%)sincetheGEROSexercisebegan.Thissuggestsageneralisedimprovementinreportingquality.However,itshouldalsobenotedthat,thisyear,regionswithdecliningqualityofevaluationreportsalsocontributedmanyfewerreportstoGEROSthisyear(withtheexceptionofLACRO),makingupasmallerproportionofthereportsample.

Exhibit5.2 GoodQualityEvaluationReportsoverTime(2009‐2014)

Exhibit5.3showsthismoderateoverallimprovementofreportquality:highlysatisfactoryreportsincreasedfrom64%lastyearto68%,andmostlysatisfactoryreportsdecreasedfrom29%to23%.Outstandingandunsatisfactoryreportsremainedatapproximatelythesamelevel.

Exhibit5.3 OverallRatingsfor2014

36%40%

42%

62%

69%

74%

2009 2010 2011 2012 2013 2014

6% 68% 23% 3%

Outstanding, best practice Highly Satisfactory Mostly Satisfactory Unsatisfactory


11 Universalia

UnsatisfactoryandmostlysatisfactoryratingsindicatetheabsenceorpoorexecutionofGEROSstandards.Unsatisfactorywasusedwhenastandardwaspoorlyaddressed,offtrackorcompletelyabsent.Mostlysatisfactorywasusedwhenastandardwaspartiallyaddressed.

Priorto2014,TORswereoftenmissing,whichmadecross‐verificationofreportingrequirementsdifficult.However,in2014therewasalargeimprovementinthenumberofTORspresent,with88%ofreportsbeingaccompaniedbytheirTORs,comparedtoonly57%in2013.TheinclusionoftheTORsallowsthereviewerstomoreaccuratelyassignratingstoareport,andmaycontributetoanincreaseinratings,asinpreviousyearsreportswithTORstendtoberatedmorehighlythanreportsthatdidnot.

55 .. 22 OO vv ee rr aa ll ll RR ee gg ii oo nn aa ll TT rr ee nn dd ss

Finding2: ThequalityofUNICEFevaluationreportsvariedsincelastyear’sreviewprocess,withsomeregionsimprovingandothersdeclining.

Dataanalysisin2014indicatesthatregionsleadinginreportqualityincludeHQ(Corporate),CEE‐CIS,andROSA.Therearevaryingpatternsbetweentheregionsinthequalityofreportssubmittedovertime,withsomeregionsimproving,somelosinggainsfrompreviousyears,andothersremainingthesame(Exhibit5.4).HQCorporatecontinuedtohaveuniversalgoodqualityratings,andROSAmaintainedasimilarlevelofgoodqualityreportstothatofthepast.CEE‐CIShadajumpingoodqualityreportsandnowreaches93%ofgoodqualityreports.TheWestandCentralAfricaRegionalOffice(WCARO),andESAROhadnoticeabledeclines.TheEastAsiaandPacificRegionalOffice(EAPRO),theMiddleEastandNorthAfricaRegionalOffice(MENARO)andLACROsubmittedtoofewreportstodrawfirmconclusions.

Closerobservationoftheoverallratingsattributedwithineachregion,showsthatCEE‐CIShadthehighestnumber(3)ofoutstandingreports(infact,mostregionsdidnothaveanyreportswithoveralloutstandingratings).Theproportionofhighlysatisfactorytomostlysatisfactoryreportsvariedbetweenregions,butonlyWCAROandROSAregisteredanyunsatisfactoryreports(oneeach)in2014.


Universalia 12

Exhibit5.4 ProportionofGoodQualityReportsperRegion67

Animportantobservationofthelinkagesbetweenqualityandquantityofregionalreportsinthisyear’ssampleisthatmostoftheregionswithdecliningqualityofevaluationreports(WCARO,ESARO,andMENARO)alsocontributedmanyfewerreportstoGEROSthisyear,makingupasmallerproportionofthereportsample.ThisraisesthequestionofwhetherthereisanyrelationshipbetweengoodqualityratingsandthecontributionofreportstotheGEROSdatabase.Thereareanumberoffactorsthatmayexplainvaryinglevelsofcontributionofreportsbyregion.Insomecases,fewerevaluationsmaybeconductedinoneyearcomparedtoanother.Thisissometimesrelatedtoalowimplementationrateinacountrywhereaprojectorprogrammehasnotprogressedenoughtojustifyanevaluation.Evaluationsmaybeplannedbutnotallconductedduetoissueswithbudgets,lackofpriority,orstaffturnover.Finally,insomecaseslowqualityevaluationsmaybereclassifiedasareviewandsubsequentlynotsubmittedtoGEROS.

55 .. 33 TT rr ee nn dd ss bb yy TT yy pp ee aa nn dd SS cc oo pp ee oo ff EE vv aa ll uu aa tt ii oo nn

Finding3: Asinpreviousyears,mostreportsfocusedoninitiativeswithanationalscopeandtheywereoverallgoodqualityreports.

Geographicscopereferstotheareacoveredbytheprogrammeorprojectbeingevaluated,andhelpsdeterminetheextenttowhichevaluationfindingscanbegeneralised.Onceagain,thegreatmajorityofreportsreviewedcoveredanationalscope(88%)8.Qualityofnationalreportsreached72%.

Reportswithamulti‐region/globalscope9arethenextlargestcategoryofreports,atonly7%ofthesample.Allachievedagoodqualityrating,whichmatchesthe100%goodqualityratingsfrom2013

6Thepercentageofgoodqualityreportswascalculatedbyaddingthenumberorreportsthatwereoutstandingandhighlysatisfactoryperregionoverthetotalnumberofreportsperregion.TotalnumberofreportsperregioncanbeseeninExhibit5.1.7HQCorporatereportsrefertoevaluationsthatarecommissionedandmanagedfromHeadquarters,butnotbytheEvaluationOffice.HQEvaluationOfficereportsarethosethatarecommissionedandmanagedbytheEvaluationOffice.In2015,thesetwocategorieshavebeenamalgamated.

8Includesreportsthatwereclassifiedas“national”and“sub‐national”.

22%

40%

52%

36%

33%

50%

30%

80%

0%

20%

77%

50%

33%

67%

83%

58%

100%

83%

47%

79%

67%

57% 57%

80%

69%

100%

86%80%

93%

55%

80%

33%

85%

33%

100%

0%

LACRO CEECIS ESARO EAPRO MENARO ROSA WCARO HQ (Corp.) HQ (EO)

2011 2012 2013 2014


13 Universalia

forthiscategory.Thepresenceofonlyonereportwitharegional10scoperatedas“mostlysatisfactory”doesnotallowformeaningfulcomparisons.Thetworeportswithamulti‐country11scopein2014remainedgoodquality,aswasthecaseforthesinglereportofthistypein2013.

Finding4: MostevaluationreportsinthesampleweremanagedbyUNICEFandwereconsideredqualityreports.Withrespecttoevaluationpurpose,programmeandprojectevaluationscontinuetorepresentthemostimportantproportionofevaluationsreviewed.

Inthemajorityofreportsreviewed,UNICEFisthemanageroftheevaluationandhasdirectresponsibilityfortheevaluationprocess;inothers,UNICEFjointlymanagestheexercisewithcountriesand/orwithotheragencies.In2014,theproportionofUNICEF‐managedreports12continuedtoincrease,fromalmosthalfofthereportsin2013to80%ofreportsin2014.TheseUNICEF‐ledreportsalsoincreasedinquality,from74%to80%.ThishighlevelofqualitymightreflectthegreaterlevelofcontrolthatUNICEFstaffcanexerciseovertheadherencetostandardsandpracticesinevaluationsthataremanagedbythem.

Inthiscycle,thereare14evaluationsthatwerejointlymanaged13.Amajorityofjointevaluationswithcountrieswereconsideredtobeofgoodquality(63%).Theabsenceofcountry‐ledevaluationsisnotablegiventheimportancegiventocountry‐ledevaluationinthe2013UNICEFEvaluationPolicy14.

Justasinlastyear’scycle,the2014GEROSexercisenotedalargeproportionofprogramme(32%)andproject(28%)evaluationsamongthereportsreviewed.Theseproportionsrepresentadecreasecomparetolastyear’s.Theproportionofpilot,project,andpolicyevaluationsincreasednoticeably,whileat‐scaleandimpactevaluationsincreasedslightly.Theproportionofhumanitarianevaluationsdecreasedslightly,andthissmallgroupincludedonereal‐timeevaluation(whichwasfoundtobehighlysatisfactoryoverall).

In2014,project‐levelevaluationsincreasedsignificantlyfrom58%goodqualityratingsto84%.Programme‐levelevaluationsalsoincreasedinquality,butbyasmaller7percentagepoints,witha70%levelofgoodqualityratings.ReportsinafewothercategoriesstruggledtomeetGEROSstandards,includingthesmallgroupsofat‐scaleandimpactevaluations15(seeExhibit5.5).

9Theprogrammeisimplementedintwoormoreregions,ordeliberatelytargetsallregions.Theevaluationwouldtypicallysampleseveralcountriesacrossmultipleregions,withtheresultsintendedtobegeneralizableintwoormoreregions.

10Whereoneprogrammeisimplementedinseveralcountries,ordifferentprogrammesofasimilarthemeareimplementedinseveralcountries,theevaluationcoversmultiplecountrieswithintheregionandthesamplingisadequatetomaketheresultsgeneralizabletotheregion.

11Theprogrammeisimplementedintwoormoreregions,ordeliberatelytargetsallregions.Theevaluationwouldtypicallysampleseveralcountriesacrossmultipleregions,withtheresultsintendedtobegeneralizableintwoormoreregions.12UNICEF‐managedevaluationsarethoseinwhichUNICEF,workingwithnationalpartnersofdifferentcategories,isresponsibleforallaspectsoftheevaluation.

13Joint‐evaluationscanbe:1)managedwithoneormoreUNagencies;2)managedwithotherorganizationsor3)canjointlymanagedbytheCountry(Governmentand/orCSO)andtheUNICEFCO.

14Country‐ledevaluation:EvaluationsmanagedbytheCountry(Governmentand/orCSO).15Acontributingfactortoperformancemayberelatedtothedifferentexpectationsforthesetypesofevaluations,whicharenotnecessarilyreflectedintheGEROSframework.


Universalia 14

Exhibit5.5 ProportionofReportsReviewedandGoodQualityReportsbyPurpose(2014)16

Finding5: Alargerproportionofevaluationsaresummativethisyearandamajorityofreportsfocusoneducationasathematicarea.

In2014,theproportionofreportsbytypeofevaluationapproachdifferedsomewhatfromtheprevioustwoyears.Whiletheproportionofformativeevaluationsremainedalmostthesame,summativeevaluationsincreasedthisyearby9percentagepoints,andevaluationscombiningaformativeandsummativeapproach17decreasedbythesameproportion.Thequalityofformativeevaluationshasimprovedtremendously,from64%in2013to85%in2014,whilequalityofsummativeevaluationshasslightlydecreased.Evaluationscombiningaformativeandsummativeapproach,increasedslightlyinquality.18

Inadditiontotheapproach,theGEROStemplateasksreviewerstoidentifywhichthematicareasareaddressedbytheobjectoftheevaluation.Priorto2014,thetemplateusedthefocusareasfromtheMedium‐termStrategicPlan(2006‐2013).In2014,theStrategicPlanObjectiveArea(SPOA)listoffocusareaprioritieswasadopted,reflectingthenew2014‐2017strategicplanthematicpriorities,whicharesomewhatdifferent.Thuscomparisonsoftrendsoverpreviousyearsarelimited,asthetablebelowshows:

Exhibit5.6 ChangesinMTSPandSPOACorrespondencecategories

MTSPCorrespondence(2013) SPOACorrespondence(2014)

HIV/AIDSandchildren HIV/AIDS

ChildProtection ChildProtection

Basiceducationandgenderequality Education

Youngchildsurvivalanddevelopment Health

Multi‐sectoral Nutrition

16Totalnumberofreportsperpurpose:7pilot,4atscale,6policy,0RTE,4humanitarian,19project,23programme,3country‐programmeand4impactevaluations.ThedefinitionofeachofthesetermscanbefoundinAppendixXII(GEROScriteria).

17Thecategory“formativeandsummative”evaluationwasonlyaddedtotheGEROSprocessinthe2012exerciseinordertofullyreflectthebreathofreportsreviewed.

18Totalnumberofreportsforeachapproach:20formative,29summative,20formativeandsummative.

10% 6% 9% 6%28% 32%

4% 6%

71%50%

100%

75% 84%68% 67%

50%

Pilot At Scale Policy Humanitarian Project Programme Country Programme

Impact Evaluation

% of reports by purpose % Good quality


15 Universalia

MTSPCorrespondence(2013) SPOACorrespondence(2014)

Cross‐cutting WASH

PolicyAdvocacyandpartnerships Socialinclusion

Organizationalperformance Cross‐cutting–genderequality

Cross‐cutting–humanitarianaction

ThemeswhichhaveremainedthesameareHIV/AIDS,childprotection,andmulti‐sectoral(evaluationsthattouchonmorethanonethematicoutcomearea).Ofthese,thenumberofHIV/AIDSandchildprotectionthematicevaluationsareverysimilarinproportiontothe2013sample.Evaluationsthattouchonmorethanonethematicoutcomeareahavedropped8percentagepointsfrom28%to20%.

Educationisthemostcommonthemeaddressedinevaluationsin2014,makingup32%ofGEROSreports(Exhibit5.7).Moremodestbutstillsignificantnumbersofreportsin2014werereviewedinhealth(13%),WASH(10%),andcrosscutting‐genderequality(10%),inadditiontochildprotectionasnotedabove(9%).

ThesenumbersraisesomequestionsregardingtheextentofcoverageofprovidedtoHIV/AIDS,nutrition,socialinclusion,andcross‐cuttinghumanitarianaction,whicharekeyfocusareasofUNICEFinthe2014‐2017StrategicPlan(someofwhichwerecombinedwithothercategoriesinpreviousyears).Itmaybeworthstudyingwhetherthelownumberofreportssubmittedonthesetopicsreflectsanevaluationcoverageissue,alimitednumberofUNICEFinitiativesintheseareas,oranaturalfluctuationovertime(forexample,nutritionwasaparticularfocusin2013inanumberofregionalandglobalstudies).UNICEFmayalsowishtoconsiderwhetherqualityissuesresultfromalackofresourcesforMonitoringandEvaluation(M&E)ofitsWASHwork.

Exhibit5.7 ProportionofReportsReviewedandGoodQualityReportsbySPOACorrespondence1920

19TotalreportsperSPOACorrespondence:9health,3HIV/AIDS,7WASH,0nutrition,22education,4childprotection,1socialinclusion,6cross‐cuttinggenderequality,0cross‐cuttinghumanitarianaction,11evaluationswithmorethanoneoutcome.20ThereisnoevaluationthathasaSPOAcorrespondencethatrelatestohumanitarianaction.ThefourreportsthathadahumanitarianpurposehaddifferentSPOAcorrespondence(focusareapriorities).Thesereportswereaddressing1)genderequalityand2)WASH,inemergencysettings.

13%4%

10%0%

32%

9%1%

10%0%

20%

67%

100%

29%

0%

82%

67%

100%

86%

0%

79%

Health HIV‐AIDS WASH Nutrition Education Child Protection Social inclusion Cross cutting ‐Gender Equality

Cross‐cutting –Humanitarian

Action

Touches more than one outcome

% of reports by SPOA % Good quality


Universalia 16

Finding6: AlmostallreportssubmittedtoGEROSareindependentexternalevaluations,whichisashiftsince2012wheninternalevaluations(orreviews)werestillsometimesclassifiedasevaluations.AmajorityofthereportssubmittedwereinEnglish.

Thelevelofindependencereferstowheretheimplementationandcontroloftheevaluationactivitiesareheld.Asteadydeclineofindependentinternalreports21haseliminatedthistypeofreportfromtheGEROSsample(Exhibit5.8).Thesamplecontainsindependentexternalevaluations,exceptforonereport,forwhichthelevelofindependencecouldnotbedetermined(comparedto22%ofreportsin2013forwhichthiscouldnotbedetermined).Partofthisshiftlikelyreflectsthatagrowingnumberofreportsclearlyidentifytheevaluatorsandtheofficeresponsibleformanagingtheevaluation.Itmayalsobeindicativethatinternalevaluationsaremorefrequentlyconsidered‘reviews’andarethereforenotsubmittedtoGEROS.

Exhibit5.8 ProportionofReportsbyLevelofIndependence(2009‐2014)

Furthermore,EnglishremainstheprimaryreportinglanguageintheGEROSprocessat89%ofreportsreviewed.ThenumberofFrenchandSpanishreportsdeclinedsignificantlyfrom2013,at9%and3%respectively22.ThisdropinlanguagediversitymayreflectthefewernumberofreportssubmittedbyregionssuchasWCAROandLACRO.

Thissmallpoolofnon‐Englishreportsmakesqualitycomparisonslessrelevant.However,wecanmaketheobservationthatthetwoSpanishreportswerebothfoundtobehighlysatisfactory,whilethesixFrenchreportsdeclinedsomewhatingoodqualityratings.

21AspertheGEROStemplatedefinitionsoftheseterms(AppendixXI),anindependentinternalevaluationisonethat“isimplementedbyconsultantsbutmanagedin‐housebyUNICEFprofessionals.Theoverallresponsibilityfortheevaluationlieswithinthedivisionwhoseworkisbeingevaluated.”Anindependentexternalevaluationis“implementedbyexternalconsultantsand/orUNICEFEvaluationOfficeprofessionals.Theoverallresponsibilityfortheevaluationliesoutsidethedivisionwhoseworkisbeingevaluated.”

22Totalnumberofreportsperlanguage:61English,6Frenchand2Spanish.

46%52% 55%

51%

82%

99%

45%38%

33%28%

10% 0%

2009 2010 2011 2012 2013 2014

Independent external Independent internal


17 Universalia

55 .. 44 TT rr ee nn dd ss bb yy QQ uu aa ll ii tt yy AA ss ss ee ss ss mm ee nn tt CC aa tt ee gg oo rr yy

55 .. 44 .. 11 OO vv ee rr aa ll ll TT rr ee nn dd ss AA ll ll SS ee cc tt ii oo nn ss

TheGEROStemplateisdividedintosixsubsections,eachonaparticularaspectofUNICEFreportingstandards:

SectionAcomprisesthedescriptionoftheevaluatedobject;

SectionBrelatestothereport’spurpose,objectives,scopeandevaluationframework;

SectionCcoversmethodologyandinclusionofhumanrights,genderandequity;

SectionDincludesfindingsandconclusions,

SectionEassessesrecommendationsandlessonslearned;and

SectionFaddressesthereport’sstructure,layoutandlogic.

Eachofthesesectionsismadeupofvaryingnumbersofindividualquestionsorganizedintosub‐sections(seeGEROStemplateinAppendixVI).ThefollowingpagesanalysetemplatesubsectionsAthroughF,andtheirsub‐sectionratings,inordertounderscorespecificareasofstrengthandweaknesstobemaintainedoraddressedinthecomingyears.

Finding7: In2014,thequalityofreportsslightlyincreasedintwosections,decreasedslightlyinanother,andchangedlittleinthreeothersections.ThemajorityofreportscontinuetobealignedwiththeUNICEF‐adaptedUNEGstandardsforevaluationreports.

Exhibit5.9illustratesthat,until2014,thedifferentreportsectionsdemonstratedcontinuousimprovement,progressivelyachievinggreateralignmentwithUNICEFreportingstandards.SectionsC(methodology,gender,humanrights,andequity)andSectionsD(findingsandconclusions),whichhaveimprovedsincelastyear,aresomeofthehighest‐ratedstandardsinthetemplate,despitebeingamongstthelowest‐ratedintheearlyyearsoftheGEROSexercise.SectionA(Object)wasfoundtobeslightlylesswelldonethanin2013.Theothersectionsremainlargelyunchangedintermsoftheproportionofreportsachievinganoutstandingorhighlysatisfactoryrating.Fortheirpart,recommendationsandlessonslearned(SectionE)haveconsistentlybeen–andstillremain–amongthemostchallengingreportstandardstomeet.


Universalia 18

Exhibit5.9 GoodQualityRatingsperSection–YearbyYearProgression(2011‐2014)

55 .. 44 .. 22 SS ee cc tt ii oo nn AA :: OO bb jj ee cc tt oo ff tt hh ee EE vv aa ll uu aa tt ii oo nn

Finding8: In2014,thedescriptionoftheevaluatedobjectanditscontextdeclinedsomewhatinqualitycomparedto2013.Evidencesuggeststhatthedescriptionofthetheoryofchangeandofstakeholderrolesandcontributionsremainareasforimprovement.

Theproportionofreportsprovidingagoodqualitydescriptionoftheevaluatedobjectanditscontextdecreasedsomewhatin2014(from72%to65%),whichisconsiderablybetterthanthe2010‐2011ratingsforthissection(Exhibit5.10).

Exhibit5.10 SectionARatingsandComparison(2011‐2014)

50% 50%

44% 44%

38%

43%

70%

54%

47%

58%

49%

66%

72%

65%

55%

64%

51%

75%

65%64%

71%74%

52%

74%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

A. Object of the evaluation

B. Evaluation purpose, objectives

and scope

C. Evaluation methodology, gender, human rights and equity

D. Findings and conclusions

E. Recommendations

and lessons learned

F. Report is well structured, logic

and clear

2011 2012 2013 2014

5%

45%38%

12%18%

51%

27%

4%5%

67%

28%

0%

9%

57%

33%

1%

Outstanding Highly Satisfactory Mostly Satisfactory Unsatisfactory

2011 2012 2013 2014


19 Universalia

Thesub‐sectionratingswithinthissection(Exhibit5.11)reflectanincreaseinthedescriptionofthetheoryofchange,decreaseindescribingtheobjectandcontext,andrelativelyunchangedratingsinoutliningstakeholdersandtheircontributionscomparedwithlastyear.

Itshowsthatamajorityofreports(73%)continuedtodowell(inotherwords,toberatedasoutstandingorhighlysatisfactory)intermsofpresentingagooddescriptionoftheobjectandcontext.Reviewercommentsinthetemplatenotedagooddescriptionofcontext,andprovidedcomplimentsforadequatelyoutliningthecomponentsoftheevaluatedobject.Thesereportswereoftencommendedforprovidingtheinstitutional,social,political,economic,demographic,andpolicycontextrelatedtotheobjectoftheevaluation,whichprovidedimportantcontextfortheevaluationfindings.

ExampleofreviewofanOutstandingSectionA

“Thereportprovidedaveryclearandthoroughdescriptionoftheobject.Thisdescriptionincludedafullexplanationoftheproject'stheoryofchangeandprovidedthereaderwithagoodunderstandingofexactlyhowtheProjectwasdesignedtoaddresstheproblemdescribedinthecontext.Thecontextwasalsofullydescribed,withinformationprovidedaboutthepolitical,economic,demographic,andspecificchildprotectionpolicyandlegalframeworks.Keystakeholdersandtheircontributionswereclearlyidentifiedanddescribed.”

FinalEvaluationoftheProject"ChildCareSystemReform"(GEROS‐Montenegro‐2014‐004)

ExampleofreviewofaMostlySatisfactorySectionA

“L'objetdel'évaluationestrelativementbiendécritmaisresteincomplet.Lachaînedesrésultatsattendusestmanquanteainsiquelerôleprécisdesdiversintervenants.Lespartiesprenantesduprojetetdel'évaluationsontmentionnéesmaisl'informationsurlescontributionsprécisesdesintervenantsclésesttrèslimitée.”

“EvaluationduprogrammeélargidevaccinationdanslescampsderéfugiéssahraouisdeTindouf"(GEROS‐Algeria‐2014‐003)


Universalia 20

However,thedescriptionofthetheoryofchangeandthepresentationofstakeholdercontributionsremainareasforimprovement.Theproportionofreportswithgoodqualityratingsforthesub‐sectionofstakeholdersandtheircontributions(65%)remainedessentiallyunchangedfromlastyear.However,thisamalgamatedratingforthesub‐sectionmasksthelargedifferencebetweenthetwostandardsmakingupthissub‐section:identificationofstakeholdersinthereport(whichisdonewell,with86%beinggoodquality)andthedescriptionofkeystakeholderscontributions(only54%goodquality).Indeed,almostone‐thirdofreviewercommentsnotedthatcontributionsofstakeholderswerenotclear.

Thesub‐sectionwhichcontinuestobetheweakest,relatingtothetheoryofchange,nonethelessshowedimprovementthisyear,rising8pointsto49%.One‐thirdofreviewercommentsnotedthatthetheoryofchangeorresultschainwasnotmadeclearinthereport.Insomecases,thiswasbecausetheevaluatedobjectdidnothaveone,butthiswasovercomeinsomecasesbyevaluationteamswhodevelopedatheoryofchange,andusedthistohelpguidetheevaluation.

Finally,amajorityofreports(78%)didwellindescribingtheimplementationstatusoftheevaluatedobject.Reviewersnotedhoweversomeinstancesinwhichsignificantchangesormodificationstotheprogrammewerenotthoroughlydescribedintheevaluationreport.

Exhibit5.11 Sub‐SectionRatings:ObjectoftheEvaluation

55 .. 44 .. 33 SS ee cc tt ii oo nn BB :: EE vv aa ll uu aa tt ii oo nn PP uu rr pp oo ss ee ,, OO bb jj ee cc tt ii vv ee ss aa nn dd SS cc oo pp ee

Finding9: Theextenttowhichreportsmetstandardsofevaluationpurpose,objectivesandscopechangedlittlefrompreviousyears.Strengthsofreportsinthisareaincludedclearpurpose,objectivesandscopeandaclearlistofevaluationcriteria.However,thejustificationfortheselectionofevaluationcriteriastillrequiresgreaterattention.

SectionBoftheGEROStemplatedealswiththeevaluation’spurpose,scope,andobjectives,aswellasthepresentationoftheframeworkguidingtheassessment.Althoughthisisnotamongstthestrongestsections,mostreports(64%)areconsideredtobegoodqualityintermsofthesestandards.ThequalityofSectionBchangedlittlefrompreviousyears.Perhapsthemostnotabletrendistheslowbutsteadydrop(tozero,in2014)inthenumberofunsatisfactoryreportsinthissection.

14%

11%

16%

12%

64%

54%

33%

61%

16%

28%

35%

25%

4%

6%

13%

2%

1%

1%

3%

Implementation Status

Stakeholders and their contributions

Theory of Change

Object and Context

Outstanding Yes Mostly No N/A


21 Universalia

Exhibit5.12 SectionBRatingsandComparison(2011‐2014)

Thedistributionofindividualsub‐sectionratings(Exhibit5.13)hasnotchangedsignificantlycomparedto2013.Thedescriptionoftheevaluations’purpose,objectivesandscoperemainedgoodqualityinthree‐quarters(76%)ofreports.Indeed,reviewercommentscommendedreportsfortheirdescriptionoftheevaluation’spurpose,andforclearlyexplainingobjectives.Goodqualityreportstendedtolinktheseclearlytogether.Insomecases,weakerreportsdidnotaddresstheseexplicitly,particularlyinthecaseofscope.

Theevaluationframeworksubsectionalsochangedlittlefromlastyear,andcontinuestobeamongsttheweakersub‐sectionsinthetemplatewithonly54%ofreportsratedgoodqualityforthisstandard.However,theweaknessinthissub‐sectionisnotthelistingoftheevaluationcriteriathemselves(which69%ofreportsdidwell),butthefrequentlackofjustificationforwhyparticularevaluationcriteriawerechosen.Anequalnumberofcommentsweremadenotingthatthecriteriawere,orwerenot,explicitlyjustified.23

23Anamendmentwasmadetothewordingofthetemplatein2013torecognisethattheuseofstandardOECD/DACcriteriarequiredlessjustificationthantheuseofothercriteria.However,somereviewercommentsindicatethattheextenttowhichsomejustificationofevenOECD/DACcriteriashouldbeprovidedisnotclear,andtheratingsmaybeinconsistentinthisregard.

5%

45%

38%

12%

5%

49%

37%

9%9%

55%

31%

4%7%

57%

36%

0%


2011 2012 2013 2014

ExampleofreviewofanOutstandingSectionB

“Theevaluation'spurpose,objectives,andscope,includingallevaluationcriteriausedinthecontextofthisevaluation,areveryclearlyexplainedandpresented.DifferencesbetweentheinitialTORandwhatwasagreedtoatinceptionarefullyexplained.Thereportprovidesarelevantlistofkeyevaluationquestions.”

Evaluationof“YoungChampionsInitiativeforGirls’Education”(GEROS‐Pakistan‐2014‐002)

ExampleofreviewofaMostlySatisfactorySectionB

“Thereportsuggestsbutdoesnotclearlydescribethepurpose,objectives,andscopeofthisevaluation.Thereaderislefttogleanthisinformationfromdifferentpartsofthereport.Moreover,thereportdoesnotprovideanexplanationastowhythestandardOECD‐DACevaluationcriteriawerenotusedinthecontextofthisevaluation.”

“ImpactevaluationoftheWASHSHEWA‐BprogrammeinBangladesh”(GEROS‐Bangladesh‐2014‐014)


Universalia 22

Exhibit5.13 Sub‐SectionRatings:Purpose,ObjectivesandScopeoftheEvaluation

55 .. 44 .. 44 SS ee cc tt ii oo nn CC :: EE vv aa ll uu aa tt ii oo nn MMee tt hh oo dd oo ll oo gg yy ,, GG ee nn dd ee rr ,, HH uumm aa nn RR ii gg hh tt ss

aa nn dd EE qq uu ii tt yy

Finding10: Reportshavemadeimprovementsintermsofthedescriptionofthemethodology.Whilemethodologicalrobustnessisoftensatisfactory,lacunasinethicalconsiderationsandstakeholderparticipationmayhaveimpactedtheoverallratingsofthissection.

SectionC,onreportmethodologyandintegrationofhumanrights,genderandequity,hasimprovedovertime,butparticularlysoin2014.Amuchlargerproportionofreportswereconsideredtobegoodqualityinthissection(71%comparedto55%in2013).

Exhibit5.14 SectionCRatingsandComparison(2011‐2014)

9%

12%

44%

65%

22%

22%

21%

1%

4%Evaluation Framework

Purpose, objectives and scope


6%

38% 36%

20%

0%

47% 46%

8%5%

50%

42%

3%7%

64%

26%

3%


2011 2012 2013 2014


23 Universalia

Lookingmorecloselyatthesub‐sections(Exhibit5.15),increasesinthenumberofgoodqualityratingsarenoticeable.Withtheexceptionofdatacollection,goodqualityratingsforallsub‐sectionsincreasedbetween2013and2014.Thisyear,ratingsseemtobeconverging,withfouroutofsixsub‐sectionsreceivingasimilarproportionofgoodqualityratings(between66%and72%).The24‐pointimprovementinthestakeholderparticipationratingsisthemostnotable.

Datacollection,forwhichgoodqualityratingsdropped7pointsfrom2013to72%,isstillthestrongestsub‐section.Everysinglereportreviewedincludedatleastsomedescriptionofdatacollection,methods,anddatasources.Thiswasdirectlyreflectedinreviewercomments,whichcomplimentedreportsontheirclearidentificationordescriptionofthedatacollectionmethods.

Thesub‐sectionaddressingresults‐basedmanagement(RBM)wasalsoanareaofstrength(with70%ofreportsbeingalignedwithUNICEFstandards),andanimprovementfromlastyearby10points.Positiveratingsindicatethatreportsdidanadequatejobofevaluatingtheobject’smonitoringsystem,andthattheevaluationmadeappropriateuseoftheM&Eframework.OftentheM&Eframeworkwasusedasaguidetotheanalysisandsometimesforthestructureofthereport.

SimilartotheRBMsub‐section,mostreports(67%)receivedgoodqualityratingsformethodologicalrobustness,anincreaseof5pointsfrom2013forthissub‐section.Theindividualstandardsmakingupthissectionshowastrengthintermsofhavingthemethodologyfacilitateanswerstotheevaluationquestions(81%havingdonethiswell).Thissub‐sectionwouldreceivehigherratingsbutforthefrequentabsenceofacounterfactualtoaddressissuesofcontributionorattribution(only47%havingdonethiswell).Inmanycases,reviewerscommentedthatacounterfactualwasimpracticalornotrelevant,and37%wereratedNotApplicable.Certainly,thisisanareaformorespecificguidanceinTORs,agreaterunderstandingforevaluatorsofwhatisentailedinconstructingacounterfactual,andwhenitmayormaynotbeappropriate.

Stakeholderparticipationimprovedsignificantlythisyear,rising25percentagepointsfrom2013to66%.Itisnotclearwhathasdriventhisimprovementoverall.However,theindividualstandardsthatmakeupthissectionindicatethatreviewersfoundinmostcaseslevelsofparticipationintheevaluationwereappropriate(71%goodqualityratings),whilethereweremoreoftengapsinactuallydescribingtheselevels(60%goodqualityratings).Onlyonefifthofreviewercommentsexplicitlynotedthatstakeholderparticipationwaswelldescribed.Thehigherratingsreflectthatreviewerswereoftenabletoinferstakeholderparticipationfromthereport,evenifitwasnotexplicitlydescribed.

ExampleofreviewofanOutstandingSectionC

“Thereportoffersanexcellentdescriptionofthemethodologyandtheannexesservetocomplementthisdescription.Specifically,thedatacollection,analysisandsamplingmethodsaredescribedandajustificationforusingtheseisgiven.Theevaluationmatrixhelpstheuserunderstandhowtheevaluationwasdesignedtoanswerspecificcriteriaandquestionsandultimatelyachievetheevaluationobjectivesandpurpose.Moreover,theconstructionoftheTOCwasusedtoguidetheentireevaluation,andbetterassesseffectivenessandothercriteria.”

FinalEvaluationReport“RKLA3Multi‐countryevaluation:increasingaccessandequityinearlychildhoodeducation”(GEROS‐RegionalBaltic‐2014‐008)

ExampleofreviewofaMostlySatisfactorySectionC

“Ladescriptionsommairedelaméthodologieidentifiedessourcesetméthodesdecollecteetd'analysedesdonnées.Toutefois,laméthodologien'estpaspleinementdécriteetlesoutilsdecollectededonnéesnesontpasinclusenannexedurapport.”

Évaluationde"l’approcheassainissementtotalpilotéparlacommunauté"(GEROS‐Madagascar‐2014‐005)


Universalia 24

Thesub‐sectionsthatremainweakarea)ethicsandb)humanrights,genderandequity.Theethicssub‐sectiondemandsadescriptionofethicalconsiderationsintheprogrammeaswellasintheevaluationprocess(e.g.protectionofconfidentiality,informedconsent).Theethicssub‐sectionreceivedthesmallestproportionofgoodqualityratingsinthissection,aswellasthefewestnumberofoutstandingratings.One‐quarter(25%)ofreportsneglectedtodiscussethicsatall(andthusreceivedaratingofunsatisfactory).Theinclusionofconsiderationsoftheethicaldesignoftheprogramme,thebalanceofcostsandbenefitstoparticipants,andtheethicsofwhowasincludedandexcludedintheevaluationwasparticularlypoorlydone,withonly43%ofreportsreceivingagoodqualityratingforthisindividualcriterion.

Exhibit5.15 Sub‐SectionRatings:Methodology,Gender,HumanRights,andEquity

Human Rights , Gender , and Equi ty

Initsreports,UNICEFplacesconsiderableimportanceontheinclusionofhumanrights,genderandequityconsiderations,fromthemethodology,throughtotherecommendations.In2012,theintegrationofaHumanRights‐BasedApproachtoProgramming(HRBAP),genderequalityandequityconsiderationsinthereportsreviewedwasstillconsideredasaweakness,withnoneofthethreeelementsreceivingmorethan46%goodqualityratings.Withtheexceptionofgender,improvementwasmadein2013and2014,andallthreeconsiderationseithermetormarginallysurpassedthehalfwaymark(50%)in2013and2014(Exhibit5.16).

Whenconsideredasawhole,thesub‐sectionpertainingtohumanrights,genderandequityconsiderationsincreasedsomewhatinqualityfrom2013.Whensomeofthestandardsofthissub‐section(pertainingspecificallytoinclusionofeitherhumanrights,gender,orequityinthemethodology,framework,findings,conclusionsandrecommendations)areanalysedindividually,however,itisevidentthattherecentprogresshasbeenmadeintheinclusionofequityandhumanrights.Thiscontinuesthepatternofongoingimprovementofinclusionofequityinreports,whichclimbedfrom9%goodqualityratingsin2010,to64%in2014.Equitynowhasthehighest

16%

3%

8%

16%

12%

8%

56%

40%

62%

42%

54%

58%

28%

20%

20%

24%

23%

15%

25%

5%

15%

11%

6%

13%

6%

3%

13%

Data collection

Ethics

Results Based Management

Human Rights, Gender and Equity

Stakeholder participation

Methodological robustness



25 Universalia

proportionofgoodqualityratingsofthethreethemes.Asnotedinlastyear’smeta‐analysis,suchadrastichikemaybetheresultofUNICEF’seffortsandinvestmentsaroundequitysince2011.

Goodqualityratingsforhumanrightsalsoimprovedsomewhat,from50%in2013to57%in2014.Reviewercommentspraisedtheinclusionofhumanrights,notingreportsthatcascadedhumanrightsconsiderationsthroughoutthereport,orusedhumanrightslanguageorframeworks.Inotherreports,theseapproacheswereabsent,ortheirinclusionwasweakorappearedasanafterthoughtinthereport.

Theinclusionofgenderinthemethodology,framework,findings,conclusionsandrecommendationsdidnotchangesignificantlycomparedto2013.Partialpointsweregivenforreportsthataddressedgenderinsomewayssuchasincludinggender‐disaggregateddata,whilestrongerreportsincluded,forexample,acomprehensivesectionongenderconsiderations,orusedgenderequalityasaguidingprincipleoftheevaluation.Overthecomingyears,itwillbeinterestingtonotehowthegendercomponentimprovesinresponsetotheUNSystem‐WideActionPlanonGenderEqualityandWomen’sEmpowerment,adoptedin2012.Sofar,continuingimprovementongenderisnotevidentinreportsassessedthroughGEROS.

Exhibit5.16 InclusionofHumanRights,Gender,andEquity:GoodQualityRatingsYearbyYear

34% 34%30%

44%46%

41%

50% 52% 52%

57%

51%

64%

Human rights Gender Equity2011 2012 2013 2014


Universalia 26

55 .. 44 .. 55 SS ee cc tt ii oo nn DD :: FF ii nn dd ii nn gg ss aa nn dd CC oo nn cc ll uu ss ii oo nn ss

Finding11: Thepresentationoffindings,conclusions,contributionandcausalitycontinuedtoimprovein2014.Costanalysisremainedparticularlychallenging.

In2014,SectionDincreasedinquality,withagreaterproportionofreportsconsideredgoodquality(from64%in2013to74%in2014)(Exhibit5.17).

Exhibit5.17 SectionDRatingsandComparison(2011‐2014)

Inlookingatthesub‐sectionratings,allbutone(costanalysis)increasedmoderatelyingoodqualityratings.Strengths,weaknessesandimplicationsremainedaparticularstrengthinthissub‐sectionagainthisyear,slightlyimprovedoverlastyear.

5%

41%37%

17%

4%

54%

37%

5%7%

56%

35%

1%7%

67%

23%

3%


2011 2012 2013 2014

ExampleofreviewofanOutstandingSectionD

“FindingsandConclusionsareextremelywellarticulated,withevidencebeingwellmarshalledandalargenumberofsourcesbeingused.Carefullanguageisusedtoexplaintheplausibilityofhowtheevidencehasbeeninterpretedinmostcases.Thestructuringofthereportisalsonovel,combiningbackgroundinformationwithfindings.Thisworksparticularlywellinregardtohumanitarianresponse,wherethefindingsrelatetoparticularclusters.Theconclusionssectionisalsostructuredclearlyaccordingtotheevaluationcriteria,andoffersanadditionallevelofinsight.”

"Real‐TimeEvaluationofUNICEF’sResponsetotheTyphoonHaiyaninthePhilippines"(GEROS‐Philippines‐2014‐004)

ExampleofreviewofanUnsatisfactorySectionD

“Findingsarenotpresentedwithsupportingevidencefromthequalitativeorquantitativeresearch…Whencomparedagainsttheevaluationquestions…thefindingsdonot…addressthesequestions.Theevaluatorsdonotacknowledgetheobviousmethodologicallimitationsandproceedtoevaluatetheobjectforitsimpactwithoutevidencetosubstantiatetheseclaims.”

"AnEvaluationoftheImpactofUNICEFRadioListenerGroupsProjectinSierraLeone"(GEROS‐Sierra‐Leone‐2014‐003)


27 Universalia

Exhibit5.18 Sub‐SectionRatings:FindingsandConclusions

In2014,morethanone‐thirdofthereportsconsideredthatacostanalysiswasnotapplicableintheevaluation,whichishigherthanin2013(whenthiswasconsiderednottoapplyin11%ofthereports).Thismaybeduetoavariationininterpretationofthetemplatestandards,wherethereissomereviewerdiscretioninvolvedintermsofwhetheracostanalysiswouldbefeasible,whereitisnotspecificallyrequestedinaTORsandyetispresentedasastandardUNICEFexpectation.Inanycase,costanalysisreceivedthelowestproportionofgoodqualityratings(34%),whichwasadeclinefrom2013,indicatingthatitistypicallynotdonewellandisnotimproving.Partialpointsweregivenforsomelimitedattemptatexaminingprogrammecosts,costefficiency,orcostimplicationsforreplicationorscalingup.

Twoothersub‐sectionsimprovedin2014.Thecompletenessandlogicoffindingsstandardreferstofindingsthatareclearlypresented,addressallcriteriaandquestions,demonstrateaprogressiontoresults,anddiscussgaps,limitations,andunexpectedfindings.Goodqualityratingsimprovedin2014toreachatotalof67%.Reviewersnotedgapsinreportsthatdidnotcoverallthestatedevaluationcriteriaorquestions,anddidnotpresentanexplanation(themostfrequentomissionwastheOECD/DACcriterionofefficiency).Highlyratedreportstypicallystructuredtheirfindingssectioninasystematicmanner(forexample,byevaluationcriteriaorquestion)andpresentedtheirfindingsclearlysothattheywerequicklyapparenttothereader(forexample,somereportsboldedorboxedfindingswithinthetext,statedthematthebeginningofthesection,orseparatedthefindingstextfromthediscussionofevidence).

Improvementwasalsoseeninthecompletenessandlogicofconclusionsstandard,whichimprovedfrom2013to68%goodqualityratingsin2014.Thisstandardasksthatareportgoesbeyondsimplyrestatingfindingsandoffersadditionalinsightthataddsvalue,takesaccountofdiversestakeholders’views,andisappropriatelypitchedtotheendusers.Whileconclusionsoftendoagoodjobofsummarizingfindings,reviewercommentsindicatethattheysometimesstruggletoaddadditionalinsightandvaluebeyondthissummary.Nonetheless,themajorityofreportsmetthisstandard.

8%

8%

9%

4%

12%

60%

73%

62%

30%

55%

21%

13%

17%

19%

20%

8%

4%

9%

13%

10%

2%

1%

3%

33%

2%

Completeness and insight of conclusions

Strengths, weaknesses and implications

Contribution and Causality

Cost Analysis

Completeness and logic of findings



Universalia 28

Onereoccurringissueinreviewercommentsconcernsbothfindingsandconclusionsofreports.Thisisthetendencyofsomereportstocreateaconfusingmixoftwoormoreoffindings,conclusions,recommendations,orlessonslearnedinindividualsectionsofthereport,sothatitbecomesdifficultforthereadertolocatethepertinentinformationtheyarelookingfor,ortoquicklyunderstandthekeymessagesofthereport.Reviewerscomplementedreportsthatclearlydifferentiatedthesetypesofstatements,whilelinkingthemtorelevantcomponents(forexample,recommendationsthatareclearlylinkedtofindingsandconclusions).

Contributionandcausalitywaslargelyunchangedfromlastyear,with71%ofgoodqualityratings.Thissuggeststhatmostreportsdoagoodjobofassigningcontributionforresultstostakeholders,andidentifyingcausalreasonsforaccomplishmentsandfailures.

Themosthighlyratedsub‐sectionwasstrengths,weaknesses,andimplications,with81%goodqualityratings.Thiswasanimprovementfromlastyear.Reviewerscommendedreportsfordiscussingfutureimplicationsofconstraints,andofferingabalancedpresentationofthestrengthsandweaknessesoftheevaluatedobject.

55 .. 44 .. 66 SS ee cc tt ii oo nn EE :: RR ee cc oo mmmm ee nn dd aa tt ii oo nn ss aa nn dd LL ee ss ss oo nn ss LL ee aa rr nn ee dd

Finding12: Largelyunchangedsince2014,thenumberofgoodqualityreportsforSectionEremainedthelowestofallsections.Cleareridentificationoftargetstakeholdergroupsandlessonslearnedcouldhelpimprovetheratingsforthissection.

GoodqualityratingsforSectionEremainthelowestamongalltheGEROStemplatesections,asinpreviousyears.Therehasbeenageneralupwardtrendoverthelastseveralyears(Exhibit5.19).Themostnotablechangefrom2013forthissectionisadecreaseinunsatisfactoryreports,from8%to1.

Exhibit5.19 SectionERatingsandComparisonwith(2011‐2014)

2%

36%42%

20%

4%

46% 46%

5%2%

49%

41%

8%3%

49%46%

1%


2011 2012 2013 2014


29 Universalia

Basedonsub‐sectionratings(Exhibit5.20),therelevanceandclarityofrecommendationscontinuedtobethestrongestcomponentbyfarofSectionEat75%goodqualityratings,whichisnearlyidenticaltolastyear.Reviewerscommendedreportsforproducingrecommendationsrelevanttothepurposeoftheevaluation,andgroundingthemintheevidencepresented.However,therewasaclearweaknessinprioritisingrecommendations,anindividualindicatorthatreceivedonly50%goodqualityratings.Reviewersoftencriticisedreportsforproducingtoomanyrecommendations,orthosethatwerevagueoroverly‐specific.

Thoughtheproportionofgoodqualityratingsrelatedtotheusefulnessofrecommendationsincreasedsince2013,goodqualityratingsinthisstandardapplytojusthalfofreports,andsomenotableareasofweaknesscanstillbeidentified.Lookingattheindividualindicatorratingsforthissub‐section,onlyhalfofreportsclearlyidentifiedtargetgroupsforaction,afrequentweaknessnotedinreviewercomments.Whilereviewersgenerallyfelttherecommendationswererealistic,themajoritydidnotdescribetheprocessfollowedindevelopingtherecommendations.Atonly30%goodqualityratings,thisisthemostpoorlyratedindividualcriterioninthetemplate.Sincethereisnodiscussioninmostreportsofhowrecommendationsweredeveloped,itisuncertainwhethertheprocessislackingparticipationbyimportantstakeholders,orwhetherreportsaresimplylackingadescriptionoftheprocessestheyarefollowing.Eitherway,thisdoesnotappeartobearequirementthatiswellunderstoodbyevaluators.

Theidentificationofappropriatelessonslearnedremained,aslastyear,themostimportantweaknessinSectionE,withonly35%ofreportsratedgoodqualityinthisrespect.Basedonratingsfortheindividualindicatorsaswellasreviewercomments,aboutone‐thirdofreportsdidnotincludelessonslearnedatall.Reviewercommentsindicatethatfrequently,therewasnoattempttogeneralisethemtoindicatetheirwiderrelevancebeyondtheevaluatedobject.Thissuggeststhatmanyevaluatorsand/orevaluationmanagersdonotunderstandUNICEF’sdefinitionoflessonslearned.SometimestheyarenotspecificallymentionedintheTORs,andensuringthattheyaremayassistinimprovingthispractice.

ExampleofreviewofanOutstandingSectionE

“Therecommendationsandlessonslearnedareveryrelevantandactionabletothepurposeandobjectivesoftheevaluation.Theyaresupportedbythefindingsandweredevelopedwiththeinvolvementofrelevantstakeholders.TheyidentifythetargetgroupsforactionandreflectaverygoodunderstandingofthecountrycontextandUNICEF'srole.”

“LetUsLearn(LUL)”FormativeEvaluationUNICEFAfghanistanCountryOffice(GEROS‐Afghanistan‐2014‐008)

ExampleofreviewofaMostlySatisfactorySectionE

“Lesrecommandationssontpertinentesetréalistesmaisnesontpasprioriséescequilimiteleurutilitépourlesdécideurs.Lerapportneprésentepasdeleçonstiréesenlienaveclesobjetsdel'évaluation.”

“ÉvaluationdesInterventionsàBaseCommunautairedanslesRégionsdesSavanesetdelaKara”(GEROS‐Togo‐2014‐001)


Universalia 30

Exhibit5.20 Sub‐SectionRatings:RecommendationsandLessonsLearned

55 .. 44 .. 77 SS ee cc tt ii oo nn FF :: RR ee pp oo rr tt SS tt rr uu cc tt uu rr ee ,, LL oo gg ii cc aa nn dd CC ll aa rr ii tt yy

Finding13: GoodqualityratingsforSectionFdidnotsignificantlychangecomparedto2013,butthisSectioncontinuestohaveamongstthehighestoverallgoodqualityratings.Themajorityofevaluationswerelogicallystructured,butissueswerenotedregardingthelengthofexecutivesummariesortheirabilitytostandalone.

Therehavebeenminimalchangesinqualityofreportstructure,logicandclarity(SectionF)sincelastyear.Theproportionofreportsconsideredtobegoodquality(74%)remainedvirtuallythesameaslastyear.Theproportionofreportswithlowerratingswasvirtuallyunchanged,withnoneconsideredunsatisfactory.

Exhibit5.21 SectionFRatingsandComparison(2011‐2014)

7%

12%

13%

28%

41%

62%

29%

24%

21%

35%

22%

2%

1%

1%

1%

Appropriate lessons learned

Usefulness of recommendations

Relevance and clarity of recommendations


5%

38%42%

20%

11%

54%

29%

5%

15%

60%

25%

0%

9%

65%

26%

0%

Outstanding, best practice

Highly Satisfactory Mostly Satisfactory Unsatisfactory

2011 2012 2013 2014


31 Universalia

In2014,goodqualityratingsforstyleandpresentationincreasedslightlyto76%.Reviewersweregenerallycomplementaryabouttheinclusionofbasicelementsintheopeningpagesofthereport,andstructuringthereportinalogicalmanner.Often,thisreflectedastructurebasedoncriteriaorevaluationquestions.AsnotedinSectionD,however,thereweresometimesproblemswithinsectionsofreportsthatmixedfindings,conclusions,andrecommendationsinaconfusingmanner.Annexeswereusuallyfoundtocontainappropriateelementsthataddedcredibilitytothereport,althoughelementssuchasdatacollectiontoolsweresometimesmissing.

AnExecutiveSummarywasincludedinnearlyallreports,andreceived73%goodqualityratings.However,theindividualstandardsratingsshowthatreviewersfoundsomedeficienciesintheabilityofthesesummariestostandalone(i.e.tonotrequirereferencetotherestofthereport)andtoinformdecisionmaking.Bothoftheseindividualindicatorsreceived61%goodqualityratings,andcommentsfrequentlyreflectaconcernwiththelonglengthofexecutivesummaries,comparedtoUNICEF’ssuggestionintheGEROStemplateof2‐3pages.Inmostcases,executivesummariesincludedalltheexpectedelements.

Exhibit5.22 Sub‐SectionRatings:ReportStructure,LogicandClarity

8%

14%

65%

62%

20%

20%

7%

3%1%

Executive Summary

Style and presentation


ExampleofreviewofanOutstandingSectionF

“Thisisaverystrongreportintermsofstyleandaccessibility.Itstrikesanadmirablebalancebetweenreadabilityanddetail.AllrequirementsoftheUNICEF/UNEGstandardsaremetintermsoftheelementstobeincluded.”

Real‐TimeEvaluationofUNICEF’sResponsetotheTyphoonHaiyaninthePhilippines(GEROS‐Philippines‐2014‐004)

ExampleofreviewofaMostlySatisfactorySectionF

“Thegeneraloutlineofthereportislogicallystructured…Butthe…largeamountofrepetitionofinformationdecreasesthereport'sreadabilityandleavesfindingsunclear...Theexecutivesummaryprovidesareasonablesummaryandisanappropriatelength,butstilldoesnotcontainanoverallhigher‐levelanalysis.”

“ImpactEvaluationofWater,SanitationandHygiene(WASH)withintheUNICEFCountryProgrammeofCooperation,GovernmentofNigeriaandUNICEF,2009‐2013”(GEROS‐Nigeria‐2014‐012)


Universalia 32

55 .. 44 .. 88 CC oo hh ee rr ee nn cc ee

Finding14: Three‐quartersofreportsreviewedin2014didagoodjobofprovidingacoherentoverallnarrative.

Inorderforareporttobecoherent,accordingtoGEROSstandards,itmustbeconsistentandlogical,flowingclearlyfromonesectiontothenext.In2014,81%ofreportsweredeemedtobecoherent.Thisissimilartolastyear’sassessment,whichwasanimprovementfrom58%in2012.Incoherentreportswerefewandwereatthesamelevelas2013(3%).Incaseswherereportslackedcoherencethiswasoftenduetoinsufficientlylinkingelementsofthereport(findings,conclusions,andrecommendations),leavingoutkeyelementsoftheevaluation(conclusions,lessons,recommendations),orhavingalargenumberoftyposandothereditingissuesthataffectedreadabilityofthereport..

55 .. 44 .. 99 AA dd dd rr ee ss ss ii nn gg tt hh ee gg aa pp ss ii nn EE vv aa ll uu aa tt ii oo nn RR ee pp oo rr tt ss

Basedontheanalysisofthetrendsbysection,thereareanumberofareasthatrequirefurtherattentioninordertoimprovethequalityofUNICEFevaluationreports.

SectionA:Theoriesofchangeareoftenmissingornotproperlydevelopedinevaluationreports.ThisisunderstandablesincethisisafairlynewevaluationapproachandnotallevaluatorsorUNICEFofficersareentirelycomfortablewiththisconcept.TORsshouldclearlyspecifywhenatheoryofchangeneedstobedevelopedorrefined.Incaseswherethisisnotapplicableorimpossible,evaluationreportsshouldstatethereasonswhy.

SectionB:OneoftheGEROSstandardsrequiresajustificationoftheevaluationcriteriathatareused,ornotused,intheevaluation.Inthereportsreviewed,thereisoftennorationaleprovidedfortheuseofspecificevaluationcriteria.Evaluatorsmaynotbeawareoftherequirementtojustifyevaluationcriteria,especiallywhenthosecriteriaarefirstidentifiedintheTORsand/orwhentheevaluatorisusingstandardOECD/DACevaluationcriteria.

SectionC:Evaluationsreviewedoftenlackedacounterfactualtoaddressissuesofcontributionorattribution.However,inmanycases,acounterfactualmaybeimpracticalornotrelevant.ThisisanareaformorespecificguidanceinTORsanddialoguebetweenUNICEFandtheevaluatorsaboutwhenitmayormaynotbeappropriate.Anotherelementwhichisoftenpoorlyexecutedinthereportsistheintegrationofgenderequality.UNICEFhasbeenreportingontheEvaluationIndicatoroftheUNSystem‐WideActionPlanongenderequalityandwomen’sempowerment.MoredisseminationoftheUN‐SWAPcriteriaforgenderresponsiveevaluationisrequiredsincecountryofficesarenotallawareofit.GenderresponsiveevaluationbeginsfromthedevelopmentoftheTORsandUNICEFofficershaveagreatdealofresponsibilityinensuringthatthisisproperlydone.

SectionD:UNICEFexpectsthatacostanalysiswillbecarriedoutaspartoftheevaluation.andyetitisrarelydoneandseldomincludedinTORs.Costsanalysesarenotalwayspossible,butwhentheyare,clearexpectationsshouldbeincludedintheTORs.

SectionE:Theidentificationoflessonslearnedisaweakarea,mostlybecauselessonsidentifiedinevaluationreportsarenotgeneralizabletoindicatetheirwiderrelevancebeyondtheevaluatedobject.Thisseemstoindicatethatmanyevaluatorsand/orevaluationmanagersdonotunderstandUNICEF’sdefinitionoflessonslearned.SometimestheyarenotspecificallymentionedintheTORs,andensuringthattheyaremayassistinimprovingthispractice.


33 Universalia

SectionF:Tobeusefulfordecisionmakers,executivesummariesneedtostandalone(i.e.tonotrequirereferencetotherestofthereport),andyet,theyalsohavetobeconcise.ExecutivesummariesareoftencompletedonlywhenthefinalversionofthereportissubmittedtoUNICEF,whichdoesnotleaveroomforcommentsorrevisions.Evaluationmanagersshouldcarefullyreviewthesesummariestoensurethattheycanbeuseful.

55 .. 55 EE xx aamm pp ll ee ss oo ff GG oo oo dd PP rr aa cc tt ii cc ee ss

TheGEROSassessmentswereexaminedwithaviewtodrawoutexamplesofgoodpracticesindifferentevaluationreports.Inthetablebelow,someofthesepracticesareidentified:

SectionA:ObjectoftheEvaluation

IncludingbothanarrativeandaschematicdescriptionoftheprojectTheoryofChange(TOC)clearlylinkingprojectoutputstooutcomesisagoodpracticethatshouldbeencouraged.Theoriesofchangearenotonlyusefultoolsinidentifyingwhataprogramcaninfluence,buttheyalsoprovideinsightsonwhetherprogramscanreachtheirgoalswiththetimeandresourcestheyhadavailable.

Oftentimes,whenaTOCiswellexplainedatthebeginningofthereport,itiseasiertofollowthelogicoftheprogressiontoresultsinthereport.Thereisacleardistinctionbetweenresultsatdifferentlevelsandevidenceisusedthroughouttosupportjudgements.

SectionB:Evaluationpurpose,objectivesandscope

Havingtheevaluatorstakethedescriptionandjustificationoftheevaluationcriteriaastepfurtherbyexplaininghowtheyintendtointerpretandevaluateeachcriterionwasidentifiedasagoodpractice.

SectionC:EvaluationMethodologyandGender,HumanRightsandEquity

Reportsthatdescribedmethodologicalchoicesinaclearandtransparentwayandprovideddetaileddescriptiveelementswereconsideredtobegoodpractice.

Effortmadebytheevaluationteamtoovercomethelimitationofnotbeingabletoconstructatruecounterfactualthroughcreatinganexpostdefactobaselinewasanotherexampleofgoodpractice.

Includingconsiderationsforcross‐cuttingissues,inparticulargenderequality&humanrightsandequity,consistentlythroughouttheevaluationisalsoagoodpractice.Usinghumanrightsvocabularythroughoutthereportandclearlyidentifyingstakeholdersasrightsholdersand/ordutybearersisanotherwaytoensureproperuseofHRBA,andsoisincludingreferencestotheinternationalhumanrightsframeworkaswellasnationalrightsbenchmarks.

SectionD:FindingsandConclusions

Ideally,theorganizationofthefindingsshouldbebasedontheestablishedevaluationcriteria.Providingdefinitionsofeachcriterionsothatthereaderclearlyunderstandshowthatcriterionwasusedinthereportisagoodpractice.

Goodpracticewasalsovisibleinreportsthatclearlydistinguishedbetweenfindingssupportedbyevidenceandconclusionsderivedfromevaluatorassessmentofthefindings.


Universalia 34

SectionE:RecommendationsandLessonsLearned

Includingabreakdownofrecommendationsintostrategicandtactical,anindicativetimeframeforeach,aswellasasummativecost‐benefitanalysisforeachdemonstratesagoodunderstandingofUNICEFcapabilitiesandprocesses.

Anothergoodpracticenotedrelatedtorecommendationwastheinclusionofatableforeachrecommendationwhichsummarizestherecommendation,describestheactionrequiredaswellasthespecific'taskholder'.

SectionF:Reportiswellstructured,logicandclear

Datavisualisationisimportantandextraefforttodisplaythedatainamannerthatisconvenientforthereaderisanothergoodpractice.Theadequateuseofsummarytables,chartsandgraphscangreatlyimprovethereadabilityofareport.


35 Universalia

6 CC oo nn cc ll uu ss ii oo nn ss

Thefollowingconclusionsarederivedfromtheanalysisprovidedinthepresentreportaswellasdiscussionsheldwiththereviewteamoverthecourseofthe2014GEROSexercise.

ThequalityofUNICEF’sevaluationreportscontinuedtoincreaseoverallthrough2014,butonlymoderately.

Thoughtheratesofincreaseinqualityaredeclininginrecentyears,theupwardtrendinreportqualitymaintaineditselfforanotheryear.Goodqualityratingsrosefrom69%to74%.Thereareafewfactorsthatmaybecontributingtothisincreasethisyear.Firstly,attheregionallevel,reportsfromsomeregionsincreasedormaintainedtheirlevelofgoodqualityratings(HQ,CEE‐CIS,andROSA)whileothersdeclinedinquality(ESARO,WCARO).Thosethatdeclinedinqualityalsosubmittedmanyfewerreportsthanin2013.ThismayhavehadanimpactonoverallGEROSqualityratings,andalsoraisesthequestionofwhethertherearelinksbetweenthequalityofevaluationsandtheextentofparticipationintheGEROSprocess.Secondly,afargreaterproportionofreportsincludedTORsthisyear,from57%to88%.Thismayalsobeacontributingfactorinimprovedquality,asinpreviousyears,reportswithTORstendedtoberatedhigherthanreportswithoutTORs.

ThecontributionofreportstotheGEROSprocessbysomeUNICEF’sregionalandcountryofficesappearstobedropping.

In2014,thetotalnumberofreportssubmittedtoGEROSdeclinedfrom96reportsin2013to69.ThisdeclinereflectsmanyfewerreportssubmittedbyLACRO,ESARO,andMENARO,whileotherregions(ROSA)andHQsubmittedmore.ThismayreflectdiverginginterestfromregionsinparticipatingintheGEROSprocess.Thesechangeshavealsoresultedinlesslinguisticdiversityinreportsbeingsubmitted.

CertainSPOAareasarecoveredmorethanothers(i.e.education,whichconstitutes32%ofevaluations),andafewareasmaybeunder‐representedintheoverallmix.Programmeandprojectevaluationsarebyfarthemostcommon,andcountryprogrammeevaluationsremainfewinnumber(3),despitetheimportanceofthecountryasaunitofanalysis.UNICEFmaywanttoknowmoreabouttheissueofevaluationcoverageandeventuallycomplementtheinformationobtainedthroughGEROSinordertodetermineifitisachievinganacceptablelevelofcoverage.

Reportscontinuetodemonstratesimilarshortcomingsfoundin2013.

Thedatacollectedduringthisreviewcyclerevealedthat,whilegoodqualityratingsformostofthesub‐sectionsintheGEROStemplateimproved,somestandardsremainedparticularlyweak,especiallythoserelatingtoethics,theoryofchange,justifyingevaluationcriteria,costanalysis,humanrightsandgender,targetingrecommendations(andexplainingtheprocessusedtodeveloprecommendations),andlessonslearned.Insomecases,thesecanbeaddressedthroughfillinggapsinthedescriptionofcertainelementswhichmayhavebeenpresentinthebackgroundoftheevaluation(suchasethicalsafeguardsandatheoryofchange),ormakingexplicitthatwhichhasbeenleftimplicitinthenarrative.Inothercases,asignificantimprovementwillrequiredifferentapproachesandanalysis,asinconductingcostanalysis.ItappearsthatmanyevaluatorsdonotunderstandUNICEF’sdefinitionorexpectationsaroundsomecomponents,suchaslessonslearned.Thoughnotallthesecomponentsareparamounttomethodologicalrobustness,theyremainimportanttoUNICEFandtheiromission/lackofdetailhadnegativerepercussionsonoverallperformanceintheGEROSprocess.Ontheotherhand,thefargreaterinclusionoftheTORsinreportsthisyearwasapositivedevelopmentandalsoaidedreviewersinmoreaccuratelyassessingGEROSratings.


Universalia 36

Whilethemajorityofreportsareincreasinginquality,considerableimprovementisnowmostlikelytobeachievedthroughcarefulanalysisofevaluationmanagementstructuresandoversightsystems.Clear,completeanddetailedTORsfavourtheproductionofbetterworkplans,whichinturnenhancethequalityofdraftsandevaluationreports.Themanagementandoversightofeachevaluationstageshouldincorporate,communicateandapplyrelevantUNICEFstandards.Accordingtothe2014AnnualReportontheEvaluationFunctionandMajorEvaluations,regionsarealreadymovinginthisdirection.

AftersixyearsofapplyingtheGEROStemplatewithminorchanges,itmaybetimetomoresignificantlyadjustthecontentandstructureofthetemplatetoaddressongoingissues.

WhilesmalladjustmentshavebeenmadeovertimetoimprovetheGEROStemplate,someareasofthetemplatehaveproventobeproblematic,intermsofmaintainingconsistencyininterpretationandratings,efficiency,orallowingflexibilityfordifferenttypesofevaluations.Improvementscouldbemadetotheratingsystem,thestructureandcontentoftheframework,andtheGEROSprocess.Detailsofproblematicareasandsuggestedchanges,basedonUniversalia’sthreeyearsofexperiencewiththisprocess,havebeenaddressedinAppendixIII.


37 Universalia

7 RR ee cc oo mm mm ee nn dd aa tt ii oo nn ss

Asinpreviousyears,therecommendationsprovidedherewithaimtoencouragetheimprovementofUNICEF’sbroaderevaluationpracticesaswellasthequalityofitsevaluations,throughtheeffectiveapplicationofGEROSasaqualityassurancesystem.

Thefollowingrecommendationswereformulatedinviewofthefindingsandconclusionsofthisreport,basedondiscussionsheldwiththeGEROSreviewteamatUniversalia.AttherequestofUNICEF,UniversaliahasalsomadespecificcommentsandsuggestionsforchangestotheGEROStemplate(seeAppendixIII).TheserecommendationsincorporatefeedbackfromtheUNICEFEvaluationOfficeandRegionalOfficesonthedraftreport.

Recommendation1: UNICEFshouldexaminewhethertheincreaseinqualityofevaluationreports,asassessedthroughGEROS,hasresultedinseniormanagershavinggreaterconfidenceinevaluationreports.

TheGEROSsystemisanimportantelementinharmonisingevaluationstandardsacrossUNICEFevaluations,atthegloballevel.Indeed,theGEROSexercisesconductedthusfarhavenotedasteadyincreaseinthequalityofreportssubmitted,whichshouldalsomeanthattheyarebettermeetingtheneedsofdecision‐makers.TheGEROSprocess–andevaluationsmoregenerally–remainsonlyonepieceofalarger,organisation‐widefeedbacksystem.

Giventhatthequalityofdecentralisedevaluationappearstobeontherighttrack,itisnowtimetothinkaboutandassesstheoverallfeedbacksystemthatisavailabletoseniormanagement,andhowevaluationfitsintothatsystem.Goingforward,effortsshouldbemadetoanalysehowtheGEROSsystemcontributestotheincreaseduseandvalueofUNICEF’sevaluationfunction,especiallyattheregionalandcountrylevels.Morespecifically,UNICEFshouldseeif,asaresultoftheincreasedreportingqualitynoted,seniormanagersnowhavegreaterconfidenceinthereportsproducedandwhethertheinformationobtainedisuseable.Inaddition,theEOmaywanttofurtherunderstandthereasonsforthedeclineinevaluationssubmittedtoGEROSfromseveralregionsin2014.Itmaybethatregionsaretendingtouseothertypesoffeedback(e.g.,fromresearch)moreoftenthanevaluations.

ThisrecommendationisrelevanttotheEvaluationOffice.

Recommendation2: Withinitsdecentralisedevaluationstrategy,UNICEFshouldcontinuetobuilditsownregional/countryofficeevaluationcapacitiesandnationalcapacitiestoconductrelevanttypesofevaluations.

UNICEFregionsplayaroleinstrengtheningevaluationwithinUNICEF.Whilegoodfoundationshavebeenestablished,thereisstillaneedtoensurethattargeted,focusedsupportisprovidedtobuildUNICEFregionalandcountryofficecapacityandnationalevaluationcapacitiestoaddressrecurrentshortcomings.Thiscapacitysupportshouldtakeintoaccountanumberofissues,including:

Themajorityofevaluationreportssubmittedin2014focusedattheprogrammeandprojectlevels.However,isthereaneedformoreevaluationatastrategiclevel,includingevaluationofupstreamwork,cross‐cuttingthemes,etc?

Ifthecountryisacriticalunitofanalysisindevelopment(andimportanttoinformfutureCountryProgrammedocuments),UNICEFshouldcontinuetoencouragecountry‐levelevaluations(includingcountryprogrammeevaluationsorjoint‐evaluations).


Universalia 38

UNICEFhasencouragedcountry‐ledevaluations,whereitactsasapartnerratherthanaleaderintheevaluationprocess,buttherearestillrelativelyfewofthesebeingcarriedout.UNICEFshouldcontinuetopromoteandsupportthedevelopmentofcapacityforcountry‐ledevaluation,whichislikelytobeanimportantelementinimplementingthePost‐2015Agenda.

ThisrecommendationisrelevanttotheEvaluationOfficeandRegionalOffices.

Recommendation3: Specialeffortsshouldbemadetostrengthencertainaspectsofevaluationreportsthathavebeenconsistentlyweakinthepastfewyears.

OverthepastfewGEROScycles,performanceonsomeevaluationcomponentshasbeenconsistentlyweak.Indeed,thedescriptionofthetheoryofchange,ethicalconsiderations,anddevelopmentoftargetedrecommendations;thejustificationforevaluationcriteria;theintegrationofgenderandhumanrights;aswellasthesystematicinclusionofgoodqualitylessonslearnedhavebeenchallenging.CostanalysisisanothercomponentthatrequiresgreaterattentionorneedstobereframedinarevisedGEROSreviewframework,ifitcontinuestobeimportanttoUNICEF.Inordertoaddresstheseweaknesses,UNICEFshouldfocusfurthereffortsandattentiononimprovedtrainingandguidancearoundthesestandardsinparticular.

ThisrecommendationisrelevanttotheEvaluationOffice.

Recommendation4: UNICEFshouldcontinuetoupdateandsystematicallycommunicateitsrequirementsforevaluationreportsacrossitsentireevaluationoversight/managementsystem.TheseupdatesshouldtakeintoaccountevolvingstandardsforevaluationintheUNSystem.

Inanefforttocontinuouslyimproveitsevaluationoversightsystem,UNICEFshouldensurethatitspriorities,standardsandcriteriaareclearandup‐to‐dateacrossallevaluationmanagementlevels,includingwhetheritbestreflectsUNICEF’sandUNrequirements,thecurrentstateofpractice,andastreamlinedreviewprocess.ThisincludesreviewingandclarifyingtheGEROStemplate(whichUNICEFdoesperiodically)aswellasthesystematicintegrationofevaluationprioritiesandstandardsinallTORs,whichshouldtranslateintobetterqualityinceptionreportsandthus,improvedevaluationreports.ThiswouldalsoincludetheUN‐SWAPstandardswhichisnotwidelyknownincountries.Fortheirpart,RegionalOfficescanfollowupwithCountryOfficestoemphasisetheimportanceofsystematicallyincludingTORswithindraftandfinalevaluationreports.SpecificsuggestionsonamendmentstotheGEROStemplateisprovidedinAppendixIII.

ThisrecommendationisrelevanttotheEvaluationOfficeandRegionalOffices.

Recommendation5: AspartoftheperiodicreviewofGEROS,UNICEFshouldconsiderrevisingtheratingscaleandseveralelementsoftheGEROStemplateinordertoensuregreaterprecisioninthemessagesthatareprovidedaboutevaluationqualityandthecharacteristicsofevaluationreports,andtocreatemoreefficiencyinapplyingthetemplate.


39 Universalia

Thisrecommendationemergesfromtheexperienceofconductingthismeta‐evaluationoverthepastthreeyears.Forthe2012cycle,afterthreeyearsofimplementingGEROS,UNICEFchangedtheratingscale.Theratingof“AlmostSatisfactory”waschangedto“MostlySatisfactory”,whilethetrafficlightcolor(yellow)associatedwiththeratingremainedthesame.Theuseoftheterm“MostlySatisfactory”forayellowratingmaybemisleading,asayellowtrafficlightsuggestssomethinginwhichthereisonlypartialcompliance,andwherethereisstillroomtoimproveinordertobecomesatisfactory.Inaddition,thechangeintheratingalsocreatedalargergapbetween“Unsatisfactory”and“MostlySatisfactory”ratings,whichcreateschallengesinmakingthejudgmentabouthowtorateparticularquestionsorstandardsinthetemplate.InAppendixIII,weprovidefurthercommentsonthescaleandidentifiedotherareaswheregreaterclarityorefficiencycanbeintroducedinapplyingtheevaluationreportclassificationandqualitystandards,asdescribedintheGEROStemplate.


Universalia 40

8 LL ee ss ss oo nn ss LL ee aa rr nn ee dd

UNICEFdefineslessonslearnedascontributionstogeneralknowledgethathelpexpandcommonunderstanding.Thefollowinglessonsaimtobuildonthelessonsprovidedinthe2013meta‐analysis,basedontheexperienceofthe2014reviewprocess.

ClearandsystematiccommunicationofevaluationstandardsandprioritiesfavourstheeffectivealignmentofevaluationswithUNICEFstandards,fromtheoutset(i.e.TORsstage).

Thetermsofreferenceconstitutethekeystartingdocumentofanyevaluation.InordertoensurethatevaluationsrespondtoUNICEF’sexpectationsandpriorities,evaluationTORsshouldclearlyidentifythestandardsbywhichreportswillbejudgedintheGEROSprocess.Inthisway,highqualityreportswillnotbepenalisedforomittingcertaincomponentsduetoalackofawarenessofUNICEFexpectationsbothonthepartoftheevaluatorandtheevaluationmanager.

Whilecommonstandardshelpimproveevaluationquality,qualityassurancesystemssuchasGEROSshouldprovidesufficientflexibilitytoaccountfordifferenttypesofevaluations.

Commonstandardshelpensurethatevaluationreportsarejudgedaccordingtothesamestandardsandstrivetoattainspecificgoalsandobjectives.However,abalanceshouldbefoundbetweencommonstandardsandflexibility,totakeintoaccountthedifferenttypesofevaluationsproducedeachyearwithinUNICEF.Currently,theGEROSprocessillustratesthistensionbetweensetobjectivesandflexibility,asthetemplateuseddoesnotalwayseasilyadaptitselftospecifictypesofevaluationreports(e.g.impactevaluations,andseparatelypublishedcasestudyevaluationswhicharepartofabroaderevaluation).

Inadecentralizedsystem,compliancewithqualityassurancesystemssuchasGEROSisaffectedbyincentives,availableresources,andtheperceptionofrelevance.

UNICEF’sdecentralizedstructureofregionalandcountryofficesleadstochallengesinmaintainingconsistencyinhowevaluationsareconducted,achallengewhichGEROSwascreatedtoassistwith.ForfullparticipationinaprocesssuchasGEROS,therelevanceandutilityoftheprocessmustbecleartotheregions.Partofmaintainingrelevanceisensuringthatthesystemiscontinuouslyupdatedtoreflectchangingexpectationsandpriorities.

QualityassurancesystemssuchasGEROSneedtostrikeabalancebetweenconsistentapplicationoveraperiodoftime(whichallowsforcomparison)andmakingmajoradjustmentsinordertoimproveutilityandreflectchangesintheenvironment.

UpdatingandchangingtheGEROStemplateeveryyearisnotnecessaryandwouldbetimeconsuming.ItishowevernecessarytohaveaformalandconsultativerevieweverytwotothreeyearsinordertokeeptheprocessrelevantandtoupdatetheGEROStemplateasrequiredinlightofnewdevelopmentsorrequirements.


41 Universalia

AA pp pp ee nn dd ii xx II LL ii ss tt oo ff FF ii nn dd ii nn gg ss

Finding1: In2014,thequalityofreportssubmittedcontinuedtoincreaseoverall,accompaniedbyalargeincreaseinthenumberofreportswithTORsincludedintheappendicesorasseparatedocuments.

Finding2: ThequalityofUNICEFevaluationreportsvariedsincelastyear’sreviewprocess,withsomeregionsimprovingandothersdeclining.

Finding3: Asinpreviousyears,mostreportsfocusedoninitiativeswithanationalscopeandtheywereoverallgoodqualityreports.

Finding4: MostevaluationreportsinthesampleweremanagedbyUNICEFandwereconsideredqualityreports.Withrespecttoevaluationpurpose,programmeandprojectevaluationscontinuetorepresentthemostimportantproportionofevaluationsreviewed.

Finding5: Alargerproportionofevaluationsaresummativethisyearandamajorityofreportsfocusoneducationasathematicarea.

Finding6: AlmostallreportssubmittedtoGEROSareindependentexternalevaluations,whichisashiftsince2012wheninternalevaluations(orreviews)werestillsometimesclassifiedasevaluations.AmajorityofthereportssubmittedwereinEnglish.

Finding7: In2014,thequalityofreportsslightlyincreasedintwosections,decreasedslightlyinanother,andchangedlittleinthreeothersections.ThemajorityofreportscontinuetobealignedwiththeUNICEF‐adaptedUNEGstandardsforevaluationreports.

Finding8: In2014,thedescriptionoftheevaluatedobjectanditscontextdeclinedsomewhatinqualitycomparedto2013.Evidencesuggeststhatthedescriptionofthetheoryofchangeandofstakeholderrolesandcontributionsremainareasforimprovement.

Finding9: Theextenttowhichreportsmetstandardsofevaluationpurpose,objectivesandscopechangedlittlefrompreviousyears.Strengthsofreportsinthisareaincludedclearpurpose,objectivesandscopeandaclearlistofevaluationcriteria.However,thejustificationfortheselectionofevaluationcriteriastillrequiresgreaterattention.

Finding10: Reportshavemadeimprovementsintermsofthedescriptionofthemethodology.Whilemethodologicalrobustnessisoftensatisfactory,lacunasinethicalconsiderationsandstakeholderparticipationmayhaveimpactedtheoverallratingsofthissection.

Finding11: Thepresentationoffindings,conclusions,contributionandcausalitycontinuedtoimprovein2014.Costanalysisremainedparticularlychallenging.

Finding12: Largelyunchangedsince2014,thenumberofgoodqualityreportsforSectionEremainedthelowestofallsections.Cleareridentificationoftargetstakeholdergroupsandlessonslearnedcouldhelpimprovetheratingsforthissection.


Universalia 42

Finding13: GoodqualityratingsforSectionFdidnotsignificantlychangecomparedto2013,butthisSectioncontinuestohaveamongstthehighestoverallgoodqualityratings.Themajorityofevaluationswerelogicallystructured,butissueswerenotedregardingthelengthofexecutivesummariesortheirabilitytostandalone.

Finding14: Three‐quartersofreportsreviewedin2014didagoodjobofprovidingacoherentoverallnarrative.


43 Universalia

AA pp pp ee nn dd ii xx II II LL ii ss tt oo ff RR ee cc oo mmmm ee nn dd aa tt ii oo nn ss

Recommendation1: UNICEFshouldexaminewhethertheincreaseinqualityofevaluationreports,asassessedthroughGEROS,hasresultedinseniormanagershavinggreaterconfidenceinevaluationreports.

Recommendation2: Withinitsdecentralisedevaluationstrategy,UNICEFshouldcontinuetobuilditsownregional/countryofficeevaluationcapacitiesandnationalcapacitiestoconductrelevanttypesofevaluations.

Recommendation3: Specialeffortsshouldbemadetostrengthencertainaspectsofevaluationreportsthathavebeenconsistentlyweakinthepastfewyears.

Recommendation4: UNICEFshouldcontinuetoupdateandsystematicallycommunicateitsrequirementsforevaluationreportsacrossitsentireevaluationoversight/managementsystem.TheseupdatesshouldtakeintoaccountevolvingstandardsforevaluationintheUNSystem.

Recommendation5: AspartoftheperiodicreviewofGEROS,UNICEFshouldconsiderrevisingtheratingscaleandseveralelementsoftheGEROStemplateinordertoensuregreaterprecisioninthemessagesthatareprovidedaboutevaluationqualityandthecharacteristicsofevaluationreports,andtocreatemoreefficiencyinapplyingthetemplate.


Universalia 44

AA pp pp ee nn dd ii xx II II II DD ee tt aa ii ll ee dd CC oo mmmm ee nn tt ss oo nn GG EE RR OO SS FF rr aa mm ee ww oo rr kk

BasedonUniversalia’smulti‐yearexperienceastheexternalconsultingfirmmanagingtheGEROSprocess,weofferthesecommentstocontributetofutureimprovementsofGEROSintermsofimprovingefficiencyandratingconsistency.TheseconsiderationscouldbeusedtoupdatethetemplateandtheaccompanyingprocessforthenextGEROScycle.

I te rat ive s t ructure of the template

Fromauserperspective,thestructureanddesignofthetemplateworkswellintermsoforganizingalargenumberofcriteriaintocleardivisionsandvisuallydemarcatingwhichratingshavebeenapplied.However,wefindthatthecascadingstructureofmultiplecolumns(plusarow)forcommentsoncriteriaisunnecessarilyrepetitiveandmightbemademoreefficientwithoutlossofimportantdata.Inthecurrenttemplate:

Ofthecommentcolumnsinthespreadsheet,the“Remarks”columnprovidesthedetailsandjustificationforeachrating.

The“constructivefeedback”columnprovidesrecommendationsonhowtoimprovethereport.

TheExecutiveFeedbackrowattheendofeachsectionprovides,ataglance,thestrengthsandweaknessesofthereportformanagerialpurposes.

ColumnJisintendedtosummarizecommentsonallindicatorsinthesection,particularlyfocusingonthecornerstonequestions.

WefindthatColumnJisanunnecessarylevelofsummarythatdoesnotcontributeinasignificantwaytofurtheranalysis.ItisadditionallyconfusingtoreviewersinthatthetemplateannotationsinColumnJemphasizethesummaryofonlycertaincriteria(thecornerstonequestions),leadingtoinconsistenciesinhowreviewerscompletethiscolumn.Wewouldsuggesttheremovalofthiscolumn,andallowtheExecutiveFeedbackrowtoactasthesummaryforthesection.

Rat ing sca le

Thereareseveralinterconnectedissueswewouldliketoidentifywiththe4‐pointratingscale(5ifNotApplicableisincluded)usedintheGEROStemplate.WerecommendthatUNICEFconsiderrevisingtheratingscalesandaddingafewclarifications.

Thecurrentratingscaledoesnothelptodifferentiatelevelsofquality:WhiletheOutstanding,HighlySatisfactory,andMostlySatisfactorylevelsareseparatedfromoneanotherinwhatfeelslikeequivalentincrements,thereisalargegapbetweentheratingsofMostlySatisfactoryandUnsatisfactory.Thisgapispartlyduetothephrase”MostlySatisfactory”itself(whichsuggeststhatthereportisdoingOK),yetitreallysignalsonlypartialcompliancewithaparticularstandard.Thishascreateddifficultyforreviewersintheoverallratingofareport,whereareportisnotcompletelyUnsatisfactory,butisalsonotMostlySatisfactory.Intheend,judgmentmustbemadeaboutwhatratingisthebestfitforthatparticularreport,yetthecurrentratingscaledoesnotfacilitatethat.


45 Universalia

Ratingcolorsandlabels:Whiletheuseofdarkgreen,lightgreen,amber,andredintuitivelyreflectacontinuumofpositiveandnegative,theuseofamberforMostlySatisfactoryisnotanintuitivematch.Withthelabelsuggestingapositive,andthecolorsuggestingproblems,thismismatchcontributestoinconsistencybetweenreviewersinhowtheratingscaleisinterpretedandapplied.Italsomisleadsthe“readers”or“recipients”ofthereview,whomightcelebratea“MostlySatisfactory,”wheninfactitreflectsthatthereisroomforsignificantimprovement.

UseoftheUnsatisfactoryrating:TheabsenceofaratingbetweenUnsatisfactoryandMostlySatisfactoryalsocauseschallengesforratingindividualquestions.Withnoratinginbetweenthese,noraratingtodifferentiate‘absent’fromsomethingthatispresentbutnotwelldone,reviewershavetypicallyreservedUnsatisfactory(red)forquestionswherethestandardiscompletelyabsentorverypoorlyaddressed,andMostlySatisfactoryifthereportpartiallymeetsthestandard.However,inthefutureitmaybenecessarytodistinguishreportsinwhichthestandardiscompletelyabsentandtofurtherrefinethemeaningofpartialcompliance.

UseofNotApplicable:Ithasbeenanon‐goingchallengetodecidewhentoapplyaratingofNotApplicableratherthanUnsatisfactory.Attimes,reviewershaveusedNotApplicabletoindicatetheabsenceofsomething.Forexample,ifnocostanalysishasbeenconducted,reviewershavesometimesratedcriteria35(“Isacostanalysispresentedthatiswellgroundedinthefindingsreported?”)asNotApplicable,asitwasnotmentionedintheTORs,andalsomayhavebeenseenbyreviewerstonotbefeasibleforaparticularevaluation.Inothercases,itsabsencewasratedUnsatisfactory,basedontheinterpretationthatitisaUNICEFexpectationregardlessofwhetheritisintheTORsornot,anditsabsenceshouldhavebeenjustified.Wemanagedthisbydiscussingeachreviewonacase‐by‐casebasis,however,inthefuture,clearparametersforuseof‘NotApplicable’shouldbeestablishedwithUNICEF,inordertofurtherenhanceconsistencyintheratings.

Content of the template

ThesixsectionsoftheGEROStemplateeachcontainbetweenfiveand13questions,groupingtogetherrelatedcriteria.

SectionC:Methodology:Whileinmostcasesthecriteriaarelogicallyorganized,SectionC(methodology,gender,humanrights,andequity)containsdisparateelementsandanunwieldynumberofcriteria(13),makingitdifficulttomanageandsummarize.Thelargestgroupingofcriteriawithinthissection,relatedtohumanrights,gender,andequity,couldberemovedandputinitsownsectiontobalancethesize.Wewouldsuggesthavingaspecificsectiononcross‐cuttingissuesand/orgender‐responsiveevaluationinthetemplatethatcouldalsoincludecomponentsoftheSWAP.Thisconsolidationofthetwotemplates,asdiscussedbelow,wouldsimplifytheoverallGEROSprocess.Ifthisweredone,theexistinggender‐relatedcriteriashouldbeamendedtoeliminatethesignificantoverlapwiththeSWAPcriteria.

SectionC:Gender,humanrights,andequity:Thissub‐sectionofSectionCposeschallengesforreviewers.Criteria20and21conflatethethreethemesofgender,humanrights,andequitytogether,suchthatareportthataddressesonethemewellmightreceiveapositiveratingwhileneglectingtheotherthemestovaryingextents.Further,thereisconsiderableoverlapbetweencriteria20‐24,leavinglittleforareviewertodifferentiatebetween20and21,orcomparedto22‐24.Itistime‐consumingforareviewertosearchforthesubtletiesbetweenthesecriteriaratingsinareport,whileaddinglittletotheoverallanalysis.Havingadistinctsectiononcross‐cuttingissuescouldallow


Universalia 46

UNICEFtoexpandonthegenderequality,humanrightsandequitythemesasopposedtohavingthemamalgamatedintothesamecriteria.

Comments on other cr i ter ia :

Specificcriteriaposeparticularchallengestoreviewers.Thesearesummarizedinthefollowingtable:Criteria Issueandcomment Possibleaction

3.Does[thecontext]illuminatefindings?

Thiscriteriondoesnotaddsignificantvaluetotheanalysis.Well‐explainedcontextthatisrelatedtotheobject(criteria2)naturallyilluminatesfindings.

Eliminatecriteria3.

12.DoestheevaluationprovidearelevantlistofevaluationcriteriathatareexplicitlyjustifiedasappropriateforthePurpose?

Duetothephrasinginthetemplate,therehasbeenconfusionastowhetherthestandardOECD/DACcriteriastillrequiressomekindofjustificationorexplanation.Wehaveprovidedinternalguidancetoreviewers,butthiscouldbemademoreexplicitinthetemplateitself.

Beexplicitinthetemplateaboutwhetherstandardcriteriarequireanyexplanationorjustification.

13.Doestheevaluationexplainwhytheevaluationcriteriawerechosenand/oranystandardDACevaluationcriteriarejected?

Thiscriterionoverlapswith#12,anditisunclearhowitshouldberatedifalltheevaluationcriteriausedarethoseoftheOECD/DACandarenotjustified.

Beexplicitinthetemplateaboutwhetherstandardcriteriarequireanyexplanationorjustification,and/or,amalgamatewith#12.

16.Areethicalissuesandconsiderationsdescribed?

Theannotationforthiscriterionsuggeststhatthedesignoftheevaluationshouldcontemplatehowethicaltheinitialdesignoftheprogrammewas,aswellastheethicsofhowisincludedandexcludedintheevaluation.Itisveryuncommonthatanyreportwouldaddressthefirstpoint,contributingtounnecessarilylowratingsonthiscriterion.Theissueofparticipationintheevaluationisasignificantlydifferentconceptandisalsoaddressedincriterion#26.

TargetthecontentofthisindicatormorespecificallyorconsidermovingthisindicatortotheStakeholderParticipationsub‐section.

20‐24HumanRights,GenderandEquitysub‐section

Seediscussionabove.

35.Isacostanalysispresentedthatiswellgroundedinthefindingsreported?

ItisnotclearwhetheracostanalysisshouldbeexpectedfromallevaluationreportsoronlywhenToRsrequestsuchananalysis.

Clarifywhetheracostanalysisisanobligatoryoroptionalcomponentofanevaluationreport.

49.Arelessonslearnedcorrectlyidentified?

50.Arelessonslearnedgeneralisedtoindicatewhatwiderrelevancetheymayhave?

Thesecriteriaoverlap.Iflessonslearnedarecorrectlyidentified,itnecessitatesthattheyaregeneralisable,sincethatisthedefinitionofalessonlearned.

Clarifythewordingtodifferentiatethesecriteriaoramalgamatethem.

55.Isanexecutivesummaryincludedaspartofthereport?

ThisiswordedasaYes/Noquestion,butthereviewerisrequiredtoselectfromthe4‐pointratingscale.

ChangethisratingtoYes/No.


47 Universalia

Backgroundsection(TypesofReports):ThetopportionoftheGEROStemplateasksreviewerstoclassifyreportsintoanumberofcategories(geographicscope,levelofresults,etc).Thisprovidesausefulsnapshotduringthemeta‐analysisofthetypesofreportsincludedinthesampleandhowtheyperformed.However,someofthesecategoriesareproblematic,inthattheyarenotalwaysmadeclearinthereport,thereisinsufficientguidanceinthetemplateforthatcategory,orthatcategoriesoverlap.Specificchallengesinclude:

TheManagementcategory:itisnotalwaysclearfromevaluationreportswhichManagementcategoryapplies,particularlyintermsofthelevelofcontrolthatwasheldbystakeholdersoutsideUNICEF.

Similarly,theLevelofIndependenceisnotalwaysmadeclearinevaluationreports,particularlythelevelofcontrolofactivitiesbyUNICEFstafforEvaluationOfficeprofessionals.

TheinclusionofbothReal‐timeandHumanitarianasmutuallyexclusivecategoriesunderPurpose.VirtuallyallReal‐timeevaluationsareHumanitarianevaluations;whichcategoryshouldtakeprecedence?Thesolutioncouldbetosimplyput“OtherHumanitarian”,thatwouldgivetheroomtoclassifyhumanitarianevaluationsthatmaynotbereferredtoasReal‐TimeEvaluationsperse.

TheinclusionofHumanitarianasacategoryunderPurpose,aswellasacategoryunderSPOACorrespondence.ThisresultsinsomeHumanitarianevaluationsbeingmarkedasHumanitarianunderPurpose,butpossiblyassomethingelse(eg.WASH)underSPOACorrespondence.WhilethismayaccuratelyreflectaHumanitarianevaluationthatfocusedonaWASHintervention,themeta‐analysisofSPOACorrespondencedoesnotreflectthisWASHevaluationexampleinitsanalysisofHumanitarianevaluations.

Asitisnotalwaysentirelyobviousfromareportwhichclassificationshouldbeapplied,wesuggestthatregionssubmittingthereportspecificallyprovidetheclassificationinformationrequiredusingasimpleformthatshouldaccompanythereport.

Appl i cat ion of the template to di f ferent types of eva luat ions

WehavefoundthattheGEROStemplateleaveslittleflexibilityforotherevaluationsthatarenotconsideredtypical,suchasimpactevaluationsorcasestudiesofalargerevaluationstudy.UNICEFcouldadaptitscurrenttemplatetoaccountforpossiblevariationincontent,structureandfocuswithinreports.Naturally,cautionshouldbetakentopreventthetemplatefrombecomingsoflexiblethatcommonstandardsareeasytocircumvent.Weofferbelowsomefeedback.

CaseStudies:Theoverallreportistheonlyonethatwerecommendbejudged.Often,thecountrycasestudiesarenotdesignedtobestand‐aloneevaluations;theyaredesignedasinputtoacorporateorregionalevaluation.Plus,casestudiesvarysignificantlyintermsofscopeandapproach.ThisiswhywewouldarguethattheyshouldnotbeassessedthroughtheGEROSprocess.Howeverthecasestudiesmustbeavailableforreviewinordertoprovideenoughdepthtotheanalysis.

ImpactEvaluations:Impactevaluationsaredifferentinanumberofrespectsandinmanycasesdonotfollowthetypicalmethodologyusedbyotherprogrammeevaluations.Theytendtoberatedlessfavorablyoncross‐cuttingissues,forexample,orontheapplicationofOECD/DACcriteria.Ideally,impactevaluationsshoulduseadistincttemplate.Ifthisisnotpossible,greatcareandflexibilityneedstobeusedinreviewingoneofthesereports.


Universalia 48

UNDAFEvaluations:UNICEFwillneedtodecideifitwouldliketoassessUNDAFevaluationsaspartoftheGEROSprocess.FlexibilityisneededtoreviewtheUNreports.MostofthecriteriaintheGEROStemplateapplytoUNICEFevaluationreports,andUN‐widereportsarenotexpectedtocomplywithalltheserequirements.IftheyremainapartoftheGEROSprocess,werecommendthatthereviewofUNDAForjoint‐evaluationsbeassignedtoseniorreviewerswho,basedontheirexperience,canmorereadilyrecognizehowandwhethertoratecertaincategoriesinunusualcontexts.

Wedonotrecommendcreatinganewtemplateforeachdifferenttypeofreportsinceitwoulddiminishthecomparabilityofallthereportsformeta‐analysispurposes.However,UNICEFmustunderstandthatleavingthetemplateasismayleadtolowerratingduetotheabsence/specialtreatmentofsomecriteriaindifferenttypesofevaluationreports.

Recommendat ions for the GEROS template

Re‐visittheratingscaleandresolvethediscrepanciesbetweenMostlySatisfactoryandUnsatisfactory.Itisthelowerendoftheratingscalethatposesdifficultiesforreviewers.ChangestotheratingscaletoaddresstheissuesdescribedshouldincludeaddingaratinglevelbetweenUnsatisfactoryandMostlySatisfactory(e.g.PartiallySatisfactory).Inallcases,wesuggestthattheserviceproviderdraftspecificguidanceontheseissuesanddiscussitwithUNICEF,inordertoclarifythecircumstancesinwhichtousetheseratings.AsnotedintherecommendationontheGEROSprocess,thiskindofguidanceormanualshouldbecostedinafuturelongtermagreementwithaserviceprovider.

DevelopandsharedefinitionsoftheratingscaleoptionstoensurethatUNICEFandreviewershavethesameunderstanding.CleardefinitionsofwhateachratingscaleoptionmeansandwhenitshouldbeappliedshouldbedevelopedbyUNICEFandprovidedtoreviewers.

Separatethehumanrights,genderandequitysub‐sectionfromSectionCandplaceitinitsownsection.Considerationcouldalsobemadeofreducingtheoverlapbetweenthesecriteriabyreducingthenumberofcriteriaordifferentiatingthemfurtherfromoneanother.Currently,statementsreferringtohumanrights,genderandequityarelumpedtogether,e.g.question20:“DidtheevaluationdesignandstyleconsiderincorporationoftheUNandUNICEF'scommitmenttoahumanrights‐basedapproachtoprogramming,togenderequality,andtoequity?”.Specificratingcouldbeprovidedoneachoftheseitems.

RemoveColumnJfromthetemplateandallowtheExecutiveFeedbackrowtoserveasthesummaryofthesection.RemovingColumnJreduceseffortwhilenotremovinganydataoranalysis,andstillallowsforasummaryintheExecutiveFeedbackrow,aswellasconstructivefeedbackonparticularissues.

Amendproblematicstandardstoallowformoreefficiencyandconsistency.Asdescribedinthetableinthediscussion,thereareanumberofstandardswhichareeitherdifficulttoassess,overlapwithotherstandards,orareunclear.

Simplifythereportclassificationcategoriesoraskcountryofficetocompleteashortformtoidentifythecorrectclassificationinformationwhensubmittingthereport.Thereportclassificationquestionsandoptionsoftenoverlapandareunclear,e.g.ingeographicscope,itisoftendifficulttodifferentiatebetweensub‐nationalandnationalscopes,inpurpose,“humanitarian”and“real‐timeevaluation”aremutuallyexclusive,andinthelevelofresultssought(outputsoroutcomes),itisoftenimpossibletodeterminewhichoneisright.Themanagementoftheevaluationisanotherproblematicareatocoversincetermsofreferencearenotalwaysclearaboutwhowillmanagetheevaluation,andevaluationreportsseldommentionwhoexactlytheyreportedto.


49 Universalia

Comments on the SWAP Framework

TheSWAPtemplatewasconsiderablysimplifiedfor2014andhaseliminatedtheproblematiccriteriathatoftenrequiredinformationnotfoundinevaluationreports.Thishasenabledreviewerstorespondtothecriteriainthevastmajorityofcasesandshouldhelpleadtomorevalidratings.

Poss ib le amalgamat ion of SWAP and GEROS templates

Lastyear,oneofthepossibilitiesraisedbyUniversaliawastoincorporatetheSWAPcriteriaintotheGEROStemplate,sothateachreportgoesthroughonlyoneintegratedratingprocess,whichwouldhelpmakethereviewprocessmoreefficient.Atitspreviouslengthof13indicators,theSWAPcriteriawouldhavebeensomewhatunwieldytointegrateintoGEROS,butwithfourcriteria,thisisentirelyfeasible.Thereareafewdifferentapproachesthatcouldbeused:

1) PlaceSWAPindicatorsintotheGEROStemplateas‐is,intheirownsectionorinasectiondedicatedtocross‐cuttingissues,maintainingtheirownratingscale,orusingthatofGEROS.

2) PlaceSWAPindicatorsintotheGEROStemplateintheirownsection,andamendoramalgamatetheseindicatorsaswellastheexistingGEROSgender‐relatedindicatorstoensurethereisnoduplication.

Theseapproachesmaintaintheabilitytoassessgenderratingsindependently,thoughcomparabilitywiththe2014exercisewilldependonthespecificchangesmadetothecriteria,ifany.

Content of the template

RegardlessofwhethertheSWAPtemplatecontinuestobeanindependentdocumentorismergedwiththeGEROStemplate,aclarificationtothecriterion4annotationwouldbebeneficial:

Criteria

Criterion4:TheevaluationFindings,ConclusionsandRecommendationreflectagenderanalysis

Theannotationforthiscriterionstatesthat“Theevaluationreportshouldalsoprovidelessons/challenges/recommendationsforconductinggender‐responsiveevaluationbasedontheexperienceofthatparticularevaluation”.Thisisrarelydone,norisitclearthatthiswouldbeseentobewithinthemandateofatypicalevaluation.Thismightbetterbedescribedasanexceptionalactivityratherthananexpectation,orremovingthiswording.

Recommendat ions for the SWAP template

ConsiderintegratingSWAPcriteriaintotheGEROStemplate.IntegratingthesecriteriaintoGEROScouldleadtogreaterefficiencywithoutlosingtheabilitytoindependentlyanalyseorhighlightdatarelatedtogender.

AmendwordingofannotationforCriterion4.Thewordingoftheannotationforcriterion4raisesanexpectationthatmaynotbewithinthemandateofanevaluation,exceptforexceptionalcases,leadingtounnecessarilylowratingsonthiscriterion.


Universalia 50

AA pp pp ee nn dd ii xx II VV TT ee rr mm ss oo ff RR ee ff ee rr ee nn cc ee

TERMSOFREFERENCEFORGLOBALEVALUATIONQUALITYOVERSIGHTSYSTEMEVALUATIONOFFICE

Background

UNICEFputinplaceanEvaluationQualityAssuranceSystemtoensureevaluationsmanaged/supportedbyUNICEFmeetqualitystandards.Thesystemiscomposedofa)theGlobalEvaluationReportsOversightSystem(GEROS),managedbytheEvaluationOffice(EO),whichratesfinalevaluationreportscommissionedbyUNICEFCountryOffices(CO),RegionalOfficesandHQdivisionsagainsttheUNEG/UNICEFEvaluationReportStandards,andb)RegionalQASystems,managedbyRegionalOffices(RO),whichassessdraftToRanddraftReports.

UNICEFislookingforaninstitutiontoensurethereviewingofandratingthequalityofdraftToR,aswellasdraftandfinalevaluationreportssupportedbyUNICEFcountryandregionalofficesallovertheworld,aswellasHQdivisions.

Expectedresults

TheselectedinstitutionswillreviewdraftTor,aswellasdraftandfinalevaluationreportsinEnglish,FrenchandSpanishreceivedbytheEOandselectedROs(uptoamaximumof200draft/finalreportsand50draftToRinoneyeartimeframe),ratethemagainstUNEG/UNICEFstandards,writeanexecutivefeedbacktobesenttotheCOconcerned,andmakeanalysisoftrends,keyweaknessesandstrengthsofUNICEF‐managed/supportedevaluationreportsandToRs.

Expecteddeliverables

WithintheGlobalEvaluationQualityOversightSystem,theselectedInstitutionwilldeliverthefollowingoutputs:

A. DraftToRanddraftReports(contracttobemanagedbyRegionalOffices)

A1:DraftEvaluationToRanddraftEvaluationreportsreviewed,ratedandexecutivefeedbacksent

UNICEFCountryOfficesaresendingthedraftToRandreportstoRegionalOfficeforreal‐timequalityreviewandpracticalcommentsonhowtoimprovethem.Theinstitutionwillcarryoutsuchreviewinmaximum3workingdaysforthedraftToRand5workingdaysforthedraftreports.Theinstitutionwillprovideprofessionalandpracticalfeedbackaccordingtopre‐agreedtemplates(seehyperlinkbelowfortheevaluationreports,andattachmentfortheToR).

A2:RegionaloverviewofevaluationdraftToRanddraftreportsreviewed

Theinstitutionwillundertakeanannualreviewoffeedbackprovided(seeattachmentwiththeexampleoflastyear);identifylessonstobelearnedonevaluationToRandreports.Willcomparetheseresultswiththosewhichemergedfromthetwopreviousyearlyexercisesundertakenintheregionandwillidentifylessonstobelearned,emerginggoodpracticesandactionablerecommendationstoimprovethequalityassurancesystemaswellasthequalityofEvaluationsinthespecificregionassessed.


51 Universalia

A3:RegionalEvaluationHelpDesk.

Theobjectiveistoensurereal‐timetrouble‐shootingandadhoctechnicalassistancetoUNICEFCountryOfficeswhenrequested,forinstanceprovidingasecondreviewofToR,specifictechnicalnotes,etc.TimingandcontentofanyspecifictasktobeagreedaboutbeforehandwiththeROconcerned.

B. FinalReports(contracttobemanagedbyEO)

B1:FinalEvaluationreportsreviewed,ratedandexecutivefeedbacksent

DownloadthefinalreportsfromtheUNICEFIntranetdatabase,andreviewandratefinalEvaluationreportsreceivedinEnglish,FrenchandSpanishagainstUNEG/UNICEFstandardsusingtheFeedbackTemplate–boththecomprehensiveaswellastheexecutiveone(availableathttp://www.unicef.org/evaluation/files/Tool_2012_v2.xlsx),highlightingstrengths,weaknessesandrecommendationstoimprovethequalityoffutureevaluationsreports.

Theestimatedtotalnumberoffinalevaluationreportstobereviewedwillbebetweenaminimumof50andamaximumof150inoneyeartimeframe,outofwhichabout80%inEnglish,15%inFrenchand5%inSpanish.

Reportsmustbefullyratedandthefeedbackgivenwithin10workingdaysofreceipt.Attimes,theremaybeasmanyas20tobehandledwithinthe10dayperiod.Ifreportstoberatedwithinthe10workingdaysexceed20,theratingtimewillbeextended.

B2.Globalanalysisoftrends,keyweaknessesandstrengthsofreportsreviewed

Everyyear,produceaMeta‐evaluationbasedontheassessmentsofallfinalreportsreviewedthatyearhighlightingkeytrends,keyweaknessesandstrengthsofreportsreviewed,includinglessonslearnedandgoodpracticesonEvaluationreports,andactionableconclusionsandrecommendationstoimproveGEREOSaswellasthequalityofEvaluationreports.Plsrefertothelatestmeta‐evaluationforyoureasyreferenceathttp://www.unicef.org/evaldatabase/index_60935.html.APowerPointhighlightingkeyissuesshouldalsobeprepared.

Managementofthesystem

ThisLongTermAgreementcoversa)theGlobalEvaluationReportsOversightSystem(GEROS),managedbytheEvaluationOffice(EO),whichratesfinalevaluationreportscommissionedbyUNICEFCountryOffices(CO),RegionalOfficesandHQdivisionsagainsttheUNEG/UNICEFEvaluationReportStandards,andb)RegionalQASystems,managedbyRegionalOffices(RO),whichassessdraftToRanddraftReports.However,theEvaluationOfficeswillraiseacontracttocovertheGEREOSsystemonly,whileRegionalOffices(ifany)willraiseseparatecontractstocovertheRegionalQASystems.

ThecontractraisedbytheEO(coveringtheGEROSSystem)willbemanagedbytheSeniorEvaluationSpecialist,Systemicstrengthening,withthesupportoftheKnowledgeManagement(KM)specialist.Thecontracts(ifany)raisedbyRegionalOfficeswillbemanagedbytherespectiveRegionalM&EChiefs.

Theselectedinstitutionwillappointaprojectmanagerwhowillensureconsistencyofrating,qualityandtimelydeliveryofexpectedproducts,andoverallcoordinationwithUNICEFEvaluationOffice.Theprojectmanagerwillalsoprovideanupdateonamonthlybasis,whichwillincludeatrackingmatrixhighlightingthestatusofreviews,ratingsandexecutivefeedback.TheprojectmanagerwillbethepointofcontactwithUNICEFforanyissuesrelatedtothisLongTermAgreement.


Universalia 52

Pleasenotethat,toavoidpotentialconflictofinterest,thefollowingwillbeapplied:

Thecompanythatwillwinthisbid,willnotreviewanyToR,draftand/orfinalevaluationreportsofevaluationconductedbythesamecompany

ThereviewerwhoratesthefinalevaluationreportswillbedifferentfromthereviewerwhoratesthedraftToRand/ordraftreport

Qualifications

Excellentandprovedknowledgeofevaluationmethodologiesandapproaches

ProvenexperiencewithQualityreviewofevaluationreports,preferablywithUNagencies

Provenpracticalprofessionalexperienceindesigningandconductingmajorevaluations

ExcellentanalyticalandwritingskillsinEnglishrequired.AdequacyinFrenchandSpanishrequired,withexcellenceinFrenchandSpanishastrongadvantage

FamiliaritywithUNEG/UNICEFevaluationstandardsisanasset

SectorialknowledgeinChildsurvivalanddevelopmentandatleastothertwoUNICEFareaofintervention(education;HIV/AIDS;Childprotection;Socialprotection)inEnglishlanguage

Knowledgeandexpertiseofotherorsimilarqualityassurancesystemswillalsobeanasset

Provencapacitiesinmanagingdatabases

Durationofcontract

TheLongTermAgreementwillstart1October2012andwillexpire30August2015.


53 Universalia

AA pp pp ee nn dd ii xx VV LL ii ss tt oo ff RR ee pp oo rr tt ss AA ss ss ee ss ss ee dd No. Country Region Year Sequence# Title Rating

1 Afghanistan SouthAsiaRegionalOffice

2015 2014/008 LetUsLearn(LUL)FormativeEvaluationUNICEFAfghanistanCountryOffice

Outstanding,bestpractice

2 CEE/CISandBalticStates

Central&EasternEurope,CommonwealthofIndependentStatesRO

2014 2014/008 RKLA3multi‐countryevaluation:increasingaccessandequityinearlychildhoodeducation–finalevaluationreport


3 RepofUzbekistan


2014 2014/003 EvaluationofCountryProgrammeofCo‐operationbetweenGovernmentofUzbekistanandUNICEF2010‐2014


4 RepublicofMontenegro


2014 2014/004 FinalEvaluationofthe"ChildCareSystemReform" Outstanding,bestpractice

5 Afghanistan Corporate(HQ) 2014 2014/001 AfghanistanCountryCaseStudy:Unicef’sUpstreamWorkinBasicEducationandGenderEquality2003‐2012

Highlysatisfactory

6 Afghanistan SouthAsiaRegionalOffice

2014 2014/007 In‐depthevaluationoffemaleliteracyprogramme Highlysatisfactory

7 Algeria MiddleEastandNorthAfricaRegionalOffice

2014 2014/003 EvaluationduprogrammeélargidevaccinationdanslescampsderéfugiéssahraouisdeTindouf

Highlysatisfactory

8 Bangladesh SouthAsiaRegionalOffice

2014 2014/001 FinalEvaluationofBasicEducationforHardtoReachUrbanWorkingChildren(BEHTRUWC)Project(BEHTRUWC)2ndPhase2004‐2014

Highlysatisfactory


2014 2014/014 ImpactevaluationoftheWASHSHEWA‐BprogrammeinBangladesh

Highlysatisfactory


Universalia 54

No. Country Region Year Sequence# Title Rating


2015 2014/002 LetUsLearnFormativeEvaluation(casestudy) Highlysatisfactory

11 Barbados LatinAmericaandCaribbeanRegionalOffice

2014 2014/002 EvaluationofthePilotoftheEarlyChildhoodHealthOutreachPrograminStVincentandtheGrenadines

Highlysatisfactory

12 Benin WestandCentralAfricaRegionalOffice

2014 2014/001 ÉvaluationdesespacesenfancesauBénin Highlysatisfactory

13 BosniaandHerzegovina


2014 2014/004 IncreasingEarlyOpportunitiesforChildreninBosniaandHerzegovina

Highlysatisfactory

14 Brazil Corporate(HQ) 2014 2014/001 BrazilCountryCaseStudy:UNICEF’sUpstreamWorkinBasicEducationandGenderEquality2003‐2012

Highlysatisfactory

15 Cambodia Corporate(HQ) 2014 2014/001 CountryCaseStudy:Cambodia‐UNICEF’sUpstreamWorkinBasicEducationandGenderEquality2003‐2012

Highlysatisfactory

16 CEE/CISandBalticStates


2014 2014/020 Multi‐CountryEvaluationofRegionalKnowledgeandLeadershipAreas:IncludingAllChildreninQualityLearninginCEE/CIS

Highlysatisfactory

17 Colombia LatinAmericaandCaribbeanRegionalOffice

2014 2014‐001 JointEvaluationoftheProgramsofCanadianCooperation(2009‐2013)andSweden(2011‐2013)ImplementedintheCountryOfficeforColombiaoftheUnitedNationsFundforChildren

Highlysatisfactory

18 Comoros EasternandSouthernAfricaRegionalOffice

2014 2014/002 Rapportdel'évaluationfinaleduCadredesNationsUniespourl'aideaudéveloppement(UNDAF)etdesonrôled'appuiàlastratégiedecroissanceetderéductiondelapauvreté(SCRP)(2008‐2014)

Highlysatisfactory

19 Denmark Corporate(HQ) 2014 2014/001 EvaluationofUNICEFSupplyDivisionEmergencySupplyResponse

Highlysatisfactory


55 Universalia


20 Ethiopia EasternandSouthernAfricaRegionalOffice

2014 2014/050 AnEvaluationoftheChild‐to‐ChildSchoolReadinessProgrammeinEthiopia

Highlysatisfactory

21 Guyana LatinAmericaandCaribbeanRegionalOffice

2014 2014/001 HealthandFamilyLifeEducation Highlysatisfactory

22 India SouthAsiaRegionalOffice

2014 2014/001 EvaluationofEmpoweringYoungGirlsandWomeninMaharashtra,India

Highlysatisfactory

23 Indonesia EastAsiaandthePacificRegionalOffice

2014 2014/002 EducationSectorResponsetoHIV&AIDS Highlysatisfactory

24 Indonesia EastAsiaandthePacificRegionalOffice

2014 2014/004 Equity‐FocusedFormativeEvaluationofUNICEF'sEngagementintheDecentralizationProcessinIndonesia

Highlysatisfactory

25 Kazakhstan Central&EasternEurope,CommonwealthofIndependentStatesRO

2014 2014/003 InternationalconsultancyonevaluationofthesupportedbytheGovernmentofNorwayprogrammeondevelopingasustainedandoperationalombudsman’schildprotectionmechanismthatpreventsandrespondstochildabuse,exploitationandfamilyseparationinlinewithinternationalstandards

Highlysatisfactory

26 Lesotho EasternandSouthernAfricaRegionalOffice

2014 2014/001 ChildGrandImpactEvaluation Highlysatisfactory

27 Macedonia Central&EasternEurope,CommonwealthofIndependentStatesRO

2015 2014/007 EvaluationoftheEarlyChildhoodDevelopmentProgrammeintheformerYugoslavRepublicofMacedonia

Highlysatisfactory

28 Malawi EasternandSouthernAfricaRegionalOffice

2014 2014/012 EvaluationofPHCEssentialMedicinesProject Highlysatisfactory

29 Maldives SouthAsiaRegionalOffice

2014 2014/001 AnevaluationofUNICEFMaldivesstrategiesinaddressingissuesaffectingwomenandchildren

Highlysatisfactory


Universalia 56


30 Mali WestandCentralAfricaRegionalOffice

2014 2014/002 ÉvaluationduProgrammeWASHàl'écoleauMali Highlysatisfactory


2014 2014/001 SummativeExternalEvaluationoftheCatalyticInitiative(CI)/IntegratedHealthSystemsStrengthening(IHSS)ProgrammeinMali

Highlysatisfactory

32 Namibia EasternandSouthernAfricaRegionalOffice

2014 2014/004 SchoolBasedHIVTestingandCounsellingPilotProgrammeEvaluation

Highlysatisfactory

33 Nepal SouthAsiaRegionalOffice

2015 2014/013 EvaluationofLetUsLearnNepal:After‐SchoolProgrammeforGirlsandGirlsAccesstoEducationProgramme

Highlysatisfactory

34 Pakistan SouthAsiaRegionalOffice

2014 2014/001 EndofProjectEvaluationforNorway‐PakistanPartnershipInitiative

Highlysatisfactory


2014 2014/003 EvaluationoftheUNICEFSanitationProgrammeatScaleinPakistan(SPSP)–Phase1(2013‐14)

Highlysatisfactory


2014 2014/002 EvaluationofYoungChampionsInitiativeforGirls’Education Highlysatisfactory

37 Peru LatinAmericaandCaribbeanRegionalOffice

2015 2014‐015 EvaluacióndeMedioTermino:"MejorandolaEducaciónBásicadeNiñasyNiñosenlaAmazoníayelSurAndinodelPerú,2010‐2017"

Highlysatisfactory

38 Philippines EastAsiaandthePacificRegionalOffice

2014 2014/007 IASCInter‐agencyHumanitarianEvaluationoftheTyphoonHaiyanResponse

Highlysatisfactory

39 Philippines EastAsiaandthePacificRegionalOffice

2014 2014/04 Real‐TimeEvaluationofUNICEF’sResponsetotheTyphoonHaiyaninthePhilippines

Highlysatisfactory

40 RepublicofKyrgyzstan


2014 2014/004 EvaluationofDFID/UNICEFEquityProgramme Highlysatisfactory


57 Universalia


41 RepublicofKyrgyzstan


2014 2014/002 EvaluationofPerinatalCareProgramme Highlysatisfactory

42 RepublicofMontenegro


2014 2014/005 EvaluationReportfortheJusticeforChildrenInitiative Highlysatisfactory

43 Romania Central&EasternEurope,CommonwealthofIndependentStatesRO

2014 2014/001 NationalIntermediateEvaluationofthe“SchoolAttendanceInitiative”Model

Highlysatisfactory

44 Tajikistan Central&EasternEurope,CommonwealthofIndependentStatesRO

2014 2014/004 YOUTHFRIENDLYHEALTHSERVICESPROGRAMINTAJIKISTAN,2006‐2013:ProgramEvaluationReport

Highlysatisfactory

45 Ukraine Central&EasternEurope,CommonwealthofIndependentStatesRO

2014 2014/001 PreventionofMother‐to‐ChildTransmissionandImprovingNeonatalOutcomesamongDrug‐dependentPregnantWomenandChildrenBorntoTheminUkraine

Highlysatisfactory

46 UnitedRep.ofTanzania

EasternandSouthernAfricaRegionalOffice

2014 2014/010 FormativeEvaluationoftheChildren'sAgenda Highlysatisfactory

47 USA Corporate(HQ) 2014 2014/012 FormativeEvaluationofUNICEF'sMoRES Highlysatisfactory

48 USA Corporate(HQ) 2013 2014/006A UNICEFEvaluationoftheMultipleIndicatorClusterSurveys(MICS)‐Round4,EvaluationPart1:ResponsetolessonslearnedinpriorroundsandpreparationsforRound5

Highlysatisfactory


Universalia 58


49 USA Corporate(HQ) 2014 2014/006B UNICEFEvaluationoftheMultipleIndicatorClusterSurveys(MICS)‐Round4,EvaluationPart2:MICSfunding,stakeholderneedsanddemandsanduse

Highlysatisfactory

50 USA Corporate(HQ) 2014 2014/005 UNICEF'sUpstreamWorkinBasicEducationandGenderEquality2003‐2012:SYNTHESISREPORT

Highlysatisfactory

51 Zimbabwe Corporate(HQ) 2014 2014/002 CountryCaseStudy:Zimbabwe(Unicef’sUpstreamWorkinBasicEducationandGenderEquality2003‐2012)

Highlysatisfactory

52 Barbados LatinAmericaandCaribbeanRegionalOffice

2014 2014/001 FinalReportfortheFormativeEvaluationoftheHighScopeCurriculumReformProgram

MostlySatisfactory

53 Benin WestandCentralAfricaRegionalOffice

2014 2014/002 ÉvaluationdesQuatreInnovationsEDUCOMauBénin MostlySatisfactory

54 Bhutan SouthAsiaRegionalOffice

2014 2104/002 EvaluationoftheWeeklyIronandFolicAcidSupplementation(WFIS)Program2004‐2014,Bhutan

MostlySatisfactory

55 Madagascar EasternandSouthernAfricaRegionalOffice

2014 2014/005 Evaluationdel’approche"assainissementtotalpilotéparlacommunauté"(atpc)

MostlySatisfactory

56 Malawi EasternandSouthernAfricaRegionalOffice

2014 2014/005 SummativereportontheexternalevaluationoftheCatalyticInitiative(CI)/IntegratedHealthSystemsStrengthening(IHSS)programmeinMalawi

MostlySatisfactory


2014 2014/003 Finalreport:Impactevaluationofcommunity‐ledtotalsanitation(CLTS)inruralMali

MostlySatisfactory

58 Mongolia EastAsiaandthePacificRegionalOffice

2014 2014/006 REDSStrategyEvaluationMongolia MostlySatisfactory

59 Nigeria WestandCentralAfricaRegionalOffice

2014 2014/012 ImpactEvaluationofWater,SanitationandHygiene(WASH)withintheUNICEFCountryProgrammeofCooperation,GovernmentofNigeriaandUNICEF,2009‐2013

MostlySatisfactory

60 Rwanda EasternandSouthernAfricaRegionalOffice

2014 2014/001 EmergencyPreparednessfortheinfluxofrefugeesintoRwanda

MostlySatisfactory


59 Universalia


61 SierraLeone WestandCentralAfricaRegionalOffice

2014 2014/002 EvaluationofJournalistsTrainingonEthicalReportingonChildRightsIssues

MostlySatisfactory

62 Somalia EasternandSouthernAfricaRegionalOffice

2014 2014/012 RegionalSupplyHubMechanismasaStrategyforWashEmergencyResponseinSomalia

MostlySatisfactory

63 Sudan MiddleEastandNorthAfricaRegionalOffice

2014 2014/004 ChildFriendlyCommunityInitiative‐EvaluationReport,Sudan

MostlySatisfactory

64 Sudan MiddleEastandNorthAfricaRegionalOffice

2014 2014/007 EVALUATIONOFUNICEFSUDANCOUNTRYOFFICEFIELDDELIVERYSTRUCTURE

MostlySatisfactory

65 Tajikistan Central&EasternEurope,CommonwealthofIndependentStatesRO

2014 2014/010 JuvenileJusticeAlternativeProject(2010‐2014)‐EvaluationReport

MostlySatisfactory

66 Togo WestandCentralAfricaRegionalOffice

2014 2014/001 ÉvaluationdesInterventionsàBaseCommunautaire(Nutrition&ATPC)danslesRégionsdesSavanesetdelaKara

MostlySatisfactory

67 Uganda EasternandSouthernAfricaRegionalOffice

2014 2014/001 UNICEF–UgandanMinistryofEducationandSportsBRMSMentorshipProjectEvaluation

MostlySatisfactory

69 Bhutan SouthAsiaRegionalOffice

2014 2014/001 EvaluationofWASHinschools,Bhutan Unsatisfactory

69 SierraLeone WestandCentralAfricaRegionalOffice

2014 2014/003 AnEvaluationoftheImpactofUNICEFRadioListenerGroupsProjectinSierraLeone

Unsatisfactory


Universalia 60

AA pp pp ee nn dd ii xx VV II GG EE RR OO SS AA ss ss ee ss ss mm ee nn tt TT oo oo ll UNICEFGlobalEvaluationReportOversightSystem(GEROS)ReviewTemplate

Colour

Coding

CC Darkgreen Green Amber Red White Thekeyquestionsarehighlightedasshownhere,andareimportantquestionsinguidingtheanalysisofthesection

Questions Outstanding Yes

MostlySatisfactory

NoNotApplicable

Section&OverallRating


HighlySatisfactory

MostlySatisfactory

Unsatisfactory

TheCornerstonequestionsareincolumnJandarequestionsthatneedtobeansweredforratingandjustificationofeachofthesixsections

UNEGStandardsforEvaluationintheUNSystem

UNEGNormsforEvaluationintheUNSystem

UNICEFAdaptedUNEGEvaluationReportStandards

Response IfthereportisnotanEvaluationpleasedonotcontinueyourreview.

TitleoftheEvaluationReport

Reportsequencenumber DateofReview YearoftheEvaluationReport

Region Country

TypeofReport TORsPresent

Nameofreviewer

ClassificationofEvaluationReport Comments

GeographicScope(Coverageoftheprogrammebeingevaluated&generalizabilityofevaluationfindings)

ManagementofEvaluation(Managerialcontrolandoversightofevaluationdecisions)

Purpose(Speakstotheoverarchinggoalforconductingtheevaluation;itsraisond'être)


61 Universalia

Result(Levelofchangessought,asdefinedinRBM:refertosubstantialuseofhighestlevelreached)

SPOACorrespondence(AlignmentwithSPOAfocusareapriorities:(1)Youngchildsurvivalanddevelopment;(2)Basiceducationandgenderequality;(3)HIV/AIDSandchildren;(4)Childprotectionfromviolence,exploitationandabuse;and(5)Policyadvocacyandpartnershipsforchildren’srights)

LevelofIndependence(Implementationandcontroloftheevaluationactivities)

Approach

SECTIONA:OBJECTOFTHEEVALUATIONSecondReview(dd/mm/yy)

ThirdReview(dd/mm/yy)

Question cc Remarks A/Doesthereportpresentaclear&fulldescriptionofthe'object'oftheevaluation?Thereportshoulddescribetheobjectoftheevaluationincludingtheresultschain,meaningthe‘theoryofchange’thatunderliestheprogrammebeingevaluated.Thistheoryofchangeincludeswhattheprogrammewasmeanttoachieveandthepathway(chainofresults)throughwhichitwasexpectedtoachievethis.Thecontextofkeysocial,political,economic,

ConstructivefeedbackforfuturereportsIncludinghowtoaddressweaknessesandmaintaininggoodpractice

A/Doesthereportpresentaclear&fulldescriptionofthe'object'oftheevaluation?

A/Doesthereportpresentaclear&fulldescriptionofthe'object'oftheevaluation?

Objectandcontext

1Istheobjectoftheevaluationwelldescribed?Thisneedstoincludeacleardescriptionoftheinterventions(project,programme,policies,otherwise)tobeevaluatedincludinghowthedesignerthoughtthatitwouldaddresstheproblemidentified,implementingmodalities,otherparametersincludingcosts,relativeimportanceintheorganizationand(numberof)peoplereached.

2Isthecontextexplainedandrelatedtotheobjectthatistobeevaluated?Thecontextincludesfactorsthathaveadirectbearingontheobjectoftheevaluation:social,political,economic,demographic,institutional.Thesefactorsmayincludestrategies,policies,goals,frameworks&prioritiesatthe:internationallevel;nationalGovernmentlevel;individualagencylevel

3Doesthisilluminatefindings?Thecontextshouldideallybelinkedtothefindingssothatitisclearhowthewidersituationmayhaveinfluencedtheoutcomesobserved.


Universalia 62

demographic,andinstitutionalfactorsthathaveadirectbearingontheobjectshouldbedescribed.Forexample,thepartnergovernment’sstrategiesandpriorities,international,regionalorcountrydevelopmentgoals,strategiesandframeworks,theconcernedagency’scorporategoals&priorities,asappropriate.

TheoryofChange

4Istheresultschainorlogicwellarticulated?Thereportshouldidentifyhowthedesignersoftheevaluatedobjectthoughtthatitwouldaddresstheproblemthattheyhadidentified.Thiscanincludearesultschainorotherlogicmodelssuchastheoryofchange.Itcanincludeinputs,outputsandoutcomes,itmayalsoincludeimpacts.Themodelsneedtobeclearlydescribedandexplained.

Stakeholdersandtheircontributions

5Arekeystakeholdersclearlyidentified?Theseincludeoimplementingagency(ies)odevelopmentpartnersorightsholdersoprimarydutybearersosecondarydutybearers

6Arekeystakeholders'contributionsdescribed?Thiscaninvolvefinancialorothercontributionsandshouldbespecific.IfjointprogramalsospecifyUNICEFcontribution,butifbasketfundingquestionisnotapplicable

7AreUNICEFcontributionsdescribed?Thiscaninvolvefinancialorothercontributionsandshouldbespecific

ImplementationStatus

8Istheimplementationstatusdescribed?Thisincludesthephaseofimplementationandsignificantchangesthathavehappenedtoplans,strategies,performanceframeworks,etcthathaveoccurred‐includingtheimplicationsofthese


63 Universalia

changes

ExecutiveFeedbackonSectionAIssuesforthissectionrelevantforfeedbacktoseniormanagement(positives&negatives),&justifyrating.Uptotwosentences

SECTIONB:EVALUATIONPURPOSE,OBJECTIVESANDSCOPE SecondReview ThirdReview

Question cc Remarks B/Aretheevaluation'spurpose,objectivesandscopesufficientlycleartoguidetheevaluation?Thepurposeoftheevaluationshouldbeclearlydefined,includingwhytheevaluationwasneededatthatpointintime,whoneededtheinformation,whatinformationisneeded,andhowtheinformationwillbeused.Thereportshouldprovideaclearexplanationoftheevaluationobjectivesandscopeincludingmainevaluationquestionsanddescribesandjustifieswhattheevaluationdidanddidnotcover.Thereportshoulddescribeandprovideanexplanationofthechosenevaluationcriteria,performancestandards,orother


B/Aretheevaluation'spurpose,objectivesandscopesufficientlycleartoguidetheevaluation?

B/Aretheevaluation'spurpose,objectivesandscopesufficientlycleartoguidetheevaluation?

Purpose,objectivesandscope

9Isthepurposeoftheevaluationclear?Thisincludeswhytheevaluationisneededatthistime,whoneedstheinformation,whatinformationisneeded,howtheinformationwillbeused.

10Aretheobjectivesandscopeoftheevaluationclearandrealistic?Thisincludes:Objectivesshouldbeclearandexplainwhattheevaluationisseekingtoachieve;Scopeshouldclearlydescribeandjustifywhattheevaluationwillandwillnotcover;Evaluationquestionsmayoptionallybeincludedtoaddadditionaldetails

11Dotheobjectiveandscoperelatetothepurpose?Thereasonsforholdingtheevaluationatthistimeintheprojectcycle(purpose)shouldlinklogicallywiththespecificobjectivestheevaluationseekstoachieveandtheboundarieschosenfortheevaluation(scope)

Evaluationframework


Universalia 64

criteriausedbytheevaluators.

12DoestheevaluationprovidearelevantlistofevaluationcriteriathatareexplicitlyjustifiedasappropriateforthePurpose?Itisimperativetomakethebasisofthevaluejudgementsusedintheevaluationtransparentifitistobeunderstoodandconvincing.UNEGevaluationstandardsrefertotheOECD/DACcriteria,butothercriteriacanbeusedsuchasHumanrightsandhumanitariancriteriaandstandards(e.g.SPHEREStandards)butthisneedsjustification..NotallOECD/DACcriteriaarerelevanttoallevaluationobjectivesandscopes.TheTORmaysetthecriteriatobeused,buttheseshouldbe(re)confirmedbytheevaluator.StandardOECDDACCriteriainclude:Relevance;Effectiveness;Efficiency;Sustainability;ImpactAdditionalhumanitariancriteriainclude;Coverage;Coordination;Coherence;Protection;timeliness;connectedness;appropriateness.(ThisisanextremelyimportantquestiontoUNICEF)

13Doestheevaluationexplainwhytheevaluationcriteriawerechosenand/oranystandardDACevaluationcriteria(above)rejected?Therationaleforusingeachparticularnon‐OECD‐DACcriterion(ifapplicable)and/orrejectinganystandardOECD‐DACcriteria(wheretheywouldbeapplicable)shouldbeexplainedinthereport.

ExecutiveFeedbackonSectionBIssuesforthissectionrelevantforfeedbacktoseniormanagement(positives&negatives),&justifyrating.Uptotwosentences

SECTIONC:EVALUATIONMETHODOLOGY,GENDER,HUMANRIGHTSANDEQUITY SecondReview ThirdReview

Question cc Remarks C/Isthemethodologyappropriateandsound?Thereportshouldpresentatransparentdescriptionofthemethodologyappliedtotheevaluationthatclearlyexplainshowtheevaluationwasspecificallydesignedto


C/Isthemethodologyappropriateandsound?

C/Isthemethodologyappropriateandsound?

Datacollection

14Doesthereportspecifydatacollectionmethods,analysismethods,samplingmethodsandbenchmarks?Thisshouldincludetherationaleforselectingmethodsandtheirlimitationsbasedoncommonlyacceptedbestpractice.

15Doesthereportspecifydatasources,therationalefortheirselection,andtheirlimitations?Thisshouldincludeadiscussionofhowthemixofdatasourceswasusedtoobtainadiversityofperspectives,ensureaccuracy&overcomedatalimits

Ethics


65 Universalia

16Areethicalissuesandconsiderationsdescribed?Thedesignoftheevaluationshouldcontemplate:Howethicaltheinitialdesignoftheprogrammewas;Thebalanceofcostsandbenefitstoparticipants(includingpossiblenegativeimpact)intheprogrammeandintheevaluation;Theethicsofwhoisincludedandexcludedintheevaluationandhowthisisdone

addresstheevaluationcriteria,yieldanswerstotheevaluationquestionsandachievetheevaluationpurposes.Thereportshouldalsopresentasufficientlydetaileddescriptionofmethodologyinwhichmethodologicalchoicesaremadeexplicitandjustifiedandinwhichlimitationsofmethodologyappliedareincluded.Thereportshouldgivetheelementstoassesstheappropriatenessofthemethodology.Methodsassucharenot‘good’or‘bad’,theyareonlysoinrelationtowhatonetriestogettoknowaspartofanevaluation.Thusthisstandardassessesthesuitabilityofthemethodsselectedforthespecificsoftheevaluationconcerned,assessingifthemethodologyissuitabletothesubjectmatterandtheinformationcollectedaresufficienttomeettheevaluation

17Doesthereportrefertoethicalsafeguardsappropriatefortheissuesdescribed?Whenthetopicofanevaluationiscontentious,thereisaheightenedneedtoprotectthoseparticipating.TheseshouldbeguidedbytheUNICEFEvaluationOfficeTechnicalNoteandinclude:protectionofconfidentiality;protectionofrights;protectionofdignityandwelfareofpeople(especiallychildren);Informedconsent;Feedbacktoparticipants;Mechanismsforshapingthebehaviourofevaluatorsanddatacollectors


Universalia 66

objectives.

ResultsBasedManagement

18Isthecapabilityandrobustnessoftheevaluatedobject'smonitoringsystemadequatelyassessed?Theevaluationshouldconsiderthedetailsandoverallfunctioningofthemanagementsysteminrelationtoresults:fromtheM&Esystemdesign,throughindividualtools,totheuseofdatainmanagementdecisionmaking.

19DoestheevaluationmakeappropriateuseoftheM&Eframeworkoftheevaluatedobject?Inadditiontoarticulatingthelogicmodel(resultschain)usedbytheprogramme,theevaluationshouldmakeuseoftheobject'slogframeorotherresultsframeworktoguidetheassessment.Theresultsframeworkindicateshowtheprogrammedesignteamexpectedtoassesseffectiveness,anditformstheguidingstructureforthemanagementofimplementation.

HumanRights,GenderandEquity

20DidtheevaluationdesignandstyleconsiderincorporationoftheUNandUNICEF'scommitmenttoahumanrights‐basedapproachtoprogramming,togenderequality,andtoequity?Thiscouldbedoneinavarietyofwaysincluding:useofarights‐basedframework,useofCRC,CCC,CEDAWandotherrightsrelatedbenchmarks,analysisofrightholdersanddutybearersandfocusonaspectsofequity,socialexclusionandgender.Styleincludes:usinghuman‐rightslanguage;gender‐sensitiveandchild‐sensitivewriting;disaggregatingdatabygender,ageanddisabilitygroups;disaggregatingdatabysociallyexcludedgroups.Promotegender‐sensitiveinterventionsasacoreprogrammaticpriority,Totheextentpossible,allrelevantpolicies,programmesandactivitieswillmainstreamgenderequality.

21Doestheevaluationassesstheextenttowhichtheimplementationoftheevaluatedobjectwasmonitoredthroughhumanrights(inc.gender,equity&childrights)frameworks?UNICEFcommitstogobeyondmonitoringtheachievementofdesirableoutcomes,andtoensurethattheseareachievedthroughmorallyacceptableprocesses.Theevaluationshouldconsiderwhethertheprogrammewasmanagedandadjustedaccordingtohumanrightsandgendermonitoringofprocesses.

22Dothemethodology,analyticalframework,findings,conclusions,recommendations&lessonsprovideappropriateinformationonHUMANRIGHTS(inc.women&childrights)?Theinclusionofhumanrightsframeworksintheevaluationmethodologyshouldcontinuetocascadedowntheevaluationreportandbeobviousinthedataanalysis,findings,conclusions,anyrecommendationsandanylessonslearned.Ifidentifiedinthe


67 Universalia

scopethemethodologyshouldbecapableofassessingthelevelof:Identificationofthehumanrightsclaimsofrights‐holdersandthecorrespondinghumanrightsobligationsofduty‐bearers,aswellastheimmediateunderlying&structuralcausesofthenonrealisationofrights.;Capacitydevelopmentofrights‐holderstoclaimrights,andduty‐bearerstofulfilobligations.Supportforhumanitarianaction–achievingfasterscalingupofresponse,earlyidentificationofprioritiesandstrategies,rapiddeploymentofqualifiedstaffandclearaccountabilitiesandresponsesconsistentwithhumanitarianprinciplesinsituationsofunrestorarmedconflict.

23Dothemethodology,analyticalframework,findings,conclusions,recommendations&lessonsprovideappropriateinformationonGENDEREQUALITYANDWOMEN'SEMPOWERMENT?Theinclusionofgenderequalityframeworksintheevaluationmethodologyshouldcontinuetocascadedowntheevaluationreportandbeobviousinthedataanalysis,findings,conclusions,anyrecommendationsandanylessonslearned.Ifidentifiedinthescopethemethodologyshouldbecapableofassessingtheimmediateunderlying&structuralcausesofsocialexclusion;andcapacitydevelopmentofwomentoclaimrights,andduty‐bearerstofulfiltheirequalityobligations.

24Dothemethodology,analyticalframework,findings,conclusions,recommendations&lessonsprovideappropriateinformationonEQUITY?Theinclusionofequityconsiderationsintheevaluationmethodologyshouldcontinuetocascadedowntheevaluationreportandbeobviousinthedataanalysis,findings,conclusions,anyrecommendationsandanylessonslearned.Ifidentifiedinthescopethemethodologyshouldbecapableofassessingthecapacitydevelopmentofrights‐holderstoclaimrights,andduty‐bearerstofulfilobligations&aspectsofequity.

Stakeholderparticipation

25Arethelevelsandactivitiesofstakeholderconsultationdescribed?Thisgoesbeyondjustusingstakeholdersassourcesofinformationandincludesthedegreeofparticipationintheevaluationitself.Thereportshouldincludetherationaleforselectingthislevelofparticipation.Rolesforparticipationmightinclude:oLiaisonoTechnicaladvisoryoObserveroActivedecisionmakingThereviewershouldlookforthesoundnessofthedescriptionandrationaleforthedegreeofparticipationratherthanthelevelofparticipationitself.

26Arethelevelsofparticipationappropriateforthetaskinhand?Thebreadth&degreeofstakeholderparticipationfeasibleinevaluationactivitieswilldependpartlyonthekindof


Universalia 68

participationachievedintheevaluatedobject.Thereviewershouldnoteherewhetherahigherdegreeofparticipationmayhavebeenfeasible&preferable.

Methodologicalrobustness

27Isthereanattempttoconstructacounterfactualoraddressissuesofcontribution/attribution?Thecounterfactualcanbeconstructedinseveralwayswhichcanbemoreorlessrigorous.Itcanbedonebycontactingeligiblebeneficiariesthatwerenotreachedbytheprogramme,oratheoreticalcounterfactualbasedonhistoricaltrends,oritcanalsobeacomparisongroup.

28Doesthemethodologyfacilitateanswerstotheevaluationquestionsinthecontextoftheevaluation?ThemethodologyshouldlinkbacktothePurposeandbecapableofprovidinganswerstotheevaluationquestions.

29Aremethodologicallimitationsacceptableforthetaskinhand?Limitationsmustbespecificallyrecognisedandappropriateeffortstakentocontrolbias.Thisincludestheuseoftriangulation,andtheuseofrobustdatacollectiontools(interviewprotocols,observationtoolsetc).Biaslimitationscanbeaddressedinthreemainareas:Biasinherentinthesourcesofdata;Biasintroducedthroughthemethodsofdatacollection;Biasthatcolourstheinterpretationoffindings

ExecutiveFeedbackonSectionCIssuesforthissectionrelevantforfeedbacktoseniormanagement(positives&negatives),&justifyrating.Uptotwosentences

SECTIOND:FINDINGSANDCONCLUSIONS SecondReview ThirdReview

Question cc Remarks D/Arethefindingsandconclusions,clearlypresented,relevantandbasedonevidence&soundanalysis?Findingsshouldresponddirectlytotheevaluationcriteriaandquestionsdetailedinthescopeandobjectivessectionofthereport.Theyshouldbebasedon


D/Arethefindingsandconclusions,clearlypresented,relevantandbasedonevidence&soundanalysis?

D/Arethefindingsandconclusions,clearlypresented,relevantandbasedonevidence&soundanalysis?

Completenessandlogicoffindings

30Arefindingsclearlypresentedandbasedontheobjectiveuseofthereportedevidence?Findingsregardingtheinputsforthecompletionofactivitiesorprocessachievementsshouldbedistinguishedclearlyfromresults.Findingsonresultsshouldclearlydistinguishoutputs,outcomesandimpacts(whereappropriate).Findingsmustdemonstratefullmarshallingandobjectiveuseoftheevidencegeneratedbytheevaluationdatacollection.Findingsshouldalsotellthe'wholestory'oftheevidenceandavoidbias.

31Dothefindingsaddressalloftheevaluation'sstatedcriteriaandquestions?


69 Universalia

Thefindingsshouldseektosystematicallyaddressalloftheevaluationquestionsaccordingtotheevaluationframeworkarticulatedinthereport.

evidencederivedfromdatacollectionandanalysismethodsdescribedinthemethodologysectionofthereport.Conclusionsshouldpresentreasonablejudgmentsbasedonfindingsandsubstantiatedbyevidence,providinginsightspertinenttotheobjectandpurposeoftheevaluation.

32Dofindingsdemonstratetheprogressiontoresultsbasedontheevidencereported?Thereshouldbealogicalchaindevelopedbythefindings,whichshowstheprogression(orlackof)fromimplementationtoresults.

33Aregapsandlimitationsdiscussed?Thedatamaybeinadequatetoansweralltheevaluationquestionsassatisfactorilyasintended,inthiscasethelimitationsshouldbeclearlypresentedanddiscussed.Caveatsshouldbeincludedtoguidethereaderonhowtointerpretthefindings.Anygapsintheprogrammeorunintendedeffectsshouldalsobeaddressed.

34Areunexpectedfindingsdiscussed?Ifthedatareveals(orsuggests)unusualorunexpectedissues,theseshouldbehighlightedanddiscussedintermsoftheirimplications.

CostAnalysis

35Isacostanalysispresentedthatiswellgroundedinthefindingsreported?Costanalysisisnotalwaysfeasibleorappropriate.Ifthisisthecasethenthereasonsshouldbeexplained.Otherwisetheevaluationshoulduseanappropriatescopeandmethodologyofcostanalysistoanswerthefollowingquestions:oHowprogrammecostscomparetoothersimilarprogrammesorstandardsoMostefficientwaytogetexpectedresultsoCostimplicationsofscalingupordownoCostimplicationsforreplicatinginadifferentcontextoIstheprogrammeworthdoingfromacostperspectiveoCostsandthesustainabilityofthe


Universalia 70

programme.

Contributionandcausality

36Doestheevaluationmakeafairandreasonableattempttoassigncontributionforresultstoidentifiedstakeholders?Forresultsattributedtotheprogramme,theresultshouldbemappedasaccuratelyaspossibletotheinputsofdifferentstakeholders.

37Arecausalreasonsforaccomplishmentsandfailuresidentifiedasmuchaspossible?Theseshouldbeconciseandusable.Theyshouldbebasedontheevidenceandbetheoreticallyrobust.(ThisisanextremelyimportantquestiontoUNICEF)

Strengths,weaknessesandimplications

38Arethefutureimplicationsofcontinuingconstraintsdiscussed?Theimplicationscanbe,forexample,intermsofthecostoftheprogramme,abilitytodeliverresults,reputationalrisk,andbreachofhumanrightsobligations.

39Dotheconclusionspresentboththestrengthsandweaknessesoftheevaluatedobject?Conclusionsshouldgiveabalancedviewofboththestrongeraspectsandweakeraspectsoftheevaluatedobjectwithreferencetotheevaluationcriteriaandhumanrightsbasedapproach.

Completenessandinsightofconclusions

40Dotheconclusionsrepresentactualinsightsintoimportantissuesthataddvaluetothefindings?Conclusionsshouldgobeyondfindingsandidentifyimportantunderlyingproblemsand/orpriorityissues.Simpleconclusionsthatarealreadywellknowndonotaddvalueandshouldbeavoided.

41Doconclusionstakedueaccountoftheviewsofadiversecross‐sectionofstakeholders?Aswellasbeinglogicallyderivedfromfindings,conclusionsshouldseektorepresenttherangeofviewsencounteredintheevaluation,andnotsimplyreflectthebiasoftheindividualevaluator.Carryingthesediverseviewsthroughtothepresentationofconclusions(consideredhere)isonlypossibleifthemethodologyhasgatheredandanalysedinformationfromabroadrangeofstakeholders.

42Aretheconclusionspitchedatalevelthatisrelevanttotheendusersoftheevaluation?Conclusionsshouldspeaktotheevaluationparticipants,stakeholdersandusers.Thesemaycoverawiderangeofgroups


71 Universalia

andconclusionsshouldthusbestatedclearlyandaccessibly:addingvalueandunderstandingtothereport(forexample,somestakeholdersmaynotunderstandthemethodologyorfindings,buttheconclusionsshouldclarifywhatthesefindingsmeantotheminthecontextoftheprogramme).

ExecutiveFeedbackonSectionDIssuesforthissectionrelevantforfeedbacktoseniormanagement(positives&negatives),&justifyrating.Uptotwosentences

SECTIONE:RECOMMENDATIONSANDLESSONSLEARNED SecondReview ThirdReview

Question cc Remarks E/Aretherecommendationsandlessonslearnedrelevantandactionable?Recommendationsshouldberelevantandactionabletotheobjectandpurposeoftheevaluation,besupportedbyevidenceandconclusions,andbedevelopedwithinvolvementofrelevantstakeholders.Recommendationsshouldclearlyidentifythetargetgroupforeachrecommendation,beclearlystatedwithprioritiesforaction,beactionableandreflectanunderstandingofthecommissioningorganizationandpotentialconstraintstofollowup.


E/Aretherecommendationsandlessonslearnedrelevantandactionable?

E/Aretherecommendationsandlessonslearnedrelevantandactionable?

Relevanceandclarityofrecommendations

43Aretherecommendationswell‐groundedintheevidenceandconclusionsreported?Recommendationsshouldbelogicallybasedinfindingsandconclusionsofthereport.

44Arerecommendationsrelevanttotheobjectandthepurposeoftheevaluation?Recommendationsshouldberelevanttotheevaluatedobject

45Arerecommendationsclearlystatedandprioritised?Iftherecommendationsarefewinnumber(upto5)thenthiscanalsobeconsideredtobeprioritised.Recommendationsthatareover‐specificorrepresentalonglistofitemsarenotofasmuchvaluetomanagers.Wherethereisalonglistofrecommendations,themostimportantshouldbeorderedinpriority.

Usefulnessofrecommendations


Universalia 72

46Doeseachrecommendationclearlyidentifythetargetgroupforaction?Recommendationsshouldprovideclearandrelevantsuggestionsforactionlinkedtothestakeholderswhomightputthatrecommendationintoaction.Thisensuresthattheevaluatorshaveagoodunderstandingoftheprogrammedynamicsandthatrecommendationsarerealistic.

47Aretherecommendationsrealisticinthecontextoftheevaluation?Thisincludes:oanunderstandingofthecommissioningorganisationoawarenessoftheimplementationconstraintsoanunderstandingofthefollow‐upprocesses

48Doesthereportdescribetheprocessfollowedindevelopingtherecommendations?Thepreparationofrecommendationsneedstosuittheevaluationprocess.Participationbystakeholdersinthedevelopmentofrecommendationsisstronglyencouragedtoincreaseownershipandutility.

Appropriatelessonslearned

49Arelessonslearnedcorrectlyidentified?Lessonslearnedarecontributionstogeneralknowledge.Theymayrefineoraddtocommonlyacceptedunderstanding,butshouldnotbemerelyarepetitionofcommonknowledge.Findingsandconclusionsspecifictotheevaluatedobjectarenotlessonslearned.

50Arelessonslearnedgeneralisedtoindicatewhatwiderrelevancetheymayhave?Correctlyidentifiedlessonslearnedshouldincludeananalysisofhowtheycanbeappliedtocontextsandsituationsoutsideoftheevaluatedobject.

ExecutiveFeedbackonSectionEIssuesforthissectionrelevantforfeedbacktoseniormanagement(positives&negatives),&justifyrating.Uptotwosentences

SECTIONF:REPORTISWELLSTRUCTURED,LOGICANDCLEAR SecondReview ThirdReview

Question cc Remarks F/Overall,doalltheseelementscometogetherinawellstructured,logical,clearandcompletereport?Thereportshould

ConstructivefeedbackforfuturereportsIncludinghowtoaddress

F/Overall,doalltheseelementscometogetherinawellstructured,logical,clearandcompletereport?

F/Overall,doalltheseelementscometogetherinawellstructured,logical,clearandcompletereport?

Styleandpresentation

51.Dotheopeningpagescontainallthebasicelements?Basicelementsincludeallof:Nameoftheevaluatedobject;Timeframeoftheevaluationanddateofthereport;Locationsoftheevaluatedobject;Namesand/ororganisationsofevaluators;


73 Universalia

Nameoftheorganisationcommissioningtheevaluation;Tableofcontentsincludingtables,graphs,figuresandannex;Listofacronyms

belogicallystructuredwithclarityandcoherence(e.g.backgroundandobjectivesarepresentedbeforefindings,andfindingsarepresentedbeforeconclusionsandrecommendations).Itshouldreadwellandbefocused.

weaknessesandmaintaininggoodpractice

52Isthereportlogicallystructured?Context,purpose,methodologyandfindingslogicallystructured.Findingswouldnormallycomebeforeconclusions,recommendations&lessonslearnt

53Dotheannexescontainappropriateelements?Appropriateelementsmayinclude:ToRs;Listofintervieweesandsitevisits;Listofdocumentaryevidence;Detailsonmethodology;Datacollectioninstruments;Informationabouttheevaluators;Copyoftheevaluationmatrix;CopyoftheResultschain.Wheretheyaddvaluetothereport

54Dotheannexesincreasetheusefulnessandcredibilityofthereport?

ExecutiveSummary

55.Isanexecutivesummaryincludedaspartofthereport?IftheanswerisNo,question56to58shouldbeN/A

56Doestheexecutivesummarycontainallthenecessaryelements?Necessaryelementsincludeallof:Overviewoftheevaluatedobject;Evaluationobjectivesandintendedaudience;Evaluationmethodology;Mostimportantfindingsandconclusions;Mainrecommendations

57Cantheexecutivesummarystandalone?Itshouldnotrequirereferencetotherestofthereportdocumentsandshouldnotintroducenewinformationorarguments

58Cantheexecutivesummaryinformdecisionmaking?Itshouldbeshort(ideally2‐3pages),andincreasetheutilityfordecisionmakersbyhighlightkeypriorities.


Universalia 74

ExecutiveFeedbackonSectionFIssuesforthissectionrelevantforfeedbacktoseniormanagement(positives&negatives),&justifyrating.Uptotwosentences

AdditionalInformation SecondReview ThirdReview

Question Remarks Remarks Remarks

i/DoestheevaluationsuccessfullyaddresstheTermsofReference?IfthereportdoesnotincludeaTORthenarecommendationshouldbegiventoensurethatallevaluationsincludetheTORinthefuture.SomeevaluationsmaybeflawedbecausetheTORsareinappropriate,toolittletimeetc.Or,theymaysucceeddespiteinadequateTORs.Thisshouldbenotedunderviiinthenextsection

ii/IdentifyaspectsofgoodpracticeintheevaluationIntermsofevaluation

iii/IdentifyaspectsofgoodpracticeoftheevaluationIntermsofprogrammatic,sectorspecific,thematicexpertise

OVERALLRATING SecondReview ThirdReview

Question cc Remarks

OVERALLRATINGInformedbytheanswersabove,applythereasonablepersontesttoanswerthefollowingquestion:Ω/Isthisacrediblereportthataddressestheevaluationpurposeandobjectivesbasedonevidence,andthatcanthereforebeusedwithconfidence?ThisquestionshouldbeconsideredfromtheperspectiveofUNICEFstrategicmanagement.

OVERALLRATINGInformedbytheanswersabove,applythereasonablepersontesttoanswerthefollowingquestion:Ω/Isthisacrediblereportthataddressestheevaluationpurposeandobjectivesbasedonevidence,andthatcanthereforebeusedwithconfidence?

OVERALLRATINGInformedbytheanswersabove,applythereasonablepersontesttoanswerthefollowingquestion:Ω/Isthisacrediblereportthataddressestheevaluationpurposeandobjectivesbasedonevidence,andthatcanthereforebeusedwithconfidence?

i/Towhatextentdoeseachofthesixsectionsoftheevaluationprovidesufficientcredibilitytogivethereasonablepersonconfidencetoact?Takenontheirown,couldareasonablepersonhaveconfidenceineachofthefivecoreevaluationelementsseparately?Itisparticularlyimportanttoconsider:oIsthereportmethodologicallyappropriate?oIstheevidencesufficient,robustandauthoritative?oDotheanalysis,findings,conclusionsandrecommendationsholdtogether?


75 Universalia

ii/Towhatextentdothesixsectionsholdtogetherinalogicallyconsistentwaythatprovidescommonthreadsthroughoutthereport?Thereportshouldholdtogethernotjustasindividuallyappropriatelyelements,butasaconsistentandlogical‘whole’.

iii/Arethereanyreasonsofnotethatmightexplaintheoverallperformanceorparticularaspectsofthisevaluationreport?Thisisachancetonotemitigatingfactorsand/orcrucialissuesapparentinthereviewofthereport.

ToRs

Other

ExecutiveFeedbackonOverallRatingIssuesforthissectionrelevantforfeedbacktoseniormanagement(positives&negatives),&justifyrating.Uptotwosentences


Universalia 76

AA pp pp ee nn dd ii xx VV II II RR ee gg ii oo nn aa ll RR aa tt ii nn gg ss GG rr aa pp hh ss

3

1

4

10

4

6

1

10

3

9

1

1

1

5

2

1

5

1

1

LACRO

CEECIS

EAPRO

ESARO

MENARO

ROSA

WCARO

HQ

Overall Ratings


2

3

1

2

9

3

3

1

9

3

9

1

2

2

6

2

4

6

1

LACRO

CEECIS

EAPRO

ESARO

MENARO

ROSA

WCARO

HQ

Section A Ratings



77 Universalia

3

1

1

3

10

1

6

1

10

2

6

2

1

3

5

2

2

7

3

LACRO

CEECIS

EAPRO

ESARO

MENARO

ROSA

WCARO

HQ

Section B Ratings


4

1

5

10

3

6

3

8

3

6

2

5

3

5

3

1

1

LACRO

CEECIS

EAPRO

ESARO

MENARO

ROSA

WCARO

HQ

Section C Ratings



Universalia 78

1

1

3

4

10

3

8

1

10

4

6

1

3

1

3

2

2

4

1

1

LACRO

CEECIS

EAPRO

ESARO

MENARO

ROSA

WCARO

HQ

Section D Ratings


1

1

2

12

3

4

2

6

1

4

3

2

2

7

1

6

7

4

1

LACRO

CEECIS

EAPRO

ESARO

MENARO

ROSA

WCARO

HQ

Section E Ratings



79 Universalia

3

1

2

3

10

3

6

1

10

3

9

2

1

1

5

2

1

6

LACRO

CEECIS

EAPRO

ESARO

MENARO

ROSA

WCARO

HQ

Section F Ratings



Universalia 80

AA pp pp ee nn dd ii xx VV II II II UU NN II CC EE FF ,, CC oo uu nn tt rr yy ‐‐ ll ee dd aa nn dd JJ oo ii nn tt ll yy MM aa nn aa gg ee dd EE vv aa ll uu aa tt ii oo nn ss bb yy RR ee gg ii oo nn

4

13

1

7

3

117

9

1

2

1

21

3

2 2

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

LACRO CEECIS EAPRO ESARO MENARO ROSA WCARO HQ

Management of Evaluation by Region

UNICEF Joint with UN Joint with other Joint with country

3

1

1UNICEF

Joint with UN

Joint with other

Joint with Country

Externally Managed

Not Clear

Country‐led

LACRO



81 Universalia

3 9

1

1UNICEF

Joint with UN

Joint with other

Joint with country

Externally Managed

Not Clear

Country‐led

CEECIS


1

1

2 1

UNICEF

Joint with UN

Joint with other

Joint with country

Externally Managed

Not Clear

Country‐led

EAPRO



Universalia 82

4

2

3

2

UNICEF

Joint with UN

Joint with other

Joint with country

Externally Managed

Not Clear

Country‐led

ESARO


1 2UNICEF

Joint with UN

Joint with other

Joint with Country

Externally Managed

Not Clear

Country‐led

MENARO



83 Universalia

1 10

1 1

UNICEF

Joint with UN

Joint with other

Joint with country

Externally Managed

Not Clear

Country‐led

ROSA


3 4UNICEF

Joint with UN

Joint with other

Joint with country

Externally Managed

Not Clear

Country‐led

WCARO



Universalia 84

9UNICEF

Joint with UN

Joint with other

Joint with country

Externally Managed

Not Clear

Country‐led

HQ



85 Universalia

AA pp pp ee nn dd ii xx II XX OO vv ee rr aa ll ll RR aa tt ii nn gg ss GG rr aa pp hh ss bb yy RR ee pp oo rr tt TT yy pp oo ll oo gg yy

1

2

1

26

15

1

5

9

6

1

2

Sub‐national

National

Multi‐country

Regional

Multi‐region/global

Geography


4 40

1

1

5

11

2

1

2

1

UNICEF

Joint with UN

Joint with other

Country‐led

Externally Managed

Not Clear

Joint with country

Management



Universalia 86

1

2

1

5

2

6

3

15

13

1

2

2

2

1

3

6

1

1

1

1

Pilot

At Scale

Policy

RTE

Humanitarian

Project

Programme

Country Programme

Joint with country

Purpose of the Evaluation


3

1

6

34

6

4

9

3 2

Output

Outcome

Impact

Result Level



87 Universalia

1

2

1

16

16

15

3

10

3

1

1

Formative

Summative

Summative and formative

Stage


2

1

1

6

3

2

16

4

6

10

3

4

3

2

1

3

1

1

Health

HIV‐AIDS

WASH

Nutrition

Education

Child Protection

Social inclusion

Cross cutting ‐ Gender Equality

Cross‐cutting – Humanitarian Action

Touches more than one outcome

SPOA Correspondence



Universalia 88

4 46

1

16 2

Self‐evaluation

Independent internal

Independent external

Not clear from Report

Level of Independence


4 42

4

2

14

2

2English

French

Spanish

Language



89 Universalia

AA pp pp ee nn dd ii xx XX SS uu bb ‐‐ SS ee cc tt ii oo nn RR aa tt ii nn gg ss

Sect ion A Object of the Eva luat ion

Sect ion B Eva luat ion Purpose , Object ives and Scope

14%

11%

16%

12%

64%

54%

33%

61%

16%

28%

35%

25%

4%

6%

13%

2%

1%

1%

3%

Implementation Status

Stakeholders and their contributions

Theory of Change

Object and Context


9%

12%

44%

65%

22%

22%

21%

1%

4%Evaluation Framework

Purpose, objectives and scope



Universalia 90

Sect ion C Eva luat ion Methodology , Gender , Human Rights and Equi ty

16%

3%

8%

16%

12%

8%

56%

40%

62%

42%

54%

58%

28%

20%

20%

24%

23%

15%

25%

5%

15%

11%

6%

13%

6%

3%

13%

Data collection

Ethics

Results Based Management

Human Rights, Gender and Equity

Stakeholder participation

Methodological robustness



91 Universalia

Sect ion D F ind ings and Conc lus ions

8%

8%

9%

4%

12%

60%

73%

62%

30%

55%

21%

13%

17%

19%

20%

8%

4%

9%

13%

10%

2%

1%

3%

33%

2%

Completeness and insight of conclusions

Strengths, weaknesses and implications

Contribution and Causality

Cost Analysis

Completeness and logic of findings



Universalia 92

Sect ion E Recommendat ions and Lessons Learned

Sect ion F Report i s Wel l Structured , Log ic and Clear

7%

12%

13%

28%

41%

62%

29%

24%

21%

35%

22%

2%

1%

1%

1%

Appropriate lessons learned

Usefulness of recommendations

Relevance and clarity of recommendations


8%

14%

65%

62%

20%

20%

7%

3%1%

Executive Summary

Style and presentation



93 Universalia

AA pp pp ee nn dd ii xx XX II AA nn aa ll yy ss ii ss TT aa bb ll ee ff oo rr 22 00 11 44 RR ee pp oo rr tt ss

RatingCategory

Region GeographicScope ManagementofEvaluation Purpose Results SPOACorrespondenceLevelof

IndependenceApproach

LatinAmericaandCaribbeanRegional

Office

Central&EasternEurope,

CommonwealthofIndependentStates

EastAsiaandthePacificRegionalOffice

EasternandSouthernAfricaRegional

Office

MiddleEastandNorthAfricaRegional

Office

SouthAsiaRegionalOffice

WestandCentralAfricaRegionalOffice

Corporate(HQ)

Other

1.1Sub‐national

1.2National

1.3Multi‐country

1.4Regional

1.5Multi‐region

2.1UNICEFmanaged

2.2Jointm

anaged,withoneormoreUN

agencies

2.3Jointm

anaged,withorganisations

outsidetheUNsystem

2.4JointlyManagedwithCountry

2.5Country‐ledEvaluation

2.6Externallymanaged

2.7Notclearfrom

Report

3.1Pilot

3.2Atscale

3.3Policy

3.4Real‐time‐evaluation

3.5Hum

anitarian

3.6Project

3.7Programme

3.8CountryProgrammeEvaluation

(CPE)

3.9ImpactEvaluation

4.1Output

4.2Outcome

4.3Impact

5.1Health

5.2HI V‐AIDS

5.3WASH

5.4Nutrition

5.5Education

5.6ChildProtection

5.7Socialinclusion

5.8Crosscutting‐GenderEquality

5.9Cross‐cutting–Hum

anitarianAction

5.10Touchesmorethanoneoutcome

6.1Sel f‐evaluation

6.2Independentinternal

6.3Independentexternal

6.4Notclearfrom

Report

7.1Form

ative

7.2Summative

7.3Summativeandform

ative

OVERALLOutstanding,bestpractice

0 3 0 0 0 1 0 0 0 1 2 1 0 0 4 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 3 1 0 0 0 0 2 0 1 0 0 1 0 0 4 0 1 2 1

HighlySatisfactory 4 10 4 6 1 10 3 9 0 26 15 1 0 5 40 1 1 5 0 0 0 5 2 6 0 3 15 13 1 2 6 34 6 6 3 2 0 16 4 0 6 0 10 0 0 46 1 16 1615

MostlySatisfactory 1 1 1 5 2 1 5 0 0 9 6 0 1 0 11 2 1 2 0 0 0 2 2 0 0 1 3 6 1 1 4 9 3 3 0 4 0 3 2 0 1 0 3 0 0 16 0 3 10 3

Unsatisfactory 0 0 0 0 0 1 1 0 0 0 2 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 2 0 0 1 0 1 0 0 0 0 0 0 0 2 0 0 1 1

SECTIONAOutstanding,bestpractice

2 3 0 1 0 0 0 0 0 1 4 0 1 0 6 0 0 0 0 0 0 2 0 0 0 0 3 0 1 0 2 4 0 0 0 0 0 2 1 1 0 0 2 0 0 6 0 3 1 2



Unsatisfactory 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1

SECTIONBOutstanding,bestpractice

0 3 1 0 0 1 0 0 0 2 3 0 0 0 5 0 0 0 0 0 0 0 0 0 0 1 2 1 1 0 1 3 1 0 0 0 0 2 0 1 1 0 1 0 0 5 0 1 1 3

HighlySatisfactory 3 10 1 6 1 10 2 6 0 21 11 1 1 4 33 1 1 4 0 0 0 5 3 2 0 1 13 13 1 1 5 30 3 8 2 1 0 11 4 0 5 0 8 0 0 38 1 15 16 8


Unsatisfactory 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

SECTIONCOutstanding,bestpractice

0 4 0 0 0 1 0 0 0 1 3 1 0 0 5 0 0 0 0 0 0 0 0 0 0 0 1 3 1 0 0 3 2 0 0 0 0 3 0 1 0 0 1 0 0 5 0 1 2 2



Unsatisfactory 0 0 0 0 0 1 1 0 0 0 2 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 2 0 0 1 0 1 0 0 0 0 0 0 0 2 0 0 1 1

SECTIONDOutstanding,bestpractice

0 1 1 0 0 0 0 3 0 1 0 1 0 3 5 0 0 0 0 0 0 0 0 0 0 2 0 3 0 0 2 2 1 0 0 0 0 1 0 0 2 0 2 0 0 5 0 3 1 1



Universalia 94

RatingCategory

Region GeographicScope ManagementofEvaluation Purpose Results SPOACorrespondenceLevelof

IndependenceApproach

LatinAmericaandCaribbeanRegional

Office

Central&EasternEurope,

CommonwealthofIndependentStates


EasternandSouthernAfricaRegional

Office

MiddleEastandNorthAfricaRegional

Office



Corporate(HQ)

Other

1.1Sub‐national

1.2National

1.3Multi‐country

1.4Regional

1.5Multi‐region

2.1UNICEFmanaged

2.2Jointm

anaged,withoneormoreUN

agencies

2.3Jointm

anaged,withorganisations

outsidetheUNsystem

2.4JointlyManagedwithCountry

2.5Country‐ledEvaluation

2.6Externallymanaged

2.7Notclearfrom

Report

3.1Pilot

3.2Atscale

3.3Policy

3.4Real‐time‐evaluation

3.5Hum

anitarian

3.6Project

3.7Programme

3.8CountryProgrammeEvaluation

(CPE)

3.9ImpactEvaluation

4.1Output

4.2Outcome

4.3Impact

5.1Health

5.2HI V‐AIDS

5.3WASH

5.4Nutrition

5.5Education

5.6ChildProtection

5.7Socialinclusion

5.8Crosscutting‐GenderEquality

5.9Cross‐cutting–Hum

anitarianAction

5.10Touchesmorethanoneoutcome

6.1Sel f‐evaluation

6.2Independentinternal

6.3Independentexternal

6.4Notclearfrom

Report

7.1Form

ative

7.2Summative

7.3Summativeandform

ative


Unsatisfactory 0 0 0 0 0 1 1 0 0 0 2 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 2 0 0 1 0 1 0 0 0 0 0 0 0 2 0 0 1 1

SECTIONEOutstanding,bestpractice

0 0 0 0 0 1 0 1 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 2 0 1 0 1



Unsatisfactory 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0

SECTIONFOutstanding,bestpractice

0 3 1 0 0 2 0 0 0 2 3 1 0 0 6 0 0 0 0 0 0 0 0 0 0 1 1 2 2 0 1 4 1 0 0 0 0 2 0 1 1 0 2 0 0 6 0 2 2 2



Unsatisfactory 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


95 Universalia

AA pp pp ee nn dd ii xx XX II II CC ll aa rr ii ff ii cc aa tt ii oo nn oo ff CC rr ii tt ee rr ii aa ff oo rr CC oo mm pp ll ee tt ii nn gg GG EE RR OO SS RR ee vv ii ee ww ss

Reportsequencenumber

TitleoftheEvaluationReport

YearoftheEvaluationReport

2016 2015 2014 2013 2012 2011 2010

Country(ies)

Region LatinAmericaandCaribbeanRegionalOffice



EasternandSouthernAfricaRegionalOffice

MiddleEastandNorthAfricaRegionalOffice



Corporate(HQ)

Other

Dateofreview

TypeofReport Evaluation Needsassessment

Appraisal Evaluability Review,includingmid‐termreview

Inspection Investigation Research&study

Audit Survey InternalManagementconsulting

GeographicScopeCoverageoftheprogrammebeingevaluatedandgeneralizabilityofevaluationfindings

1.1Sub‐national:Theprogrammeandevaluationcoversselectedsub‐nationalunits(districts,provinces,states,etc.)withinacountry,whereresultscannotbegeneralizedtothewholecountry

1.2National:Theprogrammecoversthewholecountry,andtheevaluationdrawsasampleineverydistrict,orusesasamplingframethatisrepresentativeofthewholecountry.

1.3Multi‐country:Whereoneprogrammeisimplementedinseveralcountries,ordifferentprogrammesofasimilarthemeareimplementedinseveralcountries,theevaluationwouldcovertwoormorecountries

1.4Regional:Whereoneprogrammeisimplementedinseveralcountries,ordifferentprogrammesofasimilarthemeareimplementedinseveralcountries,theevaluationcoversmultiplecountrieswithintheregionandthesamplingis

1.5Multi‐region/Global:Theprogrammeisimplementedintwoormoreregions,ordeliberatelytargetsallregions.Theevaluationwouldtypicallysampleseveralcountriesacrossmultiple


Universalia 96

withinoneregion.Theresultsoftheevaluationwouldnotbegeneralizabletoothercountriesintheregion.

adequatetomaketheresultsgeneralizabletotheregion.

regions,withtheresultsintendedtobegeneralizableintwoormoreregions.

ManagementManagerialcontrolandoversightofevaluationdecisions(i.e.,TORs,selectionofconsultants,budgets,qualityassuranceandapprovalofevaluationfindings).Inallinstances,itisassumedthatthemanagementapproachesincluderelevantnationalactors(e.g.,government,universities,NGOs,CBOs)

2.1UNICEFmanaged:WorkingwithnationalpartnersofdifferentcategoriesUNICEFisresponsibleforallaspectsoftheevaluation.

2.2Jointmanaged,withoneormoreUNagencies:UNICEFistheco‐managerwithoneormoreUNagencies

2.3Jointmanaged,withorganisationsoutsidetheUNsystem:UNICEFistheco‐managerwithoneormoreorganizationsoutsidetheUNsystem

2.4.JointlyManagedwithCountry:EvaluationsjointlymanagedbytheCountry(Governmentand/orCSO)andtheUNICEFCO

2.5.Country‐ledEvaluation:EvaluationsmanagedbytheCountry(Governmentand/orCSO)

2.6Externallymanaged:Anexternalorganizationmanagestheevaluation,whereUNICEFisoneoftheorganizationsbeingassessed(UNandnon‐UN)

2.7NotclearfromReport

PurposeSpeakstotheoverarchinggoalforconductingtheevaluation;itsraisond’etre

3.1Pilot:Whereanewsolution,approach,orprogrammeisbeingtestedatanationalorsub‐nationallevel,theevaluationexaminestheefficacyofsuchaninterventionwiththeintentiontodeterminesuitabilityforscaling‐up.

3.2Atscale:Theevaluationexaminestheefficacyofaprogrammethatisbeingimplementedatornearitsmaximumintendedextent,withtheintentionofprovidingfeedbackonefficiencyandtheoveralleffectivenessoftheprogrammetoscaleupfocusforlessons

3.3Policy:Anevaluationwhosemainpurposeistoexaminetheresultsofapolicythatisdelinkedfromfield‐basedprogrammingoperations.

3.4Real‐time‐evaluation:Inthecontextofanemergency,anevaluationoftheefficacyoftheresponse,whichcollateslessonsthatcanbeappliedbacktoanon‐goingresponse

3.5Humanitarian:Humanitarianevaluationassessesorganizationalperformanceinemergencysettings(includingbothnaturaldisasters&conflicts)atvariousphasesofthesecrises,frompreparednessandrisk

3.6Project:Anevaluationwhichisstep‐by‐stepprocessofcollecting,recordingandorganisationinformationabouttheprojectresultsincludingimmediateresults,short‐termoutputsandlong‐termproject

3.7Programme:Anevaluationofasectorialprogrammetodetermineitsoveralleffectivenessandefficiencyinrelationtothestatedgoalsandobjectives

3.8CountryProgrammeEvaluation(CPE):Anevaluationthatassesstherelevance,effectiveness,efficiency,sustainabilityoftheentireUNICEFCountryProgramme

3.9ImpactEvaluation:Anevaluationthatlooksatthepositiveandnegative,primaryandsecondarylong‐termeffectsonfinalbeneficiariesproducedbyadevelopmentintervention.Impactevaluationsassessthedirectandindirectcontributionsofthe


97 Universalia

learned. reductiontoresponse,recovery&thetransitiontodevelopment

outcomes interventiontospecificdevelopmentresults,usingrobustquantitative,qualitative,ormixedmethodstoassigncontributiontohigherlevelresults.

ResultLevelofchangessought,asdefinedinresultsbasedmanagement:refertosubstantialuseofhighestlevelreached

4.1Output:Causaleffectsderivingdirectlyfromprogrammeactivities,andassumedtobecompletelyunderprogrammecontrol

4.2Outcome:Effectsfromoneormoreprogrammesbeingimplementedbymultipleactors(UNICEFandothers),wherethecumulativeeffectofoutputselicitsresultsbeyondthecontrolofanyoneagencyorprogramme

4.3Impact:Finalresultsofaprogrammeorpolicyontheintendedbeneficiariesand,wherepossible,oncomparisongroups.Reflectsthecumulativeeffectofdonorsupportedprogrammesofcooperationandnationalpolicyinitiatives.

SPOACorrespondenceAlignmentwithSPOAfocusareapriorities:(1)Youngchildsurvivalanddevelopment;(2)Basiceducationandgenderequality;(3)HIV/AIDSandchildren;(4)Childprotectionfromviolence,exploitationandabuse;and(5)Policy

5.1Health:Supportingglobaleffortstoreduceunder‐fivemortalitythroughimprovedandequitableuseofhighimpactmaternal,newbornandchildhealthinterventionsfrompregnancytoadolescenceandpromotionofhealthybehaviours.Programmeareas:a)Immunizationb)Polioeradicationc)Maternalandnewbornhealthd)Childhealthe)Healthsystemsstrengtheningf)Healthin

5.2HIV‐AIDS:SupportingglobaleffortstopreventnewHIVinfectionsandincreasetreatmentduringbothdecadesofachild’slifethroughimprovedandequitableuseofprovenHIVpreventionandtreatmentinterventionsbypregnantwomen,childrenand

5.3WASH:Supportingglobaleffortstoeliminateopendefecationandincreaseuseofsafedrinkingwaterthroughimprovedandequitableaccesstosafedrinkingwatersources,sanitationandhealthyenvironment

5.4Nutrition:Supportingglobaleffortstoreduceundernutrition,withparticularfocusonstunting,throughimprovedandequitableuseofnutritionalsupportandimprovednutritionandcarepractices.Programmeareas:a)Infantandyoungchild

5.5Education:Supportingglobaleffortstoprovideaccesstoqualityeducationforbothboysandgirlsthroughimprovedlearningoutcomesandequitableandinclusiveeducation.Programmeareas:a)Earlylearning

5.6ChildProtection:Supportingglobaleffortstopreventviolence,abuse,exploitationandneglectthroughimprovedandequitablepreventionandchildprotectionsystems.Programmeareas:a)Childprotection

5.7Socialinclusion:Supportingglobaleffortstoreducechildpovertyanddiscriminationagainstchildrenthroughimprovedpolicyenvironmentsandsystemsfordisadvantagedchildren.Programmeareas:a)Childpoverty

5.8Crosscutting‐GenderEquality:UNICEFwillemphasizetheempowermentofgirlsandwomenandaddressgender‐relatedneedsofgirls,boys,fathers,mothersandcommunities.UNICEFwillidentifyandleveragepositive

5.9Cross‐cutting–HumanitarianAction:UNICEFwillstrivetosavelivesandprotectrightsasdefinedintheCoreCommitmentsforChildreninHumanitarianActioninlinewithinternationallyacceptedstandards.UNICEFwillfocusitseffortson

5.10Touchesmorethanoneoutcome.Pleasespecifyinthecomments.


Universalia 98

advocacyandpartnershipsforchildren’srights

humanitariansituations

adolescents.Programmeareas:a)Preventionofmothertochildtransmissionandinfantmalecircumcisionb)CareandtreatmentofyoungchildrenaffectedbyHIV&AIDSc)AdolescentsandHIV&AIDSd)Protectionandsupportforchildrenandfamiliese)HIV‐AIDSinhumanitariansituations

sandimprovedhygienepractices.Programmeareas:a)Watersupplyb)Sanitationc)Hygiened)WASHinschoolsandearlychildhooddevelopmentscenterse)WASHinhumanitariansituations

feedingb)Micronutrientsc)NutritionandHIVd)Communitymanagementofacutemalnutritione)Nutritioninhumanitariansituations

b)Equitywithafocusongirls’educationandinclusiveeducationc)Learningandchildfriendlyschoolsd)Educationinhumanitariansituations

systemsstrengtheningb)Violence,exploitationandabusec)Justiceforchildrend)Birthregistratione)Strengthenedfamiliesandcommunitiesf)Childprotectioninhumanitariansituations

analysisandsocialprotectionb)Humanrights,non‐discriminationandparticipationc)Publicfinancemanagementd)Governanceanddecentralizatione)Socialinclusioninhumanitariansituations

cross‐sectoralsynergiesandlinkagessuchasthoseamongimprovinggirls’education,endingchildmarriageandreducingmaternalmortality.UNICEFwillalsofocusonincreasingaccesstoservicesandopportunitiesbywomenandgirlsandtheirinclusionandparticipationinallfacetsoflifeaswellasonadvocacyandtechnicalsupportongender‐equitablepolicies,budgetingandresourceallocations.Emphasiswillbeplacedon:a)Sex‐disaggregatedandothergenderrelateddata.b)Promotegender‐sensitiveinterventionsasacoreprogrammaticpriority,c)Totheextentpossible,allrelevantpolicies,programmes

systematicallyreducingvulnerabilitytodisastersandconflictsforeffectivepreventionofandresponsetohumanitariancrises,onimprovinglinksbetweendevelopmentprogrammesandhumanitarianresponseandonpromotingrapidrecoveryandbuildingcommunityresiliencetoshocksthataffectchildren.Emphasison:a)Supportforhumanitarianaction–achievingfasterscalingupofresponseb)Earlyidentificationofprioritiesandstrategiesc)Rapiddeploymentofqualifiedstaffandclearaccountabilitiesd)Responsesconsistentwithhumanitarianprinciplesinsituationsofunrestorarmedconflict


99 Universalia

andactivitieswillmainstreamgenderequality.

LevelofIndependenceImplementationandcontroloftheevaluationactivities

6.1Self‐evaluation:Asignificantcomponentofevaluationmanagementactivities&decision‐makingabouttheevaluationareimplementedbyindividualsassociatedwiththetargetprogramme/intervention(eg.programmesofficer/specialists)

6.2Independentinternal:Theevaluationisimplementedbyconsultantsbutmanagedin‐housebyUNICEFprofessionals.Theoverallresponsibilityfortheevaluationlieswithinthedivisionwhoseworkisbeingevaluated.

6.3Independentexternal:Theevaluationisimplementedbyexternalconsultantsand/orUNICEFEvaluationOfficeprofessionals.Theoverallresponsibilityfortheevaluationliesoutsidethedivisionwhoseworkisbeingevaluated.

6.4NotclearfromReport

Approach 7.1Formative:Anevaluationwiththepurposeandaimofimprovingtheprogramme.Formativeevaluationsstrengthenorimprovetheobjectbeingevaluatedbyexaminingthedeliveryoftheprogramme

7.2Summative:Anevaluationthatexaminestheeffectsoroutcomesoftheobjectbeingevaluatedandsummarizeitbydescribingwhathappenedsubsequenttodeliveryoftheprogramme

7.3Summativeandformative:Anevaluationthatcombinestheelementsofaformativeandasummativeevaluation.

TORspresent Yes No

QuestionCriteria


SectionRatingCriteria


Highlysatisfactory

MostlySatisfactory

Unsatisfactory


Universalia 100

AA pp pp ee nn dd ii xx XX II II II SS pp ee cc ii aa ll CC aa ss ee ss ii nn tt hh ee 22 00 11 44 GG EE RR OO SS EE xx ee rr cc ii ss ee

The2014GEROSexerciseincludedanumberofreportsthatwereslightlydifferentfromtraditionalevaluations.SomechallengesarosewhenitcametimetoassessthesereportsaccordingtotheGEROScriteria,whichraisedsomequestionsregardingtheflexibilityofthetemplate.Belowisalistofthespecialcasesfoundamongthe2014reportpopulation,aswellasthereasonswhythetemplatewassometimesdifficulttoapply.

Case Stud ies and Globa l Syntheses

GEROS‐Afghanistan‐2014‐001 GEROS‐Bangladesh‐2014‐002

GEROS‐Brazil‐2014‐001 GEROS‐Cambodia‐2014‐001

GEROS‐Zimbabwe‐2014‐002

Inall,5casestudieswereincludedamongthereportsreviewedin2013.Thesereportswerepartoflarger,globalsynthesisevaluations.AsrequestedbyUNICEF,eachreport(thesynthesisreportandallcasestudies)wasassessedindividuallyaccordingtotheGEROScriteria.However,itwasunclearifevaluatorswereawarethattheircasestudieswouldbeanalysedasseparatereports(i.e.sotheycouldtakeappropriatemeasurestoincludethenecessarydetailineachone,priortosubmission).

Thecasestudiesanalysedthisyearallreceivedhighlysatisfactoryratings,notablybecausetheyweredetailedandquitecomprehensive.Theyincludedinformationonthecontext,methodologyandanalysis.Nevertheless,casestudiesdonotalwaysincludesuchdepth,nordotheynecessarilyrequireadescriptionoftheevaluationframeworkorasetofconclusionsandrecommendations(normallyprovidedintheglobalsynthesisreport).Ontheotherhand,globalsynthesesmaynotalwayscontainspecificdetailsorexamplesfromthecountriesevaluated.UNICEFmaywishtoconsiderhowtodealwithsuchreportsinfuture,andwhetheracasestudycanrealisticallybejudgedagainstthesamecriteriaasafullevaluationreport.

Impact Eva luat ions Rating

GEROS‐Bangladesh‐2014‐014 Highlysatisfactory

GEROS‐Lesotho‐2014‐001 Highlysatisfactory

GEROS‐Nigeria‐2014‐012 Mostlysatisfactory

GEROS‐Sierra_Leone‐2014‐003 Unsatisfactory

FourimpactevaluationswereincludedamongtheGEROSreportsthisyear.Thesereportsreceivedamixedsetofratings,fromhighlysatisfactorytounsatisfactory.


101 Universalia

Accordingtoreviewers,thetemplatewasdifficulttoapplytotheseimpactevaluations,notablybecauseofthewayevaluationcriteriawereused.ItisgenerallyacceptedthatimpactevaluationsarenotrequiredtoaddressalloftheOECD/DACcriteria(onlyimpact),sothischoiceneednotbejustifiedwithinthedescriptionoftheevaluation.Further,duetotheirnarrowfocus,theimpactevaluationsreviewedprovidedlittleinformationaroundresults‐basedmanagementandtheevaluatedobject’stheoryofchange.Otherissuesrelatedtohowrecommendationsandlessonslearnedshouldbejudged,giventheuniquenatureofimpactevaluations.Inmanycases,therefore,GEROStemplatecriteriasimplyappearedinapplicable.

Future Cons iderat ions for UNICEF

Giventhesespecialcasesinthe2014reviewsample,UNICEFmaywishtoconsidermodifyingitstemplateorfindingnewwaysofassessinguniquereports.Forinstance,UNICEFcould:

Adaptitscurrenttemplatetoaccountforpossiblevariationincontent,structureandfocuswithinreports.Naturally,cautionshouldbetakentopreventthetemplatefrombecomingsoflexiblethatcommonstandardsareeasytocircumvent;

Createanewtemplateforeachdifferenttypeofreport.Thecomparabilityofallthereports(formeta‐analysispurposes)wouldthenlikelyrequirefurtherthoughtordiscussion,especiallyifcertainquestionswereremovedoralteredinsomeversionsofthetemplate.Inthiscase,UNICEFmaywishtoconductseparateanalysesofeachreporttype,comparingthemtoeachotherratherthantotheentirepopulationofreports;

Leavethetemplateas‐is,eveniffuturegoodqualityreportsmayreceivealowerratingduetotheabsence/specialtreatmentofsomecriteria.

geros global meta evaluation report 2014 · narrative justification is also provided and supported...

Documents