d4.1 - allscale runtime system interface … - allscale runtime system...d4.1 – allscale runtime...

21
Copyright © AllScale Consortium Partners 2015 1 H2020 FETHPC-1-2014 An Exascale Programming, Multi-objective Optimisation and Resilience Management Environment Based on Nested Recursive Parallelism Project Number 671603 D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version: 1.0 Author(s): Thomas Heller (FAU), Arne Hendricks (FAU), Hebert Jordan (UIBK), Peter Thoman (UIBK) Date: 28/03/16

Upload: others

Post on 21-May-2020

30 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

Copyright©AllScaleConsortiumPartners2015

1

H2020 FETHPC-1-2014

AnExascaleProgramming,Multi-objectiveOptimisationandResilience

ManagementEnvironmentBasedonNestedRecursiveParallelismProjectNumber671603

D4.1–AllScaleruntimesysteminterfacespecification

WP4:UnifiedruntimesystemforextremescalesVersion: 1.0Author(s): ThomasHeller(FAU),ArneHendricks(FAU),HebertJordan(UIBK),PeterThoman(UIBK)Date: 28/03/16

Page 2: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

2

Duedate: PM6Submissiondate: day/month/yearProjectstartdate: 01/10/2015Projectduration: 36monthsDeliverableleadorganization FAU

Version: 1.00Status Final

Author(s):

ThomasHeller(FAU)ArneHendricks(FAU)HerbertJordan(UIBK)PeterThoman(UIBK)

Reviewer(s) KirilDichev(QUB),ThomasFahringer(UIBK)

DisseminationlevelPU Public DisclaimerThis deliverable has been prepared by the responsible Work Package of theProjectinaccordancewiththeConsortiumAgreementandtheGrantAgreementNr671603.Itsolelyreflectstheopinionofthepartiestosuchagreementsonacollectivebasis in thecontextof theProjectand to theextent foreseen in suchagreements.

Page 3: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

3

Acknowledgements

TheworkpresentedinthisdocumenthasbeenconductedinthecontextoftheEU Horizon 2020. AllScale is a 36-month project that started on October 1st,2015andisfundedbytheEuropeanCommission.

Thepartners in theprojectareUNIVERSITÄTINNSBRUCK(UBIK),FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN NÜRNBERG (FAU), THE QUEEN'SUNIVERSITY OF BELFAST (QUB), KUNGLIGA TEKNISKA HÖGSKOLAN (KTH),NUMERICALMECHANICSAPPLICATIONS INTERNATIONALSA (NUMEXA), IBMIRELANDLIMITED(IBM).

The content of this document is the result of extensive discussionswithin theAllScaleConsortiumasawhole.

MoreinformationPublic AllScale reports and other information pertaining to the project are available through the AllScale public Web site under http://www.allscale.eu.VersionHistoryVersion Date Comments,Changes,Status Authors,

contributors,reviewers

0.1 21/02/16 Firstdraft ThomasHeller0.2 03/03/16 Seconddraft ArneHendricks0.3 07/03/16 Thirddraft ArneHendricks0.4. 16/03/16 Finaldraft ArneHendricks1.0 19/03/16 Finalversion ThomasHeller

Page 4: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

4

Table of Contents1 Introduction...................................................................................................................................52 RuntimeInterface........................................................................................................................62.1Tuple-Spaces...........................................................................................................62.2AllScaleRuntimeInterface.................................................................................82.3KeyConceptsexplained.......................................................................................92.3.1DataItem..........................................................................................................................132.3.2WorkItem........................................................................................................................132.3.3Region-Realization....................................................................................................152.3.4Fragment-Realization..............................................................................................162.3.5DataItem-Realization..............................................................................................162.3.5WorkItem-Realization............................................................................................182.3.6Scheduler-Realization............................................................................................19

3 Monitoring/Events...................................................................................................................204 InterfacetothePerformanceandResilienceMonitor..............................................205 InterfacetotheResilienceManager.................................................................................216 ConclusionsandFutureWork.............................................................................................21

Page 5: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

5

1 IntroductionTheruntimesysteminterfaceisthelayerdescribingthestructureofanapplicationthatcanbemanagedandtunedbytheAllScaleruntimesystem.ThefundamentalfunctionalityandpropertiesofthisessentialinterfacewillbedeterminedbyTaskT2.1andincorporatedintheoverallsystemarchitecturebyTaskT2.2.Simultaneously,toaidtheinterfacespecificationwithintaskT2.2,weformalizetheinvolvedentitiesthroughactualspecificationsthereofusingC++declarations.Theruntimeinterfaceisconnectedtothetoolsandentitiesdevelopedbytheotherworkpackages,suchasworkpackage5,andcanbeseeninthelowerdiagram,highlightedincyan.

Themaininteractingareasarewiththecompiler,theonlinemonitoringtoolandtheresiliencemanager.Duetothehighlevelofcouplingtoothercomponentsit

Page 6: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

6

isveryimportantfortheresultingruntimeinterfacespecificationtofindamodelofexpressingparallel(recursive)tasks,datadependencies,IOcharacteristics,andhardwarerequirementsaswellasmeansforofferingmultiple,exchangeableimplementationsconstitutingdifferentvariationsofthesamecodefragmentandmeta-datacapturingresiliencepropertiesoftheprovidedcodes,inparticularfailurecompensationstrategies.

2 RuntimeInterfaceTheruntimeinterfaceisbasedonatheoreticalfoundationwithsimilaritiestothetuple-spaceparadigm.Thisparadigmdefinesacollectionoftuplestobeaccessedandusedconcurrently.ItwasintroducedbyDavidGelernteratUniversityofYaleinthe1980sbytheLindacoordination-language.ApplicationsusingLindaanditstuple-spacesrealizecommunicationamongprocessesbydefiningacollectionoftuples(Key-Valuepairs)andinformationisexchangedbywriting/loading(consuming/producing)tuples.Itwasanearlyapproachtoadistributedsharedmemory,andthekeyconceptsbeneathit,i.e.modelinganapplication’scommunicationintermsofproducingandconsumingsharedresources,issimilartomodernapproachessuchastheAllScaleruntimeinterface.

2.1Tuple-SpacesAtuple-spaceholdsanumberofKey-Valuepairs,whichcanbeconcurrentlymanipulated.Thecommunicationishandledbyapplyingreadandstoreoperationsonthesetuples,commonlyreferredtoasconsumeandproduceoperations.Thekeyconceptbehindit,i.e.dividinganapplication’stasksbymeansofproducingandconsumingsharedresources,isverysimilartomodernapproachessuchastheAllScaleruntime-interface.Atuple-spaceenvironmentoffersatleastthreeoperations:•Amethodtoinsertatupleintothespace,originallyreferredtoastheout()method•Amethodtowithdrawtuplesfromthetuple-space,thein()method•Amethodtoreadatuplefromthetuple-spacewithoutremovingit,theread()method.Thefollowingdiagramsdescribetheinteractionof4consumerswith2distincttuple-spaces,inserting,readinganddeletingtuples:

Page 7: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

7

InsteadofpassingmessagesbetweentwoprocessesAandB,itisnowpossibleforthemtocommunicatebyaccessingthetuplespaces:processAcreatesoneormoretuplesencodingthemessagetobeconveyed,addsthemtothetuple-spacefromwhereprocessBmayconsumethembywithdrawingorreadingthem.Thebehaviorisdescribedinthefigurebelow:

Anumberofinterestingpropertiesoriginatefromthetuplespaceparadigm,rootedinitsprincipleofcommunicationorthogonality(neithersendernorreceiverhavepriorknowledgeabouteachother):anuncouplingofcommunicationbothinregardsoftimeandspace.Spaceuncouplingreferstotheconceptofdistributionofresources,allowing1:Ncommunicationaswellas1:1betweenprocesses.Timeuncoupling,ontheotherhand,meansthataproducingprocesscanruntocompletionandterminate,whileitsproducedtuplesremaininthetuple-spaceoncetheyareadded,availableforotherprocessestobeconsumed.Thisconcepthasremainedveryimportantovertheyears,andisalso

Page 8: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

8

appliedbyHPXwithitsreferencecountingwithintheActiveGlobalAddressSpace(AGAS).Theproducersandconsumerscanbespreadacrossadistributedaddressspace-theyarestillabletoaccessthetuples,andatupleisusuallyuniqueinthetuple-space.Itshouldbeclearfromthediagramthatthemodelleadstoaformofshared,distributedmemoryspace,becausetuplescanactuallyresideondifferentphysicalnodesbutmaystillbeaccessedbytheothermembersThedetailsoftheseconceptscanbefoundin:LindainContext(1989).ThekindreadermightalsobeinterestedintheapproachoftheIntelConcurrentCollections,atemplatelibraryforC++parallelanddistributedapplications,introducingadataandcontrolflowmodeldistantlysimilartooursandinfluencedbytheconceptsoftuple-spaces.

2.2AllScaleRuntimeInterfaceWewillnowcontinuewithanintroductionandexplanationofthefundamentalconceptsandentitiesoftheAllScaleruntimeinterface.Ourtheoreticalfoundationofmanagingconcurrentlyusedresourcesisinfluencedbythetuple-spaceparadigm.Ourconceptwilladdtothisgeneralideawhatwecalldecomposabletuples.Decomposabletuplesarerepresentedbyahierarchicaltreeoffutures.Eachtaskthatisspawnedwithintheschedulerwillreturnsuchafuture.Eachfuturerepresentsthecompletionandreturnvalueofarecursivetask.Thefigurebelowshowssuchadependencytreeoffutures:

Thefinalresultmightdependuponthecompletionofatreeofotherresults,whichis,forinstance,thecasewhenrecursivelycalculatingFibonaccinumbers.ByrelyingonHPXastheunderlyingruntimeAPI,wecanconstructthesetreesoffuturesusingData-Flowcontroltechniquesandadditionallyexecuteapplicationswithoutthread-suspensionorwaiting.Thiswillallowtheruntimetobeasresourceefficientaspossibleandnotwasteanymemoryinthestacksegmentbyhavingagreatnumberofalreadyallocatedbutsuspendedthreads.Thelater

Page 9: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

9

featurewillbeofgreatusewhendealingwithextremescaleconcurrencyleadingtobillionsofconcurrenttaskssuchastargetedbytheAllScaleenvironment.OurmainfoundationalentitiesarecalledWorkItemandDataItem.Instancesofthosetwocanbemodeledastuplesinatuple-space.Similartothetuplesinthediagramabove,ourentitiesneedtobeaccessedconcurrentlyandmightresidephysicallyspreadonaclusterofnodes.Theruntimeinterfaceoffersmethodsequivalenttoput()andget()inatuple-space,thewrite()orstore()aswellastheload()method,whichwillbedescribedinmoredetaillater.Differingtothetuple-spacepicturedabove,inourcaseitisalsopossibletoretrieveaTreeofFuturestotuples.Atreeoffuturesiscalled“Treeture”anditrepresentsthedependenciesamongnestedtaskstobecomputedbeforebeingabletoobtainthevaluepromisedbythetaskassociatedtotherootnode’sfuture.TheusageofTreeturescombinedwithoperationsonthefuturetree,suchasunionandintersect,enablesustohaveset-likebehaviorenhancingthetuple-space’sabilitiesbyaddingthepossibilityofretrievinga(partial)tuple,whichcorrespondstoa(sub-)treeoffutures.

2.3KeyConceptsexplainedWewillnowintroducesomeofthemainconceptsoftheruntimeinterfaceinaninformalway,withoutactualsourcecode.Furtherdowninthedocumentwewillshowsomeofthedetailsoftheplannedimplementationoftheruntimeinterfacetoprovideamoredetailedpictureoftheruntimetothereader.Toexplaintheconcepts,wewillrunthereaderthroughsomesimpleexamples,fromtoptobottom.Anapplication,e.g.asimplestencil,isruninparallelon4nodesofacluster.Thedataisstoredina1Darrayofvalues.Initiallytherecursivesplit-upoftasksisdonebytheAllScalecompiler;itprovidesimplementationsoftasksintheformofWorkItems.TheseWorkItemsareourentitiesthatdescribewhathastobedone(inourexampleperformingcomputationsonthestencilcells)andonwhichdata.Thedatacanbespreadoverthenetwork.Withinthestencilexample,therewillbetheneedtoaccessdatawhichresidedphysicallyonanothernodeinthecluster.Regardingthemanagementofdata,therearethreeconceptsthatneedfurtherexplanation:Regions,Fragments,andCollectionFacades.ARegionexposesinformationtoaddresssubsetsofelementswithinacollection.Inanarraylikecollection,regionsmightberealizedbydescribingsetsofindicesor,morecompact,asanintervalofindices.Fortreeshapeddatastructures,regionsmaybeaddressedbyaddressingrootnodesofsubtreestobeaddressed.TheactualdatastorageisrealizedbyFragments,whichcorrespondstotheobjectinstantiatedontheindividualnodes.Eachfragmentcontainsashareoftheoverallcollectionanddatamaybemovedbetweenfragmentinstances.Thedatawithinfragmentsistherebyaddressedbytheassociatedregionstructure.Thefollowingfigurevisualizestherelationbetweenregionsandfragmentsforour1Darrayexample:

Page 10: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

10

ARegionisdefined,coveringindices4to7,andcanthenbeusedtocreateafragment,includingthedataaddressablebythoseindices.Dependingontheunderlyingdatastructure,regionsneedtoincludedifferentparameters:inthesimple1Dexamplethesearejuststartandendindices(columnindices),butfora2Darrayinformationabouttherowisneededaswell:

Asmentionedabove,theRegionconceptisversatile,andcanalsobeusedtoaccessstructureslikeatree:

Page 11: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

11

InordertoprovideaunifiedAPItotheuser,hidingtheinternaldatamanagement,weintroduceCollectionFacades,whichareauser-facingwrapperstocreateandmanipulatedistributeddatacollections.ACollectionFacadeappearslikeitisaccessinggloballyshareddata,whileinfactitisaccessinglocalizedFragments,managedbytheruntimesystem.Thismakesmanagingthedatapossibleinadistributedenvironment.TheinteractionofFragmentsandCollectionFacadesinadistributedenvironmentisillustratedinthefigurebelow.

Page 12: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

12

AswecoveredtheRegion,Fragment,andCollectionFacadeconcepts,wewillnowintroducetheDataItem.InordertocreateaDataItem,aDataItemDescriptioniscreated,whichbasicallyexposedtypeinformationassociatedtoaDataItem.Thusitsummarizeswhattypesareusedtoprocessthedata,includingthetypesforRegions,Fragments,andCollectionFacades.

Page 13: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

13

2.3.1DataItemTheDataItemitselfisasymbolicrepresentationtosomemanageddataelement.Typicaluser-levelcodemightaccesselementsinanarrayjustbyusinganarray,becausethisapproachwillfailtosufficewhenworkinginadistributedsystem,thedataneedstobesomehowwrappedbyamanagement-layer.ThisiswhatDataItemsdo.EachDataItemhasauniquesymbolicname:aglobalidentifier.ItalsousesaRegioninstancetodescribethefullsizeofthe(virtual)dataitem.Itistheruntimesjobtotakethefullsizeofthedataitem,partitionitintosmallersub-regionsandcreatefragmentscoveringthosesub-regionsthroughouttheavailablenodes.Thisdatadistributionmaylaterbeadjusteddynamically,tofacilitateloadbalancing/managementoperationsortoreactonchangesintheinfrastructure,likejoiningorfailingnodes.

2.3.2WorkItemHavingexplainedDataItem,itistimetoillustratetheWorkItemconcept.AWorkItemistheruntimeinterfacetodescribeatask,withadditionaldatadependenciesandvariouscodevariants.

Page 14: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

14

Itisconstructedbyutilizingatemplate,calledWorkItemDescription.ThistemplatecontainsinformationabouttheresultweexpectfromtheWorkItem,aswellasthetypeofdata/parametersthetaskneedstobeprocessed,called“Closure”type.Itcanalsocontainmultipleimplementationsoftheactualtask.Thisiscoveredbytuple<WorkItemVariants…>fieldinthedescription.WeneedtodefinethetermWorkItemVariantatleastroughly,becauseitisessentialforunderstandinghowWorkItemsareconstructed:AWorkItemVariantisatemplateclassthatisusedtodescribeoneimplementationofaworktask.

Page 15: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

15

Itconsists,amongotherthings,ofarequiresfunctionwhichreturnsalistofdatadependenciesimposedbytheexecutionofthecorrespondingcodevariant.Thoserequirementslistsub-regionsofDataItemsandaccessmodes(e.g.ReadOnlyorReadWrite)whichwillhavetobeaccessibleonanynodesupposedtoprocessestheassociatedtask.TheWorkItemVariantalsocontainstheexecutemethod,whichistheactualimplementationofthecomputation.Thisiswheretheusercodewillbelocated.Thereturnvalueoftheexecutemethodiswhatwecalla“Treeture”.AtleastoneWorkItemVariantisneededtogenerateaWorkItemDescription,whichinturnisneededtogenerateanactualWorkItem.However,therecanbemultiplevariants,e.g.whenthecompilerissynthesizingspecializedcodeversionsforspecificusecase(sequentialexecution,checkpointcreation)ortargetarchitectures(variousaccelerators).Therearealsotwoentitiestoadministratetheseitems:theschedulerandamanagerforDataItemswhichwillbepresentedlater.Havingdiscussedthebasicelementsinaninformalandabstract,waywewillnowcontinuetodescribethesamecomponentsinregardsoftheiractualrepresentationintheinterface,i.e.writtencode.

2.3.3Region-RealizationAsexplained,aregioninstanceaddressesasubsetofelementsofacollection.Regionsareneededtoreferencesetsofelementsindataobjects.Theyhavetosupportasetofoperations.Oneveryimportantbeingtheloadoperation,whichloadsaRegionfromanarchiveobtainedfromthenetwork.OtheroperationsincludeunionandintersectionofRegionsofthesametype,needforthemanagementofdatapartitions.///Descriptiononhowthedataisaccessedenumaccess_privilige{read_only,write_first,read_write,write_only,accelerator_read_access,accelerator_write_access,accelerator_read_write_access,};template<typenameRegion>structis_region;///CalculatestheunionoftworegionsRegionmerge(Regionconst&region1,Regionconst&region2);///CalculatestheintersectionoftworegionsRegionintersection(Regionconst&region1,Regionconst&region2);///\returntrueiftheregionisempty,falseotherwiseboolempty(Regionsconst&region);///LoadsaRegionfromanarchivewhichhasbeenobtainedfromthenetworktemplate<typenameArchive,typenameRegion>voidload(Archive&archive,Region&region);///SavesaRegiontoanarchivewhichissupposedtobesentoverthe

Page 16: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

16

///networktemplate<typenameArchive,typenameRegion>voidsave(Archive&archive,Region&region);

2.3.4Fragment-RealizationAFragmentisastorageforasub-setofelementsofacollection,whichisaddressedbyaRegion: ///AFragmentisareferenceintoacollectionofdatawithaspecific ///viewpointexpressedwithaRegion Fragmentcreate(Regionconst&region); voidresize(Fragmentconst&fragment,Regionconst&region); OutFragmentmask(Fragmentconst&fragment,Regionconst&region);

voidinsert(Fragment&destination,Fragmentconst&source,Regionconst&region);

voidsave(Archive&ar,Fragmentconst&fragment); voidload(Archive&ar,Fragment&fragment);

2.3.5DataItem-RealizationThestatictypepartofadata_itemcannowbedescribedusingatemplateparameterDataItemDescription,whichisusedtocreateit.ADataItemDescriptioncontainsinformationaboutthetypesusedtoprocessthedataitem:///ConceptCollectionFacade:///ACollectionFacadeistheuserfacingtypewhichislogicallyrepresenting///aglobalviewonthedata,however,itmightonlyhaveaccesstoaacquired///fragmentwhichisjustasubset///Informationaboutthetypesusedtoprocessadataitemtemplate<typenameRegion,typenameFragment,typenameCollectionFacade>classdata_item_description{usingregion_type=Region;usingfragment_type=Fragment;usingcollection_facade=CollectionFacade;};WhereCollectionFacadeistheinterfacetotheend-user,aglobalviewtothedistributeddatacollection.Thedata_itemitselfcontainsthefullsizeofthedataitemintheformofaregion_typevalue,andauniqueidentifiertoidentifythedata_itemglobally:///Asymbolicrepresentationtosomemanageddataelement(collection)template<typenameDataItemDescription>classdata_item{usingregion_type=typenameDataItemDescription::region_type;///Createsadataitemwithauniqueidentifierdata_item();///Returnstheuniqueidentifiertoadataitemid_typeget_id()const;};

Page 17: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

17

Creation,acquisitionanddeletionofdata_itemsistheresponsibilityofthedata_item_manager,whichoffersacreate,destroy,acquire,andreleasemethodandistheinterfacetothecompiler:classdata_item_manager{///createsasymbolicinstanceforadataitemtemplate<typenameExecutor,typenameDataItemDescription>data_item<DataItemDescription>create(typenameDataItemDescription::region_typeconst&size,Executor=this_executor);///destroyasymbolicinstanceforadataitemtemplate<typenameDataItemDescription>voiddestroy(data_item<DataItemDescription>item);///Calledinthegeneratedcode(fromthecompiler)///whenthedataisaccessedtemplate<typenameDataItemDescription>typenameDataItemDescription::collection_typeacquire(requirement<DataItemDescription>const&requirement);///Calledinthegeneratedcode(fromthecompiler)///whenthedataisnotneededanymoretemplate<typenameDataItemDescription>voidrelease(requirement<DataItemDescription>const&requirement);///\returnsthelocationsofwherethefragmentstodataitems///thatthepassedregionscontainsarelocatedtemplate<typenameDataItemDescription>future<std::vector<std::pair<typenameDataItemDescription::region_type,locality>>>locate(requirement<DataItemDescription>const&requirement);///\paramshopping_listalistofrequirementswithalocalityhint///indicatingonwhatshouldbegathered//////\returnThisfunctionreturnsafuturethatbecomesreadywhenthe///datahasbeensuccesfullycopiedtothesystemtemplate<typenameExecutor,typenameDataItemDescription>future<void>gather(Executorconst&executor,std::vector<std::pair<requirement<DataItemDescription>,locality>>const&shopping_list);};data_itemsareinterfacedwiththework_itemsbytheWorkItemVariant,agenericconceptthatfeaturesarequires(Closureconst&closure)method,whichreturnsalistofrequirements.Arequirementreferencesadata_item,thesub-regionoftheitemtobeaccessed,andaccessprivilegesneededfortheassociatedtasktobeprocessed.template<typenameResult>classWorkItemVariant{///\returnIftheworkitemvariantwasgenerated;constexprboolvalid;

Page 18: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

18

///\tparamClosureListofcapturedvariablesthatistheused///dataitemsandpassedparameters///\returnAlistofrequirementstemplate<typenameClosure>staticunspecifiedrequires(Closureconst&closure);///Executesthegivenvariantusingthecapturedvariablesintheclosure///onceallrequirementshavebeenfullfilledtemplate<typenameClosure>statictreeture<Result>execute(Closureconst&closure);template<typenameClosure,typenameNonFunctionalProperty>statictypenameNonFunctionalProperty::result_typenon_functional_property(Closureconst&closure);};

2.3.5WorkItem-RealizationTheWorkItemVariantexposesanexecutemethodwhichispassedaClosureasparameter(representingthecapturedvariablesintheClosureonceallrequirementshavebeenacquiredbytheuseofthedata_item_manager),andthenexecutesthegivenvariant.WorkItemVariantalsooffersthepossibilitytospecifyadditionalnon-functional-propertiescharacterizingtherepresentedcodevariant.Thesecanincludethingssuchasparallelgranularity,resiliencyrequirements,andmemoryusagepatternsaswellaswhetherornotGPUsorotheracceleratorsareutilizedforthecomputation.Thisinformationismadeavailabletotheschedulerforconductingschedulingdecisions.WorkItemVariantsareusedtocreateWorkItemDescriptions:template<typenameResult,typenameClosure,typenameSplitVariant,typenameProcessVariant,typename...WorkItemVariant>classwork_item_description{usingresult_type=Result;usingclosure_type=Closure;usingsplit_variant=SplitVariant;usingprocess_variant=ProcessVariant;usingwork_items=std::tuple<WorkItemVariant...>;};EachWorkItemDescriptioncancontainanumberofWorkItemVariants,representedbyastd::tuple<WorkItemVariant...>.Inordertocreateawork_itemobject,aWorkItemDescriptionispassedastemplateparametertothework_itemclass,resultinginawork_itemfeaturingthegivenWorkItemDescription.classwork_item_base;template<typenameWorkItemDescription>classwork_item:work_item_base{usingresult_type=typenameWorkItemDescription::result_typeusingclosure_type=typenameWorkItemDescription::closure_type;

Page 19: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

19

work_item(closure_type&&closure);};Anothermethodneededfortheschedulingistherunmethod:///Executeagivenvariantwiththeclosure,thisgetscalledfromwithin///theschedulertemplate<typenameExecutor,typenameWorkItemVariant,typenameClosure>voidrun(Executor,Closure&&closure);ItispassedaWorkItemVariantandanassociatedclosureinordertoexecutethegivenvariantonahardwareresourcedeterminedbyfirstparameterbeinganHPXexecutorinstance.

2.3.6Scheduler-RealizationHavingintroducedthebasicclassesneededfordependency,dataandworkdescriptions,wewillnowintroducetheschedulerinterface.Theschedulerisagenericcomponentintheglobaladdressspace,whichisresponsibletomakedecisionsregardingwhereandhowtoprocessworkitems,wheretoplacesub-regionsofdataitems,andhowtotunehardwareparameters.Itfeaturesaspawnmethod,whichisresponsibleforschedulingtheexecutionofagivenWorkItem,byfirstselectingoneofitscodevariantsandalocationforitsexecution,followedbyforwardingtheselectedoptionstotherunmethodoutlinedabove. structscheduler{///Flushesthequeuesandabortsalltasksthathaven'tbeenstartedyetvoidflush();///Thisfunctionwillbegeneratedinthecompilerandeventually///spawnsaspecificWorkItemVariantbasedonsomeschedulerdecisiontemplate<typenameWorkItemDescription>treeture<typenameWorkItemDescription::result_type>spawn(typenameWorkItemDescription::closure_typeconst&closure);private:};Thespawnmethodisinvokedbythecodegeneratedbythecompiler,withagenericparameterWorkItemDescription,describingthework_itemitwillbecreating.Asvisiblefromthesignatureabove,spawnwillbegivenaparameterWorkItemDescription::closure_typeconst&closure.Itisusedtocapturethefunctions“callingcontext”,includingcapturedparametersanddataitems.ThespawnmethodreturnsaTreetureprovidingafuturehandletotheresultsoftheworkitem.TypeinformationaboutthetypeoftheresultintheFuture-TreeareprovidedbytheWorkItemDescription::result_typetemplateparameterofthetreeturereturntypeofthespawnmethod.

Page 20: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

20

2.4HardwareTheconfigurationofthehardwareisaccessiblethroughtopologyinformation,thatis,localtopologyoftheunderlyingprocessor/nodewithrespecttoprocessingunits,memoryandinterconnects.///Returnsatopologyofexecutorsthatcover:///-NUMADomains///-Symmetricmultiprocessing///-Accelerators///-Providesdifferentsetsofcorestouse///-Differentlowlevelschedulingpolicies:///*FIFOordering///*Workstealingwithinthesetofcores///*...executor_topologyget_topology();Sincetheschedulermightneedtobeabletomakedecisionsinvolvingmorethanoneprocess/localityitshouldmakeuseoftheHPXcommunicationfacilitiesbyregisteringschedulerswithauniquenameandresolvethenamestoGlobalIDsinordertomakecallschedulerdefinedactionstomakefurtherdecisions.

3 Monitoring/EventsTheruntimesystemexposesanAPIbasedonHPXperformancecounterswhichallowsthefollowing:

• GenerateAllScaleSpecificPerformanceCounterData• Globallydiscoverableandqueryable• Subscribable:eventonchangeoreventonchangegivenaspecific

thresholdetc.ThiswillformthebasisforthePerfomanceMonitorandResilienceManagercomponentsastheywillbeabletoa)registercountersandb)reactonthecounterstoeithergeneratenewworkitemsorgenerateeventsfortheschedulertoreacton.

4 InterfacetothePerformanceandResilienceMonitorWorkPackage5isthetaskofdevelopinglanguageandtoolsupportforcontinuousmonitoringofapplicationperformanceanderrorresilience,aswellassupportforapplication-specific,algorithmicerrordetectionandrecoveryfrombotherrorsandperformanceanomaliesduetonon-deterministicvariabilityinperformance.Inordertoaccomplishthistasktheruntimeneedstobeinterfacedtothecross-layerresiliencemonitor.TheinterfacetothetooldevelopedinWP5willbebasedonHPXperformancecounters.Itcanthusprovidearangeofinformation,includinglocality-specificperformancemetricssuchasthreadqueuelength,statesofthreadsandworkitems,andcachestatistics.Thisinformationcanthenbeusedbytheperformancemonitortoinformthedynamicoptimizertomakeprudentdecisionsatruntime.

Page 21: D4.1 - AllScale runtime system interface … - AllScale runtime system...D4.1 – AllScale runtime system interface specification WP4: Unified runtime system for extreme scales Version:

D4.1–AllScaleruntimesysteminterfacespecification

Copyright©AllScaleConsortiumPartners2015

21

5 InterfacetotheResilienceManagerInordertoprovideresiliencesupport,theruntimesystemwillprovidefunctionalitytobackup/restoredataitemsandworkitemstates.Intheeventofanetworkornodefailure,oranyotherdisruption,theresiliencemanagerwillorchestratearecoveryprocess.Ithastobediscussedwhichcomponentisdetectingnetworkfaults.Furthermore,globalcheckpointingneedstobeavoidedduetoitsdetrimentaleffectonperformance.

6 ConclusionsandFutureWorkOurspecificationallowsustoexpresstasks,datadependencies,andnon-functionalrequirementssuchashardwarerequirementsinaparallel-readyway.Theusageoftemplate-heavycodeassuresmaximumflexibilityforfurtherdevelopmentaswellasperformancebenefits.Paradigmssuchastheworkanddataitemshavetheadvantageofofferingwaystoutilizemultipleimplementationsofvariationsofthesamecodefragments.TheperformanceMonitorandresiliencemanagerinterfacesarestillveryearlyintheirplanningphaseandneedstobefurtherrefinedbasedontheinsightsobtainedoverthecourseofthedevelopmentprocess.BibliographyD. Gelernter - Linda in Context (1989).