improving post-click user engagement on native ads via survival analysis

23
Improving Post-Click User Engagement on Native Ads via Survival Analysis Nicola Barbieri, Fabrizio Silvestri and Mounia Lalmas

Upload: mounia-lalmas-roelleke

Post on 21-Apr-2017

1.492 views

Category:

Internet


0 download

TRANSCRIPT

Improving Post-Click User Engagement on Native Ads via Survival Analysis

Nicola Barbieri, Fabrizio Silvestri and Mounia Lalmas

Na$veadsanduserengagement

•  Es$matepost-clickengagementonmobilena$veads

•  Integratepredic$onintorankingfunc$ontopromotethemostengagingads

•  Wemeasureengagementbyanalyzingwhathappensa0ertheuserclick:–  Howmuch$metheuserspendsonthelandingpage?

•  Twopossibili$es:–  theuserimmediatelycomesback(bounces)

–  theuserstayslongerand“hopefully”convert(highdwell$me)

looser winner

Dwell$mebaseduserengagement

Howarena$veadsselected?

?

?

?

?

?

?

?

•  Itdependsonpoten$alrevenue:eCPM=P(click)*bid– P(click)istheprobabilityofuserclickingonanad.

– Bidistheamountofmoneytheadver$seriswillingtopayforadbeingshown.

Howarena$veadsselected?

eCPM = 0.1

eCPM = 0.08

eCPM = 0.05

Howarena$veadsselected?

Nowconsiderdwell$me

•  Supposewecanes$matept=P(dwell_5me>t):–  Probabilitythatauserwillstayformorethantsecondsontheadlandingpage.

•  WemodifyeCPMbymul$plyingitbypt–  Inprac$cewearecompu$ngprobabilityofclickingANDstayingformorethantseconds.

Howtoes$matept?•  Iftisfixeda-prioriitis,basically,aclassifica$onproblem:–  Logis$cRegression–  RandomForest– …

•  Whatifwewanttousedifferentthresholdsdependingonfactorssuchas,users,adcategory,etc.?– Differentad-hocmodels–  SurvivalAnalysis!

SurvivalAnalysis•  Stemsfromthenecessityofmakingaprognosisforspecificpa$ents,i.e.,toes$matetheprobabilityofsurvivingaspecificamountof$me.

•  Inothercontexts,theresponseisnot‘survival’,buta8metoevent:– Howlongwillabulb‘survive’–  Timeun$lfirsttoothisaffectedwithcaries–  Timeun$laharddrivewillfail–  Timeun8lusersreturntomobilenewsstream– …

SurvivalAnalysis•  Interestistheninthesurvivalfunc$on:

S(t)=P(Outcome>t)àprobabilityofbeingalivea0er8met=1–probabilityofdyingbefore8metS(t)=probabilityofnotreturningtostreamaZer$met=1–probabilityofreturningtostreambefore$met

•  Proper$esofS(t):–  S(0)=1 à youareabsolutelycertaintobe

aliveatthebeginningof$me.–  S(+∞)=0 à youwilleventuallydie.–  S’(t)>=0 à aZeronehouryoucannothave

morechancestolivethanbefore.

Examplesofasurvivalcurve

higherpost-clickengagement

lowerpost-clickengagement

Hazardandcumula$vehazard•  Hazardfunc$onisdefinedasrateofoccurrence(usersreturning)at$met

•  Cumula$vehazard(CHF)isthesumoftherisks(usersreturning)fromdura$on0tot

•  S(t)andH(t)arerelatedasfollows:

h(t) = � d

dtlog (S(t))

H(t) =

Z t

0h(u)du

S(t) = e�H(t)

SurvivalRandomForest•  Howtoes$mateS(t)froma

popula$on?–  Trainingdata

•  SurvivalRandomForest(SRF):–  Non-parametric–  Highperformance–  Learninginamul$-threadingenvironment

•  Issimilarto“classic”Random

Forestbutatleafnodesyoufindanes$matesoftheCumula8veHazardH(t)

Experiments&dataset•  Wesampled46,914observa$ons(impressions)fromourna$veadlog,

correspondingto:–  2,438ads–  Over850adver$sers.

•  Weperforma80/20training/testsplit.•  Foreachadweextracted42features

•  Hyper-parameters:–  100treesaswefoundittoop$mizethetradeoff$mevsgeneraliza$on

error–  Numberoffeaturessampledateachsplitis√nf,wherenfisthenumberof

features.

•  OfflineTests:–  AUCandROCcurves.

•  Onine(A/B)Tests:–  Dwell$meupliZ,Bounceratereduc$on,CTRupliZ.

Features(andtheirimportance)

HistoricalfeaturesaremostimportantThendocumentobjectandreadabilityfeatures:

dwell$meisinfluencedbyhowmuch“actual”contentispresentwithinalandingpage,andwhatisthequalityofthiscontent.

Es$mateddwell$me

80%withdwell$me≤100secondsmedianis45secondsaverageis65seconds

Kaplan-Meieres$mate

ROCCurves SAachievesameaccuracyascounterpartsNon-linear>linearSRF>RFSRFdoesnotrequirepredeterminedthreshold

UpliZinonline(A/B)tes$ng

MetricsDwell$me +13%Bouncerate -10.3%CTR +6.8%

Variablethreshold?

Conclusionsandfuturework•  Survivalrandomforestbased(slightly)outperformsallthe

othercompe$ngmodeland,moreimportantly,itallowstocomputethesurvivalatdifferentthresholds.

•  TheA/Btestshowsanimprovementinuserexperience:

–  Aposi$veeffectonCTR–  Adecreaseinthenumberofbounces,and–  Anincreasesinaveragedwell$me.

•  Futureworkincludes:–  Personaliza$onofthresholdperuser,perads,andperver$cal(category)

–  Integrateothersignalsrelatedtorevenue.

Appendix

Building$meandOOBerror

Cumula$vehazardfunc$on