Transcript
Page 1: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

AFine‐GrainedAnalysisofUser‐GeneratedContenttoSupportDecisionMaking

MarcirioSilveiraChavesh/p://mchaves.wikidot.com

Informa<onSystemsResearchGroup

BusinessandInforma<onTechnologyResearchCentre(BITREC)Ins<tuteforScien<ficandTechnologicalResearchofUniversidadeAtlân<ca(ISTR)

Workshop

Page 2: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

User‐GeneratedContent(UGC)•  Asknownas

–  User‐GeneratedData–  User‐CreatedContent–  User‐ContributedData–  Consumer‐GeneratedMedia

–  …

•  Canbeexpressedthrought–  Opinions–  Reviews–  Comments–  Posts

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 2

• Notes:• Alltheexamplesdescribedinthisworkshoparerealdata.• Somepapersmen<onedhereareunderreview.• Colorlegend:

• Examples• Posi<vefeature• Nega<vefeature

Page 3: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

ExampleofUGC•  AnopinionpostedinFacebookDec‐10‐2011,12:30pm

– “wouldhighlyrecommendInfinityMotorcycles,Southamptonforallmotorbikinggear.Veryreasonablepeople.Earliertheygavemeafullmoneybackforaunused(a\erexplainingwhyitwasunused)ladiesmotorbikejacket(nodefectswhasoever)andtodaythezipperonmynewjacketwasbrokenandtheygavemeabrandnewone(noques<onsasked,noreceiptbusinessandnofusscreated).FiveStarservice.”

– Thisuserhad226friends.

Apr‐18‐12 3MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 4: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Somesta<s<csaboutUGC•  Morethan50%ofallinternetvisitsarenowtoUGC/socialmediasites.

•  Morethan75%of<mespentontheinternetis"social”.

•  Facebooknowcapturesasmuch<mespentontheinternetasGoogle,Yahoo,andAOL.

•  Morethan80%ofconsumersareinfluencedbySocialMarkeJng.

Source: http://www.bbrisco.com/2010/05/social.html

Apr‐18‐12 4MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 5: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

MainObjec<vesofthisWorkshop•  In‐depthanalysisofUGC

•  UseUGCtosupportdecisionmaking

•  StudyadomainontologytosupportAr<ficialIntelligencetasks

•  Addressapproachesforsen<mentanalysis

•  Fromtheorytoprac<ce:Hands‐onSessionApr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 5

Page 6: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 6

Page 7: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

ContextWorkshop

AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.

Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 7

Page 8: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

AnFine‐grainedAnalysisofUGC•  OverallopinionaboutatopicisonlyapartoftheinformaJonofinterest.

•  Document‐levelsenJmentclassificaJonfailstodetectsen<mentaboutindividualaspectsofthetopic.Inreality,forexample,thoughonecouldbegenerallyhappyabouthiscar,hemightbedissaJsfiedbytheenginenoise.

•  Tothemanufacturers,theseindividualweaknessesandstrengthsareequallyimportanttoknow,orevenmorevaluablethantheoverallsa<sfac<onlevelofcustomers.(Tangetal.2009)

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 8

Page 9: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

UGC

AnopinionissimplyaposiJveornegaJvesenJment,view,aPtude,emoJon,or

appraisalaboutanenJtyoranaspectoftheenJty(HuandLiu,2004;Liu,2006)fromanopinionholder(Bethardetal.,2004;Kimand

Hovy,2004;Wiebeetal.,2005).

Apr‐18‐12 9MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 10: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC•  Opinion’sCharacterisa<on–  Iuseandextendthedefini<onproposedby(Dingetal.,2008;Liu,2010;Mar<nandWhite,2005)toanalysethesentencesofreviews.

– Letthereviewber.

–  Inthemostgeneralcase,rischaracterisedasasetofthefollowingelements{O,F,SO,H,S,A,R,I,SG},where:

Apr‐18‐12 10MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 11: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC•  Opinion’sCharacterisa<on– O:Object– F:Feature– SO:Seman<c‐Orienta<on– H:Holder– S:Source– A:A%tude– SG:Sugges.on– R:Recommenda.on–  I:Inten.on

Apr‐18‐12 11MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 12: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC1  ‐Object(O)– Anobjectisaproduct(e.g.movieandbook)oraservice(e.g.hotelandrestaurant)underreviewwhichiscomposedbyfeatures.– ObjectsarealsocalledenJJes.

2‐Feature(F)– Afeatureisacomponentorpartofanobject.•  actorandphotographyarefeaturesonamovie.•  poolandstaffarefeaturesonahotel.

– FeaturesarealsocalledaXributesorfacets.– Afeaturecanbemen<onedexplicitlyorimplicitlyinareview(Dingetal.2008).

Apr‐18‐12 12MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 13: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC2.1‐ExplicitFeature(F)–  Ifafeaturefappearsinreviewr,itiscalledanexplicitfeatureinr.

– Thehotelislocatedverynearthecentercity.•  loca<onisanexplicitfeature.

2.2‐ImplicitFeature(F):–  Iffdoesnotappearinrbutisimplied,itiscalledanimplicitfeatureinr.

– Hotelisfarfrompublictransporta<on.•  loca<onisanimplicitfeature.

Apr‐18‐12 13MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 14: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC3‐Sentence‐OrientaJon(SO)– Areviewconsistsofasequenceofsentencesr=⟨s1,s2,…,sm⟩(Dingetal.,2008).

– Asentencecanbeevaluatedasthefollowingperspec<ves:

Apr‐18‐12 14MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 15: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC3.1ObjecJvity– Anobjec<vesentencecontainsormenJonfacts.•  Thishotelisfarfromtheairport,ca.15km.

– Asubjec<vesentencedoesnotmenJonanyfact.•  Theparkingcouldbefree.

3.2Polarity–  ItdescribestheorientaJonpresentinasentence(i.e.posiJve,negaJve,neutralandirrelevant).

Apr‐18‐12 15MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 16: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC3.3Intensity(strengthofthepolarity)–  Itreferstothestrengthoftheprivatestatethatisbeingexpressed,inotherwords,howstrongisanemo<onoraconvic<onofbelief(Wilson,2008).

–  Itdescribeshowintenseitwastheexperienceusingaproductorservice:•  veryposiJve,posiJve,neutral,negaJveandverynegaJve.

•  Verykindlystaff.referstoaveryposi<veimpressiononthestaffservice.

Apr‐18‐12 16MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 17: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC4‐OpinionHolder(H)–  Theholderofapar<cularopinionisthepersonortheorganisaJonthatholdstheopinion(Dingetal.,2008).

– Aholderisiden<fiedwithdemographiccharacterisJcs(e.g.name,cityandcountry).

–  Sitessuchastripadvisor.comandbooking.comclassifyholdersastypesincluding:•  familieswitholderchildren

•  familieswithyoungchildren•  maturecouples

•  groupsoffriends•  solotravellers•  youngcouples

Apr‐18‐12 17MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 18: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Characterisa<onofUGC5–Source– Aninforma<onsourceisawebsitewhichprovidesasetofreviews.•  tripadvisor.com

•  booking.com•  amazon.com

•  A:A%tude

•  SG:Sugges.on•  R:Recommenda.on

•  I:Inten.onApr‐18‐12 18MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 19: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 19

Page 20: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Limita<onsforrepresen<ngknowledgeintheaccommoda<onsector

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 20

language?

Page 21: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Morelimita<ons•  Actually,webagentsareunabletoanswerques<onssuchas:– WhatarethehotelswithlongerindoorswimmingpoolJmetableinRoma?

– WhatarethehotelswiththecheapestbreakfastinLisbon?

– WhatarethecheapesthotelswithfamilysuiteroomwithseaviewinBarcelona?

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 21

Page 22: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

KnowledgeEngineering•  OntologyasasupporttoevaluateUGC– Setofconceptstoaspecificdomain

– Humanandmachinereadable– Supporttofine‐grainedanalysisoftheinstances(e.g.reviews)

– Hontology(Hstandsforhotel,hostalandhostel)•  Arobust,coherentandmul<lingualrepresenta<onoftheaccommoda<onsector.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 22

Page 23: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

ContextWorkshop

AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.

Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 23

Page 24: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

KnowledgeEngineering•  DevelopmentMethodology

–  Iden<fyexis<ngontologiesonrelateddomains–  Selectthemainconceptsandproper<es–  Organizeconceptsandproper<eshierarchicallyintocategories–  Translatetheontology(manual)–  Expandconceptsandproper<esbasedoncomments–  Translatethenewconceptsandproper<es(manual)–  Generatetheontologyinseveralformats

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 24

Chaves,M.S.andTrojahn,C.TowardsaMulJlingualOntologyforOntology‐drivenContentMininginSocialWebSites.Proc.oftheISWC2010Workshops,VolumeI,1stInternaJonalWorkshoponCross‐CulturalandCross‐LingualAspectsoftheSemanJcWeb.Shanghai,China,November7th,2010.

Page 25: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

KnowledgeEngineering•  Hontology– AmulJlingualontologyfortheaccommodaJonsector.

•  DemoProtégé

Chaves,M.S.;Freitas,L.A.andVieira,R.(2012).Hontology:AmulJlingualontologyfortheaccommodaJonsector.4thInternaJonalConferenceonKnowledgeEngineeringandOntologyDevelopment,Barcelona,Spain,4‐7October.(SubmiXed)

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 25

Page 26: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

KnowledgeEngineering

PreliminaryHontologySta<s<cs

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 26

Metrics ValueNumberofConcepts 285NumberofObjectProper<es 10NumberofDataProper<es 31

ConceptAxiomsSubconceptaxioms 270Equivalentconceptsaxioms 4Disjointconceptsaxioms 93

ObjectPropertyAxiomsFunc<onalobjectpropertyaxioms 6Objectpropertydomainaxioms 11Objectpropertyrangeaxioms 8

DataPropertyAxiomsFunc<onaldatapropertyaxioms 12Objectdatadomainaxioms 17Objectdatarangeaxioms 1

Page 27: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Hands‐onSession•  Theaimofthishands‐onsessionistoallowyouthinking

in‐depthaboutUGConthecontextoftheaccommoda<onsector.

•  Youaregoingtoreceiveasetof4or5reviewsaboutaccommoda<onsandshouldevaluateeachoneaccordingtothefollowingparameters:–  Featurespresentinthereview(seetheconceptsofHontology)

–  Intensity(StrengthofthePolarity):(verynega<ve,nega<ve,neutral,posi<ve,veryposi<ve)

•  Notes:–  Evaluateonefeatureperline.–  Please,[email protected]:UB:GX

–  X=numberofthegroup.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 27

Page 28: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 28

Page 29: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  AnalysisandautomaJcextracJonofSemanJcOrientaJon

•  SemanJcorientaJonreferstothepolarityandstrengthofwords,phrases,ortexts.

•  Approaches–  Lexicon‐based

•  Dic<onariesofwordsannotatedwiththeword´sseman<corienta<on,orpolarity.

•  AmanuallybuiltdicJonaryprovidesasolidfoundaJonforalexicon‐basedapproach(Taboadaet.al.,2011).

–  StaJsJcalorMachine‐learning•  Supervisedclassifica<on

Apr‐18‐12 29MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 30: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  Lexicon‐basedApproach– Sen<ment‐bearingwords:alistofnouns,verbs,adjecJvesandadverbs(Chesleyetal.,2006)• useverbsandadjec<vestoclassifyEnglishopinionatedblogtexts.

– ListofconjuncJonsandconnecJves(Liu,2010).– Useofauxiliaryverbstogetfeaturesandopinion‐orientedwordsaboutproductsfromtexts(Khanetal.,2010).

Apr‐18‐12 30MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 31: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  Seedwords– areasmallsetofwordswithstrongnegaJveorposiJveassocia<ons,suchasexcellentorabysmal.

–  Inprinciple,aposi<veadjec<veshouldoccurmorefrequentlyalongsidetheposi<veseedwords,andthuswillobtainaposi<vescore,whereasnega<veadjec<veswilloccurmosto\eninthevicinityofnega<veseedwords,thusobtaininganega<vescore(Taboadaet.al.2011).•  Thisrestauranthasabadandexpensivefood.

Apr‐18‐12 31MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 32: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  Part‐of‐Speech(PoS)–  Inordertoevaluateasentenceinareview,weshouldconsidertheparts‐of‐speechmen<onedsuchasadjecJves,adverbsandverbs.

– Adjec<vesareclassifiedas:•  posi<ve(good,excellentandclean),•  nega<ve(awful,boringandterrible),•  neutral(regularandindifferent)and•  dual,whichcanexpressposi<veandnega<veopinion(small,long).

–  Insomeapproachesnounsarerepresentedbyconceptsofadomainontologyandmappedasfeatures.

Apr‐18‐12 32MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 33: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  ConjuncJonandConnecJve(CC)– Connec<vesarewordsthathelpiden<fyingaddiJonaladjecJveopinionwordsandtheirorientaJons.

– Oneoftheconstraintsisaboutconjunc<on(i.e.and),whichsaysthatconjoinedadjec<vesusuallyhavethesameorienta<on(Liu,2010).•  Thisroomisbeau<fulandspacious.

–  ifbeau<fulisknowntobeposi<ve,itcanbeinferredthatspaciousisalsoposi<ve.

– HeurisJc:•  PeopleusuallyexpressthesameopiniononbothsidesofaconjuncJon.

Apr‐18‐12 33MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 34: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  ConjuncJonandConnecJve(CC)– Rulesorconstraintsarealsodesignedforotherconnec<ves(e.g.or,but,either‐or,andneither‐nor).•  Thishotelisbeau<fulbutdifficulttogetthere.

–  Theoccurrencea\ertheconnec<vebutisanindicatorofanega<veopinion.

Apr‐18‐12 34MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 35: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  StrengthofthePolaJryorIntensityorIntensificaJon– Amplifiers(very,alot)increasetheseman<cintensityofaneighboringlexicalitem;

– AXenuators/Downtoners(ali/le,slightly)decreaseit.

•  SomeapproacheshaveimplementedintensifiersusingsimpleaddiJonandsubtracJon–  ifaposi<veadjec<vehasanSOvalueof2:•  anamplifiedadjec<vewouldhaveanSOvalueof3,and•  adowntonedadjec<veanSOvalueof1.

Apr‐18‐12 35MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 36: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Sen<mentAnalysis•  NegaJon– Theobviousapproachtonega<onissimplytoreversethepolarityofthelexicalitemnexttoanegator,changinggood(+3)intonotgood(−3).

– Not,none,nobody,never,andnothing,andotherwords,suchaswithoutorlack.

Apr‐18‐12 36MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 37: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PolarityRecognizerinPortuguese(PIRPO)•  PolarityRecognizerinPortuguesetoclassifysenJmentin

onlinereviews.

•  PIRPOwasbuiltfromthegroundtoPortugueseforrecognisingthepolarityoftheuseropiniononaccommoda<onreviews.

•  Eachreviewisanalysedaccordingtoconceptsfromadomainontology.

•  Wedecomposethereviewinsentencesinordertoassignapolaritytoeachconceptoftheontologyinthesentence.

Chaves,M.S.,Freitas,L.,Souza,M.andVieira,R.PIRPO:AnAlgorithmtodealwithPolarityinPortugueseOnlineReviewsfromtheAccommodaJonSector.17thInternaJonalconferenceonApplicaJonsofNaturalLanguageProcessingtoInformaJonSystems(NLDB),Groningen,TheNetherlands,26‐28June2012.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 37

Page 38: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PIRPOInforma<onArchitecture

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 38

Page 39: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PIRPO•  Reviews– Fulldataset:1500reviewsfromJanuary2010toApril2011inPortuguese,EnglishandSpanish,fromwhich180inPortuguese.

•  OntologyConcepts– TheconceptsusedtoclassifythereviewsareprovidedbyHontology,whichinitscurrentversion,has110concepts.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 39

Page 40: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PIRPO•  ListofadjecJves:Itiscomposedbysen<ment‐bearingwords.– ThislistofpolaradjecJvesinPortuguese•  contains30.322entries.•  iscomposedbythenameoftheadjecJveandapolaritywhichcanassignoneofthreevalues:+1,‐1and0.

•  ThesevaluescorrespondingtotheposiJve,negaJveandneutralsensesoftheadjec<ve.

– PIRPOusesthislisttocalculatethesemanJcorientaJonoftheconceptsfoundinthesentences.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 40

Page 41: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PIRPOAlgorithm

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 41

Page 42: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PIRPOMeasureEvalua<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 42

•  Precision

•  Recall

•  F‐score(harmonicmeanofprecisionandrecall)

P ={relevantConcepts}∩{retrievedConcepts}

{retrievedConcepts}

R ={relevantConcepts}∩{retrievedConcepts}

{relevantConcepts}

F = 2 × P × RP + R

Page 43: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PIRPOPreliminaryResults

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 43

Page 44: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

PIRPO:DiscussionontheResults•  PIRPOreachedabe/errecallforconceptswithposi<vepolarity,whilemixedpolarityhadahigherprecision.

•  ThelowF‐scorecanbemainlyduetothealgorithmhasassignedapolaritytoaspecificconceptoftheontology,whilethehumanclassifiedthereviewasawhole.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 44

Page 45: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  KnowledgeEngineering‐ModellingUGC

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 45

Page 46: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

ContextWorkshop

AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.

Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 46

Page 47: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Informa<onVisualisa<on•  Whatisthevisualmodelofthepoten<alend‐user?

•  Howshouldweproperlymapandrender:–  themostvaluedaccommoda<onfeatures?

–  thepercep<onofthequalityofferedbythehotel?–  thecorrela<onbetweentheguest’sprofileandthemostlyrelevantfeatures?

–  theintensityoftheposi<vityornega<vityofthefeatures?

•  Doestheuseofadvancedvisualtechniques(suchastreeoriented)tomaptheresultswillhelptheaccommoda<onmanagersandgueststohaveabe/erinsightofthedata?

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 47

Page 48: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

ExploringInforma<onVisualisa<on•  Inthenextfigures

– ThecolorwasusedtomapthepolarityandthestrengthofthepolarityvaluesontheCO.

– ThesizewasusedtomapthefrequencythattheCOismen<onedinthereviews.

Apr‐18‐12 48MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 49: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

ExploringInforma<onVisualisa<on

Resultoftheapplica<onofBubbleTreevisualisaJonoftherela<onamongconceptsoftheontology,polarity(le\)and

strengthofthepolarity(right).

•  Carvalho,E.;Chaves,M.S.,2012.ExploringUser‐GeneratedDataVisualizaJonintheAccommodaJonSector.16thInternaJonalConferenceInformaJonVisualisaJon,IEEE.(SubmiXed)

Apr‐18‐12 49MarcirioChaves‐marcirioc@uatlan<ca.pt

Page 50: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

ExploringInforma<onVisualisa<on

Apr‐18‐12 50MarcirioChaves‐marcirioc@uatlan<ca.pt

ResultsusingTreemapvisualisaJonoftherela<onamongtypeofcustomer,conceptsoftheontologyandpolarity.

Page 51: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Ques<onnaire(inSpanish)•  Youaregoingtoreceiveaques<onnaireaboutinforma<onvisualisa<onusingUGCinthecontextoftheaccommoda<onsector.

•  Please,clickhereh/p://kwiksurveys.com?u=Infovisestoanswerit.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 51

Page 52: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

FinalRemarks•  In‐depthanalysisofUGCcanbeusedasinputtoimprovedecisionmaking.

•  Itis<metothinkaboutnewmodelstostoreUGCdata.

•  ItisnecessarythebuildingfromthegroundofnewalgorithmstodealwithUGCforlanguagesotherthanEnglish.

•  InformaJonvisualisaJonofUGCisinitsinfancystate.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 52

Page 53: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

MainReferences•  S.Bethard,H.Yu,A.Thornton,V.Hatzivassiloglou,andD.Jurafsky,2004.Automa<cextrac<onofopinionproposi<onsand

theirholders.inProceedingsoftheAAAISpringSymposiumonExploringA%tudeandAffectinText.•  Chesley,P.;Vincent,B.;Xu,L.andSrihariR.,2006.Usingverbsandadjec<vestoautoma<callyclassifyblogsen<ment.in

AAAISymposiumonComputa<onalApproachestoAnalysingWeblogs(AAAI‐CAAW),27–29.

•  Ding,X.,Liu,B.,andYu,P.S.,2008.Aholis<clexicon‐basedapproachtoopinionmining.ProceedingsoftheConferenceonWebSearchandWebDataMining(WSDM).

•  M.HuandB.Liu,2004.Miningopinionfeaturesincustomerreviews.InProceedingsofAAAI,pp.755–760.

•  S.‐M.KimandE.Hovy,2004.Determiningthesen<mentofopinions.InProceedingsoftheInterna.onalConferenceonComputa.onalLinguis.cs(COLING),2004.

•  Liu,Bing,2010.Sen<mentAnalysisandSubjec<vity.InHandbookofNaturalLanguageProcessing,SecondEdi<on,Eds:N.IndurkhyaandF.J.Damerau),CRCPress,TaylorandFrancisGroup,BocaRaton,FL.Chapter28.

•  Mar<n,J.R.andWhite,P.R.R.,2005.TheLanguageofEvalua<on,AppraisalinEnglish,PalgraveMacmillan,London&NewYork.

•  Taboada,M.,Brooke,J.,Tofiloski,M.,Voll,K.D.,Stede,M.,2011.Lexicon‐basedmethodsforsen<mentanalysis.Computa<onalLinguis<cs37(2),267–307.

•  Tang,H.,Tan,S.,Cheng,X.,2009.Asurveyonsen<mentdetec<onofreviews.ExpertSystemswithApplica<ons36(7),10760–10773.

•  Whitelaw,C.;Garg,N.andArgamon,S.,2005.Usingappraisalgroupsforsen<mentanalysis.InProceedingsofthe14thACMinterna<onalconferenceonInforma<onandknowledgemanagement(CIKM'05).ACM,NewYork,NY,USA,625‐631.

•  Wilson,T.,2008.Fine‐GrainedSubjec<vityAnalysis.PhDDisserta<on,IntelligentSystemsProgram,UniversityofPi/sburgh.

•  Wilson,T.,Wiebe,J.,Hoffmann,P.,2009.Recognizingcontextualpolarity:Anexplora<onoffeaturesforphrase‐levelsen<mentanalysis.Computa<onalLinguis<cs35,399–433.

•  Y.Wu,F.Wei,S.Liu,N.Au,W.Cui,H.Zhou,andH.Qu,2010.OpinionSeer:Interac<veVisualisa<onofHotelCustomerFeedback.IEEETransac<onsonVisualiza<onandComputerGraphics,6,1109‐1118.Nov‐Dec.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 53

Page 54: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Open‐sourcesen<ment‐analysistools• PythonNLTK(NaturalLanguageToolkit)–  h/p://www.nltk.organdh/p://text‐processing.com/demo/sen<ment

• R,TM(textmining)moduleh/p://cran.r‐project.org/web/packages/tm/index.html

• RapidMinerh/p://rapid‐i.com/content/view/184/196/

• GATE,theGeneralArchitectureforTextEngineeringh/p://gate.ac.uk/sen<ment

• UIMA‐plug‐inannotatorsforsen<ment—ApacheUIMAistheUnstructuredInforma<onManagementArchitecture,h/p://uima.apache.org/

•  SenJmentclassifiersfortheWEKAdata‐miningworkbench,h/p://www.cs.waikato.ac.nz/ml/weka/.

•  StanfordNLPtools‐h/p://www‐nlp.stanford.edu/so\waremaximum‐entropyclassifica<onapproachforsen<ment.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 54

Page 55: A Fine-Grained Analysis of User-Generated Content to Support Decision Making

Thankyouverymuchforyoura/en<on!!

Ques<ons

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 55


Top Related