a fine-grained analysis of user-generated content to support decision making

Post on 18-Dec-2014

459 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

User-generated content (UGC) such as online reviews is freely available in the web. This kind of data has been used to support clients’ and managerial decision-making in several industries, e.g. books, tourism, or hospitality. In this workshop, I will introduce a fine-grained characterisation of UGC and a new multidomain and multilingual conceptual data model to represent UGC. Moreover, I will present a domain-specific ontology for accommodations that can be also used to support managerial decision making and end-user applications. Instead of the few categories commonly provided by Web 2.0 portals, this ontology enables accommodation managers to find specific information. The ontology is also used as input for an algorithm to recognise sentiment in online reviews. Finally, I will describe some of the main approaches to deal with sentiment analysis. In short, I will address some of the main challenges of UGC introducing: a) A proposal for a fine-grained characterisation of UGC; b) A structured representation of UGC which leverages the information provided by the use of Web 2.0 applications; c) The main approaches to perform sentiment analysis; d) An ontology to represent knowledge in the accommodation sector.

TRANSCRIPT

AFine‐GrainedAnalysisofUser‐GeneratedContenttoSupportDecisionMaking

MarcirioSilveiraChavesh/p://mchaves.wikidot.com

Informa<onSystemsResearchGroup

BusinessandInforma<onTechnologyResearchCentre(BITREC)Ins<tuteforScien<ficandTechnologicalResearchofUniversidadeAtlân<ca(ISTR)

Workshop

User‐GeneratedContent(UGC)•  Asknownas

–  User‐GeneratedData–  User‐CreatedContent–  User‐ContributedData–  Consumer‐GeneratedMedia

–  …

•  Canbeexpressedthrought–  Opinions–  Reviews–  Comments–  Posts

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 2

• Notes:• Alltheexamplesdescribedinthisworkshoparerealdata.• Somepapersmen<onedhereareunderreview.• Colorlegend:

• Examples• Posi<vefeature• Nega<vefeature

ExampleofUGC•  AnopinionpostedinFacebookDec‐10‐2011,12:30pm

– “wouldhighlyrecommendInfinityMotorcycles,Southamptonforallmotorbikinggear.Veryreasonablepeople.Earliertheygavemeafullmoneybackforaunused(a\erexplainingwhyitwasunused)ladiesmotorbikejacket(nodefectswhasoever)andtodaythezipperonmynewjacketwasbrokenandtheygavemeabrandnewone(noques<onsasked,noreceiptbusinessandnofusscreated).FiveStarservice.”

– Thisuserhad226friends.

Apr‐18‐12 3MarcirioChaves‐marcirioc@uatlan<ca.pt

Somesta<s<csaboutUGC•  Morethan50%ofallinternetvisitsarenowtoUGC/socialmediasites.

•  Morethan75%of<mespentontheinternetis"social”.

•  Facebooknowcapturesasmuch<mespentontheinternetasGoogle,Yahoo,andAOL.

•  Morethan80%ofconsumersareinfluencedbySocialMarkeJng.

Source: http://www.bbrisco.com/2010/05/social.html

Apr‐18‐12 4MarcirioChaves‐marcirioc@uatlan<ca.pt

MainObjec<vesofthisWorkshop•  In‐depthanalysisofUGC

•  UseUGCtosupportdecisionmaking

•  StudyadomainontologytosupportAr<ficialIntelligencetasks

•  Addressapproachesforsen<mentanalysis

•  Fromtheorytoprac<ce:Hands‐onSessionApr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 5

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 6

ContextWorkshop

AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.

Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 7

AnFine‐grainedAnalysisofUGC•  OverallopinionaboutatopicisonlyapartoftheinformaJonofinterest.

•  Document‐levelsenJmentclassificaJonfailstodetectsen<mentaboutindividualaspectsofthetopic.Inreality,forexample,thoughonecouldbegenerallyhappyabouthiscar,hemightbedissaJsfiedbytheenginenoise.

•  Tothemanufacturers,theseindividualweaknessesandstrengthsareequallyimportanttoknow,orevenmorevaluablethantheoverallsa<sfac<onlevelofcustomers.(Tangetal.2009)

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 8

UGC

AnopinionissimplyaposiJveornegaJvesenJment,view,aPtude,emoJon,or

appraisalaboutanenJtyoranaspectoftheenJty(HuandLiu,2004;Liu,2006)fromanopinionholder(Bethardetal.,2004;Kimand

Hovy,2004;Wiebeetal.,2005).

Apr‐18‐12 9MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC•  Opinion’sCharacterisa<on–  Iuseandextendthedefini<onproposedby(Dingetal.,2008;Liu,2010;Mar<nandWhite,2005)toanalysethesentencesofreviews.

– Letthereviewber.

–  Inthemostgeneralcase,rischaracterisedasasetofthefollowingelements{O,F,SO,H,S,A,R,I,SG},where:

Apr‐18‐12 10MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC•  Opinion’sCharacterisa<on– O:Object– F:Feature– SO:Seman<c‐Orienta<on– H:Holder– S:Source– A:A%tude– SG:Sugges.on– R:Recommenda.on–  I:Inten.on

Apr‐18‐12 11MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC1  ‐Object(O)– Anobjectisaproduct(e.g.movieandbook)oraservice(e.g.hotelandrestaurant)underreviewwhichiscomposedbyfeatures.– ObjectsarealsocalledenJJes.

2‐Feature(F)– Afeatureisacomponentorpartofanobject.•  actorandphotographyarefeaturesonamovie.•  poolandstaffarefeaturesonahotel.

– FeaturesarealsocalledaXributesorfacets.– Afeaturecanbemen<onedexplicitlyorimplicitlyinareview(Dingetal.2008).

Apr‐18‐12 12MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC2.1‐ExplicitFeature(F)–  Ifafeaturefappearsinreviewr,itiscalledanexplicitfeatureinr.

– Thehotelislocatedverynearthecentercity.•  loca<onisanexplicitfeature.

2.2‐ImplicitFeature(F):–  Iffdoesnotappearinrbutisimplied,itiscalledanimplicitfeatureinr.

– Hotelisfarfrompublictransporta<on.•  loca<onisanimplicitfeature.

Apr‐18‐12 13MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC3‐Sentence‐OrientaJon(SO)– Areviewconsistsofasequenceofsentencesr=⟨s1,s2,…,sm⟩(Dingetal.,2008).

– Asentencecanbeevaluatedasthefollowingperspec<ves:

Apr‐18‐12 14MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC3.1ObjecJvity– Anobjec<vesentencecontainsormenJonfacts.•  Thishotelisfarfromtheairport,ca.15km.

– Asubjec<vesentencedoesnotmenJonanyfact.•  Theparkingcouldbefree.

3.2Polarity–  ItdescribestheorientaJonpresentinasentence(i.e.posiJve,negaJve,neutralandirrelevant).

Apr‐18‐12 15MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC3.3Intensity(strengthofthepolarity)–  Itreferstothestrengthoftheprivatestatethatisbeingexpressed,inotherwords,howstrongisanemo<onoraconvic<onofbelief(Wilson,2008).

–  Itdescribeshowintenseitwastheexperienceusingaproductorservice:•  veryposiJve,posiJve,neutral,negaJveandverynegaJve.

•  Verykindlystaff.referstoaveryposi<veimpressiononthestaffservice.

Apr‐18‐12 16MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC4‐OpinionHolder(H)–  Theholderofapar<cularopinionisthepersonortheorganisaJonthatholdstheopinion(Dingetal.,2008).

– Aholderisiden<fiedwithdemographiccharacterisJcs(e.g.name,cityandcountry).

–  Sitessuchastripadvisor.comandbooking.comclassifyholdersastypesincluding:•  familieswitholderchildren

•  familieswithyoungchildren•  maturecouples

•  groupsoffriends•  solotravellers•  youngcouples

Apr‐18‐12 17MarcirioChaves‐marcirioc@uatlan<ca.pt

Characterisa<onofUGC5–Source– Aninforma<onsourceisawebsitewhichprovidesasetofreviews.•  tripadvisor.com

•  booking.com•  amazon.com

•  A:A%tude

•  SG:Sugges.on•  R:Recommenda.on

•  I:Inten.onApr‐18‐12 18MarcirioChaves‐marcirioc@uatlan<ca.pt

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 19

Limita<onsforrepresen<ngknowledgeintheaccommoda<onsector

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 20

language?

Morelimita<ons•  Actually,webagentsareunabletoanswerques<onssuchas:– WhatarethehotelswithlongerindoorswimmingpoolJmetableinRoma?

– WhatarethehotelswiththecheapestbreakfastinLisbon?

– WhatarethecheapesthotelswithfamilysuiteroomwithseaviewinBarcelona?

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 21

KnowledgeEngineering•  OntologyasasupporttoevaluateUGC– Setofconceptstoaspecificdomain

– Humanandmachinereadable– Supporttofine‐grainedanalysisoftheinstances(e.g.reviews)

– Hontology(Hstandsforhotel,hostalandhostel)•  Arobust,coherentandmul<lingualrepresenta<onoftheaccommoda<onsector.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 22

ContextWorkshop

AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.

Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 23

KnowledgeEngineering•  DevelopmentMethodology

–  Iden<fyexis<ngontologiesonrelateddomains–  Selectthemainconceptsandproper<es–  Organizeconceptsandproper<eshierarchicallyintocategories–  Translatetheontology(manual)–  Expandconceptsandproper<esbasedoncomments–  Translatethenewconceptsandproper<es(manual)–  Generatetheontologyinseveralformats

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 24

Chaves,M.S.andTrojahn,C.TowardsaMulJlingualOntologyforOntology‐drivenContentMininginSocialWebSites.Proc.oftheISWC2010Workshops,VolumeI,1stInternaJonalWorkshoponCross‐CulturalandCross‐LingualAspectsoftheSemanJcWeb.Shanghai,China,November7th,2010.

KnowledgeEngineering•  Hontology– AmulJlingualontologyfortheaccommodaJonsector.

•  DemoProtégé

Chaves,M.S.;Freitas,L.A.andVieira,R.(2012).Hontology:AmulJlingualontologyfortheaccommodaJonsector.4thInternaJonalConferenceonKnowledgeEngineeringandOntologyDevelopment,Barcelona,Spain,4‐7October.(SubmiXed)

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 25

KnowledgeEngineering

PreliminaryHontologySta<s<cs

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 26

Metrics ValueNumberofConcepts 285NumberofObjectProper<es 10NumberofDataProper<es 31

ConceptAxiomsSubconceptaxioms 270Equivalentconceptsaxioms 4Disjointconceptsaxioms 93

ObjectPropertyAxiomsFunc<onalobjectpropertyaxioms 6Objectpropertydomainaxioms 11Objectpropertyrangeaxioms 8

DataPropertyAxiomsFunc<onaldatapropertyaxioms 12Objectdatadomainaxioms 17Objectdatarangeaxioms 1

Hands‐onSession•  Theaimofthishands‐onsessionistoallowyouthinking

in‐depthaboutUGConthecontextoftheaccommoda<onsector.

•  Youaregoingtoreceiveasetof4or5reviewsaboutaccommoda<onsandshouldevaluateeachoneaccordingtothefollowingparameters:–  Featurespresentinthereview(seetheconceptsofHontology)

–  Intensity(StrengthofthePolarity):(verynega<ve,nega<ve,neutral,posi<ve,veryposi<ve)

•  Notes:–  Evaluateonefeatureperline.–  Please,saveyoursheetinanotherfileandsendtomschaves@gmail.com.Subject:UB:GX

–  X=numberofthegroup.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 27

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 28

Sen<mentAnalysis•  AnalysisandautomaJcextracJonofSemanJcOrientaJon

•  SemanJcorientaJonreferstothepolarityandstrengthofwords,phrases,ortexts.

•  Approaches–  Lexicon‐based

•  Dic<onariesofwordsannotatedwiththeword´sseman<corienta<on,orpolarity.

•  AmanuallybuiltdicJonaryprovidesasolidfoundaJonforalexicon‐basedapproach(Taboadaet.al.,2011).

–  StaJsJcalorMachine‐learning•  Supervisedclassifica<on

Apr‐18‐12 29MarcirioChaves‐marcirioc@uatlan<ca.pt

Sen<mentAnalysis•  Lexicon‐basedApproach– Sen<ment‐bearingwords:alistofnouns,verbs,adjecJvesandadverbs(Chesleyetal.,2006)• useverbsandadjec<vestoclassifyEnglishopinionatedblogtexts.

– ListofconjuncJonsandconnecJves(Liu,2010).– Useofauxiliaryverbstogetfeaturesandopinion‐orientedwordsaboutproductsfromtexts(Khanetal.,2010).

Apr‐18‐12 30MarcirioChaves‐marcirioc@uatlan<ca.pt

Sen<mentAnalysis•  Seedwords– areasmallsetofwordswithstrongnegaJveorposiJveassocia<ons,suchasexcellentorabysmal.

–  Inprinciple,aposi<veadjec<veshouldoccurmorefrequentlyalongsidetheposi<veseedwords,andthuswillobtainaposi<vescore,whereasnega<veadjec<veswilloccurmosto\eninthevicinityofnega<veseedwords,thusobtaininganega<vescore(Taboadaet.al.2011).•  Thisrestauranthasabadandexpensivefood.

Apr‐18‐12 31MarcirioChaves‐marcirioc@uatlan<ca.pt

Sen<mentAnalysis•  Part‐of‐Speech(PoS)–  Inordertoevaluateasentenceinareview,weshouldconsidertheparts‐of‐speechmen<onedsuchasadjecJves,adverbsandverbs.

– Adjec<vesareclassifiedas:•  posi<ve(good,excellentandclean),•  nega<ve(awful,boringandterrible),•  neutral(regularandindifferent)and•  dual,whichcanexpressposi<veandnega<veopinion(small,long).

–  Insomeapproachesnounsarerepresentedbyconceptsofadomainontologyandmappedasfeatures.

Apr‐18‐12 32MarcirioChaves‐marcirioc@uatlan<ca.pt

Sen<mentAnalysis•  ConjuncJonandConnecJve(CC)– Connec<vesarewordsthathelpiden<fyingaddiJonaladjecJveopinionwordsandtheirorientaJons.

– Oneoftheconstraintsisaboutconjunc<on(i.e.and),whichsaysthatconjoinedadjec<vesusuallyhavethesameorienta<on(Liu,2010).•  Thisroomisbeau<fulandspacious.

–  ifbeau<fulisknowntobeposi<ve,itcanbeinferredthatspaciousisalsoposi<ve.

– HeurisJc:•  PeopleusuallyexpressthesameopiniononbothsidesofaconjuncJon.

Apr‐18‐12 33MarcirioChaves‐marcirioc@uatlan<ca.pt

Sen<mentAnalysis•  ConjuncJonandConnecJve(CC)– Rulesorconstraintsarealsodesignedforotherconnec<ves(e.g.or,but,either‐or,andneither‐nor).•  Thishotelisbeau<fulbutdifficulttogetthere.

–  Theoccurrencea\ertheconnec<vebutisanindicatorofanega<veopinion.

Apr‐18‐12 34MarcirioChaves‐marcirioc@uatlan<ca.pt

Sen<mentAnalysis•  StrengthofthePolaJryorIntensityorIntensificaJon– Amplifiers(very,alot)increasetheseman<cintensityofaneighboringlexicalitem;

– AXenuators/Downtoners(ali/le,slightly)decreaseit.

•  SomeapproacheshaveimplementedintensifiersusingsimpleaddiJonandsubtracJon–  ifaposi<veadjec<vehasanSOvalueof2:•  anamplifiedadjec<vewouldhaveanSOvalueof3,and•  adowntonedadjec<veanSOvalueof1.

Apr‐18‐12 35MarcirioChaves‐marcirioc@uatlan<ca.pt

Sen<mentAnalysis•  NegaJon– Theobviousapproachtonega<onissimplytoreversethepolarityofthelexicalitemnexttoanegator,changinggood(+3)intonotgood(−3).

– Not,none,nobody,never,andnothing,andotherwords,suchaswithoutorlack.

Apr‐18‐12 36MarcirioChaves‐marcirioc@uatlan<ca.pt

PolarityRecognizerinPortuguese(PIRPO)•  PolarityRecognizerinPortuguesetoclassifysenJmentin

onlinereviews.

•  PIRPOwasbuiltfromthegroundtoPortugueseforrecognisingthepolarityoftheuseropiniononaccommoda<onreviews.

•  Eachreviewisanalysedaccordingtoconceptsfromadomainontology.

•  Wedecomposethereviewinsentencesinordertoassignapolaritytoeachconceptoftheontologyinthesentence.

Chaves,M.S.,Freitas,L.,Souza,M.andVieira,R.PIRPO:AnAlgorithmtodealwithPolarityinPortugueseOnlineReviewsfromtheAccommodaJonSector.17thInternaJonalconferenceonApplicaJonsofNaturalLanguageProcessingtoInformaJonSystems(NLDB),Groningen,TheNetherlands,26‐28June2012.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 37

PIRPOInforma<onArchitecture

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 38

PIRPO•  Reviews– Fulldataset:1500reviewsfromJanuary2010toApril2011inPortuguese,EnglishandSpanish,fromwhich180inPortuguese.

•  OntologyConcepts– TheconceptsusedtoclassifythereviewsareprovidedbyHontology,whichinitscurrentversion,has110concepts.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 39

PIRPO•  ListofadjecJves:Itiscomposedbysen<ment‐bearingwords.– ThislistofpolaradjecJvesinPortuguese•  contains30.322entries.•  iscomposedbythenameoftheadjecJveandapolaritywhichcanassignoneofthreevalues:+1,‐1and0.

•  ThesevaluescorrespondingtotheposiJve,negaJveandneutralsensesoftheadjec<ve.

– PIRPOusesthislisttocalculatethesemanJcorientaJonoftheconceptsfoundinthesentences.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 40

PIRPOAlgorithm

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 41

PIRPOMeasureEvalua<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 42

•  Precision

•  Recall

•  F‐score(harmonicmeanofprecisionandrecall)

P ={relevantConcepts}∩{retrievedConcepts}

{retrievedConcepts}

R ={relevantConcepts}∩{retrievedConcepts}

{relevantConcepts}

F = 2 × P × RP + R

PIRPOPreliminaryResults

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 43

PIRPO:DiscussionontheResults•  PIRPOreachedabe/errecallforconceptswithposi<vepolarity,whilemixedpolarityhadahigherprecision.

•  ThelowF‐scorecanbemainlyduetothealgorithmhasassignedapolaritytoaspecificconceptoftheontology,whilethehumanclassifiedthereviewasawhole.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 44

OutlinePart1

•  WorkshopContext

•  User‐GeneratedContent(UGC)

•  Characterisa<onofUGC•  KnowledgeEngineering‐

OntologyDevelopment

•  Hands‐onSession(IndividualTask):DealingwithUGC

Part2

•  KnowledgeEngineering‐ModellingUGC

•  Sen<mentAnalysis/OpinionMining

•  PolarityRecognizerinPortuguese(PIRPO)

•  Informa<onVisualisa<on

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 45

ContextWorkshop

AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.

Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 46

Informa<onVisualisa<on•  Whatisthevisualmodelofthepoten<alend‐user?

•  Howshouldweproperlymapandrender:–  themostvaluedaccommoda<onfeatures?

–  thepercep<onofthequalityofferedbythehotel?–  thecorrela<onbetweentheguest’sprofileandthemostlyrelevantfeatures?

–  theintensityoftheposi<vityornega<vityofthefeatures?

•  Doestheuseofadvancedvisualtechniques(suchastreeoriented)tomaptheresultswillhelptheaccommoda<onmanagersandgueststohaveabe/erinsightofthedata?

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 47

ExploringInforma<onVisualisa<on•  Inthenextfigures

– ThecolorwasusedtomapthepolarityandthestrengthofthepolarityvaluesontheCO.

– ThesizewasusedtomapthefrequencythattheCOismen<onedinthereviews.

Apr‐18‐12 48MarcirioChaves‐marcirioc@uatlan<ca.pt

ExploringInforma<onVisualisa<on

Resultoftheapplica<onofBubbleTreevisualisaJonoftherela<onamongconceptsoftheontology,polarity(le\)and

strengthofthepolarity(right).

•  Carvalho,E.;Chaves,M.S.,2012.ExploringUser‐GeneratedDataVisualizaJonintheAccommodaJonSector.16thInternaJonalConferenceInformaJonVisualisaJon,IEEE.(SubmiXed)

Apr‐18‐12 49MarcirioChaves‐marcirioc@uatlan<ca.pt

ExploringInforma<onVisualisa<on

Apr‐18‐12 50MarcirioChaves‐marcirioc@uatlan<ca.pt

ResultsusingTreemapvisualisaJonoftherela<onamongtypeofcustomer,conceptsoftheontologyandpolarity.

Ques<onnaire(inSpanish)•  Youaregoingtoreceiveaques<onnaireaboutinforma<onvisualisa<onusingUGCinthecontextoftheaccommoda<onsector.

•  Please,clickhereh/p://kwiksurveys.com?u=Infovisestoanswerit.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 51

FinalRemarks•  In‐depthanalysisofUGCcanbeusedasinputtoimprovedecisionmaking.

•  Itis<metothinkaboutnewmodelstostoreUGCdata.

•  ItisnecessarythebuildingfromthegroundofnewalgorithmstodealwithUGCforlanguagesotherthanEnglish.

•  InformaJonvisualisaJonofUGCisinitsinfancystate.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 52

MainReferences•  S.Bethard,H.Yu,A.Thornton,V.Hatzivassiloglou,andD.Jurafsky,2004.Automa<cextrac<onofopinionproposi<onsand

theirholders.inProceedingsoftheAAAISpringSymposiumonExploringA%tudeandAffectinText.•  Chesley,P.;Vincent,B.;Xu,L.andSrihariR.,2006.Usingverbsandadjec<vestoautoma<callyclassifyblogsen<ment.in

AAAISymposiumonComputa<onalApproachestoAnalysingWeblogs(AAAI‐CAAW),27–29.

•  Ding,X.,Liu,B.,andYu,P.S.,2008.Aholis<clexicon‐basedapproachtoopinionmining.ProceedingsoftheConferenceonWebSearchandWebDataMining(WSDM).

•  M.HuandB.Liu,2004.Miningopinionfeaturesincustomerreviews.InProceedingsofAAAI,pp.755–760.

•  S.‐M.KimandE.Hovy,2004.Determiningthesen<mentofopinions.InProceedingsoftheInterna.onalConferenceonComputa.onalLinguis.cs(COLING),2004.

•  Liu,Bing,2010.Sen<mentAnalysisandSubjec<vity.InHandbookofNaturalLanguageProcessing,SecondEdi<on,Eds:N.IndurkhyaandF.J.Damerau),CRCPress,TaylorandFrancisGroup,BocaRaton,FL.Chapter28.

•  Mar<n,J.R.andWhite,P.R.R.,2005.TheLanguageofEvalua<on,AppraisalinEnglish,PalgraveMacmillan,London&NewYork.

•  Taboada,M.,Brooke,J.,Tofiloski,M.,Voll,K.D.,Stede,M.,2011.Lexicon‐basedmethodsforsen<mentanalysis.Computa<onalLinguis<cs37(2),267–307.

•  Tang,H.,Tan,S.,Cheng,X.,2009.Asurveyonsen<mentdetec<onofreviews.ExpertSystemswithApplica<ons36(7),10760–10773.

•  Whitelaw,C.;Garg,N.andArgamon,S.,2005.Usingappraisalgroupsforsen<mentanalysis.InProceedingsofthe14thACMinterna<onalconferenceonInforma<onandknowledgemanagement(CIKM'05).ACM,NewYork,NY,USA,625‐631.

•  Wilson,T.,2008.Fine‐GrainedSubjec<vityAnalysis.PhDDisserta<on,IntelligentSystemsProgram,UniversityofPi/sburgh.

•  Wilson,T.,Wiebe,J.,Hoffmann,P.,2009.Recognizingcontextualpolarity:Anexplora<onoffeaturesforphrase‐levelsen<mentanalysis.Computa<onalLinguis<cs35,399–433.

•  Y.Wu,F.Wei,S.Liu,N.Au,W.Cui,H.Zhou,andH.Qu,2010.OpinionSeer:Interac<veVisualisa<onofHotelCustomerFeedback.IEEETransac<onsonVisualiza<onandComputerGraphics,6,1109‐1118.Nov‐Dec.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 53

Open‐sourcesen<ment‐analysistools• PythonNLTK(NaturalLanguageToolkit)–  h/p://www.nltk.organdh/p://text‐processing.com/demo/sen<ment

• R,TM(textmining)moduleh/p://cran.r‐project.org/web/packages/tm/index.html

• RapidMinerh/p://rapid‐i.com/content/view/184/196/

• GATE,theGeneralArchitectureforTextEngineeringh/p://gate.ac.uk/sen<ment

• UIMA‐plug‐inannotatorsforsen<ment—ApacheUIMAistheUnstructuredInforma<onManagementArchitecture,h/p://uima.apache.org/

•  SenJmentclassifiersfortheWEKAdata‐miningworkbench,h/p://www.cs.waikato.ac.nz/ml/weka/.

•  StanfordNLPtools‐h/p://www‐nlp.stanford.edu/so\waremaximum‐entropyclassifica<onapproachforsen<ment.

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 54

Thankyouverymuchforyoura/en<on!!

Ques<ons

Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 55

top related