a fine-grained analysis of user-generated content to support decision making
DESCRIPTION
User-generated content (UGC) such as online reviews is freely available in the web. This kind of data has been used to support clients’ and managerial decision-making in several industries, e.g. books, tourism, or hospitality. In this workshop, I will introduce a fine-grained characterisation of UGC and a new multidomain and multilingual conceptual data model to represent UGC. Moreover, I will present a domain-specific ontology for accommodations that can be also used to support managerial decision making and end-user applications. Instead of the few categories commonly provided by Web 2.0 portals, this ontology enables accommodation managers to find specific information. The ontology is also used as input for an algorithm to recognise sentiment in online reviews. Finally, I will describe some of the main approaches to deal with sentiment analysis. In short, I will address some of the main challenges of UGC introducing: a) A proposal for a fine-grained characterisation of UGC; b) A structured representation of UGC which leverages the information provided by the use of Web 2.0 applications; c) The main approaches to perform sentiment analysis; d) An ontology to represent knowledge in the accommodation sector.TRANSCRIPT
AFine‐GrainedAnalysisofUser‐GeneratedContenttoSupportDecisionMaking
MarcirioSilveiraChavesh/p://mchaves.wikidot.com
Informa<onSystemsResearchGroup
BusinessandInforma<onTechnologyResearchCentre(BITREC)Ins<tuteforScien<ficandTechnologicalResearchofUniversidadeAtlân<ca(ISTR)
Workshop
User‐GeneratedContent(UGC)• Asknownas
– User‐GeneratedData– User‐CreatedContent– User‐ContributedData– Consumer‐GeneratedMedia
– …
• Canbeexpressedthrought– Opinions– Reviews– Comments– Posts
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 2
• Notes:• Alltheexamplesdescribedinthisworkshoparerealdata.• Somepapersmen<onedhereareunderreview.• Colorlegend:
• Examples• Posi<vefeature• Nega<vefeature
ExampleofUGC• AnopinionpostedinFacebookDec‐10‐2011,12:30pm
– “wouldhighlyrecommendInfinityMotorcycles,Southamptonforallmotorbikinggear.Veryreasonablepeople.Earliertheygavemeafullmoneybackforaunused(a\erexplainingwhyitwasunused)ladiesmotorbikejacket(nodefectswhasoever)andtodaythezipperonmynewjacketwasbrokenandtheygavemeabrandnewone(noques<onsasked,noreceiptbusinessandnofusscreated).FiveStarservice.”
– Thisuserhad226friends.
Apr‐18‐12 3MarcirioChaves‐marcirioc@uatlan<ca.pt
Somesta<s<csaboutUGC• Morethan50%ofallinternetvisitsarenowtoUGC/socialmediasites.
• Morethan75%of<mespentontheinternetis"social”.
• Facebooknowcapturesasmuch<mespentontheinternetasGoogle,Yahoo,andAOL.
• Morethan80%ofconsumersareinfluencedbySocialMarkeJng.
Source: http://www.bbrisco.com/2010/05/social.html
Apr‐18‐12 4MarcirioChaves‐marcirioc@uatlan<ca.pt
MainObjec<vesofthisWorkshop• In‐depthanalysisofUGC
• UseUGCtosupportdecisionmaking
• StudyadomainontologytosupportAr<ficialIntelligencetasks
• Addressapproachesforsen<mentanalysis
• Fromtheorytoprac<ce:Hands‐onSessionApr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 5
OutlinePart1
• WorkshopContext
• User‐GeneratedContent(UGC)
• Characterisa<onofUGC• KnowledgeEngineering‐
OntologyDevelopment
• Hands‐onSession(IndividualTask):DealingwithUGC
Part2
• Sen<mentAnalysis/OpinionMining
• PolarityRecognizerinPortuguese(PIRPO)
• Informa<onVisualisa<on
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 6
ContextWorkshop
AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.
Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 7
AnFine‐grainedAnalysisofUGC• OverallopinionaboutatopicisonlyapartoftheinformaJonofinterest.
• Document‐levelsenJmentclassificaJonfailstodetectsen<mentaboutindividualaspectsofthetopic.Inreality,forexample,thoughonecouldbegenerallyhappyabouthiscar,hemightbedissaJsfiedbytheenginenoise.
• Tothemanufacturers,theseindividualweaknessesandstrengthsareequallyimportanttoknow,orevenmorevaluablethantheoverallsa<sfac<onlevelofcustomers.(Tangetal.2009)
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 8
UGC
AnopinionissimplyaposiJveornegaJvesenJment,view,aPtude,emoJon,or
appraisalaboutanenJtyoranaspectoftheenJty(HuandLiu,2004;Liu,2006)fromanopinionholder(Bethardetal.,2004;Kimand
Hovy,2004;Wiebeetal.,2005).
Apr‐18‐12 9MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC• Opinion’sCharacterisa<on– Iuseandextendthedefini<onproposedby(Dingetal.,2008;Liu,2010;Mar<nandWhite,2005)toanalysethesentencesofreviews.
– Letthereviewber.
– Inthemostgeneralcase,rischaracterisedasasetofthefollowingelements{O,F,SO,H,S,A,R,I,SG},where:
Apr‐18‐12 10MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC• Opinion’sCharacterisa<on– O:Object– F:Feature– SO:Seman<c‐Orienta<on– H:Holder– S:Source– A:A%tude– SG:Sugges.on– R:Recommenda.on– I:Inten.on
Apr‐18‐12 11MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC1 ‐Object(O)– Anobjectisaproduct(e.g.movieandbook)oraservice(e.g.hotelandrestaurant)underreviewwhichiscomposedbyfeatures.– ObjectsarealsocalledenJJes.
2‐Feature(F)– Afeatureisacomponentorpartofanobject.• actorandphotographyarefeaturesonamovie.• poolandstaffarefeaturesonahotel.
– FeaturesarealsocalledaXributesorfacets.– Afeaturecanbemen<onedexplicitlyorimplicitlyinareview(Dingetal.2008).
Apr‐18‐12 12MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC2.1‐ExplicitFeature(F)– Ifafeaturefappearsinreviewr,itiscalledanexplicitfeatureinr.
– Thehotelislocatedverynearthecentercity.• loca<onisanexplicitfeature.
2.2‐ImplicitFeature(F):– Iffdoesnotappearinrbutisimplied,itiscalledanimplicitfeatureinr.
– Hotelisfarfrompublictransporta<on.• loca<onisanimplicitfeature.
Apr‐18‐12 13MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC3‐Sentence‐OrientaJon(SO)– Areviewconsistsofasequenceofsentencesr=⟨s1,s2,…,sm⟩(Dingetal.,2008).
– Asentencecanbeevaluatedasthefollowingperspec<ves:
Apr‐18‐12 14MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC3.1ObjecJvity– Anobjec<vesentencecontainsormenJonfacts.• Thishotelisfarfromtheairport,ca.15km.
– Asubjec<vesentencedoesnotmenJonanyfact.• Theparkingcouldbefree.
3.2Polarity– ItdescribestheorientaJonpresentinasentence(i.e.posiJve,negaJve,neutralandirrelevant).
Apr‐18‐12 15MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC3.3Intensity(strengthofthepolarity)– Itreferstothestrengthoftheprivatestatethatisbeingexpressed,inotherwords,howstrongisanemo<onoraconvic<onofbelief(Wilson,2008).
– Itdescribeshowintenseitwastheexperienceusingaproductorservice:• veryposiJve,posiJve,neutral,negaJveandverynegaJve.
• Verykindlystaff.referstoaveryposi<veimpressiononthestaffservice.
Apr‐18‐12 16MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC4‐OpinionHolder(H)– Theholderofapar<cularopinionisthepersonortheorganisaJonthatholdstheopinion(Dingetal.,2008).
– Aholderisiden<fiedwithdemographiccharacterisJcs(e.g.name,cityandcountry).
– Sitessuchastripadvisor.comandbooking.comclassifyholdersastypesincluding:• familieswitholderchildren
• familieswithyoungchildren• maturecouples
• groupsoffriends• solotravellers• youngcouples
Apr‐18‐12 17MarcirioChaves‐marcirioc@uatlan<ca.pt
Characterisa<onofUGC5–Source– Aninforma<onsourceisawebsitewhichprovidesasetofreviews.• tripadvisor.com
• booking.com• amazon.com
• A:A%tude
• SG:Sugges.on• R:Recommenda.on
• I:Inten.onApr‐18‐12 18MarcirioChaves‐marcirioc@uatlan<ca.pt
OutlinePart1
• WorkshopContext
• User‐GeneratedContent(UGC)
• Characterisa<onofUGC• KnowledgeEngineering‐
OntologyDevelopment
• Hands‐onSession(IndividualTask):DealingwithUGC
Part2
• Sen<mentAnalysis/OpinionMining
• PolarityRecognizerinPortuguese(PIRPO)
• Informa<onVisualisa<on
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 19
Limita<onsforrepresen<ngknowledgeintheaccommoda<onsector
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 20
language?
Morelimita<ons• Actually,webagentsareunabletoanswerques<onssuchas:– WhatarethehotelswithlongerindoorswimmingpoolJmetableinRoma?
– WhatarethehotelswiththecheapestbreakfastinLisbon?
– WhatarethecheapesthotelswithfamilysuiteroomwithseaviewinBarcelona?
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 21
KnowledgeEngineering• OntologyasasupporttoevaluateUGC– Setofconceptstoaspecificdomain
– Humanandmachinereadable– Supporttofine‐grainedanalysisoftheinstances(e.g.reviews)
– Hontology(Hstandsforhotel,hostalandhostel)• Arobust,coherentandmul<lingualrepresenta<onoftheaccommoda<onsector.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 22
ContextWorkshop
AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.
Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 23
KnowledgeEngineering• DevelopmentMethodology
– Iden<fyexis<ngontologiesonrelateddomains– Selectthemainconceptsandproper<es– Organizeconceptsandproper<eshierarchicallyintocategories– Translatetheontology(manual)– Expandconceptsandproper<esbasedoncomments– Translatethenewconceptsandproper<es(manual)– Generatetheontologyinseveralformats
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 24
Chaves,M.S.andTrojahn,C.TowardsaMulJlingualOntologyforOntology‐drivenContentMininginSocialWebSites.Proc.oftheISWC2010Workshops,VolumeI,1stInternaJonalWorkshoponCross‐CulturalandCross‐LingualAspectsoftheSemanJcWeb.Shanghai,China,November7th,2010.
KnowledgeEngineering• Hontology– AmulJlingualontologyfortheaccommodaJonsector.
• DemoProtégé
Chaves,M.S.;Freitas,L.A.andVieira,R.(2012).Hontology:AmulJlingualontologyfortheaccommodaJonsector.4thInternaJonalConferenceonKnowledgeEngineeringandOntologyDevelopment,Barcelona,Spain,4‐7October.(SubmiXed)
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 25
KnowledgeEngineering
PreliminaryHontologySta<s<cs
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 26
Metrics ValueNumberofConcepts 285NumberofObjectProper<es 10NumberofDataProper<es 31
ConceptAxiomsSubconceptaxioms 270Equivalentconceptsaxioms 4Disjointconceptsaxioms 93
ObjectPropertyAxiomsFunc<onalobjectpropertyaxioms 6Objectpropertydomainaxioms 11Objectpropertyrangeaxioms 8
DataPropertyAxiomsFunc<onaldatapropertyaxioms 12Objectdatadomainaxioms 17Objectdatarangeaxioms 1
Hands‐onSession• Theaimofthishands‐onsessionistoallowyouthinking
in‐depthaboutUGConthecontextoftheaccommoda<onsector.
• Youaregoingtoreceiveasetof4or5reviewsaboutaccommoda<onsandshouldevaluateeachoneaccordingtothefollowingparameters:– Featurespresentinthereview(seetheconceptsofHontology)
– Intensity(StrengthofthePolarity):(verynega<ve,nega<ve,neutral,posi<ve,veryposi<ve)
• Notes:– Evaluateonefeatureperline.– Please,[email protected]:UB:GX
– X=numberofthegroup.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 27
OutlinePart1
• WorkshopContext
• User‐GeneratedContent(UGC)
• Characterisa<onofUGC• KnowledgeEngineering‐
OntologyDevelopment
• Hands‐onSession(IndividualTask):DealingwithUGC
Part2
• Sen<mentAnalysis/OpinionMining
• PolarityRecognizerinPortuguese(PIRPO)
• Informa<onVisualisa<on
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 28
Sen<mentAnalysis• AnalysisandautomaJcextracJonofSemanJcOrientaJon
• SemanJcorientaJonreferstothepolarityandstrengthofwords,phrases,ortexts.
• Approaches– Lexicon‐based
• Dic<onariesofwordsannotatedwiththeword´sseman<corienta<on,orpolarity.
• AmanuallybuiltdicJonaryprovidesasolidfoundaJonforalexicon‐basedapproach(Taboadaet.al.,2011).
– StaJsJcalorMachine‐learning• Supervisedclassifica<on
Apr‐18‐12 29MarcirioChaves‐marcirioc@uatlan<ca.pt
Sen<mentAnalysis• Lexicon‐basedApproach– Sen<ment‐bearingwords:alistofnouns,verbs,adjecJvesandadverbs(Chesleyetal.,2006)• useverbsandadjec<vestoclassifyEnglishopinionatedblogtexts.
– ListofconjuncJonsandconnecJves(Liu,2010).– Useofauxiliaryverbstogetfeaturesandopinion‐orientedwordsaboutproductsfromtexts(Khanetal.,2010).
Apr‐18‐12 30MarcirioChaves‐marcirioc@uatlan<ca.pt
Sen<mentAnalysis• Seedwords– areasmallsetofwordswithstrongnegaJveorposiJveassocia<ons,suchasexcellentorabysmal.
– Inprinciple,aposi<veadjec<veshouldoccurmorefrequentlyalongsidetheposi<veseedwords,andthuswillobtainaposi<vescore,whereasnega<veadjec<veswilloccurmosto\eninthevicinityofnega<veseedwords,thusobtaininganega<vescore(Taboadaet.al.2011).• Thisrestauranthasabadandexpensivefood.
Apr‐18‐12 31MarcirioChaves‐marcirioc@uatlan<ca.pt
Sen<mentAnalysis• Part‐of‐Speech(PoS)– Inordertoevaluateasentenceinareview,weshouldconsidertheparts‐of‐speechmen<onedsuchasadjecJves,adverbsandverbs.
– Adjec<vesareclassifiedas:• posi<ve(good,excellentandclean),• nega<ve(awful,boringandterrible),• neutral(regularandindifferent)and• dual,whichcanexpressposi<veandnega<veopinion(small,long).
– Insomeapproachesnounsarerepresentedbyconceptsofadomainontologyandmappedasfeatures.
Apr‐18‐12 32MarcirioChaves‐marcirioc@uatlan<ca.pt
Sen<mentAnalysis• ConjuncJonandConnecJve(CC)– Connec<vesarewordsthathelpiden<fyingaddiJonaladjecJveopinionwordsandtheirorientaJons.
– Oneoftheconstraintsisaboutconjunc<on(i.e.and),whichsaysthatconjoinedadjec<vesusuallyhavethesameorienta<on(Liu,2010).• Thisroomisbeau<fulandspacious.
– ifbeau<fulisknowntobeposi<ve,itcanbeinferredthatspaciousisalsoposi<ve.
– HeurisJc:• PeopleusuallyexpressthesameopiniononbothsidesofaconjuncJon.
Apr‐18‐12 33MarcirioChaves‐marcirioc@uatlan<ca.pt
Sen<mentAnalysis• ConjuncJonandConnecJve(CC)– Rulesorconstraintsarealsodesignedforotherconnec<ves(e.g.or,but,either‐or,andneither‐nor).• Thishotelisbeau<fulbutdifficulttogetthere.
– Theoccurrencea\ertheconnec<vebutisanindicatorofanega<veopinion.
Apr‐18‐12 34MarcirioChaves‐marcirioc@uatlan<ca.pt
Sen<mentAnalysis• StrengthofthePolaJryorIntensityorIntensificaJon– Amplifiers(very,alot)increasetheseman<cintensityofaneighboringlexicalitem;
– AXenuators/Downtoners(ali/le,slightly)decreaseit.
• SomeapproacheshaveimplementedintensifiersusingsimpleaddiJonandsubtracJon– ifaposi<veadjec<vehasanSOvalueof2:• anamplifiedadjec<vewouldhaveanSOvalueof3,and• adowntonedadjec<veanSOvalueof1.
Apr‐18‐12 35MarcirioChaves‐marcirioc@uatlan<ca.pt
Sen<mentAnalysis• NegaJon– Theobviousapproachtonega<onissimplytoreversethepolarityofthelexicalitemnexttoanegator,changinggood(+3)intonotgood(−3).
– Not,none,nobody,never,andnothing,andotherwords,suchaswithoutorlack.
Apr‐18‐12 36MarcirioChaves‐marcirioc@uatlan<ca.pt
PolarityRecognizerinPortuguese(PIRPO)• PolarityRecognizerinPortuguesetoclassifysenJmentin
onlinereviews.
• PIRPOwasbuiltfromthegroundtoPortugueseforrecognisingthepolarityoftheuseropiniononaccommoda<onreviews.
• Eachreviewisanalysedaccordingtoconceptsfromadomainontology.
• Wedecomposethereviewinsentencesinordertoassignapolaritytoeachconceptoftheontologyinthesentence.
Chaves,M.S.,Freitas,L.,Souza,M.andVieira,R.PIRPO:AnAlgorithmtodealwithPolarityinPortugueseOnlineReviewsfromtheAccommodaJonSector.17thInternaJonalconferenceonApplicaJonsofNaturalLanguageProcessingtoInformaJonSystems(NLDB),Groningen,TheNetherlands,26‐28June2012.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 37
PIRPOInforma<onArchitecture
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 38
PIRPO• Reviews– Fulldataset:1500reviewsfromJanuary2010toApril2011inPortuguese,EnglishandSpanish,fromwhich180inPortuguese.
• OntologyConcepts– TheconceptsusedtoclassifythereviewsareprovidedbyHontology,whichinitscurrentversion,has110concepts.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 39
PIRPO• ListofadjecJves:Itiscomposedbysen<ment‐bearingwords.– ThislistofpolaradjecJvesinPortuguese• contains30.322entries.• iscomposedbythenameoftheadjecJveandapolaritywhichcanassignoneofthreevalues:+1,‐1and0.
• ThesevaluescorrespondingtotheposiJve,negaJveandneutralsensesoftheadjec<ve.
– PIRPOusesthislisttocalculatethesemanJcorientaJonoftheconceptsfoundinthesentences.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 40
PIRPOAlgorithm
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 41
PIRPOMeasureEvalua<on
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 42
• Precision
• Recall
• F‐score(harmonicmeanofprecisionandrecall)
€
P ={relevantConcepts}∩{retrievedConcepts}
{retrievedConcepts}
€
R ={relevantConcepts}∩{retrievedConcepts}
{relevantConcepts}
€
F = 2 × P × RP + R
PIRPOPreliminaryResults
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 43
PIRPO:DiscussionontheResults• PIRPOreachedabe/errecallforconceptswithposi<vepolarity,whilemixedpolarityhadahigherprecision.
• ThelowF‐scorecanbemainlyduetothealgorithmhasassignedapolaritytoaspecificconceptoftheontology,whilethehumanclassifiedthereviewasawhole.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 44
OutlinePart1
• WorkshopContext
• User‐GeneratedContent(UGC)
• Characterisa<onofUGC• KnowledgeEngineering‐
OntologyDevelopment
• Hands‐onSession(IndividualTask):DealingwithUGC
Part2
• KnowledgeEngineering‐ModellingUGC
• Sen<mentAnalysis/OpinionMining
• PolarityRecognizerinPortuguese(PIRPO)
• Informa<onVisualisa<on
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 45
ContextWorkshop
AframeworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb.
Chaves,MarcirioSilveira;Trojahn,CássiaandPedron,Cris<aneDrebes.AFrameworkforCustomerKnowledgeManagementbasedonSocialSeman<cWeb:AHotelSectorApproach.In:CustomerRela<onshipManagementandtheSocialandSeman<cWeb:EnablingCliensConexus.Colomo‐Palacios,R.;Varajão,J.andSoto‐Acosta,P.(Eds.).p.141‐157,Hershey,PA:IGIGlobal,2012.ISBN:978‐161‐35‐0044‐6
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 46
Informa<onVisualisa<on• Whatisthevisualmodelofthepoten<alend‐user?
• Howshouldweproperlymapandrender:– themostvaluedaccommoda<onfeatures?
– thepercep<onofthequalityofferedbythehotel?– thecorrela<onbetweentheguest’sprofileandthemostlyrelevantfeatures?
– theintensityoftheposi<vityornega<vityofthefeatures?
• Doestheuseofadvancedvisualtechniques(suchastreeoriented)tomaptheresultswillhelptheaccommoda<onmanagersandgueststohaveabe/erinsightofthedata?
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 47
ExploringInforma<onVisualisa<on• Inthenextfigures
– ThecolorwasusedtomapthepolarityandthestrengthofthepolarityvaluesontheCO.
– ThesizewasusedtomapthefrequencythattheCOismen<onedinthereviews.
Apr‐18‐12 48MarcirioChaves‐marcirioc@uatlan<ca.pt
ExploringInforma<onVisualisa<on
Resultoftheapplica<onofBubbleTreevisualisaJonoftherela<onamongconceptsoftheontology,polarity(le\)and
strengthofthepolarity(right).
• Carvalho,E.;Chaves,M.S.,2012.ExploringUser‐GeneratedDataVisualizaJonintheAccommodaJonSector.16thInternaJonalConferenceInformaJonVisualisaJon,IEEE.(SubmiXed)
Apr‐18‐12 49MarcirioChaves‐marcirioc@uatlan<ca.pt
ExploringInforma<onVisualisa<on
Apr‐18‐12 50MarcirioChaves‐marcirioc@uatlan<ca.pt
ResultsusingTreemapvisualisaJonoftherela<onamongtypeofcustomer,conceptsoftheontologyandpolarity.
Ques<onnaire(inSpanish)• Youaregoingtoreceiveaques<onnaireaboutinforma<onvisualisa<onusingUGCinthecontextoftheaccommoda<onsector.
• Please,clickhereh/p://kwiksurveys.com?u=Infovisestoanswerit.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 51
FinalRemarks• In‐depthanalysisofUGCcanbeusedasinputtoimprovedecisionmaking.
• Itis<metothinkaboutnewmodelstostoreUGCdata.
• ItisnecessarythebuildingfromthegroundofnewalgorithmstodealwithUGCforlanguagesotherthanEnglish.
• InformaJonvisualisaJonofUGCisinitsinfancystate.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 52
MainReferences• S.Bethard,H.Yu,A.Thornton,V.Hatzivassiloglou,andD.Jurafsky,2004.Automa<cextrac<onofopinionproposi<onsand
theirholders.inProceedingsoftheAAAISpringSymposiumonExploringA%tudeandAffectinText.• Chesley,P.;Vincent,B.;Xu,L.andSrihariR.,2006.Usingverbsandadjec<vestoautoma<callyclassifyblogsen<ment.in
AAAISymposiumonComputa<onalApproachestoAnalysingWeblogs(AAAI‐CAAW),27–29.
• Ding,X.,Liu,B.,andYu,P.S.,2008.Aholis<clexicon‐basedapproachtoopinionmining.ProceedingsoftheConferenceonWebSearchandWebDataMining(WSDM).
• M.HuandB.Liu,2004.Miningopinionfeaturesincustomerreviews.InProceedingsofAAAI,pp.755–760.
• S.‐M.KimandE.Hovy,2004.Determiningthesen<mentofopinions.InProceedingsoftheInterna.onalConferenceonComputa.onalLinguis.cs(COLING),2004.
• Liu,Bing,2010.Sen<mentAnalysisandSubjec<vity.InHandbookofNaturalLanguageProcessing,SecondEdi<on,Eds:N.IndurkhyaandF.J.Damerau),CRCPress,TaylorandFrancisGroup,BocaRaton,FL.Chapter28.
• Mar<n,J.R.andWhite,P.R.R.,2005.TheLanguageofEvalua<on,AppraisalinEnglish,PalgraveMacmillan,London&NewYork.
• Taboada,M.,Brooke,J.,Tofiloski,M.,Voll,K.D.,Stede,M.,2011.Lexicon‐basedmethodsforsen<mentanalysis.Computa<onalLinguis<cs37(2),267–307.
• Tang,H.,Tan,S.,Cheng,X.,2009.Asurveyonsen<mentdetec<onofreviews.ExpertSystemswithApplica<ons36(7),10760–10773.
• Whitelaw,C.;Garg,N.andArgamon,S.,2005.Usingappraisalgroupsforsen<mentanalysis.InProceedingsofthe14thACMinterna<onalconferenceonInforma<onandknowledgemanagement(CIKM'05).ACM,NewYork,NY,USA,625‐631.
• Wilson,T.,2008.Fine‐GrainedSubjec<vityAnalysis.PhDDisserta<on,IntelligentSystemsProgram,UniversityofPi/sburgh.
• Wilson,T.,Wiebe,J.,Hoffmann,P.,2009.Recognizingcontextualpolarity:Anexplora<onoffeaturesforphrase‐levelsen<mentanalysis.Computa<onalLinguis<cs35,399–433.
• Y.Wu,F.Wei,S.Liu,N.Au,W.Cui,H.Zhou,andH.Qu,2010.OpinionSeer:Interac<veVisualisa<onofHotelCustomerFeedback.IEEETransac<onsonVisualiza<onandComputerGraphics,6,1109‐1118.Nov‐Dec.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 53
Open‐sourcesen<ment‐analysistools• PythonNLTK(NaturalLanguageToolkit)– h/p://www.nltk.organdh/p://text‐processing.com/demo/sen<ment
• R,TM(textmining)moduleh/p://cran.r‐project.org/web/packages/tm/index.html
• RapidMinerh/p://rapid‐i.com/content/view/184/196/
• GATE,theGeneralArchitectureforTextEngineeringh/p://gate.ac.uk/sen<ment
• UIMA‐plug‐inannotatorsforsen<ment—ApacheUIMAistheUnstructuredInforma<onManagementArchitecture,h/p://uima.apache.org/
• SenJmentclassifiersfortheWEKAdata‐miningworkbench,h/p://www.cs.waikato.ac.nz/ml/weka/.
• StanfordNLPtools‐h/p://www‐nlp.stanford.edu/so\waremaximum‐entropyclassifica<onapproachforsen<ment.
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 54
Thankyouverymuchforyoura/en<on!!
Ques<ons
Apr‐18‐12 MarcirioChaves‐marcirioc@uatlan<ca.pt 55