visual analytics talk at ismb2013

Post on 11-May-2015

295 Views

Category:

Education

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

- Visual Analytics -The human back in the loop

Jan AertsBiodata Analysis and VisualizationStadius Group, ESATLeuven University, Belgiumjan.aerts@esat.kuleuven.be@jandothttp://orcid.org/0000-0002-6416-2717

hypothesis-driven -> data-driven

Scientific Research Paradigms (Jim Gray, Microsoft)

I have an hypothesis -> need to generate data to (dis)prove it.I have data -> need to find hypotheses that I can test.

1st 1,000s years ago empirical

2nd 100s years ago theoretical

3rd last few decades computational

4rd today data exploration

What does this mean?

• immense re-use of existing datasets

• much of initial analysis is exploratory in nature

• biologically interesting signals may be too poorly understood to be analyzed in automated fashion

• visualization is very effective in facilitating human reasoning about complex data

• automated algorithms often act as black boxes => biologists must have blind faith in bioinformatician (and bioinformatician in his/her own skills)

What is visualization?

T. Munzner

Data visualization framework

Data visualization framework

interactivity

Data visualization framework

Data visualization framework

visual analytics infographics

“visual analytics”

• Types of interaction (Yi et al, IEEE Transactions on Visualization and Computer Graphics, 2007)

• select -> mark something as interesting

• explore -> show me something else

• reconfigure -> show me a different arrangement

• encode -> show me a different representation

• abstract/elaborate -> show me less/more detail

• filter -> show me something conditionally

• connect -> show me connected items

Visualization for biological hypothesis generation

• example: eQTL data (IEEE BioVis visualization challenge 2011)

• 500 patients (affected + non-affected)

• 7500 SNPs; gene expression data for 15 genes

• PLINK one-locus/two-locus

Aracari

Ryo Sakai

Bartlett C et al. BMC Bioinformatics (2012)

RevealJäger, G et al. Bioinformatics (2012)

HiTSeeBertini E et al. IEEE Symposium on Biological Data Visualization (2011)

when do I know that my algorithm is “correct”? -> peek into the black box

input

filter 1

filter 2

output A

filter 3

output B output C

Visualization for algorithm development

AB

C

AB

C

AB

C

Caleydo MatchMaker

Lex A et al. IEEE Transactions on Visualization and Computer Graphics (2010)

MeanderPavlopoulos et al. Nucl Acids Res (2013)

Georgios Pavlopoulos

ParCoordBoogaerts T et al. IEEE International Conference on

Bioinformatics & Bioengineering (2012)

Thomas Boogaerts

Endeavour gene prioritization

Visualization for (live) interaction with analysis

• alternating between visual and automatic methods -> continuous refinement and verification of preliminary results

• misleading results: discovered at early stage

• leverage user’s (biologist’s) insights

• no black box

CytoscapeSmoot et al. Bioinformatics (2011)

Data filtering (visual parameter setting)

TrioVis

Ryo Sakai

Sakai R et al. Bioinformatics (2013)

User-guided analysis

SparkNielsen et al. Genome Research (2012)

clustering

chromatin modification

DNA methylationRNA-Seq

data samples

regions of interest

BaobabViewvan den Elzen S & van Wijk J. IEEE Conference on

Visual Analytics Science and Technology (2011)decision trees

Goecks, J. et al. Nature Biotechnology (2012)

Galaxy TracksterGoecks J et al. Nature Biotechnology (2012)

Bret Victor - Ladder of abstration

Many challenges remain

• scalability (data processing + perception), uncertainty, “interestingness”, interaction, evaluation

• infrastructure & architecture

• fast imprecise answers with progressive refinement

• incremental re-computation

• steering computation towards data regions of interest

Acknowledgments

• Bioinformatics Group at Stadius, Leuven University

• in particular: Ryo Sakai, Georgios Pavlopoulos

• visualization community for examples

• Jeremy for Trackster video

top related