aug. 20, 2009edrn @ jpl, socalbsi '091 the power of bioinformatics tools in cancer research...

23
Aug. 20, 2009 EDRN @ JPL, SoCalBSI '0 9 1 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann, Andrew Hart Andrew Clark Southern California Bioinformatics Summer Institute, 2009

Post on 19-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 1

The power of bioinformatics tools in

cancer researchEarly Detection Research Network, JPL

Mentors: Dr. Chris Mattmann, Andrew Hart

Andrew ClarkSouthern California Bioinformatics

Summer Institute, 2009

Page 2: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 2

Agenda Introduction

Biomarkers and cancer research

Early Detection Research Network (EDRN)

The NCI & JPL EDRN Infrastructure

Project objective eCAS Curator additions

The eCAS Catalog and Archive

Service Data curation

Architectural & design considerations Software engineering Meta-data processing

Results & conclusions

Acknowledgements

Page 3: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 3

Introduction

Biomarkers and cancer researchConstant research is underway to

discover and identify reliable biomarkers of cancer in the human body.

What is a biomarker?“A biological molecule found in blood, other

body fluids, or tissues that is a sign of a normal or abnormal process, or of a condition or disease.”

source: http://www.cancer.gov/dictionary/?searchTxt=biomarker

Page 4: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 4

Biomarker research

The more information that is collected and shared between research sites and medical laboratories:The more effective diagnosis will become. The more specialized treatments can be

devised to minimize the devastating effects of cancer on its host.

Page 5: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 5

The Early Detection Research Network

The NCI is concerned with managing biomarker research data and disseminating information to the public.

Formed the EDRN in 1999 “to provide up-to-date information on

biomarker research” to the scientific and medical communities and to the general public.

source: http://edrn.nci.nih.gov/about-edrn

Page 6: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 6

The Jet Propulsion Laboratory

FFRDC, operated by Cal-Tech, for NASAJPL’s technology for cataloging and

managing extremely large sets of data provided the underlying infrastructure needed by the EDRN to accomplish its own mission.

Page 7: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 7

The EDRN Infrastructure

My mentors, Dr. Chris Mattmann and Andrew Hart, and their team continue ongoing development of the underlying software grid.

JPL software engineers work with bioinformatics experts to develop the public interface to the EDRN, a web-based portal available to the general public:

http://edrn.nci.nih.gov

Page 8: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 8

Project objective

Overall:To participate as a bioinformatics

software engineer at JPL. To contribute to the EDRN software

infrastructure.Specifically:

Improve the functionality of the eCAS Curator.

Page 9: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 9

EDRN Catalog and Archive Service

JPL software customized for cataloging and archiving biomarker data, including specimen details, specimen images and related information.

A

B

C

EDRNStagingServer

EDRNPublicPortal

WWW

Research data

2. Curation-Meta-data edits-Pub. survey & cross reference-Expert review

1. Data Ingestion 3. Product Release

Pre-release data

Released dataxmlDataset meta-data

Curator

Page 10: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 10

eCAS data curationData ingested from research sites

undergoes a curation phase before its publication to the public portal.

A

B

C

EDRNStagingServer

EDRNPublicPortal

WWW

Research data

2. Curation-Meta-data edits-Pub. survey & cross reference-Expert review

1. Data Ingestion 3. Product Release

Pre-release data

Released dataxmlDataset meta-data

Curator

Page 11: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 11

eCAS CuratorThe curation activities would benefit

from additional software tools as part of the overall eCAS workflow.

A

B

C

EDRNStagingServer

EDRNPublicPortal

WWW

Research data

2. Curation-Meta-data edits-Pub. survey & cross reference-Expert review

1. Data Ingestion 3. Product Release

Pre-release data

Released dataxmlDataset meta-data

Curator

Page 12: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 12

Architectural & design considerations

Software engineering:EDRN tools are primarily web applicationsDesign and integrate modular

components Meta-data management:

Meta-data: information that describes the content of other information.

Meta-data management is crucial to the data curation and the operation of the EDRN system.

Page 13: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 13

Data curation with eCAS

A

B

C

EDRNStagingServer

EDRNPublicPortal

WWW

Research data

2. Curation-Meta-data edits-Pub. survey & cross reference-Expert review

1. Data Ingestion 3. Product Release

Pre-release data

Released dataxmlDataset meta-data

Curator

Internal EDRN policy files contain meta-data definitions and configuration details that describe the dataset expected from each research site.

1

Page 14: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 14

Data curation with eCAS

A

B

C

EDRNStagingServer

EDRNPublicPortal

WWW

Research data

2. Curation-Meta-data edits-Pub. survey & cross reference-Expert review

1. Data Ingestion 3. Product Release

Pre-release data

Released dataxmlDataset meta-data

Curator

Curators edit and revise dataset meta-data to make the final product records complete and accurate.

2

Page 15: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 15

Data curation with eCAS

A

B

C

EDRNStagingServer

EDRNPublicPortal

WWW

Research data

2. Curation-Meta-data edits-Pub. survey & cross reference-Expert review

1. Data Ingestion 3. Product Release

Pre-release data

Released dataxmlDataset meta-data

Curator

Accepted data made available through web portal. Meta-data definitions provide searchable fields and descriptions of dataset contents to portal users.

3

Page 16: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 16

. . .

A dataset policy file

Page 17: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 17

Dataset meta-data configuration

Page 18: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 18

Curator tool

Browser based meta-data editor.

Page 19: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 19

Curator tool

Selecting datasets formetadata editing

Metadata items retrieved from backend.

Page 20: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 20

Results and conclusions

Final resultMeta-data management tool

integrated with the eCAS and curation functionality incorporated into the workflow.

Page 21: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 21

Conclusion

The goal of software engineering in bioinformatics should be to:support scientists’ activities facilitate better research and

collaborationsimplify/bring clarity to complex tasks

Page 22: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 22

Conclusion

The combined effectiveness of software tools and expert curation make the EDRN a more powerful

scientific resource that helps drive progress in biomarker research.

Page 23: Aug. 20, 2009EDRN @ JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,

Aug. 20, 2009 EDRN @ JPL, SoCalBSI '09 23

Acknowledgements

Thanks to my mentors and supporters at JPL: Chris Mattmann, Andrew Hart

Thanks to the SoCalBSI faculty and staff: Dr. Momand, Drs. Johnston, Dr. Sharp, Dr. Warter-

Perez, Ronnie Cheng

Thanks to the SoCalBSI funding sources: The National Science Foundation The National Institutes of Health Economic and Workforce Development