pdsc: p2p document sharing community team no. 4 r91922001 黃振修 pm r91922020 羅婉琪 rd...

22
PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃黃黃 PM R91922020 黃黃黃 RD B89902012 黃黃黃 RD R91725032 黃黃黃 RD R91922015 黃黃黃 QA R91922028 黃黃黃 QA

Upload: julie-simson

Post on 15-Dec-2015

235 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

PDSC: P2P DocumentSharing Community

Team No. 4

R91922001 黃振修 PM

R91922020 羅婉琪 RD

B89902012 葉家齊 RD

R91725032 李宜儒 RD

R91922015 張燕君 QA

R91922028 張靜雯 QA

Team No. 4

R91922001 黃振修 PM

R91922020 羅婉琪 RD

B89902012 葉家齊 RD

R91725032 李宜儒 RD

R91922015 張燕君 QA

R91922028 張靜雯 QA

Page 2: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Introduction

The original idea comes from research groups such as CML Laboratory of NTU.

People want to share their document over Internet and need the functionality of keyword search.

Thus we need a peer-to-peer mechanism for document exchange to achieve the goal of knowledge management. And we also need full text search to find/filter the sharing resources before downloading.

Page 3: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Features

Peer-to-peer document sharing over Internet.

Full text keyword searching / search result ranking within community.

Direct document exchange by sending to and downloading from others.

We developed our own URL format

Ex: dsc://download/hostname/path/to/file

Page 4: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Market Requirement

A simple application can be installed to connect to the community.Entering/leaving the community at any time.Sharing documents with each other.The sharing resources must keep up to date.Easy to see what's on the community.User can enter keywords to search the community for documents.User can direct send files with each other.

Page 5: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Project Roadmap

Version 1.0Basic functionality:

Version 2.0Duplication multi-copies in communityProvide central backup mechanism

Version 3.0User management/authenticationUser acknowledge of document exchange

More document formats will be supported in the future

Page 6: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Stage Goals

Stage 1: Community browsing

Stage 2: Search functionality

Stage 3: Download/send file functionality

Page 7: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Schedule Notes

5/3: 黃振修 should finish the document digest module5/10: 葉家齊 should finish the architecture prototype and server side protocol communication5/12: 羅婉琪 should finish the client browsing functionality5/10: QA finishes doc conversion testing (binary/code)5/10: 李宜儒 should finish Win32 file hook mechanism5/13: Download/send file should be OK5/24: Document search QA finishes testing5/24: The search result should be OK5/28 ~ 6/4: Code freeze and final testing

Page 8: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Project Meetings

Two types of meetings are defined:[PRJ]: Project meeting[DEV]: Developing meeting

Meeting dates:[PRJ] 4/15, Tue. R319 of CSIE building[DEV] 4/23, Wed. R505 of CSIE building[DEV] 4/28, Mon, R505 of CSIE building[DEV] 4/29, Tue. R107 of CSIE building[DEV] 4/29, Tue. R105 of CSIE building[DEV] 5/6, Tue. R519 of CSIE building[PRJ] 6/9, Mon. R503 of CSIE building

After 5/6, no formal meeting is held until

the final. Instead, several small meetings

are held in QAs and RDs; sometimes

PM also calls RDs and QAs to cooperate.

Page 9: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Documentation

MRD: Market requirement Document [PM]

PRD: Project Requirement Document [PM]

PED: Project Execution Document [PM]

PDD: Project Development Document [RD]

QAD: Quality Assurance Document [QA]

BTD: Bug Tracking Document [QA]

WDD: Working Discussion Document [PM]

User’s Manual [QA]

Page 10: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Development Tools

Microsoft VC++ 6.0

Borland C++ Builder

CVS for source control

Central FTP server for file exchange

Install Shield for SETUP program

Page 11: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Architecture

GraphicsUser

Interface

Kernel Protocol

API for GUI

HostLookupThread

ServerThread

Client

Server

DocumentKeywordProcessor

Database

LocalShared

FileDatabase

HostDatabase

TaskDatabase

Page 12: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Technical Notes (1/2)

Pure peer-to-peer mechanism is implemented. Each application embeds both the client and server. (for the efficiency reason)When search request issued, the application will search its own document collection and then forward the message to other computersDynamically monitoring of the sharing folder. Once the documents in the sharing folder are modified, the digest module will re-digest it real-time; keeping the latest information toward the community.

See PDD for more detail

Page 13: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Technical Notes (2/2)

Support three main document formats: MS Word, MS PowerPoint, and PDF files. (No Chinese support)Digest is the technique used to extract document’s feature vector. Searching is based on those digest vectors.An algorithm is developed to rate the searching and the result is ranked according to the points. Digest for the sharing documents are saved once exiting the program; only first time initialization is needed.

See PDD for more detail

Page 14: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Demonstration

Page 15: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Testing Plans

What is to be tested?Platform

Network status

Command

File Conversion

Download/Upload

Where is going to be tested?Win32 environment, Windows 2000 OS

PIII 500 CPU, 256 MB RAM, 100 Mbps ethernet

See QAD for more detail

Page 16: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Testing Cases

Document format conversion (binary tools testing)Document format conversion (integrated as program module, test for robustness and accuracy)P2P sharing community (test for the feature functionalities for UI program)The sharing module (test for the digest/searching and sharing folder monitoring)Setup program (test for the installer’s functionality)Performance report (memory usage, CPU utilization, memory leak)

See QAD for more detail

Page 17: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Bug Tracking (1/2)

Empty document files may cause fatal errorSolved by check file completeness first.

Some PDF file may cause the conversion module to get the wrong page number, causing fatal error.

Check the validity of page number first.

Duplication list when browsingStupid bug

Get file list waits too longStupid bug

See BTD for more detail

Page 18: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Bug Tracking (2/2)

Download/sending file too slowStupid bug (sleeping in the sending loop)

Can not get file list/browsing when clients using DHCP

Not solved because of the time limit.

Keyword search in sharing folder do not recursively applied

Solved by writing the recursive code

Keyword search is too slowImprove the algorithm

See BTD for more detail

Page 19: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Bug Statistics

0

2

4

6

8

1 2 3 4 5 6

5/3 5/10 5/24 6/8 6/95/7

Page 20: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Change Control History

Change from client-server architecture to peer-to-peer architecture [4/23]

Change the document digest from full-text to digest vector based. [5/6]

Decide to allow recursively sharing in sharing folder [6/1]

Page 21: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

Future Plan

Version 2.0Duplication multi-copies in communityProvide central backup mechanism

Version 3.0User management/authenticationUser acknowledge of document exchange

Bug fix and support for more document formats

Page 22: PDSC: P2P Document Sharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯

The END

Project Shipping Checklist:Source Code

Include all surveyed components, CVS repository.

Development DocumentMRD, PRD, PED, PDD, QAD, BTD, and WDD

User’s ManualPresentation fileInstall ProgramProject CD with all the stuff