labeling lineage over rfid data streams 第 32 届中国数据库学术会议 四川大学

11
Labeling Lineage over RFID Data Streams 第 32 第第第第第第第第第第 第第第第

Upload: russell-smith

Post on 19-Jan-2016

282 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

Labeling Lineage over RFID Data Streams

第 32 届中国数据库学术会议四川大学

Page 2: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

The lineage of RFID data streams mainly contains the information of its origin, which is required for automatic labeling. Aiming at the characteristics of real RFID data streams, we introduced the data streams classification methods, combined with active learning and semi-supervised learning mechanism, to achieve automatic labeling for RFID data streams origin information. Experiment results showed that the proposed method can improve the efficiency of RFID streams labeling with the premise of ensuring the accuracy of the classification.

What is it?

Page 3: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

RFID data streams with lineageTable 1. RFID data streams

No. Tag_id Reader_id Intension Observer Timestamp

1 21 12 1 LIU 11:07:05

2 105 15 23 LIU 11:07:06

3 12 20 32 LIU 11:07:07

Page 4: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

The data streams labeling is typically used to track the data lineage, and to record some information about the data source or the history of the generating process. From machine learning perspective, the online labeling task is a data streams classification process, that adding different types of label information.

Lineage labeling

Page 5: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

Labeling lineage over RFID data streams based on active learning and semi-supervised learning

RFID data streams

Sorting Sn based on entropy

Labeled Data Set

Active Learning(Max and Min Entropy)

Human LabelingSemi-supervised Learning

Ci-KNN Classifier

Updating Labeling

Page 6: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

Symbols and meanings in the algorithm

Symbols Meanings

Pool Data streams buffer pool

Sn Current data streams

L Labeled data set

U Unlabeled data set

Di Current data block

Ci Current data streams classifier

α Data streams label sample rate

NB Naive Bayes

KNN K-Nearest Neighbor algorithm

Labels Number of labled data set

MaxE Maximum entropy

MinE Minimum entropy

Page 7: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

Algorithm : Labeling Lineage over RFID Data Streams

Page 8: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

SIMULATION AND PERFORMANCE ANALYSIS

The RFID data streams of the experiments were in a RFID-based RTLS application environment. The laboratory is divided into 8 M2 plane network. 14 readers were deployed. 20 people were asked to carry labels and to make uniform random motion within these readers' sensing range. The readers collected tag readings every two seconds, according to the RFID data streams collected by SR2240 reader from 2.4GHz Active RFID tags. The data server was a PC with dual-core 1.6GHz CPU and 4.0G memory. RFID data streams were collected from a dedicated port. The origin information labeling was the Intension and Observer lineage information in RFID real-time location information applications .

Page 9: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

Experimental scene

R1

R2 R3

R8

R4 R5

R6R7

R9R10

R14R11 R13R12

Page 10: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

Experimental result

RFID数据流的起源信息 已标注的样本百分率 α 训练集中未标注数据集 测试集数据

Random AL-SeS Random AL-SeS

Intension

(信号强度)

 

1 0.433 0.452 0.354 0.423

5 0.534 0.623 0.446 0.542

10 0.687 0.768 0.536 0.624

Observer

(观测者)

 

1 0.534 0.623 0.446 0.542

5 0.657 0.786 0.546 0.684

10 0.867 0.919 0.756 0.886

表 4.6 RFID 数据流起源信息的标注准确率

Page 11: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学

CONCLUSIONS AND FUTURE WORK THANKS!

Based on the data streams classification algorithm, by combining with active learning and semi-supervised learning mechanism, we implemented the lineage labeling on massive RFID data streams, avoided heavily manual labeling task. The final label accuracy were greatly affected by the initial basic training set. We dynamically built the initial data set in the buffer, thus ensuring the suitability of the current classifier and the accuracy of RFID data streams labeling. The RFID data streams labeling is an important part of the RFID data streams lineage. Our future work will be tracking and studying the RFID lineage. Thanks!