labeling lineage over rfid data streams 第 32 届中国数据库学术会议 四川大学
TRANSCRIPT
![Page 1: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/1.jpg)
Labeling Lineage over RFID Data Streams
第 32 届中国数据库学术会议四川大学
![Page 2: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/2.jpg)
The lineage of RFID data streams mainly contains the information of its origin, which is required for automatic labeling. Aiming at the characteristics of real RFID data streams, we introduced the data streams classification methods, combined with active learning and semi-supervised learning mechanism, to achieve automatic labeling for RFID data streams origin information. Experiment results showed that the proposed method can improve the efficiency of RFID streams labeling with the premise of ensuring the accuracy of the classification.
What is it?
![Page 3: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/3.jpg)
RFID data streams with lineageTable 1. RFID data streams
No. Tag_id Reader_id Intension Observer Timestamp
1 21 12 1 LIU 11:07:05
2 105 15 23 LIU 11:07:06
3 12 20 32 LIU 11:07:07
![Page 4: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/4.jpg)
The data streams labeling is typically used to track the data lineage, and to record some information about the data source or the history of the generating process. From machine learning perspective, the online labeling task is a data streams classification process, that adding different types of label information.
Lineage labeling
![Page 5: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/5.jpg)
Labeling lineage over RFID data streams based on active learning and semi-supervised learning
RFID data streams
Sorting Sn based on entropy
Labeled Data Set
Active Learning(Max and Min Entropy)
Human LabelingSemi-supervised Learning
Ci-KNN Classifier
Updating Labeling
![Page 6: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/6.jpg)
Symbols and meanings in the algorithm
Symbols Meanings
Pool Data streams buffer pool
Sn Current data streams
L Labeled data set
U Unlabeled data set
Di Current data block
Ci Current data streams classifier
α Data streams label sample rate
NB Naive Bayes
KNN K-Nearest Neighbor algorithm
Labels Number of labled data set
MaxE Maximum entropy
MinE Minimum entropy
![Page 7: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/7.jpg)
Algorithm : Labeling Lineage over RFID Data Streams
![Page 8: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/8.jpg)
SIMULATION AND PERFORMANCE ANALYSIS
The RFID data streams of the experiments were in a RFID-based RTLS application environment. The laboratory is divided into 8 M2 plane network. 14 readers were deployed. 20 people were asked to carry labels and to make uniform random motion within these readers' sensing range. The readers collected tag readings every two seconds, according to the RFID data streams collected by SR2240 reader from 2.4GHz Active RFID tags. The data server was a PC with dual-core 1.6GHz CPU and 4.0G memory. RFID data streams were collected from a dedicated port. The origin information labeling was the Intension and Observer lineage information in RFID real-time location information applications .
![Page 9: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/9.jpg)
Experimental scene
R1
R2 R3
R8
R4 R5
R6R7
R9R10
R14R11 R13R12
![Page 10: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/10.jpg)
Experimental result
RFID数据流的起源信息 已标注的样本百分率 α 训练集中未标注数据集 测试集数据
Random AL-SeS Random AL-SeS
Intension
(信号强度)
1 0.433 0.452 0.354 0.423
5 0.534 0.623 0.446 0.542
10 0.687 0.768 0.536 0.624
Observer
(观测者)
1 0.534 0.623 0.446 0.542
5 0.657 0.786 0.546 0.684
10 0.867 0.919 0.756 0.886
表 4.6 RFID 数据流起源信息的标注准确率
![Page 11: Labeling Lineage over RFID Data Streams 第 32 届中国数据库学术会议 四川大学](https://reader036.vdocuments.pub/reader036/viewer/2022082317/5697c0091a28abf838cc7262/html5/thumbnails/11.jpg)
CONCLUSIONS AND FUTURE WORK THANKS!
Based on the data streams classification algorithm, by combining with active learning and semi-supervised learning mechanism, we implemented the lineage labeling on massive RFID data streams, avoided heavily manual labeling task. The final label accuracy were greatly affected by the initial basic training set. We dynamically built the initial data set in the buffer, thus ensuring the suitability of the current classifier and the accuracy of RFID data streams labeling. The RFID data streams labeling is an important part of the RFID data streams lineage. Our future work will be tracking and studying the RFID lineage. Thanks!