a method of extracting malicious expressions in bulletin board systems by using context analysis

17
Intelligent Database Systems Lab N.Y.U.S. T. I. M. A method of extracting malicious expressions in bulletin board systems by using context analysis Presenter: Jun-Yi Wu Authors: Hiroshi Hanafusa, Kazuhiro Morita, Masao Fuketa, Jun-ichi Aoe 2011 IPM 國國國國國國國國 National Yunlin University of Science and Technology

Upload: neith

Post on 23-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

A method of extracting malicious expressions in bulletin board systems by using context analysis. Presenter: Jun-Yi Wu Authors: Hiroshi Hanafusa , Kazuhiro Morita, Masao Fuketa , Jun- ichi Aoe. 國立雲林科技大學 National Yunlin University of Science and Technology. 2011 IPM. Outline. Motivation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

A method of extracting malicious expressions in bulletin board systems by using context analysis

Presenter: Jun-Yi Wu Authors: Hiroshi Hanafusa, Kazuhiro Morita, Masao Fuketa, Jun-ichi Aoe

2011 IPM

國立雲林科技大學National Yunlin University of Science and Technology

Page 2: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Outline

Motivation Objective Methodology Experiments Conclusion Comments

2

Page 3: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

3

The extracting scheme of the traditional method depends on words or a sequence of words without considering contexts of articles.

To takes a lot of human efforts to alert malicious articles.

Malicious expression text Non-malicious expression text

Page 4: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objective

To presents a new context filtering algorithm to reduce the effort of human and to improve the rate of false positive without degrading the rate of false negative.

4

Page 5: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

The presented system Rule-based extracting knowledge Multi-attribute matching

Page 6: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

6

The presented system

Page 7: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

7

The presented system The outline of context analysis

Malicious expression text Non-malicious expression text

Page 8: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

8

The presented system Inadequate and crime expressions

Inadequate Crime

Abuse、 Discrimination、 Dating Service Website、 Obscenity

Murder&Violence、 Explosion&Arson、 Crime Material、 Drug

Page 9: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

9

Rule-based extracting knowledge Definition of multi-attribute rules

For example : “He kills someone”

STR: string, or, word spelling.CAT: category by general concepts, or a part of speeches.SEM: sematic information

Page 10: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

10

Multi-attribute matching Construction of machines(MAPM) Goto and out function

Page 11: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

11

Multi-attribute matching Procedure

For example : “I get a strong sward”

Page 12: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

12

Page 13: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

13

Page 14: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

14

Page 15: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

15

Time evaluation and error analysis

Page 16: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusion

1616

The presented method MULTI is a very useful approach for filtering services for inadequate expressions.

It is difficult task to register new words and expressions into dictionaries together with their categories and semantics.

The rules bases of the presented method MULTI is building for frequent expressions step by step, but there are difficult problems as shown in the following examples:

‘‘RQJmcf2O” kill ‘‘Aaaaqqqbbb”

Page 17: A method of extracting malicious expressions in bulletin board  systems by  using context analysis

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comments

1717

Advantage Many examples

Drawback Some mistakes

Application Information retrieval Context analysis