Download - Kansai debian study_20071007
Debian SPAM(CRM114)
dselect(1)
2007/10/7
Free Software Foundation GNU General Public License(2)
http://www.gnu.org/copyleft/gpl.html
http://www.netfort.gr.jp/~tosihisa/debian/kansai_debian_study_20071007.odp
()Debian
SPAM(CRM114)
Debian
Debian 1.3(bo)1997
12
(dselect )
dpkg
./configure ; make ; make install
Debian runlevel 23
etch
apt(deb)
SPAM
SPAM
1SPAM
a) 10(^^)v
b) 20(^^)/
c) 50(--)/
d) 100(__)/
e)
130250SPAM
SPAM7
SPAM
3627SPAM
106.67SPAM
SPAM
(MUA)()SPAM(^^)/
bogofilter,bsfilter,POPFile SPAM(^^)/
(^^)/
(^^)/
(__)/
SPAM(^^)v
(__)/
SPAM
procmail
bsfilter
ThunderBirdSPAM
POPFile
POPFile
POPFileSPAM
SPAMSYSTEM
POP proxy
POPFile
POPFilePOPProxySPAM
POPFile
1100SPAM
100SPAM
SPAM
POPPOPFilefetchmail
SPAM
DebianSPAM
bogofilter
m(_ _)m
SpamAssassin
SPAM
bsfilter
CRM114
'the Controllable Regex Mutilator'
apt-cache search SPAM
http://crm114.sourceforge.net/
CRM114
Hidden Markov Model, Bayesian Chain Rule Orthogonal Sparse
Bigrams, Winnow, Correlation, KNN/Hyperspace, Bit Entropy, CLUMP,
SVM, Neural Networks
( or by other means- its all programmable).
CRM114
SPAMnkfkakasi
$ apt-get install nkf kakasi crm114
CRM114
CRM114
cp -a /usr/share/doc/crm114/examples .crm114
cd .crm114
gunzip *.gz
chmod +x mailfilter.crm
cssutil -r -b spam.css
cssutil -r -b nonspam.css
CRM114
crm -v
This is CRM114, version 20060704a-BlameRobert (TRE 0.7.3
(LGPL))
Copyright 2001-2006 William S. Yerazunis
This software is licensed under the GPL with ABSOLUTELY NO
WARRANTY
SPAM
11
$ cat spam | crm -u ~/.crm114/ mailfilter.crm | grep
X-CRM114-Status
X-CRM114-Status: UNSURE (0.0000) This message is 'unsure'; please
train it!
CRM114SPAM
'pR'-320.0+320.0
X-CRM114-Status:
+320.0
-320.0
+/-0.0
-10.0
+10.0
GOOD(SPAM)
SPAM
UNSURE()
SPAM--learnspam
$ cat spam | crm -u ~/.crm114/ mailfilter.crm learnspam
$ cat spam | crm -u ~/.crm114/ mailfilter.crm | grep
X-CRM114-Status
X-CRM114-Status: SPAM ( pR: -183.7027 )
SPAM--learnnonspam
CRM114SPAM
TOE - Train Only Errors
POPFile
TET Train Every Thing
CRM114
SPAM
CRM1148bit clean
[]
CRM114kakasi
kakasi
$ echo '' | nkf -e | kakasi -Ha -Ka -Ja -Ea -ka -s
watashi no namae ha tanaka desu .
nkfUTF-8(tty)EUC,JIS(mail)EUC
pretokenizer.crm
CRM114(mailfilter.crm)
pretokenizer.crmMIMECRM114
(pretokenizer.crm)
pretokenizer.crmSPAM(1)
*** 261,268 ****# We clip m_text to be the first :decision_length: characters of# the incoming mail.#! match (:m_text:) [:_dw: 0 :*:decision_length:] /.*/! isolate (:m_text:)## :b_text: is the text with base64's expanded.isolate (:b_text:) /:*:m_text:/--- 261,274 ----# We clip m_text to be the first :decision_length: characters of# the incoming mail.#! #match (:m_text:) [:_dw: 0 :*:decision_length:] /.*/! #isolate (:m_text:)! isolate (:m_text:) /:*:_dw:/! {! match [:text_preprocessor:] /./! syscall (:*:_dw:) (:m_text:) /:*:text_preprocessor:/! }! match (:m_text:) [:m_text: 0 :*:decision_length:] /.*/## :b_text: is the text with base64's expanded.isolate (:b_text:) /:*:m_text:/
pretokenizer.crmSPAM(2)
:text_preprocessor:
/\/home\/tosihisa\/\.crm114\/pretokenizer\.crm/mailfilter.cf1
()
mailfilter.cfdo_base64no
(pretokenizer.crm)
URL
http://www.netfort.gr.jp/~tosihisa/crm114/
mailfilter2.crm
SPAM
mailfilter2.cf
pretokenizer.crm
CRM114
postfixclamav
procmailCRM114
Maildir
imap
PC
CRM114
CRM114
MLSPAM
ML
anyone can post ML
MLML
SPAM
CRM114(1)
()
CRM114(1)
(SPAM)
UNSURESPAM
SPAM
UNSURE)
CRM114
UNSURE
GOOD/SPAMSPAM
UNSURE
SPAM
SPAM
()
POP
POPFile
GOOD33597.10%UNSURE102.90%SPAM00.00%345
??? ??? (???)2007/10/06, 01:57:20 / C D
GOOD3350.971014492753623
UNSURE100.0289855072463768
SPAM00
GOOD10.03%UNSURE61316.90%SPAM301483.08%3628
??? ??? (???)2007/10/06, 01:57:20 / B C
GOOD10.000275633958103638
UNSURE6130.16896361631753
SPAM30140.830760749724366