2015 ftsm malim khairuddin omar 28/5/2015 bicara bicara malim prof ko.pdf · arab dan hukum tajwid...
TRANSCRIPT
![Page 1: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/1.jpg)
Khairuddin Omar
28/5/2015
Bicara Malim FTSM 2015
![Page 2: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/2.jpg)
Pengenalan Pengecaman aksara optik (PAO) adalah proses menukar
imej teks bercetak atau tulisan tangan yang telah diimbas (angka, huruf, dan simbol), ke dalam betuk aliran aksara mesin-boleh baca, jelas (contoh fail teks) atau diformat (contoh fail HTML).
PAO adalah cabang Pengecaman Corak (PC) yang paling berjaya. Indeed, to recognize a character from a given image, one would match
(via some known metric) this character’s feature pattern against some very limited reference set of known feature patterns in the given alphabet. This clearly is a classical case of a pattern recognition problem. Eugene Borovikov, 2014. A survey of modern optical character recognition techniques -
Computer Vision and Pattern Recognition
Bicara Malim FTSM 2015
![Page 3: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/3.jpg)
A typical OCR System
Bicara Malim FTSM 2015
![Page 4: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/4.jpg)
Contoh Struktur Seni Bina PAO (Khairuddin 2000)
Bicara Malim FTSM 2015
![Page 5: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/5.jpg)
Handwriting recognition Jawi
Khairuddin Omar, Jawi Handwritten Text Recognition Using Multi-level Classifier (in Malay), PhD Thesis, Universiti Putra Malaysia, 2000.
Mazani Manaf, Jawi Handwritten Text Recognition Using Recurrent Bama Neural Networks (in Malay), PhD Thesis, 2002.
Roslim Mohammad, Modification of Combined Segmentation Technique for Jawi Manuscript (in Malay). MIT Thesis, 2002.
Mohammad Faidzul Nasrudin, Pengecaman Aksara Jawi Menggunakan Jelmaan Surih. 2011.
Bicara Malim FTSM 2015
![Page 6: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/6.jpg)
Handwriting recognition Jawi (sambungan)
Che Norhaslida Deraman, Extension of Combined Segmentation Technique for Jawi Manuscripts (in Malay). MIT Thesis, 2005.
Viska Mutiawani, Segmentation of Jawi Text Using Voronoi Diagram (in Malay) MIT Thesis, 2007.
Remon Redika, Features Extraction Of Jawi Character Base On Hidden Markov Method, 2009.
Anton Heryanto, Segmentation technique for jawi character recognition using Dynamic Programming, 2009.
Bicara Malim FTSM 2015
![Page 7: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/7.jpg)
Handwriting recognition Arabic Ahmad M. Z. Mohammed, Segmentation of Arabic Characters
Using Voronoi Diagrams, PhD Thesis. Fakulti Teknologi dan Sains Maklumat, Universiti Kebangsaan Malaysia, Bangi, 2007.
Atallah Mahmoud Awad Al-Shatnawi. A Non-Iterative Thinning Method Based on Exploited Vertices of Voronoi Diagrams, 2010.
Ali Mohammed Massud Mady. A Comparative Study in The Algorithms of Voronoi Diagrams Construction on Thinning Process, 2011.
Jabril Ramdan Abdslam Salem. Comparative Study of Algorithms for Voronoi Diagram Construction on Segmentation Of Arabic Handwriting, 2011
Bicara Malim FTSM 2015
![Page 8: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/8.jpg)
Intelligent post-processing Azniah Ismail, ASCII Code and UNICODE for Arabic and Jawi Word
Processing (in Malay). MIT Thesis, 2003. Suliana Sulaiman, Digital Jawi Manuscript in UNICODE Character
Code (in Malay), MIT Thesis, 2007. Juhaidah Abu Bakar, Transliteration System of Old Jawi to New Jawi
Using Grafem (in Malay), MIT Thesis, 2007. Suliana Sulaiman. Pencantas Perkataan Melayu untuk Aksara Jawi
Berasaskan Petua, 2013. Juhaidah Abu Bakar. Minimizing Part of Speech Tagging Gap:
Identifying Proper Names in Jawi corpus.
Bicara Malim FTSM 2015
![Page 9: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/9.jpg)
OCR in multi-media Che Wan Shamsul Bahari Che Wan Ahmad, Old Jawi to
New Jawi Translator (in Malay), MIT Thesis, Fakulti Teknologi dan Sains Maklumat, Universiti Kebangsaan Malaysia, Bangi, 2006.
Yonhendri . Enjin Transliterasi Rumi-Jawi, 2009.
Che Wan Shamsul Bahari Che Wan Ahmad. Transliterasi Mesin untuk Ejaan Melayu Lama.
Bicara Malim FTSM 2015
![Page 10: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/10.jpg)
Adaptive OCR wider range of printed document imagery
Majdi Abdel Rahim Saleh Salameh. Pengecaman Harakat Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009.
omni-font texts Mohd Sanusi bin Azmi. Fitur Baharu dari Kombinasi
Geometri Segitiga dan Pengezonan untuk Paleografi Jawi Digital, 2013.
multi-script and multi-language recognition Waleed Abdel Karim Helal Abu-Ain. Automatic Off-line
International Handwritting Script Identification Based on Skeleton Primitive Direction Features.
Bicara Malim FTSM 2015
![Page 11: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/11.jpg)
Document Image Enhancement Mohd Sanusi Azmi, Reengineering of Slant and Slope
Orientation Skew Histogram for Merong Mahawangsa Manuscript (in Malay), MIT Thesis, 2003.
Bilal Mohammad Ahmad Bataineh Adaptive Binarization and Statistical Texture Analysis for Document Images Analysis and Recognition, 2011.
Sitti Rachmawati Yahya. Pembentukan Semula Imej Manuskrip Lama Secara Kaedah Adaptif Perduaan Automatik Dan Penjejakan Tetingkap Piksel.
Tarik Abdel Kareem Helal Abu Ain. Joint-Landmarks Baseline and Advanced Direction Features for Arabic Character Segmentation and Classification.
Bicara Malim FTSM 2015
![Page 12: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/12.jpg)
Trend Utama dalam PAO moden Adaptive OCR aims at robust handling of a wider
range of printed document imagery by addressing multi-script and multi-language recognition
omni-font texts
automatic document segmentation
mathematical notation recognition
Bicara Malim FTSM 2015
![Page 13: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/13.jpg)
Trend Utama dalam PAO moden Handwriting recognition is a maturing OCR
technology that has to be extremely robust and adaptive. In general, it remains an actively researched open problem that has been solved to a certain extent for some special applications, such as
recognition of hand-printed text in forms
handwriting recognition in personal checks
postal envelope and parcel address readers
OCR in portable and handheld devices
Bicara Malim FTSM 2015
![Page 14: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/14.jpg)
Trend Utama dalam PAO moden Document image enhancement - involves
(automatically) choosing and applying appropriate image filters to the source document image to help the given OCR engine better recognize characters and words.
Bicara Malim FTSM 2015
![Page 15: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/15.jpg)
Trend Utama dalam PAO moden Intelligent post-processing is of great importance
for improving the OCR recognition accuracy and for creating robust information retrieval (IR) systems that utilize smart indexing and approximate string matching techniques for storage and retrieval of noisy OCR output texts.
Bicara Malim FTSM 2015
![Page 16: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/16.jpg)
Trend Utama dalam PAO moden OCR in multi-media is an interesting development
that adapts techniques of optical character recognition in the media other than printed documents, e.g. photo, video, and the internet
Bicara Malim FTSM 2015
![Page 17: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/17.jpg)
Mengapa POA sukar? Datang dari dua sumber utama:
kualiti imej yang rendah poor original document quality
noisy, low resolution, multi-generation image scanning
incorrect or insufficient image pre-processing
poor segmentation into recognition items
keupayaan diskriminan pengelas Sukar untuk dapatkan 99% kadar pengecaman Bicara Malim FTSM 2015
![Page 18: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/18.jpg)
Mengapa POA sukar? script and language
document image types and image defects
document segmentation
character types
OCR flexibility, accuracy and productivity
hand-writing and hand-printing
OCR pre- and post-processing Bicara Malim FTSM 2015
![Page 19: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/19.jpg)
Complex character scripts
Bicara Malim FTSM 2015
![Page 20: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/20.jpg)
Insufficient image preprocessing
Bicara Malim FTSM 2015
![Page 21: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/21.jpg)
Document segmentation ambiguity
Bicara Malim FTSM 2015
![Page 22: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/22.jpg)
Character shape variability
Bicara Malim FTSM 2015
![Page 23: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/23.jpg)
Baseline detection
Bicara Malim FTSM 2015
![Page 24: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/24.jpg)
Skew and slanting
Bicara Malim FTSM 2015
![Page 25: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/25.jpg)
Poor original document quality
Bicara Malim FTSM 2015
![Page 26: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/26.jpg)
Poor segmentation into recognition items
Bicara Malim FTSM 2015
![Page 27: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/27.jpg)
Complex features
Bicara Malim FTSM 2015
![Page 28: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/28.jpg)
Stemming, tagging, homograph
Bicara Malim FTSM 2015
![Page 29: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/29.jpg)
the most promising directions adaptive OCR aiming at robust handling of a wider range of
printed document imagery – deep learning document image enhancement as part of OCR pre-
processing intelligent use of context providing a bigger picture to the
OCR engine and making the recognition task more focused and robust
handwriting recognition in all forms, static and dynamic, general-purpose and task-specific, etc.
multi-lingual OCR, including multiple embedded scripts multi-media OCR aiming to recognize any text captured by
any visual sensor in any environment
Bicara Malim FTSM 2015
![Page 30: 2015 FTSM Malim Khairuddin Omar 28/5/2015 Bicara Bicara Malim Prof KO.pdf · Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009](https://reader033.vdocuments.pub/reader033/viewer/2022050917/5aa866e27f8b9a9a188b8d3b/html5/thumbnails/30.jpg)
Sekian
Bicara Malim FTSM 2015