regularni_izrazi
DESCRIPTION
gtrgTRANSCRIPT
![Page 2: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/2.jpg)
04/28/2023Teoretske osnove računarstva 2
Regularni izraz (eng. REGular EXpression) - uzorak koji opisuje skup stringova
Regularni izrazi formiraju se analogno aritmetičkim izrazima, korištenjem operatora kojima se kombinuju manji izrazi
Osnovna (eng. basic) i proširena (eng. extended) sintaksa
Primena: pronalaženje uzoraka u tekstu (Find, Replace, leksička analiza)
Uvod
![Page 3: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/3.jpg)
04/28/2023Teoretske osnove računarstva 3
a Literal abc Konkatenacija literala [abcd] Unija literala: a, b, c i d [a-d] Unija literala: a, b, c i d [-a-d] Unija literala: -, a, b, c i d [a-d-] Unija literala: a, b, c, d i – [a\-d] Unija literala: a, - i d abc|def Unija izraza abc? Najviše jedno ponavljanje abc* Nula ili više ponavljanja abc+ Jedno ili više ponavljanja a(bc)? Zagrade
Regularni izrazi
![Page 4: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/4.jpg)
04/28/2023Teoretske osnove računarstva 4
. Bilo koji znak \\, \*, \+, \?, ... Escape sekvence ^ Početak linije [^abcd] Sve osim: a, b, c, i d $ Kraj linije a(bc){2} Tačno 2 ponavljanja a(bc){2,} Najmanje 2 ponavljanja a(bc){,4} Najviše 4 ponavljanja a(bc){2,4} Između 2 i 4 ponavljanja
Regularni izrazi
![Page 5: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/5.jpg)
04/28/2023Teoretske osnove računarstva 5
\t tab \b Granica reči \< Početak reči \> Kraj reči \w a-zA-Z0-9_ \W Sve osim a-zA-Z0-9_ \d 0-9 \D Sve osim 0-9
Regularni izrazi
![Page 6: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/6.jpg)
04/28/2023Teoretske osnove računarstva 6
Klase znakova:◦[:lower:] a-z◦[:upper:] A-Z◦[:digit:] 0-9 ◦[:alpha:] [:lower:][:upper:]◦[:alnum:] [:alpha:][:digit:]◦[:word:] a-zA-Z0-9_◦[:blank:] space, tab ◦[:space:] space, tab, CR, NL, VT, FF◦[:punct:] ! " # $ % & ' ( ) * + , - . / : ; <
= > ? @ [ \ ] ^ _ ` { | } ~◦[:xdigit:] 0-9a-fA-F
Regularni izrazi
![Page 7: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/7.jpg)
04/28/2023Teoretske osnove računarstva 7
Back-reference◦ \N (N jednocifren prirodan broj) - pozivanje na N-ti podizraz
regularnog izraza◦ (ab)(cd)\2\1 označava: abcdcdab◦ (ab(c))\1 označava: abcabc◦ (a.)\1 označava nizove: aaaa, abab,
acac, ..., a0a0, a1a1, ... Pohlepnost operatora◦ a(bc)? abc◦ a.*d abcd abcd abcd◦ ".*" "Tekst" pod "navodnicima"◦ "[^"]*" "Tekst" pod "navodnicima"
Regularni izrazi
![Page 8: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/8.jpg)
04/28/2023Teoretske osnove računarstva 8
Unix-ov alat za pretraživanje teksta Razume tri sintakse regularnih izraza: osnovna,
proširena i perl U osnovnim regularnim izrazima specijalni
znakovi ?, +, {, |, (, i ) gube svoje specijalno značenje, pa je potrebno koristiti escape sekvence: \?, \+, \{, \|, \(, i \)
grep
![Page 9: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/9.jpg)
04/28/2023Teoretske osnove računarstva 9
Opšti oblik grep naredbe:◦ grep [opcije] "regex" [ulazni_fajl...]
-E, --extended-regexp --color, --colour -o, --only-match
U shell-u navodnici imaju specijalno značenje i ograničavaju nizove znakova pa se navodnici koji se koriste kao literali, bez izuzetka moraju pisati kao sekvenca: \"
grep
![Page 10: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/10.jpg)
04/28/2023Teoretske osnove računarstva 10
grep
![Page 11: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/11.jpg)
04/28/2023Teoretske osnove računarstva 11
Koristi se za automatizovano uređivanje tekstualnih fajlova (ili bilo kog stream-a) pomoću regularnih izraza pod Unix operativnim sistemima
Opšti oblik sed naredbe:◦ sed –r "s/regex/zamena/[flegovi]" [ul_fajl...]
sed
![Page 12: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/12.jpg)
04/28/2023Teoretske osnove računarstva 12
sed
![Page 13: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/13.jpg)
04/28/2023Teoretske osnove računarstva 13
(a.)(a.)◦ abab abac
(a.)\1◦ abab abac
(ab(c))\1◦ abcabc abcc
(ab(c))\2◦ abcabc abcc
(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)\9\8\7\6\5\4\3\2\1◦ abba◦ anavolimilovana
Primeri
![Page 14: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/14.jpg)
04/28/2023Teoretske osnove računarstva 14
(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1◦ abba◦ anavolimilovana
((\b[0-9]+)?\.)?\b[0-9]+([eE][-+]?[0-9]+)?\b◦12.6E-12◦ e45◦2359◦ -11ek◦99e9
Primeri
![Page 15: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/15.jpg)
04/28/2023Teoretske osnove računarstva 15
\b[1-9][0-9]{3,5}\b◦23nb2345◦1230◦99222◦999999992
(18|19)[0-9][0-9][-](0[1-9]|1[012])[-](0[1-9]|[12][0-9]|3[01])◦1999-07-20◦1898.07-21◦12-06-2009
Primeri
![Page 16: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/16.jpg)
04/28/2023Teoretske osnove računarstva 16
\b[3-5]\b|[3-4][.][0-9]+◦ 3.56◦ 78◦ 1.2◦ 4◦ 4.678
^[\_]*([a-z0-9]+(\.|\_*)?)+@([a-z][a-z0-9\-]+(\.|\-*\.))+[a-z]{2,6}$◦ [email protected]◦ marko.markovic@◦ milan@etfbl.◦ [email protected]
Primeri
![Page 17: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/17.jpg)
04/28/2023Teoretske osnove računarstva 17
sed -r "s/[0-9]/cifra/"◦1 cifra◦1 2 3 cifra 2 3◦1 2 cifra 2
sed -r "s/[0-9]/cifra/g"◦1 cifra◦1 2 3 cifra cifra cifra◦1 2 cifra cifra
sed -r "s/\b(.?)(.?)(.?)\b/\3\2\1/g"◦ jedan dva tri cetiri jedan avd irt cetiri
Primeri
![Page 18: Regularni_izrazi](https://reader036.vdocuments.pub/reader036/viewer/2022083018/577c85191a28abe054bbac9b/html5/thumbnails/18.jpg)
04/28/2023Teoretske osnove računarstva 18
Regularni izrazi