2004/12/051/27 sparcs 04 seminar regular expression by 박강현 (lightspd)

27
2004/12/05 1/27 SPARCS 04 Seminar Regular Expression By 박박박 (lig htspd)

Upload: buck-dwayne-mckinney

Post on 17-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

2004/12/05 1/27

SPARCS 04 Seminar Regular Expression

By 박강현

(lightspd)

Page 2: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

2/27 2004/12/05

INDEX

What’s Regular Expression?

What can you do with Regular Expression?

The Usage

Regular Expression with Perl

Any Drawbacks?

References

Page 3: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

3/27 2004/12/05

1. What’s Regular Expression? Sets of symbols and syntactic elements use

d to match patterns of text. A grammar that has a concise and fairly stan

dardized format. Originally developed by a mathematician na

med Stephen Kleene.

Page 4: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

4/27 2004/12/05

2. What can you do with Regular Expression?

Fast and various searching and search-and-replace

Testing for a certain condition in a text file or data stream

Page 5: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

5/27 2004/12/05

3. The Usage (1) Before Starting What do I need to know before starting?

Nothing. Regular Expression doesn't constitute a "language" in the way C or Perl does.

Instead, Regular Expression constitutes a syntax which many languages and tools support.

ex) grep(global regular expression print), sed, vi, Emacs, Ultra-Edit 32, C, Python, etc.

Page 6: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

6/27 2004/12/05

3. The Usage (2) Simple search pattern with egrep

The standard way to find a word in a file is to use the Regular Expression with a tool like egrep(extended grep).

Asking egrep to find instances of the pattern “SPARCS" in the file “clublist" and write the results to a file called “result".

Page 7: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

7/27 2004/12/05

3. The Usage (3) Simple Search-and-replace with sed

We’ll use sed(stream editor) for this purpose. Here’s an example below.

Asking sed to find instances of the pattern “SPARCS” in the file “clublist” and then replace “SPARCS” with its original meaning.

Page 8: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

8/27 2004/12/05

3. The Usage (4) A bit more complex patterns

Metacharacters

The characters that match in a generalized fashion. Single Character Metacharacters

Metacharacters that match single characters. ex) . […] [^…]

Quantifiers Metacharacters which specify the number of times a particular chara

cter should match. ex) ? * + {num} {min, max} Anchors

Characters specifying the position at which a particular pattern occurs.

ex) ^ $ \< \> \b \B

Page 9: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

9/27 2004/12/05

3. The Usage (4) A bit more complex patterns

Single Character Metacharacters .

Matches any one character […]

Matches any single character in the square brackets.

[^…] Matches any single character except

those which are in the square brackets.

Page 10: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

10/27 2004/12/05

3. The Usage (4) A bit more complex patterns Single Character Metacharacters Example 1

Page 11: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

11/27 2004/12/05

3. The Usage (4) A bit more complex patterns Single Character Metacharacters Example 2

Page 12: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

12/27 2004/12/05

3. The Usage (4) A bit more complex patterns Single Character Metacharacters Example 3

Page 13: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

13/27 2004/12/05

3. The Usage (4) A bit more complex patterns

Quantifiers ?

Matches any character 0 or one times. *

Matches the preceding element 0 or more times. +

Matches the preceding element 1 or more times. {num}

Matches the preceding element num times. {min, max}

Matches the preceding element at least min times, but not more than max times.

Page 14: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

14/27 2004/12/05

3. The Usage (4) A bit more complex patterns

Quantifiers Example 1

Page 15: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

15/27 2004/12/05

3. The Usage (4) A bit more complex patterns

Quantifiers Example 2

Page 16: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

16/27 2004/12/05

3. The Usage (4) A bit more complex patterns

Anchors ^

Matches at the start of the line. $

Matches at the end of the line. \<

Matches at the beginning of a word. \>

Matches at the end of a word. \b

Matches at the beginning or the end of a word. \B

Matches any character not at the beginning or the end of a word.

Page 17: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

17/27 2004/12/05

3. The Usage (4) A bit more complex patterns

Anchors Example

Page 18: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

18/27 2004/12/05

3. The Usage (5) Escape Characters How to look for special characters like

asterisks(*), periods(.) and slashes(\,/)?

Just Put a backslash(\) before that character.

This escape character feature has a definite disadvantage. Ugliness!

Page 19: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

19/27 2004/12/05

3. The Usage (6) Alternation

Using “|” symbol to indicate Logical OR. ex1) egrep “apple|kiwi” fruits ex2) egrep “gr(e|a)y” textfile ex3) sed “s/^To: \(Spammer1\|Spammer2\)

/Deleted/g” mailbox

Page 20: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

20/27 2004/12/05

3. The Usage (7) Backreferences Perhaps the most powerful element

of the Regular Expression syntax. Allows you to load the results of a

matched pattern into a buffer and then reuse it later in the expression.

Page 21: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

21/27 2004/12/05

3. The Usage (7) Backreferences Example 1

Page 22: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

22/27 2004/12/05

3. The Usage (7) Backreferences Example 2

Page 23: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

23/27 2004/12/05

4. Regular Expression with Perl

Perl(Practical Extraction and Report Language)? A flexible and sophisticated language.

Perl is especially good for text manipulation with the most extensive support for regular expression.

Perl also allows you to match on a string, save it into a buffer, evaluate the contents of that buffer, and perform a computation upon it.

Page 24: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

24/27 2004/12/05

Regular Expression with Perl Example 1

Page 25: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

25/27 2004/12/05

Regular Expression with Perl Example 2

(1) Match on “page n” (2) Save the contents of n into a buffer as

$1 (3) Use an expression like "$newnumber +

= $1" to increment the value of the page number by one.

Page 26: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

26/27 2004/12/05

5. Any Drawbacks?

Bad Readability – Regular Expression tends to be easier to write than it is to read.

ex) a simple sed routine designed to replace a couple of URLs sed ‘s/http:\/\/etext\.lib\.virginia\.edu\//http:\/\/www\.etext\.virginia\.edu/g’

Ordinary macros(other than Regular Expression) are superior in that they are easy to use even for beginners.

It’s largely a question of fitting the tool to the job.

Page 27: 2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)

27/27 2004/12/05

6. Reference

Mastering Regular Expressions in UNIX by Jeffrey E. F. Friedl

Unix in a nutshell by Arnold Robbins 따라하기 리눅스 V0.1 by ㈜

리눅스인터내셔널 편집부 (linux.co.kr) Naver 지식 iN