일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- Convolution
- aitech
- 자바스크립트
- regex
- RNN
- group_by( )
- dplyr
- 베이즈통계학
- r
- NomadCoder
- mutate( )
- regular expression
- 네이버커넥트재단
- convolution 역전파
- 모각공
- LinearNeuralNetwork
- Filter
- 부스트캠프 aitech3기
- Beyond Linear Neural Networks
- Multi-Layer Perceptron
- 정규표현식
- 역전파알고리즘
- 부스트캠프aitech3기
- col_names
- 네이버커넥트
- 생활코딩
- Sequential Model
- JavaScript
- summarise( )
- 부스트캠프
- Today
- Total
clear_uncertainty
정규표현식(Regular Expression)에 대하여 - 경계와 전방/후방 탐색 (18~25) 본문
정규표현식(Regular Expression)에 대하여 - 경계와 전방/후방 탐색 (18~25)
SOidentitiy 2021. 10. 7. 17:32
본 포스팅은 생활코딩님의 정규표현식 토픽을 공부하고 정리한 내용입니다.
åç본 포스팅의 패턴(Page) 및 설명은 http://zvon.org/comp/r/tut-Regexp.html#Pages~Contents 를 참고했습니다.
정규표현식 패턴들
Page 18
18 페이지에선 \w 에 대해 알아봅시다. w는 word, 즉 단어의 줄임말입니다. 모든 문자와 _을 검출합니다.
따라서 \w 는 [A-z0-9_]와 정확히 일치합니다.
Source : A1 B2 c3 d_4 e:5 ffGG77--__--
Case1
Regular Expression | \w |
First match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
All match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
공백과 _을 제외한 기호는 word에 제외되기 때문에 검출되지않습니다.
Case2
Regular Expression | \w* |
First match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
All match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
*은 0개 이상을 검출하기 때문에 First match 에서 A1을 검출합니다.
Case2
Regular Expression | [a-z]\w* |
First match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
All match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
[a-z]\w*은 소문자로 시작하며, 그 뒤 word를 검출합니다. 따라서 대문자로 시작하는 A1, B2는 검출되지않고,
ffGG77은 소문자로 시작하기때문에 대문자가 뒤에 있어도 검출됩니다.
Case3
Regular Expression | \w{5} |
First match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
All match | A1 B2 c3 d_4 e:5 ffGG77--__-- |
\w 는 word이며 길이가 5인 것을 검출합니다. 따라서 길이가 5가 되지않는 A1 B2 c3 d_4 e:5 는 검출되지않습니다.
Page 19
19 페이지에선 \W 에 대해 알아봅시다. \W는 \w 와 정확히 반대입니다.
따라서 \W 는 [^A-z0-9_]와 정확히 일치합니다.
Source : AS _34:AS11.23 @#$ %12^*
Case1
Regular Expression | \W |
First match | AS _34:AS11.23 @#$ %12^* |
All match | AS _34:AS11.23 @#$ %12^* |
\W는 공백과 _을 제외한 기호를 검출합니다.
Page 20
20페이지에선 \s와 \S에 대해 알아봅시다. \s 는 공백을 검출합니다. \S는 공백을 제외하고 검출합니다.
Source : Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case1
Regular Expression | \s |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
Case2
Regular Expression | \S |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
Page 21
21페이지에선 \d와 \D에 대해 알아봅시다. d는 digit의 줄임말입니다.
\d는 [0-9]와 일치합니다. \D는 \d를 제외하고 검출합니다.
Source : Page 123; published: 1234 id=12#24@112
Case1
Regular Expression | \d |
First match | Page 123; published: 1234 id=12#24@112 |
All match | Page 123; published: 1234 id=12#24@112 |
Case1
Regular Expression | \D |
First match | Page 123; published: 1234 id=12#24@112 |
All match | Page 123; published: 1234 id=12#24@112 |
Page 22
22페이지에선 \b에 대해 알아봅시다. b는 word boundary의 줄임말입니다.
\b.는 어절단위로 앞부분을, .\b는 어절단위의 뒷부분을 검출합니다.
Source : Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case1
Regular Expression | \b. |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
Case2
Regular Expression | .\b |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
Case3
Regular Expression | \bcat |
First match | cat concat |
All match | cat concat |
Case4
Regular Expression | cat\b |
First match | cat concat |
All match | cat concat |
Page 23
23 페이지에선 \B에 대해 알아봅시다. \B는 \b의 반대를 의미합니다.
Source : Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case1
Regular Expression | \B. |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
Case2
Regular Expression | .\B |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
Page 24
24페이지에선 \A와 \Z에 대해 알아봅시다. \A는 문장의 앞을, \Z는 문장의 뒤를 검출한다는 점에서,
\A는 ^와 유사하고, \Z는 $와 유사합니다.
Source : Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case1
Regular Expression | \A... |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
Case2
Regular Expression | ...\Z |
First match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
All match | Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago. |
그럼 이제 \A와 ^, \Z와 $의 차이점에 대해 알아봅시다.
\AEre는 miltiline을 적용하여도 전체의 가장 앞부분의 Ere를 검출합니다.
그러나 ^Ere는 첫번째 문장 뿐만아니라 세번째 문장또한 검출합니다.
마찬가지로 \Z는 마지막 test만을 검출합니다.
test$ 또한 마찬가지로 문장 끝 test를 모두 검출합니다.
Page 25
25 페이지는 예제를 통해 자세히 알아봅시다.
Source : AAAX---aaax---111
Case1
Regular Expression | \w+ |
First match | AAAX---aaax---111 |
All match | AAAX---aaax---111 |
\w는 문자를 뜻하고 +는 0이상을 검출하기 때문에 \w+는 위와 같이 검출합니다.
Case2
Regular Expression | \w+(?=X) |
First match | AAAX---aaax---111 |
All match | AAAX---aaax---111 |
여기서 (?=<pattern>)의 형식은, <pattern>이 나올때 까지를 검출합니다. 하지만 <pattern>은 검출에 포함되지않습니다.
AAAX---aaax---111 에서 X까지 검출하고 X는 검출되지않는 것을 볼수있습니다.
Case3
Regular Expression | \w+(?=\w) |
First match | AAAX---aaax---111 |
All match | AAAX---aaax---111 |
마찬가지로 \w+(?=\w)는 \w 까지 검출하는 것이고, \w는 모든 문자를 뜻하기때문에 각각 마지막 문자를 제외하고 검출합니다.
추가적으로 위와같은 상황에서, AAAX에서 AAA를 검출하고, 뒤에 aaaX에서 aaa 또한 검출합니다.
출처
'언어 > 정규표현식(Regular Expression)' 카테고리의 다른 글
정규표현식(Regular Expression)에 대하여 - 수량자(Quantifier) (11~17) (0) | 2021.10.07 |
---|---|
정규표현식(Regular Expression)에 대하여 - 문자그룹, 특정문자와 범위, 서브패턴 (5~10) (0) | 2021.10.07 |
정규표현식(Regular Expression)에 대하여 - 앵커와 이스케이핑(3~4) (0) | 2021.10.07 |
정규표현식 (Regular Expression)에 대하여 - 정의와 기본 패턴(1~2) (0) | 2021.10.07 |