정규표현식 톺아보기 with JavaScript and Python

JavaScript에서의 정규표현식 활용

1. JavaScript의 정규표현식 기초

1.1 정규표현식 생성 방법

1.1.1 리터럴 패턴

리터럴 패턴은 슬래시(/)로 감싸서 정규표현식을 생성하는 가장 일반적인 방법입니다. 변수로 할당하지 않고 매서드에서도 바로 사용할 수 있습니다.

const regex1 = /hello/;                        // 기본 패턴
const regex2 = /hello/g;                       // 전역 검색
const emailRegex = /[\w.-]+@[\w.-]+\.\w+/g;   // 이메일 패턴

const regex1 = /hello/;                        // 기본 패턴
const regex2 = /hello/g;                       // 전역 검색
const emailRegex = /[\w.-]+@[\w.-]+\.\w+/g;   // 이메일 패턴

아래와 같이 실습해 볼 수 있습니다. match는 매칭된 결과를 배열로 반환합니다.

const text = 'weniv CEO leehojun, email: paul-lab@naver.com';
// 변수 할당
const pattern = /[\w.-]+@[\w.-]+\.\w+/g;
console.log(text.match(pattern));
 
// 바로 사용
console.log(text.match(/[\w.-]+@[\w.-]+\.\w+/g));

const text = 'weniv CEO leehojun, email: paul-lab@naver.com';
// 변수 할당
const pattern = /[\w.-]+@[\w.-]+\.\w+/g;
console.log(text.match(pattern));
 
// 바로 사용
console.log(text.match(/[\w.-]+@[\w.-]+\.\w+/g));

1.1.2 RegExp 생성자

동적으로 패턴을 생성해야 할 때 사용합니다. 여기서 동적이라 함은 고정된 값이 아니라 변수나 함수의 반환값을 사용하는 경우를 말합니다.

const regex1 = new RegExp('hello');
const regex2 = new RegExp('hello', 'g');
const userPattern = new RegExp(`${username}`, 'g');

const regex1 = new RegExp('hello');
const regex2 = new RegExp('hello', 'g');
const userPattern = new RegExp(`${username}`, 'g');

아래와 같이 실습해볼 수 있습니다.

const text = 'weniv CEO leehojun, email: paul-lab@naver.com'
const username = 'paul-lab';
const pattern = new RegExp(`${username}`, 'g');
console.log(text.match(pattern));

const text = 'weniv CEO leehojun, email: paul-lab@naver.com'
const username = 'paul-lab';
const pattern = new RegExp(`${username}`, 'g');
console.log(text.match(pattern));

1.2 플래그의 이해

1.2.1 주요 플래그

/pattern/g    // g: 전역 검색 (모든 일치 항목 찾기)
/pattern/i    // i: 대소문자 구분 없음
/pattern/m    // m: 다중 행 모드

/pattern/g    // g: 전역 검색 (모든 일치 항목 찾기)
/pattern/i    // i: 대소문자 구분 없음
/pattern/m    // m: 다중 행 모드

1.2.2 플래그 조합

/hello/gi     // 대소문자 구분 없이 전역 검색
/^hello/gm    // 모든 행의 시작에서 hello 검색

/hello/gi     // 대소문자 구분 없이 전역 검색
/^hello/gm    // 모든 행의 시작에서 hello 검색

1.2.3 플래그 활용

const text = 'hello Hello HELLO';
const pattern = /hello/gi;
console.log(text.match(pattern));  // ['hello', 'Hello', 'HELLO']

const text = 'hello Hello HELLO';
const pattern = /hello/gi;
console.log(text.match(pattern));  // ['hello', 'Hello', 'HELLO']

2. JavaScript의 정규표현식 메서드

2.1 RegExp 객체 메서드

2.1.1 test 메서드

패턴이 문자열과 매칭되는지 확인하여 매칭되는 경우 true를 반환합니다.

const text = 'hello world'
const pattern = /hello/gm
 
pattern.test(text) // true

const text = 'hello world'
const pattern = /hello/gm
 
pattern.test(text) // true

2.1.2 exec 메서드

매칭된 결과를 반환합니다. 결과는 매칭된 문자열, 인덱스, 입력 문자열을 포함하는 배열입니다. 매칭된 결과가 없으면 null을 반환합니다. 아래와 같이 사용할 수 있습니다.

const text = "hello world hello";
const pattern = /\w+/g;
let result;
 
while ((result = pattern.exec(text)) !== null) {
    console.log(result[0]); // 'hello', 'world', 'hello' 순서로 출력
}

const text = "hello world hello";
const pattern = /\w+/g;
let result;
 
while ((result = pattern.exec(text)) !== null) {
    console.log(result[0]); // 'hello', 'world', 'hello' 순서로 출력
}

2.2 String 객체의 정규표현식 메서드

2.2.1 match 메서드

match는 매칭된 결과를 배열로 반환합니다. 매칭된 결과가 없으면 null을 반환합니다.

const text = "hello world hello";
text.match(/\w+/g);  // ['hello', 'world', 'hello']
text.match(/Hello/i); // ['hello']

const text = "hello world hello";
text.match(/\w+/g);  // ['hello', 'world', 'hello']
text.match(/Hello/i); // ['hello']

2.2.2 replace 메서드

첫 번째 인자로 전달된 패턴을 두 번째 인자로 전달된 값으로 치환합니다.

const text = "hello world";
text.replace(/o/g, '0');           // 'hell0 w0rld'
text.replace(/(\w+)\s(\w+)/, '$2 $1'); // 'world hello'
text.replace(/(\w+)\s(\w+)/, '$2!!$2!!'); // 'world!!world!!'

const text = "hello world";
text.replace(/o/g, '0');           // 'hell0 w0rld'
text.replace(/(\w+)\s(\w+)/, '$2 $1'); // 'world hello'
text.replace(/(\w+)\s(\w+)/, '$2!!$2!!'); // 'world!!world!!'

2.2.3 split 메서드

문자열을 패턴에 따라 분할하여 배열로 반환합니다.

"hello,world;hi".split(/[,;]/);  // ['hello', 'world', 'hi']

"hello,world;hi".split(/[,;]/);  // ['hello', 'world', 'hi']

3. 실습 문제

3.1 기본 실습

// 실습 1: hello 찾기
const text = "hello Hello HELLO";
 
// 실습 2: URL 도메인 추출
const url = "https://github.com/LiveCoronaDetector/livecod";
 
// 실습 3: 괄호 내용 추출
const text2 = "hello (hello world) hello";

// 실습 1: hello 찾기
const text = "hello Hello HELLO";
 
// 실습 2: URL 도메인 추출
const url = "https://github.com/LiveCoronaDetector/livecod";
 
// 실습 3: 괄호 내용 추출
const text2 = "hello (hello world) hello";

주어진 텍스트에서 모든 "hello"를 찾아 대소문자 구분 없이 출력하세요.
URL 주소에서 도메인 이름만 추출하세요.
괄호 안의 내용만 추출하세요.

3.2 문제 풀이

// 실습 1: hello 찾기
const text = "hello Hello HELLO";
console.log(text.match(/hello/gi));
 
// 실습 2: URL 도메인 추출
const url = "https://github.com/LiveCoronaDetector/livecod"
console.log(url.match(/https?:\/\/[\w.-]+/)[0]);
 
// 실습 3: 괄호 내용 추출
const text2 = "hello (hello world) hello";
console.log(text2.match(/\(.*\)/)[0]);

// 실습 1: hello 찾기
const text = "hello Hello HELLO";
console.log(text.match(/hello/gi));
 
// 실습 2: URL 도메인 추출
const url = "https://github.com/LiveCoronaDetector/livecod"
console.log(url.match(/https?:\/\/[\w.-]+/)[0]);
 
// 실습 3: 괄호 내용 추출
const text2 = "hello (hello world) hello";
console.log(text2.match(/\(.*\)/)[0]);