02/28/2015

Redis 운영에 필요한 잡지식들.

Table of Contents

1 Redis 의 save 옵션&&maxmemory-policy
2 bgsave 는 slave 와 접속을 차단한다. replication 은 SYNC 는 BGSAVE 를 발생시킨다.
3 diskless 를 통한 slave sync

Redis 는 In memory 기반으로 동작하기 때문에 메모리 관리가 매우 중요한데, 먼저 리눅스에 메모리 관리를 Redis 운영에 적합하도록 설정하는 것이 좋다.

먼저 Redis 는 Swap 을 사용하지 않고 물리 메모리 내에서만 운영하는 환경이다.

vm.overcommit=2
vm.overcommit_ratio=99
vm.swappiness=0

echo never > /sys/kernel/mm/transparent_hugepage/enabled

vm.overcommit=2

vm.overcommit_ratio=99

vm.swappiness=0

echo never > /sys/kernel/mm/transparent_hugepage/enabled

가상 메모리 overcommit 에 대해서 2 로 설정하는 것이 좋다. 2로 설정하게 되면 리눅스가 사용가능한 메모리는 다음과 같이 계산한다.

swap + (physical memory * overcommit_ratio)

1	swap + (physical memory * overcommit_ratio)

예를들어 32GB 물리 메모리를 가지고 있고 overcommit_ratio 가 50%라면 16GB 물리 메모리만 사용하게 된다. 그래서 overcommit_ratio 를 99% 로 설정을하면 모든 메모리를 사용하게 된다.

swappiness 는 0 ~ 100 사이에 값을 가진다. 0에 가까우면 메모리의 dirty page 가 있다고 하더라도 최대한 swap 을 하지 않지만 100에 가까우면 조금만 dirty page 가 있으면 swap 을 한다. Redis 를 물리 메모리만 사용하게끔 하고 싶다면 swappiness 를 0 으로 하는게 좋다.

Redis 의 save 옵션&&maxmemory-policy

Redis 는 메모리의 내용을 디스크로 보관하도록 하는 옵션이 여럿 있다. 그중에 save 옵션이 존재하는데, 다음과 같다.

save 300 10
save 60 10000

1 2	save 300 10 save 60 10000

위 두 조건중에 하나만 만족되도 Redis 는 메모리에 내용을 디스크에 쓰게된다. 만약, Redis 를 운영하는동안 위 조건을 하나도 만족하지 못한다면 디스크로 전혀 쓰는 행위가 발생하지 않게되고 maxmemory 에 닫게 되는 상황에서 maxmemory-policy 가 volatie-lru 로 되어 있다면 다음과 같은 상황에 직면하게 된다.

Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "test.py", line 22, in g
    r_server.set(key, id_generator(random.randrange(128,1024), "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz!@#$%^&*()"))
  File "build/bdist.linux-x86_64/egg/redis/client.py", line 1055, in set
    return self.execute_command('SET', *pieces)
  File "build/bdist.linux-x86_64/egg/redis/client.py", line 565, in execute_command
    return self.parse_response(connection, command_name, **options)
  File "build/bdist.linux-x86_64/egg/redis/client.py", line 577, in parse_response
    response = connection.read_response()
  File "build/bdist.linux-x86_64/egg/redis/connection.py", line 574, in read_response
    raise response
ResponseError: OOM command not allowed when used memory > 'maxmemory'.

Traceback (most recent call last):

File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap

self.run()

File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run

self._target(*self._args, **self._kwargs)

File "test.py", line 22, in g

r_server.set(key, id_generator(random.randrange(128,1024), "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz!@#$%^&*()"))

File "build/bdist.linux-x86_64/egg/redis/client.py", line 1055, in set

return self.execute_command('SET', *pieces)

File "build/bdist.linux-x86_64/egg/redis/client.py", line 565, in execute_command

return self.parse_response(connection, command_name, **options)

File "build/bdist.linux-x86_64/egg/redis/client.py", line 577, in parse_response

response = connection.read_response()

File "build/bdist.linux-x86_64/egg/redis/connection.py", line 574, in read_response

raise response

ResponseError: OOM command not allowed when used memory > 'maxmemory'.

위 오류는 랜덤으로 key-value 를 생성해서 무한 set 을 돌린 결과 이다. maxmemory 는 512MB 였는데, expire 되는 key 가 하나도 없어 메모리는 항상 차게되어 있고 결국에는 더 이상 사용할 메모리가 없자 오류가 난 것이다.

이는 프로그램 운영 상으로도 매우 심각한 문제가 된다. Redis 는 ‘evicted_keys’ 가 존재하는데 보통은 메모리 부족현상으로 발생되는 거라고 생각한다. 그래서 메모리가 부족하면 evicted_key 만 나오고 프로그램은 예외가 발생하지 않을 거라고 생각한다.

evicted_keys:112177

1	evicted_keys:112177

evicted_keys 도 증가했지만 예외상황을 맞을수도 있다.

그렇다면 maxmemory-plicy 가 allkeys-lru 되어 있다면 어떻게 될까? 결론부터말하면 프로그램은 예외를 발생할 확률이 줄어든다. 예외를 발생시킬 수도 있지만 안될 확률이 높다. 프로그램은 잘동작하는 것처럼 보이지만 Redis 는 evicted 를 발생시키고 있다. 따라서 evicted_key 가 증가하는 현상이 보인다면 maxmemory 값을 늘려주는게 좋다.

bgsave 는 slave 와 접속을 차단한다. replication 은 SYNC 는 BGSAVE 를 발생시킨다.

먼저, 구분되어야 할게 있다. Master<->Slave 간의 replication 상태에서 데이터는 전송되지 않는다. 무슨 말이냐하면 Master 발생되는 명령어들을 그대로 Slave 에 던지는 방법이다. “set myhello ‘Hello'” 라는 명령어를 Master 에서 했다면 Slave 도 똑같은 명령어가 전달되지 데이터가 전달되지 않는다.

용어의 차이일 수 있는데, SYNC 시에는 slave 에게 데이터를 전송하기에 앞서 disk 에 rdb 파일을 생성한다. replication.c 에 다음과 같은 내용을 볼 수 있다.

/* Start a BGSAVE for replication goals, which is, selecting the disk or
 * socket target depending on the configuration, and making sure that
 * the script cache is flushed before to start.
 *
 * Returns REDIS_OK on success or REDIS_ERR otherwise. */
int startBgsaveForReplication(void) {
    int retval;

    redisLog(REDIS_NOTICE,"Starting BGSAVE for SYNC with target: %s",
        server.repl_diskless_sync ? "slaves sockets" : "disk");

    if (server.repl_diskless_sync)
        retval = rdbSaveToSlavesSockets();
    else
        retval = rdbSaveBackground(server.rdb_filename);  // 백그라운드 프로세스를 생성하고 rdb_filename 에 점프를 뜬다.

    /* Flush the script cache, since we need that slave differences are
     * accumulated without requiring slaves to match our cached scripts. */
    if (retval == REDIS_OK) replicationScriptCacheFlush();
    return retval;
}

/* Start a BGSAVE for replication goals, which is, selecting the disk or

* socket target depending on the configuration, and making sure that

* the script cache is flushed before to start.

* Returns REDIS_OK on success or REDIS_ERR otherwise. */

int startBgsaveForReplication(void) {

int retval;

redisLog(REDIS_NOTICE,"Starting BGSAVE for SYNC with target: %s",

server.repl_diskless_sync ? "slaves sockets" : "disk");

if (server.repl_diskless_sync)

retval = rdbSaveToSlavesSockets();

else

retval = rdbSaveBackground(server.rdb_filename); // 백그라운드 프로세스를 생성하고 rdb_filename 에 점프를 뜬다.

/* Flush the script cache, since we need that slave differences are

* accumulated without requiring slaves to match our cached scripts. */

if (retval == REDIS_OK) replicationScriptCacheFlush();

return retval;

}

rdb 파일은 백그라운드 프로세스를 생성시키고 rdb_filename 파일에 저장한다. 문제는 이러한 일이 벌어지는 동안에 접속은 차단된다.

int rdbSaveBackground(char *filename) {
    pid_t childpid;
    long long start;

    if (server.rdb_child_pid != -1) return REDIS_ERR;

    server.dirty_before_bgsave = server.dirty;
    server.lastbgsave_try = time(NULL);

    start = ustime();
    if ((childpid = fork()) == 0) {
        int retval;

        /* Child */
        closeListeningSockets(0); // 접속중인 클라이언트와의 접속을 차단한다.
        redisSetProcTitle("redis-rdb-bgsave");
        retval = rdbSave(filename);

int rdbSaveBackground(char *filename) {

pid_t childpid;

long long start;

if (server.rdb_child_pid != -1) return REDIS_ERR;

server.dirty_before_bgsave = server.dirty;

server.lastbgsave_try = time(NULL);

start = ustime();

if ((childpid = fork()) == 0) {

int retval;

/* Child */

closeListeningSockets(0); // 접속중인 클라이언트와의 접속을 차단한다.

redisSetProcTitle("redis-rdb-bgsave");

retval = rdbSave(filename);

closeListeningSockets 함수는 TCP/IP 접속 소켓을 닫게 된다.

결국 BGSAVE 가 발생하는동안 slave 접속은 차단된다. 그리고 다시 접속이 이루어지면 Full Sync 가 발생된다. 정리를하면 현재 Replication 이 된 상태라면 Sync 를 위해서 rdb 덤프는 발생되지 않는다. 하지만 새롭게 Replication 이 맺어지면 그때에는 Master 가 rdb 덤프를 하고 그리고 나서 데이터를 전송하게 된다. 실제로 새로운 Replication 이 맺어지면 파일 덤프가 다음과 같이 발생한다.

[root@localhost redis]# ls -lh data/
합계 512M
-rw-r--r--. 1 root root   18  3월  1 15:13 dump.rdb
-rw-r--r--. 1 root root 289M  3월  1 15:44 temp-5594.rdb
[root@localhost redis]# ls -lh data/
합계 289M
-rw-r--r--. 1 root root 289M  3월  1 15:44 dump.rdb

[root@localhost redis]# ls -lh data/

합계 512M

-rw-r--r--. 1 root root 18 3월 1 15:13 dump.rdb

-rw-r--r--. 1 root root 289M 3월 1 15:44 temp-5594.rdb

[root@localhost redis]# ls -lh data/

합계 289M

-rw-r--r--. 1 root root 289M 3월 1 15:44 dump.rdb

temp-5594.rdb 임시파일을 생성하고 나서 rdb_filename 으로 변경한다. Replication 상태에서 rdb 검프는 SYNC 에만 발생한다.

그런데, Redis 는 slave 가 잠시동안 접속이 차단될때에 Full Sync 가 발생하지 않도록 repl-backlog-size 에 복제를 위한 데이터를 저장해둔다. 주의해야할 것은 이 크기는 메모리 크기를 말하는 것으로 디스크에 데이터를 저장하지 않는다.

void createReplicationBacklog(void) {
    redisAssert(server.repl_backlog == NULL);
    server.repl_backlog = zmalloc(server.repl_backlog_size);

void createReplicationBacklog(void) {

redisAssert(server.repl_backlog == NULL);

server.repl_backlog = zmalloc(server.repl_backlog_size);

여기서 또 짚고 넘어가야할 것은 repl-backlog-size 가 작아서 runtime 으로 크기를 변경해주면 기존의 데이터는 다 지워지고 새로운 크기의 backlog 가 할당되는 방식을 취한다.

void resizeReplicationBacklog(long long newsize) {
    if (newsize < REDIS_REPL_BACKLOG_MIN_SIZE)
        newsize = REDIS_REPL_BACKLOG_MIN_SIZE;
    if (server.repl_backlog_size == newsize) return;

    server.repl_backlog_size = newsize;
    if (server.repl_backlog != NULL) {
        /* What we actually do is to flush the old buffer and realloc a new
         * empty one. It will refill with new data incrementally.
         * The reason is that copying a few gigabytes adds latency and even
         * worse often we need to alloc additional space before freeing the
         * old buffer. */
        zfree(server.repl_backlog);  // 메모리 해제.
        server.repl_backlog = zmalloc(server.repl_backlog_size);
        server.repl_backlog_histlen = 0;
        server.repl_backlog_idx = 0;
        /* Next byte we have is... the next since the buffer is empty. */
        server.repl_backlog_off = server.master_repl_offset+1;
    }
}

void resizeReplicationBacklog(long long newsize) {

if (newsize < REDIS_REPL_BACKLOG_MIN_SIZE)

newsize = REDIS_REPL_BACKLOG_MIN_SIZE;

if (server.repl_backlog_size == newsize) return;

server.repl_backlog_size = newsize;

if (server.repl_backlog != NULL) {

/* What we actually do is to flush the old buffer and realloc a new

* empty one. It will refill with new data incrementally.

* The reason is that copying a few gigabytes adds latency and even

* worse often we need to alloc additional space before freeing the

* old buffer. */

zfree(server.repl_backlog); // 메모리 해제.

server.repl_backlog = zmalloc(server.repl_backlog_size);

server.repl_backlog_histlen = 0;

server.repl_backlog_idx = 0;

/* Next byte we have is... the next since the buffer is empty. */

server.repl_backlog_off = server.master_repl_offset+1;

}

Redis 를 운영할때에 Master-Slave 로 운영한다면 모든 것을 메모리로만 운영할 수는 없다. Slave 로 Replication 을 할때마다 BGSAVE 가 발생하고 이는 일시적으로 Slave 와의 접속을 차단하기 때문이다. Slave 와의 접속이 차단된 후에 다시 연결이 되었을 경우에 Full Sync 가 발생할 수 있다.

diskless 를 통한 slave sync

이는 사실상 트릭같다. disk sync 은 Master 가 bgsave 로 메모리에 데이터를 점프한 후에 그것을 slave 로 전송하는 방법이다. bgsave 가 작동하게 되면 slave 와의 연결은 잠정적으로 끊어지게 된다.

하지만 disklee sync 는 bgsave 가 동작하지 않으며 Master 가 아닌 Slave 가 Master 로부터 메모리의 내용을 전송받아 덤프 파일을 생성한 후에 이것을 메모리에 올리는 방식이다. Master 로부터 Slave 가 데이터를 전송받을때, Master 는 “redis-rdb-to-slaves *:6379” 프로세스가 생성돼 전송을 담당한다. Slave 에서는 전송된 데이터를 파일로 저장하기위해서 별도의 프로세스는 작동하지 않는다.

Master의 메모리 내용을 Slave 가 덤프 받는 방식이 diskless sync 이며 수십기가의 메모리 내용을 전송해야하기 때문에 네트워크 상태가 좋아야 한다.

2025 8월
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Redis 의 save 옵션&&maxmemory-policy

bgsave 는 slave 와 접속을 차단한다. replication 은 SYNC 는 BGSAVE 를 발생시킨다.

diskless 를 통한 slave sync

Post a comment Cancel reply