隨著nosql風(fēng)潮興起,redis作為當(dāng)中一個(gè)耀眼的明星,也越來越多的被關(guān)注和使用,我在工作中也廣泛的用到了redis來充當(dāng)cache和key-value DB,但當(dāng)大家發(fā)現(xiàn)數(shù)據(jù)越來越多時(shí),不禁有些擔(dān)心,redis能撐的住嗎,雖然官方已經(jīng)有漂亮的benchmark,自己也可以做做壓力測試,但是看看源碼,也是確認(rèn)問題最直接的辦法之一。比如目前我們要確認(rèn)的一個(gè)問題是,redis是如何刪除過期數(shù)據(jù)的?
用一個(gè)可以"find reference"的IDE,沿著setex(Set the value and expiration of a key)命令一窺究竟:
void setexCommand(redisClient *c) {
c->argv[3] = tryObjectEncoding(c->argv[3]);
setGenericCommand(c,0,c->argv[1],c->argv[3],c->argv[2]);
}
setGenericCommand是一個(gè)實(shí)現(xiàn)set,setnx,setex的通用函數(shù),參數(shù)設(shè)置不同而已。
void setCommand(redisClient *c) {
c->argv[2] = tryObjectEncoding(c->argv[2]);
setGenericCommand(c,0,c->argv[1],c->argv[2],NULL);
}
void setnxCommand(redisClient *c) {
c->argv[2] = tryObjectEncoding(c->argv[2]);
setGenericCommand(c,1,c->argv[1],c->argv[2],NULL);
}
void setexCommand(redisClient *c) {
c->argv[3] = tryObjectEncoding(c->argv[3]);
setGenericCommand(c,0,c->argv[1],c->argv[3],c->argv[2]);
}
再看setGenericCommand:
1 void setGenericCommand(redisClient *c, int nx, robj *key, robj *val, robj *expire) {
2 long seconds = 0; /* initialized to avoid an harmness warning */
3
4 if (expire) {
5 if (getLongFromObjectOrReply(c, expire, &seconds, NULL) != REDIS_OK)
6 return;
7 if (seconds <= 0) {
8 addReplyError(c,"invalid expire time in SETEX");
9 return;
10 }
11 }
12
13 if (lookupKeyWrite(c->db,key) != NULL && nx) {
14 addReply(c,shared.czero);
15 return;
16 }
17 setKey(c->db,key,val);
18 server.dirty++;
19 if (expire) setExpire(c->db,key,time(NULL)+seconds);
20 addReply(c, nx ? shared.cone : shared.ok);
21 }
22
13行處理"Set the value of a key, only if the key does not exist"的場景,17行插入這個(gè)key,19行設(shè)置它的超時(shí),注意時(shí)間戳已經(jīng)被設(shè)置成了到期時(shí)間。這里要看一下redisDb(即c->db)的定義:
typedef struct redisDb {
dict *dict; /* The keyspace for this DB */
dict *expires; /* Timeout of keys with a timeout set */
dict *blocking_keys; /* Keys with clients waiting for data (BLPOP) */
dict *io_keys; /* Keys with clients waiting for VM I/O */
dict *watched_keys; /* WATCHED keys for MULTI/EXEC CAS */
int id;
} redisDb;
僅關(guān)注dict和expires,分別來存key-value和它的超時(shí),也就是說如果一個(gè)key-value是有超時(shí)的,那么它會(huì)存在dict里,同時(shí)也存到expires里,類似這樣的形式:dict[key]:value,expires[key]:timeout.
當(dāng)然key-value沒有超時(shí),expires里就不存在這個(gè)key。剩下setKey和setExpire兩個(gè)函數(shù)無非是插數(shù)據(jù)到兩個(gè)字典里,這里不再詳述。
那么redis是如何刪除過期key的呢。
通過查看dbDelete的調(diào)用者,首先注意到這一個(gè)函數(shù),是用來刪除過期key的。
1 int expireIfNeeded(redisDb *db, robj *key) {
2 time_t when = getExpire(db,key);
3
4 if (when < 0) return 0; /* No expire for this key */
5
6 /* Don't expire anything while loading. It will be done later. */
7 if (server.loading) return 0;
8
9 /* If we are running in the context of a slave, return ASAP:
10 * the slave key expiration is controlled by the master that will
11 * send us synthesized DEL operations for expired keys.
12 *
13 * Still we try to return the right information to the caller,
14 * that is, 0 if we think the key should be still valid, 1 if
15 * we think the key is expired at this time. */
16 if (server.masterhost != NULL) {
17 return time(NULL) > when;
18 }
19
20 /* Return when this key has not expired */
21 if (time(NULL) <= when) return 0;
22
23 /* Delete the key */
24 server.stat_expiredkeys++;
25 propagateExpire(db,key);
26 return dbDelete(db,key);
27 }
28
ifNeed表示能刪則刪,所以4行沒有設(shè)置超時(shí)不刪,7行在"loading"時(shí)不刪,16行非主庫不刪,21行未到期不刪。25行同步從庫和文件。
再看看哪些函數(shù)調(diào)用了expireIfNeeded,有lookupKeyRead,lookupKeyWrite,dbRandomKey,existsCommand,keysCommand。通過這些函數(shù)命名可以看出,只要訪問了某一個(gè)key,順帶做的事情就是嘗試查看過期并刪除,這就保證了用戶不可能訪問到過期的key。但是如果有大量的key過期,并且沒有被訪問到,那么就浪費(fèi)了許多內(nèi)存。Redis是如何處理這個(gè)問題的呢。
dbDelete的調(diào)用者里還發(fā)現(xiàn)這樣一個(gè)函數(shù):
1 /* Try to expire a few timed out keys. The algorithm used is adaptive and
2 * will use few CPU cycles if there are few expiring keys, otherwise
3 * it will get more aggressive to avoid that too much memory is used by
4 * keys that can be removed from the keyspace. */
5 void activeExpireCycle(void) {
6 int j;
7
8 for (j = 0; j < server.dbnum; j++) {
9 int expired;
10 redisDb *db = server.db+j;
11
12 /* Continue to expire if at the end of the cycle more than 25%
13 * of the keys were expired. */
14 do {
15 long num = dictSize(db->expires);
16 time_t now = time(NULL);
17
18 expired = 0;
19 if (num > REDIS_EXPIRELOOKUPS_PER_CRON)
20 num = REDIS_EXPIRELOOKUPS_PER_CRON;
21 while (num--) {
22 dictEntry *de;
23 time_t t;
24
25 if ((de = dictGetRandomKey(db->expires)) == NULL) break;
26 t = (time_t) dictGetEntryVal(de);
27 if (now > t) {
28 sds key = dictGetEntryKey(de);
29 robj *keyobj = createStringObject(key,sdslen(key));
30
31 propagateExpire(db,keyobj);
32 dbDelete(db,keyobj);
33 decrRefCount(keyobj);
34 expired++;
35 server.stat_expiredkeys++;
36 }
37 }
38 } while (expired > REDIS_EXPIRELOOKUPS_PER_CRON/4);
39 }
40 }
41
這個(gè)函數(shù)的意圖已經(jīng)有說明:刪一點(diǎn)點(diǎn)過期key,如果過期key較少,那也只用一點(diǎn)點(diǎn)cpu。25行隨機(jī)取一個(gè)key,38行刪key成功的概率較低就退出。這個(gè)函數(shù)被放在一個(gè)cron里,每毫秒被調(diào)用一次。這個(gè)算法保證每次會(huì)刪除一定比例的key,但是如果key總量很大,而這個(gè)比例控制的太大,就需要更多次的循環(huán),浪費(fèi)cpu,控制的太小,過期的key就會(huì)變多,浪費(fèi)內(nèi)存——這就是時(shí)空權(quán)衡了。
最后在dbDelete的調(diào)用者里還發(fā)現(xiàn)這樣一個(gè)函數(shù):
/* This function gets called when 'maxmemory' is set on the config file to limit
* the max memory used by the server, and we are out of memory.
* This function will try to, in order:
*
* - Free objects from the free list
* - Try to remove keys with an EXPIRE set
*
* It is not possible to free enough memory to reach used-memory < maxmemory
* the server will start refusing commands that will enlarge even more the
* memory usage.
*/
void freeMemoryIfNeeded(void)
這個(gè)函數(shù)太長就不再詳述了,注釋部分說明只有在配置文件中設(shè)置了最大內(nèi)存時(shí)候才會(huì)調(diào)用這個(gè)函數(shù),而設(shè)置這個(gè)參數(shù)的意義是,你把redis當(dāng)做一個(gè)內(nèi)存cache而不是key-value數(shù)據(jù)庫。
以上3種刪除過期key的途徑,第二種定期刪除一定比例的key是主要的刪除途徑,第一種“讀時(shí)刪除”保證過期key不會(huì)被訪問到,第三種是一個(gè)當(dāng)內(nèi)存超出設(shè)定時(shí)的暴力手段。由此也能看出redis設(shè)計(jì)的巧妙之處,