Sentinel(Redis 3.0.0-rc1)
Sentinel是Redis HA方案,一个或多个Sentinel实例组成的Sentinel系统,可以监视任意多个主服务器(master),
以及这些主服务器属下的所有从服务器(slave),并在被监视的主服务器进入下线状态时,自动在将被下线主服务器属下的某个从服务器升级为新的主服务器,
然后由新的主服务器代替已下线的主服务器继续处理命令请求。
基础数据结构
代码如下 | 复制代码 |
typedef struct sentinelRedisInstance { // 当前实例的类型和状态(master, slave, sentinel,是否下线) int flags; // 主机名,ip:port char *name; // 实例的runid char *runid; // 配置纪元 uint64_t config_epoch; // 实例的地址 sentinelAddr *addr; redisAsyncContext *cc; /* Hiredis context for commands. */ redisAsyncContext *pc; /* Hiredis context for Pub / Sub. */ int pending_commands; /* Number of commands sent waiting for a reply. */ mstime_t cc_conn_time; /* cc connection time. */ mstime_t pc_conn_time; /* pc connection time. */ mstime_t pc_last_activity; /* Last time we received any message. */ mstime_t last_avail_time; /* Last time the instance replied to ping with a reply we consider valid. */ mstime_t last_ping_time; /* Last time a pending ping was sent in the context of the current command connection with the instance. 0 if still not sent or if pong already received. */ mstime_t last_pong_time; /* Last time the instance replied to ping, whatever the reply was. That's used to check if the link is idle and must be reconnected. */ mstime_t last_pub_time; /* Last time we sent hello via Pub/Sub. */ mstime_t last_hello_time; /* Only used if SRI_SENTINEL is set. Last time we received a hello from this Sentinel via Pub/Sub. */ mstime_t last_master_down_reply_time; /* Time of last reply to SENTINEL is-master-down command. */ mstime_t s_down_since_time; /* Subjectively down since time. */ mstime_t o_down_since_time; /* Objectively down since time. */ // 无响应多少毫秒之后,进入主观下线 mstime_t down_after_period; mstime_t info_refresh; /* Time at which we received INFO output from it. */ /* Role and the first time we observed it. * This is useful in order to delay replacing what the instance reports * with our own configuration. We need to always wait some time in order * to give a chance to the leader to report the new configuration before * we do silly things. */ int role_reported; mstime_t role_reported_time; mstime_t slave_conf_change_time; /* Last time slave master addr changed. */ /* Master specific. */ dict *sentinels; /* Other sentinels monitoring the same master. */ dict *slaves; /* Slaves for this master instance. */ // 判断为客观下线需要的支持票数 unsigned int quorum; // 故障转移时,可以同时对新的master进行同步的slave数量 int parallel_syncs; /* How many slaves to reconfigure at same time. */ char *auth_pass; /* Password to use for AUTH against master & slaves. */ /* Slave specific. */ mstime_t master_link_down_time; /* Slave replication link down time. */ int slave_priority; /* Slave priority according to its INFO output. */ mstime_t slave_reconf_sent_time; /* Time at which we sent SLAVE OF <new> */ struct sentinelRedisInstance *master; /* Master instance if it's slave. */ char *slave_master_host; /* Master host as reported by INFO */ int slave_master_port; /* Master port as reported by INFO */ int slave_master_link_status; /* Master link status as reported by INFO */ unsigned long long slave_repl_offset; /* Slave replication offset. */ /* Failover */ char *leader; /* If this is a master instance, this is the runid of the Sentinel that should perform the failover. If this is a Sentinel, this is the runid of the Sentinel that this Sentinel voted as leader. */ uint64_t leader_epoch; /* Epoch of the 'leader' field. */ uint64_t failover_epoch; /* Epoch of the currently started failover. */ int failover_state; /* See SENTINEL_FAILOVER_STATE_* defines. */ mstime_t failover_state_change_time; mstime_t failover_start_time; /* Last failover attempt start time. */ // 刷新故障迁移状态的最大时限 mstime_t failover_timeout; /* Max time to refresh failover state. */ mstime_t failover_delay_logged; /* For what failover_start_time value we logged the failover delay. */ struct sentinelRedisInstance *promoted_slave; /* Promoted slave instance. */ /* Scripts executed to notify admin or reconfigure clients: when they * are set to NULL no script is executed. */ char *notification_script; char *client_reconfig_script; } sentinelRedisInstance; struct sentinelState { // 当前纪元 uint64_t current_epoch; // 监控的master字典,key是master名称,value是sentinelRedisInstance对象 dict *masters; // 是否处于TILT模式 int tilt; // 目前正在执行脚本的数量 int running_scripts; // 进入TITL模式时间 mstime_t tilt_start_time; // 最后一次执行时间处理器的时间 mstime_t previous_time; // 用户脚本执行队列 list *scripts_queue; } sentinel; |
Sentinel初始化
启动命令:
redis-sentinel /path/to/sentinel.conf
或者
redis-server /path/to/sentinel.conf --sentinel
Sentinel启动的时候,必须指定配置文件,最小配置类似:
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
这个配置文件表示,当前sentinel监视一个Redis master(mymaster),ip为127.0.0.1,端口为6379,
需要两个sentinel声明下线,才进行主备切换。mymaster 60000ms未响应标记为失效。
在main函数中,会直接对sentinel启动特殊的配置:
代码如下 | 复制代码 |
int main(int argc, char **argv) { ... if (server.sentinel_mode) { initSentinelConfig(); initSentinel(); } ... } |
首先是覆盖redis server的端口设置,sentinel会默认监听在26379端口:
代码如下 | 复制代码 |
#define REDIS_SENTINEL_PORT 26379 void initSentinelConfig(void) { server.port = REDIS_SENTINEL_PORT; } |
然后覆盖server的命令表格,初始化sentinel对象:
代码如下 | 复制代码 |
void initSentinel(void) { unsigned int j; /* Remove usual Redis commands from the command table, then just add * the SENTINEL command. */ // 清空Redis支持的命令表格,改成sentinel支持的命令表格 dictEmpty(server.commands,NULL); for (j = 0; j < sizeof(sentinelcmds)/sizeof(sentinelcmds[0]); j++) { int retval; struct redisCommand *cmd = sentinelcmds+j; retval = dictAdd(server.commands, sdsnew(cmd->name), cmd); redisAssert(retval == DICT_OK); } /* Initialize various data structures. */ sentinel.current_epoch = 0; sentinel.masters = dictCreate(&instancesDictType,NULL); sentinel.tilt = 0; sentinel.tilt_start_time = 0; sentinel.previous_time = mstime(); sentinel.running_scripts = 0; sentinel.scripts_queue = listCreate(); sentinel.announce_ip = NULL; sentinel.announce_port = 0; } |
sentinel只能支持监控相关的命令,无法执行通常的Redis命令,sentinel可以支持的命令表格为:
代码如下 | 复制代码 |
struct redisCommand sentinelcmds[] = { {"ping",pingCommand,1,"",0,NULL,0,0,0,0,0}, {"sentinel",sentinelCommand,-2,"",0,NULL,0,0,0,0,0}, {"subscribe",subscribeCommand,-2,"",0,NULL,0,0,0,0,0}, {"unsubscribe",unsubscribeCommand,-1,"",0,NULL,0,0,0,0,0}, {"psubscribe",psubscribeCommand,-2,"",0,NULL,0,0,0,0,0}, {"punsubscribe",punsubscribeCommand,-1,"",0,NULL,0,0,0,0,0}, {"publish",sentinelPublishCommand,3,"",0,NULL,0,0,0,0,0}, {"info",sentinelInfoCommand,-1,"",0,NULL,0,0,0,0,0}, {"role",sentinelRoleCommand,1,"l",0,NULL,0,0,0,0,0}, {"shutdown",shutdownCommand,-1,"",0,NULL,0,0,0,0,0} }; |
同时,由于sentinel不接受Redis普通命令,因此初始化的时候,也不会去加载rdb文件等原始数据。
sentinel 通信
通信
初始化完成之后,sentinel会主动和master、slave进行通信,获取他们的信息。
获取主服务器信息
首先,sentinel会和master建立两个连接,分别是命令连接和订阅连接(分别保存在sentinelRedisInstance的cc和pc字段中)。
代码如下 | 复制代码 |
void sentinelHandleRedisInstance(sentinelRedisInstance *ri) { sentinelReconnectInstance(ri); ... } #define SENTINEL_HELLO_CHANNEL "__sentinel__:hello" |
命令连接用于后续获取master的信息(包括slave的信息),订阅(sentinel:hello频道)用于获取master的掉线状态等消息的推送。
连接完成之后,sentinel会定时发送消息:
代码如下 | 复制代码 |
void sentinelSendPeriodicCommands(sentinelRedisInstance *ri) { mstime_t now = mstime(); mstime_t info_period, ping_period; int retval; /* Return ASAP if we have already a PING or INFO already pending, or * in the case the instance is not properly connected. */ if (ri->flags & SRI_DISCONNECTED) return; /* For INFO, PING, PUBLISH that are not critical commands to send we * also have a limit of SENTINEL_MAX_PENDING_COMMANDS. We don't * want to use a lot of memory just because a link is not working * properly (note that anyway there is a redundant protection about this, * that is, the link will be disconnected and reconnected if a long * timeout condition is detected. */ if (ri->pending_commands >= SENTINEL_MAX_PENDING_COMMANDS) return; /* If this is a slave of a master in O_DOWN condition we start sending * it INFO every second, instead of the usual SENTINEL_INFO_PERIOD * period. In this state we want to closely monitor slaves in case they * are turned into masters by another Sentinel, or by the sysadmin. */ // INFO命令默认发送间隔为 10s #define SENTINEL_INFO_PERIOD 10000 // 如果为已经宕机的master的slave,改1s if ((ri->flags & SRI_SLAVE) && (ri->master->flags & (SRI_O_DOWN|SRI_FAILOVER_IN_PROGRESS))) { info_period = 1000; } else { info_period = SENTINEL_INFO_PERIOD; } /* We ping instances every time the last received pong is older than * the configured 'down-after-milliseconds' time, but every second * anyway if 'down-after-milliseconds' is greater than 1 second. */ // ping命令默认间隔1s #define SENTINEL_PING_PERIOD 1000 ping_period = ri->down_after_period; if (ping_period > SENTINEL_PING_PERIOD) ping_period = SENTINEL_PING_PERIOD; if ((ri->flags & SRI_SENTINEL) == 0 && (ri->info_refresh == 0 || (now - ri->info_refresh) > info_period)) { /* Send INFO to masters and slaves, not sentinels. */ retval = redisAsyncCommand(ri->cc, sentinelInfoReplyCallback, NULL, "INFO"); if (retval == REDIS_OK) ri->pending_commands++; } else if ((now - ri->last_pong_time) > ping_period) { /* Send PING to all the three kinds of instances. */ sentinelSendPing(ri); } else if ((now - ri->last_pub_time) > SENTINEL_PUBLISH_PERIOD) { /* PUBLISH hello messages to all the three kinds of instances. */ sentinelSendHello(ri); } } |
处理INFO消息的返回:
代码如下 | 复制代码 |
void sentinelInfoReplyCallback(redisAsyncContext *c, void *reply, void *privdata) { sentinelRedisInstance *ri = c->data; redisReply *r; REDIS_NOTUSED(privdata); if (ri) ri->pending_commands--; if (!reply || !ri) return; r = reply; if (r->type == REDIS_REPLY_STRING) { sentinelRefreshInstanceInfo(ri,r->str); } } |
这里针对向master发送的INFO,sentinel会:
1. 获取master run_id记录,检查master的role,更新自己维护的master列表
2. 通过复制字段,获取master对应的slave列表,更新自己维护的slave列表
获取slave信息
同样,sentinel也会以同样的频率向slave发送INFO命令,并且提取以下参数:
* run_id
* role
* master的host和port
* 主从服务器的连接状态(master_link_status)
* slave优先级(slave_priority)
* slave复制偏移量(slave_repl_offset)
并更新自己维护的sentinelRedisInstance结构。
发送和接受订阅信息
sentinel每秒通过命令连接向所有master和slave发送信息。
代码如下 | 复制代码 |
int sentinelSendHello(sentinelRedisInstance *ri) { char ip[REDIS_IP_STR_LEN]; char payload[REDIS_IP_STR_LEN+1024]; int retval; char *announce_ip; int announce_port; sentinelRedisInstance *master = (ri->flags & SRI_MASTER) ? ri : ri->master; sentinelAddr *master_addr = sentinelGetCurrentMasterAddress(master); if (ri->flags & SRI_DISCONNECTED) return REDIS_ERR; /* Use the specified announce address if specified, otherwise try to * obtain our own IP address. */ if (sentinel.announce_ip) { announce_ip = sentinel.announce_ip; } else { if (anetSockName(ri->cc->c.fd,ip,sizeof(ip),NULL) == -1) return REDIS_ERR; announce_ip = ip; } announce_port = sentinel.announce_port ? sentinel.announce_port : server.port; /* Format and send the Hello message. */ snprintf(payload,sizeof(payload), "%s,%d,%s,%llu," /* Info about this sentinel. */ "%s,%s,%d,%llu", /* Info about current master. */ announce_ip, announce_port, server.runid, (unsigned long long) sentinel.current_epoch, /* --- */ master->name,master_addr->ip,master_addr->port, (unsigned long long) master->config_epoch); retval = redisAsyncCommand(ri->cc, sentinelPublishReplyCallback, NULL, "PUBLISH %s %s", SENTINEL_HELLO_CHANNEL,payload); if (retval != REDIS_OK) return REDIS_ERR; ri->pending_commands++; return REDIS_OK; } |
发送内容包括:
* sentinel ip(announce_ip)
* sentinel端口(announce_port)
* sentinel运行id(server.runid)
* sentinel配置纪元(sentinel.current_epoch)
* master名称(master->name)
* master IP(master_addr->ip)
* master端口(master_addr->port)
* master纪元(master->config_epoch)
同时,所有连接到这个master上的sentinel都会收到这个消息,然后做出回应:
* 更新sentinels字典
* 创建连接其他sentinel的命令连接