repmgr/repmgr_net.c,
__repmgr_send(): 做send_broadcast, 然后根据policy 对DB_REP_PERMANENT的处理
__repmgr_send_broadcast(): 对每个site, send_connection().
MASTER 发送
log/log_put.c, log_put(),
不接受 REP_CLIENT
__rep_send_message(env, DB_EID_BROADCAST, - REP_NEWFILE, REP_LOG
txn/txn_chkpt.c, __txn_checkpoint()
REP_CLIENT仅在recover时到这里, sync mp后马上退出
在master sync mp之前, 发送给client:
__rep_send_message(env, DB_EID_BROADCAST, REP_START_SYNC
等待 chkpt_delay
写ckp log rec
DB_REP_REREQUEST vs DB_REP_ANYWHERE
* Gap requests are "new" and can go anywhere, unless
* this is already a re-request.
repmgr_net.c, __repmgr_send()
if ((flags & (DB_REP_ANYWHERE | DB_REP_REREQUEST)) ==DB_REP_ANYWHERE &&
(site = __repmgr_find_available_peer(env))
发完后, DB_REP_PERMANENT, 检查policy, 需要多少ack 才能返回(for durability).
http://docs.oracle.com/cd/E17076_03/html/api_reference/C/repmgrset_ack_policy.html
默认 DB_REPMGR_ACKS_QUORUM, repmgr_net.c, __repmgr_send(), 可见 (n - 1) / 2 + 1
确定了之后, __repmgr_await_cond(env, got_acks, &perm, rep->ack_timeout, &db_rep->ack_waiters);
=> while (got_acks(env, &perm)) {pthread_cond_timedwait(&db_rep->ack_waiters,rep->ack_timeout)}
repmgr_net.c, got_acks().
dbinc/rep.h,
struct __db_rep {}
REPMGR_RUNNABLE *selector, **messengers, **elect_threads;
WSAEventSelect
client接收
repmgr_method.c, __repmgr_start_int() - elect/msg/select threads
repmgr_method.c, __repmgr_start_selector()
repmgr_sel.c, __repmgr_select_thread()
repmgr_windows.c, __repmgr_select_loop()
repmgr_windows.c, handle_completion() -
repmgr_sel.c, __repmgr_read_from_site()
repmgr_sel.c, dispatch_msgin() -
放入db_rep->input_queue, __repmgr_signal(&db_rep->msg_avail)
rep.h, struct __db_rep - cond_var_t check_election, gmdb_idle, msg_avail;
repmgr_method.c, __repmgr_start_int()
repmgr_method.c, __repmgr_start_msg_threads()
repmgr_msg.c, __repmgr_msg_thread()
message_loop()
while ((ret = __repmgr_queue_get()...
__repmgr_queue_get - while(m = available_work(env)) == NULL), wait 在msg_avail 上
process_message()
repmgr_record.c, __rep_process_message_int()
对 REP_LOG 消息, 调用
rep_log.c, __rep_log()
rep_record.c, __rep_apply():
log.h, struct log {} - waiting_lsn, max_wait_lsn, __db.rep.db, ready_lsn
waiting_lsn: It is the first LSN that we are holding without putting in the log, because we received one or more log records out of order.
ready_lsn: It is the next LSN we expect to receive. It‘s normally equal to "lsn", except at the beginning of a log file, at which point it‘s set to the LSN of the first record of the new file
若正是我们需要的 下一个log rec, call __rep_process_rec(); __rep_remfirst/__rep_getnext 接着处理 tmp db里的log; __rep_loggap_req().
若 在我们需要的 log rec后面, 入tmp db, 更新 waiting_lsn, 发送__rep_loggap_req().
若 在我们需要的 log rec前面, 收到 重复 log rec.
rep_record.c, __rep_process_rec(), newfile特殊处理退出. 除ckp,其他先直接写入log 文件
- DB___txn_prepare: 直接flush log
- DB___txn_regop: __rep_process_txn(), 拿到 需要的写锁; 拿到txn对应的所有log rec, 排序, 读出, db_dispatch(DB_TXN_APPLY
- DB___txn_ckp: 首先在 rep_db(bookkeeping db)中写一个 rec, nooverwrite, 如果已经有, 则其他线程在做ckp, 退出. sync mp; 写DB_LOG_CHKPNT log rec, flush log