MySQL Replication--复制延迟(Seconds_Behind_Master)计算01

本人完全不懂MySQL源码,以下文字纯属瞎猜,如有误导,概不负责!

在sql/rpl_slave.cc文件中,time_diff的计算代码为:

/*
      The pseudo code to compute Seconds_Behind_Master:
        if (SQL thread is running)
        {
          if (SQL thread processed all the available relay log)
          {
            if (IO thread is running)
              print 0;
            else
              print NULL;
          }
          else
            compute Seconds_Behind_Master;
        }
        else
          print NULL;
    */
    if (mi->rli->slave_running)
    {
      /* Check if SQL thread is at the end of relay log
           Checking should be done using two conditions
           condition1: compare the log positions and
           condition2: compare the file names (to handle rotation case)
      */
      if ((mi->get_master_log_pos() == mi->rli->get_group_master_log_pos()) &&
           (!strcmp(mi->get_master_log_name(), mi->rli->get_group_master_log_name())))
      {
        if (mi->slave_running == MYSQL_SLAVE_RUN_CONNECT)
          protocol->store(0LL);
        else
          protocol->store_null();
      }
      else
      {
        long time_diff= ((long)(time(0) - mi->rli->last_master_timestamp)
                                - mi->clock_diff_with_master);
      /*
        Apparently on some systems time_diff can be <0. Here are possible
        reasons related to MySQL:
        - the master is itself a slave of another master whose time is ahead.
        - somebody used an explicit SET TIMESTAMP on the master.
        Possible reason related to granularity-to-second of time functions
        (nothing to do with MySQL), which can explain a value of -1:
        assume the master‘s and slave‘s time are perfectly synchronized, and
        that at slave‘s connection time, when the master‘s timestamp is read,
        it is at the very end of second 1, and (a very short time later) when
        the slave‘s timestamp is read it is at the very beginning of second
        2. Then the recorded value for master is 1 and the recorded value for
        slave is 2. At SHOW SLAVE STATUS time, assume that the difference
        between timestamp of slave and rli->last_master_timestamp is 0
        (i.e. they are in the same second), then we get 0-(2-1)=-1 as a result.
        This confuses users, so we don‘t go below 0: hence the max().

        last_master_timestamp == 0 (an "impossible" timestamp 1970) is a
        special marker to say "consider we have caught up".
      */
        protocol->store((longlong)(mi->rli->last_master_timestamp ?
                                   max(0L, time_diff) : 0));
      }
    }
    else
    {
      protocol->store_null();
    }

1、当SQL线程停止时,返回NULL

2、当SLAVE正常运行时,如果SQL线程执行的位置是relay log的最后位置则返回0,否则返回NULL

3、当SLAVE正常运行时,复制延迟时间=当前从库系统时间(time(0)) - SQL线程处理的最后binlog的时间( mi->rli->last_master_timestamp) - 主从系统时间差(mi->clock_diff_with_master)

主从系统时间差(mi->clock_diff_with_master)

在sql/rpl_slave.cc文件中,主从系统时间差计算代码如下:

  /*
    Compare the master and slave‘s clock. Do not die if master‘s clock is
    unavailable (very old master not supporting UNIX_TIMESTAMP()?).
  */

  DBUG_EXECUTE_IF("dbug.before_get_UNIX_TIMESTAMP",
                  {
                    const char act[]=
                      "now "
                      "wait_for signal.get_unix_timestamp";
                    DBUG_ASSERT(opt_debug_sync_timeout > 0);
                    DBUG_ASSERT(!debug_sync_set_action(current_thd,
                                                       STRING_WITH_LEN(act)));
                  };);

  master_res= NULL;
  if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) &&
      (master_res= mysql_store_result(mysql)) &&
      (master_row= mysql_fetch_row(master_res)))
  {
    mysql_mutex_lock(&mi->data_lock);
    mi->clock_diff_with_master=
      (long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10));
    mysql_mutex_unlock(&mi->data_lock);
  }
  else if (check_io_slave_killed(mi->info_thd, mi, NULL))
    goto slave_killed_err;
  else if (is_network_error(mysql_errno(mysql)))
  {
    mi->report(WARNING_LEVEL, mysql_errno(mysql),
               "Get master clock failed with error: %s", mysql_error(mysql));
    goto network_err;
  }
  else
  {
    mysql_mutex_lock(&mi->data_lock);
    mi->clock_diff_with_master= 0; /* The "most sensible" value */
    mysql_mutex_unlock(&mi->data_lock);
    sql_print_warning("\"SELECT UNIX_TIMESTAMP()\" failed on master, "
                      "do not trust column Seconds_Behind_Master of SHOW "
                      "SLAVE STATUS. Error: %s (%d)",
                      mysql_error(mysql), mysql_errno(mysql));
  }
  if (master_res)
  {
    mysql_free_result(master_res);
    master_res= NULL;
  }

主从系统时间差=从库当前时间(time((time_t*) 0)) - 主库时间(UNIX_TIMESTAMP()),而主库时间是到主库上执行SELECT UNIX_TIMESTAMP(),然后取执行结果(strtoul(master_row[0], 0, 10))

SQL线程处理的最后binlog的时间( mi->rli->last_master_timestamp)

在sql/rpl_slave.cc文件中exec_relay_log_event方法中,计算last_master_timestamp的代码如下:

/**
  Top-level function for executing the next event in the relay log.
  This is called from the SQL thread.

  This function reads the event from the relay log, executes it, and
  advances the relay log position.  It also handles errors, etc.

  This function may fail to apply the event for the following reasons:

   - The position specfied by the UNTIL condition of the START SLAVE
     command is reached.

   - It was not possible to read the event from the log.

   - The slave is killed.

   - An error occurred when applying the event, and the event has been
     tried slave_trans_retries times.  If the event has been retried
     fewer times, 0 is returned.

   - init_info or init_relay_log_pos failed. (These are called
     if a failure occurs when applying the event.)

   - An error occurred when updating the binlog position.

  @retval 0 The event was applied.

  @retval 1 The event was not applied.
*/
static int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{
  DBUG_ENTER("exec_relay_log_event");

  /*
     We acquire this mutex since we need it for all operations except
     event execution. But we will release it in places where we will
     wait for something for example inside of next_event().
   */
  mysql_mutex_lock(&rli->data_lock);

  /*
    UNTIL_SQL_AFTER_GTIDS requires special handling since we have to check
    whether the until_condition is satisfied *before* the SQL threads goes on
    a wait inside next_event() for the relay log to grow. This is reuired since
    if we have already applied the last event in the waiting set but since he
    check happens only at the start of the next event we may end up waiting
    forever the next event is not available or is delayed.
  */
  if (rli->until_condition == Relay_log_info::UNTIL_SQL_AFTER_GTIDS &&
       rli->is_until_satisfied(thd, NULL))
  {
    rli->abort_slave= 1;
    mysql_mutex_unlock(&rli->data_lock);
    DBUG_RETURN(1);
  }

  Log_event *ev = next_event(rli), **ptr_ev;

  DBUG_ASSERT(rli->info_thd==thd);

  if (sql_slave_killed(thd,rli))
  {
    mysql_mutex_unlock(&rli->data_lock);
    delete ev;
    DBUG_RETURN(1);
  }
  if (ev)
  {
    enum enum_slave_apply_event_and_update_pos_retval exec_res;

    ptr_ev= &ev;
    /*
      Even if we don‘t execute this event, we keep the master timestamp,
      so that seconds behind master shows correct delta (there are events
      that are not replayed, so we keep falling behind).

      If it is an artificial event, or a relay log event (IO thread generated
      event) or ev->when is set to 0, or a FD from master, or a heartbeat
      event with server_id ‘0‘ then  we don‘t update the last_master_timestamp.
    */
    if (!(rli->is_parallel_exec() ||
          ev->is_artificial_event() || ev->is_relay_log_event() ||
          ev->when.tv_sec == 0 || ev->get_type_code() == FORMAT_DESCRIPTION_EVENT ||
          ev->server_id == 0))
    {
      rli->last_master_timestamp= ev->when.tv_sec + (time_t) ev->exec_time;
      DBUG_ASSERT(rli->last_master_timestamp >= 0);
    }

其中when.tv_sec是当前从库时间,when.tv_sec的赋值在sql/rpl_rli_pdb.cc文件中slave_worker_exec_job方法中:

  ev= static_cast<Log_event*>(job_item->data);
  thd->server_id = ev->server_id;
  thd->set_time();
  thd->lex->current_select= 0;
  if (!ev->when.tv_sec)
    ev->when.tv_sec= my_time(0);
  ev->thd= thd; // todo: assert because up to this point, ev->thd == 0
  ev->worker= worker;

而my_time(0)返回系统时间,代码在mysys/my_getsystime.cc文件

/**
  Return current time.

  @param  flags   If MY_WME is set, write error if time call fails.

  @retval current time.
*/

time_t my_time(myf flags)
{
  time_t t;
  /* The following loop is here beacuse time() may fail on some systems */
  while ((t= time(0)) == (time_t) -1)
  {
    if (flags & MY_WME)
      fprintf(stderr, "%s: Warning: time() call failed\n", my_progname);
  }
  return t;
}

而ev->exec_time的计算在sql/log_event.cc中的代码如下:

  /*
  exec_time calculation has changed to use the same method that is used
  to fill out "thd_arg->start_time"
  */

  struct timeval end_time;
  ulonglong micro_end_time= my_micro_time();
  my_micro_time_to_timeval(micro_end_time, &end_time);

  exec_time= end_time.tv_sec - thd_arg->start_time.tv_sec;

exec_time在文件sql/log_event.h中的注解如下:

  <tr>
    <td>exec_time</td>
    <td>4 byte unsigned integer</td>
    <td>The time from when the query started to when it was logged in the binlog, in seconds.</td>
  </tr>

原文地址:https://www.cnblogs.com/gaogao67/p/11075648.html

时间: 2024-11-03 12:03:01

MySQL Replication--复制延迟(Seconds_Behind_Master)计算01的相关文章

浅谈MySQL Replication(复制)基本原理

1.MySQL Replication复制进程MySQL的复制(replication)是一个异步的复制,从一个MySQL instace(称之为Master)复制到另一个MySQL instance(称之Slave).实现整个复制操作主要由三个进程完成的,其中两个进程在Slave(Sql进程和IO进程),另外一个进程在Master(IO进程)上.要实施复制,首先必须打开Master端的binary log(bin-log)功能,否则无法实现.因为整个复制过程实际上就是Slave从Master端

浅析 MySQL Replication(本文转自网络,非本人所写)

作者:卢飞 来源:DoDBA(mysqlcode) 0.导读 本文几乎涵盖了MySQL Replication(主从复制)的大部分知识点,包括Replication原理.binlog format.复制中如何保证数据一致性.组提交.复制优化.半同步复制.多源复制. 目前很多公司中的生产环境中都使用了MySQL Replication ,也叫 MySQL 复制,搭建配置方便等很多特性让 MySQL Replication 的应用很广泛,我们曾经使用过一主拖20多个从库来分担业务压力.关于 MySQ

怎样解决MySQL数据库主从复制延迟的问题---流行网站的解决办法(转载)

像Facebook.开心001.人人网.优酷.豆瓣.淘宝等高流量.高并发的网站,单点数据库很难支撑得住,WEB2.0类型的网站中使用MySQL的 居多,要么用MySQL自带的MySQL NDB Cluster(MySQL5.0及以上版本支持MySQL NDB Cluster功能),或者用MySQL自带的分区功能(MySQL5.1及以上版本支持分区功能),我所知道的使用这两种方案的很少,一般使用主从复 制,再加上MySQL Proxy实现负载均衡.读写分离等功能,在使用主从复制的基础上,再使用垂直

怎样解决MySQL数据库主从复制延迟的问题

像Facebook.开心001.人人网.优酷.豆瓣.淘宝等高流量.高并发的网站,单点数据库很难支撑得住,WEB2.0类型的网站中使用MySQL的居多,要么用MySQL自带的MySQL NDB Cluster(MySQL5.0及以上版本支持MySQL NDB Cluster功能),或者用MySQL自带的分区功能(MySQL5.1及以上版本支持分区功能),我所知道的使用这两种方案的很少,一般使用主从复制,再加上MySQL Proxy实现负载均衡.读写分离等功能,在使用主从复制的基础上,再使用垂直切分

seconds_behind_master监控复制延迟的不足及pt-heartbeat改进方法

seconds_behind_master含义及不足 seconds_behind_master的值是通过将salve服务器当前的时间戳与二进制日志中的事件的时间戳相比得到的,所以只有执行事件时才会报告延迟. 1.1  如果备库复制线程没有运行,就会报延迟为null. 1.2  一些错误比如网络不稳定可能导致复制中断或停止复制线程,但是seconds_behind_master将显示为0,而不是显示错误 1.3  即使备库线程正在运行,备库有时候可能无法计算延时,如果发生这种情况,备库会报0或者

MySQL &#183; 答疑解惑 &#183; 备库Seconds_Behind_Master计算

背景 在mysql主备环境下,主备同步过程如下,主库更新产生binlog, 备库io线程拉取主库binlog生成relay log.备库sql线程执行relay log从而保持和主库同步. 理论上主库有更新时,备库都存在延迟,且延迟时间为备库执行时间+网络传输时间即t4-t2. 那么mysql是怎么来计算备库延迟的? 先来看show slave status中的一些信息,io线程拉取主库binlog的位置: Master_Log_File: mysql-bin.000001 Read_Maste

第四阶段 (七)MySQL REPLICATION(主从复制、半同步复制、复制过滤)

Linux运维 第四阶段 (七)MySQL REPLICATION(主从复制.半同步复制.复制过滤) 一.MySQL Replication相关概念: 1.复制的作用:辅助实现备份:高可用HA:异地容灾:分摊负载(scaleout):rw-spliting(mysql proxy工作在应用层). 2.master有多个CPU允许事务并行执行,但往二进制日志文件只能一条条写:slave比master要慢:master-slave默认异步方式传送. 3.半同步:仅负责最近一台slave同步成功,其它

MySQL 5.7 延迟复制

MySQL 5.7延迟复制是通过设置复制参数MASTER_DELAY实现(单位为秒,就是从库延迟多少秒后执行这条SQL) 例如: mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.10.1.101 ..... Slave_IO_Runnin

mysql复制延迟监控脚本

#!/bin/sh #[email protected] #repdelay.sh #查看复制延迟具体多少event #####1.juede the rep slave status export black='\033[0m' export boldblack='\033[1;0m' export red='\033[31m' export boldred='\033[1;31m' export green='\033[32m' export boldgreen='\033[1;32m' e