Flume-NG(1.5版本)中SpillableMemoryChannel源码级分析 / 憋错料

　　SpillableMemoryChannel是1.5版本新增的一个channel。这个channel优先将evnet放在内存中，一旦内存达到设定的容量就使用file channel写入磁盘。然后读的时候会按照顺序读取：会通过一个DrainOrderQueue来保证不管是内存中的还是溢出(本文的“溢出”指的是内存channel已满，需要使用file channel存储数据)文件中的顺序。这个Channel是memory channel和file channel的一个折中，虽然在内存中的数据仍然可能因为进程的突然中断而丢失，但是相对于memory channel而言一旦sink处理速度跟不上不至于丢失数据(后者一旦满了爆发异常会丢失后续的数据)，提高了数据的可靠性；相对于file channel而言自然是大大提高了速度，但是可靠性较file channel有所降低。

　　我们来看一下SpillableMemoryChannel的继承结构：SpillableMemoryChannel extends FileChannel，原来SpillableMemoryChannel是file的子类，天热具有file channel的特性。但是它的BasicTransactionSemantics是自己实现的。接下来我们来分析分析这个channel，这个channel可以看成是两个channel。相关内容传送门：Flume-NG源码阅读之FileChannel 和 flume-ng源码阅读memory-channel(原创) 。

　　一、首先来看configure(Context context)方法，这个方法是对这个channel进行配置。一些主要参数介绍：

　　(1)Semaphore totalStored，这两个channel【内存channel（并不是flume内置的memory channel，这里是新实现的一个，本文中的“内存channel”若无说明就是新实现的这个）和溢出而使用的file channel】中event数量的总和的信号量，初始为0；

　　(2)ArrayDeque<Event> memQueue，这就是这里的内存channel，使用可以改变大小的数组双端队列ArrayDeque，存储event数据；

　　(3)int memoryCapacity(对应参数名"memoryCapacity")，内存channel中存储的event的最大数量；

　　(4)Semaphore memQueRemaining，内存channel剩余的可存储event的数量的信号量，初始大小为memoryCapacity；

　　(5)int overflowTimeout(对应参数名"overflowTimeout")，溢出超时时间，指的是内存channel满了之后，切换到file channel的等待时间，默认是3s；

　　(6)double overflowDeactivationThreshold(对应参数名"overflowDeactivationThreshold")，指的是停止溢出的阈值------内存channel剩余内存(这里指可再存储的event数量)，默认5%；

　　(7)volatile int byteCapacityBufferPercentage(对应参数名"byteCapacityBufferPercentage")，用来限制内存channel使用物理内存量，默认20；

　　(8)volatile double avgEventSize()(对应参数名"avgEventSize")，指定每个event的大小，用来计算内存channel可以使用的slot总数量，会把event量化为slot，而不是字节，默认500；

　　(9)volatile int byteCapacity(对应参数名"byteCapacity")，slot数量，默认是JVM可使用的最大物理内存(可通过配置"byteCapacity"参数来控制物理内存使用)的80%* (1 - byteCapacityBufferPercentage * .01 )) / avgEventSize得来；

　　(10)Semaphore bytesRemaining，内存channel中剩余可使用的slot数量信号量，初始大小是byteCapacity；

　　(11)volatile int lastByteCapacity，动态加载配置文件时才会有用，记录上一次的ByteCapacity，用于修改bytesRemaining信号量的大小；

　　(12)int overflowCapacity(对应参数名"overflowCapacity")，用于设置file channel的容量，默认是1亿；

　　此外，boolean overflowDisabled用来是否禁用溢出，只要overflowCapacity不小于1就不会禁用；boolean overflowActivated表示是否可以使用溢出，默认是false；还会对对file channel的"keep-alive"设置为0；最后会通过super.configure(context)来对file channel进行配置。对于file channel的配置信息可以和SpillableMemoryChannel的配置信息在一起配置。

　　二、start()方法，首先会super.start()启动file channel，获取file中溢出的数据量overFlowCount，重置totalStored和DrainOrderQueue对象drainOrder，内存channel的start是不会有数据的。

　　三、需要讲一下DrainOrderQueue drainOrder = new DrainOrderQueue()。我们知道SpillableMemoryChannel其实是由两个channel组成，分别是内存channel和file channel，因此数据也会分布在内存和磁盘文件之中，那我们take时，是什么机制呢？换句话说就是什么时候读内存中的数据，什么时候读磁盘上文件的数据？take的顺序怎么样呢？我们希望take的顺序和put的顺序一样，先put的应该先take，所以我们应该给所有的put（包括内存和文件）进行“编号”使得可以有序的take，还要注意的就是需要标示这个take是应该从内存还是file中去读。为此设计了DrainOrderQueue类，来使得有序的put和take。

　　这个类设计的狠精巧，是保证take和put正常合理操作的关键。在讲之前先大概说一下原理：这个类的关键属性是ArrayDeque<MutableInteger> queue，这也是一个ArrayDeque，ArrayDeque特性是数组可变且大小不受限制，可在头尾操作，此类很可能在用作堆栈时快于 Stack，在用作队列时快于 LinkedList，但是不是线程安全的不支持多线程并发操作；put操作总是对queue中的最后(尾)一个元素操作，take操作总是对queue中第一个(头)操作；put时，如果是内存channel，在queue增加的就是正数，如果是溢出操作增加的就是负数，内存和溢出分别对应queue中不同的元素(可以分类去读)；take时，如果从内存中取数据，就会使得queue第一个元素的值不断缩小(正数)至0，然后删除这个元素，如果是从溢出文件中取数据则会使得queue中第一个元素不断增大(负数)至0，然后删除这个元素；这样就会形成流，使得put不断追加数据到流中，take不断从流中取数据，这个流就是有序的，且流中元素其实就是内存中的evnet个数和溢出文件中event的个数。

　　好了，DrainOrderQueue详细代码如下：

 1   public static class DrainOrderQueue {
 2     public ArrayDeque<MutableInteger> queue = new ArrayDeque<MutableInteger>(1000);
 3
 4     public int totalPuts = 0;  // for debugging only
 5     private long overflowCounter = 0; // # of items in overflow channel
 6
 7     public  String dump() {
 8       StringBuilder sb = new StringBuilder();
 9
10       sb.append("  [ ");
11       for (MutableInteger i : queue) {
12         sb.append(i.intValue());
13         sb.append(" ");
14       }
15       sb.append("]");
16       return  sb.toString();
17     }
18
19     public void putPrimary(Integer eventCount) {
20       totalPuts += eventCount;
21       if (  (queue.peekLast() == null) || queue.getLast().intValue() < 0) {    //获取，但不移除此双端队列的最后一个元素；如果此双端队列为空，则返回 null
22         queue.addLast(new MutableInteger(eventCount));
23       } else {
24         queue.getLast().add(eventCount);//获取，但不移除此双端队列的第一个元素。
25       }
26     }
27
28     public void putFirstPrimary(Integer eventCount) {
29       if ( (queue.peekFirst() == null) || queue.getFirst().intValue() < 0) {    //获取，但不移除此双端队列的第一个元素；如果此双端队列为空，则返回 null。
30         queue.addFirst(new MutableInteger(eventCount));
31       } else {
32         queue.getFirst().add(eventCount);//获取，但不移除此双端队列的第一个元素。
33       }
34     }
35
36     public void putOverflow(Integer eventCount) {
37       totalPuts += eventCount;
38       if ( (queue.peekLast() == null) ||  queue.getLast().intValue() > 0) {
39         queue.addLast(new MutableInteger(-eventCount));
40       } else {
41         queue.getLast().add(-eventCount);
42       }
43       overflowCounter += eventCount;
44     }
45
46     public void putFirstOverflow(Integer eventCount) {
47       if ( (queue.peekFirst() == null) ||  queue.getFirst().intValue() > 0) {
48         queue.addFirst(new MutableInteger(-eventCount));
49       }  else {
50         queue.getFirst().add(-eventCount);
51       }
52       overflowCounter += eventCount;
53     }
54
55     public int front() {
56       return queue.getFirst().intValue();
57     }
58
59     public boolean isEmpty() {
60       return queue.isEmpty();
61     }
62
63     public void takePrimary(int takeCount) {
64       MutableInteger headValue = queue.getFirst();
65
66       // this condition is optimization to avoid redundant conversions of
67       // int -> Integer -> string in hot path
68       if (headValue.intValue() < takeCount)  {
69         throw new IllegalStateException("Cannot take " + takeCount +
70                 " from " + headValue.intValue() + " in DrainOrder Queue");
71       }
72
73       headValue.add(-takeCount);
74       if (headValue.intValue() == 0) {
75         queue.removeFirst();
76       }
77     }
78
79     public void takeOverflow(int takeCount) {
80       MutableInteger headValue = queue.getFirst();
81       if(headValue.intValue() > -takeCount) {
82         throw new IllegalStateException("Cannot take " + takeCount + " from "
83                 + headValue.intValue() + " in DrainOrder Queue head " );
84       }
85
86       headValue.add(takeCount);
87       if (headValue.intValue() == 0) {
88         queue.removeFirst();    //获取并移除此双端队列第一个元素。
89       }
90       overflowCounter -= takeCount;
91     }
92
93   }

　　我们一个方法一个方法的来剖析这个类：

　　(1)dump()，这个方法比较简单就是获得queue中所有元素的数据量；

　　(2)putPrimary(Integer eventCount)，这个方法用在put操作的commit时，在commitPutsToPrimary()方法中被调用，表示向内存提交数据。这个方法会尝试获取queue中最后一个元素，如果为空(说明没数据)或者元素数值小于0(说明这个元素是面向溢出文件的)，就新建一个元素赋值这个事务的event数量加入queue；否则表示当前是的元素表征的是内存中的event数量，直接累加即可。

　　(3)putFirstPrimary(Integer eventCount)，在doRollback()回滚的时候被调用，表示将takeList中的数据放回内存memQueue的头。这个方法会尝试获取queue中第一个元素，如果为空(说明没数据)或者元素数值小于0(说明这个元素是面向溢出文件的)，就新建一个元素赋值takeList的event数量加入queue；否则表示当前是的元素表征的是内存中的event数量，直接累加即可。

　　(4)putOverflow(Integer eventCount)，这个方法发生在put操作的commit时，在commitPutsToOverflow_core方法和start()方法中，后者是设置初始量，前者表示内存channel已满要溢出到file channel。这个方法会尝试获取queue中最后一个元素，如果为空(说明没数据)或者元素数值大于0(表示这个元素是面向内存的)，就新建一个元素赋值这个事务的event数量加入queue，这里赋值为负数；否则表示当前是的元素表征的是溢出文件中的event数量，直接累加负数即可。

　　(5)putFirstOverflow(Integer eventCount)，在doRollback()回滚的时候被调用，表示将takeList中event的数量放回溢出文件。这个方法会尝试获取queue中第一个元素，如果为空(说明没数据)或者元素数值大于0(表示这个元素是面向内存的)，就新建一个元素赋值这个事务的 event数量加入queue，这里赋值为负数；否则表示当前是的元素表征的是溢出到文件中的event数量，直接累加负数即可。

　　(6)front()，返回queue中第一个元素的值

　　(7)takePrimary(int takeCount)，这个方法在doTake()中被调用，表示take发生之后，要将内存中的event数量减takeCount（这个值一般都是1，即每次取一个）。这个方法会获取第一个元素的值(表示内存channel中有多少event)，如果这个值比takeCount小，说明内存中没有足够的数量，这种情况不应该发生，报错；否则将这个元素的值减去takeCount，表示已取出takeCount个。最后如果这个元素的值为0，则从queue中删除这个元素。注意这里虽然是可以取takeCount个，但是源码调用这个参数都是一次取1个而已。

　　(8)takeOverflow(int takeCount)，这个方法在doTake()中被调用，表示take发生之后，要将溢出文件中的event数量加上takeCount（这个值一般都是1，即每次取一个）。这个方法会获取第一个元素的值(表示溢出文件中有多少event)，如果这个值比takeCount的负值大，说明文件中没有足够的数量，这种情况不应该发生，报错；否则将这个元素的值加上takeCount，表示已取出takeCount个。最后如果这个元素的值为0，则从queue中删除这个元素。注意这里虽然是可以取 takeCount个，但是源码调用这个参数都是一次取1个而已。

　　四、这个channel的BasicTransactionSemantics：SpillableMemoryTransaction，这是每个channel的必须实现的可靠性保证。这个类也有一些属性：

　　(1)BasicTransactionSemantics overflowTakeTx = null，这个是file channel的事务FileBackedTransaction，表示take操作从溢出文件中获取event；

　　(2)BasicTransactionSemantics overflowPutTx = null，这个是file channel的事务FileBackedTransaction，表示put操作溢出到磁盘文件；

　　(3)boolean useOverflow = false，是否使用溢出；

　　(4)boolean putCalled = false，put操作，初次put的时候会置为true；

　　(5)boolean takeCalled = false，take操作，初次take的时候会置为true；

　　(6)int largestTakeTxSize = 5000，不是常量，可以再分配；

　　(7)int largestPutTxSize = 5000，不是常量，可以再分配；

　　(8)Integer overflowPutCount = 0，这次事务溢出的event的数量；

　　(9)int putListByteCount = 0，这次事务putList所有event占用字节总和；

　　(10)int takeListByteCount = 0，这次事务takeList所有event占用字节总和；

　　(11)int takeCount = 0，这次事务take操作的个数；

　　(12)ArrayDeque<Event> takeList，从memQueue拿出来的event暂存之所；

　　(13)ArrayDeque<Event> putList，放入memQueue之前event的暂存之所；

　　按照国际惯例必须实现的4个方法：

　　A、doPut(Event event)，代码如下：

 1 protected void doPut(Event event) throws InterruptedException {
 2       channelCounter.incrementEventPutAttemptCount();
 3
 4       putCalled = true;    //说明是在put操作
 5       int eventByteSize = (int)Math.ceil(estimateEventSize(event)/ avgEventSize);//获取这个event可以占用几个slot
 6       if (!putList.offer(event)) {    //加入putList
 7         throw new ChannelFullException("Put queue in " + getName() +
 8                 " channel‘s Transaction having capacity " + putList.size() +
 9                 " full, consider reducing batch size of sources");
10       }
11       putListByteCount += eventByteSize;
12     }

　　这个方法比较简单，就是put开始；设置putCalled为true表示put操作；计算占用slot个数；将event放入putList等待commit操作；putListByteCount加上这个evnet占用的slot数。

　　B、doTake()，代码如下：

 1 protected Event doTake() throws InterruptedException {
 2       channelCounter.incrementEventTakeAttemptCount();
 3       if (!totalStored.tryAcquire(overflowTimeout, TimeUnit.SECONDS)) {
 4         LOGGER.debug("Take is backing off as channel is empty.");
 5         return null;
 6       }
 7       boolean takeSuceeded = false;
 8       try {
 9         Event event;
10         synchronized(queueLock) {
11           int drainOrderTop = drainOrder.front();
12
13           if (!takeCalled) {
14             takeCalled = true;
15             if (drainOrderTop < 0) {
16               useOverflow = true;
17               overflowTakeTx = getOverflowTx();        //获取file channle的事务
18               overflowTakeTx.begin();
19             }
20           }
21
22           if (useOverflow) {
23             if (drainOrderTop > 0) {
24               LOGGER.debug("Take is switching to primary");
25               return null;       // takes should now occur from primary channel
26             }
27
28             event = overflowTakeTx.take();
29             ++takeCount;
30             drainOrder.takeOverflow(1);
31           } else {
32             if (drainOrderTop < 0) {
33               LOGGER.debug("Take is switching to overflow");
34               return null;      // takes should now occur from overflow channel
35             }
36
37             event = memQueue.poll();    //获取并移除此双端队列所表示的队列的头（换句话说，此双端队列的第一个元素）；如果此双端队列为空，则返回 null。
38             ++takeCount;
39             drainOrder.takePrimary(1);
40             Preconditions.checkNotNull(event, "Queue.poll returned NULL despite"
41                     + " semaphore signalling existence of entry");
42           }
43         }
44
45         int eventByteSize = (int)Math.ceil(estimateEventSize(event)/ avgEventSize);
46         if (!useOverflow) {
47           // takeList is thd pvt, so no need to do this in synchronized block
48           takeList.offer(event);
49         }
50
51         takeListByteCount += eventByteSize;
52         takeSuceeded = true;
53         return event;
54       } finally {
55         if(!takeSuceeded) {
56           totalStored.release();
57         }
58       }
59     }

　　由于ArrayDeque是非线程安全的(memQueue就是ArrayDeque)，所以take操作从memQueue获取数据时，要独占memQueue。任何对memQueue都要进行同步，这里是同步queueLock。

　　doTake方法会先检查totalStored中有无许可，即channel中有无数据；然后同步；再获取drainOrder的头元素，如果takeCalled为false(初始为false)，则设置其为true，再判断获取到的drainOrder头元素的值是否为负数，负数说明数据在溢出文件中，设置useOverflow为true表示要从溢出文件中读取数据并且获取file channel的FileBackedTransaction赋值给overflowTakeTx，begin()可以获取数据。如果useOverflow为true则转到调用overflowTakeTx.take获取event，然后takeCount自增1，调用drainOrder.takeOverflow(1)修改队列中溢出event数量的值。如果useOverflow为false说明数据在内存中，直接调用memQueue.poll()获得event，然后takeCount自增1，调用drainOrder.takePrimary(1)修改队列中内存中evnet数量的值。然后计算这个event占用的slot数。如果是从内存channel中读取的event则将其放入takeList中；takeListByteCount加上这个evnet占用的slot数。最后返回event。

　　C、doCommit()方法，如果putCalled为true就会调用putCommit()方法来处理put的操作，如果takeCalled为true就调用takeCommit()方法来处理take操作。

　　1、putCommit()方法，会首先依据overflowActivated的真假来设置超时时间。内存channel的溢出情况由两个信号量控制memQueRemaining和bytesRemaining，前者控制着event的数量，后者控制着物理内存的使用情况，如果这两者中的任何一个不满足都会触发溢出，溢出会设置overflowActivated = true;useOverflow = true，如果useOverflow为true，就调用commitPutsToOverflow()方法来处理溢出，这个方法会创建一个file channel的FileBackedTransaction赋值给overflowPutTx，begin可以put数据，然后依次将putList中的event通过overflowPutTx.put(event)放入file channel中，调用commitPutsToOverflow_core方法来处理overflowPutTx提交事务，再调用drainOrder.putOverflow(putList.size())修改queue中溢出文件中event的数量，如果在overflowPutTx提交过程中失败，最多再尝试一次，中间等待overflowTimeout秒。返回到commitPutsToOverflow方法，将totalStored释放putList.size的许可，溢出数量overflowPutCount增加putList.size。到这溢出的情况完成。如果putCommit()中useOverflow为false则说明event在内存channel中，会调用commitPutsToPrimary()来处理，这个方法会将putList中的所有event放入memQueue中，然后调用drainOrder.putPrimary(putList.size())修改queue中内存中event的数量，修改maxMemQueueSize的值，将totalStored释放putList.size的许可。

　　2、takeCommit()方法，如果overflowTakeTx不为null，说明是从溢出文件取得的event，就调用commit方法提交事务。然后获得内存channel剩余空间的百分比，包括两部分之和，一部分是内存channel还可以再存储evnet的数量，另一部分就是takeCount，他们俩之和与memoryCapacity(不能为0)之比就是百分比memoryPercentFree。如果overflowActivated为true且memoryPercentFree不小于overflowDeactivationThreshold，说明内存中剩余空间已经达到了停止溢出的阈值，就设置overflowActivated为false停止溢出，这样其实会导致内存满了之后等待溢出的时间加长。如果take操作是从内存channel中取数据，memQueRemaining会释放takeCount个许可，表示腾出takeCount个空间；bytesRemaining会释放takeListByteCount个许可，表示腾出takeListByteCount个slot。

　　D、doRollback()，代码如下：

 1 protected void doRollback() {
 2       LOGGER.debug("Rollback() of " +
 3               (takeCalled ? " Take Tx" : (putCalled ? " Put Tx" : "Empty Tx")));
 4
 5       if (putCalled) {
 6         if (overflowPutTx!=null) {
 7           overflowPutTx.rollback();
 8         }
 9         if (!useOverflow) {
10           bytesRemaining.release(putListByteCount);
11           putList.clear();
12         }
13         putListByteCount = 0;
14       } else if (takeCalled) {
15         synchronized(queueLock) {
16           if (overflowTakeTx!=null) {
17             overflowTakeTx.rollback();
18           }
19           if (useOverflow) {
20             drainOrder.putFirstOverflow(takeCount);
21           } else {
22             int remainingCapacity = memoryCapacity - memQueue.size();
23             Preconditions.checkState(remainingCapacity >= takeCount,
24                     "Not enough space in memory queue to rollback takes. This" +
25                             " should never happen, please report");
26             while (!takeList.isEmpty()) {
27               memQueue.addFirst(takeList.removeLast());
28             }
29             drainOrder.putFirstPrimary(takeCount);
30           }
31         }
32         totalStored.release(takeCount);
33       } else {
34         overflowTakeTx.rollback();
35       }
36       channelCounter.setChannelSize(memQueue.size() + drainOrder.overflowCounter);
37     }

　　如果putCalled为true，则表明正在进行的是put操作。如果overflowPutTx不为null，说明是在溢出，执行overflowPutTx的roolback方法进行回滚。如果没有溢出，则bytesRemaining释放putListByteCount许可，表示腾出putListByteCount个slot；清空putList；最后将putListByteCount置为0。如果takeCalled为true，说明正在进行的操作是take，如果overflowTakeTx不为null，说明是在溢出，执行overflowTakeTx的roolback方法进行回滚；如果在溢出，则调用drainOrder.putFirstOverflow(takeCount)修改queue中溢出文件中的event的数量；如果在使用内存channel，则计算出内存channel中还可以最多存储event的数量，如果这个数量小于takeCount，则报错，否则将takeList中的所有event加入memQueue的头部，执行drainOrder.putFirstPrimary(takeCount)来修改queue中内存channel存放的event的数量；然后totalStored释放takeCount个许可，表示内存channel中增加了takeCount个event。

　　五、stop方法，会调用父类file channel中的stop方法。

　　六、createTransaction()方法，直接返回一个SpillableMemoryTransaction对象。这说明take和put可以并发执行，但是当涉及到memQueue时，还是需要同步。

　　至此，这个新的channel介绍完了。总体来说SpillableMemoryChannel是精心设计的一个channel，兼顾Flume内置的file channel和memory channel的优点，又增加了一个选择，大伙可根据需要选择合适的channel。

Flume-NG(1.5版本)中SpillableMemoryChannel源码级分析

时间： 2024-10-06 04:43:41

Flume-NG(1.5版本)中SpillableMemoryChannel源码级分析

Flume-NG(1.5版本)中SpillableMemoryChannel源码级分析的相关文章

Flume-NG内置计数器(监控)源码级分析

MapReduce job在JobTracker初始化源码级分析

监听器初始化Job、JobTracker相应TaskTracker心跳、调度器分配task源码级分析

TableInputFormat分片及分片数据读取源码级分析

Shell主要逻辑源码级分析(1)——SHELL运行流程

源码级强力分析Hadoop的RPC机制

MapReduce中TextInputFormat分片和读取分片数据源码级分析

深入理解 Node.js 中 EventEmitter源码分析(3.0.0版本)

swift版本拼图游戏项目源码