继续上篇的分析,接下来是第二个问题”Service的自动重启问题“
(一)、Service的生命周期
(二)、Service的自动重启问题
这里要说服务的自动重启问题,这个问题其实很简单,只有两个关键的方法。代码如下:
这个方法在ActivityThread的一系列针对服务的handle方法中都有调用到ActivityManagerSerice的serviceDoneExecuting()方法,但是跟重启有关的只有handleServiceArgs(),因为只有在这里才有一个叫res的参数会起作用。
private void handleServiceArgs(ServiceArgsData data) { Service s = mServices.get(data.token); if (s != null) { try { if (data.args != null) { data.args.setExtrasClassLoader(s.getClassLoader()); } int res; if (!data.taskRemoved) { //就是回调了用户服务的onStartCommand生命周期,这个做应用的都知道了, //这里可以通过设置其返回值来控制自己的服务是否允许被重新启动,顺理成章的这个值就是res res = s.onStartCommand(data.args, data.flags, data.startId); } else { s.onTaskRemoved(data.args); res = Service.START_TASK_REMOVED_COMPLETE; } ............... try { //看看系统用这个值都干了一些什么导致有这个特性 ActivityManagerNative.getDefault().serviceDoneExecuting( data.token, 1, data.startId, res); } catch (RemoteException e) { // nothing to do. } ensureJitEnabled(); } .................. } }
下面就是这个特性的关键代码,里面的注释已经写的很全了,关键其作用的就是stopIfKilled这个标志。
void serviceDoneExecutingLocked(ServiceRecord r, int type, int startId, int res) { boolean inDestroying = mDestroyingServices.contains(r); if (r != null) { if (type == 1) { // This is a call from a service start... take care of // book-keeping. r.callStart = true; switch (res) { case Service.START_STICKY_COMPATIBILITY: case Service.START_STICKY: { // We are done with the associated start arguments. r.findDeliveredStart(startId, true); // Don't stop if killed. r.stopIfKilled = false; break; } case Service.START_NOT_STICKY: { // We are done with the associated start arguments. r.findDeliveredStart(startId, true); if (r.getLastStartId() == startId) { // There is no more work, and this service // doesn't want to hang around if killed. r.stopIfKilled = true; } break; } case Service.START_REDELIVER_INTENT: { // We'll keep this item until they explicitly // call stop for it, but keep track of the fact // that it was delivered. ServiceRecord.StartItem si = r.findDeliveredStart(startId, false); if (si != null) { si.deliveryCount = 0; si.doneExecutingCount++; // Don't stop if killed. r.stopIfKilled = true; } break; } case Service.START_TASK_REMOVED_COMPLETE: { // Special processing for onTaskRemoved(). Don't // impact normal onStartCommand() processing. r.findDeliveredStart(startId, true); break; } default: throw new IllegalArgumentException( "Unknown service start result: " + res); } if (res == Service.START_STICKY_COMPATIBILITY) { r.callStart = false; } } final long origId = Binder.clearCallingIdentity(); serviceDoneExecutingLocked(r, inDestroying, inDestroying); Binder.restoreCallingIdentity(origId); } else { Slog.w(TAG, "Done executing unknown service from pid " + Binder.getCallingPid()); } }
那么这个标志位又是在哪些情况下使得服务可以重启的呢?这种场景入口很多啊,比如系统清理进程等,总之就是APP Died的情况下,入口方法不列举了,最后都会执行到这来:
final void killServicesLocked(ProcessRecord app, boolean allowRestart) { // Report disconnected services. if (false) { // XXX we are letting the client link to the service for // death notifications. if (app.services.size() > 0) { Iterator<ServiceRecord> it = app.services.iterator(); while (it.hasNext()) { ServiceRecord r = it.next(); for (int conni=r.connections.size()-1; conni>=0; conni--) { ArrayList<ConnectionRecord> cl = r.connections.valueAt(conni); for (int i=0; i<cl.size(); i++) { ConnectionRecord c = cl.get(i); if (c.binding.client != app) { try { //c.conn.connected(r.className, null); } catch (Exception e) { // todo: this should be asynchronous! Slog.w(TAG, "Exception thrown disconnected servce " + r.shortName + " from app " + app.processName, e); } } } } } } } // First clear app state from services. for (int i=app.services.size()-1; i>=0; i--) { ServiceRecord sr = app.services.valueAt(i); synchronized (sr.stats.getBatteryStats()) { sr.stats.stopLaunchedLocked(); } if (sr.app != null) { sr.app.services.remove(sr); } sr.app = null; sr.isolatedProc = null; sr.executeNesting = 0; sr.forceClearTracker(); if (mDestroyingServices.remove(sr)) { if (DEBUG_SERVICE) Slog.v(TAG, "killServices remove destroying " + sr); } final int numClients = sr.bindings.size(); for (int bindingi=numClients-1; bindingi>=0; bindingi--) { IntentBindRecord b = sr.bindings.valueAt(bindingi); if (DEBUG_SERVICE) Slog.v(TAG, "Killing binding " + b + ": shouldUnbind=" + b.hasBound); b.binder = null; b.requested = b.received = b.hasBound = false; } } // Clean up any connections this application has to other services. for (int i=app.connections.size()-1; i>=0; i--) { ConnectionRecord r = app.connections.valueAt(i); removeConnectionLocked(r, app, null); } app.connections.clear(); ServiceMap smap = getServiceMap(app.userId); // Now do remaining service cleanup. for (int i=app.services.size()-1; i>=0; i--) { ServiceRecord sr = app.services.valueAt(i); // Sanity check: if the service listed for the app is not one // we actually are maintaining, drop it. if (smap.mServicesByName.get(sr.name) != sr) { ServiceRecord cur = smap.mServicesByName.get(sr.name); Slog.wtf(TAG, "Service " + sr + " in process " + app + " not same as in map: " + cur); app.services.removeAt(i); continue; } // Any services running in the application may need to be placed // back in the pending list. // 这里还是分很多种情况的 // 允许重启时,如果当前服务所在进程crash超过两次,并且不是persistent的进程就结束不会重启了 if (allowRestart && sr.crashCount >= 2 && (sr.serviceInfo.applicationInfo.flags &ApplicationInfo.FLAG_PERSISTENT) == 0) { Slog.w(TAG, "Service crashed " + sr.crashCount + " times, stopping: " + sr); EventLog.writeEvent(EventLogTags.AM_SERVICE_CRASHED_TOO_MUCH, sr.userId, sr.crashCount, sr.shortName, app.pid); bringDownServiceLocked(sr); } else if (!allowRestart) { // 不允许重启直接挂掉 bringDownServiceLocked(sr); } else { // boolean canceled = scheduleServiceRestartLocked(sr, true); // Should the service remain running? Note that in the // extreme case of so many attempts to deliver a command // that it failed we also will stop it here. if (sr.startRequested && (sr.stopIfKilled || canceled)) { if (sr.pendingStarts.size() == 0) { sr.startRequested = false; if (sr.tracker != null) { sr.tracker.setStarted(false, mAm.mProcessStats.getMemFactorLocked(), SystemClock.uptimeMillis()); } if (!sr.hasAutoCreateConnections()) { // Whoops, no reason to restart! bringDownServiceLocked(sr); } } } } } if (!allowRestart) { app.services.clear(); // Make sure there are no more restarting services for this process. for (int i=mRestartingServices.size()-1; i>=0; i--) { ServiceRecord r = mRestartingServices.get(i); if (r.processName.equals(app.processName) && r.serviceInfo.applicationInfo.uid == app.info.uid) { mRestartingServices.remove(i); clearRestartingIfNeededLocked(r); } } for (int i=mPendingServices.size()-1; i>=0; i--) { ServiceRecord r = mPendingServices.get(i); if (r.processName.equals(app.processName) && r.serviceInfo.applicationInfo.uid == app.info.uid) { mPendingServices.remove(i); } } } // Make sure we have no more records on the stopping list. int i = mDestroyingServices.size(); while (i > 0) { i--; ServiceRecord sr = mDestroyingServices.get(i); if (sr.app == app) { sr.forceClearTracker(); mDestroyingServices.remove(i); if (DEBUG_SERVICE) Slog.v(TAG, "killServices remove destroying " + sr); } } app.executingServices.clear(); }
private final boolean scheduleServiceRestartLocked(ServiceRecord r, boolean allowCancel) { boolean canceled = false; ServiceMap smap = getServiceMap(r.userId); if (smap.mServicesByName.get(r.name) != r) { ServiceRecord cur = smap.mServicesByName.get(r.name); Slog.wtf(TAG, "Attempting to schedule restart of " + r + " when found in map: " + cur); return false; } final long now = SystemClock.uptimeMillis(); if ((r.serviceInfo.applicationInfo.flags &ApplicationInfo.FLAG_PERSISTENT) == 0) { long minDuration = SERVICE_RESTART_DURATION; long resetTime = SERVICE_RESET_RUN_DURATION; // Any delivered but not yet finished starts should be put back // on the pending list. final int N = r.deliveredStarts.size(); if (N > 0) { for (int i=N-1; i>=0; i--) { ServiceRecord.StartItem si = r.deliveredStarts.get(i); si.removeUriPermissionsLocked(); //注意了,这里的canceled如果为true还是需要结束服务的 //还要关注一下delivery的上限和doneExecuting的上限 if (si.intent == null) { // We'll generate this again if needed. } else if (!allowCancel || (si.deliveryCount < ServiceRecord.MAX_DELIVERY_COUNT && si.doneExecutingCount < ServiceRecord.MAX_DONE_EXECUTING_COUNT)) { //重新在pendingStart中添加si,所以会在下次执行时重新带入intent进去 r.pendingStarts.add(0, si); long dur = SystemClock.uptimeMillis() - si.deliveredTime; dur *= 2; if (minDuration < dur) minDuration = dur; if (resetTime < dur) resetTime = dur; } else { Slog.w(TAG, "Canceling start item " + si.intent + " in service " + r.name); canceled = true; } } r.deliveredStarts.clear(); } r.totalRestartCount++; if (r.restartDelay == 0) { r.restartCount++; r.restartDelay = minDuration; } else { // If it has been a "reasonably long time" since the service // was started, then reset our restart duration back to // the beginning, so we don't infinitely increase the duration // on a service that just occasionally gets killed (which is // a normal case, due to process being killed to reclaim memory). if (now > (r.restartTime+resetTime)) { r.restartCount = 1; r.restartDelay = minDuration; } else { r.restartDelay *= SERVICE_RESTART_DURATION_FACTOR; if (r.restartDelay < minDuration) { r.restartDelay = minDuration; } } } r.nextRestartTime = now + r.restartDelay; // Make sure that we don't end up restarting a bunch of services // all at the same time. boolean repeat; do { repeat = false; for (int i=mRestartingServices.size()-1; i>=0; i--) { ServiceRecord r2 = mRestartingServices.get(i); if (r2 != r && r.nextRestartTime >= (r2.nextRestartTime-SERVICE_MIN_RESTART_TIME_BETWEEN) && r.nextRestartTime < (r2.nextRestartTime+SERVICE_MIN_RESTART_TIME_BETWEEN)) { r.nextRestartTime = r2.nextRestartTime + SERVICE_MIN_RESTART_TIME_BETWEEN; r.restartDelay = r.nextRestartTime - now; repeat = true; break; } } } while (repeat); } else { // Persistent processes are immediately restarted, so there is no // reason to hold of on restarting their services. r.totalRestartCount++; r.restartCount = 0; r.restartDelay = 0; r.nextRestartTime = now; } if (!mRestartingServices.contains(r)) { r.createdFromFg = false; mRestartingServices.add(r); r.makeRestarting(mAm.mProcessStats.getMemFactorLocked(), now); } r.cancelNotification(); mAm.mHandler.removeCallbacks(r.restarter); // 最关键的操作在这里,忘ActivityManagerService的handler里面post一个重启的Runnable // 这个东西前面启动过程创建ServiceRecord时有的,很简单就是一个ServiceRestarter,它里面保存了这个ServiceRecord本身 // 重启的时候根据这个record就可以直接启动服务了 mAm.mHandler.postAtTime(r.restarter, r.nextRestartTime); r.nextRestartTime = SystemClock.uptimeMillis() + r.restartDelay; Slog.w(TAG, "Scheduling restart of crashed service " + r.shortName + " in " + r.restartDelay + "ms"); EventLog.writeEvent(EventLogTags.AM_SCHEDULE_SERVICE_RESTART, r.userId, r.shortName, r.restartDelay); return canceled; }
private class ServiceRestarter implements Runnable { private ServiceRecord mService; void setService(ServiceRecord service) { mService = service; } public void run() { synchronized(mAm) { //后面的事情就顺利成章了。 performServiceRestartLocked(mService); } } }
整个这个过程中,有好几个参数控制着是否需要重启,也定了很多参数的上限等等,这里单独列出来解释一下。
ServiceRecord.crashCount、ServiceRecord.StartItem.deliveryCount、ServiceRecord.StartItem.doneExecutingCount
crashCount顾名思义啊,就是crash的次数,这个在handleAppCrashLocked()中自增的,很明显每crash一次就会自增,没什么好说的
deliveryCount也很好理解,他是属于StartItem的,所以表示的是启动信息,是执行onStartCommand方法的次数,也就是外部startService的次数
doneExecutingCount跟deliveryCount还很有关联,类似的也是说的这个服务执行的次数,那么它们有什么区别呢?
还有两个标志位Service.START_FLAG_RETRY、Service.START_FLAG_REDELIVERY要一起看。这个在ActivesService.sendServiceArgsLocked()中可以看到。意思就是说这个服务是直接重启还是重新发送发送请求。
它们还是互斥的,这点在serviceDoneExecutingLocked()方法的START_REDELIVER_INTENT分支处理中可以得到结论,总的来说就是说onStartCommand返回START_STICKY是允许重启,而START_REDELIVER_INTENT会重新将上次的intent请求发送出去,服务中会重新接收到这个。
最后将在下篇论第三个问题
(三)、Service与其客户端的绑定如何实现,即跨进程调用问题。