Android中BlockCanary原理分析

BlockCanary原理分析

一、背景

在Android开发中,经常会遇到应用程序卡顿的情况,对于卡顿问题的出现,有时候很难去复现和定位。卡顿可能和当时手机的ROM,系统环境有关,dump相关日志分析往往画了很多时间。然后BlockCanary就是来解决这个问题的,以几乎对代码无侵入性检测APP运行过程卡顿的情况,只会在出现卡顿情况下才会dump日志。

二、基本介绍

BlockCanary实现了对主线程完全透明的监控,并输出有效信息,可以帮助分析和优化应用。其特点:

  • 非侵入式,只需要简单几行接入代码,不需要埋点,破坏现有代码的结构。

  • 精准,输出的信息可以帮助定位到问题所在(精确到行),不需要像Logcat一样,慢慢去找。

三、实现原理

熟悉Android消息机制、LooperHandler的开发者,就知道整个应用程序只有一个主Looper消息循环系统,所有操作都会转化成Message, 把它插入到MessageQueue中,然后消息循环系统Looper,通过loop()方法不断从MessageQueue取出消息,然后再通过hanlder的dispatchMessage分发消息,最后触发Handler中的handleMessage方法。然而我们常说主线程卡顿,一般就是dispatchMessage方法执行卡顿了。所以只需要记录调用dispatchMessage前后两个时刻差,就是执行的时长,如果这个执行时长大于我们给定卡顿时间的阈值,就说明我们执行某个操作耗时了,可以定向优化了。

public final class Looper {
    private static Looper sMainLooper;  // mMainLooper主循环消息系统

    final MessageQueue mQueue;//消息队列
    final Thread mThread;

    private Printer mLogging;
    private long mTraceTag;

    ...
    public static void prepare() {
        prepare(true);
    }

    private static void prepare(boolean quitAllowed) {
        if (sThreadLocal.get() != null) {
            throw new RuntimeException("Only one Looper may be created per thread");
        }
        sThreadLocal.set(new Looper(quitAllowed));
    }

    //初始化主循环消息系统
    public static void prepareMainLooper() {
        prepare(false);
        synchronized (Looper.class) {
            if (sMainLooper != null) {
                throw new IllegalStateException("The main Looper has already been prepared.");
            }
            sMainLooper = myLooper();
        }
    }

    //获取主循环系统实例
    public static Looper getMainLooper() {
        synchronized (Looper.class) {
            return sMainLooper;
        }
    }

整个应用的主线程,只有这一个looper,不管有多少handler,最后都会回到这里。

如果再细心一点会发现在Looper的loop方法中有这么一段

 public static void loop() {
        final Looper me = myLooper();
        ...
        //启动无限循环获取消息
        for (;;) {
            Message msg = queue.next(); // might block
            if (msg == null) {
                return;
            }
            ...  
            //请注意: 这里的Printer logging对象,在dispatchMessage(msg)执行之前,会触发println输出
            final Printer logging = me.mLogging;
            if (logging != null) {
                logging.println(">>>>> Dispatching to " + msg.target + " " +
                        msg.callback + ": " + msg.what);
            }
            ...
            try {
                //注意这里是真正消息分发dispatchMessage
                msg.target.dispatchMessage(msg);
                dispatchEnd = needEndTime ? SystemClock.uptimeMillis() : 0;
            } finally {
                if (traceTag != 0) {
                    Trace.traceEnd(traceTag);
                }
            }
            ... 
            //请注意: 这里的Printer logging对象,在dispatchMessage(msg)执行之后,会触发println输出
            if (logging != null) {
                logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
            }
            ...
        }
    }

BolckCanary的原理就是在dispatchMessage执行前后hack住logging的两次 println方法。所以只需要在外部自定义一个Printer即可。它在每个message处理的前后被调用,而如果主线程卡住了,不就是在dispatchMessage里卡住了吗?

四、使用方法

  • 1、加入依赖
dependencies {
    // 最常用的接入方式
    compile 'com.github.markzhai:blockcanary-android:1.5.0'
    // 仅仅在debug包下使用
    // debugCompile 'com.github.markzhai:blockcanary-android:1.5.0'
    // 仅仅在release包下使用
    // releaseCompile 'com.github.markzhai:blockcanary-no-op:1.5.0'
}
  • 2、在Android Application中注册
BlockCanary.install(this, new AppContext()).start();
  • 3、实现BlockCanaryContext,去配置BlockCanary参数
  public class AppBlockCanaryContext extends BlockCanaryContext {

      /**
       * Implement in your project.
       * 配置应用安装标识,建议使用version+flavor
       */
      public String provideQualifier() {
          return version+flavor;
      }

      /**
       * Implement in your project.
       * 配置唯一uid,建议使用user_id
       */
      public String provideUid() {
          return user_id;
      }

      /**
       * 配置网络类型,比如 2G, 3G, 4G, wifi, etc.
       */
      public String provideNetworkType() {
          return "unknown";
      }

      /**
       * 配置监控时长,如果超过这个时间BlockCanary就会停止监控
       */
      public int provideMonitorDuration() {
          return -1;
      }

      /**
       * 配置阻塞时间,如果超过这个阻塞时间就视为阻塞,默认是3000ms。
       */
      public int provideBlockThreshold() {
          return 1000;
      }

      /**
       * 配置stack dump的时间间隔,BlockCanary将在主线程中dump
       */
      public int provideDumpInterval() {
          return provideBlockThreshold();
      }

      /**
       * 配置blockcanary日志保存在SD卡的路径
       */
      public String providePath() {
          return "/blockcanary/";
      }

      /**
       * 配置是否展示通知
       */
      public boolean displayNotification() {
          return true;
      }

      /**
       * 是否压缩日志文件
       */
      public boolean zip(File[] src, File dest) {
          return false;
      }

      /**
       * 配置上传日志文件
       */
      public void upload(File zippedFile) {
          throw new UnsupportedOperationException();
      }
   }

五、源码解析

  • 1、使用BlockCanary,在Application中的onCreate方法中执行install方法,然后拿到一个BlockCanary实例,最后调用start方法。
  public class DemoApplication extends Application {

      private static Context sContext;

      @Override
      public void onCreate() {
          super.onCreate();
          sContext = this;
          BlockCanary.install(this, new AppContext()).start();
      }

      public static Context getAppContext() {
          return sContext;
      }
  }
  • 2、BlockCanary是一个单例实现。
  public final class BlockCanary {
      private static BlockCanary sInstance;
      private BlockCanary() {
          BlockCanaryInternals.setContext(BlockCanaryContext.get());
          //创建BlockCanaryInternals实例
          mBlockCanaryCore = BlockCanaryInternals.getInstance();
          //BlockCanaryInternals添加阻塞拦截器
         mBlockCanaryCore.addBlockInterceptor(BlockCanaryContext.get());
          if (!BlockCanaryContext.get().displayNotification()) {
              return;
          }
          //BlockCanaryInternals添加阻塞拦截器DisplayService,拦截到阻塞然后显示通知Notification
          mBlockCanaryCore.addBlockInterceptor(new DisplayService());
      }

     public static BlockCanary install(Context context, BlockCanaryContext blockCanaryContext) {
          BlockCanaryContext.init(context, blockCanaryContext);
          setEnabled(context, DisplayActivity.class, BlockCanaryContext.get().displayNotification());
          return get();
      }

     public static BlockCanary get() {
          if (sInstance == null) {
              synchronized (BlockCanary.class) {
                  if (sInstance == null) {
                      sInstance = new BlockCanary();
                  }
              }
          }
          return sInstance;
      }

      //开始监听    
      public void start() {
          if (!mMonitorStarted) {
              mMonitorStarted = true;
             //注意: 这里调用MainLooper中的Printer, 也就是前面loop()方法中logging, 所以这里mBlockCanaryCore.monitor是一个LooperMonitor实现了Printer接口,重写了println函数。       
  Looper.getMainLooper().setMessageLogging(mBlockCanaryCore.monitor);
          }
      }
     //结束监听
      public void stop() {
          if (mMonitorStarted) {
              mMonitorStarted = false;
              //传入null,就不会触发loop函数中的logging.println函数。
              Looper.getMainLooper().setMessageLogging(null);
              mBlockCanaryCore.stackSampler.stop();
              mBlockCanaryCore.cpuSampler.stop();
          }
      }
  }    
  • 3、其中BlockCanaryInternals类中monitor变量的类型是LooperMonitor类,该类实现了Printer接口,从原理部分我们知道如果我们使用Looper.getMainLooper().setMessageLogging()方法设置了打印日志的监听之后,主线程中所有的事件执行都会调用println方法, 相当于间接hack了loop()函数中的println方法:
      @Override
      public void println(String x) {
          if (mStopWhenDebugging && Debug.isDebuggerConnected()) {
              return;
          }
          //默认mPrintingStarted=fasle
          if (!mPrintingStarted) {
              //记录当前开始时刻
              mStartTimestamp = System.currentTimeMillis();
              //开始记录当前启动线程开始时刻
              mStartThreadTimestamp = SystemClock.currentThreadTimeMillis();
              //重置mPrintingStarted=true
              mPrintingStarted = true;
              //开始dump
              startDump();
          } else {
              //记录当前结束时刻
              final long endTime = System.currentTimeMillis();
              mPrintingStarted = false;
              //如果阻塞了,就触发通知阻塞事件
              if (isBlock(endTime)) {
                  notifyBlockEvent(endTime);
              }
              //结束dump
              stopDump();
          }
      }

      //判断是否是阻塞了
      private boolean isBlock(long endTime) {
          //如果结束时刻与开始时刻时间差 大于 指定的阻塞时间就会判定为阻塞了
          return endTime - mStartTimestamp > mBlockThresholdMillis;
      }
  • 4、当事件开始会调用startDump方法开始采样,获取堆栈信息和CPU信息,事件结束会判断是否超过指定阻塞时间值,如果超过会在HandlerThreadFactory提供的HandlerThread子线程中通过接口回调到BlockCanaryInternals类中,该接口通过LooperMonitor构造方法传入,然后调用stopDump方法停止采样。
     private void notifyBlockEvent(final long endTime) {
          final long startTime = mStartTimestamp;
          final long startThreadTime = mStartThreadTimestamp;
          final long endThreadTime = SystemClock.currentThreadTimeMillis();
          //阻塞了就通过HandlerThreadFactory中提供的HandlerThread子线程回调到
          HandlerThreadFactory.getWriteLogThreadHandler().post(new Runnable() {
              @Override
              public void run() {
              //在HandlerThread中执行回调到BlockCanaryInternals中
                  mBlockListener.onBlockEvent(startTime, endTime, startThreadTime, endThreadTime);
              }
          });
      }

      //事件开始dump堆栈信息和CPU信息
      private void startDump() {
          //开始进行Stack信息采集
          if (null != BlockCanaryInternals.getInstance().stackSampler) {
              BlockCanaryInternals.getInstance().stackSampler.start();
          }
          //开始进行CPU信息采集
          if (null != BlockCanaryInternals.getInstance().cpuSampler) {
              BlockCanaryInternals.getInstance().cpuSampler.start();
          }
      }

      private void stopDump() {
          //停止进行Stack信息采集
          if (null != BlockCanaryInternals.getInstance().stackSampler) {
              BlockCanaryInternals.getInstance().stackSampler.stop();
          }
          //停止进行CPU信息采集
          if (null != BlockCanaryInternals.getInstance().cpuSampler) {
              BlockCanaryInternals.getInstance().cpuSampler.stop();
          }
      }
  • 5、接下来我们看下AbstractSampler抽象类,CpuSamplerStackSampler继承自该类,该类中主要处理了startstop方法,以及一个Runnable调用抽象方法doSample,runnable会在HandlerThreadFactory类中提供的HandleThread子线程中执行。
  //抽象采样类,CPU信息采样(CpuSampler)和堆栈信息采样(StachSampler)都是继承自AbstractSampler
  abstract class AbstractSampler {
      //默认采样时间间隔
      private static final int DEFAULT_SAMPLE_INTERVAL = 300;
      //是否应该采样 
      protected AtomicBoolean mShouldSample = new AtomicBoolean(false);
      //采样时间间隔
      protected long mSampleInterval;
      //创建一个mRunnable对象
      private Runnable mRunnable = new Runnable() {
          @Override
          public void run() {
              //开始采样
              doSample();

              if (mShouldSample.get()) {
                  HandlerThreadFactory.getTimerThreadHandler()
                          .postDelayed(mRunnable, mSampleInterval);
              }
          }
      };

      public AbstractSampler(long sampleInterval) {
          //外部为0,默认采样时间为DEFAULT_SAMPLE_INTERVAL
          if (0 == sampleInterval) {
              sampleInterval = DEFAULT_SAMPLE_INTERVAL;
          }
          //采样时间间隔
          mSampleInterval = sampleInterval;
      }

      //开始采样
      public void start() {
          //如果正在采样中,就return,不采样
          if (mShouldSample.get()) {
              return;
          }
          //表示已经在采样中...
          mShouldSample.set(true);  
          //先移除上一次采用mRunnable
               HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
//往HandlerThread中发送执行采样的runnable,并在HandlerThread中执行doSample采样
          HandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable,
                  BlockCanaryInternals.getInstance().getSampleDelay());
      }
      //停止采样
      public void stop() {
          //如果还在不在采样中,不需要停止,直接return
          if (!mShouldSample.get()) {
              return;
          }
          //如果还在采样中,设置mShouldSample=false
          mShouldSample.set(false);
          //并且移除采样的mRunnable任务。
                                       HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
      }

      abstract void doSample();
  }
  • 6、CPU信息的采样CpuSampler
class CpuSampler extends AbstractSampler {

    private static final String TAG = "CpuSampler";
    private static final int BUFFER_SIZE = 1000;
    private final int BUSY_TIME;
    private static final int MAX_ENTRY_COUNT = 10;

    private final LinkedHashMap<Long, String> mCpuInfoEntries = new LinkedHashMap<>();
    private int mPid = 0;
    private long mUserLast = 0;
    private long mSystemLast = 0;
    private long mIdleLast = 0;
    private long mIoWaitLast = 0;
    private long mTotalLast = 0;
    private long mAppCpuTimeLast = 0;

    public CpuSampler(long sampleInterval) {
        super(sampleInterval);
        BUSY_TIME = (int) (mSampleInterval * 1.2f);
    }

    @Override
    public void start() {
        super.start();
        reset();//重置采集参数
    }

    //获取CPU信息
    public String getCpuRateInfo() {
        StringBuilder sb = new StringBuilder();
        synchronized (mCpuInfoEntries) {
            for (Map.Entry<Long, String> entry : mCpuInfoEntries.entrySet()) {
                long time = entry.getKey();
                sb.append(BlockInfo.TIME_FORMATTER.format(time))
                        .append(' ')
                        .append(entry.getValue())
                        .append(BlockInfo.SEPARATOR);
            }
        }
        return sb.toString();
    }
    //CPU 是否占用率高
    public boolean isCpuBusy(long start, long end) {
        if (end - start > mSampleInterval) {
            long s = start - mSampleInterval;
            long e = start + mSampleInterval;
            long last = 0;
            synchronized (mCpuInfoEntries) {
                for (Map.Entry<Long, String> entry : mCpuInfoEntries.entrySet()) {
                    long time = entry.getKey();
                    if (s < time && time < e) {
                        if (last != 0 && time - last > BUSY_TIME) {
                            return true;
                        }
                        last = time;
                    }
                }
            }
        }
        return false;
    }

    //执行CPU采样
    @Override
    protected void doSample() {
        BufferedReader cpuReader = null;
        BufferedReader pidReader = null;

        try {
            cpuReader = new BufferedReader(new InputStreamReader(
                    new FileInputStream("/proc/stat")), BUFFER_SIZE);
            String cpuRate = cpuReader.readLine();
            if (cpuRate == null) {
                cpuRate = "";
            }

            if (mPid == 0) {
                mPid = android.os.Process.myPid();
            }
            pidReader = new BufferedReader(new InputStreamReader(
                    new FileInputStream("/proc/" + mPid + "/stat")), BUFFER_SIZE);
            String pidCpuRate = pidReader.readLine();
            if (pidCpuRate == null) {
                pidCpuRate = "";
            }

            parse(cpuRate, pidCpuRate);
        } catch (Throwable throwable) {
            Log.e(TAG, "doSample: ", throwable);
        } finally {
            try {
                if (cpuReader != null) {
                    cpuReader.close();
                }
                if (pidReader != null) {
                    pidReader.close();
                }
            } catch (IOException exception) {
                Log.e(TAG, "doSample: ", exception);
            }
        }
    }
    //重置采样信息参数
    private void reset() {
        mUserLast = 0;
        mSystemLast = 0;
        mIdleLast = 0;
        mIoWaitLast = 0;
        mTotalLast = 0;
        mAppCpuTimeLast = 0;
    }
    //采样信息格式化解析输出
    private void parse(String cpuRate, String pidCpuRate) {
        String[] cpuInfoArray = cpuRate.split(" ");
        if (cpuInfoArray.length < 9) {
            return;
        }

        long user = Long.parseLong(cpuInfoArray[2]);
        long nice = Long.parseLong(cpuInfoArray[3]);
        long system = Long.parseLong(cpuInfoArray[4]);
        long idle = Long.parseLong(cpuInfoArray[5]);
        long ioWait = Long.parseLong(cpuInfoArray[6]);
        long total = user + nice + system + idle + ioWait
                + Long.parseLong(cpuInfoArray[7])
                + Long.parseLong(cpuInfoArray[8]);

        String[] pidCpuInfoList = pidCpuRate.split(" ");
        if (pidCpuInfoList.length < 17) {
            return;
        }

        long appCpuTime = Long.parseLong(pidCpuInfoList[13])
                + Long.parseLong(pidCpuInfoList[14])
                + Long.parseLong(pidCpuInfoList[15])
                + Long.parseLong(pidCpuInfoList[16]);

        if (mTotalLast != 0) {
            StringBuilder stringBuilder = new StringBuilder();
            long idleTime = idle - mIdleLast;
            long totalTime = total - mTotalLast;

            stringBuilder
                    .append("cpu:")
                    .append((totalTime - idleTime) * 100L / totalTime)
                    .append("% ")
                    .append("app:")
                    .append((appCpuTime - mAppCpuTimeLast) * 100L / totalTime)
                    .append("% ")
                    .append("[")
                    .append("user:").append((user - mUserLast) * 100L / totalTime)
                    .append("% ")
                    .append("system:").append((system - mSystemLast) * 100L / totalTime)
                    .append("% ")
                    .append("ioWait:").append((ioWait - mIoWaitLast) * 100L / totalTime)
                    .append("% ]");

            synchronized (mCpuInfoEntries) {
                mCpuInfoEntries.put(System.currentTimeMillis(), stringBuilder.toString());
                if (mCpuInfoEntries.size() > MAX_ENTRY_COUNT) {
                    for (Map.Entry<Long, String> entry : mCpuInfoEntries.entrySet()) {
                        Long key = entry.getKey();
                        mCpuInfoEntries.remove(key);
                        break;
                    }
                }
            }
        }
        mUserLast = user;
        mSystemLast = system;
        mIdleLast = idle;
        mIoWaitLast = ioWait;
        mTotalLast = total;

        mAppCpuTimeLast = appCpuTime;
    }
}
  • 7、Stack栈信息采样StackSampler
class StackSampler extends AbstractSampler {

    private static final int DEFAULT_MAX_ENTRY_COUNT = 100;
    private static final LinkedHashMap<Long, String> sStackMap = new LinkedHashMap<>();

    private int mMaxEntryCount = DEFAULT_MAX_ENTRY_COUNT;
    private Thread mCurrentThread;

    public StackSampler(Thread thread, long sampleIntervalMillis) {
        this(thread, DEFAULT_MAX_ENTRY_COUNT, sampleIntervalMillis);
    }

    public StackSampler(Thread thread, int maxEntryCount, long sampleIntervalMillis) {
        super(sampleIntervalMillis);
        mCurrentThread = thread;
        mMaxEntryCount = maxEntryCount;
    }

    public ArrayList<String> getThreadStackEntries(long startTime, long endTime) {
        ArrayList<String> result = new ArrayList<>();
        synchronized (sStackMap) {
            for (Long entryTime : sStackMap.keySet()) {
                if (startTime < entryTime && entryTime < endTime) {
                    result.add(BlockInfo.TIME_FORMATTER.format(entryTime)
                            + BlockInfo.SEPARATOR
                            + BlockInfo.SEPARATOR
                            + sStackMap.get(entryTime));
                }
            }
        }
        return result;
    }

    @Override
    protected void doSample() {
        StringBuilder stringBuilder = new StringBuilder();

        for (StackTraceElement stackTraceElement : mCurrentThread.getStackTrace()) {
            stringBuilder
                    .append(stackTraceElement.toString())
                    .append(BlockInfo.SEPARATOR);
        }

        synchronized (sStackMap) {
            if (sStackMap.size() == mMaxEntryCount && mMaxEntryCount > 0) {
                sStackMap.remove(sStackMap.keySet().iterator().next());
            }
            sStackMap.put(System.currentTimeMillis(), stringBuilder.toString());
        }
    }
}
  • 8、接下来回到BlockCanaryInternals类看下阻塞事件发生时的处理,也就是onBlockEvent回调的地方
    public BlockCanaryInternals() {

        stackSampler = new StackSampler(
                Looper.getMainLooper().getThread(),
                sContext.provideDumpInterval());

        cpuSampler = new CpuSampler(sContext.provideDumpInterval());

        setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {

            @Override
            public void onBlockEvent(long realTimeStart, long realTimeEnd,
                                     long threadTimeStart, long threadTimeEnd) {
                // 获取最新的线程堆栈信息和CPU信息
                ArrayList<String> threadStackEntries = stackSampler
                        .getThreadStackEntries(realTimeStart, realTimeEnd);
                if (!threadStackEntries.isEmpty()) {
                //创建一个BlockInfo实例,包括了主线程时间开销(真正执行开始时刻、真正执行结束时刻、记录线程开始时刻、记录线程结束时刻)
                    BlockInfo blockInfo = BlockInfo.newInstance()
                            .setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
   //cpu是否占用高
                            .setCpuBusyFlag(cpuSampler.isCpuBusy(realTimeStart, realTimeEnd))
                            .setRecentCpuRate(cpuSampler.getCpuRateInfo())
                            .setThreadStackEntries(threadStackEntries)
                            .flushString();
                    LogWriter.save(blockInfo.toString());
//执行阻塞拦截器链mInterceptorChain
                    if (mInterceptorChain.size() != 0) {
                        for (BlockInterceptor interceptor : mInterceptorChain) {
                            interceptor.onBlock(getContext().provideContext(), blockInfo);
                        }
                    }
                }
            }
        }, getContext().provideBlockThreshold(), getContext().stopWhenDebugging()));

        LogWriter.cleanObsolete();
    }

到这里有关BlockCanary源码分析就结束了。


   转载规则


《Android中BlockCanary原理分析》 mikyou 采用 知识共享署名 4.0 国际许可协议 进行许可。
 上一篇
Android性能优化相关面试题 Android性能优化相关面试题
Android性能优化相关面试题1、要做一个尽可能流畅的ListView,你平时在工作中如何进行优化的? ①Item布局,层级越少越好,使用hierarchyview工具查看优化。 ②复用convertView ③使用ViewHolder
2019-12-09
下一篇 
Android和Java多线程面试题 Android和Java多线程面试题
Android、Java多线程面试题1、run()和start()方法区别 这个问题经常被问到,但还是能从此区分出面试者对Java线程模型的理解程度。start()方法被用来启动新创建的线程,而且start()内部调用了run()方法,这和
2019-12-03