Headless Chrome long image capture issue

原文引用https://www.dazhuanlan.com/2019/08/26/5d6300778d22d/

The problem

Recently I had received complaint about my capture service not export complete image. It seems that this problem only occurs when the page’s is extremely long.

The broken image is like this:

Chromium’s limit

So I Googled for the problem and I found a lot issues on Github that target the same problem. When reading throught this issue, I got the fact that this problem is caused by Chromium’s limit.

Since normal server don’t have a GPU inside, Headless Chrome had to use software renderer, that is, using CPU to calculate the pixels.

Chromium’s compositor has a maximum texture size when using software GL backend, this limit is 16384px. So large image will not be renderer completely.

How to solve it

The solve for this problem is simple. Cut the page into pieces, capture these fragments in order, and composite those pieces into a whole image.

The code below use Puppeteer’s API, it’s fine to replace it with other library like CDP.

await page.setViewport({ width: 1440, height: 1024});
const {contentSize} = await page._client.send(‘Page.getLayoutMetrics‘);
// MAGIC NUMBER, DO NOT MODIFIY THIS OR YOU WILL BE FIRED
const maxScreenshotHeight = 7000;
          if (contentSize.height >= maxScreenshotHeight) {

            let image;
            let lastBuffer;

            for (let ypos = 0; ypos < contentSize.height; ypos += maxScreenshotHeight) {
              const height = Math.min(contentSize.height - ypos, maxScreenshotHeight);
              let buffer = await page.screenshot({
                clip: {
                  x: 0,
                  y: ypos,
                  width: contentSize.width,
                  height
                }
              });
              if (ypos === 0) {
                image = sharp(buffer);
                lastBuffer = await image.toBuffer();
              }else {
                image = sharp(lastBuffer);
                image = image.extend({top: 0, bottom: height, left: 0, right: 0})
                image = image.overlayWith(buffer, {top: ypos, left:0})
                lastBuffer = await image.toBuffer();
              }
            }
            fileData = lastBuffer;

I use sharp for image processing, bacause it’s recommended on Github issue.

Future

The approach may not be necessary accroding to this Chromium issue.

原文地址：https://www.cnblogs.com/petewell/p/11410472.html

时间： 2024-07-31 09:31:12

Headless Chrome long image capture issue

The problem

Chromium’s limit

How to solve it

Future

Headless Chrome long image capture issue的相关文章

Headless Chrome入门

【转】利用 selenium 的 webdrive 驱动 headless chrome

Serverless 实战——使用 Rendertron 搭建 Headless Chrome 渲染

selenium（六）Headless Chrome/Firefox--PhantomJS停止支持后，使用无界面模式。

基于headless chrome的游戏资源下载实现（初版）

ChromeDriver与Chrome版本对应关系

chrome 和 chromeDriver

centos chrome

Selenium+Headless Firefox