理想是火,点燃熄灭的灯。
*潜行模式:应用各种技术使无头木偶师的检测更加困难。💯
*###目的
*有几种方法可以很容易地被目标网站检测到木偶师的使用。
*在用户代理中添加“HeadlessChrome”只是最明显的一个。
*这个插件的目标是成为木偶师的明确伴侣,以避免
*检测,在它们浮出水面时应用新技术。
*由于这款猫捉老鼠游戏还处于起步阶段,而且插件节奏很快
*保持尽可能灵活,以支持快速测试和迭代。
*###模块化
*此插件使用“puppeteer extra”的依赖系统仅需要
*为已经启用的规避编写mods代码,以保持模块化和高效。
*“隐身”插件是一个方便的包装器,需要多种[规避技术](./evasions/)
*自动并带有默认值。您也可以绕过主模块,并要求
*特定的规避插件,如果你想这样做(因为它们是独立的“木偶师额外”插件):
*//绕过主模块,直接需要一个特定的隐形插件:
*puppeteer.use(require('puppeteer-extra-plugin-sicanic/evasions/console.debug')())
*###贡献
*欢迎PR,如果你想添加一种新的逃避技巧,我建议你
*看看[template](./evasions/_template)来启动事情。
*###荣誉
*感谢[Evan Sangaline](https://intoli.com/blog/not-possible-to-block-chrome-headless/)和[保罗爱尔兰人](https://github.com/paulirish/headless-cat-n-mouse)开始讨论!
1.下载puppeteer-extra
npm install puppeteer-extra --save
2.下载puppeteer-extra-plugin-stealth
npm install puppeteer-extra-plugin-stealth --save
3.下载puppeteer
npm install puppeteer --save
浏览器的包可能下载失败,加一个参数--ignore-scripts 忽略包的下载,后面在引用本地的chrome目录即可
像这样:
executablePath: "C:\\Users\\nanfang\\AppData\\Local\\Google\\Chrome\\Application\\chrome.exe",
let puppeteer = require("puppeteer-extra"); let { executablePath } = require("puppeteer"); const pluginStealth = require("puppeteer-extra-plugin-stealth"); puppeteer.use(pluginStealth()); let browser = {}; const Bowser = { launch: async () => { const pathToExtension = "/data/Koa_blog/node_modules/puppeteer/.local-chromium/linux-722234/chrome-linux/chrome"; const config = { headless: false, args: ["--disable-images"], defaultViewport: { width: 1440, height: 1500, deviceScaleFactor: 1, isMobile: false, hasTouch: false, }, ignoreHTTPSErrors: true, slowMo: 100, userDataDir: "/path/to/user/data/directory", executablePath: "C:\\Users\\nanfang\\AppData\\Local\\Google\\Chrome\\Application\\chrome.exe", }; browser = await puppeteer.launch(config); const page = await browser.newPage(); await page.setUserAgent( "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3617.0 Safari/537.36" // executablePath ); return page; }, close: async () => { try { await browser.close(); return true; } catch (err) { console.log("close browser err::", err); return true; } }, }; module.exports = Bowser;
引用:
const Bowser = require("Bowser"); const gotoUrl = "http://biaoblog.cn"; (async () => { const page = await Bowser.launch(); await page.goto(gotoUrl); })();
Node Version:12.18.2
package.json:
{ "name": "crawler", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "start": "node index.js" }, "author": "", "license": "ISC", "dependencies": { "axios": "^1.1.3", "jsdom": "16.7.0", "md5": "^2.3.0", "node-fetch": "2.6.12", "node-schedule": "^2.1.0", "puppeteer": "^8.0.0", "puppeteer-extra": "^3.3.6", "puppeteer-extra-plugin-stealth": "^2.11.2" } }
作者: Bill 本文地址: http://biaoblog.cn/info?id=1689653119204
版权声明: 本文为原创文章,版权归 biaoblog 个人博客 所有,欢迎分享本文,转载请保留出处,谢谢!