-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to detect selenium #182
Comments
This is on my mind. I will look into this and see what I can find. |
Thanks for your reply, I found that the project named botD could detect this case, but I guess that might depended on server-side analytics. |
Nice. There might be some new tricks at botD. Here are some resources by Antoine Vastel:
|
I had tried fp-collect months ago. Since this project was not maintained any more (2019), the command line above: chrome_options.add_argument('--disable-blink-features=AutomationControlled') cannot be detected through https://antoinevastel.com/bots/ |
I finally got around to testing this more in depth, and we do detect Selenium in headless. Even with Web Driver and the User Agent hidden, there are many headless signals available. Detection of non-headless Selenium is missed, but I think that it is an unnecessary detection. Automated patterns can be detected through event listeners, but that's not a focus yet. I might create a test page for that. Similarly, Puppeteer and Playwright can run Google Chrome in non-headless and use automation without being detected. I think all that is fine, as long as the web traffic is producing good activity and okay fingerprints. This is the script I used. import time
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--disable-blink-features=AutomationControlled') # web driver off
options.headless = True
options.add_argument("--window-size=800,600")
# make sure you download the driver that supports the chrome.exe
options.binary_location = "C:\Program Files\Google\Chrome Beta\Application\chrome.exe"
driver = webdriver.Chrome(options=options)
def save_screenshot(driver: webdriver.Chrome, path: str = 'selen_screenshot.png') -> None:
# Ref: https://stackoverflow.com/a/52572919/
original_size = driver.get_window_size()
required_width = driver.execute_script('return document.body.parentNode.scrollWidth')
required_height = driver.execute_script('return document.body.parentNode.scrollHeight')
driver.set_window_size(required_width, required_height)
# driver.save_screenshot(path) # has scrollbar
driver.find_element_by_tag_name('body').screenshot(path) # avoids scrollbar
driver.set_window_size(original_size['width'], original_size['height'])
try:
driver.get('https://abrahamjuliot.github.io/creepjs/')
time.sleep(10)
save_screenshot(driver)
input("press any key to exit...")
finally:
driver.quit()
|
Good job! I'll take a look the latest version of creepjs for the rest of this month, thanks. |
Nice. Just started researching this. |
Thanks. |
@abrahamjuliot for selenium, and all chromedriver-driven browsers, check the two following values:
|
Nice. These have been on my mind. There's also a way to get the
Good tips. |
Some more additional flags for detecting selenium and selenium adjacent softwares: window["__nightmare"]
window["cdc_adoQpoasnfa76pfcZLmcfl_Array"]
window["cdc_adoQpoasnfa76pfcZLmcfl_Promise"]
window["cdc_adoQpoasnfa76pfcZLmcfl_Symbol"]
window["OSMJIF"]
window["_Selenium_IDE_Recorder"]
window["__$webdriverAsyncExecutor"]
window["__driver_evaluate"]
window["__driver_unwrapped"]
window["__fxdriver_evaluate"]
window["__fxdriver_unwrapped"]
window["__lastWatirAlert"]
window["__lastWatirConfirm"]
window["__lastWatirPrompt"]
window["__phantomas"]
window["__selenium_evaluate"]
window["__selenium_unwrapped"]
window["__webdriverFuncgeb"]
window["__webdriver__chr"]
window["__webdriver_evaluate"]
window["__webdriver_script_fn"]
window["__webdriver_script_func"]
window["__webdriver_script_function"]
window["__webdriver_unwrapped"]
window["awesomium"]
window["callSelenium"]
window["calledPhantom"]
window["calledSelenium"]
window["domAutomationController"]
window["watinExpressionError"]
window["watinExpressionResult"]
window["spynner_additional_js_loaded"]
document["$chrome_asyncScriptInfo"]
window["fmget_targets"]
window["geb"] |
Also worthwhile to check the types of navigator.plugins to ensure that it hasn't been tampered with. |
Other relevant values can be found here: It detects objects created//used by |
I was trying to hide the "webdriver=true" navigator attribute, and asked chatgpt. Its answer is spookily similar to yours. const originalNavigator = navigator;
const proxyNavigator = new Proxy(originalNavigator, {
get(target, prop) {
if (prop === 'webdriver') {
return false;
}
return target[prop];
},
ownKeys(target) {
const keys = Reflect.ownKeys(target);
return keys.filter((key) => key !== 'webdriver');
},
getOwnPropertyDescriptor(target, prop) {
if (prop === 'webdriver') {
return undefined;
}
return Reflect.getOwnPropertyDescriptor(target, prop);
},
});
// Replace the global navigator object with the proxy object
Object.defineProperty(window, 'navigator', {
value: proxyNavigator,
configurable: false,
enumerable: false,
writable: false,
}); I'm hoping against hope, but any way to see if an object is a Proxy? I feel like chrome is teasing me in the console. |
That is funny. I asked Bing Chat (gpt4) about our detecting JS Proxies and what it thought about our methods here. It didn't like our code and insisted we try outdated techniques on stack overflow. |
That's pretty solid btw!!!!
Returns true if its proxied; false if its not. Honest question - why doesn't |
The bot score has some game elements and includes tags like friend and stranger. By default, everyone is treated as a bot. From there, we just want to establish some level of trust. The more transparent and normal the player, the less they are perceived as untrustworthy. This allows use of web driver and headless UAs since these are designed for transparency. |
I'm sorry, I meant to say the 'headlessRating'. Instead of just having 20% weight if true (I think its 1 in 5 attributes), it feels like it should automatically trip the value to 100% when its true. I could be wrong, but I doubt you'd get too many false positives where "normal" users have webdriver set to true - seems like a really strong signal when present. Just a thought as I'm going through your library trying to distil heuristics I can steal :-) |
Ah yes, that's a good idea. I might change that at some point. |
BTW, I need to remove these from headless rating and move to like headless. These can appear in Android WebView, Smart TVs and other Chromium flavors.
|
That's a really good point. I totally forgot about the plethora of platforms that can legitimately ping a service, but look "weird". I don't know if its worth the investment, but it might be interesting to have somewhere an attribute called "framework" or "automated" or something. I'm sure some clever AI could parse out good rules, but things like you're using Windows, Chromium, and Webdriver is true? Minimum 99.8% chance you're automated. Its not good or bad, but its definitely not normal and it'd be useful to flag. I could be here for weeks MMQB'ing this thing into the ground; dorking out over what-if's and things I think would be useful :-P Thanks again for maintaining this thing!! |
It's possible to override the types for these object if I remember correctly. Probably still nice as an extra measure. |
Headless=new new cannot detect |
@NCLnclNCL you're 100% right. Right now, I'm keeping a Bayesian score, and if the browser is chromium and the OS isn't Linux; that's a big red flag its a bot. Not 100%, but definitely worth paying attention to. /shrug |
i think very hard to detect headless=new bro, it can slow than old headless but it is perfect to antidetect |
`from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument('--disable-blink-features=AutomationControlled')
chrome = webdriver.Chrome(executable_path='./chromedriver.exe', chrome_options=chrome_options)
chrome.get('https://abrahamjuliot.github.io/creepjs')`
Invoke Chrome via the selenium package in Python, seemingly without being intercepted by creepjs, any suggestions? Thanks.
The text was updated successfully, but these errors were encountered: