Skip to content

Commit

Permalink
Fixed the rest of false positives for now (#371)
Browse files Browse the repository at this point in the history
* Fixed the rest of false positives for now

* Fixed tag

* Updated site list and statistics
  • Loading branch information
soxoj committed Feb 26, 2022
1 parent bc787cd commit 8a53a38
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 22 deletions.
2 changes: 1 addition & 1 deletion maigret/maigret.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def notify_about_errors(search_results: QueryResultWrapper, query_notify):
for e in errs:
if not errors.is_important(e):
continue
text = f'Too many errors of type "{e["err"]}" ({e["perc"]}%)'
text = f'Too many errors of type "{e["err"]}" ({round(e["perc"],2)}%)'
solution = errors.solution_of(e['err'])
if solution:
text = '. '.join([text, solution.capitalize()])
Expand Down
35 changes: 24 additions & 11 deletions maigret/resources/data.json
Original file line number Diff line number Diff line change
Expand Up @@ -1462,6 +1462,7 @@
"forum",
"ru"
],
"disabled": true,
"checkType": "message",
"absenceStrs": [
"\u041f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u0435\u043b\u044c \u043d\u0435 \u043d\u0430\u0439\u0434\u0435\u043d"
Expand Down Expand Up @@ -18145,6 +18146,7 @@
"tags": [
"ru"
],
"disabled": true,
"checkType": "status_code",
"alexaRank": 6054365,
"urlMain": "http://linuxmint.info",
Expand Down Expand Up @@ -24487,23 +24489,35 @@
},
"thoughts.com": {
"tags": [
"in"
"blog"
],
"checkType": "message",
"absenceStrs": [
"<title>Start a Blog"
],
"presenseStrs": [
"&#8211; Thoughts.com</title>"
],
"engine": "engine404get",
"urlMain": "http://thoughts.com",
"url": "http://thoughts.com/profile/{username}",
"url": "http://thoughts.com/members/{username}",
"usernameUnclaimed": "noonewouldeverusethis7",
"usernameClaimed": "red"
"usernameClaimed": "alicia12"
},
"hackernoon.com": {
"tags": [
"in",
"us"
"us",
"news"
],
"checkType": "message",
"absenceStrs": [
"<title>HackerNoon"
],
"presenseStrs": [
" | HackerNoon</title>"
],
"engine": "engine404message",
"urlMain": "https://hackernoon.com",
"url": "https://hackernoon.com/u/{username}",
"usernameUnclaimed": "noonewouldeverusethis7",
"usernameUnclaimed": "noonewouldeverusethis71",
"usernameClaimed": "god"
},
"Intigriti": {
Expand Down Expand Up @@ -28479,10 +28493,9 @@
},
"photoshop-kopona.com": {
"absenceStrs": [
"<title>noonewouldeverusethis7 &raquo; \u0420\u0435\u0441\u0443\u0440\u0441\u044b \u0434\u043b\u044f \u0424\u043e\u0442\u043e\u0448\u043e\u043f\u0430</title>"
"<div id='dle-content'></div></div></main></div></div><footer class=\"footer\">"
],
"presenseStrs": [
"offline",
"uspusertitle"
],
"url": "https://photoshop-kopona.com/ru/user/{username}/",
Expand Down Expand Up @@ -28645,7 +28658,7 @@
},
"coder.social": {
"absenceStrs": [
"Not Found in Githubhelp"
"<title>Coder Social Home</title>"
],
"presenseStrs": [
"nofollow"
Expand Down
18 changes: 9 additions & 9 deletions sites.md
Original file line number Diff line number Diff line change
Expand Up @@ -1442,7 +1442,7 @@ Rank data fetched from Alexa by domains.
1. ![](https://www.google.com/s2/favicons?domain=http://rodobozhie.ru) [rodobozhie.ru (http://rodobozhie.ru)](http://rodobozhie.ru)*: top 10M, ru*
1. ![](https://www.google.com/s2/favicons?domain=http://satwarez.ru) [satwarez.ru (http://satwarez.ru)](http://satwarez.ru)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=http://nf-club.ru) [nf-club.ru (http://nf-club.ru)](http://nf-club.ru)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=http://linuxmint.info) [linuxmint.info (http://linuxmint.info)](http://linuxmint.info)*: top 10M, ru*
1. ![](https://www.google.com/s2/favicons?domain=http://linuxmint.info) [linuxmint.info (http://linuxmint.info)](http://linuxmint.info)*: top 10M, ru*, search is disabled
1. ![](https://www.google.com/s2/favicons?domain=http://kiabongo.info) [kiabongo.info (http://kiabongo.info)](http://kiabongo.info)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=http://koshtoris.at.ua) [koshtoris.at.ua (http://koshtoris.at.ua)](http://koshtoris.at.ua)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=http://xn--90anbhklk.xn--p1ai) [xn--90anbhklk.xn--p1ai (http://xn--90anbhklk.xn--p1ai)](http://xn--90anbhklk.xn--p1ai)*: top 10M*
Expand All @@ -1469,7 +1469,7 @@ Rank data fetched from Alexa by domains.
1. ![](https://www.google.com/s2/favicons?domain=http://wolga24.at.ua) [wolga24.at.ua (http://wolga24.at.ua)](http://wolga24.at.ua)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=http://millerovo161.ru) [millerovo161.ru (http://millerovo161.ru)](http://millerovo161.ru)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=http://videomuzon.ucoz.ru) [videomuzon.ucoz.ru (http://videomuzon.ucoz.ru)](http://videomuzon.ucoz.ru)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=https://community.autolenta.ru) [Autolenta (https://community.autolenta.ru)](https://community.autolenta.ru)*: top 10M, auto, forum, ru*
1. ![](https://www.google.com/s2/favicons?domain=https://community.autolenta.ru) [Autolenta (https://community.autolenta.ru)](https://community.autolenta.ru)*: top 10M, auto, forum, ru*, search is disabled
1. ![](https://www.google.com/s2/favicons?domain=http://amax-sb.ru) [amax-sb.ru (http://amax-sb.ru)](http://amax-sb.ru)*: top 10M*
1. ![](https://www.google.com/s2/favicons?domain=http://angelgothics.ru) [Angelgothics (http://angelgothics.ru)](http://angelgothics.ru)*: top 10M, ru*
1. ![](https://www.google.com/s2/favicons?domain=http://help-baby.org) [help-baby.org (http://help-baby.org)](http://help-baby.org)*: top 10M*
Expand Down Expand Up @@ -2283,8 +2283,8 @@ Rank data fetched from Alexa by domains.
1. ![](https://www.google.com/s2/favicons?domain=https://freelance.ru) [freelance.ru (https://freelance.ru)](https://freelance.ru)*: top 100M, ru*
1. ![](https://www.google.com/s2/favicons?domain=https://freelansim.ru) [freelansim.ru (https://freelansim.ru)](https://freelansim.ru)*: top 100M*
1. ![](https://www.google.com/s2/favicons?domain=http://fotolog.com) [fotolog.com (http://fotolog.com)](http://fotolog.com)*: top 100M, in*
1. ![](https://www.google.com/s2/favicons?domain=http://thoughts.com) [thoughts.com (http://thoughts.com)](http://thoughts.com)*: top 100M, in*
1. ![](https://www.google.com/s2/favicons?domain=https://hackernoon.com) [hackernoon.com (https://hackernoon.com)](https://hackernoon.com)*: top 100M, in, us*
1. ![](https://www.google.com/s2/favicons?domain=http://thoughts.com) [thoughts.com (http://thoughts.com)](http://thoughts.com)*: top 100M, blog*
1. ![](https://www.google.com/s2/favicons?domain=https://hackernoon.com) [hackernoon.com (https://hackernoon.com)](https://hackernoon.com)*: top 100M, news, us*
1. ![](https://www.google.com/s2/favicons?domain=https://intigriti.com) [Intigriti (https://intigriti.com)](https://intigriti.com)*: top 100M, hacking, in*
1. ![](https://www.google.com/s2/favicons?domain=https://yamaya.ru) [yamaya.ru (https://yamaya.ru)](https://yamaya.ru)*: top 100M, ru*
1. ![](https://www.google.com/s2/favicons?domain=https://www.tinkoff.ru/invest/) [Tinkoff Invest (https://www.tinkoff.ru/invest/)](https://www.tinkoff.ru/invest/)*: top 100M, ru*
Expand Down Expand Up @@ -2599,20 +2599,20 @@ Rank data fetched from Alexa by domains.
1. ![](https://www.google.com/s2/favicons?domain=https://www.hozpitality.com) [hozpitality (https://www.hozpitality.com)](https://www.hozpitality.com)*: top 100M*
1. ![](https://www.google.com/s2/favicons?domain=https://kazanlashkigalab.com) [kazanlashkigalab.com (https://kazanlashkigalab.com)](https://kazanlashkigalab.com)*: top 100M, kz*

Alexa.com rank data fetched at (2022-02-26 12:55:54.605333 UTC)
The list was updated at (2022-02-26 13:41:10.351473 UTC)
## Statistics

Enabled/total sites: 2443/2595 = 94.14%
Enabled/total sites: 2441/2595 = 94.07%

Incomplete checks: 525/1853 = 28.33% (false positive risks)
Incomplete checks: 523/1853 = 28.22% (false positive risks)

Top 20 profile URLs:
- (796) `{urlMain}/index/8-0-{username} (uCoz)`
- (221) `{urlMain}{urlSubpath}/members/?username={username} (XenForo)`
- (221) `/{username}`
- (138) `/user/{username}`
- (134) `{urlMain}{urlSubpath}/member.php?username={username} (vBulletin)`
- (97) `/profile/{username}`
- (96) `/profile/{username}`
- (87) `{urlMain}/u/{username}/summary (Discourse)`
- (74) `/users/{username}`
- (44) `{urlMain}{urlSubpath}/search.php?author={username} (phpBB/Search)`
Expand All @@ -2621,7 +2621,7 @@ Top 20 profile URLs:
- (36) `/@{username}`
- (28) `/u/{username}`
- (27) `{urlMain}{urlSubpath}/memberlist.php?username={username} (phpBB)`
- (24) `/members/{username}`
- (25) `/members/{username}`
- (18) `/forum/members/?username={username}`
- (18) `/forum/search.php?keywords=&terms=all&author={username}`
- (17) `/search.php?keywords=&terms=all&author={username}`
Expand Down
2 changes: 1 addition & 1 deletion utils/update_site_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ def get_readable_rank(r):
site_file.write(f'1. {favicon} [{site}]({url_main})*: top {valid_rank}{tags}*{note}\n')
db.update_site(site)

site_file.write(f'\nAlexa.com rank data fetched at ({datetime.utcnow()} UTC)\n')
site_file.write(f'\nThe list was updated at ({datetime.utcnow()} UTC)\n')
db.save_to_file(args.base_file)

statistics_text = db.get_db_stats(is_markdown=True)
Expand Down

0 comments on commit 8a53a38

Please sign in to comment.