Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the original bytes of the PE file. I want to covert a file to a gray image. #85

Open
zxC0der opened this issue May 5, 2022 · 5 comments

Comments

@zxC0der
Copy link

zxC0der commented May 5, 2022

Thanks

@kevin3567
Copy link

Same here. Would it be possible to acquire the original bytes of the malware files (or at least the bytes of the PE headers)?

@lkurlandski
Copy link

lkurlandski commented Sep 21, 2022

No. In their paper, the authors discuss why they do not release the raw executables. The SOREL project worked on improving upon some of the shortcommings of EMBER, including this issue. They release raw binaries for the malware files only. You can check them out here: SOREL-20M

@isimsizolan
Copy link

Yet sorel did not publish benign sets either.

@lkurlandski
Copy link

Yes, because benign software is typically proprietary.

@isimsizolan
Copy link

isimsizolan commented Nov 2, 2022

Yet, it is hard limit. Very selected few have access to this proprietary ground truth good benign dataset.

I also think that it is an excuse because you can theoretically collect some of those proprietary software by yourself, plus they can be disarmed the same way used to disarm malwares.

I'm a fulltime academic personel in one of the most respectable university in my country, I have contacted almost all security companies and literally begged good ground truth benign dataset and none responded positive.

Just a few days ago, virustotal refused my api request for getting AVScores of self benign-only dataset for at least making a baseline, they said "it is extremely unethical to compete with anti-virus companies using their product line"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants