Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding disasm_iter functionality #114

Closed
wants to merge 6 commits into from
Closed

adding disasm_iter functionality #114

wants to merge 6 commits into from

Conversation

dnsserver
Copy link

I needed the recursive disasm functionality.
I haven't done extensive testing although I've been using this code for some time now.

@@ -18,6 +18,9 @@ travis-ci = { repository = "capstone-rust/capstone-rs" }
capstone-sys = { path = "../capstone-sys", version = "0.14.0" }
libc = { version = "0.2", default-features = false }

[dependencies.object]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid bringing in this dependency. Is it used? Could you make it a "dev-dependency" instead?

let cs = if let Ok(cs) = Capstone::new()
.x86()
.mode(arch::x86::ArchMode::Mode64)
.detail(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please run cargo fmt to format the code

@@ -165,6 +165,28 @@ impl Capstone {
}
}

pub fn iter_malloc<'a>(&'a self) -> *mut cs_insn {
unsafe {
cs_malloc(self.csh())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this allocates memory but never frees it; this is a memory leak

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%! I will update, thanks.

let mut addr_queue: VecDeque<u64> = VecDeque::new();
let mut addr_seen: HashMap<u64, bool> = HashMap::new();

let cs_ins = cs.iter_malloc();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is undefined behavior (instantiating an Rust struct with initialized memory).

To work with uninitialized data, look into the MaybeUninit type.

Copy link
Author

@dnsserver dnsserver Sep 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double check this. However the iter_malloc does allocate memory for 1 instruction, and that is a raw pointer:
*mut cs_insn.
I just used this fn: https://docs.rs/capstone-sys/0.5.0/capstone_sys/fn.cs_malloc.html

@tmfink
Copy link
Member

tmfink commented Sep 19, 2021

Thanks for the PR! What is the purpose of disasm_iter() here? What issues were you running into?

@dnsserver
Copy link
Author

I didn't run into any issues. I needed to do recursive disassembly for obfuscated instructions.

@tmfink
Copy link
Member

tmfink commented Sep 20, 2021

I didn't run into any issues. I needed to do recursive disassembly for obfuscated instructions.

Sorry I wasn't clear with my question. What did this crate not allow you to do before that this new function disasm_iter() allows you to do?

@dnsserver
Copy link
Author

Oh, recursive disassembly is good for use cases where data and code are mixed or when x86 instructions are obfuscated.
This is what this issue is about: #1
Pretty much allows me to disassemble chunks of bytes at a time, specifying offset where to start disassembling (following jumps), and have control when to stop. This is much easier done with capstone then trying to manually feed the chunks of data to disasm_all function IMHO.

Copy link
Member

@tmfink tmfink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add an example program that does recursive disassembly, but I'm not inclined to accept the new disasm_iter() function in the current state since it allows users to violate memory safety.

One of the biggest advantages of Rust is the memory safety, so we want to make sure we preserve that.

You can see from #1 that I've previously tried to add support for cs_disasm_iter() unsuccessfully.

My recommendation is that you remove the disasm_iter() feature from this PR and just add the recursive disassembly example. If you want to in a follow-up PR, you can try to add disasm_iter() in a memory-safe way.

}
}

pub fn disasm_iter<'a>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of adding this function? Why are the existing disasm_all() or disasm_count() not sufficient?

This function dereferences raw pointers and is hence unsafe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think more about what the disasm_iter() function interface should be.

Instead of a C-like interface, we want a "Rusty" interface that:

  1. Does not require the user to use unsafe
    a. Does not take any pointers
  2. Does not require manual malloc/free functions to avoid memory leaks, double frees, etc.
  3. Is easy to use
    a. Does not return a tuple (an enum/struct instead)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Let me close this PR. I can do a followup with these guidelines


pub fn disasm_iter<'a>(
&'a self,
code: *mut *const u8,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid exposing raw pointers in the public API. We want to expose a memory-safe interface.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all your feedback. I really appreciate it.
I will take a look and see if I can clean even more the API. Feel free to decline this PR.

@dnsserver dnsserver closed this Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants