8000 TI ARMv7R COFF files are not recognized · Issue #760 · gimli-rs/object · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

TI ARMv7R COFF files are not recognized #760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
xobs opened this issue Mar 18, 2025 · 5 comments
Open

TI ARMv7R COFF files are not recognized #760

xobs opened this issue Mar 18, 2025 · 5 comments

Comments

@xobs
Copy link
xobs commented Mar 18, 2025

I have some COFF files from TI. They're recognized by Ghidra as having COFF Level 2 magic and start with:

   1   │ 00000000  c2 00 0f 00 cf a2 30 55  74 e4 00 00 ee 01 00 00  |......0Ut.......|
   2   │ 00000010  1c 00 f3 12 97 00 08 01  8a 13 d8 0b 00 00 00 00  |................|
   3   │ 00000020  00 00 51 00 00 00 08 07  00 08 40 00 00 08 00 00  |..Q.......@.....|
   4   │ 00000030  00 00 00 00 00 00 5b 08  00 00 00 00 00 00 00 00  |......[.........|
   5   │ 00000040  00 00 30 00 00 00 02 03  00 00 00 00 00 00 00 00  |..0.............|
   6   │ 00000050  00 00 00 00 00 00 00 00  00 00 10 00 00 00 08 00  |................|
   7   │ 00000060  00 00 2e 69 6e 74 76 65  63 73 00 00 00 08 00 00  |...intvecs......|

I believe it's just a simple difference in the header, possibly with an additional field for the f_target_id, but I don't quite understand how the parsing works yet.

@xobs
Copy link
Author
xobs commented Mar 19, 2025

I was able to get it to read the section headers. I've had to make the following modifications:

  1. Section names can be more than 8 bytes. If the first four bytes are zeroes, then the next four bytes are treated as an index into the string table.[1]
  2. The header has an additional 2 bytes in it.[2]
  3. The section header has additional fields, and some of the actual field sizes are u32 rather than u16.[3]

[1] Getting the string name:

let (string_type, string_table_index) = section.name.split_at(4);
let string_type = u32::from_le_bytes(string_type.try_into().unwrap());
if string_type == 0 {
    let name_index = u32::from_le_bytes(string_table_index.try_into().unwrap());
    let mut string_table_index: u64 = (name_index + header.pointer_to_symbol_table() + header.number_of_symbols()).into();
    string_table_index += pe::IMAGE_SIZEOF_SYMBOL as u64;
    let name = data
        .read_bytes_at_until(
            string_table_index..data.len().read_error("Unable to get data length")?,
            0,
        )
        .read_error("String index not found")?;
    let Ok(name) = str::from_utf8(name) else {
        return Err(Error("Unable to locate string"));
    };
    println!("Section Name: {:x?}", name);
}

[2] Skipping the extra machine type:

const TI_COFF1_MAGIC: U16<LE> = U16::from_bytes([0xc1, 0x00]);
const TI_COFF2_MAGIC: U16<LE> = U16::from_bytes([0xc2, 0x00]);
// TI adds an additional `target_id` section. Skip over that.
if header.machine == TI_COFF1_MAGIC || header.machine == TI_COFF2_MAGIC {
    println!("TI magic detected -- skipping over machine type");
    *offset = offset
        .checked_add(2)
        .read_error("TI header could not be skipped")?;
}

[3] Bigger header size

#[derive(Debug, Default, Clone, Copy)]
#[repr(C)]
pub struct ImageSectionHeader {
    pub name: [u8; IMAGE_SIZEOF_SHORT_NAME],
    pub virtual_size: U32<LE>,
    pub virtual_address: U32<LE>,
    pub size_of_raw_data: U32<LE>,
    pub pointer_to_raw_data: U32<LE>,
    pub pointer_to_relocations: U32<LE>,
    pub pointer_to_linenumbers: U32<LE>,
    pub number_of_relocations: U32<LE>,
    pub number_of_linenumbers: U32<LE>,
    pub characteristics: U32<LE>,
    pub reserved: U16<LE>,
    pub page: U16<LE>,
}

@xobs
Copy link
Author
xobs commented Mar 19, 2025

I have a few questions:

  1. Is there any interest in parsing these formats? I'd like to be able to parse these object files from TI, but if there's no interest then I can stop here.
  2. What's the best way to handle the varying header sizes? Should I create ImageSectionHeaderV2 (and, if Ghidra's source is to be believed, ImageSectionHeaderV1)? How do I return different types?
  3. How do I handle section names? Right now the name is just an array of 8 bytes, but it's also very possible for section names to be arbitrary-length strings. Should I allocate strings? Or add a helper function to return the string?

@philipc
Copy link
Contributor
philipc commented Mar 19, 2025

The COFF implementation in this crate is Microsoft's COFF. I probably should have named it something different to indicate that.

TI states: So, while TI and MicroSoft both use COFF, the resulting object files are in no way compatible.

So we probably shouldn't be trying to adapt the existing COFF support to work for TI as well.

Is there any interest in parsing these formats?

Which compiler toolchains can produce TI COFF? In particular, can rustc produce it?

The primary purpose of this crate is to provide a unified API for parsing these file formats, and it is generally only used for file formats that the rust toolchain can produce. This is intended to enable rust code to work regardless of which platform it is running on.

If you are writing code that will only ever work for TI, then the unified API is probably not appropriate. I'd still be open to adding lower level parsing if it doesn't complicate the other code.

What's the best way to handle the varying header sizes? Should I create ImageSectionHeaderV2 (and, if Ghidra's source is to be believed, ImageSectionHeaderV1)? How do I return different types?

ImageSectionHeader is Microsoft's naming convention. I would avoid using anything related to that name for formats that are not from Microsoft.

Where are you wanting to return this type from?

How do I handle section names? Right now the name is just an array of 8 bytes, but it's also very possible for section names to be arbitrary-length strings. Should I allocate strings? Or add a helper function to return the string?

We already handle section names that are string table offsets for Microsoft COFF. This is just a different way of encoding the offset (and Microsoft's symbol names already use this encoding). You should be able to return a reference to the string table entry, with no need to allocate.

@xobs
Copy link
Author
xobs commented Mar 20, 2025

The goal for all of this is to understand the flashing process used by TI. To that end, I've set up unicorn-engine to execute routines based on their flashing API, and I need to parse the object files they provide.

So far the only tool I've found that understands this format is Ghidra, and up until now I've gotten binary images by copying the hex dumps out and massaging them with a program to get it in binary form.

This is ultimately part of my goal to get support for the TI TMS570 into probe-rs. I can't actually generate these files because I'm not sure what compiler they came from. I can't even really ingest them in any useful form. Ghidra and object with these patches are the only two useful methods I've found to inspect them at all.

It sounds like it's most appropriate to relegate this code to a temporary hack, rather than maintain it for all eternity inside the main repository. If nothing else, I've at least documented some breadcrumbs if someone else (or, more likely, me in the future) wants to pick this up and parse these files in the future.

@xobs
Copy link
Author
xobs commented Mar 20, 2025

Also, you're right, this works for an implementation of name_offset(&self){

    /// Return the string table offset of the section name.
    ///
    /// Returns `Ok(None)` if the name doesn't use the string table
    /// and can be obtained with `raw_name` instead.
    pub fn name_offset(&self) -> Result<Option<u32>> {
        let (string_type, string_table_index) = self.name.split_at(4);
        let string_type = u32::from_le_bytes(
            string_type
                .try_into()
                .ok()
                .read_error("invalid string type")?,
        );
        if string_type != 0 {
            return Ok(None);
        }
        Ok(Some(u32::from_le_bytes(
            string_table_index
                .try_into()
                .ok()
                .read_error("invalid string table index")?,
        )))
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0