-
Notifications
You must be signed in to change notification settings - Fork 6
Initial draft of harp file standard #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@bruno-f-cruz Thanks, will review soon. Are we changing the recommended casing from Harp to harp? I don't necessarily mind, but whichever way we decide to go, I would like us to be consistent across all documents. I think all references to the original upper-case HARP have now been replaced with Harp. Note that calling the protocol "Harp" does not mean we need to exclude having CLI-tools and URL use Maybe we can create a separate proposal for this if we want to change? |
I may be confused but I thought that convention after the discussion for the sfn poster was to lower-case everything. That being said I am also ok with either and we should refactor the docs for sure. Does anyone else have a preference? @artursilva0 @filcarv @jfrazao maybe? |
I thought we had already settled on the Harp a while back. One downside is that the logo is also in lowercase, but I'm ok with either option. |
@artursilva0 this is not inconsistent with other projects. No need to look further than Git. Logo in lower-case, all references in docs and everywhere with capitalized first letter. |
I do remember hearing something about a preference for lower-case, but I don't remember the decision and like @artursilva0 I thought the matter was decided long ago. I think the cost of changing everything now would be complex and create even more confusion since we already used HARP and Harp, and introducing harp in docs just creates even more combinations. |
All relevant references to Harp have now been capitalized in the text. Other references were left as lower case where appropriate (e.,g. paths and URLs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just minor comments for clarity and to try and accommodate the various ways in which we have been using the de-multiplexed file format.
where: | ||
|
||
- the character "_" is reserved as a separator between fields. | ||
- `<DeviceName>` should match the `device.yml` metadata file that fully defines the device and can be found in the repository of each device ([e.g.](https://raw.githubusercontent.com/harp-tech/device.behavior/main/device.yml)). This file can be seen as the "ground-truth" specification of the device. It is used to automatically generate documentation, interfaces and data ingestion tools. While this is not a strict requirement, it is highly recommended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `<DeviceName>` should match the `device.yml` metadata file that fully defines the device and can be found in the repository of each device ([e.g.](https://raw.githubusercontent.com/harp-tech/device.behavior/main/device.yml)). This file can be seen as the "ground-truth" specification of the device. It is used to automatically generate documentation, interfaces and data ingestion tools. While this is not a strict requirement, it is highly recommended. | |
- `<DeviceName>` should match the name in the `device.yml` metadata file that fully defines the device and can be found in the repository of each device ([e.g.](https://raw.githubusercontent.com/harp-tech/device.behavior/main/device.yml)). This file can be seen as the "ground-truth" specification of the device. It is used to automatically generate documentation, interfaces and data ingestion tools. While this is not a strict requirement, it is highly recommended. |
Do we need to require that <DeviceName>
matches the device type, or is it enough to match the device folder? In our current standards we tend to have individual files inside container folders be named following the full hierarchy structure leading up to the file.
This is useful in cases where there may be other types of files stored in the experimental device folder which are not harp binary files themselves, but are data stored and synched together with this device (e.g. video frame data).
In this case if we change the convention for non-Harp files the naming convention will be inconsistent. Alternatively, if we do change the convention to name it after the Harp device type, the naming convention may be misleading, e.g. consider the following:
CameraTop
┣ 📜Behavior_0.bin
┣ 📜Behavior_1.bin
┗ 📜CameraTop_Video.avi
CameraSide
┣ 📜Behavior_0.bin
┣ 📜Behavior_1.bin
┗ 📜CameraSide_Video.avi
If we do want to include the device type in the name I guess we could add the container name as a prefix and keep the harp device type in the file:
CameraTop
┣ 📜CameraTop_Behavior_0.bin
┣ 📜CameraTop_Behavior_1.bin
┗ 📜CameraTop_Video.avi
CameraSide
┣ 📜CameraSide_Behavior_0.bin
┣ 📜CameraSide_Behavior_1.bin
┗ 📜CameraSide_Video.avi
There are definitely other possible alternatives such as nested folders, etc, but as above I feel somehow we should avoid being overly prescriptive. It is true that the current harp-python
currently adheres to a specific structure, but perhaps we could make it flexible to adhere to a variety of options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure the standard should encompass multi-file-type containers. In my mind, if you consider the harp container as "indivisible", your use case could instead be modelled as:
CameraTop
┣ CameraTop.harp
┣ 📜CameraTop_Behavior_0.bin
┣ 📜CameraTop_Behavior_1.bin
┗ 📜CameraTop_Video.avi
What do we gain by doing this? In my mind more predictability of the file structure. At the end of the day, the reason for standardizing the structure are, in my mind:
- to know what device we have
a. If device.yml is present, we are done
b. if no device.yml is present, it would be nice to at least have a sane way to find where the whoami register is so we can read it and attempt to fetch the device.yml
2.to have a patterned way to open all files inside the container in a easily automated way.
If we want to allow for multiple file names inside the container, I am worried about running into the following situation:
CameraTop
┣ 📜CameraTop_Behavior_0.bin
┣ 📜CameraTop_Behavior_1.bin
┣ 📜CameraTop_Wheel_0.bin
┣ 📜CameraTop_Wheel_1.bin
┗ 📜CameraTop_Video.avi
If I now want to ingest this datastream using the harp python library I would need to solve the generate case of Behavior vs Wheel. I am sure this can be handle by a custom regexp, but I would rather consider that as outside the scope of this standard. Otherwise, not really sure we gain much by standardizing anything other than "all harp files are expected to have data from a single register, wherein all messages have the same length and data type".
Co-authored-by: glopesdev <glopesdev@users.noreply.github.com>
Co-authored-by: glopesdev <glopesdev@users.noreply.github.com>
This PR includes a first draft of the standardized harp file format