-
Notifications
You must be signed in to change notification settings - Fork 193
mpi3mr-based raid controllers are not supported (e.g., PERC H965i) #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's not a bug; it just means this driver/controller is not supported. If you would like to have it supported, there are two options:
|
P.S. i see that mpi3mr is completley new driver, written from scratch. Do you know if the Venor tool could at least show some basic data like a smart identify packet? if not it would be very hard to guess the interface. |
Thank you for your input / effort! When there is any new information, will give an update here. |
While we are trying to pressure Dell to contribute, we would also like to grant you SSH access to one of our machines in case you have some time and would love to help out, we will contact you via e-mail for the details regarding IP's, public keys etc. We would like to provide further information in regards to the issue: The following we were able to find out during my own research into the topic:
Additionally, we found the following documentation which could at least give some insight into what the device is capable of and what not: Thank you in advance! |
RAID controller support consists from 2 parts:
I can't promise I will have enough time to work on this, but I can glance over the weekend. My email is samm [at] net-art.cz. However, it is still strongly encouraged to get some samples from the vendor, this would speedup the work a lot and make implementation less problematic. |
Some results of reading the Linux kernel src for mpi3mr: The driver, mpi3mr, offers 2 interfaces. A controller interface at /dev/bsg/mpi3mrctl1, and a general SCSI interface (v4, BSG) for sending SCSI commands to the controller. The controller interface uses ioctls, specifically SG_IO. It fills up a structure that has the format of a request & a pointer to a preallocated response buffer, as well as optional xfer buffers for attaching data to the request (dout) or response (din). The request structure starts with a request type, then 7 bytes of padding (there's a kernel struct for this). There are two types of commands: DRV and MPT. DRV is id 1, MPT is id 2. DRV fetches driver info and basic stuff like disk count. It has a simple format with an opcode (list of opcodes is in kernel), and it can then take or return data in xfer. MPT then has a variable width description of the buffer types included in the 2 xfer buffers, which describes at which offset as well as what length is a given buffer type in a given xfer direction. An MPI3 command is then built from this MPT description and sent to the controller. The actual controller <-> os interface uses an asynchronous queue system, with help from DMA for actually transferring the data. The driver implements logic for waiting on the request to complete in the queue. MPI3 has several queues, unsure of the details of each one of them. The driver seems to offer some mechanism for exposing drives to the host based on The kernel has a struct for identifying a device, mpi3mr_tgt_dev. With two disks plugged in, 4 of those exist. 1 of them might be the controller. Full list of types is available at scsi/scsi_bsg_mpi3mr.h in the kernel includes, prefixed with Some patches regarding the drive exposure mechanism: https://lore.kernel.org/all/20220804131226.16653-6-sreekanth.reddy@broadcom.com/ We would like to be involved in the process, if possible, could you grant access to the server ^^? |
@samm-git access has been granted, information is via e-mail. |
Some of the quick findings:
|
Looks like i found SAT16 command to get identify data from PD and reply in
|
Our findings from yesterday:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <linux/bsg.h>
#include <scsi/sg.h>
#include <inttypes.h>
void print_hex(const char *label, void *buf, size_t len) {
unsigned char *byte_buf = (unsigned char *)buf;
printf("%s: ", label);
for (size_t i = 0; i < len; i++) {
printf("%02x ", byte_buf[i]);
}
printf("\n");
}
int main() {
struct sg_io_v4 io_v4;
memset(&io_v4, 0, sizeof(io_v4));
// Set the guard, protocol, and subprotocol
io_v4.guard = 'Q'; // 'Q' to differentiate from v3
io_v4.protocol = BSG_PROTOCOL_SCSI; // SCSI protocol
io_v4.subprotocol = BSG_SUB_PROTOCOL_SCSI_TRANSPORT; // SCSI command subprotocol
// Request and response setup
io_v4.request_len = 64;
io_v4.request = (uint64_t)malloc(io_v4.request_len);
uint8_t request_data[64] = {0x02, 0xcc, 0xd3, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xcc, 0xb4, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x3c, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0xfe, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
memcpy((void*)io_v4.request, request_data, io_v4.request_len);
io_v4.max_response_len = 32;
io_v4.response_len = 32;
io_v4.response = (uint64_t)malloc(io_v4.max_response_len);
memset((void*)io_v4.response, 0, io_v4.max_response_len);
// din setup
io_v4.din_xfer_len = 828;
io_v4.din_xferp = (uint64_t)malloc(io_v4.din_xfer_len);
memset((void*)io_v4.din_xferp, 0, io_v4.din_xfer_len);
// dout setup
io_v4.dout_xfer_len = 64;
io_v4.dout_xferp = (uint64_t)malloc(io_v4.dout_xfer_len);
// IDENTIFY DEVICE: 0xEC
// SMART READ LOG: 0xB0
uint8_t opcode = 0xb0;
uint8_t page = 0x0;
//uint8_t dout_data[64] = {0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x01, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, opcode, 0x08, 0x0e, 0x00, 0xd0, 0x00, 0x00, 0x00, page, 0x00, 0x4f, 0x00, 0xc2, 0x00, 0xb0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
uint8_t dout_data[64] = {0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x85, 0x08, 0x0e, 0x00, 0xd0, 0x00, 0x00, 0x00, page, 0x00, 0x4f, 0x00, 0xc2, 0x00, opcode, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
memcpy((void*)io_v4.dout_xferp, dout_data, io_v4.dout_xfer_len);
// Other fields
io_v4.timeout = 20000; // Timeout in milliseconds
io_v4.flags = 0; // No flags set
io_v4.dout_iovec_count = 0; // Flat transfer
io_v4.din_iovec_count = 0; // Flat transfer
// Open the /dev/bsg/mpi3mrctl0 device
int fd = open("/dev/bsg/mpi3mrctl0", O_RDWR);
if (fd < 0) {
perror("Failed to open /dev/bsg/mpi3mrctl0");
return 1;
}
// Send SG_IO ioctl
if (ioctl(fd, SG_IO, &io_v4) < 0) {
perror("SG_IO ioctl failed");
close(fd);
return 1;
}
printf("SG_IO ioctl sent successfully\n");
// Print din and response as hex
print_hex("Response", (void*)io_v4.response, io_v4.response_len);
print_hex("SMART", (void*)io_v4.din_xferp+256+60, io_v4.din_xfer_len-256-60);
printf("\n");
free((void*)io_v4.request);
free((void*)io_v4.response);
free((void*)io_v4.din_xferp);
close(fd);
return 0;
} |
Also, perccli2 consists of a C++ part and a C part, the config file enables the C++ tracer, but there's also a C tracer for the internal storelib8 functions which we didn't yet figure out how to enable. Should hopefully be unnecessary, though. |
/* MPI3: Function definitions */
#define MPI3_BSG_FUNCTION_MGMT_PASSTHROUGH (0x0a)
#define MPI3_BSG_FUNCTION_SCSI_IO (0x20)
#define MPI3_BSG_FUNCTION_SCSI_TASK_MGMT (0x21)
#define MPI3_BSG_FUNCTION_SMP_PASSTHROUGH (0x22)
#define MPI3_BSG_FUNCTION_NVME_ENCAPSULATED (0x24) MPT function defs, from #define MPI3MR_DRVBSG_OPCODE_UNKNOWN 0
#define MPI3MR_DRVBSG_OPCODE_ADPINFO 1
#define MPI3MR_DRVBSG_OPCODE_ADPRESET 2
#define MPI3MR_DRVBSG_OPCODE_ALLTGTDEVINFO 4
#define MPI3MR_DRVBSG_OPCODE_GETCHGCNT 5
#define MPI3MR_DRVBSG_OPCODE_LOGDATAENABLE 6
#define MPI3MR_DRVBSG_OPCODE_PELENABLE 7
#define MPI3MR_DRVBSG_OPCODE_GETLOGDATA 8
#define MPI3MR_DRVBSG_OPCODE_QUERY_HDB 9
#define MPI3MR_DRVBSG_OPCODE_REPOST_HDB 10
#define MPI3MR_DRVBSG_OPCODE_UPLOAD_HDB 11
#define MPI3MR_DRVBSG_OPCODE_REFRESH_HDB_TRIGGERS 12 DRV opcodes (at offset 8, check struct 0x01 in REQUEST is DRV, 0x02 is MPT (it's the Entry point into the handler is at |
A decoding for an The handle is what the MPT MPI3_BSG_FUNCTION_SCSI_IO command takes to identify the disk to send to. |
We (us & @bluedinoo) will write a patch this week, if nothing unexpected occurs. We have a pretty good idea of what we need to do, just gotta write it :) A test utility we wrote is already successfully passing through ATA commands through SCSI ATA PASSTHROUGH, and getting some data that looks good, so it's just a matter of hooking it up to smartmontools. |
Cool. I also wrote a test code to get identify working but my time is very limited. @tranquillity-codes patch should implement only scsi device, sat should be done by smartmontools, see other raids, e.g. megaraid for the details. There is another topic - NVMe support, which will probably require more efforts. but we can do it later |
When we have the SCSI response data, how do we return it back to smartmontools from the scsi_pass_through of our linux_mpi3mr_device ? We tried assigning |
currently looking into the existing sssraid code as reference as BSG is also in use for that controller type with @tranquillity-codes. |
@samm-git Is it possible to arrange a DM session? So far, we finished the implementation into smartmontools almost fully, however, it seems that we are still having an issue with smartmontools parsing our response (din_xfer) data. I am happy to share the din data we receive with you among with @tranquillity-codes and my patch to smartmontools if needed. Kind Regards |
feel free to open PR with a patch or just put it to the comment. I will try to take a look too |
PR opened |
@samm-git Asking for a short update while also sharing what I found out further:
|
Dell officially responded now, they will not contribute and support this project/issue. |
Issue:
smartctl --scan
does not show physical drives for raid controller PERC H965i Front with driver mpi3mrsmartctl versions: 7.2 and 7.4
Using smartctl 7.2 for all the information below.
Dell documentation for the raid controller:
https://www.dell.com/support/manuals/en-us/perc-h965i-mx/perc12/features-of-perc-h965i-front?guid=guid-7503c350-4757-4afa-9d3e-7649dabcb860&lang=en-us
OS information:
Result for PERC H965i Front, driver mpi3mr:
Summary of configuration: 2 physical drives (SSDs) -- 1 configured as RAID0, 1 configured as Unconfigured Good.
Expected behaviour would be for like PERC H730P Adapter, driver megaraid_sas:
Summary of configuration: 5 physical drives (HDDs), 1 PCIe NVMe -- 1 configured as RAID0, 4 configured as Unconfigured Good.
Conclusion: Physical drives are not shown for raid controller PERC H965i Front with driver mpi3mr.
Reason: Unknown.
Expected outcome: We would like to achieve the same behaviour where physical drives are visible and can be used to get smartctl data. (If possible that would also include NVMe's behind the raid controller).
From our Dell contact we got the following information:
If any other information is needed please let me know.
Information dump:
XXX = redacted
The text was updated successfully, but these errors were encountered: