10000 mpi3mr-based raid controllers are not supported (e.g., PERC H965i) · Issue #313 · smartmontools/smartmontools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

mpi3mr-based raid controllers are not supported (e.g., PERC H965i) #313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
KaizenTamasVeres opened this issue Jan 7, 2025 · 22 comments

Comments

@KaizenTamasVeres
Copy link
KaizenTamasVeres commented Jan 7, 2025

Issue: smartctl --scan does not show physical drives for raid controller PERC H965i Front with driver mpi3mr
smartctl versions: 7.2 and 7.4
Using smartctl 7.2 for all the information below.

Dell documentation for the raid controller:
https://www.dell.com/support/manuals/en-us/perc-h965i-mx/perc12/features-of-perc-h965i-front?guid=guid-7503c350-4757-4afa-9d3e-7649dabcb860&lang=en-us

OS information:

[root@mpi3mr ~]# cat /etc/os-release
NAME="Rocky Linux"
VERSION="9.5 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.5"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.5 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
VENDOR_NAME="RESF"
VENDOR_URL="https://resf.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2032-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.5"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.5"
[root@mpi3mr ~]# lsmod | grep mpt
mpt3sas               577536  0
raid_class             16384  1 mpt3sas
scsi_transport_sas     61440  2 mpi3mr,mpt3sas
[root@mpi3mr ~]# lsmod | grep mpi
mpi3mr                315392  0
scsi_transport_sas     61440  2 mpi3mr,mpt3sas

Result for PERC H965i Front, driver mpi3mr:
Summary of configuration: 2 physical drives (SSDs) -- 1 configured as RAID0, 1 configured as Unconfigured Good.

[root@mpi3mr ~]# smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device

Expected behaviour would be for like PERC H730P Adapter, driver megaraid_sas:
Summary of configuration: 5 physical drives (HDDs), 1 PCIe NVMe -- 1 configured as RAID0, 4 configured as Unconfigured Good.

[root@megaraid_sas~]# smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/bus/0 -d megaraid,0 # /dev/bus/0 [megaraid_disk_00], SCSI device
/dev/bus/0 -d megaraid,1 # /dev/bus/0 [megaraid_disk_01], SCSI device
/dev/bus/0 -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], SCSI device
/dev/bus/0 -d megaraid,3 # /dev/bus/0 [megaraid_disk_03], SCSI device
/dev/bus/0 -d megaraid,4 # /dev/bus/0 [megaraid_disk_04], SCSI device
/dev/nvme0 -d nvme # /dev/nvme0, NVMe device

Conclusion: Physical drives are not shown for raid controller PERC H965i Front with driver mpi3mr.
Reason: Unknown.
Expected outcome: We would like to achieve the same behaviour where physical drives are visible and can be used to get smartctl data. (If possible that would also include NVMe's behind the raid controller).

From our Dell contact we got the following information:

PERC 12 H965 does not block SmartCTL, but the tool has to use passthrough commands to get to the devices behind the perc.

If any other information is needed please let me know.

Information dump:

XXX = redacted

[root@mpi3mr ~]# perccli2 /c0 show all
CLI Version = 008.0008.0000.0012 Mar 16, 2024
Operating system = Linux5.14.0-503.16.1.el9_5.x86_64
Controller = 0
Status = Success
Description = None


Basics :
======
Product Name = PERC H965i Front
Board Name = XXX
Board Assembly = XXX
Board Tracer Number = XXX
Board Revision = A05
Chip Name = SAS4116W
Chip Revision = ALL
Board Mfg Date(yyyy/mm/dd) = 2024/01/24
Board Rework Date(yyyy/mm/dd) = 2024/01/24
Custom Serial Number = Unavailable
SAS Address = 0x5f4ee0806d0a9900
Serial Number = XXX
Controller Time(LocalTime yyyy/mm/dd hh:mm:sec) = 2025/01/07 12:12:53
System Time(LocalTime yyyy/mm/dd hh:mm:sec) = 2025/01/07 12:12:53
OEM = Dell
SubOEMID = 0
PCI Slot Number = Embedded


Version :
=======
Package Version = 8.8.0.0.18-26
Firmware Version = 8.8.0.225-00018-00011
Firmware Security Version Number = 00.00.00.00
FMC Version = 8.8.0.225-00018-00011
FMC Security Version Number = 00.00.00.00
BSP Version = 8.8.0.225-00018-00011
BSP Security Version Number = 00.00.00.00
BIOS Version = 0x08080501
BIOS Security Version Number = 00.00.00.00
HIIM Version = 08.08.08.03
HIIM Security Version Number = 00.00.00.00
HIIA Version = 08.08.08.03
HIIA Security Version Number = 00.00.00.00
OEM Version = BBU:12.02_CPLD:00.0D
OEM Security Version Number = 00.00.00.00
PSOC Hardware Version = 0.00
PSOC Firmware Version = 0.00
PSOC Part Number = Unavailable
NVDATA Version = 08.0E.00.0A
Driver Name = mpi3mr
Driver Version = 8.8.1.0.50
SL8 Library Version = 08.0807.0500


HostInterface :
=============
Max PCIe Link Rate = 0x08 (16GT/s)
Max PCIe Port Width = 16
PCI Address = 00:52:00:0
PCIe Link Width = X16 Lane(s)


DeviceInterface :
===============
SAS/SATA = SAS/SATA-6G, SAS-12G, SAS-22.5G
PCIe = PCIE-2.5GT, PCIE-5GT, PCIE-8GT, PCIE-16GT
PCI Vendor ID = 0x1000
PCI Device ID = 0x00A5
PCI Subsystem Vendor ID = 0x1028
PCI Subsystem ID = 0x2115


Status :
======
Controller Status = Optimal
VD Count = 1
VD Degraded Count = 0
VD Offline Count = 0
PD Count = 3
PD Drive Count = 2
Critical Predictive Failure PD Drive Count = 0
Failed PD Drive Count = 0
Memory Correctable Errors = 0
Memory Uncorrectable Errors = 0
ECC Bucket Count = 0
Preserved Cache Present = No
System Reboot Required = No
System Shutdown Required = No
Controller Reset Required = No
Security Key Assigned = No
Security Type = None
Failed to get Security key on bootup = No
Bios was not detected during boot = No
Boot Time Secret Key Provider = None
EKM Key Provider = OOB
Controller must be rebooted to complete security operation = No
Controller has booted into safe mode = No
Controller has trial features enabled = No
Personality Change Pending = No
Security Key Rekey Pending = No
Reconfigure types = NVMe reconfigure is not supported
Controller Personality = RAID


Supported Controller operations :
===============================
Support Enclosure Affinity = Yes
Support Foreign Config Import = Yes
Support Foreign Config Clear = Yes
Support Self Diagnostic Check = Yes
Support Diag Retention Test = Yes
Support Cache Offload Info = Yes
FW and Event Time in GMT = Yes
Support Abort CC on Error = Yes
Support Multipath = Yes
Support Security = Yes
Support Security Suggest = Yes
Support Portable Security Key = No
Support Security Key Per PD = No
Support LKM = Yes, Supported u
8000
sing both in-band and out-of-band command
Support EKM = Yes, Supported only using out-of-band command
Support LKM to EKM = Yes
Support EKM to LKM = No
Support PFK = No
Support Electronic PFK = No
Support Ldbbm = Yes
Support Shield State = Yes
Support SSD Ld Disk Cache Change = Yes
Support Operation Suspend Resume = Yes
Support Emergency Spares = No
Support Device Mgmt PEL = Yes
Support CC Schedule = Yes
Support PatrolRead = Yes
Support SSD PatrolRead = Yes
Support Resize Array = Yes
Support Drive Performance Monitoring = Yes
Support Emulated Drives = Yes
Support Limited Dedicated HotSpares = Yes
Support Factory Defaults set = Yes
Support Personality = Yes
Support NVCache Info Get = Yes
Support NVCache Erase = Yes
Support Cache Offload Encryption = No
Support CacheBypass Modes = Yes
Support Purge Cache During VD Delete = Yes
Support BoardLogic Update = No
Support Ibuttonless = Yes
Support Host Info = Yes
Support Cache-Vault Health Info = Yes
Support PCIe Devices = Yes
Support Extended MPB VPD pages = Yes
Support Snapdump = Yes
Support Energy Pack = Yes
Support Energy Pack VPD = Yes
Support Multiple Security keys(One Per PD) = No
Support Platform Security = Yes
Support SESCtrl In Multipath Config = No
Support R6 Individual PD Suspend/Resume = No
Support SAS Config = Yes
Support limiting of SAS Link speed = Yes, Supports only max link speed limit control of a SAS Phy
Support limiting of SAS Link speed per Phy = Yes
Support PCIe Config = Yes
Support limiting of PCIe Link Speed = Yes
Support limiting of PCIe Link Speed per Link = Yes
Support PCIe Config Lane Mapping = Yes
Support Immediate Auto-configure = Yes
Support NVMe Recover = Yes
Support NVMe Reconfigure = No
Support PCIe Clock mode = Yes
Support Enhanced Scheduling = Yes
Support Parallel Drive FW Download = Yes
Support Replicate IO = Yes
Support Drive Makers Authority = Yes
Support Drive FW download management = Yes


External Key Management :
=======================
Capability = Supported
Boot Agent Status = Available
Configured = No


Supported PD operations :
=======================
Support Force 
Support Make Offline = Yes
Support Make Failed = No
Support SpinDown Unconfigured = Yes
Support SpinDown HotSpare = Yes
Support SpinDown Configured = No
Support T10 Power State = Yes
Support Temperature Monitoring = Yes
Support WCE = Yes
Support Degraded Media Detection = Yes
Support PD Secure Erase = Yes
Support SSD Wear Gauge = No
Support Mark Missing = No
Support Replace Missing = No
Support Prepare For Removal = No
Support Drive Power State Change = Yes, Supports only Spin-Up


Supported VD operations :
=======================
Support Changing Read Policy = Yes
Support Changing Write Policy = Yes
Support Changing Disk Cache Policy = Yes
Support Online Capacity Expansion = Yes, with VD Expansion and addition of Drives only
Support LDBBM = Yes
Support Secure Erase = No
Support Unmap for single drive RAID 0 VDs = No
Support Unmap for RAID 0 VDs = No
Support Unmap for RAID 1/10 VDs = No
Support Unmap for RAID 5/50/6/60 VDs = No
Support Writesame Unmap for single drive RAID 0 VDs = No
Support Writesame Unmap for RAID 0 VDs = No
Support Writesame Unmap for RAID 1/10 VDs = No
Support Writesame Unmap for RAID 5/50/6/60 VDs = No
Support T10 Atomicity = No


HwCfg :
=====
NVRAM = Present
NVRAM Size(KiB) = 128
DDR Memory Size(MiB) = 8192
Flash = Present
NOR Flash Size(MiB) = 32
CacheVault Flash Size(MiB) = 24192
OCM Memory = Present
OCM Memory Size(MiB) = 15
Current Size of FW Cache(MiB) = 7973
DDR Width = 72-bit
Energy Pack = Present
Upgrade Key = Absent
Upgrade Key Slot = Absent
On Board Expander = Absent
Temperature Sensor for Chip = Present
Temperature Sensor for Board = Absent
Upgradable Board Logic = Absent
PCI Switch = Absent
Serial Debugger = Present
Chip temperature(C) = 48
Controller Fan = Absent


Max. Supported Config :
=====================
Max Number of VDs = 240
Max Number of Physical Drives = 240
Max SAS/SATA Drives = 240
Max NVMe Drives = 16
Max Arrays = 240
Max PD Per Array = 32
Max Spans Per VD = 8
Max Dedicated HSPs = 64
Max Global HSPs = 64
Max VDs Per Array = 16
Max Phys = 16
Max NonRAIDs = 240
Max RAID Configurable PDs = 240
Max Complex RAID VDs = 64
Max Data Transfer Size(Bytes) = 1048576
Max Parallel Commands = 8192
Max NS/LU per NonRAID = 1
Max NS/LU per RAID PDs = 1
Max Supported LUNs for SAS PDs = 4
Max Namespaces = 1
Max Persistent Id = 1024
Max VD's per Array in configuration = 16
Max Key Id length = 256
Max Security Key length = 32
Max Parallel FW Download Drives = 32
Max Replicate IO Drives = 32
Max Replicate IO Data Transfer Size(KiB) = 1020


Properties :
==========
Patrol Read Rate(%) = 30
BGI Rate(%) = 30
Consistency Check Rate(%) = 30
OCE Rate(%) = 30
Drive Coercion Mode = 1 (128 MiB)
Auto Rebuild = Yes
Energy Pack Warning = Yes
Data Loss Warning = Yes
Ecc Bucket Size(Entries) = 15
ECC Bucket Leak Rate(minute(s)) = 1440
Expose Enclosure Devices = No
Maintain Drive Fail History = No
Maintain Drive Fail History - NonRAID = No
All Online Controller Reset = Yes
Auto Online Controller Reset = Yes
Abort CC on Error = No
Replace Drive = Yes
HDD SMARTer Enabled = Yes
SSD SMARTer Enabled = Yes
PR Correct UnConfigured Areas = Yes
Support SSD Patrol Read = Yes
SpinDown Unconfigured Drive = Yes
Boot With Preserved Cache = Yes
SpinDown HotSpare = No
Fail On SMARTer = No
LED management for NonRAID = Yes
SES VPD Association = TargetPort
Base Enclosure Level = Enclosure level enumeration start with Zero
SMART/Temperature PollInterval-For External PDs only(second(s)) = 300
SMART/Temperature PollInterval-For Internal PDs only(second(s)) = 10
SpinDown Time(minute(s)) = 30
Spinup Drive Count = 4
Spinup Delay(second(s)) = 12
Spinup Encl Drive Count = 4
Spinup Enclosure Delay(second(s)) = 12
Drive Detection Type = Disabled
Drive Corrective Action = Only Log Events
Drive Error Threshold = Every 8 Hours
Boot Mode = COE
Name =
SmartPoll - RAID PDs = Yes
SmartPoll - NonRAID PDs = Yes
PD Temperature Poll = Yes
Security on NonRAID = Yes
Host Managed Security on NonRAID = No
ATA Security Commands on NonRAID = No
Device Reporting Order = Logical drives are reported prior to JBOD devices.
First Reporting Device Persistent Id = None
Rebuild Operating Mode Priority = Rebuild


Capabilities :
============
Supported Drive Interfaces = SAS, SATA, NVMe
Supported RAID Levels = RAID0, RAID1, RAID5, RAID6, RAID10, RAID50, RAID60
Mix of SAS-HDD/SATA-HDD in VD = Not Allowed
Mix of SAS-SSD/SATA-SSD in VD = Not Allowed
Mix of SSD/HDD in VD = Not Allowed
Mix of SED type(Enterprise,OPAL,and RUBY) for Security Enabled Arrays = Not Allowed
Mix of different NS/LU count in a VD = Not Allowed
Mix of NVMe SGL and PRP in a VD = Allowed


NVCache Information :
===================
NVCache Flash Capacity(MiB) = 24192
Bad Block Capacity(MiB) = 240
Allowable bad blocks consumed(%) = 19


Scheduled Tasks :
===============
Patrol Read Execution Frequency(hours) = 168
Next Patrol Read Start time(LocalTime yyyy/mm/dd hh:mm:sec) = 2025/01/13 10:00:00


Secure Boot Details :
===================
Secure Boot State = Enabled
Secure Boot Mode = Hard Secure
Total Number Of Key Slots = 8
Number Of Key Slots Used = 2
Remaining Key Slots = 6
Current Key Encryption Algorithm = RSA2048
Key Hash Size = SHA256
Key Hash Version = SHA3
Key Hash Algorithm = SHA3-256
Security Version Number = 00.00.00.00


Security Protocol Details :
=========================
Security Protocol = SPDM-1.1.0,1.0.0

Drive Groups = 1

TOPOLOGY :
========

-----------------------------------------------------------------------------
DG Span Row EID:Slot PID Type  State Status BT      Size PDC  Secured FSpace
-----------------------------------------------------------------------------
 0 -    -   -        -   RAID0 -     -      N  223.0 GiB dflt N       N
 0 0    -   -        -   RAID0 -     -      N  223.0 GiB dflt N       N
 0 0    0   292:0    275 DRIVE Conf  Online N  223.0 GiB dflt N       -
-----------------------------------------------------------------------------

DG-Drive Group Index|Span-Span Index|Row-Row Index|EID-Enclosure Persistent ID
PID-Persistent ID|Slot-Slot Number|Type-Drive Type|Onln-Online|Rbld-Rebuild|Dgrd-Degraded
Pdgd-Partially degraded|Offln-Offline|BT-Background Task Active
PDC-Drive Write Cache Policy|Frgn-Foreign|Optl-Optimal|FSpace-Free Space Present
dflt-Default|Msng-Missing

Virtual Drives = 1

VD LIST :
=======

------------------------------------------------------------------
DG/VD TYPE  State Access CurrentCache DefaultCache      Size Name
------------------------------------------------------------------
0/2   RAID0 Optl  RW     NR,WB        NR,WB        223.0 GiB
------------------------------------------------------------------

Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|Dgrd=Degraded
Optl=Optimal|RO=Read Only|RW=Read Write|CurrentCache-Curent Cache Status
R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack|
AWB=Always WriteBack|WT=WriteThrough|Access-Access Policy

Physical Drives = 2

PD LIST :
=======

---------------------------------------------------------------------------------------------------------------
EID:Slt PID State Status DG      Size Intf Med SED_Type SeSz Model                      Sp LU/NS Count Alt-EID
---------------------------------------------------------------------------------------------------------------
292:0   275 Conf  Online  0 223.0 GiB SATA SSD -        512B SAMSUNG MZ7KM240HMHQ-00005 U            1 -
292:1   276 UConf Good    - 223.0 GiB SATA SSD -        512B SAMSUNG MZ7KM240HMHQ-00005 U            1 -
---------------------------------------------------------------------------------------------------------------


LU/NS LIST :
==========

------------------------------------
PID LUN/NSID Index Status      Size
------------------------------------
275 0/-          0 Online 223.0 GiB
276 0/-          0 Good   223.0 GiB
------------------------------------

EID-Enclosure Persistent ID|Slt-Slot Number|PID-Persistent ID|DG-DriveGroup
UConf-Unconfigured|UConfUnsp-Unconfigured Unsupported|Conf-Configured|Unusbl-Unusable
GHS-Global Hot Spare|DHS-Dedicated Hot Spare|UConfShld-Unconfigured Shielded|
ConfShld-Configured Shielded|NonRAIDShld-NonRAID Shielded|GHSShld-GHS Shielded|DHSShld-DHS Shielded
UConfSntz-Unconfigured Sanitize|ConfSntz-Configured Sanitize|NonRAIDSntz-NonRAID Sanitize|GHSSntz-GHS Sanitize
DHSSntz-DHS Sanitize|UConfDgrd-Unconfigured Degraded|ConfDgrd-Configured Degraded|NonRAIDDgrd-NonRAID Degraded
GHSDgrd-GHS Degraded|DHSDgrd-DHS Degraded|Various-Multiple LU/NS Status|Med-Media|SED-Self Encryptive Drive
SeSz-Logical Sector Size|Intf-Interface|Sp-Power state|U-Up/On|D-Down/PowerSave|T-Transitioning|F-Foreign
NS-Namespace|LU-Logical Unit|LUN-Logical Unit Number|NSID-Namespace ID|Alt-EID-Alternate Enclosure Persistent ID

Enclosures = 1

Enclosure List :
==============

-------------------------------------------------------------------------------------------------
EID State DeviceType        Slots PD Partner-EID Multipath PS Fans TSs Alms SIM ProdID
-------------------------------------------------------------------------------------------------
292 OK    Logical Enclosure    16  2 -           No         0    0   0    0   0 BP_PSV
-------------------------------------------------------------------------------------------------

EID-Enclosure Persistent ID |SID-Slot ID |PID-Physical drive Persistent ID |PD-Physical drive count
PS-Power Supply count |TSs-Temperature sensor count |Alms-Alarm count |SIM-SIM Count |ProdID-Product ID
ConnId-ConnectorID


Energy Pack Info :
================

---------------------------------------------------
Type    SubType Voltage(mV) Temperature(C) Status
---------------------------------------------------
Battery None           3889             27 Optimal
---------------------------------------------------
@samm-git
Copy link
Contributor

It's not a bug; it just means this driver/controller is not supported. If you would like to have it supported, there are two options:

  1. Develop and send patches to us or work with the vendor to provide patches
  2. I can look at the hardware, too, but I will need SSH or bmc access, and I can't promise I will have time in the following days. Also, nothing important must be stored on the controller in that case as it may hang at times.

@samm-git
Copy link
Contributor

P.S. i see that mpi3mr is completley new driver, written from scratch. Do you know if the Venor tool could at least show some basic data like a smart identify packet? if not it would be very hard to guess the interface.

@samm-git samm-git changed the title Physical drives are not shown for driver mpi3mr, raid controller PERC H965i Front. mpi3mr-based raid controllers are not supported (e.g., PERC H965i) Jan 14, 2025
@KaizenTamasVeres
Copy link
Author

Thank you for your input / effort!
Currently we are trying to get the involvement of the vendor (dell) to contribute to this situation.

When there is any new information, will give an update here.

@bluedinoo
Copy link

While we are trying to pressure Dell to contribute, we would also like to grant you SSH access to one of our machines in case you have some time and would love to help out, we will contact you via e-mail for the details regarding IP's, public keys etc.

We would like to provide further information in regards to the issue:

The following we were able to find out during my own research into the topic:

  • the device seems to be taking SCSI commands, specifically BSG (SCSI io v4).
  • the SCSI requests seem to differ from the standard in the following ways:
    There seems to be two different request types: MPI requests and MPT requests. For the former we were able to find out the specifics (We are happy to provide you with what we documented in that regard so far), but the MPI requests seem to be only for controller management related requests.
    The latter, MPT, we were so far unable to figure out how it could be utilized for the type of requests needed for retrieving SMART information.

Additionally, we found the following documentation which could at least give some insight into what the device is capable of and what not:
https://techdocs.broadcom.com/content/dam/broadcom/techdocs/data-center-solutions/tools/generated-pdfs/MR8_Utility_UG.pdf

Thank you in advance!

@samm-git
Copy link
Contributor
samm-git commented Jan 21, 2025

RAID controller support consists from 2 parts:

  1. Encapsulation to call SCS/NVMe/SATA commands to underlying drives. It looks like that driver uses the message passing interface, which can be reached using ioctl. We will also need to write different functions for nvme/scsi/ata parts. If vendor could at least provide some basic examples that would speedup things a lot.
  2. PD enumeration, used for autoscan, optional. This part is probably easier as could be guessed from the vendor tools and some tracing.

I can't promise I will have enough time to work on this, but I can glance over the weekend. My email is samm [at] net-art.cz. However, it is still strongly encouraged to get some samples from the vendor, this would speedup the work a lot and make implementation less problematic.

@tranquillity-codes
Copy link

Some results of reading the Linux kernel src for mpi3mr:

The driver, mpi3mr, offers 2 interfaces.

A controller interface at /dev/bsg/mpi3mrctl1, and a general SCSI interface (v4, BSG) for sending SCSI commands to the controller.

The controller interface uses ioctls, specifically SG_IO. It fills up a structure that has the format of a request & a pointer to a preallocated response buffer, as well as optional xfer buffers for attaching data to the request (dout) or response (din).

The request structure starts with a request type, then 7 bytes of padding (there's a kernel struct for this). There are two types of commands: DRV and MPT. DRV is id 1, MPT is id 2. DRV fetches driver info and basic stuff like disk count. It has a simple format with an opcode (list of opcodes is in kernel), and it can then take or return data in xfer.

MPT then has a variable width description of the buffer types included in the 2 xfer buffers, which describes at which offset as well as what length is a given buffer type in a given xfer direction.

An MPI3 command is then built from this MPT description and sent to the controller. The actual controller <-> os interface uses an asynchronous queue system, with help from DMA for actually transferring the data. The driver implements logic for waiting on the request to complete in the queue.

MPI3 has several queues, unsure of the details of each one of them.

The driver seems to offer some mechanism for exposing drives to the host based on DEVICE0_FLAGS (some kind of "device page". Unsure if this is a dead end or not.

The kernel has a struct for identifying a device, mpi3mr_tgt_dev. With two disks plugged in, 4 of those exist. 1 of them might be the controller.

Full list of types is available at scsi/scsi_bsg_mpi3mr.h in the kernel includes, prefixed with MPI3MR_BSG_BUFTYPE

Some patches regarding the drive exposure mechanism:

https://lore.kernel.org/all/20220804131226.16653-6-sreekanth.reddy@broadcom.com/
https://lore.kernel.org/all/20210520152545.2710479-7-kashyap.desai@broadcom.com/

We would like to be involved in the process, if possible, could you grant access to the server ^^?

@KaizenTamasVeres
Copy link
Author

@samm-git access has been granted, information is via e-mail.
@tranquillity-codes access has been granted, information is via e-mail.

@samm-git
Copy link
Contributor
samm-git commented Jan 22, 2025

Some of the quick findings:

  1. perccli2 tool supports ini (perccli2conf.ini) and DEBUGLEVEL=4 will output some debug info in the storcli2.log file.
  2. Once started it checks for existence /dev/bsg/mpi3mrctl0 - /dev/bsg/mpi3mrctl1024 to find available cards
  3. If found it starts to issue SG_IO ioctl with BSG_PROTOCOL_SCSI/BSG_SUB_PROTOCOL_SCSI_TRANSPORT protocol set. Request starts with host_tag defined at https://github.com/torvalds/linux/blob/master/drivers/scsi/mpi3mr/mpi3mr.h#L95. Most of the requests starts with \x01\x00\x00, some with \x02\x00.

@samm-git
Copy link
Contributor

Looks like i found SAT16 command to get identify data from PD and reply in din_xferp:

ioctl(4, SG_IO, {guard='Q', protocol=BSG_PROTOCOL_SCSI, subprotocol=BSG_SUB_PROTOCOL_SCSI_TRANSPORT, request_len=64, request="\x02\xcc\xd3\x00\x00\x00\x00\x00\x00\xcc\xb4\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x3c\x00\x00\x00\x06\x00\x00\x00\x00\x01\x00\x00\x03\x00\x00\x00\x00\x02\x00\x00\xfe\x00\x00\x00\x40\x00\x00\x00\x41\x00\x00\x00\x00\x00\x00\x00", request_tag=0, request_attr=0, request_priority=0, request_extra=0, max_response_len=32, dout_iovec_count=0, dout_xfer_len=64, din_iovec_count=0, din_xfer_len=828, dout_xferp="\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x05\x01\x00\x00\x08\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x85\x08\x0e\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xec\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", timeout=200000, flags=0, usr_ptr=0, response_len=32, response="\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", din_xferp="\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\xff\x3f\x37\xc8\x10\x00\x00\x00\x00\x00\x3f\x00\x00\x00\x00\x00\x00\x00\x33\x53\x35\x46\x58\x4e\x4b\x30\x30\x34\x32\x31\x36\x38\x20\x20\x20\x20\x20\x20\x00\x00\x00\x00\x00\x00\x58\x47\x35\x4d\x30\x31\x51\x34\x41\x53\x53\x4d\x4e\x55\x20\x47\x5a\x4d\x4b\x37\x32\x4d\x30\x34\x4d\x48\x51\x48\x30\x2d\x30\x30\x35\x30\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x10\x80\x00\x40\x00\x2f\x00\x40\x00\x02\x00\x02\x07\x00\xff\x3f\x10\x00\x3f\x00\x10\xfc\xfb\x00\x10\xbd\xff\xff\xff\x0f\x00\x00\x07\x00\x03\x00\x78\x00\x78\x00\x78\x00\x78\x00\x30\x4f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x1f\x00\x0e\x85\x46\x00\x64\x00\x64\x00\xfc\x03\x39\x00\x6b\x74\x29\x7d\x63\x41\x69\x74\x01\xbc\x63\x41\x7f\x40\x10\x00\x10\x00\xfe\x00\xfe\xff\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xb0\x44\xf2\x1b\x00\x00\x00\x00\x00\x00\x08\x00\x00\x40\x00\x00\x02\x50\x8c\x53\xa1\x40\x34\x77\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x9e\x40\x1c\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x21\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x00\x01\x00\x20\x20\x20\x20\x20\x20\x20\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x3d\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x7f\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa5\x04", driver_status=0, transport_status=0, device_status=0, retry_delay=1537, info=0, duration=384, response_len=32, din_resid=0, dout_resid=0, generated_tag=0x8b000001b0}) = 0

@tranquillity-codes
Copy link

Our findings from yesterday:

REQUEST:
  MPT cmd:
    \x02
  Padding:
    \xcc\xd3\x00\x00\x00\x00\x00
  Controller ID:
    \x00
  Padding:
    \xcc
  Timeout:
    \xb4\x00
  Padding:
    \x00\x00\x00\x00
  Size:
    \x04
  Padding:
    \x00\x00\x00\x00\x00\x00\x00
  Entries:
    Type (MPI_REPLY):
      \x05\x00\x00\x00
    Size (60):
      \x3c\x00\x00\x00

    Type (ERR_RESPONSE):
      \x06\x00\x00\x00
    Size (256):
      \x00\x01\x00\x00
    
    Type (DATA_IN):
      \x03\x00\x00\x00
    Size (512):
      \x00\x02\x00\x00
    
    Type (MPI_REQUEST):
      \xfe\x00\x00\x00
    Size (64):
      \x40\x00\x00\x00
  Padding:
    \x00\x00\x00\x00\x00\x00\x00\x00
XFER_OUT:
  host_tag:
    \x00\x00
  ioc_use_only02:
    \x00
  function:
    \x20
  ioc_use_only04:
    \x00\x00
  ioc_use_only06:
    \x00
  msg_flags:
    \x00
  change_count:
    \x00\x00
  function_dependent (disk selector):
    \x03\x01

  Rest (unsure):
    \x00\x00\x08\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

  SCSI command:
    opcode: ATA PASS-THROUGH (16) 
      \x85
    Obsolete/Protocol/Extend:
      \x08
    OFF_LINE/CK_COND/T_TYPE/T_DIR/BYTE_BLOCK/T_LENGTH:
      \x0e
    Features:
      \x00\xd0
    Count:
      \x00\x00
    LBA:
      \x00\x00\x00\x4f\x00\xc2
    Device:
      \x00
    Command (SMART READ LOG):
      \xb0
    Control:
      \x00

  SCSI Padding (since we have a 16 byte SCSI command):
    \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <linux/bsg.h>
#include <scsi/sg.h>
#include <inttypes.h>

void print_hex(const char *label, void *buf, size_t len) {
    unsigned char *byte_buf = (unsigned char *)buf;
    printf("%s: ", label);
    for (size_t i = 0; i < len; i++) {
        printf("%02x ", byte_buf[i]);
    }
    printf("\n");
}

int main() {
    struct sg_io_v4 io_v4;
    memset(&io_v4, 0, sizeof(io_v4));

    // Set the guard, protocol, and subprotocol
    io_v4.guard = 'Q';  // 'Q' to differentiate from v3
    io_v4.protocol = BSG_PROTOCOL_SCSI;  // SCSI protocol
    io_v4.subprotocol = BSG_SUB_PROTOCOL_SCSI_TRANSPORT;  // SCSI command subprotocol

    // Request and response setup
    io_v4.request_len = 64;
    io_v4.request = (uint64_t)malloc(io_v4.request_len);
    uint8_t request_data[64] = {0x02, 0xcc, 0xd3, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xcc, 0xb4, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x3c, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0xfe, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
    memcpy((void*)io_v4.request, request_data, io_v4.request_len);

    io_v4.max_response_len = 32;
    io_v4.response_len = 32;
    io_v4.response = (uint64_t)malloc(io_v4.max_response_len);
    memset((void*)io_v4.response, 0, io_v4.max_response_len);

    // din setup
    io_v4.din_xfer_len = 828;
    io_v4.din_xferp = (uint64_t)malloc(io_v4.din_xfer_len);
    memset((void*)io_v4.din_xferp, 0, io_v4.din_xfer_len);

    // dout setup
    io_v4.dout_xfer_len = 64;
    io_v4.dout_xferp = (uint64_t)malloc(io_v4.dout_xfer_len);
    // IDENTIFY DEVICE: 0xEC
    // SMART READ LOG: 0xB0
    uint8_t opcode = 0xb0;
    uint8_t page = 0x0;
    //uint8_t dout_data[64] = {0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x01, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, opcode, 0x08, 0x0e, 0x00, 0xd0, 0x00, 0x00, 0x00, page, 0x00, 0x4f, 0x00, 0xc2, 0x00, 0xb0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
    uint8_t dout_data[64] = {0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x85, 0x08, 0x0e, 0x00, 0xd0, 0x00, 0x00, 0x00, page, 0x00, 0x4f, 0x00, 0xc2, 0x00, opcode, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
    memcpy((void*)io_v4.dout_xferp, dout_data, io_v4.dout_xfer_len);

    // Other fields
    io_v4.timeout = 20000;       // Timeout in milliseconds
    io_v4.flags = 0;             // No flags set
    io_v4.dout_iovec_count = 0;  // Flat transfer
    io_v4.din_iovec_count = 0;   // Flat transfer

    // Open the /dev/bsg/mpi3mrctl0 device
    int fd = open("/dev/bsg/mpi3mrctl0", O_RDWR);
    if (fd < 0) {
        perror("Failed to open /dev/bsg/mpi3mrctl0");
        return 1;
    }

    // Send SG_IO ioctl
    if (ioctl(fd, SG_IO, &io_v4) < 0) {
        perror("SG_IO ioctl failed");
        close(fd);
        return 1;
    }

    printf("SG_IO ioctl sent successfully\n");

    // Print din and response as hex
    print_hex("Response", (void*)io_v4.response, io_v4.response_len);
    print_hex("SMART", (void*)io_v4.din_xferp+256+60, io_v4.din_xfer_len-256-60);
    printf("\n");

    free((void*)io_v4.request);
    free((void*)io_v4.response);
    free((void*)io_v4.din_xferp);

    close(fd);
    return 0;
}

@tranquillity-codes
Copy link

Also, perccli2 consists of a C++ part and a C part, the config file enables the C++ tracer, but there's also a C tracer for the internal storelib8 functions which we didn't yet figure out how to enable. Should hopefully be unnecessary, though.

@tranquillity-codes
Copy link
tranquillity-codes commented Jan 22, 2025
/* MPI3: Function definitions */
#define MPI3_BSG_FUNCTION_MGMT_PASSTHROUGH       (0x0a)
#define MPI3_BSG_FUNCTION_SCSI_IO                (0x20)
#define MPI3_BSG_FUNCTION_SCSI_TASK_MGMT         (0x21)
#define MPI3_BSG_FUNCTION_SMP_PASSTHROUGH        (0x22)
#define MPI3_BSG_FUNCTION_NVME_ENCAPSULATED      (0x24)

MPT function defs, from include/uapi/scsi/scsi_bsg_mpi3mr.h

#define MPI3MR_DRVBSG_OPCODE_UNKNOWN		0
#define MPI3MR_DRVBSG_OPCODE_ADPINFO		1
#define MPI3MR_DRVBSG_OPCODE_ADPRESET		2
#define MPI3MR_DRVBSG_OPCODE_ALLTGTDEVINFO	4
#define MPI3MR_DRVBSG_OPCODE_GETCHGCNT		5
#define MPI3MR_DRVBSG_OPCODE_LOGDATAENABLE	6
#define MPI3MR_DRVBSG_OPCODE_PELENABLE		7
#define MPI3MR_DRVBSG_OPCODE_GETLOGDATA		8
#define MPI3MR_DRVBSG_OPCODE_QUERY_HDB		9
#define MPI3MR_DRVBSG_OPCODE_REPOST_HDB		10
#define MPI3MR_DRVBSG_OPCODE_UPLOAD_HDB		11
#define MPI3MR_DRVBSG_OPCODE_REFRESH_HDB_TRIGGERS	12

DRV opcodes (at offset 8, check struct mpi3mr_bsg_packet)

0x01 in REQUEST is DRV, 0x02 is MPT (it's the cmd_type field)

Entry point into the handler is at mpi3mr_bsg_request in drivers/scsi/mpi3mr/mpi3mr_app.c

@tranquillity-codes
Copy link
tranquillity-codes commented Jan 22, 2025
num:
04 00

pad:
00 00
00 00 00 00

handle:
02 00
perst_id:
02 00
target_id:
02 00 00 00
bus_id:
00
padding:
ff ff ff

handle:
05 01
perst_id
13 01
target_id:
ff ff ff ff
bus_id:
ff
padding:
ff ff ff

handle:
03 01
perst_id:
14 01
target_id:
ff ff ff ff
bus_id:
ff
padding:
ff ff ff

handle:
04 01
perst_id:
23 01
target_id:
ff ff ff ff
bus_id:
ff
padding:
ff ff ff

A decoding for an ALLTGTDEVINFO DRV command. It conforms to the mpi3mr_all_tgt_info struct. The rest is untouched zero adding padding; perccli2 just preallocates a buffer of size 24584 and hopes it fits (which it almost certainly will, the results are not large)

The handle is what the MPT MPI3_BSG_FUNCTION_SCSI_IO command takes to identify the disk to send to.

@tranquillity-codes
Copy link
tranquillity-codes commented Jan 22, 2025

We (us & @bluedinoo) will write a patch this week, if nothing unexpected occurs. We have a pretty good idea of what we need to do, just gotta write it :) A test utility we wrote is already successfully passing through ATA commands through SCSI ATA PASSTHROUGH, and getting some data that looks good, so it's just a matter of hooking it up to smartmontools.

@samm-git
Copy link
Contributor

Cool. I also wrote a test code to get identify working but my time is very limited. @tranquillity-codes patch should implement only scsi device, sat should be done by smartmontools, see other raids, e.g. megaraid for the details.

There is another topic - NVMe support, which will probably require more efforts. but we can do it later

@tranquillity-codes
Copy link
tranquillity-codes commented Jan 22, 2025

When we have the SCSI response data, how do we return it back to smartmontools from the scsi_pass_through of our linux_mpi3mr_device ? We tried assigning iop->dxferp and iop->dxfer_len, but can't get it working; Unsure if we have a bug somewhere or if we missed something :/ We are largely basing it off of the existing megaraid stuff.

@bluedinoo
Copy link

currently looking into the existing sssraid code as reference as BSG is also in use for that controller type with @tranquillity-codes.
I imagine v4 is different enough to cause issues with response data.
We will look into this further tomorrow.

@bluedinoo
Copy link

@samm-git Is it possible to arrange a DM session?

So far, we finished the implementation into smartmontools almost fully, however, it seems that we are still having an issue with smartmontools parsing our response (din_xfer) data.
As we do not have access to a different type of RAID controller that supports the BSG interface, we are limited in being able to compare data across working machines and non working machines. Therefore, we decided to compare our ioctl to a standard SCSI request for now:
We found, that the first 316 bytes of the response (din) is (possibly) MPI3 specific. The data afterwards seems to be a standard SCSI response. We accounted for this in our current code, so that we only parse the SCSI data into scsi_cmnd_io.dxferp .
However, we seem to receive the following error: Read Device Identity failed: empty IDENTIFY data.
The IOCTL itself succeeds and returns the drive's S/N, Model Name, etc.

I am happy to share the din data we receive with you among with @tranquillity-codes and my patch to smartmontools if needed.

Kind Regards

@samm-git
Copy link
Contributor

feel free to open PR with a patch or just put it to the comment. I will try to take a look too

@bluedinoo
Copy link

PR opened

@bluedinoo
Copy link

@samm-git Asking for a short update while also sharing what I found out further:

  • identify data seems to not work because device is not yet correctly implemented. Working on that, but I'd like to ask you for some guidance in doing so the way smartmontools expects.

@KaizenTamasVeres
Copy link
Author

Dell officially responded now, they will not contribute and support this project/issue.
They are mentioning again, to repeat it that they are not blocking smartctl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
0