Child pages
  • Adding SMART disk monitoring as a SMF service
Skip to end of metadata
Go to start of metadata

Installing the smartmontools package and adding disks to the monitored list

Self Monitoring And Reporting Technology (SMART) is a useful tool to monitor the physical health of your hard disks.

By using the programs and daemons offered by the smartmontools package together with a special Service Management Facility manifest, you can manage SMART as a service on your machines.

Device types

In Solaris, SMART monitoring works on SCSI, SATA and SAS disks. It does not work on IDE drives or on any device that uses the pci-ide driver.

First, enable the Spec Files Extra repository:

 

pfexec pkg set-publisher -p http://pkg.openindiana.org/sfe

Then install the smartmontools package

pfexec pkg install storage/smartmontools

Now you have three main components at your disposal:

  • the smartctl command
  • the smartd daemon
  • the /etc/smartd.conf configuration file

The smartctl command allows you to query the disk status and run short and long tests. You can read the man page for this command for detailed information as to its use. For example, you can query the status of an example disk like so:

pfexec smartctl -a /dev/rdsk/c5t0d0s0

Note: for many controllers, you might need to specify the device type. The most common types are "scsi", "sat" and "sat,12". Use this form of the command:

pfexec smartctl -d sat,12 -a /dev/rdsk/c5t0d0s0

Now, once you have determined the right device type, you can edit the /etc/smartd.conf file by removing the DEVICESCAN line and adding the disk raw device followed by the -d option and device type. A typical line might look like this:

/dev/rdsk/c5t0d0s0 -d sat,12 -a

At this point, you might also want to specify automated scheduled disk testing. The /etc/smartd.conf file is full of examples you can adjust to suit your needs.

After you have edited and saved the file, run:

pfexec smartd -q onecheck

This will parse the config file and check that the disks can be accessed. If successful, it will add the device to the list of monitored disks.

Adding the SMF service

 

This blog has an XML manifest file and the necessary instructions for its installation. Be careful it is outdated.  You need to edit the xml script to the location of the installed location of the init smartd script.

Here is a corrected XML file:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type="manifest" name="smartd">
  <service
     name="site/smartd"
     type="service"
     version="1">
    <single_instance/>
    <dependency
       name="filesystem-local"
       grouping="require_all"
       restart_on="none"
       type="service">
      <service_fmri value="svc:/system/filesystem/local:default"/>
    </dependency>
    <exec_method
       type="method"
       name="start"
       exec="/etc/init.d/smartd start"
       timeout_seconds="60">
      <method_context>
        <method_credential user="root" group="root"/>
      </method_context>
    </exec_method>
    <exec_method
       type="method"
       name="stop"
       exec="/etc/init.d/smartd stop"
       timeout_seconds="60">
    </exec_method>
    <instance name="default" enabled="true"/>
    <stability value="Unstable"/>
    <template>
      <common_name>
        <loctext xml:lang="C">
          SMART monitoring service (smartd)
        </loctext>
      </common_name>
      <documentation>
        <manpage title="smartd" section="1M" manpath="/usr/local/share/man"/>
      </documentation>
    </template>
  </service>
</service_bundle>

Make sure to configure the smartd.conf from the instructions above and run the pfexec smartd -q onecheck command.


Briefly, copy the file into /var/smf/manifest/site/smartd.xml, change its owner to root:sys and import the manifest by running this command:

pfexec svccfg -v import /var/svc/manifest/site/smartd.xml

Check that the service exists and enable it:

pfexec svcadm enable smartd

The service should now be running.

 

Congratulations! You have an extra safety feature to ensure your data is safe, by hopefully detecting failing drives before they die completely.

  • No labels