Table of Contents
Check_smartstatus
Script: check_smartstatus
check_smartstatus is a plugin run a smartctl check to verify the disk status of all local harddisks/ ssds.
It works on physical machines only.
Requirements
-
smartctl
The icinga user needs sudo permissions on the smartctl binary.
icingaclient ALL=(ALL) NOPASSWD: /sbin/smartctl
Standalone installation
From this repository you need next to this script:
-
inc_pluginfunctionsshared function for all IML checks written in bash
Syntax
______________________________________________________________________
CHECK_SMARTSTATUS
v1.9
(c) Institute for Medical Education - University of Bern
Licence: GNU GPL 3
https://os-docs.iml.unibe.ch/icinga-checks/Checks/check_smartstatus.html
______________________________________________________________________
Show status of local S.M.A.R.T. devices.
SYNTAX:
check_smartstatus [-h] [-l] [DEVICE(S)]
OPTIONS:
-h|--help show this help.
-l|--list list devices without scanning them.
-n|--noscan do not use 'smartctl --scan' to detect devices. Add
devices to scan as parameter
-s|--short short output
-i|--ignore REGEX ignore disks matching the given regex
PARAMETERS:
DEVICE A disk drive to scan with smartctl, eg
/dev/hda, /dev/sga, ...
EXAMPLES
check_smartstatus
Scan all disks found by 'smartctl --scan' and show full output.
check_smartstatus -l
List all local disks without scanning them.
check_smartstatus -s
Scan all disks found by 'smartctl --scan' and show short output only.
check_smartstatus /dev/sg0 /dev/sg1
Scan all disks found by 'smartctl --scan' plus /dev/sg0 and /dev/sg1
and show full output.
check_smartstatus --noscan /dev/sg0 /dev/sg1
Scan all /dev/sg0 and /dev/sg1 and show full output.
check_smartstatus --noscan --ignore "sg(1|10)" /dev/sg*
Scan all /dev/sg* but ignore /dev/sg1 and /dev/sg10.
Show full output.
Parameters
(none)
Examples
Fort testing purposes: Show devices only without scanning them:
./check_smartstatus -l
Devices to scan:
- /dev/nvme0 -d nvme # /dev/nvme0, NVMe device
Without parameter check_smartstatus will loop over all found devices and perform a SMART scan on each. You get a status line with a summary followed by the output sections for each disk.
This is the output of a single SSD:
OK: SMART check on 1 Disks - 0 errors - /dev/nvme0: PASSED
------------------------------------------------------------------------------------------
>>>> /dev/nvme0 - rc=0 - PASSED
Short infos:
Model Number: SKHynix_HFS001TEJ9X162N
SMART overall-health self-assessment test result: PASSED
Full output:
$ sudo smartctl -Ha /dev/nvme0
smartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.16.8-1-MANJARO] (local build)
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: SKHynix_HFS001TEJ9X162N
Serial Number: AJC9N469110209D22
Firmware Version: 51730A10
PCI Vendor/Subsystem ID: 0x1c5c
IEEE OUI Identifier: 0xace42e
Controller ID: 0
NVMe Version: 1.4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,024,209,543,168 [1.02 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: ace42e 0035db84db
Local Time is: Tue Oct 21 11:15:34 2025 CEST
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x00df): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
Log Page Attributes (0x1e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
Maximum Data Transfer Size: 64 Pages
Warning Comp. Temp. Threshold: 86 Celsius
Critical Comp. Temp. Threshold: 87 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 7.50W - - 0 0 0 0 5 305
1 + 3.9000W - - 1 1 1 1 30 330
2 + 1.5000W - - 2 2 2 2 100 400
3 - 0.0500W - - 3 3 3 3 500 1500
4 - 0.0050W - - 4 4 4 4 1000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning: 0x00
Temperature: 48 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 43,174,237 [22.1 TB]
Data Units Written: 31,718,352 [16.2 TB]
Host Read Commands: 625,085,692
Host Write Commands: 743,860,963
Controller Busy Time: 19,662
Power Cycles: 672
Power On Hours: 3,564
Unsafe Shutdowns: 76
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 43 Celsius
Temperature Sensor 2: 42 Celsius
Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged
Self-test Log (NVMe Log 0x06, NSID 0xffffffff)
Self-test status: No self-test in progress
No Self-tests Logged
Scan custom devices
smartctl --scan could show raids that you would like to ignore … or disks of a raid controller are not shown … sometimes you need a custom list of devices to check.
You can skip the automatic scan with --noscan. Then add the wanted devices as parameters. You can use globbing eg /dev/sg*.
To remove disks from a list use --ignore <REGEX>.
Example:
check_smartstatus --noscan --ignore "sg(1|10)" /dev/sg*