// Copyright (c) 2018, Intel Corporation.
// SPDX-License-Identifier: BSD-3-Clause

ifdef::manpage[]
ipmctl-inject-error(1)
======================
endif::manpage[]

NAME
----
ipmctl-inject-error - Injects an error or clears a previously injected error

SYNOPSIS
--------
[listing]
--
ipmctl set [OPTIONS] -dimm [TARGETS] [PROPERTIES]
--

DESCRIPTION
-----------
Injects an error or clears a previously injected error on one or more PMem module for
testing purposes.

OPTIONS
-------
-h::
-help::
    Displays help for the command.

-ddrt::
  Used to specify DDRT as the desired transport protocol for the current invocation of ipmctl.

-smbus::
  Used to specify SMBUS as the desired transport protocol for the current invocation of ipmctl.

NOTE: The -ddrt and -smbus options are mutually exclusive and may not be used together.

ifdef::os_build[]
-o (text|nvmxml)::
-output (text|nvmxml)::
  Changes the output format. One of: "text" (default) or "nvmxml".
endif::os_build[]

TARGETS
-------
-dimm [DimmIDs]::
  Injects or clears an error on specific PMem modules by supplying one or more comma
  separated PMem module identifiers. The default is to inject the error on all
  manageable PMem modules.

PROPERTIES
----------
This command only supports setting or clearing one type of error at a time.

Clear::
  * "1": Clears a previously injected error. This property must be combined with
  one of the other properties indicating the previously injected error to clear.

Temperature::
  Injects an artificial media temperature in degrees Celsius into the PMem module. The
  firmware that is monitoring the temperature of the PMem module will then be alerted
  and take necessary precautions to preserve the PMem module. The value is injected
  immediately and will override the firmware from reading the actual media
  temperature of the device and use this value instead which may cause adverse
  reactions by the firmware and result in an alert or log.

NOTE: The injected temperature value will remain until the next reboot or until it is cleared.
The media temperature is an artificial temperature and will not cause harm to the
part. Although firmware actions due to improper temperature injections may
cause adverse effects on the PMem module. +
If the Critical Shutdown Temperature, or higher, is passed in, this may cause the
PMem module firmware to perform a shutdown in order to preserve the part and
data. +
The temperature value will be ignored on clear.

Poison::
  The physical address to poison. +
  Poison is not possible for any address in the PM region if the PM region is
  locked. Injected poison errors are only triggered on a subsequent read of the
  poisoned address in which case an error log will be generated by the firmware,
  but no alerts will be sent. +
  This command can be used to clear non-injected poison errors. The data will be
  zeroed after clearing. There is no requirement to enable error injection prior to
  request to clear poison errors. +
  The caller is responsible for keeping a list of injected poison errors, in
  order to properly clear the injected errors afterwards. Simply disabling
  injection does not clear injected poison errors. Injected poison errors are
  persistent across power cycles and system resets.

NOTE: System firmware (BIOS) will not read from any Intel(R) Optane(TM) PMem device
addresses that are known to be poisoned. For any poisoned address, the first read may
result in a hang/fault, but system firmware (BIOS) will mark this address as poisoned
so subsequent attempts to read poisoned addresses will be rejected with an error. The
result of such an error may prevent booting from a namespace that has poisoned data.

NOTE: The address must be 256 byte aligned (e.g., 0x10000000, 0x10000100, 0x10000200...).

PoisonType::
  The type of memory to poison. One of:
  * "PatrolScrub": Injects a poison error at the specified address simulating an
    error found during a patrol scrub operation indifferent to how the memory is
    currently allocated. This is the default.
  * "MemoryMode": Injects a poison error at the specified address currently
    allocated in Memory Mode.
  * "AppDirect": Injects a poison error at the specified address currently
    allocated as App Direct. +

NOTE: If the address to poison is not currently allocated as the specified
memory type, an error is returned.

PackageSparing::
  - "1": Triggers an artificial package sparing. If package sparing is enabled
  and the PMem module still has spares remaining, this will cause the firmware to
  report that there are no spares remaining.

NOTE: Injecting package sparing is not supported on Intel(R) Optane(TM) Persistent Memory 300
series modules.

PercentageRemaining::
  Injects an artificial percentage remaining value into the PMem module. This
  will cause the firmware to take appropriate action based on the value and if
  necessary generate an error log, an alert, and update the health status.

FatalMediaError::
  * "1": Injects a fake media fatal error which will cause the firmware to
  generate an error log and an alert.

NOTE: When media fatal error is injected, BSR Media Disabled status
  bit will be set indicating media error, until the fatal error is
  cleared using disable trigger input parameter to clear this injected
  fatal error.

NOTE: Injecting a Fatal Media error is unsupported on Windows*.
Contact Microsoft* for assistance in performing this action.

NOTE: When a fatal media error is cleared, A power cycle is needed for
this operation to take effect.

DirtyShutdown::
  * "1": Injects an ADR failure resulting in dirty shutdown upon reboot.

EXAMPLES
--------
Sets the media temperature on all manageable PMem modules to 50 degrees
Celsius.
[listing]
--
ipmctl set -dimm Temperature=50
--

Clears the injected media temperature on all manageable PMem modules.
[listing]
--
ipmctl set -dimm Clear=1 Temperature=1
--

Poison address 0x10000200 on PMem module 1234.
[listing]
--
ipmctl set -dimm 1234 Poison=0x10000200
--

Clears the injected poison of address 0x10000200 on PMem module 1234.
[listing]
--
ipmctl set -dimm 1234 Poison=0x10000200 Clear=1
--

Triggers an artificial package sparing on all manageable PMem modules.
[listing]
ipmctl set -dimm PackageSparing=1

Sets the life remaining percentage on all manageable PMem modules to 10%.
[listing]
--
ipmctl set -dimm PercentageRemaining=10
--

Clears the injected remaining life percentage on all manageable PMem modules. The
value of PercentageRemaining is irrelevant.
[listing]
--
ipmctl set -dimm PercentageRemaining=10 Clear=1
--

Triggers an artificial ADR failure on all manageable PMem modules resulting in a dirty
shutdown on each PMem module on the next reboot.
[listing]
--
ipmctl set -dimm DirtyShutdown=1
--

LIMITATIONS
-----------
This command is available only when error injection is enabled on the PMem modules
in the BIOS. To successfully execute this command, the specified PMem modules must
be manageable by the host software.

RETURN DATA
-----------
For each PMem module, the CLI will indicate the status of the operation. If a failure
occurs when injecting an error on multiple PMem modules, the process will continue
with the remaining PMem modules.

SAMPLE OUTPUT
-------------
[listing]
--
Set temperature on PMem module (DimmID): Success|Error (Code) -
(Description)
Clear injected temperature on PMem module (DimmID): Success|Error
(Code) - (Description)
--

[listing]
--
Poison address (Address) on PMem module (DimmID): Success|Error
(Code) - (Description)
Clear injected poison of address (Address) on PMem module
(DimmID): Success|Error (Code) - (Description)
--
[listing]
Trigger package sparing on PMem module (DimmID): Success|Error
(Code) - (Description)
Clear injected package sparing on PMem module (DimmID):
Success|Error (Code) - (Description)

[listing]
--
Trigger a spare capacity alarm on PMem module (DimmID):
Success|Error (Code) - (Description)
Clear injected spare capacity alarm on PMem module (DimmID):
Success|Error (Code) - (Description)
--

[listing]
--
Create a media fatal error on PMem module (DimmID): Success|Error
(Code) - (Description)
Clear injected media fatal error on PMem module (DimmID):
Success|Error (Code) - (Description)
--
