Features Of Machine Check Monitoring Service - NEC Express5800/A1040b User Manual

Machine check monitoring service
Hide thumbs Also See for Express5800/A1040b:
Table of Contents

Advertisement

2.4

Features of Machine Check Monitoring Service

Process flow of Machine Check Monitoring Service is shown below.
Table 2-1 Process flow of Machine Check Monitoring Service
Features
Monitoring
CPU failure
Monitoring
memory failure
When mcemonitor detects occurrence of CPU failure, send CPU fault information to
firmware.
When the firmware receives CPU fault information, it determines the failed
component.
The firmware manages failure occurrence count, and when it exceeds threshold
value, the firmware instructs Core Offline to mcemonitor.
When mcemonitor receives Core Offline instruction from firmware, it issues CPU
Offline instruction to kernel.
If Hyper Threading Mode is set to OFF, one logical CPU in CPU core is made
offline. If Hyper Threading Mode is set to ON, two logical CPUs in CPU core are
made offline.
When CPU Offline succeeds, the relevant CPU is disabled for OS and software.
Thus, the number of available CPUs is reduced.
Note: Express5800/A1040b does not support Core Offline feature.
mcemonitor notifies the firmware of result of CPU Offline.
When CPU Offline succeeds and if the server has spare CPU, the spare CPU is
added automatically (Core Online feature).
Note: For details of Core Online, refer to Capacity Optimization (COPT) User's
Guide.
CPU fault information and result of CPU Offline can be confirmed by mcemonitor
command.
See 5.1 Show CPU / Memory Status for details of mcemonitor command.
If the correctable memory error on a certain memory page exceeds threshold value,
the firmware instructs Memory Page Offline to mcemonitor.
When mcemonitor receives Memory Page Offline instruction from firmware, it sends
Memory Page Offline instruction to kernel.
Memory Page Offline is performed in unit of 4K bytes.
When Memory Page Offline succeeds, the relevant memory page is disabled for OS
and software. Thus, the number of available memory capacity is reduced.
Note: Express5800/A1040b does not support Page Offline feature.
mcemonitor notifies the firmware of result of Memory Page Offline.
Result of Memory Page Offline can be confirmed by mcemonitor command.
See 5.1 Show CPU / Memory Status for details of mcemonitor command.
Process flow
5

Advertisement

Table of Contents
loading

This manual is also suitable for:

Express5800/a2010bExpress5800/a2020bExpress5800/a2040b

Table of Contents