Request / Error / Symptom :
Command 통한 리부팅 시 아래 메시지를 반복 출력하며 부팅 진행되지 않음, 전원버튼 이용하여 Cold boot 시 정상 부팅 진행 됨
BUG: soft lockup - CPU#0 stuck for 67s! [migration/0:5]
Analysis :
RHEL OS의 알려진 버그로, 커널을 업그레이드 하거나, audit을 중단함으로서 회피/해소 가능하다.
System Information
Product Name: ProLiant BL460c Gen8
BIOS Information
Version: I31
Release Date: 09/08/2013
OS, Kernel version
Linux 2.6.32-279.el6.x86_64
OS Specific Release Information (/etc/redhat-release)
Red Hat Enterprise Linux Server release 6.3 (Santiago)
RHEL 6 server with audit enabled hangs on startup/shutdown (BUG: soft lockup - CPU#N stuck for 67s!) in audit code
https://access.redhat.com/solutions/502603
Environment
• Red Hat Enterprise Linux (RHEL) 6, several minor releases < 5
Resolution
• RHEL6.2.z(EUS): Update the kernel to 2.6.32-220.45.1.el6 (released with RHSA-2013-1519) or later to fix the issue.
• RHEL6.3.z(EUS): Update the kernel to 2.6.32-279.39.1.el6 (released with RHSA-2013-1783) or later to fix the issue.
• RHEL6.4.z(EUS): Update the kernel to 2.6.32-358.28.1.el6 (released with RHBA-2013-1770) or later to fix the issue.
• RHEL6.5: Update the kernel to 2.6.32-431.el6 (released with RHSA-2013-1645) or later to fix the issue. This fix is already included in RHEL6.5.
• A workaround was tested by one of our customers and appeared to be working - just start the system with "audit" disabled.
Add "audit=0" to the kernel command line (in /boot/grub/grub.conf) and try to boot. This workaround disables auditd daemon.
• The audit subsystem can become heavily loaded for brief instances even on a fairly idle system.
We recommend that any auditd users move to a patched kernel, since we cannot quantify the amount of load that may trigger the hang.
Recommendation :
Action Plan 1.
What: Upgrade Kernel or Disable Audit
Why : OS 버그로 인한 정상적인 리부팅 실패 방지를 위해
1. Update the kernel to 2.6.32-279.39.1.el6 (released with RHSA-2013-1783) or later
or
2. Add "audit=0" to the kernel command line (in /boot/grub/grub.conf) and try to boot