System Information:
DL360 Gen10(Intel 6230 x 2p, 64GB(32GB x 2qty), P408i-a HBA, 10Gb 2p NIC, 1Gb 4p NIC, NVidia T4 x 2qty)
1. Set SW Level and configure
a. System ROM 2.68 / IE 0.2.3.0.0 / SPS 4.1.4.804 / iLO 2.72 - SPP2022.09.01.00
b. T4 fw 90.04.B4.00.04
c. Set WP and Cooling - Not set yet.
- Restore Set Default
- Virtualization - Max Performance: Yes
2. Install CentOS 7.8
a. Set "nomodeset" during Install
- Edit<E> GRUB: add "nomodeset"
linuxefi /images/pxeboot/vmlinuz ... nomodeset
b. Select Server with GUI
+ Compatibility Libraries
+ Development Tools
c. Configure network
- Set IP Address
3. Install NVidia GPU Driver
a. blacklist nouveau and acpi_power_meter
# modprobe -r acpi_power_meter
# echo "blacklist acpi_power_meter" > /etc/modprobe.d/blacklist-acpi_power_meter.conf
# echo "install acpi_power_meter /bin/false" >> /etc/modprobe.d/blacklist-acpi_power_meter.conf
# vim /etc/sensors3.conf
chip "power_meter-acpi-0"
ignore power1
# modprobe -r nouveau
# echo "blacklist nouveau" > /etc/modprobe.d/blacklist-nouveau.conf
# echo "options nouveau modeset=0" >> /etc/modprobe.d/blacklist-nouveau.conf
# echo "install nouveau /bin/false" >> /etc/modprobe.d/blacklist-nouveau.conf
# cp /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.$(date +%m-%d-%H%M%S).bak
# dracut --omit-drivers nouveau -f
# grub2-editenv - set "$(grub2-editenv - list | grep kernelopts) nouveau.blacklist=1 rd.driver.blacklist=nouveau"
or grub2-editenv - set "$(grub2-editenv - list | grep kernelopts) nouveau.modeset=0"
# cp /boot/initramfs-$(uname -r)kdump.img /boot/initramfs-$(uname -r)kdump.img.$(date +%m-%d-%H%M%S).bak
# sed -i '/^KDUMP_COMMANDLINE_APPEND=/s/"$/ rd.driver.blacklist=nouveau"/' /etc/sysconfig/kdump
# kdumpctl restart
# mkdumprd -f /boot/initramfs-$(uname -r)kdump.img
# reboot
b. Install NV GPU driver and CUDA toolkit
- configure yum repositories
# mkdir -p /media/CentOS-DVD
# mount -o loop /tmp/CentOS-7.8-x86_64-Everything-2003.iso /media/CentOS-DVD
# vim /etc/yum.repos.d/CentOS-DVD.repo
[CentOS-DVD]
NAME=CentOS-DVD
BASEURL=file:///media/CentOS-DVD
ENABLED=1
GPGCHECK=0
# vim /etc/fstab
/tmp/CentOS-7.8-x86_64-Everything-2003.iso /media/CentOS-DVD iso9660 ro 0 0
# mount -a
# yum repolist all
# yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-$distro.repo
# yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
# yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# yum clean expire-cache
# yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
# yum install gcc gcc-c++ freeglut-devel libX11-devel libXi-devel libXmu-devel make mesa-libGLU-devel freeimage-devel
# yum install elfutils-libelf-devel libglvnd-devel
cf.
$ sudo systemctl isolate multi-user.target
$ sudo systemctl start graphical.target
$ sudo systemctl set-default multi-user
$ sudo systemctl set-default graphical
$ sudo systemctl isolate multi-user.target
- CUDA 11.6 / 510.47.03
# wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda_11.6.2_510.47.03_linux.run
# sudo sh cuda_11.6.2_510.47.03_linux.run
$ sudo vim ~/.bashrc
PATH=/usr/local/cuda-11.6/bin:/usr/local/cuda-11.6/samples:/usr/local/cuda-11.6/samples/bin/x86_64/linux/release:${PATH:+:${PATH}}
LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
$ source ~/.bashrc
b. Install CUDA sample packages
# cd /usr/local/cuda-11.6/samples
# git clone https://github.com/NVIDIA/cuda-samples.git
# mv /usr/local/cuda-11.6/samples/cuda-samples/* /usr/local/cuda-11.6/samples/
# sudo make SMS="75 80" -i
Note. '-i', ignore unsupported architecture problem.
such as - "nvcc fatal : Unsupported gpu architecture 'compute_90'"
Note. GPU SM / GPU Architecture reference
https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/
$ sudo /usr/bin/nvidia-persistenced --verbose
$ sudo systemctl start graphical.target
# deviceQueryDry
# nvcc -V
# nbody -benchmark -fp64 -device=x