karlji 6bbd3d0954 | ||
---|---|---|
CUDA.py | ||
LICENSE | ||
README.md | ||
requirements.txt |
README.md
CudaStresser
CudaStresser is a Python-based tool designed to stress test CUDA-enabled GPUs. It performs various operations to measure GPU memory bandwidth, stress the device, and log GPU utilization metrics. This tool is useful for developers, researchers, and system administrators who want to benchmark or stress test their GPU hardware.
Features
- CUDA Core Stress Testing: Create and manipulate tensors on the GPU to stress test the CUDA cores.
- VRAM Load and Unload: Simulate heavy memory operations to test GPU memory bandwidth and consistency.
- Real-time GPU Monitoring: Log GPU utilization, memory usage, and temperature during the stress test.
- Multiprocessing: Efficiently utilize multiple CPU cores for concurrent GPU stress testing.
Requirements
- Python 3.7+
- PyTorch 1.7+
- CUDA-enabled GPU(s)
- NVIDIA drivers with
nvidia-smi
available
Installation
-
Clone the repository:
git clone https://github.com/your-username/cuda-stresser.git cd cuda-stresser
-
Create and activate a virtual environment (optional but recommended):
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the required Python packages:
pip install torch
-
Ensure
nvidia-smi
is available:Make sure
nvidia-smi
is accessible in your system's PATH. This is usually installed with the NVIDIA drivers.
Usage
Basic Stress Test
To perform a basic CUDA stress test:
python cuda_stresser.py
This command will:
- Detect available CUDA devices.
- Stress test the CUDA cores by creating and manipulating tensors for 60 seconds.
- Log GPU utilization, memory usage, and temperature every 5 seconds.
- Display the progress and results in the console.
Custom Stress Test
You can customize the test parameters using the CudaStresser
class:
from cuda_stresser import CudaStresser
# Initialize the stresser with 99% VRAM load
stresser = CudaStresser(load_perc=0.99)
# Perform a stress test for 60 seconds with 15 tensors
results = stresser.cuda_stress(timing=60, tensor_num=15)
print(results)
# Perform a VRAM load/unload test for 60 seconds
vram_results = stresser.cuda_load_unload(timing=60)
print(vram_results)
Logging GPU Information
During the stress test, the script logs GPU utilization, memory usage, and temperature data. The logs can be accessed from the console or modified to save to a file.
Customization
Parameters
load_perc
: Percentage of VRAM to load (default:0.99
).timing
: Duration of the stress test in seconds (default:60
).tensor_num
: Number of tensors to create for the stress test (default:1000
).poll_time
: Interval for logging GPU data in seconds (default:5
).
Example with Custom Parameters
stresser = CudaStresser(load_perc=0.90)
gpu_log = stresser.cuda_stress(timing=120, tensor_num=20, poll_time=10)
print("GPU Log:", gpu_log)
License
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgements
Special thanks to the developers and contributors of PyTorch for providing an excellent framework for machine learning and GPU computing.
Note: This tool is intended for testing and benchmarking purposes. Use it responsibly, especially on production systems, as it can put significant load on your hardware.