Okay, I found it, and I am sorry for wasting time.
services:
ray-worker:
build:
context: .
dockerfile: Dockerfile
container_name: ray-worker
restart: unless-stopped
volumes:
- /opt/models:/models
environment:
- VLLM_HOST_IP=${workerIp}
- NCCL_DEBUG=INFO
- RAY_DEDUP_LOGS=0
- NCCL_NET=Socket
- NCCL_IB_DISABLE=0
network_mode: host
ipc: host
shm_size: '100gb'
devices:
- nvidia.com/gpu=all
You don’t need to define gpus: all only defining the devices you want to hang to the container.
Thank you, @malloc, for the suggestion, with your input I tried to think out of the box.
Marking this as the solution ![]()