Would like to move to to https://github.com/rug-cit-hpc/pg-playbooks but has large files...
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
E.M.A. Rijpkema 493afa29ae Merge pull request 'Remove ESX vulture nodes from Ansible hosts file' (#28) from fix/remove_vcpu_nodes into master 2 months ago
documentation Updated prometheus documentation a little. 5 months ago
files
group_vars/all
promtools Made build work again. 9 months ago
roles Merge pull request 'Removed vulture nodes that have been terminated.' (#27) from fix/remove-vulture into master 2 months ago
.gitignore
ansible.cfg
apps.yml
common.yml
compute_node.yml
disablerepo.yml
etc_hosts.yml
firmware.yml
firmware630.yml
firmware920.yml
firmware7425.yml
gpu.yml
gpu_detector.yml
haswell_sym.yml
hosts remove esx nodes from vulture 2 months ago
hosts-dev
inefficient_jobs_detector.yml
interactive.yml
ipmi_exporter.yml
kernel.yml
kill_memory_hogs.yml
ldap_client.yml
login.yml
lustre_client.yml
lustre_exporter.yml
metadata.yml
node_exporter.yml
nvidia-exporter.yml
nvidia_smi_exporter.yml
pg-packages.yml Added tree to the list of tools. 6 months ago
pg-tools.yml
prom_sql.yml
prometheus.yml
readme.md
remount_apps.py
sandybridge_sym.yml
site.yml
skylake_sym.yml
slurm.yml
slurm_client.yml
slurm_exporter.yml
sudo_lecture.yml
test-hosts
tmp
umount.yml
update.yml
updateeth.yml

readme.md

ansible playbooks for peregrine

This repository contains an inventory and ansible playbooks for the peregrine cluster.

Install slurm.

To install slurm server:

ansible-playbook  --vault-password-file=.vault_pass.txt  slurm.yml

Skip building of docker images.

The building of docker images takes al lot of time and is only nessecary when the docker file has been changed. You can skip this with the following command.

ansible-playbook --vault-password-file=.vault_pass.txt slurm.yml  --skip-tags build

Furthermore, you can prevent the services from starting inmediately by providing the --skip-tags start-service flag.

Setting the state of a single node.

If you want to bring a node's configuration up to date. For example after it has been rolled out via xcat, you can run the following command.
This will configure all state for that node. (node exporter for prometheus, if it is a gpu node, gpu monitoring etc)

ansible-playbook --limit pg-node023 site.yml