191 Commits (aa545626eba3aba9096ae1a8de4c21fdacfbbc3b)
 

Author SHA1 Message Date
Egon Rijpkema cad42543d2 Make it easy to bring a node up to standard. 4 years ago
Egon Rijpkema a5f0faea8e We have new pg-oss machines. 4 years ago
Egon Rijpkema 6c10847bdb osts are now called oss (as that's what they are) 4 years ago
Egon Rijpkema 09da454619 Implemented better timekeeping checks. 4 years ago
G.J.C. Strikwerda 296160d6de nieuwe host-file 4 years ago
F. Dijkstra 44030cd32e * Removed the target partition, because gpfs3 is no longer mounted. 4 years ago
Egon Rijpkema 0813da7498 Increased permissions, so container can start. 4 years ago
Egon Rijpkema 015112cf5b Increased retention. 4 years ago
Egon Rijpkema 3dd30c02fe Forgot role.. 4 years ago
Egon Rijpkema 071791aa72 Started documenting prometheus. 4 years ago
F. Dijkstra a81c5f40c3 Added account limitations to all partitions. 4 years ago
F. Dijkstra 737b25644c Added gelifeslong qos to gelifes partition. 4 years ago
F. Dijkstra dce552004d Bug fixes, and added qos gelifeslong to be consistent with other 4 years ago
F. Dijkstra 83fdc73a90 Added gelifes and test partition. 4 years ago
F. Dijkstra 78e755fc5c Fixed the number of gelifes nodes 4 years ago
F. Dijkstra 712d5e6c98 Added gelifes partition 4 years ago
Egon Rijpkema f478d26479 Disabled lustre exporter (it fails) 4 years ago
Egon Rijpkema 3c646a73c6 Added ipmi monitoring 4 years ago
Egon Rijpkema 297448fbbb with_items needs an "{{item}}" 4 years ago
Egon Rijpkema 73577e7715 Increased retention. 4 years ago
Egon Rijpkema 843ce04a1c Enabled ost exporter and set correct group name 4 years ago
Egon Rijpkema ec67fe2af2 added identifier for knyft 4 years ago
Egon Rijpkema a62ebcbde7 Using a prometheus server on knyft instead of prox 4 years ago
Egon Rijpkema 7681f1bbd3 Added lustre exporter for datahandling 4 years ago
root 5accaf9e6a change group dh-storage to dh_storage 4 years ago
R. Teeninga 5cb9835040 Added pg-node211:225 4 years ago
Egon Rijpkema 39b7ef7bd0 Added a prometheus exporter for slurm. 4 years ago
Egon Rijpkema c028602a6e This playbook is still needed for the metadara role. 4 years ago
Egon Rijpkema fb3b3399df Added nvidia_smi_exporter for prometheus 4 years ago
Egon Rijpkema 09b9996c65 Made sure the services are restarted after reboot. 4 years ago
Egon Rijpkema 727c69f1e4 added recipy for lustre exporter 4 years ago
root 700d6fd2fe playbooks for /software 4 years ago
Egon Rijpkema f25f881883 UnkillableStepTimeout=120 4 years ago
Egon Rijpkema c10dc00edd Ignoring fuse sshs mounts. 4 years ago
Egon Rijpkema 44583da8c3 Speed things up a bit. 4 years ago
Egon Rijpkema 8368e6b38e Made separation that made a bit more sense. 4 years ago
Egon Rijpkema 85d2c44206 Added profiling for slurm 4 years ago
E.M.A. Rijpkema 29a841fa2e Merge branch 'feature/drop_caches' of HPC/pg-playbooks into master 4 years ago
Egon Rijpkema 07b893befc A cronjob to drop cashes on the metadata servers. 4 years ago
Egon Rijpkema bf27030b9d slurmctld is now standalone? 4 years ago
Egon Rijpkema e9f749f7fa Some changes for the new docker packages. 4 years ago
Egon Rijpkema 1ead4c42d1 Made the search a litte less naieve... 4 years ago
Egon Rijpkema 6b8503d559 Prevent error mail when flag file does not exist. 4 years ago
Egon Rijpkema 43cfd2ccff Prevent sending a mail every minute. 4 years ago
Egon Rijpkema 102c7e622b Updated mail adress. 4 years ago
Egon Rijpkema f37aedf6ff Refactored the touch alert to own role. 4 years ago
Egon Rijpkema f42c6b4f74 Updated depricated include to import_tasks 4 years ago
Egon Rijpkema ba0f6445b7 Merge branch 'feature/find-alert' 4 years ago
Egon Rijpkema 63a5816cef Added ansible automation. 4 years ago
Egon Rijpkema 5545baf9f4 Alerting script 4 years ago