378 Commits (master)
 

Author SHA1 Message Date
B.E. Droge 3320c4d570 Merge pull request 'Increased timeout for not using the GPU to 4 hours' (#20) from feature/increased_gpu_timeout into master 4 weeks ago
F. Dijkstra 0e1fc73cca Increased timeout for not using the GPU to 4 hours, since 4 weeks ago
Egon Rijpkema 700c7fd0a6 Updated prometheus documentation a little. 1 month ago
E.M.A. Rijpkema 454061659a Merge pull request 'Add PrivateData setting to slurmdbd.conf and slurm.conf' (#18) from feature/privatedata into master 1 month ago
F. Dijkstra 63d5c01d59 Added PrivateData setting to slurm.conf as setting it only in 1 month ago
F. Dijkstra fe09b7faf5 Added users to PrivateData, as usage on itself did not have the 2 months ago
F. Dijkstra 80c0533eb6 Added the parameter PrivateData to prevent regular users from seeing 2 months ago
G.J.C. Strikwerda 32dc935e4c Merge pull request 'Added tree to the list of tools.' (#17) from feature/tree into master 2 months ago
F. Dijkstra 2b0c012502 Added tree to the list of tools. 2 months ago
Egon Rijpkema 5dc4274e96 Added new prometheus cert for knyft. 3 months ago
Egon Rijpkema 210c8a6911 Made build work again. 4 months ago
Egon Rijpkema c70e4a4af9 Lustre exporter is extremely verbose. 4 months ago
B.E. Droge 7e43402cb0 set pg-node247 and 269 to FUTURE 6 months ago
B.E. Droge 3e775df7a7 remove dh-node11 and 19 6 months ago
root 5074348f17 slurmd_restart should actually restart (not reload) slurmd 6 months ago
root 6735ac1e69 add config tag to config-related steps, do restart of slurmd 6 months ago
B.E. Droge a4cb09cd33 Merge branch 'master' of ssh://git.web.rug.nl:222/HPC/pg-playbooks 6 months ago
B.E. Droge ecc56268c4 remove xdmod scripts 6 months ago
B.E. Droge 39e4b8ad77 disable task affinity for cgroups 6 months ago
B.E. Droge df5090ca69 decrease tmpdisk values to a close power of 10 6 months ago
B.E. Droge c29670ceaf Add TmpFS=/local and TmpDisk values for nodes 6 months ago
root 41f075af42 fix syntax error 6 months ago
root 1bb6ca0329 update db password 6 months ago
root b965d07018 split single slurm logrotate setting into two separate ones 6 months ago
root 56ac7e9194 fix deprecation warning for loop in yum module 6 months ago
root a3eb7a3e72 bump slurm version 6 months ago
B.E. Droge c2fc2e779a make slurm user owner of slurmdbd.conf 6 months ago
B.E. Droge e7cf23fb7d change mode of slurmdbd.conf 6 months ago
B.E. Droge 5e730f4364 move node 267 back to generic list with esx nodes 8 months ago
B.E. Droge 6e684d9056 move nodes with broken ib to vulture, set merlin nodes to future 8 months ago
Egon Rijpkema d2d799cf56 Do not crash when no usage data for a gpu is available. 9 months ago
Egon Rijpkema e23f29f39e Added alerts for ceph health status. 9 months ago
Egon Rijpkema 3d5120363e Scrape ceph on the merlin-management001 9 months ago
Egon Rijpkema dba8e6269b Added a slurmdbd storage pass 9 months ago
E.M.A. Rijpkema 335e60087c Merge pull request 'Use NodeSets in SLURM config' (#16) from nodesets into master 1 year ago
B.E. Droge 731fe3d802 Use nodesets, and move non-ib nodes to vulture 1 year ago
B.E. Droge 746d716385 modify link to scientific papers that acknowledge peregrine 1 year ago
Egon Rijpkema c6fcf9ca27 When it is actualy full, send an alert about /tmp 1 year ago
B.E. Droge d801c55ff4 fix typo in lua script 1 year ago
B.E. Droge 3e361b5cac set max_rpc_cnt=150 1 year ago
B.E. Droge 9347bd9238 set messagetimeout to 30 1 year ago
B.E. Droge 68806f782e set maxnodes=1 for gpushort 1 year ago
B.E. Droge 9731cc7b04 Merge pull request 'Added gpushort partition and removed pg-gpu06 from the list of nodes.' (#15) from gpushort into master 1 year ago
B.E. Droge c5722f5c89 Merge pull request 'Moved the location of the job private temporary directory from /local to /local/tmp.' (#14) from localdir into master 1 year ago
F. Dijkstra 9267d5ffbc Added missing plugstack.conf change. The private tmpdir is now taken 1 year ago
F. Dijkstra 90a5552b47 Changed the limit for short jobs in the gpu partition to 2 hours, 1 year ago
F. Dijkstra ae3532d3f8 Added 2nd node to gpushort partition. 1 year ago
F. Dijkstra 0bc104ad17 Fixed a typo. 1 year ago
F. Dijkstra 795577652c Added gpushort partition and corresponding qos. This to be able to 1 year ago
F. Dijkstra 5599223993 Moved the location of the job private temporary directory from /local to /local/tmp. 1 year ago