[00:53:49] I tried to restart one of my tools on tools labs (wsexport) and the restart timeout. Does someone have an idea of what could be the cause of it? [00:54:26] I've started to make some change in the lighttpd config, but it keeps failing even after rolling back these changes [00:55:02] Coren: ping [01:00:12] Tpt: I think we are currently maxed out on the webserver grid [01:00:21] 6Labs, 10Tool-Labs: Unable to boot Ruby app on tool labs - https://phabricator.wikimedia.org/T109322#1547901 (10scfc) Shell functions are only visible and usable in shell scripts. That the syntax for calling a shell function looks indistinguishably similar to that for calling an executable (or an alias, or …)... [01:00:33] Betacommand: thank you [01:00:46] Is it possible to kick someone out to make wsexport running [01:01:14] it has an important number of users [01:01:19] Tpt: ops took at least one out of service, and realized we where really really close to maxing out available nodes [01:01:48] ok [01:02:07] Tpt: drop a note to the mailing list, See if you can shake something loose [01:02:27] 6Labs, 10Tool-Labs: Unable to boot Ruby app on tool labs - https://phabricator.wikimedia.org/T109322#1547902 (10MusikAnimal) @scfc @sitic I got it! It is up and running baby!! I managed to do this mainly through trial and error, finally getting my `httpserver.sh` like: ``` #!/bin/bash export PATH="$HOME/.rben... [01:02:30] If I kill one of my other tool, do you think i'll be able to get wsexport back [01:02:33] ? [01:02:55] Tpt: unknown [02:07:32] 6Labs, 10Labs-Infrastructure, 7Database: Different results with queries in labs versus production - https://phabricator.wikimedia.org/T74413#1547934 (10Krenair) Checked db2034.codfw.wmnet, labsdb1001.eqiad.wmnet, labsdb1002.eqiad.wmnet, and labsdb1003.eqiad.wmnet. All return 8027, so assuming this got fixed... [02:07:39] 6Labs, 10Labs-Infrastructure, 7Database: Different results with queries in labs versus production - https://phabricator.wikimedia.org/T74413#1547937 (10Krenair) 5Open>3Resolved [02:18:42] hello, my Ruby tool is alllllmost ready to go :) only problem is the webserver fails to start when I try to pass in the config file along with the command to load the app to portgrabber [02:18:47] so `exec portgrabber musikanimal unicorn -c unicorn.rb -E production -p` [02:19:13] unicorn.rb definitely exists locally, and I can run that command directly and unicorn starts [02:19:41] since when I run `jstart` I'm going through trusty, is it possible the unicorn.rb file isn't present there? [02:20:19] httpserver.err and .out don't offer clues [02:22:03] 6Labs, 10Labs-Infrastructure: Replica MySQL: Views completely missing from some wiki's - https://phabricator.wikimedia.org/T73041#1547942 (10Krenair) Is there anything left to fix here? [02:23:10] 6Labs, 10wikitech.wikimedia.org: Keystone tokens truncated when wikitech stores them - https://phabricator.wikimedia.org/T92014#1547944 (10Krenair) Bump. [03:17:16] 6Labs, 10Labs-Infrastructure: Replica MySQL: Views completely missing from some wiki's - https://phabricator.wikimedia.org/T73041#1547993 (10Pathoschild) This seems to be resolved. [03:19:05] 6Labs, 10Labs-Infrastructure: Replica MySQL: Views completely missing from some wiki's - https://phabricator.wikimedia.org/T73041#1547996 (10Krenair) 5Open>3Resolved [03:29:40] 6Labs, 10Labs-Infrastructure: Database upgrade MariaDB 10: Engine / Option mismatch on table `user_properties` - https://phabricator.wikimedia.org/T70942#1547998 (10Krenair) It looks like labsdb100[13] run 10.0.15 and labsdb1002 runs 10.0.16. Is this resolved now? [03:43:22] 6Labs, 7Database: Database replicas: replicate user.user_touched - https://phabricator.wikimedia.org/T92841#1548018 (10Krenair) [03:51:07] 6Labs, 10Labs-Infrastructure, 7Database: Database upgrade MariaDB 10: Discrepancies with logging table on different wikis - https://phabricator.wikimedia.org/T71127#1548025 (10Krenair) [07:26:25] !log tools andrewbogott built tools-webgrid-lighttpd-1411 yesterday but it's not actually added as exec host. Trying to figure out how to do that... [07:26:28] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [07:31:56] !log tools reading puppet suggests I should qconf -ah /var/lib/gridengine/etc/exechosts/tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs but that file is missing? [07:31:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [07:37:02] !log applying role::labs::tools::compute and toollabs::node::web::generic to \tools-webgrid-lighttpd-1411 [07:37:02] applying is not a valid project. [07:37:07] !log tools applying role::labs::tools::compute and toollabs::node::web::generic to \tools-webgrid-lighttpd-1411 [07:37:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [07:53:07] !log tools Setting up adminbot (1.7.8) ... chmod: cannot access '/usr/lib/adminbot/README': No such file or directory --- ran sudo touch /usr/lib/adminbot/README [07:53:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [07:54:32] !log various issues with Error: /Stage[main]/Gridengine::Submit_host/File[/var/lib/gridengine/default/common/accounting]/ensure: change from absent to link failed: Could not set 'link' on ensure: No such file or directory - /var/lib/gridengine/default/common at 17:/etc/puppet/modules/gridengine/manifests/submit_host.pp; probably an ordering issue in puppet [07:54:32] various is not a valid project. [07:54:40] !log tools various issues such as Error: /Stage[main]/Gridengine::Submit_host/File[/var/lib/gridengine/default/common/accounting]/ensure: change from absent to link failed: Could not set 'link' on ensure: No such file or directory - /var/lib/gridengine/default/common at 17:/etc/puppet/modules/gridengine/manifests/submit_host.pp; probably an ordering issue in [07:54:41] puppet [07:54:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [07:55:33] !log tools argh. Disabling toollabs::node::web::generic again and enabling toollabs::node::web::lighttpd [07:55:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:00:33] !log tools running puppet agent -tv again [08:00:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:04:05] !log tools installing packages from /data/project/.system/deb-trusty seems to fail. sudo apt-get update helps. [08:04:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:06:39] !log tools ok, success. /var/lib/gridengine/etc/exechosts/tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs now exists. Do I still have to add it manually to the grid? I suppose so. [08:06:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:07:14] !log tools sudo qconf -Ae /var/lib/gridengine/etc/exechosts/tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs -> root@tools-bastion-01.eqiad.wmflabs added "tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs" to exechost list [08:07:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:10:08] !log tools /var/lib/gridengine/etc/queues/webgrid-lighttpd does not seem to be the correct configuration as the current config refers to '@webgrid' as host list. [08:10:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:13:58] !log and the hostgroup @webgrid doesn't even exist? (╯°□°)╯︵ ┻━┻ [08:13:59] and is not a valid project. [08:14:02] !log tools and the hostgroup @webgrid doesn't even exist? (╯°□°)╯︵ ┻━┻ [08:14:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:20:27] !log tools sudo qconf -mhgrp "@webgrid", added tools-webgrid-lighttpd-1411.eqiad.wmflabs [08:20:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:21:32] !log tools still sudo qmod -e "*@tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs" -> invalid queue "*@tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs" [08:21:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:23:13] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548243 (10valhallasw) 3NEW [08:23:18] * valhallasw`cloud gives up [08:30:56] !log tools hostname mismatch: host is called tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs in config, but it was named tools-webgrid-lighttpd-1411.eqiad.wmflabs in the hostgroup config [08:31:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:33:52] !log tools tools-webgrid-lighttpd-1403.eqiad.wmflabs, tools-webgrid-lighttpd-1404.eqiad.wmflabs and tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs are all broken (queue dropped because it is temporarily not available) [08:33:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:37:00] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548252 (10valhallasw) ``` queue instance "webgrid-lighttpd@tools-webgrid-lighttpd-1403.eqiad.wmflabs" dropped because it is temporaril... [08:37:15] !log sudo service gridengine-exec start on tools-webgrid-lighttpd-1404.eqiad.wmflabs" tools-webgrid-lighttpd-1406.eqiad.wmflabs" tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs" [08:37:15] sudo is not a valid project. [08:37:18] !log tools sudo service gridengine-exec start on tools-webgrid-lighttpd-1404.eqiad.wmflabs" tools-webgrid-lighttpd-1406.eqiad.wmflabs" tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs" [08:37:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [08:44:58] andrewbogott: the answer to the question 'how does one configure a new webgrid host' is summarized in my cursing over the last 90 minutes [08:46:07] I'll try to make an actual guide out of it. As a note for today; we have to make sure the exec daemon actually starts on the exec hosts after reboot. Not entirely sure how to check easily, but qstat -j seems to list a stack of queues that are 'temporarily unavailable' == master cannot communicate to host [09:03:08] Hi [09:03:34] valhallasw`cloud: is there a bug or something for me to catch up to what's going on? [09:05:16] YuviPanda: https://phabricator.wikimedia.org/T109412#1548252 + SAL [09:06:40] Oook [09:06:46] Reading through now [09:06:59] YuviPanda: basically, webservice jobs didn't start [09:07:16] andrew fired up a new host but didn't know how to configure it [09:07:29] I tried to configure it this morning, but the lack of docs made that difficult [09:07:56] in the end I noticed the exec daemon wasn't running on the new host + two existing webservice hosts, so I also started those [09:16:33] valhallasw`cloud: did that fix things? [09:16:44] I think so, but I'm not entirely sure how to get a queue status [09:17:02] but there's jobs running on tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs [09:17:04] so I suppose so [09:17:11] (qhost -h tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs -j) [09:18:00] wooo [09:18:12] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Admin should have new docs... [09:18:12] sigh [09:19:02] YuviPanda: see also https://phabricator.wikimedia.org/T104734 [09:19:38] 6Labs, 10Tool-Labs: Document adding new nodes - https://phabricator.wikimedia.org/T109417#1548317 (10yuvipanda) 3NEW [09:19:45] 6Labs, 10Tool-Labs: Tool Labs virt reboot checklist - https://phabricator.wikimedia.org/T108669#1548324 (10valhallasw) [09:19:46] 6Labs, 10Tool-Labs: [tracking] Tool labs admin guides - https://phabricator.wikimedia.org/T104734#1548323 (10valhallasw) [09:20:06] YuviPanda: I think it's fine puppet doesn't add the nodes automatically [09:20:10] but there should be a cheeecklist [09:20:13] I love checklists [09:21:21] yeah I agree [09:21:34] 6Labs, 10Tool-Labs: 'new exec node' checklist - https://phabricator.wikimedia.org/T109417#1548325 (10valhallasw) [09:22:15] valhallasw`cloud: <3 [09:22:39] valhallasw`cloud: a webservice start seems to work for me [09:22:49] ok [09:23:01] so that's two unplanned outages in one week. fun fun fun [09:23:38] (I consider http://thread.gmane.org/gmane.org.wikimedia.labs/3954 an unplanned outage as well) [09:24:05] so I should probably do a post-mortem somewhere? [09:27:36] valhallasw`cloud: two? [09:27:47] valhallasw`cloud: yeah, etherpad then wikitech? [09:28:18] YuviPanda: (1) massive number of jobs disappearing from grid due to qmod -rj, (2) webservices not starting due to servers being down [09:35:16] ah that [09:58:36] 6Labs: Create a checkpoint check for labs LDAP - https://phabricator.wikimedia.org/T107454#1548428 (10yuvipanda) How does it create a dependency loop? How does toolschecker depend on ldap? [11:19:02] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108: Evaluate a 'cluster solution' for use on Tool Labs - https://phabricator.wikimedia.org/T106475#1548603 (10Joe) Some general comments on the discussion: - The current toollabs has a lot of deficiencies and stability issues. It is also based on an almost-abandonware... [11:19:38] 6Labs, 10Tool-Labs: Request for installation of MongoDB package - https://phabricator.wikimedia.org/T108341#1548605 (10yuvipanda) Hello! So... we've tried in the past to setup mongodb for tools users (it isn't as simple as just installing a package, unfortunately) and it has ended up really badly (they don't... [12:05:52] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Marcusmeisel was created, changed by Marcusmeisel link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Marcusmeisel edit summary: Created page with "{{Tools Access Request |Justification=Research for a scientific research proposal |Completed=false |User Name=Marcusmeisel }}" [13:25:15] valhallasw`cloud: I’m sorry I slept through your tribulations :( [13:25:27] andrewbogott: that's perfectly fine, you need your sleep too :-) [13:35:57] valhallasw`cloud: so… when I reboot labvirt1006 later, what do I need to do differently? [13:37:04] andrewbogott: I think 1) we need to make sure the gridengine exec service runs on all hosts that stay online; 2) we should take more time to resubmit jobs, probably by taking one host at a time; 3) we need to make sure the gridengine exec service starts again after reboot [13:37:58] I don’t understand 1) and don’t know how to do 3) — tell me more? [13:38:29] on each exec host, there is a daemon that actually starts jobs on that host [13:38:36] = service gridengine-exec [13:39:08] for some reason this was not started on two webgrid hosts, which meant jobs could not start [13:39:25] it's still not on-line on several other hosts, I think, but I haven't had time to check it out [13:39:34] and… nothing monitors whether or not that job is running, I guess? [13:39:39] starting is as simple as sudo service gridengine-exec start [13:40:01] I'm afraid not. You can get a list of queues and status wiht qstat -f -q [13:40:03] ok. Was that also the answer to ‘how do I add a new node’? [13:40:19] the 'add a new node' is a whole complicated story on it's own [13:40:30] it's not hard, but we need a checklist, basically. [13:41:01] ok [13:41:21] And it didn’t turn out that yuvi had already documented this and I just failed at google? [13:41:36] no, just hadn't been documented :( [13:43:56] 6Labs, 10Tool-Labs, 5Patch-For-Review: Check for error log ownership before starting webservice job - https://phabricator.wikimedia.org/T99576#1548847 (10scfc) 5Open>3Resolved Removed. [13:46:32] !log tools starting gridengine-exec on hosts with queues in 'au' (=alarm, unknown) state using for i in $(qstat -f -xml | grep "au" -B 6 | grep "" | cut -d'@' -f2 | cut -d. -f1); do echo $i; ssh $i sudo service gridengine-exec start; done [13:46:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [13:47:20] valhallasw`cloud: thank you for fixing everything. I’m going to relocate, will be back online in time for the reboot. [13:47:33] !log tools that brought tools-exec-1403, tools-exec-1406 and tools-webgrid-generic-1402 back up, tools-exec-1401 and tools-exec-catscan are still in 'au' state [13:47:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [13:47:54] I’m trying to deploy a PHP script that uses Symfony HTTPKernel and the lighttpd rewrite feature, but keep getting routing errors because X-Original-URI is used for routing which still contains the tool name. I’ve tried to remove the tool name with a rewrite rule, but no luck. Can anyone help? [13:54:50] !log tools tried to restart gridengine-exec on tools-exec-1401, no effect. tools-webgrid-lighttpd-1411 also just went into 'au' state. [13:54:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [13:55:51] !log tools no, wait, that's ''tools-webgrid-lighttpd-1411.eqiad.wmflabs'', not the actual host ''tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs''. We should fix that dns mess as well. [13:55:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [13:56:35] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548893 (10scfc) I did so far: - `qconf -mhgrp \@webgrid` => `tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs` => `tools-webgrid-lighttpd-1411.eqiad.wmflabs` (con... [13:57:02] !log tools same issue seems to happen with the other hosts: tools-exec-1401.tools.eqiad.wmflabs vs tools-exec-1401.eqiad.wmflabs and tools-exec-catscan.tools.eqiad.wmflabs vs tools-exec-catscan.eqiad.wmflabs. [13:57:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [13:58:56] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548910 (10valhallasw) Note that there are jobs running on -1411 at the moment, so please be careful with the sge magic. More DNS crazyness with tools-exec-1401.... [14:01:44] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548917 (10scfc) Ha! The alias list stroke: ``` scfc@tools-bastion-01:~$ qconf -de tools-webgrid-lighttpd-1411.eqiad.wmflabs scfc@tools-bastion-01.eqiad.wmflabs... [14:03:01] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548922 (10scfc) I'm sorry, I read your comment too late. But `qstat -f` showed no running jobs anyway? [14:03:35] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548929 (10scfc) Ugh, wrong column. Yes there were and still are jobs running on that host. [14:06:53] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1548944 (10scfc) a:3scfc [14:38:49] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1549069 (10scfc) Okay, now that I have done practically nothing, except changing the list in `qconf -mhgrp \@webgrid` back to `…-1411.tools.…` because I hadn't con... [14:42:48] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Marcusmeisel was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=174153 edit summary: [14:44:52] 6Labs, 10Tool-Labs: Enable tools-webgrid-lighttpd-1411.tools.eqiad.wmflabs and build more webgrid hosts - https://phabricator.wikimedia.org/T109412#1549077 (10valhallasw) 5Open>3Resolved I think so, for now. I still need to document what I did in more detail (T109417) and write up a post-mortem, but I thin... [14:45:32] valhallasw`cloud: This is what you mean by ‘staggered’ right? https://phabricator.wikimedia.org/P1894 [14:46:08] andrewbogott: yes, although can we change the second 'sleep' by 'wait for a human to continue'? I'm tailing the accounting log to see if anything weird happens [14:47:31] (for the same look at the accounting log: cd /home/valhallasw/accountingtools; tail -f /data/project/.system/accounting | python merlijn_stdin.py ) [14:48:00] that should give lines starting with 25 (=rescheduling) and none starting with 100 (=failed after job) [14:49:28] valhallasw`cloud: ‘wait for a human’ seems like this will take forever... [14:49:53] oh, you mean, just this once, sure. [14:49:59] yes, just this once [14:50:33] ok, changed the last sleep to 'read -n 1 -s' [14:50:36] ready for me to run this? [14:50:49] Or do you want to so you can see all the output? [14:51:03] eh, maybe that's easier, yes [14:51:23] It’s /home/andrew/killjobs/killjobs.sh [14:51:26] ok! [14:51:32] and args are tools-exec-1205 tools-exec-1207 tools-exec-1208 tools-exec-1401 tools-exec-1404 tools-exec-1409 tools-exec-1410 tools-exec-catscan tools-web-static-01 tools-webgrid-lighttpd-1201 tools-webgrid-lighttpd-1205 tools-webgrid-lighttpd-1206 tools-webgrid-lighttpd-1406 tools-webproxy-02 [14:52:13] hrm, /home/andrew/killjobs/killjobs.sh: 19: /home/andrew/killjobs/killjobs.sh: Syntax error: Unterminated quoted string [14:52:44] ok, try now [14:54:36] ok, tasks killed, waiting for the cont jobs [14:55:10] currently only continuous and webgrid jobs are remaining, which is good [14:56:40] okat, that read didn't work because I was stupid enough to press enter during the sleep 2m [14:57:26] so the tools-exec hosts were cleared in one batch after all. Argh. [14:58:25] shouldn’t you have needed to hit ‘enter’ for each host? [14:58:27] Lots of times? [14:58:46] I resized my screen and hit enter a few times to get my bearings again. Stupid. [14:58:56] ah ok [14:59:15] So… you can still check and see what’s running, yes? [14:59:19] heh, also ./killjobs.sh: 6: read: Illegal option -n [14:59:37] * andrewbogott curses stackoverflow [14:59:45] yeah, the -exec hosts are cleared, there's a set that didn't come back up [15:00:05] how do I wait for a keypress then? [15:00:20] read -p [15:01:02] read -p foo [15:01:05] tools-webgrid-lighttpd-1205.eqiad.wmflabs rescheduled, no issues there, it seems [15:01:47] tools-webgrid-lighttpd-1206.eqiad.wmflabs also reschedules OK [15:03:15] and tools-webgrid-lighttpd-1406.eqiad.wmflabs also seems OK [15:03:28] but all tools-exec-1207 and -08 jobs are dead, I think [15:03:41] andrewbogott: anyway, you can safely reboot [15:03:56] ok! Here we go... [15:04:41] oh, except bastion-01 was on the list to be rebooted as well. We probably should have warned people >_< [15:05:04] this is not my day