[00:28:47] how do i give another user root on my instance. he is already project admin [00:28:55] but apparently no root [00:30:39] nevermind, we used it wrong [01:12:20] added new instance, added new webproxy, looks like it was added just fine [01:12:32] but making requests to it..504 Gateway Time-out [01:12:40] mutante: did you open security groups? [01:12:41] Apache on backend runs fine and returns things [01:12:56] no :p [01:14:27] yuvipanda: sorry for even asking, much better [01:14:45] mutante: np! I'm sure if I had just not responded you'd have figured it out just like the last few questions in that row :D [01:15:03] makes git.wmflabs.org ... and gets paladox on it , heh [01:15:09] all to kill gitblit [01:17:04] I didn't realize killing gitblit needed help [01:17:11] it usually kills itself often enough [01:17:44] :p true. the users.. think of the users:) [01:18:19] it's all about http://www.w3.org/Provider/Style/URI.html [02:48:20] 06Labs, 10Tool-Labs: Virtualenvs slow on tool labs NFS - https://phabricator.wikimedia.org/T136712#2362967 (10yuvipanda) p:05Triage>03High I wonder if this is caused by the throttling or the lookupcache. [03:04:12] 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 13Patch-For-Review, 15User-bd808: Create temporary http -> https reverse proxy for MerlBot - https://phabricator.wikimedia.org/T137235#2362991 (10bd808) Proxy is up and running at http://tools-merlbot-proxy.tools.eqiad.wmflabs:80 ``` $ curl -v -X POST -... [03:06:51] 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 13Patch-For-Review, 15User-bd808: Create temporary http -> https reverse proxy for MerlBot - https://phabricator.wikimedia.org/T137235#2362993 (10bd808) And with java on the grid with java from P3219: ``` $ jsub -stderr -once -l release=trusty -mem 4g j... [03:11:56] 10Quarry: Option to kill task by myself - https://phabricator.wikimedia.org/T137266#2362997 (10Dvorapa) [03:12:32] 10Quarry: Option to kill task by myself - https://phabricator.wikimedia.org/T137266#2362997 (10yuvipanda) If you fix it and re-execute again the previous task will automatically get killed! [03:18:49] 06Labs, 10Tool-Labs: Virtualenvs slow on tool labs NFS - https://phabricator.wikimedia.org/T136712#2363012 (10zhuyifei1999) Just some even crazier results: ``` (venv)tools.video2commons-test@tools-bastion-02:~$ time pip -V pip 8.1.2 from /data/project/video2commons-test/www/python/venv/local/lib/python2.7/site... [03:22:48] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to support java 1.8 on tool labs (Merl's bot) - https://phabricator.wikimedia.org/T121279#2363014 (10bd808) >>! In T121279#2352969, @BBlack wrote: > A generic non-modifying proxy might work if the java code has some generic support for explicitly usin... [03:23:30] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2363016 (10bd808) [03:28:26] 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 13Patch-For-Review, 15User-bd808: Create temporary http -> https reverse proxy for MerlBot - https://phabricator.wikimedia.org/T137235#2363019 (10bd808) [03:28:28] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2363018 (10bd808) [03:49:13] 10Quarry: Option to kill task by myself - https://phabricator.wikimedia.org/T137266#2363073 (10Dvorapa) 05Open>03Invalid Oh, I didn't realized, thanks [03:50:50] 10PAWS: PAWS can not login - https://phabricator.wikimedia.org/T136114#2363080 (10Dvorapa) Still the same result [03:55:37] 10PAWS, 10Jupyter-Hub: I can't login my bot in JUPYTER - https://phabricator.wikimedia.org/T135306#2363082 (10yuvipanda) மன்னிக்கவும், வேரு வெலை அதிகமாகி விட்டது. Maathavanbot என்ற கணக்கை டெலீட் செய்து விட்டேன். இபோ டிரய் பன்னுன்க? அதன் பழ்ய பயன�� [03:57:26] hahaha [03:57:37] wikibugs doesn't handle unicode well does it [03:58:45] 10Wikibugs: Wikibugs doesn't deal with unicode properly - https://phabricator.wikimedia.org/T137267#2363083 (10yuvipanda) [04:03:16] 10Quarry: Option to kill task by myself - https://phabricator.wikimedia.org/T137266#2363095 (10Dvorapa) 05Invalid>03Open But sometimes you just don't want to run it again. E.g. if you find out the result is too large so you just want to kill it and end working on it [04:04:04] 10Quarry: Option to kill task by myself - https://phabricator.wikimedia.org/T137266#2363097 (10Dvorapa) [04:10:54] 10Quarry: Add an option to export result in Wikilist - https://phabricator.wikimedia.org/T137268#2363098 (10Dvorapa) [04:18:10] So... creating a new web proxy in horizon maps it via IP, not hostname? [04:18:41] Matthew_: yeah, why? [04:19:03] Just making sure. [04:19:06] :D [04:19:17] yeah, using DNS wasn't providing any additional value and was causing load [04:19:36] Okay. As long as the IP doesn't change. [04:20:23] Matthew_: yeha [04:33:07] yuvipanda: is wikibugs a python irc bot? [04:33:15] bd808: yup [04:34:49] the thing I was thinking of is apparently for input parsing, not output -- https://github.com/bd808/tools-stashbot/blob/master/stashbot/bot.py#L60-L64 [04:35:28] ah [04:35:30] fun [04:37:55] Oh man. I just realized that I can work on an hhvm container that uses repo-authoritative mode [04:38:28] bd808: oh yeah, you can. it would just work similar to uwsgi / nodejs, where you have to restart for updates [04:38:49] yeah. fun stuff [05:02:31] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2363140 (10Pokefan95) [05:07:04] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2363156 (10Pokefan95) p:05Normal>03High Without phab-01.wmflabs.org, we cannot test Phabricator. [05:43:03] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2363140 (10mmodell) phab-03.wmflabs.org should work [06:16:28] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2363187 (10Pokefan95) >>! In T137270#2363167, @mmodell wrote: > phab-03.wmflabs.org should work Yep, phab-03 is working now. phab-01, phab-02, and phab-04 are s... [08:27:47] 10PAWS, 10Jupyter-Hub: I can't login my bot in JUPYTER - https://phabricator.wikimedia.org/T135306#2294878 (10Shanmugamp7) >>! In T135306#2363082, @yuvipanda wrote: > மன்னிக்கவும், வேரு வெலை அதிகமாகி விட்டது. > > Maathavanbot என்ற கணக்கை டெலீட் செய்து விட்டேன். இபோ டிரய் பன்ன� [08:28:41] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2363340 (10Pokefan95) [09:51:57] 06Labs, 10Tool-Labs: No 'replica.my.cnf' in my home directory at maos@tools-bastion-03 - https://phabricator.wikimedia.org/T137283#2363528 (10Mattias_Ostmar-WMSE) [10:05:51] 06Labs, 10Tool-Labs: No 'replica.my.cnf' in my home directory at maos@tools-bastion-03 - https://phabricator.wikimedia.org/T137283#2363528 (10Krenair) Those instructions are not for labs users, it "requires access to the active labstore host" [10:07:06] 06Labs, 10Tool-Labs: No 'replica.my.cnf' in my home directory at maos@tools-bastion-03 - https://phabricator.wikimedia.org/T137283#2363599 (10Krenair) [10:07:08] 06Labs, 10Tool-Labs, 07Tracking: Tool Labs users missing replica.my.cnf (tracking) - https://phabricator.wikimedia.org/T135931#2363598 (10Krenair) [10:13:20] 06Labs, 10Labs-Infrastructure: Create a service IP for labs statsd - https://phabricator.wikimedia.org/T136968#2353972 (10Krenair) statsd-labs.svc.eqiad.wmnet 1H IN CNAME labmon1001.eqiad.wmnet? [10:15:06] 06Labs, 10Labs-Infrastructure: Create a service hostname for labs statsd - https://phabricator.wikimedia.org/T136968#2363614 (10Krenair) [11:06:39] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2363751 (10Krenair) Those broken phab-* hosts won't even let me log in as root, only phab-03 does. [11:28:12] 10Wikibugs: wikibugs test bug part II - https://phabricator.wikimedia.org/T90594#2363794 (10valhallasw) மன்னிக்கவும், வேரு வெலை அதிகமாகி விட்டது. (T137267) [11:29:20] 10Wikibugs: Wikibugs doesn't deal with unicode properly - https://phabricator.wikimedia.org/T137267#2363083 (10valhallasw) What irc client are you using? It seems to be parsed correctly by irccloud: {F4142810} [12:56:55] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs: Setup an easy to use logrotate based system for rotating tools logs - https://phabricator.wikimedia.org/T68623#686351 (10JeanFred) Confirming my interest for tools.heritage, where our chatty logs grow and grow :) [13:46:35] 06Labs, 10Tool-Labs: geohack is not responding - https://phabricator.wikimedia.org/T137306#2364376 (10scfc) [13:48:04] 06Labs, 10Tool-Labs: geohack is not responding - https://phabricator.wikimedia.org/T137306#2364389 (10scfc) The webservice is running on `tools-webgrid-lighttpd-1408`, and I can't `ssh` to that host. [13:53:14] 06Labs, 10Tool-Labs: geohack is not responding - https://phabricator.wikimedia.org/T137306#2364391 (10scfc) ``` scfc@tools-bastion-03:~$ qmod -d '*@tools-webgrid-lighttpd-1408' scfc@tools-bastion-03.tools.eqiad.wmflabs changed state of "webgrid-lighttpd@tools-webgrid-lighttpd-1408.eqiad.wmflabs" (disabled) scf... [13:53:16] 06Labs, 10Tool-Labs: geohack is not responding - https://phabricator.wikimedia.org/T137306#2364392 (10scfc) a:03scfc [13:54:32] 06Labs, 10Tool-Labs: tools-webgrid-lighttpd-1408 is not responding - https://phabricator.wikimedia.org/T137306#2364376 (10scfc) [13:59:30] 06Labs, 10Tool-Labs: tools-webgrid-lighttpd-1408 is not responding - https://phabricator.wikimedia.org/T137306#2364434 (10scfc) a:05scfc>03chasemp @chasemp: The host `tools-webgrid-lighttpd-1408` is not responding. IIRC you look at stuck instances to find out what is wrong; do you want to do that here as... [14:21:52] 10Quarry: Add a stop button to halt the query - https://phabricator.wikimedia.org/T71037#2364499 (10matej_suchanek) [14:22:06] 10Quarry: Option to kill task by myself - https://phabricator.wikimedia.org/T137266#2364500 (10matej_suchanek) [14:22:09] 10Quarry: Add a stop button to halt the query - https://phabricator.wikimedia.org/T71037#721028 (10matej_suchanek) [14:46:13] 06Labs, 06Operations: Changing username on WikiTech - https://phabricator.wikimedia.org/T137315#2364589 (10Soni) [14:47:49] 06Labs, 06Operations, 10wikitech.wikimedia.org: Changing username on WikiTech - https://phabricator.wikimedia.org/T137315#2364602 (10Peachey88) [14:48:49] 06Labs, 06Operations, 10wikitech.wikimedia.org: Changing username on LDAP - https://phabricator.wikimedia.org/T137315#2364603 (10Dereckson) [14:49:18] 06Labs, 06Operations, 10wikitech.wikimedia.org, 07LDAP: Changing username on LDAP - https://phabricator.wikimedia.org/T137315#2364589 (10Dereckson) [15:05:18] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2364683 (10Luke081515) Last time the disk was full. Maybe happend again? [15:18:31] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2364696 (10Krenair) Difficult to tell when you don't have salt access. [15:31:41] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Phabricator, 15User-bd808: Create a Conduit API method to lookup Policy information - https://phabricator.wikimedia.org/T137004#2364701 (10bd808) [15:36:21] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Diffusion, 15User-bd808: Create application to manage Diffusion repositories for a Tool Labs project - https://phabricator.wikimedia.org/T133252#2364720 (10bd808) [15:36:24] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 10Phabricator, 15User-bd808: Create a Conduit API method to lookup Policy information - https://phabricator.wikimedia.org/T137004#2364719 (10bd808) 05Open>03Resolved [15:50:52] 06Labs, 06Operations, 10ops-codfw: labtestneutron2001.codfw.wmnet does not appear to be reachable - https://phabricator.wikimedia.org/T132302#2364756 (10Papaul) a:05Papaul>03Andrew [16:22:52] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2364905 (10hashar) [16:23:54] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2364922 (10hashar) We might need... [16:24:53] Hey! I cannot login to tools lab. "ssh: connect to host login.tools.wmflabs.org port 22: Network is unreachable" [16:25:02] It was working a few minutes ago. [16:26:26] what about tools bastion 03 [16:26:30] it's dead [16:27:36] I'll look [16:27:41] it might've just been out for a minute [16:27:42] oh good [16:28:17] yeah, it should be working now [16:28:45] there have been several stucks since weekend [16:29:23] Yeah, I need to suspend/resume all instances as part of a security update [16:29:47] Niharika: should be working again, it was a temporary freeze as part of scheduled maintenance [16:30:05] andrewbogott: Yeah, works now. Thanks! [16:42:31] 06Labs, 10Tool-Labs: tools-webgrid-lighttpd-1408 is not responding - https://phabricator.wikimedia.org/T137306#2364376 (10yuvipanda) @scfc @chasemp is on vacation for about 10more days, so I think we should just bring this one back up maybe. [16:43:50] 10MediaWiki-extensions-OpenStackManager, 10MediaWiki-Authentication-and-authorization, 06Reading-Infrastructure-Team: Update OpenStackManager to use AuthManager - https://phabricator.wikimedia.org/T110288#2364978 (10Tgr) >>! In T110288#2286963, @Anomie wrote: > The one bit remaining is the use of the AbortNe... [16:48:02] 06Labs, 10Tool-Labs: tools-webgrid-lighttpd-1408 is not responding - https://phabricator.wikimedia.org/T137306#2364376 (10Andrew) rebooting. [17:20:04] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2364905 (10mark) If I read this... [17:29:33] 06Labs, 10Labs-Infrastructure, 10Phabricator: can't log in to phab-01.eqiad.wmflabs - https://phabricator.wikimedia.org/T125666#2365122 (10Negative24) [17:29:37] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2365121 (10Negative24) [17:36:21] 06Labs, 10Phabricator: https://phab-01.wmflabs.org (and other installs, like phab-02) returns a 502 error - https://phabricator.wikimedia.org/T137270#2363140 (10Negative24) I can't log into phab-01, -02 nor -04, like I could in T125666, anymore. [17:40:16] lal [17:41:19] 06Labs: Retire Labs shell access request process - https://phabricator.wikimedia.org/T137331#2365180 (10scfc) [17:44:36] vi o [17:46:49] sorry about that ^ :) [18:11:15] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2365294 (10hashar) Sorry it is n... [18:38:05] 06Labs, 06Operations, 10wikitech.wikimedia.org, 07LDAP: Changing username on LDAP - https://phabricator.wikimedia.org/T137315#2365391 (10demon) 05Open>03Resolved Renamed in LDAP and Wikitech. I don't see you in Gerrit so I didn't do anything there. [18:50:54] hey, I'm getting instance creation errors [18:52:41] MaxSem: quota? [18:52:48] under it [18:53:16] instances just spawn in ERROR state [18:53:49] both from horizon and wikitech [18:54:00] MaxSem: could have something to do with the security work andrewbogott is doing [18:56:49] MaxSem: I think that we're just out of RAM in the cluster [18:57:00] :O [18:57:05] I'll be available to look more seriously in an hour or so [19:07:57] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2365455 (10hashar) [19:08:30] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2364905 (10hashar) I have split... [19:23:53] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2365519 (10hashar) [19:28:25] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Nodepool can barely spawn instances OpenStack - https://phabricator.wikimedia.org/T137241#2365543 (10hashar) Thank you @Andrew for the details, happy you managed to figure it out. [19:33:01] andrewbogott: also having the same issue as Niharika [19:38:24] rolling pause/restarts for server patching are ongoing I believe [19:38:35] so ... try again in 10 minutes? [20:11:42] any updates? [20:14:37] musikanimal: still broken for you to ssh into a bastion? [20:15:18] I've only tried login.tools.wmflabs.org [20:16:06] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2365716 (10Luke081515) >>! In T121279#2351107, @bd808 wrote: > (...) > @Luke081515 I know you have shown interest in the jobs that MerlBot suppor... [20:16:11] so yes, I guess [20:16:18] musikanimal: try this one -- tools-dev.wmflabs.org [20:16:36] yes! [20:16:52] tools-login.wmflabs.org / login.tools.wmflabs.org does seem to be unreachable [20:16:55] yuvipanda: ^ [20:19:07] I also had to manually restart several tools, while others did not go down [20:19:55] I can force kill jobs that are stuck for you [20:21:11] looks like all of my stuff is functional now [20:21:29] I just wonder if there are others that went down [20:21:30] 06Labs, 10Labs-Infrastructure: Instance creation results in nodes in ERROR state - https://phabricator.wikimedia.org/T137347#2365741 (10MaxSem) [20:21:51] andrewbogott, created a ticket ^ [20:22:11] xTools, toollabs:musikanimal, templatecount all stayed up, but all 5 of the Pageviews Analaysis tools had to be manually restarted [20:22:55] I wonder why just that suite of tools went down [20:25:36] bad luck with SGE state tracking I would guess. Instances in various projects have been being put through a suspend/resume process to allow OS patching on the host servers. [20:26:09] Normally SGE and/or the various watcher systems should restart jobs that are interrupted by this [20:28:06] Just weird because all 5 run independently of each other hah [20:30:24] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2365830 (10bd808) >>! In T121279#2365716, @Luke081515 wrote: >>>! In T121279#2351107, @bd808 wrote: >> (...) >> @Luke081515 I know you have shown... [20:31:28] !log tools reboot tools-bastion-03 [20:31:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:31:46] !log tools start tools-bastion-03 was stuck in 'stopped' state [20:31:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:33:00] 10Tool-Labs-tools-Other, 06Community-Tech-Tool-Labs, 13Patch-For-Review, 15User-bd808: Create temporary http -> https reverse proxy for MerlBot - https://phabricator.wikimedia.org/T137235#2365837 (10bd808) 05Open>03Resolved The proxy is up and running using Puppet managed configuration. [20:33:02] 10Tool-Labs-tools-Other, 07Tracking: merl tools (tracking) - https://phabricator.wikimedia.org/T69556#2365840 (10bd808) [20:33:04] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2365839 (10bd808) [20:34:08] I had this issue "ssh: connect to host login.tools.wmflabs.org port 22: No route to host" [20:34:19] I couldn't connect to tools [20:35:01] Amir1: yuvipanda jsut restarted the instance behind that dns name. You can try tools-dev.wmflabs.org as an alternate if you still can't reach it [20:35:10] andrewbogott: around? Looks like labnodepool1001 has all its instances stuck in 'delete' again. [20:35:31] Fun. yuvipanda can you kill the wabbit? [20:35:34] thcipriani: I'm looking at a few things, might be related. [20:35:39] I just restarted rabbit [20:35:51] oh [20:36:13] andrewbogott: just a fyi for when you are looking - a nova start on tools-bastion-03 didn't work either. [20:37:06] neat, seems like it's able to delete instance again, hopefully CI catches up. Quick work, sir :) [20:38:19] bd808: thanks :) [21:02:48] !log tools.merlbot Initialized local git repository [21:09:46] MaxSem: can I delete maps-scratch2? It looks like it never really came up in the first place... [21:10:09] andrewbogott, yep - I left it for you to dissect [21:10:19] thanks [21:15:16] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2364905 (10hashar) [21:15:25] !log tools.jouncebot Updated to f14640f and restarted [21:15:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.jouncebot/SAL, Master [21:21:39] andrewbogott: I think the suspend / resume cycle fucked up NTP [21:21:41] Exception: Expected x-www-form-urlencoded response from MediaWiki, but got something else: b'Error: An error occurred in the OAuth protocol: Expired timestamp, yours 1465411919, ours 1465420893' [21:22:06] yuvipanda: that's quite possible, although I would expect it to resync... [21:22:09] if not we can do that with salt [21:22:10] yeah me too [21:22:12] but not yet :| [21:22:15] andrewbogott: can you do that? [21:22:22] yep, googling :) [21:22:46] thanks [21:23:25] yuvipanda: what host are you seeing the problem on? [21:23:40] andrewbogott: the tools-worker-xxx hosts [21:23:44] do you want met o find out which one? [21:23:57] yeah, if you can find a couple of test cases that'd be good [21:24:06] at least Node: tools-worker-1007.tools.eqiad.wmflabs/10.68.19.114 [21:46:29] Quarry broken. Known issue? [21:46:35] Attempting to login generates a 5002 [21:46:37] *502 [21:46:48] Matthew_: how long has it been broken? [21:47:00] I don't know. I just attempted to log in. [21:47:09] ok — give it a minute or two? [21:47:26] Okay. [21:47:33] Matthew_: can you try any other OAuth setup? [21:47:43] Sure, give me two ticks. [21:48:20] xtools went through. [21:48:38] hmm [21:49:10] The url in question: https://quarry.wmflabs.org/oauth-callback?oauth_verifier=&oauth_token= [21:50:00] I can PM it, but my understanding of oauth you could log in as me if I gave it out. Could be wrong. [21:50:19] yeah there's a time offset difference between labs and prod [21:50:25] because of the suspend and resumes I think [21:52:37] Okay. Let me know if I can help you any more. [21:55:00] Matthew_: try again in like 5 mins and let me know? [21:55:03] * yuvipanda is at a conference etc [21:55:07] Okay. [22:01:25] yuvipanda: clocks in *.tools.* should be reasonable now… let me know if you find anything that's off [22:01:56] andrewbogott: ok, checking again [22:03:03] andrewbogott: works on paws now \o/ [22:03:09] ok, good [22:03:11] Matthew_: verifier and token are one-time only [22:03:15] andrewbogott: do for rest maybe? [22:03:15] I'll do the rest of labs after this last sus/res job finishes [22:03:33] if they don't work the first time, you probably need to go through the handshake again [22:04:21] andrewbogott: thanks [22:12:22] yuvipanda andrewbogott : Negative, still broken. [22:14:35] Matthew_: yeah, need to wait for andrewbogott to finish running the ntp syncer [22:14:43] andrewbogott: what's the command? I can probably manually run it on quarry [22:14:45] Okay. [22:15:45] ntpd -q; service ntpd restart [22:15:50] but I'm running that on quarry now [22:16:00] ok [22:16:05] ah, sorry, it's service ntp restart [22:16:18] anyway… did that help? [22:16:40] Matthew_: ^ [22:16:46] Still 502 [22:18:09] Exception: Identity issued 70.8694241047 seconds in the future! [22:18:22] hm, so whatever i'm doing doesn't help :( [22:18:28] yuvipanda: maybe you can google better than me? [22:18:33] oh, you're at a conf [22:19:03] yuvipanda: what host is having that problem? [22:19:10] Matthew_: try now? [22:19:13] andrewbogott: quarry-main-01 [22:19:14] I just did [22:19:17] sudo service ntp stop [22:19:19] sudo ntpdate -s time.nist.gov [22:19:21] sudo service ntp start [22:19:23] and that seemst o have fixed it [22:19:27] ok [22:19:28] Worked. [22:19:33] \o/ [22:19:35] andrewbogott: can you salt that? [22:19:41] yep [22:20:07] thanks [22:20:37] I thought that I'd determined that ntpdate wasn't present on most labs hosts [22:20:41] but we will see what salt things [22:21:56] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2366280 (10bd808) I have been in contact with @Bmueller and others in the dewiki community about how to proceed with helping @Merl in advance of... [22:22:01] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2366284 (10bd808) I've done an audit of the `*.qsub` files in `/data/project/merlbot` and found that there are 28 job control files using `/usr/b... [22:23:26] Thank you all. [22:25:31] yuvipanda: Hey Yuvi, Julien would look into livingstyleguide [22:25:46] in order to debug it [22:26:49] Volker_E: ok! am at a conference for the next two days and then dealing with visa stuff so won't be around so much. [22:27:13] could you provide him with access to the labs instance? [22:28:16] Volker_E: wikitech.wikimedia.org/wiki/Help:Access has docs and stuff, but I won't be able to do much myself before monday. sorry [22:30:25] andrewbogott: does wikitech need its own scap pull after a sync? [22:30:55] tgr: it should act like any other appserver [22:32:03] Volker_E: are you not an admin in that project? I can help fix that for you. [22:34:00] Volker_E: you are an admin in the design project, so you can add members using https://wikitech.wikimedia.org/wiki/Special:NovaProject [22:45:14] bd808: would you add jgirault to the list, I haven't done this yet :} [22:46:39] !log design Added JGirault as projectadmin at request of VolkerE [22:46:43] {{done} [22:46:47] } [22:49:33] bd808: Thank you! :) [22:50:03] bd808: thanks [22:50:23] yw. have fund with the styleguide [22:50:27] *fun [22:52:28] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2364905 (10Dzahn) for the flows... [22:57:06] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, 10Continuous-Integration-Infrastructure (phase-out-gallium): Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2366395 (10Dzahn) Same for labno... [22:57:34] 06Labs, 10Tool-Labs, 13Patch-For-Review: Figure out a way to keep MerlBot running when the HTTP POST loophole is closed - https://phabricator.wikimedia.org/T121279#2366396 (10bd808) The job control files have been edited. I have left the changes as uncommitted diffs in `/data/project/merlbot` making them eas... [22:58:27] Okay docs aren't very clear on this so I'll ask. If I connect to c1.labsdb, do I have visibility to all of the replicas? Current xtools is using c2, but that has supposedly been removed. [23:03:31] Matthew_: the safest thing to do is to connect to the various aliases based on which slice the db you want is grouped with. There is a table in meta_p called "wiki" that tells which slice each wiki db lives on [23:04:10] So... establish connections to each slice and use the right one? My framework won't let me establish database connections on the fly afaik. [23:05:52] there are only 7 slices (s1 ... s7) so at worst you could set those all up statically. [23:06:11] the meta_p database is on s7 for the lookup queries [23:06:50] you can see a PHP example of this sort of behavior at https://tools.wmflabs.org/replag/?source [23:08:19] Okay. Hm. [23:08:58] I wish there was a one-stop shop. [23:09:53] that's a lot of data to put on one db server [23:10:09] I was thinking aliasing. But eh. [23:10:09] Matthew_: c1.labsdb will also have all replicas [23:10:13] and so does c3 etc [23:10:23] but the c1 / c3 names aren't guaranteed to be stable forever I think [23:10:28] right [23:10:38] Okay. [23:10:44] and in theory they can be out of sync [23:10:48] I'll do more research. [23:11:13] Thank you. [23:29:27] 06Labs, 10Labs-Infrastructure, 06Operations, 06Release-Engineering-Team, and 2 others: Firewall rules for labs support host to communicate with contint1001.eqiad.wmnet (new gallium) - https://phabricator.wikimedia.org/T137323#2366543 (10Dzahn) |--|--|--|--|--|--|--|-- | TCP | scandium | 10.64.4.12 | contin... [23:41:48] 06Labs, 10Tool-Labs: tools-webgrid-lighttpd-1408 is not responding - https://phabricator.wikimedia.org/T137306#2366570 (10scfc) 05Open>03Resolved a:05chasemp>03scfc ``` scfc@tools-bastion-03:~$ qmod -e '*@tools-webgrid-lighttpd-1408' scfc@tools-bastion-03.tools.eqiad.wmflabs changed state of "webgrid-l...