[01:04:35] any admins around? Please aprove https://wikitech.wikimedia.org/wiki/Shell_Request/Kevinkreiser [01:53:03] cron on tools seems to be down. Sheduled jobs didn't get executed and the crontab command just blocks [02:12:08] sitic: looking [02:12:15] can someone here restart a tool? [02:12:51] https://tools.wmflabs.org/croptool [02:13:40] comets: done [02:13:48] :DD [02:14:02] areh wah ;) [02:14:38] sitic: hmm, I can’t reach tools-submit [02:14:45] imma reboot it see what happens [02:15:09] !log tools rebooted tools-submit, was not responding [02:15:15] Logged the message, Master [02:16:32] still down yuvi :( [02:20:59] crontab seems to be back (I see my old crontab now), thanks YuviPanda :-) [02:21:05] yw [02:21:10] comets: stil? [02:21:17] comets: which one is still down? [02:21:33] https://tools.wmflabs.org/?tool=croptool [02:21:47] comets: https://tools.wmflabs.org/croptool/ wfm [02:22:18] yeah is still down O_O [02:23:37] comets: I loads for me, but when I connect with OAuth I get an internal server error [02:23:40] comets: oh, I see what you mean. [02:23:40] yeah [02:24:26] I moved it back to precise, works now [02:24:30] comets: try now [02:25:10] woo hoo \o/ chal para .. [03:28:39] !log tools.zoomviewer disabled inefficient webwatcher script [03:28:42] Logged the message, Master [03:29:45] 10Tool-Labs: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200627 (10yuvipanda) 3NEW [03:31:04] 10Tool-Labs: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200634 (10yuvipanda) It wasn't puppetized either, so it didn't exist on the new bastions. I've killed it on tools-submit. [03:34:26] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200638 (10yuvipanda) I've symlinked jsub to jlocal so that tools that reference it do not break, and am manually removing the references to the webwatcher script. [03:44:37] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200644 (10jeremyb-phone) [03:45:34] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200647 (10yuvipanda) A *lot* of crons seem to refer to them. The common webwatcher one should already be handled by service manifests already, so it's safe to disable them, I think. [04:11:13] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200663 (10yuvipanda) Alright, I had to revert that because plenty of tools on cron rely on being run from an exec host, which seems to be done like this. I'm going to go through and disable all the ones that seem ok t... [04:18:38] 10Tool-Labs: Crontabs are not backed up - https://phabricator.wikimedia.org/T95798#1200671 (10yuvipanda) 3NEW [04:36:33] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200689 (10yuvipanda) Alright, I've killed all the ones that could safely be killed, and hope the others don't cause tools-submit to die again... manifests are, again, the long term solution to this, I think. [06:36:22] PROBLEM - Puppet failure on tools-exec-07 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [06:38:30] PROBLEM - Puppet failure on tools-shadow is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [06:41:57] https://tools.wmflabs.org/magnustools/multistatus.html could someone please restart catscan ? [06:52:54] GerardM-: looking [06:52:58] it shouldn’t be down [06:53:50] correct, but the tool shows that it is [06:54:04] yeah, I see that [06:54:09] I’m investigating to see what’s happening [06:54:21] I put in a bunch of effort last week to make this stop from ever happening [06:54:25] now looking at what I missed [06:55:32] YuviPanda: I do notice that it has an effect ... the statistics for Reasonator are going up [06:56:00] so thank you !! [06:56:18] interesting. [06:56:32] GerardM-: so basically I wrote a bigbrother replacement that should just keep tools up, no effort from maintainers [06:56:38] and it just started it back up fine [06:56:44] but for some reason it itself had crashed [06:56:49] and I’m going to investigate why [06:56:53] GerardM-: catscan should be up now [06:57:03] shit happens [06:57:05] thanks [06:59:10] PROBLEM - Puppet failure on tools-exec-02 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [07:01:09] Yuvi ... I think I either wrote a mail or a phab item ... much of the history of Reasonator is gone [07:01:21] is it possible to get the old and merge it ? [07:01:22] RECOVERY - Puppet failure on tools-exec-07 is OK: OK: Less than 1.00% above the threshold [0.0] [07:01:53] PROBLEM - Puppet failure on tools-webgrid-03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [07:03:33] GerardM-: what do you mean ‘history of reasonator’? [07:03:39] I’ve no idea what that means [07:04:04] https://tools.wmflabs.org/reasonator/stats.php [07:04:13] they are its page views [07:04:27] you’ll have to ask magnus. [07:04:31] I’ve no idea where that comes from [07:04:34] and it is a testament to the health of Reasonator and by inference labs [07:07:08] 10Tool-Labs: Page views of Reasonator are missing - https://phabricator.wikimedia.org/T95805#1200766 (10GerardM) 3NEW a:3Magnus [07:07:43] 10Tool-Labs: Page views of Reasonator are missing - https://phabricator.wikimedia.org/T95805#1200774 (10GerardM) [07:08:02] 10Tool-Labs: Page views of Reasonator are missing - https://phabricator.wikimedia.org/T95805#1200776 (10yuvipanda) 5Open>3Invalid Please file bugs for reasonator at https://bitbucket.org/magnusmanske/reasonator/issues?status=new&status=open. [07:08:30] RECOVERY - Puppet failure on tools-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [07:08:42] The data IS at Labs ... [07:09:26] it’s a support request for an individual tool [07:09:33] 10Tool-Labs: Page views of Reasonator are missing - https://phabricator.wikimedia.org/T95805#1200779 (10GerardM) 5Invalid>3Open [07:09:35] if magnus determins that the problem is with labs then he’ll open a ticket here [07:10:10] 10Tool-Labs: Page views of Reasonator are missing - https://phabricator.wikimedia.org/T95805#1200780 (10yuvipanda) 5Open>3Invalid To repeat, the problem is with an individual tool. Please report it to the tool maintainer at the appropriate place. [07:17:46] I feel this a a disincentive to report bugs [07:18:27] I am a casual contributor to all of this and it feels like more bureaucratic and difficult than needed [07:19:04] I was just pointing out the correct place for you to file your bugs so magnus can see them. [07:19:18] Does Magnus get the message ? [07:19:38] all other bugs for reasonator are filed on bitbucket [07:19:47] right [07:20:03] You are perfectly right in what you say [07:20:39] if you want to use phabricator for it, talk to magnus and ask him to move bug reporting for his tools to phabricator. he can easily request a project [07:21:36] I typically talk to Magnus directly... I was trying to be nice [07:23:17] I am too. I could’ve just closed it and asked you to find out where to file it [07:23:34] I pointed you to where to file it, after restarting things for you past midnight on a friday night [07:24:08] RECOVERY - Puppet failure on tools-exec-02 is OK: OK: Less than 1.00% above the threshold [0.0] [07:26:58] RECOVERY - Puppet failure on tools-webgrid-03 is OK: OK: Less than 1.00% above the threshold [0.0] [07:40:44] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200796 (10scfc) Cf. T66988 for the necessity; granted, "Tool Labs: support some non-work prefixes in crontabs" makes the addition look like an oversight. [07:44:07] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200800 (10yuvipanda) Perhaps we can make one host available that can be a submit host as well, and people can schedule jobs on that one? [09:14:49] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200863 (10Giftpflanze) The change still seems to be in effect. That means that if I change my crontab, jlocal is prepended with jsub, which renders it unusable. Please effectively revert it. [09:21:12] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200864 (10yuvipanda) Yes, I've reverted it, should be back to normal in next puppet run (about 20mins?) This needs to be fixed 'cleanly'. [10:51:08] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200878 (10valhallasw) > Why does it exist? It just runs the 'job' locally. https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Grid#Creating_a_crontab > If your cron entry only includes a brief script that, itself, s... [11:09:53] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200885 (10Betacommand) I am one of the people who pushed for a tool like jlocal in the first place. It cannot be removed without causing a lot of problems. All but one of my cron's use jsub, the one that uses jlocal... [11:15:50] 10Tool-Labs: Make admin www relocatable - https://phabricator.wikimedia.org/T95808#1200886 (10scfc) 3NEW [11:39:06] 10Tool-Labs, 5Patch-For-Review: Get rid of jlocal - https://phabricator.wikimedia.org/T95796#1200910 (10scfc) >>! In T95796#1200800, @yuvipanda wrote: > Perhaps we can make one host available that can be a submit host as well, and people can schedule jobs on that one? I wasn't convinced then that "schedule bo... [12:07:27] 10Tool-Labs-tools-tsreports: oldpagesvswikidata: add page creation date - https://phabricator.wikimedia.org/T87340#1200951 (10valhallasw) a:5valhallasw>3None [12:42:51] [13flask-mwoauth] 15valhallasw pushed 1 new commit to 06master: 02http://git.io/vvJeK [12:42:51] 13flask-mwoauth/06master 149bb1144 15Merlijn van Deen: Added info on the demo app [12:43:04] why did the screw up the editing interface :< [12:43:37] [13flask-mwoauth] 15valhallasw pushed 1 new commit to 06master: 02http://git.io/vvJe7 [12:43:37] 13flask-mwoauth/06master 1476d8fbf 15Merlijn van Deen: Update README.rst [12:44:06] oh it's RST [12:44:07] no wonder [12:44:08] :< [12:46:43] [13flask-mwoauth] 15valhallasw pushed 1 new commit to 06master: 02http://git.io/vvJvz [12:46:43] 13flask-mwoauth/06master 1410144a5 15Merlijn van Deen: rst not markdown :/ [13:58:16] 10Tool-Labs: Tool Labs: Provide anonymized view of the user_properties table - https://phabricator.wikimedia.org/T60196#1201019 (10He7d3r) [13:59:58] 10Tool-Labs: Tool Labs: Provide anonymized view of the user_properties table - https://phabricator.wikimedia.org/T60196#630086 (10He7d3r) [14:00:01] 10Tool-Labs-tools-Database-Queries, 10Analytics, 10MediaWiki-User-preferences: DBQ-197 User preferences usage on Portuguese Wikipedia - https://phabricator.wikimedia.org/T61480#1201025 (10He7d3r) [14:05:26] 10Tool-Labs-tools-Database-Queries, 10Analytics, 10MediaWiki-User-preferences: DBQ-197 User preferences usage on Portuguese Wikipedia - https://phabricator.wikimedia.org/T61480#1201028 (10He7d3r) > use ptwiki_p; > SELECT up_property,up_value,COUNT(*) FROM user_properties WHERE up_property LIKE 'gadget%' GROU... [14:23:23] 10Tool-Labs-tools-Database-Queries, 10Analytics, 10MediaWiki-User-preferences: Gadget usage statistics for Portuguese Wikipedia - https://phabricator.wikimedia.org/T61480#1201037 (10He7d3r) [14:52:11] 10Tool-Labs: Fix htmlpurifier deployment - https://phabricator.wikimedia.org/T95815#1201040 (10scfc) 3NEW a:3scfc [14:52:34] (03PS1) 10Tim Landscheidt: Don't use deprecated PHP short tags [labs/toollabs] - 10https://gerrit.wikimedia.org/r/203536 (https://phabricator.wikimedia.org/T95688) [14:54:39] (03CR) 10Tim Landscheidt: [C: 032 V: 032] "Tested live." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/203536 (https://phabricator.wikimedia.org/T95688) (owner: 10Tim Landscheidt) [15:03:46] PROBLEM - ToolLabs Home Page on toollabs is CRITICAL: HTTP CRITICAL: HTTP/1.1 200 OK - string 'Magnus' not found on 'http://tools.wmflabs.org:80/' - 2698 bytes in 0.007 second response time [15:08:49] RECOVERY - ToolLabs Home Page on toollabs is OK: HTTP OK: HTTP/1.1 200 OK - 764180 bytes in 2.356 second response time [15:10:16] 10Tool-Labs-tools-Database-Queries, 10Analytics, 10MediaWiki-User-preferences: Gadget usage statistics for Portuguese Wikipedia - https://phabricator.wikimedia.org/T61480#1201059 (10Halfak) {F111522} Here's my query: ``` SELECT SUBSTR(up_property, 8) as gadget, CAST(up_value AS SIGNED) AS enabled,... [15:10:54] (03PS1) 10Tim Landscheidt: Do not use *any* deprecated PHP short tag [labs/toollabs] - 10https://gerrit.wikimedia.org/r/203538 (https://phabricator.wikimedia.org/T95688) [15:13:17] (03CR) 10Tim Landscheidt: [C: 032 V: 032] "a) My regexes missed that one." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/203538 (https://phabricator.wikimedia.org/T95688) (owner: 10Tim Landscheidt) [15:15:50] 10Tool-Labs, 3ToolLabs-Goals-Q4: Make webservice default to trusty on toollabs - https://phabricator.wikimedia.org/T94788#1201074 (10scfc) [15:15:51] 10Tool-Labs, 5Patch-For-Review: Admin www depends on short_open_tag = On - https://phabricator.wikimedia.org/T95688#1201071 (10scfc) 5Open>3Resolved a:3scfc Web service is running now on Trusty: ``` tools.admin@tools-bastion-01:~$ cat .bigbrotherrc jstart -N toolhistory -mem 350M /data/project/admin/bi... [15:20:11] 10Tool-Labs-tools-Database-Queries, 10Analytics, 10MediaWiki-User-preferences: Gadget usage statistics for Portuguese Wikipedia - https://phabricator.wikimedia.org/T61480#1201075 (10He7d3r) 5Open>3Resolved a:3He7d3r Thanks! [15:25:19] PROBLEM - Puppet failure on tools-webproxy-02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:30:19] PROBLEM - Puppet failure on tools-webproxy-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:50:20] RECOVERY - Puppet failure on tools-webproxy-02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:15] RECOVERY - Puppet failure on tools-webproxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:51:55] YuviPanda|zzz: why are you sleeping? Coren_away why are you away? [16:54:43] Can I get an admin to tell me how to restart wikiviewstats? I try webservice stop and it says webservice not running yet qstat reports 2 jobs. [16:56:37] Weekend? :P [17:14:52] T13|mobile: qdel [17:15:27] Yes Betacommand, but how do I restart the service? lol [17:15:54] T13|mobile: webservice2 perhaps? [17:15:55] T13|mobile: you first kill them using qdel [17:16:06] then webservice2 start [17:16:13] JohnLewis: webservice2 says no jobs [17:16:46] Will try that Betacommand [17:17:09] T13|mobile: wait a few seconds after the last qdel before trying to start it [17:19:44] I did qdel '*' [17:20:08] Waited and checked with qstat until there were no jobs [17:20:15] Did webservice2 start [17:20:24] It said starting [17:20:39] I checked with qstat and there were no jobs [17:20:52] Did webservice start and it showed up. [17:20:53] T13|mobile: does the web links work? [17:21:05] Checking that now [17:23:20] Betacommand: tool is up but says no db connection [17:23:23] http://tools.wmflabs.org/wikiviewstats/ [17:24:02] T13|mobile: that means something is wrong with your code [17:24:54] I'll have to look into it later. Doing it from my mobile phone isn't very productive. [17:30:10] T13|mobile: Coren looked once into it, see the last two lines in https://phabricator.wikimedia.org/T91320 [17:31:59] 10Tool-Labs-xTools: wikiviewstats - No db-connection - https://phabricator.wikimedia.org/T91320#1201130 (10Technical13) p:5Triage>3High a:3Technical13 [17:33:11] 10Tool-Labs-xTools: Wikiviewstats does not support Wikidata - https://phabricator.wikimedia.org/T63833#1201132 (10Technical13) p:5Triage>3Normal [17:34:24] what kind of error message is that? [17:34:34] "Again something is messed up after Tool Labs database maintenance Sorry for that!" [17:34:51] blaming tools for every single error in the tool? [17:35:49] Glaisher: that was how Hedonil left it. xTools hasn't had time to make any changes since takeover. [17:36:19] I've claimed the ticket and will work on it over the next couple weeks. [17:37:53] (03PS1) 10BearND: Update SDK and more [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/203552 [17:39:06] (03CR) 10BearND: [C: 032 V: 032] Update SDK and more [labs/tools/wikipedia-android-builds] - 10https://gerrit.wikimedia.org/r/203552 (owner: 10BearND) [17:41:38] (03PS1) 10John F. Lewis: wmt: change tag name [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/203564 [21:31:15] PROBLEM - Puppet failure on tools-webproxy-01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:32:34] 10Wikimedia-Labs-wikitech-interface, 6operations, 7Regression: Some wikitech.wikimedia.org thumbnails broken (404) - https://phabricator.wikimedia.org/T93041#1201254 (10Andrew) a:3Andrew [21:39:02] I come back and I have a dozen or so emails with qgil asking phacility for paid prioritization [21:40:10] I guess that's one way to do it :) [21:43:18] PROBLEM - Puppet failure on tools-webproxy-02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:49:13] !log tools moved /data/project/admin/toollabs to /data/project/admin/toollabsbak on tools-webproxy-01 and tools-webproxy-02 to fix permission errors [21:49:16] Logged the message, dummy [21:49:46] PROBLEM - ToolLabs Home Page on toollabs is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - string 'Magnus' not found on 'http://tools.wmflabs.org:80/' - 371 bytes in 0.006 second response time [21:51:27] andrewbogott: ^ [21:51:32] That's probably the move [21:51:40] YuviPanda|zzz: weirdly I think it isn't [21:51:42] * YuviPanda|zzz is at a bank [21:51:43] although surely related [21:51:44] Oh? [21:51:47] Hmm [21:51:57] * YuviPanda|zzz reads backscroll [21:52:54] or maybe it is... [21:53:01] for a moment I couldn’t get nginx to start but not it seems fine [21:54:26] The admin stuff is all a bit hacky [21:54:54] yes! [21:54:59] I can’t tell what’s happening. [21:55:03] Heh [21:55:10] The puppetized git clone was failing, so I moved the old clone out of the way. After that puppet got happy. [21:55:12] but then... [21:55:32] Maybe someone was hacking locally? [21:55:32] well, I moved it back [21:55:39] and now puppet is broken again but the site works. [21:55:40] So [21:55:40] It also has some jobs on the grid [21:56:17] the diff between the two (the proper git checkout and what was there before) is enormous [21:56:44] Sounds like someone has been locally hacking mayne [21:56:52] so yes, local hacks breaking puppet but also required for things to work. [21:56:56] very bad! [21:57:29] YuviPanda|zzz: who would be hacking? tim, coren, that it? [21:59:37] andrewbogott: get rid of the git clone from puppet and open a bug? [21:59:48] RECOVERY - ToolLabs Home Page on toollabs is OK: HTTP OK: HTTP/1.1 200 OK - 764195 bytes in 2.813 second response time [21:59:51] andrewbogott: tim had a patch merged earlier today so maybe hi. [21:59:53] Him [22:01:03] 6Labs, 10Tool-Labs: Local hacks in /data/project/admin/toollabs on the web proxies - https://phabricator.wikimedia.org/T95821#1201286 (10Andrew) 3NEW [22:01:13] YuviPanda|zzz: ^ and I’m going to drop it. [22:02:42] andrewbogott: yup thankd [22:03:20] RECOVERY - Puppet failure on tools-webproxy-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:16:16] RECOVERY - Puppet failure on tools-webproxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:31:47] 6Labs, 10Tool-Labs: Local hacks in /data/project/admin/toollabs on the web proxies - https://phabricator.wikimedia.org/T95821#1201302 (10scfc) a:3scfc [22:37:06] 6Labs, 10Tool-Labs: Local hacks in /data/project/admin/toollabs on the web proxies - https://phabricator.wikimedia.org/T95821#1201303 (10scfc) 5Open>3Resolved I had played around with it, but I was sure that when I moved on, `git status` was clean (and the homepage working). Hmmm. Anyway, Puppet works no...