[00:10:55] !log tools.openstack-browser Deployed c491ace to display proxy information on project pages. T45580 [00:10:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.openstack-browser/SAL [00:11:00] T45580: Automatically updated list of all configured domains - https://phabricator.wikimedia.org/T45580 [00:12:17] (03Draft2) 10Mess: Add new file list "File con titolo omonimo su Commons" [labs/tools/lists] - 10https://gerrit.wikimedia.org/r/349348 [00:19:16] PROBLEM - Puppet errors on tools-exec-1432 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [00:30:17] 06Labs, 10Labs-Infrastructure, 06Community-Tech-Tool-Labs, 13Patch-For-Review: invisible-unicorn (dynamicproxy) should provide an easy way to see where a host routes without knowing the project - https://phabricator.wikimedia.org/T115752#3200381 (10bd808) The read-only GET routes on the domainproxy instanc... [00:33:45] 10Tool-Labs-tools-Xtools, 06Community-Tech: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3192212 (10Niharika) Why do we need routing again? Can't we have a copy of the repo in each of the tools? [00:59:17] RECOVERY - Puppet errors on tools-exec-1432 is OK: OK: Less than 1.00% above the threshold [0.0] [01:00:14] 10Tool-Labs-tools-Xtools, 06Community-Tech: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3192212 (10Samwilson) It would be fine to have a copy, and I think the only drawback would be URLs like `/xtools-ec/ec/blah` and `/xtools-articleinfo/articleinfo/blah`, so it'd be nice to rewri... [01:24:29] PROBLEM - Puppet errors on tools-exec-1433 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [01:59:26] RECOVERY - Puppet errors on tools-exec-1433 is OK: OK: Less than 1.00% above the threshold [0.0] [03:34:41] PROBLEM - Puppet errors on tools-exec-1437 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [03:59:41] RECOVERY - Puppet errors on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [04:25:40] PROBLEM - Puppet errors on tools-exec-1437 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [05:00:40] RECOVERY - Puppet errors on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [05:33:25] PROBLEM - Puppet errors on tools-exec-1441 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [05:37:04] !log project-proxy Added BryanDavis (self) as admin [05:37:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Project-proxy/SAL [06:08:25] RECOVERY - Puppet errors on tools-exec-1441 is OK: OK: Less than 1.00% above the threshold [0.0] [06:29:27] 10Tool-Labs-tools-Xtools, 06Community-Tech: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3200732 (10MusikAnimal) >>! In T163283#3200660, @Matthewrbowker wrote: > Symfony must be installed in every Tool Labs tool you create (This is a restriction of Composer). This creates a huge m... [06:44:24] 10Tool-Labs-tools-Xtools, 06Community-Tech: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3200774 (10Matthewrbowker) >>! In T163283#3200732, @MusikAnimal wrote: >>>! In T163283#3200660, @Matthewrbowker wrote: >> Symfony must be installed in every Tool Labs tool you create (This is a... [08:52:41] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/S3w3nofficial was created, changed by S3w3nofficial link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/S3w3nofficial edit summary: Created page with "{{Tools Access Request |Justification=I would like to cooperate with Urbanecm on commons-mass-description tool. To be added as a maintainer I need to have access to toollabs p..." [10:39:57] (03CR) 10Ricordisamoa: [C: 032] Change group 1 item, add alkali-metal class [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/348696 (owner: 10Ricordisamoa) [10:40:20] (03Merged) 10jenkins-bot: Change group 1 item, add alkali-metal class [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/348696 (owner: 10Ricordisamoa) [11:07:46] (03CR) 10Ricordisamoa: "According to https://en.wikipedia.org/wiki/Periodic_table#Overview Q19753344 should be added to hydrogen and nitrogen" [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/348696 (owner: 10Ricordisamoa) [13:43:16] !log tools T161898 clush -g all 'sudo puppet agent --disable "rollout nfs-mount-manager"' [13:43:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:43:21] T161898: IO issues for Tools instances flapping with iowait and puppet failure - https://phabricator.wikimedia.org/T161898 [13:45:24] 06Labs, 10Tool-Labs: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3201409 (10Anomie) When I checked the 24-hour graph this morning I saw there was an approximately 10GB increase in memory usage on 2017-04-20 from about 13:44 to 14:15 UTC ([[https://gra... [13:51:41] PROBLEM - Puppet errors on tools-exec-1437 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [13:52:39] ^ me :) [14:01:58] 06Labs, 10Tool-Labs: s51053 is running unnecessarily long running queries on revision - https://phabricator.wikimedia.org/T163192#3201447 (10jcrespo) For @marostegui: disabling an account on mysql for me means: ``` UPDATE mysql.user SET password=REVERSE(password) WHERE user='s51053'; ``` [14:02:08] 06Labs, 10Tool-Labs: s51053 is running unnecessarily long running queries on revision - https://phabricator.wikimedia.org/T163192#3201448 (10chasemp) >>! In T163192#3201444, @jcrespo wrote: > @Chasemp @bd808 I got no answer from the user in a week's time. Fearing that the account may be unmaintained, and given... [14:06:41] RECOVERY - Puppet errors on tools-exec-1437 is OK: OK: Less than 1.00% above the threshold [0.0] [14:06:53] !log wikilabels deployed wikilabels-wmflabs-deploy:c63a65d [14:06:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikilabels/SAL [14:07:09] 06Labs, 10Tool-Labs: s51053 is running unnecessarily long running queries on revision - https://phabricator.wikimedia.org/T163192#3201470 (10chasemp) Just a note for reference, we do have reconciliation logic that looks at ldap and creates users but it only looks at whether the account exists natively and does... [14:08:05] 06Labs, 10Tool-Labs: s51053 is running unnecessarily long running queries on revision - https://phabricator.wikimedia.org/T163192#3201474 (10jcrespo) I was wondering that and hoping it was the case. [15:13:42] 06Labs, 10Labs-Infrastructure, 07artificial-intelligence: Provide large disk space to WikiBrain for memory-mapped file - https://phabricator.wikimedia.org/T161554#3135155 (10Andrew) So, this question relates to both storage needs and also the appropriateness of Labs use: Is this giant storage use something... [15:21:59] 06Labs, 10Labs-Infrastructure, 07artificial-intelligence: Provide large disk space to WikiBrain for memory-mapped file - https://phabricator.wikimedia.org/T161554#3201619 (10Shilad) Good questions! The big files are statistical models. So they take a while to build (a day or two), but they can be easily recr... [15:27:54] 06Labs, 10Labs-Infrastructure, 07artificial-intelligence: Provide large disk space to WikiBrain for memory-mapped file - https://phabricator.wikimedia.org/T161554#3201625 (10Andrew) [15:27:57] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#3201624 (10Andrew) [15:36:08] 06Labs: IO issues for Tools instances flapping with iowait and puppet failure - https://phabricator.wikimedia.org/T161898#3201637 (10chasemp) iowait errors over the past few weeks: ```chasemp_freenode_#wikimedia-labs_20170401.log 1 tools-grid-master 1 tools-webgrid-lighttpd-1415 chasemp_freenode_#wi... [15:37:17] 06Labs: IO issues for Tools instances flapping with iowait and puppet failure - https://phabricator.wikimedia.org/T161898#3201638 (10chasemp) I think we should create 10 tools-webgrid-lighttpd-14* instances to make up for the 20 lost precise ones and see how jobs and load shift. Making 2 tools-webgrid-generic i... [15:59:09] 06Labs: IO issues for Tools instances flapping with iowait and puppet failure - https://phabricator.wikimedia.org/T161898#3201719 (10chasemp) Is this meant to be up https://tools.wmflabs.org/hazard-bot/ ? seen on https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad [16:02:42] !log tools.chie-bot webservice restart [16:02:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.chie-bot/SAL [16:27:48] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Create XTools API with namespace endpoint, using JS to update to namespace selector - https://phabricator.wikimedia.org/T162754#3173629 (10MusikAnimal) Note T163527 means currently the API and namespace selector will break if you try certain projects like es.w... [17:25:16] 10Tool-Labs-tools-Xtools, 06Community-Tech: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3201951 (10MusikAnimal) >>! In T163283#3201839, @bd808 wrote: > * If everything is kept in one large bundle, three to five years from now there will be another group of people trying to rescue... [17:30:06] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3201979 (10MusikAnimal) [17:30:34] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3192212 (10MusikAnimal) Sounds like this investigation has begun! [17:36:07] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3201997 (10bd808) >>! In T163283#3201891, @MusikAnimal wrote: > @bd808 Solely in terms of disk space and "overhead", how bad is it to have the full Symfony framework running on each dedi... [17:49:02] 06Labs, 10Tool-Labs: labsdb1001 crashing regularly in the last 2 days due to OOM - https://phabricator.wikimedia.org/T163001#3202081 (10JackPotte) [17:49:04] 06Labs, 10Tool-Labs: s51053 is running unnecessarily long running queries on revision - https://phabricator.wikimedia.org/T163192#3202079 (10JackPotte) 05Resolved>03Open Hello, Please forgive me to reopen this ticket but I've just discovered that the database user s51053 was actually me. So I've just pro... [17:53:02] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3202085 (10bd808) >>! In T163283#3201951, @MusikAnimal wrote: > And again the "monolithic suite" is mostly Symfony. For me, the "monolithic suite" is placing N different end user desire... [18:07:29] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3202109 (10MusikAnimal) Yup, and I pulled that number myself with https://tools.wmflabs.org/musikanimal I have my regrets :( We definitely don't want to do that for XTools, and I think w... [18:14:00] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Investigation: XTools routing - https://phabricator.wikimedia.org/T163283#3202116 (10MusikAnimal) >>! In T163283#3202109, @MusikAnimal wrote: > Finally, it's important to bear in mind the "brand" of XTools. And that's also why I am really fond of https://xtoo... [18:30:32] 06Labs, 10Tool-Labs-tools-Other: cewbot using '* * * * *' cron that could be replaced with .bigbrotherrc - https://phabricator.wikimedia.org/T163572#3202153 (10bd808) [20:15:06] PROBLEM - Host tools-webgrid-lighttpd-1422 is DOWN: CRITICAL - Host Unreachable (10.68.22.239) [20:15:12] PROBLEM - Host tools-webgrid-lighttpd-1423 is DOWN: CRITICAL - Host Unreachable (10.68.20.249) [20:15:18] PROBLEM - Host tools-webgrid-lighttpd-1426 is DOWN: CRITICAL - Host Unreachable (10.68.19.198) [20:15:32] PROBLEM - Host tools-webgrid-lighttpd-1425 is DOWN: CRITICAL - Host Unreachable (10.68.22.213) [20:16:04] PROBLEM - Host tools-webgrid-lighttpd-1424 is DOWN: CRITICAL - Host Unreachable (10.68.18.121) [20:19:17] ^ That's all me, sorry for the noise [20:22:24] RECOVERY - Host tools-webgrid-lighttpd-1425 is UP: PING OK - Packet loss = 0%, RTA = 0.61 ms [20:23:00] RECOVERY - Host tools-webgrid-lighttpd-1422 is UP: PING OK - Packet loss = 0%, RTA = 0.68 ms [20:23:38] RECOVERY - Host tools-webgrid-lighttpd-1426 is UP: PING OK - Packet loss = 0%, RTA = 1.24 ms [20:23:42] RECOVERY - Host tools-webgrid-lighttpd-1424 is UP: PING OK - Packet loss = 0%, RTA = 4.87 ms [20:23:52] RECOVERY - Host tools-webgrid-lighttpd-1423 is UP: PING OK - Packet loss = 0%, RTA = 1.63 ms [20:27:26] andrewbogott: CI is a bit busy we have a huge amount of patches to land. Seems it is well behaving but I noticed labvirt1008 load is raising [20:27:40] labvirt1007 as well [20:27:50] ( based on https://grafana.wikimedia.org/dashboard/db/labs-capacity-planning?panelId=94&fullscreen&orgId=1&from=now-2h&to=now ) [20:27:53] hasharAway: yeah, I'm building a bunch of new nodes so that's probably taxing things [20:28:03] ahhh [20:28:43] also I just noticed labvirt1010 has 25-30 load while the rest of the fleet is more at 15-20 [20:28:57] maybe lavirt1010 has too many high cpu demanding instances? [20:29:49] not sure… I'll keep an eye on it for a bit. [20:33:29] hm, 29 test, 6 gate-and-submit... wow :o [20:33:50] Sagan most of them hasharAway :) [20:36:23] 06Labs, 10Labs-Infrastructure, 05Security: horizon accepts the same 2FA token as wikitech - https://phabricator.wikimedia.org/T131638#2173430 (10bd808) The MediaWiki code used for this was developed in {T144712} to support 2FA in #striker. Tokens are now validated by MediaWiki's OATH extension directly using... [20:36:25] 06Labs, 10Labs-Infrastructure, 05Security: horizon accepts the same 2FA token as wikitech - https://phabricator.wikimedia.org/T131638#3202606 (10bd808) [20:37:30] Sagan: yup and changes in "gate-and-submit" are processed before the changes in "test" [20:37:55] ah, right [20:38:09] so it takes a long time, since most gate-and-submit do not run at the same? [20:38:29] there is a limited number of instances that can run jobs [20:38:42] so they are currently all consumed/assigned to run the spam of changes in gate-and-submit [20:39:00] changes in "test" are patiently waiting for a slot :] [20:39:51] Sagan: the reason is all the patches on https://gerrit.wikimedia.org/r/#/q/topic:T119973 [20:39:51] T119973: Convert all repos to use npm Jenkins job with jsonlint and eslint - https://phabricator.wikimedia.org/T119973 [20:40:06] ah [20:43:27] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Remove references to "Range Contributions" and "Autoblock" within xTools code - https://phabricator.wikimedia.org/T163374#3202687 (10kaldari) 05Open>03Resolved [20:43:31] 10Tool-Labs-tools-Xtools, 06Community-Tech: Epic: Rewriting XTools - https://phabricator.wikimedia.org/T153112#3202688 (10kaldari) [20:43:56] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1421 is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [20:48:58] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1421 is OK: OK: Less than 1.00% above the threshold [0.0] [21:15:52] hey madhuvishy. I have a question about PAWS, not PAWS-internal. can I ask you? [21:20:12] PROBLEM - Puppet errors on tools-webgrid-lighttpd-1425 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:21:17] lzia1: sure :) [21:22:00] I'm looking for a few good examples of PAWS usage I can share with Tiziano. He's considering to move all the documentation around article expansion work to PAWS, and it would be great if we can help him get started faster. [21:22:12] aah [21:22:28] I read the documentation for PAWS, I see Peter Norvig's examples, anything else you can recommend, madhuvishy? [21:22:29] lzia1: paws has public urls so you can see other users work easily [21:22:43] http://paws-public.wmflabs.org/paws-public/User:YuviPanda/ [21:22:45] * lzia1 checks [21:22:46] has some notebooks [21:23:20] I assume other research folks like j-mo and folks who work with him in U-Washington will have similar stuff [21:23:42] cool. thanks, madhuvishy. this should help me get started. [21:24:35] lzia1: http://paws-public.wmflabs.org/paws-public/User:YuviPanda/talks/WDC2017.ipynb?format=slides#/ is Yuvi's slides from the data design hackathon - hopefully that can help too :) [21:24:39] yw :) [21:24:54] thanks! :) [21:25:12] RECOVERY - Puppet errors on tools-webgrid-lighttpd-1425 is OK: OK: Less than 1.00% above the threshold [0.0] [21:25:41] lzia1 Brian Keegan at UC Boulder has been using PAWS in some of his classes. His profile is at: http://paws-public.wmflabs.org/paws-public/15811/ [21:25:54] *CU Boulder [21:30:55] thanks J-Mo1 [21:30:55] PROBLEM - Host tools-webgrid-lighttpd-1419 is DOWN: CRITICAL - Host Unreachable (10.68.19.167) [21:57:14] is there any tool labs admin in the CEST time zone? [21:59:52] tgr: valhallasw would be the closest [22:00:16] yeah UTC+1 I think? [22:00:37] tgr: do you have a hackathon or something similar you need support for or just generally asking? [22:02:51] tgr: or are you about to volunteer to become an admin :) [22:05:44] bd808: yeah, a mini-hackathon in Hungary [22:07:06] not really related to Labs but it doesn't hurt to be prepared [22:07:30] (for some value of prepared, asking one day before...) [22:07:45] heh [22:08:02] if I could get a temporary admin right, that would certainly make things simpler [22:08:31] what sorts of things are you worried about needing to do? Account creation or ? [22:08:56] I supposed getting people approved as tool members would be useful [22:09:05] tool labs registration is the only thing an admin is needed for, I think? [22:09:23] ...err yyes, I mean membership approval [22:10:42] the way that works on wiki today I think that means you would need to be either a cloudadmin or an admin in the tools project. [22:20:11] tgr: would you make a phab task asking for temporary admin in the tools project? That will make it easier to keep track of. [22:20:39] sure, thanks [22:23:38] 06Labs, 10Tool-Labs: Temporary Tool Labs projectadmin right for Tgr - https://phabricator.wikimedia.org/T163611#3202985 (10Tgr) [22:30:41] !log tools Added Gergő Tisza as a projectadmin for T163611 [22:30:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [22:30:46] T163611: Temporary Tool Labs projectadmin right for Tgr - https://phabricator.wikimedia.org/T163611 [22:31:35] tgr: lets test that your right are sufficient by having you approve this one -- https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/S3w3nofficial [22:32:47] the process is to use the "manage tool users" link to add to the project, then to do the "Add welcome message" step, and finally use "edit this page" to mark as resolved [22:34:30] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/S3w3nofficial was modified, changed by Gergő Tisza link https://wikitech.wikimedia.org/w/index.php?diff=1757134 edit summary: [22:34:41] seemed to work [22:34:57] awesome. go get us some new developers :) [22:36:26] there are 2 more uncompleted requests in the queue right now where I'm waiting for a better description from the users. there are notes on their request pages [22:43:27] by the way, is https://toolsadmin.wikimedia.org/register/ now preferred over the wiki registration page? [22:45:56] tgr: its a lot easier for people to follow I think [22:46:11] I still haven't changed all the docs on wiki to point to it [22:46:29] "soon" I'll have the tools membership request built in there too [22:46:36] sure, I just wasn't sure if it's still experimental or something [22:46:47] it should be solid. [23:10:59] 06Labs: IO issues for Tools instances flapping with iowait and puppet failure - https://phabricator.wikimedia.org/T161898#3203099 (10Andrew) -1417 and -1419 are now in the process of puppetizing. This is a note to myself to remember to queue those two over the weekend.