[01:12:18] (03PS2) 1020after4: Display messages from mediawiki/core at #wikimedia-codereview [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283850 (https://phabricator.wikimedia.org/T128371) (owner: 10Luke081515) [01:12:37] (03CR) 1020after4: [C: 031] Display messages from mediawiki/core at #wikimedia-codereview [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283850 (https://phabricator.wikimedia.org/T128371) (owner: 10Luke081515) [01:42:40] 06Labs, 10Tool-Labs: tools-webgrid-lighttpd-1415 disabled - https://phabricator.wikimedia.org/T132878#2213261 (10yuvipanda) 05Open>03Resolved a:03yuvipanda It's probably an accidental leftover from the last reboots. I've re-enabled it. [02:01:25] (03CR) 10Yuvipanda: [C: 032 V: 032] Display messages from mediawiki/core at #wikimedia-codereview [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283850 (https://phabricator.wikimedia.org/T128371) (owner: 10Luke081515) [02:01:54] (03PS2) 10Yuvipanda: Add mediawiki/extensions/Cognate to #wikimedia-de-tech [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283603 (owner: 10Addshore) [02:02:01] (03CR) 10Yuvipanda: [C: 032 V: 032] Add mediawiki/extensions/Cognate to #wikimedia-de-tech [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283603 (owner: 10Addshore) [02:02:10] (03PS2) 10Yuvipanda: Add mediawiki/extensions/RevisionSlider to #wikimedia-de-tech [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283605 (owner: 10Addshore) [02:02:17] (03CR) 10Yuvipanda: [C: 032 V: 032] Add mediawiki/extensions/RevisionSlider to #wikimedia-de-tech [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283605 (owner: 10Addshore) [02:10:56] 10PAWS: PAWS 404 for users with special characters in their names - https://phabricator.wikimedia.org/T120066#2213265 (10yuvipanda) 05Open>03Resolved It was released, and I upgraded us and we're all good to go! \o/ [02:21:13] 06Labs, 10Tool-Labs, 10grrrit-wm: Fix grrrit-wm access situation - https://phabricator.wikimedia.org/T132828#2213271 (10yuvipanda) You actually needed access to push to yuvipanda/ on dockerhub, which only me and valhalla had. I just fixed that to only need ops / tools admin and updated https://wikitech.wikim... [02:34:12] 06Labs, 10Tool-Labs, 10grrrit-wm: Fix grrrit-wm access situation - https://phabricator.wikimedia.org/T132828#2213272 (10Krenair) >>! In T132828#2213271, @yuvipanda wrote: > You actually needed access to push to yuvipanda/ on dockerhub To push new changes in? Or operational things such as simple restarts? [02:35:08] 06Labs, 10Tool-Labs, 10grrrit-wm: Fix grrrit-wm access situation - https://phabricator.wikimedia.org/T132828#2213273 (10yuvipanda) Nope, for just deploying new changes. It is no longer needed. [06:37:45] PROBLEM - Puppet run on tools-worker-1007 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [07:17:53] RECOVERY - Puppet run on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [07:18:23] (03CR) 10Legoktm: "Uhhhhhhhh, you just stopped mediawiki/core from going to wikimedia-dev!" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283850 (https://phabricator.wikimedia.org/T128371) (owner: 10Luke081515) [07:24:36] (03CR) 10Legoktm: "Please revert ASAP, I have no idea how to deploy this anymore." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283850 (https://phabricator.wikimedia.org/T128371) (owner: 10Luke081515) [07:54:24] thansk YuviPanda ! [07:54:54] *thanks :D [08:07:20] (03PS1) 10Wctaiwan: Add mediawiki/core back to #wikimedia-dev [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283943 [08:09:04] (03PS1) 10Luke081515: Revert "Display messages from mediawiki/core at #wikimedia-codereview" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283944 [08:09:11] Legoktm ^ [08:09:43] nah, ok, maybe we should try the first one before the revert [08:10:47] (03CR) 10Wctaiwan: "See also https://gerrit.wikimedia.org/r/#/c/283943/ (pick one, basically, assuming I actually did it correctly)" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283944 (owner: 10Luke081515) [08:11:12] (03CR) 10Wctaiwan: "Alternatively, https://gerrit.wikimedia.org/r/#/c/283944/" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283943 (owner: 10Wctaiwan) [08:12:20] (03CR) 10Luke081515: [C: 04-1] "We should try the other patch first, if it works, it's better than this solution I think." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283944 (owner: 10Luke081515) [08:34:53] PROBLEM - Host tools-bastion-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.228) [09:24:18] 10Tool-Labs-tools-stewardbots: General update of HTML and CSS for stewardbot tools and portals - https://phabricator.wikimedia.org/T128745#2213660 (10MarcoAurelio) Also, that grey background is so, uhm, ~~funeral house~~... [10:16:36] How can I get into https://horizon.wikimedia.org/auth/login/ ? O_o [10:16:49] in other words, where on earth do I get a Totp Token ? [10:17:14] ahhh https://wikitech.wikimedia.org/wiki/Help:Horizon_FAQ#The_Horizon_login_prompts_me_for_a_.27Totp_token..27_What.27s_that.3F_Can_I_just_leave_it_blank.3F [13:41:16] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Philroc2 was created, changed by Philroc2 link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Philroc2 edit summary: Created page with "{{Tools Access Request |Justification=To run a bot which will give reminders to users |Completed=false |User Name=Philroc2 }}" [13:55:22] 06Labs, 10Tool-Labs: High load on some webgrid nodes - https://phabricator.wikimedia.org/T132879#2212933 (10chasemp) thanks valhallasw :) [15:31:00] !log integration migration integration-slave-trusty-1018 to labvirt1009 [15:51:51] suggestbot’s webservice all of a sudden returns 503s (service unavailable), but there’s nothing in the error log to help determine why this happens… is there some setting in lighttpd’s configuration I can turn on or something? [15:56:22] Nettrom: I'm not entirely sure, may have to sleuth the documentation on it but I believe there is, an access.log is there [15:57:02] there’s nothing in neither the tool’s access.log nor the error.log that reveals anything about why it crashed, unfortunately, I’ll go dig some more in the documentation, thanks! [15:59:31] (03CR) 10Yuvipanda: [C: 032 V: 032] "This should work!" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283943 (owner: 10Wctaiwan) [16:01:43] Nettrom: it's possible a 503 is the proxy failing to connect to the tool correctly, try a restart and see? [16:02:29] tends to work after a restart, then fail again. since the webservice is always running, it doesn’t get force-restarted either [16:05:34] sounds like it is crashing after awhile [16:17:44] (03Abandoned) 10Luke081515: Revert "Display messages from mediawiki/core at #wikimedia-codereview" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/283944 (owner: 10Luke081515) [16:18:12] (03PS1) 10Paladox: Test [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283992 [16:19:30] (03Abandoned) 10Paladox: Test [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283992 (owner: 10Paladox) [16:22:35] (03PS1) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283993 [16:25:42] (03CR) 10Luke081515: [C: 04-2] "Work not finished yet, needs a readme for example." [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283993 (owner: 10Luke081515) [16:39:55] (03CR) 10Luke081515: [V: 04-1] "Has currently code errors." [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283993 (owner: 10Luke081515) [16:54:49] (03PS1) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283995 [16:54:57] argh [16:55:49] (03Abandoned) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283995 (owner: 10Luke081515) [16:58:28] (03PS2) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283993 [16:59:09] (03PS1) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 [16:59:15] ... [16:59:36] (03Abandoned) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283993 (owner: 10Luke081515) [17:00:13] (03CR) 10Luke081515: [C: 04-2 V: 031] "Works, but needs readme" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 (owner: 10Luke081515) [17:16:44] (03PS2) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 [17:28:30] (03PS3) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 [17:30:02] (03PS4) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 [17:30:34] (03CR) 10Luke081515: [WIP] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 (owner: 10Luke081515) [17:31:16] (03PS5) 10Luke081515: Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 [17:32:04] (03CR) 10Luke081515: [C: 032] Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 (owner: 10Luke081515) [17:32:22] (03Merged) 10jenkins-bot: Initial Commit [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/283996 (owner: 10Luke081515) [17:34:28] 06Labs, 06Operations, 10ops-codfw: labtestneutron2001.codfw.wmnet does not appear to be reachable - https://phabricator.wikimedia.org/T132302#2193908 (10RobH) So chatting with Papaul, this host shows ssl2001 as the hostname when booted. This means this host has never been installed as labtestneutron2001.... [17:37:49] 06Labs, 06Operations, 10ops-codfw: labtestneutron2001.codfw.wmnet does not appear to be reachable - https://phabricator.wikimedia.org/T132302#2193908 (10Andrew) Confirmed, this host is designated for labtest use but so far we have never used it. I can reimage shortly if that makes your lives less confusing :) [18:07:51] 06Labs, 10Tool-Labs, 06Operations, 10Traffic, and 2 others: Migrate tools.wmflabs.org to https only (and set HSTS) - https://phabricator.wikimedia.org/T102367#2214934 (10Andrew) a:03Andrew [18:08:32] 06Labs, 10Tool-Labs, 06Operations, 10Traffic, and 2 others: Migrate tools.wmflabs.org to https only (and set HSTS) - https://phabricator.wikimedia.org/T102367#2214936 (10yuvipanda) >>! In T102367#1372878, @Nemo_bis wrote: >> Should be fairly simple to do. > > I doubt this is possible. Several tools do not... [18:15:41] 06Labs, 10Tool-Labs, 06Operations, 10Traffic, and 2 others: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2073537 (10yuvipanda) This seems not particularly useful as a data point, since it is just counting: 1. Tools that are doing HTTPS redirects themselves... [18:15:57] 06Labs, 10Tool-Labs, 06Operations, 10Traffic, and 2 others: Detect tools.wmflabs.org tools which are HTTP-only - https://phabricator.wikimedia.org/T128409#2214957 (10yuvipanda) [18:17:20] Hey [18:18:22] Hello. [18:21:56] 06Labs: Create project Yandex-proxy - https://phabricator.wikimedia.org/T132950#2214984 (10bd808) [18:23:09] * legoktm hugs bd808 [18:23:31] * bd808 hugs legoktm back [18:24:41] bd808: https://phabricator.wikimedia.org/T125459#2215028 [18:25:16] ha! I didn't know that this wasn't a thing that we know we want [18:25:39] I'm not sure, which is why I asked :/ [18:25:46] 06Labs: Create project Yandex-proxy - https://phabricator.wikimedia.org/T132950#2215036 (10bd808) [18:33:28] how to get from a wmflabs url to the project, then people responsible for it? [18:34:22] Platonides: If it's not a tools.wmflabs.org link we don't have any UI to do that :/ [18:34:40] but it should be possible. what's the URL? [18:34:46] korma.wmflabs.org [18:34:53] I lookedd around in wikitech [18:35:07] but I don't see a similarly named project… [18:35:53] I *think* that andre__ and qgil run that [18:35:57] I think [18:38:02] 06Labs, 07Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2215072 (10bd808) [18:38:04] 06Labs, 15User-bd808: Create project Yandex-proxy - https://phabricator.wikimedia.org/T132950#2215068 (10bd808) 05Open>03Resolved a:03bd808 https://wikitech.wikimedia.org/wiki/Nova_Resource:Yandex-proxy [18:43:46] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Philroc2 was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=444102 edit summary: [19:01:32] On Tool Labs cdnjs there are 2 identical jquery items. [19:26:26] Platonides: the general problem of exposing the proxy targets is tracked in T115752 [19:26:27] T115752: invisible-unicorn (dynamicproxy) should provide an easy way to see where a host routes without knowing the project - https://phabricator.wikimedia.org/T115752 [19:27:32] want me to look this one up Platonides? [19:28:11] Do you have a trick/script for iterating the projects Krenair? [19:28:49] from what environment? [19:28:55] silver? [19:29:06] hm... there was something [19:29:11] I usually hit it from novaproxy-01 with curl [19:29:14] one sec [19:32:11] bd808, got it [19:33:41] bd808, https://phabricator.wikimedia.org/P2916 [19:33:52] obviously you'll need to fill out the password [19:34:16] neat [19:34:25] this is what I used in https://gerrit.wikimedia.org/r/#/c/268921/12 [19:34:46] well, am trying to use :) it's not merged yet [19:37:00] not sure why `keystone tenant-list` doesn't work, but this does [19:39:39] works like a charm [19:44:15] had to do quite a bit of messing around with the damn keystoneclient code to get it working [19:45:26] Platonides: that proxy points to contributors-metrics.contributors.eqiad.wmflabs [19:46:26] Krenair: in ~bd808 on silver there is a script that will dump all the proxies into ~bd808/proxies/*.json now [19:46:54] brute force and ugly by hey it works [19:47:07] s/by/but/ [19:47:48] yeah but I could already ssh in and look it up in the sqlite db [19:48:12] or in the redis [19:48:13] doesn't really help unprivileged people [19:48:40] what are you talking about, ofc everyone has root on everything! [19:48:42] * bd808 didn't know how to do either of those things [19:49:02] after pressing enter I'm aware how much of a terrible joke that is, so please ignore. [19:49:28] lol [19:49:38] ssh to the public domain, you'll get to novaproxy-01 assuming you're in project-proxy (most people are not) [19:49:44] * bd808 has trebuchet flashbacks [19:50:06] then sudo sqlite3 /etc/dynamicproxy-api/data.db [19:50:56] sqlite> select backend.url from backend, route where backend.route_id = route.id and route.domain = 'korma.wmflabs.org.'; [19:50:56] http://contributors-metrics.contributors.eqiad.wmflabs:80 [19:51:20] just haven't quite learnt the table schema yet, have to check .tables/.schema each time [19:53:09] So... can I use Quarry to debug a long SQL query? Or is that not allowed [19:53:51] Matthew_: if you mean run an EXPLAIN query, sadly we don't allow that at this time [19:54:40] bd808: Negative. I've got a query on a tool that runs long... and I want to see if I can reproduce and fix it on Quarry. I ask because it runs long... 250+ secs. [19:55:05] Matthew_: quarry's limit is 30min now, which is more than 250+sec I think [19:55:13] Okay. As long as it's ok :) [19:55:17] * YuviPanda confirms that 30mins is definitely more than 250seconds [19:55:37] like 6 times longer :) [19:55:59] how would quarry help here? [19:55:59] indeed. 300s might even be 5 minutes! [19:56:16] Krenair: interactive debugging? vs using 'sql' on the commandline [19:56:29] bd808: btw, you should feel free to self-merge your labs/toollabs patches [19:56:39] yew gross [19:57:25] I was sort of hoping someone would review https://gerrit.wikimedia.org/r/#/c/283377/ to see if I was lying [19:57:32] bd808: you could also convince chasemp or valhallasw or tim l to take a look. [19:58:10] btw porting that script to python is not going to be pretty, but it should be possible [19:58:38] the arg parsing logic is ugly and won't map well to python niceness [19:59:48] bd808: right, but we've this mass of data on how people *actually* use it, and so we can trim the perl script down maybe first to only things people use... [20:00:34] ah that would be nice. we should probably put something in to see if anyone is crazy enough to use the comment parsing part of it [20:01:47] bd808: yup. it's all on EL already. most of the calls are actually from cron, and cron is something we can fix with scripts [20:18:26] * Platonides wonders who acs is [20:25:07] bd808: I stared at 'Supported qsub options and whether they take an argument or not' and what the 0/1 stood for too long the first time, and thank for doing that [20:27:20] Little things like changing "$>" to "$EFFECTIVE_USER_ID" should help readability too [20:28:21] Remembering the difference between "$>" and "$<" should not be necessary outside of applying for a perl job [20:29:19] I completely agree [20:34:48] 06Labs, 10Beta-Cluster-Infrastructure, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2206880 (10hashar) The `/etc/puppet/puppet.conf` file is generated by concatenating files und... [20:39:13] 06Labs, 10Beta-Cluster-Infrastructure, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2215581 (10hashar) Random trace on deployment-cache-text04 ``` Info: Applying configuration v... [20:41:49] 10Labs-project-wikistats: wikisite table - status of updates - https://phabricator.wikimedia.org/T111592#2215602 (10RobiH) Works in the browser. Change user agent? [20:42:04] (03CR) 10Rush: [C: 031] jsub: Add a ton of comments (031 comment) [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283377 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [20:42:16] bd808: small note of trying ot read it w/ my comment hat on [20:42:36] took me a minute to mentally untangle is it naming the job or running through some naming the job order of operations where last one wins [20:42:43] just a thought tho [20:43:28] Yeah I'm not really sure why it is coded that way. [20:43:43] line 239 will always happen [20:45:40] I refrained from coupling comments on your comments w/ comments on various approaches :) [20:46:59] making the perl better at this point isn't a big priority I don't think [20:47:08] but understanding wft it does is [20:47:16] *wtf [20:47:57] yup [20:49:07] (03PS2) 10BryanDavis: jsub: Add a ton of comments [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283377 (https://phabricator.wikimedia.org/T132475) [20:50:12] (03CR) 10BryanDavis: jsub: Add a ton of comments (031 comment) [labs/toollabs] - 10https://gerrit.wikimedia.org/r/283377 (https://phabricator.wikimedia.org/T132475) (owner: 10BryanDavis) [20:51:15] I haven't written much perl, only debugged a medium amount but is it common perl thing to try to ascertain /how/ a script is called? [20:51:21] the jstart portion [20:51:49] "common [20:51:56] " is a relative term [20:52:44] that sort of thing happens quite a bit in system software where the behavior is made subtily different based on the symlink to the binary that is used [20:53:12] it's typically a shortcut for passing some common cli args [20:53:18] I have always seen that done w/ say a jsub and then jstart would be a shell script w/ the options in question [20:53:20] yeah [20:53:28] and a $@ [20:53:29] or wahtever [20:54:16] the "best" example of it is something like busybox where 50+ programs are smushed into one binary to save space [20:59:38] 10Labs-project-wikistats: wikisite table - status of updates - https://phabricator.wikimedia.org/T111592#2215671 (10Dzahn) wanna try running the update script yourself and check what the error is? [21:16:56] 10Labs-project-wikistats: wikisite table - status of updates - https://phabricator.wikimedia.org/T111592#2215738 (10RobiH) Ask for permission? [21:17:40] PROBLEM - Puppet run on tools-cron-01 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [21:17:52] PROBLEM - Puppet run on tools-grid-shadow is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:18:24] PROBLEM - Puppet run on tools-webgrid-lighttpd-1202 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:18:30] PROBLEM - Puppet run on tools-exec-cyberbot is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [21:19:16] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:21:44] PROBLEM - Puppet run on tools-webgrid-lighttpd-1415 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:21:45] PROBLEM - Puppet run on tools-webgrid-lighttpd-1203 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [21:21:58] PROBLEM - Puppet run on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [21:22:12] something up w/ puppet^? [21:22:35] yeah am checking [21:22:46] Warning: Error 400 on SERVER: Invalid line 37: allow [21:22:51] PROBLEM - Puppet run on tools-webgrid-lighttpd-1201 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [21:22:51] PROBLEM - Puppet run on tools-exec-1209 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:22:53] not sure wha'ts going on there? [21:23:01] yeah wtf is that about....apache config? [21:23:13] yeah, sounds like it [21:23:57] PROBLEM - Puppet run on tools-exec-1219 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:23:57] PROBLEM - Puppet run on tools-webgrid-lighttpd-1401 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:25:21] PROBLEM - Puppet run on tools-webgrid-lighttpd-1404 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:25:31] PROBLEM - Puppet run on tools-exec-1218 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:25:31] PROBLEM - Puppet run on tools-webgrid-lighttpd-1208 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:25:31] PROBLEM - Puppet run on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:25:57] PROBLEM - Puppet run on tools-webgrid-lighttpd-1407 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [21:26:11] PROBLEM - Puppet run on tools-exec-1215 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:27:17] PROBLEM - Puppet run on tools-exec-1203 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:27:27] PROBLEM - Puppet run on tools-exec-1405 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:27:35] PROBLEM - Puppet run on tools-exec-1205 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:27:45] PROBLEM - Puppet run on tools-exec-1408 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:27:46] PROBLEM - Puppet run on tools-webgrid-lighttpd-1406 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [21:27:47] PROBLEM - Puppet run on tools-webgrid-lighttpd-1413 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [21:28:18] PROBLEM - Puppet run on tools-redis-1001 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:28:40] PROBLEM - Puppet run on tools-exec-1211 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:29:04] PROBLEM - Puppet run on tools-webgrid-generic-1401 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:29:37] andrewbogott: it's your patch! :D [21:29:50] andrewbogott: the 'allow from' is apparently illegal apache config [21:29:50] ahhh how did you track that down? [21:30:02] I would have though only affected californium [21:30:17] it must have triggered as keyword? :) [21:30:41] chasemp: maybe wrong actually. I saw line 37 was the 'allow' without a mask (so no /32 or /24) while rest of them had it [21:31:30] PROBLEM - Puppet run on tools-exec-1204 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:31:36] PROBLEM - Puppet run on tools-exec-1213 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:31:59] chasemp: so was totally guessing. but a manual fixup + puppetmaster [21:32:09] chasemp: + apache2 restart doesn't seem to have fixed it.. [21:32:14] good guess man [21:32:36] PROBLEM - Puppet run on tools-exec-1403 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:32:42] PROBLEM - Puppet run on tools-bastion-05 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [21:32:50] PROBLEM - Puppet run on tools-webgrid-lighttpd-1405 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [0.0] [21:33:00] PROBLEM - Puppet run on tools-bastion-mtemp is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:33:04] PROBLEM - Puppet run on tools-exec-1221 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [21:33:24] YuviPanda: oh I thought you said has fixed it :) [21:33:32] chasemp: no, I thought I had, but something's wonky somewhere. [21:34:52] Error: /File[/var/lib/puppet/lib]: Could not evaluate: Connection refused - connect(2) Could not retrieve file metadata for puppet://labs-puppetmaster-eqiad.wikimedia.org/plugins: Connection refused - connect(2) [21:36:02] chasemp: caught during a restart, probably. nothing in apache error logs [21:37:01] PROBLEM - Puppet run on tools-precise-dev is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [21:37:01] PROBLEM - Puppet run on tools-exec-1206 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:37:29] PROBLEM - Puppet run on tools-exec-1407 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:37:47] PROBLEM - Puppet run on tools-webgrid-lighttpd-1206 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:38:19] PROBLEM - Puppet run on tools-exec-1210 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:38:27] YuviPanda: ah you think https://gerrit.wikimedia.org/r/#/c/284079/ [21:39:07] PROBLEM - Puppet run on tools-exec-1406 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:39:22] chasemp: yeah, I think so. let's see - running puppet now. [21:39:41] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:42:14] chasemp: hmm, that doesn't seem to have fixed it [21:42:23] PROBLEM - Puppet run on tools-mail-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:42:23] PROBLEM - Puppet run on tools-webgrid-generic-1405 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:42:43] PROBLEM - Puppet run on tools-cron-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [21:43:29] YuviPanda: ok quick verify. tools vm's are using labcontrol1001 as their puppet master, and that is the puppet master that is failing? [21:43:43] PROBLEM - Puppet run on tools-exec-1214 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:43:56] chasemp: yeah, I can verify that the former is true, and the latter seems to be true as well. [21:44:07] PROBLEM - Puppet run on tools-checker-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:44:27] (i verified by looking at the puppet.conf file, which points to labs-puppetmaster-eqiad.wikimedia.org, which pings to same IP as labcontrol1001) [21:44:30] PROBLEM - Puppet run on tools-webgrid-lighttpd-1205 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [21:45:01] YuviPanda: ok agreed, and --debug think so too and there isn't another specified master in puppet-run or anything [21:45:22] PROBLEM - Puppet run on tools-webgrid-lighttpd-1412 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:45:28] PROBLEM - Puppet run on tools-exec-1208 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:45:29] right [21:45:31] YuviPanda: it's also failing for another VM in another random project [21:45:36] PROBLEM - Puppet run on tools-exec-1409 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:46:00] chasemp: right. so puppetmaster seems like obvious culprit [21:46:04] PROBLEM - Puppet run on tools-webgrid-lighttpd-1408 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:46:30] PROBLEM - Puppet run on tools-exec-gift is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:47:40] PROBLEM - Puppet run on tools-bastion-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:47:40] PROBLEM - Puppet run on tools-exec-1216 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:47:40] PROBLEM - Puppet run on tools-mail is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:48:04] PROBLEM - Puppet run on tools-webgrid-lighttpd-1209 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:48:06] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [21:49:33] I killed shinken-wm [22:01:31] Hello, I noticed that Wikimedia Labs hosted a tile set for topography at http://a.tiles.wmflabs.org/hillshading/14/4947/5940.png. I was just wondering if you are allowed to directly use this resource, or if I have to rehost these assets. I know another site (http://hikebikemap.org/) use this resource directly, is this form of access allowed. I know they also have an attribution to NASA, is a attribution to this Foundation als [22:02:23] sulsull: I think if you're using it outside of wikimedia things, you should re-host. There's also a production service being worked on by yurik and co that might have different terms - #wikimedia-discovery might be able to help better in that regard [22:04:12] sulsull, at the moment our ops have limited referer headers to only wikivoyage.org & wmflabs.org [22:04:36] once the production is in full swing (waiting for more servers), we should eventually remove these restrictions [22:05:56] yurik Is this service having access an anomaly and should have been blocked? I was just curious where they were getting this data and found it was from this organization. [22:06:34] no idea, i only know about the maps.wikimedia.org [22:06:35] RECOVERY - Puppet run on tools-exec-1204 is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:39] RECOVERY - Puppet run on tools-webgrid-lighttpd-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:39] RECOVERY - Puppet run on tools-webgrid-lighttpd-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:39] RECOVERY - Puppet run on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:39] RECOVERY - Puppet run on tools-cron-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:08:37] RECOVERY - Puppet run on tools-exec-1211 is OK: OK: Less than 1.00% above the threshold [0.0] [22:09:54] Hmm I'll ask on #wikimedia-discovery to make sure of proper attributions, however based on your responses it seems that I should rehost. Interestingly maps.wikimedia.org does not use this tile set [22:11:35] yurik: that URL is served by maps-tiles3.eqiad.wmflabs in the maps project. You are at least an admin there. ;) [22:11:38] RECOVERY - Puppet run on tools-exec-1213 is OK: OK: Less than 1.00% above the threshold [0.0] [22:12:04] RECOVERY - Puppet run on tools-exec-1206 is OK: OK: Less than 1.00% above the threshold [0.0] [22:12:07] bd808, yes, but i never actually looked at the wmflabs tile server, so don't know much about it [22:12:34] RECOVERY - Puppet run on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [22:12:44] RECOVERY - Puppet run on tools-bastion-05 is OK: OK: Less than 1.00% above the threshold [0.0] [22:13:20] sulsull, btw, as far as the original source of data, I'm guessing it all comes from OSM, but they might be mixing in more data from other sources [22:14:41] 06Labs, 10Horizon, 13Patch-For-Review: Add some in-line documentation to the horizon login screen - https://phabricator.wikimedia.org/T132694#2215924 (10Andrew) 05Open>03Resolved I added help bubbles to the 'username' and 'totp' fields on this form. [22:15:30] bd808, btw, do you know by any chance where i can configure the IRC notifications for the phabricator tickets? [22:16:20] yurik: you probably want to ask valhallasw or YuviPanda and it's a yaml file iirc for the bot [22:17:05] RECOVERY - Puppet run on tools-precise-dev is OK: OK: Less than 1.00% above the threshold [0.0] [22:17:15] thx chasemp [22:17:31] RECOVERY - Puppet run on tools-mail-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:17:55] yurik: chasemp https://wikitech.wikimedia.org/wiki/wikibugs [22:17:55] yurik HikeBikemap claims the source is NASA, but I have no affiliation with this website. I figured WM Labs just ran a tile server. WM Labs is also listed as a Tile Server on OSM (http://wiki.openstreetmap.org/wiki/Tile_servers) however neither there or on Media Wiki does it mention if you are allowed to use this server for non Wiki Media properties. [22:17:59] RECOVERY - Puppet run on tools-bastion-mtemp is OK: OK: Less than 1.00% above the threshold [0.0] [22:18:13] RECOVERY - Puppet run on tools-exec-1210 is OK: OK: Less than 1.00% above the threshold [0.0] [22:19:07] RECOVERY - Puppet run on tools-exec-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [22:19:08] RECOVERY - Puppet run on tools-checker-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:20:13] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:20:16] sulsull, really no idea :( Some volunteers used mapnik (raster) setup to build a tile server. It is largly unmaintained, and it frequently seems that the only operations people do is restart it when it fails. The database is self updating from OSM, but they might have created more data sources. As for usage licensing - absolutely no idea [22:21:32] yurik, Is their a channel/email I should use to contact someone that might know the licensing. Thank you again for your help. [22:22:23] RECOVERY - Puppet run on tools-webgrid-generic-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [22:22:32] sulsull, there is maps-l mailing list, but I don't know if those same volunteers subscribe to that. Or you could try wikitech-l. https://lists.wikimedia.org/mailman/listinfo/maps-l [22:22:37] RECOVERY - Puppet run on tools-cron-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:22:37] RECOVERY - Puppet run on tools-exec-1216 is OK: OK: Less than 1.00% above the threshold [0.0] [22:22:38] RECOVERY - Puppet run on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [22:22:38] RECOVERY - Puppet run on tools-bastion-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:23:05] RECOVERY - Puppet run on tools-webgrid-lighttpd-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [22:23:43] RECOVERY - Puppet run on tools-exec-1214 is OK: OK: Less than 1.00% above the threshold [0.0] [22:25:18] RECOVERY - Puppet run on tools-webgrid-lighttpd-1412 is OK: OK: Less than 1.00% above the threshold [0.0] [22:25:27] Thanks I will use these channels to contact someone in order to see if I can use this data. [22:25:33] RECOVERY - Puppet run on tools-exec-1409 is OK: OK: Less than 1.00% above the threshold [0.0] [22:25:34] RECOVERY - Puppet run on tools-exec-1208 is OK: OK: Less than 1.00% above the threshold [0.0] [22:25:58] RECOVERY - Puppet run on tools-webgrid-lighttpd-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [22:26:38] RECOVERY - Puppet run on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [0.0] [22:26:50] RECOVERY - Puppet run on tools-webgrid-lighttpd-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [22:28:24] RECOVERY - Puppet run on tools-webgrid-lighttpd-1202 is OK: OK: Less than 1.00% above the threshold [0.0] [22:29:30] RECOVERY - Puppet run on tools-webgrid-lighttpd-1205 is OK: OK: Less than 1.00% above the threshold [0.0] [22:31:08] RECOVERY - Puppet run on tools-exec-1215 is OK: OK: Less than 1.00% above the threshold [0.0] [22:31:54] RECOVERY - Puppet run on tools-webgrid-lighttpd-1203 is OK: OK: Less than 1.00% above the threshold [0.0] [22:32:00] RECOVERY - Puppet run on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [22:32:30] 06Labs, 10Labs-Infrastructure, 15User-bd808: Static IP for yandex-proxy01.yandex-proxy.eqiad.wmflabs - https://phabricator.wikimedia.org/T132982#2216068 (10bd808) [22:32:44] RECOVERY - Puppet run on tools-webgrid-lighttpd-1201 is OK: OK: Less than 1.00% above the threshold [0.0] [22:32:51] 06Labs, 10Labs-Infrastructure, 15User-bd808: Static IP for yandex-proxy01.yandex-proxy.eqiad.wmflabs - https://phabricator.wikimedia.org/T132982#2216087 (10bd808) a:05bd808>03None [22:33:22] RECOVERY - Puppet run on tools-redis-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [22:33:23] 06Labs, 10Labs-Infrastructure, 15User-bd808: Static IP for yandex-proxy01.yandex-proxy.eqiad.wmflabs - https://phabricator.wikimedia.org/T132982#2216094 (10yuvipanda) 05Open>03Resolved a:03yuvipanda I've increased the quota to 1. You can allocate it via horizon to an instance of your choosing. [22:34:04] YuviPanda, andrewbogott, chasemp: per the wikitech instructions (which are horrible), I'd like a static IP for the yandex-proxy Labs project. Details at T132982 [22:34:04] T132982: Static IP for yandex-proxy01.yandex-proxy.eqiad.wmflabs - https://phabricator.wikimedia.org/T132982 [22:34:32] bd808: too late, I already granted it. [22:34:39] :) [22:34:40] yuvi did it so fast [22:34:43] I barely loaded the task [22:35:21] RECOVERY - Puppet run on tools-webgrid-lighttpd-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [22:35:21] RECOVERY - Puppet run on tools-exec-1218 is OK: OK: Less than 1.00% above the threshold [0.0] [22:35:22] RECOVERY - Puppet run on tools-webgrid-lighttpd-1208 is OK: OK: Less than 1.00% above the threshold [0.0] [22:37:17] RECOVERY - Puppet run on tools-exec-1203 is OK: OK: Less than 1.00% above the threshold [0.0] [22:37:22] * bd808 wonders how he is going to test this [22:37:31] RECOVERY - Puppet run on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [22:37:43] RECOVERY - Puppet run on tools-exec-1205 is OK: OK: Less than 1.00% above the threshold [0.0] [22:37:50] bd808: hit a 'what is my ip' page somewhere? [22:37:57] So those address mappings are SNAT right? [22:38:03] I always liked ipchicken.com [22:38:06] tehy are actually both [22:38:16] if you have a static assigned to a host it uses it in both directions iirc [22:38:30] but only to non wmf addresses [22:38:50] bd808: ^ [22:38:50] looks like it works! [22:39:08] $ curl http://casadebender.com/myip.php -- 208.80.155.189 [22:39:28] yup [22:40:18] bd808: btw, assuming you give it a .wmflabs.org domain, we need to maybe merge Krenair's patch first to make that work properly [22:40:50] I actually only need it for outbound [22:42:09] I suppose I should setup a mapping for nicer reverse dns [22:42:24] bd808: but tools users are gonna hit it, right? if so, making it a .wmflabs.org domain means that you can setup a new instance / failover later, and not have to have everyone change their code [22:42:59] yeah I should have a "service name" internally [22:43:17] right. and .wmflabs.org domains are unfortunately the closest we have now [22:43:24] but I don't want it exposed to the actual internet [22:43:46] I guess I can do authn by ip in my nginx config [22:44:12] bd808: aaah, right. yeah, you've to do that anyway, since your instance won't actually be able to just bind to the private IP. you can also use security groups for the same effect [22:44:33] yeah, that was my original plan [22:44:42] only opening port 80 to tool labs [22:45:33] bd808: to tool labs? [22:46:02] yes. A tool will hit my proxy and it will reverse proxy to yandex.ru [22:46:12] bd808: right, but *just* to toollabs? [22:46:33] I think so, yes. It's for things like corenbot [22:47:14] and I think the username + ip combo is under a single quota [22:47:30] so we don't want "random" usage [22:48:34] bd808: I see. I don't know any way to limit it to just toollabs without putting in basic auth or something like that I guess [22:49:07] oh. sure. because there is no network range that is only tool labs [22:49:18] yeah, so Labs in general I guess [22:49:32] that will work [22:50:36] right [23:04:44] 10Tool-Labs-tools-Other, 06WMF-Legal: Another request to review privacy policy and rules - https://phabricator.wikimedia.org/T104784#2216193 (10ZhouZ) Hi @Ricordisamoa, is this game still in development? If so, I would be happy to discuss any potential legal issues with the design. Zhou [23:11:32] MusikAnimal: around? [23:11:38] hey! [23:11:57] whatsup? [23:11:59] YuviPanda: I'm pretty sure I have to make an auth.conf change for that. The default behavior is to forbid everything. I reworked the .erb bits though… https://gerrit.wikimedia.org/r/#/c/284103/ [23:12:16] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2216208 (10Yurik) It seems this is really a question for @bblack - is it possible to make all of wmflabs traffic go through the varnish layer so that it gets tagged with t... [23:33:55] 10Labs-project-wikistats, 07Need-volunteer: wikisite table - status of updates - https://phabricator.wikimedia.org/T111592#2216271 (10Dzahn) [23:35:34] 06Labs, 10Labs-Sprint-100, 10Tool-Labs: Deploy new unified webservice code - https://phabricator.wikimedia.org/T98440#2216272 (10yuvipanda) I spent some time making webservicemonitor work with webservice-new, and it does work fine now. webservice-new will add a 'version: 2' field in the service.manifest, and... [23:40:02] 06Labs, 10Tool-Labs, 13Patch-For-Review: Deprecate #no-default-php in .lighttpd.conf - https://phabricator.wikimedia.org/T98818#2216282 (10yuvipanda) I just converted 'static' to lighttpd-plain and it seems to work well. Should be the same for the rest. [23:40:08] 06Labs, 10Tool-Labs, 13Patch-For-Review: Deprecate #no-default-php in .lighttpd.conf - https://phabricator.wikimedia.org/T98818#2216283 (10yuvipanda) a:03yuvipanda [23:41:00] 06Labs, 10Tool-Labs, 13Patch-For-Review: Convert most top level tool and bastion dns redcords to CNAMEs - https://phabricator.wikimedia.org/T131796#2216289 (10yuvipanda) a:05yuvipanda>03None Uh, should do another round later. [23:42:09] 06Labs, 10Labs-Sprint-100, 10Tool-Labs: Write documentation on new webservice code - https://phabricator.wikimedia.org/T132987#2216291 (10yuvipanda)