[00:18:42] PROBLEM - Puppet run on tools-worker-1029 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:20:25] (03CR) 10Jean-Frédéric: "This looks great! Thanks for tackling this." (032 comments) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/342198 (owner: 10Lokal Profil) [00:21:49] PROBLEM - Puppet run on tools-worker-1028 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:26:14] (03CR) 10Jean-Frédéric: "Will finish reviewing this later :)" (032 comments) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/342038 (owner: 10Lokal Profil) [00:57:01] PROBLEM - Puppet run on tools-exec-1415 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:05:09] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 13Patch-For-Review: Make a nag system to email maintainers of tools still running on precise grid hosts - https://phabricator.wikimedia.org/T149214#2745525 (10Quiddity) I've sent a manual email to most of the maintainers of the tools still listed as running,... [01:37:00] RECOVERY - Puppet run on tools-exec-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [02:35:05] who are the maintainers of CBNGRelay? [02:49:14] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Output data for new XTools: Top edits - https://phabricator.wikimedia.org/T160139#3094550 (10Samwilson) a:03Samwilson [04:09:27] 06Labs, 10DBA: page_lang column of the page table is not replicated to Labs - https://phabricator.wikimedia.org/T154355#3094592 (10TTO) a:03chasemp [06:45:28] PROBLEM - Puppet run on tools-worker-1017 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [07:14:35] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/342418 (owner: 10L10n-bot) [07:25:28] RECOVERY - Puppet run on tools-worker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [07:54:28] 06Labs, 10DBA, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3094724 (10Marostegui) In order to start getting ready to import s5 on sanitarium2 and labsdb1009,10,11 I am going to start: - Compressing InnoDB on db1070... [08:37:37] 06Labs, 10DBA, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3094758 (10Marostegui) labsdb1009,10 and 11 - replication stopped db1095 replication stopped and mysql down data transfer between db1095 and dbstore1001 is... [08:44:08] (03CR) 10Lokal Profil: ">" (032 comments) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/342198 (owner: 10Lokal Profil) [09:55:14] 06Labs, 10DBA, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3094959 (10Marostegui) I just realised that db1070 doesn't have .ibd files because of this: T137191 So I think I will reclone that host from a host that do... [09:56:04] 06Labs, 10DBA, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3094961 (10Marostegui) Probably also do a re-image won't hurt. [10:00:16] 06Labs, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3094966 (10Gilles) [10:00:24] 06Labs, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3094979 (10Gilles) [10:02:32] 06Labs, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3094997 (10Gilles) p:05Triage>03Normal [11:35:49] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095199 (10hashar) [11:37:03] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3094966 (10hashar) Either the puppet master is down on puppetmaster.thumbor.eqiad.wmflabs or some firewall rule prevents... [11:57:24] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095232 (10Gilles) The puppet master appears to be shut down at the moment. The machine uses the role::puppetmaster::sta... [11:58:47] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095233 (10Gilles) Apache startup appears to be failing, it must be where the issue started: ``` gilles@puppetmaster:~$... [12:00:41] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095250 (10Gilles) ``` Mar 13 11:59:29 puppetmaster apache2[11688]: AH00526: Syntax error on line 8 of /etc/apache2/sites... [12:06:20] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095265 (10Gilles) https://httpd.apache.org/docs/trunk/mod/mod_ssl.html#sslopensslconfcmd > Available in httpd 2.4.8 and... [12:07:16] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095267 (10Gilles) Seems related to {T159254} [12:12:14] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095304 (10Gilles) Updating apache2 as indicated in that ticket fixed the issue. With apache restored on puppetmaster.thu... [12:12:32] 06Labs: Blacklist apache from unattended-upgrades on tools puppetmaster - https://phabricator.wikimedia.org/T159254#3095322 (10Gilles) [12:12:35] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095324 (10Gilles) [12:57:10] (03CR) 10Lokal Profil: Prepare monument_tables for wikidata and add tests (032 comments) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/342038 (owner: 10Lokal Profil) [13:12:47] 06Labs, 10Recommendation-API: Request increased quota for recommendation-api labs project - https://phabricator.wikimedia.org/T160344#3095525 (10schana) [13:54:57] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3095626 (10akosiaris) We 've ended up promoting labsdb1007 to master, resyncing from planet.osm and pg_dump/pg_restore the various databases/tables.... [14:03:24] 10Tool-Labs-tools-Attribution-Generator, 06TCB-Team: Attribution in Wikitext - https://phabricator.wikimedia.org/T160347#3095633 (10Katja_Ullrich_WMDE) [14:15:12] !log tools.quarrybot-enwiki no codereview requests were made (E537) [14:15:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quarrybot-enwiki/SAL [14:20:17] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3095661 (10akosiaris) And we are done. The rest of the databases/tables have been copied over, the DNS record has been updated and DNS caches cleare... [14:28:02] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3095669 (10jcrespo) @aude @MaxSem @Kolossos Can you verify your applications (e.g. restarting them) and see that they work as expected to be 100% th... [14:30:17] 06Labs, 10Beta-Cluster-Infrastructure, 06Performance-Team, 10Thumbor: deployment-imagescaler01 can't reach puppetmaster.thumbor.eqiad.wmflabs - https://phabricator.wikimedia.org/T160324#3095671 (10hashar) Well done. Thanks :) [15:05:45] 06Labs, 10Tool-Labs: Still issues with node.js webservice - https://phabricator.wikimedia.org/T160353#3095767 (10Magnus) [15:58:53] 06Labs, 10MediaWiki-extensions-OATHAuth: Special:Two-factor_authentication reloads identical after submit (step 4) - https://phabricator.wikimedia.org/T158492#3038685 (10TheDJ) I've noticed this too. I suspect this is a recent regression. I feel like i've had error or warning notifications before, when I made... [16:02:49] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 2017-03-31 - https://phabricator.wikimedia.org/T143349#3095963 (10Andrew) [16:02:54] 06Labs, 10Labs-Infrastructure: Labs instance utrs-primary is running Ubuntu Precise and must be rebuilt. - https://phabricator.wikimedia.org/T159737#3095961 (10Andrew) 05Open>03Resolved This instance was deleted and replaced by utrs-database and utrs-production. [16:03:13] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 2017-03-31 - https://phabricator.wikimedia.org/T143349#2965184 (10Andrew) [16:08:01] 06Labs, 10Analytics, 10DBA: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#3095980 (10Nuria) This seems like background work related to labs import rather than a task per se, moving to radar. [16:09:22] 06Labs, 10Analytics, 10DBA: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#3095985 (10Nuria) Ping @JAllemandou Did you talk with labs team about this? [16:45:13] 06Labs, 10Continuous-Integration-Infrastructure, 06Operations, 10netops: git clone over EQIAD (wmflabs) CODFW timeout due to low bandwidth (~250 KiB/s) - https://phabricator.wikimedia.org/T158601#3096072 (10EddieGP) p:05Triage>03Normal [16:53:50] 10Tool-Labs-tools-Pageviews, 07I18n: Pageviews-num-pageviews works incorrectly in Hebrew - https://phabricator.wikimedia.org/T160248#3096084 (10Amire80) Thank you! [16:54:12] 10Tool-Labs-tools-Pageviews, 07I18n, 07RTL: add appropriate lang and dir attributes to the page title near the dates range under the chart - https://phabricator.wikimedia.org/T160247#3096085 (10Amire80) Thank you so much for all these quick fixes! [17:04:56] 06Labs, 10Continuous-Integration-Infrastructure, 06Operations, 10netops: git clone over EQIAD (wmflabs) CODFW timeout due to low bandwidth (~250 KiB/s) - https://phabricator.wikimedia.org/T158601#3096149 (10hashar) 05Open>03Resolved a:03hashar Must have been a transient issue. Seems the bandwidth is... [17:18:22] 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Deprecate precise instances in Labs by 2017-03-31 - https://phabricator.wikimedia.org/T143349#3096204 (10hashar) Status update for integration ===================== //Child task is T158652 // I have migrated all php53 jobs from the Precise instances to php... [18:06:18] 06Labs, 10PAWS: paws returns 502 bad gateway - https://phabricator.wikimedia.org/T158685#3096390 (10yuvipanda) I fixed this late yesterday night, and hopefully will have a more long term fix coming in the next few weeks. [18:08:27] there is like 7000 pending changes on labscontrol1002 [18:09:02] I think I will put a ticket instead [19:12:00] 06Labs, 10PAWS: paws returns 502 bad gateway - https://phabricator.wikimedia.org/T158685#3096547 (10Framawiki) Looks good right now, thanks Yuvi. [19:16:53] 10PAWS: Paws display 504 - Bad gateway time-out - https://phabricator.wikimedia.org/T143493#3096560 (10yuvipanda) I fixed this yesterday night. [20:49:21] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:01:34] Is labs really laggy for anyone else, or is it just network issues on my end? [21:04:41] 06Labs, 10Tool-Labs: Still issues with node.js webservice - https://phabricator.wikimedia.org/T160353#3095767 (10valhallasw) `webservice restart` (as opposed to `start`) should use the contents of the service.manifest file. Not ideal, but this is due to the original design where `webservice start` would do the... [21:05:10] enterprisey: https://wikitech.wikimedia.org/wiki/Labs_labs_labs [21:05:22] do you mean tools-login? [21:06:26] I do mean tools-login [21:06:34] valhallasw`cloud: specifically, the become command [21:07:08] (brb) [21:12:48] !log tools tools-bastion-03: killed heavy unzip operation from staeiou, and heavy (inadvertent large file opening?) vim operation from steenth, as the entire server was blocked due to high i/o [21:12:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:29:23] (03CR) 10Lokal Profil: [WIP]Build fill_table_monuments_all from per dataset dicts. (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/342198 (owner: 10Lokal Profil) [21:29:25] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:26:21] Ping anomie [22:26:30] Or anyone who has implemented OAuth [22:26:45] what's your question TParis? [22:27:45] Well, I just found the "oauth-hello-world" on tools labs so maybe that will answer my question [22:27:52] But I just don't understand the process. [22:28:05] How does my app know it's getting an answer it can trust from WMF to authenticate a user? [22:28:20] 06Labs, 06Operations, 10Traffic, 07Puppet, 07Technical-Debt: Convert all of our site.pp/roles to the role/profile paradigm - https://phabricator.wikimedia.org/T159412#3096884 (10Ciencia_Al_Poder) [22:28:55] is that a practical question or a "how does OAuth work in theory" question? [22:29:52] Uh, well, I guess let me explain what I'm trying to do. I am working on the English Wikipedia's UTRS v2 using Laravel. I want to allow administrators to register and check their rights using oAuth [22:31:47] you should fetch Special:OAuth/identify with an OAuth-signed request [22:32:16] it will return a JWT describing whose OAuth access token was used to sign the request [22:32:54] you need to validate the JWT signature, which should prove it's from the WMF [22:33:27] if you use a MediaWiki-specific OAuth framework, it can probably do this for you [22:33:41] checking rights with OAuth is ... tricky. Its not an authorization protocol, so really the best you can do is get a token from the user and then use it to ask the MediaWiki API questions [22:34:01] bd808: That's essentially what I'd like to do [22:34:20] the JWT will be signed with your consumer secret [22:35:53] Okay, so to get the logic straight. 1) User hits my webpage, I get token 2) I redirect them to WMF where they authorize 3) WMF sends them back to my page with a token. I get their token and go back to WMF to verify it and get their JWT object? [22:36:02] disclaimer: I am not a security person, but as far as I understand this is safe [22:36:12] yes [22:36:44] but unless you are a masochist you'll probably want to use a library for this [22:38:00] I have a library, I just didn't understand what it was doing [22:40:13] if you don't trust the library, the important part is to make sure everything in the JWT is validated [22:40:49] signature, algorithm, expiry, user identity, issuer identity [22:41:42] @bd808: Any idea if your mediawiki/oauthclient will work with Laravel? [22:42:23] it should. I don't know what other plumbing you will need to add to it [22:42:40] * bd808 looks for an example of it in a Slim app [22:43:08] Alright. I was going to implement a different generoic oauth library but yours is already written for mediawiki [22:44:22] TParis: this slim app -- https://github.com/bd808/quips -- uses it just to validate that the visitor has a Wikimedia account [22:44:26] @AmandaNP: I think I'm going to get started on this if you want to help [22:45:30] this is the controller that handles setting up and validating the OAuth grant -- https://github.com/bd808/quips/blob/master/src/Pages/OAuth.php [22:46:46] bd808: Thanks. And thanks for making it freely licensed. I might just copy/paste this and extend a laravel controller class. [22:47:06] TParis: it might be useful to figure out how to patch https://github.com/laravel/socialite to know about MediaWiki :) [22:47:37] I looked at socialite but I don't think I know enough about oauth to do that [22:48:12] Although, now that I've read more about it, I can see how socialite might work [22:48:20] more about OAuth* [22:48:21] *nod* it would probably end up being a lot like https://github.com/laravel/socialite/blob/3.0/src/One/TwitterProvider.php but I've not played with Laravel [22:49:21] Can I save the user's token and use it in the future? [22:49:29] yes [22:49:39] How could I match a user to their token when they return to my site? A cookie? [22:49:40] it will be valid until the user revokes it [22:50:29] yeah. figuring out how to have durable sessions and associate them with the user is "an exercise left to the reader" with many OAuth things [22:51:04] bd808: re authentication vs. authorization, I'll admit I never managed to understand that [22:51:28] Authentication means I am who I say I am. Authorization means I can do what I want to do. [22:51:34] authentication is "who are you". authorization is "what are you allowed to do" [22:51:53] it can be a fine distinction [22:52:20] the standard explanation I have seen is that access tokens are not bound to consumer identity, so EvilService can get an access token from the user, and then the attacker can set it as a cookie, visit GoodService and it will believe the user is visiting it [22:52:42] which is not, in fact, true [22:52:56] yeah. that's misleading [22:53:10] But the user's token is my secret token + the user's id token, right? [22:53:25] So, EvilService would need my secret key to duplicate user's access token [22:53:57] the user's token is half a valet key that lets you act as the user with the service that granted to token [22:54:26] the other half of the valet key is the shared secret that you have with that same service [22:54:46] well, the access token is just a pair of strings, but OAuth requests are signed with both access token and consumer token, and nothing stops the server from checking whether those two match [22:55:42] I think this can problematic in scenarios where the identity provider and the resource owner are different entities and don't fully trust each other, but that pretty much never happens in practice [22:57:09] the advice we got from security people (well, Chris) was to use /identify as it's safer than the userinfo API or whatnot, but I don't see the difference [22:57:27] The API might not check the token. [22:57:49] err...the API doesn't return a nonce? [22:58:04] So you can't verify the far end [22:58:09] I dunno [22:58:22] yeah. I think the nonce echo is the big difference [22:58:36] although looking at our code we do not, in fact, check whether the access token and the consumer key are for the same provider... [22:58:46] I'll be streaming my efforts to connect Laravel to Wikipedia on livecoding if anyone is interested in watching, just PM me [23:01:24] I guess if you are worried about MITM attacks, then the JWT is signed and the API response is just plaintext, but if someone can MITM Wikimedia requests we are already in a very bad place [23:01:47] I am worried about MITM. This system will have private CU data in it and we need to protect it. [23:01:56] (OAuth 1.0 is designed to work without HTTPS though so that's a reasonable explanation) [23:02:58] for Wikimedia, MITM should not be an issue as long as you handle SSL correctly [23:03:09] (you can pin the certificate if paranoid) [23:03:15] Well, AmandaNP is in charge of configuring our labs instance to use SSL [23:03:18] So I'll leave that bit to her [23:03:26] that said, multi-layered defense is always a good idea [23:03:38] SSL between your server and Wikimedia, I mean [23:03:56] ic [23:04:04] (of course the other direction is important as well) [23:04:30] Either way: AmandaNP [23:04:48] She's in charge, I'm just following her lead [23:05:24] well, the server-server part needs to be handled in PHP [23:05:41] although any decent request library will default to secure these days [23:05:43] Pretty sure Curl handles that well [23:06:25] don't be, test it :) [23:06:58] native web request support in PHP is not super reliable [23:08:55] older versions of curl default to [23:09:25] CURLOPT_SSL_VERIFYPEER = false which basically disables SSL validation [23:19:19] old php did that, too