[03:54:21] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Iislucas was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=897041 edit summary: [03:57:51] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Salahyahya was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=897052 edit summary: [05:47:25] RECOVERY - Host tools-secgroup-test-102 is UP: PING OK - Packet loss = 0%, RTA = 0.84 ms [05:51:54] PROBLEM - Host tools-secgroup-test-102 is DOWN: PING CRITICAL - Packet loss = 100% [06:43:21] PROBLEM - Puppet run on tools-webgrid-lighttpd-1418 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:56:10] RECOVERY - Host secgroup-lag-102 is UP: PING OK - Packet loss = 0%, RTA = 1.04 ms [07:09:25] RECOVERY - Host tools-secgroup-test-103 is UP: PING OK - Packet loss = 0%, RTA = 1.94 ms [07:13:19] RECOVERY - Puppet run on tools-webgrid-lighttpd-1418 is OK: OK: Less than 1.00% above the threshold [0.0] [07:19:42] PROBLEM - Host tools-secgroup-test-103 is DOWN: CRITICAL - Host Unreachable (10.68.21.22) [07:22:00] PROBLEM - Host secgroup-lag-102 is DOWN: CRITICAL - Host Unreachable (10.68.17.218) [09:59:25] PROBLEM - Puppet run on tools-bastion-03 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [11:04:27] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [11:58:08] (03PS1) 10Ricordisamoa: All text is Unicode by default in Python 3 [labs/tools/translatemplate] - 10https://gerrit.wikimedia.org/r/315665 [12:02:00] (03CR) 10Ricordisamoa: [C: 04-2] All text is Unicode by default in Python 3 [labs/tools/translatemplate] - 10https://gerrit.wikimedia.org/r/315665 (owner: 10Ricordisamoa) [12:02:05] (03CR) 10Ricordisamoa: [C: 032] All text is Unicode by default in Python 3 [labs/tools/translatemplate] - 10https://gerrit.wikimedia.org/r/315665 (owner: 10Ricordisamoa) [12:02:14] (03CR) 10Ricordisamoa: [V: 032] All text is Unicode by default in Python 3 [labs/tools/translatemplate] - 10https://gerrit.wikimedia.org/r/315665 (owner: 10Ricordisamoa) [12:04:05] (03PS3) 10Ricordisamoa: Add proper User-Agent header to every request [labs/tools/translatemplate] - 10https://gerrit.wikimedia.org/r/314684 [13:42:30] 10Tool-Labs-tools-Wikidata-Periodic-Table, 10Wikidata: Create a WDQS-based ElementProvider - https://phabricator.wikimedia.org/T122706#2712741 (10Ricordisamoa) a:03Ricordisamoa [13:42:52] (03PS1) 10Ricordisamoa: Add and use SparqlElementProvider [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/315671 (https://phabricator.wikimedia.org/T122706) [13:43:08] (03CR) 10jenkins-bot: [V: 04-1] Add and use SparqlElementProvider [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/315671 (https://phabricator.wikimedia.org/T122706) (owner: 10Ricordisamoa) [13:46:53] (03PS2) 10Ricordisamoa: Add and use SparqlElementProvider [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/315671 (https://phabricator.wikimedia.org/T122706) [13:50:33] 10Tool-Labs-tools-Wikidata-Periodic-Table, 10Wikidata, 13Patch-For-Review: Create a WDQS-based ElementProvider - https://phabricator.wikimedia.org/T122706#2712805 (10Ricordisamoa) @ArthurPSmith do you think there's still need for WDQ-based implementations? [14:56:00] Ehm, I'm getting a certificate error. [15:01:22] sjoerddebruin: for what? [15:01:32] http://tools.wmflabs.org [15:01:47] Ehm: https://tools.wmflabs.org* [15:02:11] Seems to work here. What certificate are you served? [15:02:33] "GlobalSign Organization Validation CA - SHA256 - G2" [15:02:58] should be for *.wmflabs.org, signed by GlobalSign Organization Validation CA - SHA256 - G2 [15:03:09] see #wikimedia-tech, globalsign is having OCSP or CRL issues [15:03:09] right, and what is the error you're getting? [15:03:54] (03CR) 10Andrew Bogott: "Don't we need to handle things if mwoauth.identify fails? Or does it return some kind of generic-yet-valid response for users without an " [labs/striker] - 10https://gerrit.wikimedia.org/r/313137 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:04:08] mark: thanks [15:13:51] (03CR) 10Andrew Bogott: [C: 031] "Bikeshed alert: I continue to wish that we had a better name for this than 'SUL account' and 'ldap account.' Maybe something like 'on-wi" [labs/striker] - 10https://gerrit.wikimedia.org/r/313138 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:14:59] (03CR) 10BryanDavis: "> Don't we need to handle things if mwoauth.identify fails?" [labs/striker] - 10https://gerrit.wikimedia.org/r/313137 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:16:25] (03CR) 10Andrew Bogott: [C: 031] "ok then :)" [labs/striker] - 10https://gerrit.wikimedia.org/r/313137 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:21:36] (03CR) 10BryanDavis: "> Bikeshed alert: I continue to wish that we had a better name for" [labs/striker] - 10https://gerrit.wikimedia.org/r/313138 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:29:23] (03CR) 10Andrew Bogott: "A couple of comments inline." (032 comments) [labs/striker] - 10https://gerrit.wikimedia.org/r/313139 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:31:33] (03CR) 10Andrew Bogott: [C: 031] "Assuming that blacklist checks are still to come, lgtm" [labs/striker] - 10https://gerrit.wikimedia.org/r/313140 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:34:08] (03CR) 10Andrew Bogott: [C: 031] Add confirmation step to account creation wizard [labs/striker] - 10https://gerrit.wikimedia.org/r/313141 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [15:49:28] 10Tool-Labs-tools-Xtools: Bugs section on articleinfo returns incorrect results - https://phabricator.wikimedia.org/T148046#2713184 (10Matthewrbowker) [16:06:25] 06Labs, 10Striker, 07LDAP: Store Wikimedia unified account name (SUL) in LDAP directory - https://phabricator.wikimedia.org/T148048#2713307 (10bd808) [16:13:22] 06Labs, 10Striker, 06Operations, 07LDAP: Store Wikimedia unified account name (SUL) in LDAP directory - https://phabricator.wikimedia.org/T148048#2713357 (10MoritzMuehlenhoff) [16:14:58] (03CR) 10Andrew Bogott: [C: 031] "ldap code looks right to me." (031 comment) [labs/striker] - 10https://gerrit.wikimedia.org/r/313143 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [16:16:27] (03CR) 10BryanDavis: [C: 04-1] Collect data needed to create a new LDAP account (032 comments) [labs/striker] - 10https://gerrit.wikimedia.org/r/313139 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [16:17:31] yuvipanda: hi, i have a script (in java...) which requres a *real terminal* to run (it won't run on grid). It is allowed to run it in a screen on labs-dev? [16:17:59] Steinsplitter: depends on how much resources it is going to use :) [16:18:08] and how long it's going to run [16:18:13] if the answer is 'forever', then no, you can not run it [16:19:13] it isn't consuming much cpu. ok, sigh :( [16:19:58] yeah, if you want to run it for an hour or two, sure. even a day [16:20:01] but 'forever' is a long time :) [16:20:15] and those shouldn't be run interactively [16:20:19] sorry [16:21:01] Steinsplitter: what does "real terminal" mean in this case? [16:21:27] * bd808 wonders if screen/tmux can run on the grid as a container for things that think they need a tty [16:23:18] i think i will try to find a way to run it offwiki. thx anway. [16:24:59] Steinsplitter: you can start an interactive session on the grid [16:25:47] qlogin I think? [16:26:06] or you can try the script /dev/null trick (https://makandracards.com/makandra/2533-solve-screen-error-cannot-open-your-terminal-dev-pts-0-please-check) [16:26:38] oh, that sounds interesting. Thanks for the hint :) [16:29:19] valhallasw`cloud: thanks, qlogin is perfect for debugging :) [18:01:26] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Lpeters4umd was created, changed by Lpeters4umd link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Lpeters4umd edit summary: Created page with "{{Tools Access Request |Justification=School project/learning |Completed=false |User Name=Lpeters4umd }}" [18:02:08] I am trying to set up sudo, but this page does not load https://wikitech.wikimedia.org/wiki/Special:NovaSudoer [18:02:47] I would like to use sudo to install a python virtual environment on my tool-space and then install Django and the other dependencies [18:05:35] Actually if I use "become commons-app-web" I can type "sudo something", and it asks me for a password. But I didn't set a password for the tool [18:11:44] tobias47n9e-c: on tool labs, that's not possible [18:11:50] but you can just use a regular virtualenv? [18:12:37] valhallasw`cloud: I can try. I am learning as I go along. [18:19:19] Yes that worked thank you! [19:08:59] Hi all. Is it possible to deploy an unmerged core patch on betacluster? [19:09:21] I see instructions for an unmerged puppet patch: https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/How_code_is_updated#Cherry-picking_a_patch_from_gerrit [19:09:41] ejegg: asking in #wikimedia-releng might provide better answers [19:09:52] cool, thanks yuvi [19:12:20] yes ejegg [19:12:32] but I wouldn't leave it there [19:13:57] Krenair: k, i don't want to screw up the beta-code-update-eqiad job [19:14:10] yeah, that could conflict and break [19:14:33] if it's a core patch you should've tested it locally already [19:15:28] yeah, it doesn't break stuff locally, but I can't tell if it fixes the wanObjectCache race condition [19:15:44] ah, yeah [19:17:02] Krenair: so I can cherry-pick on deployment-tin and run beta-scap? [19:17:20] ejegg, beta-scap? [19:17:32] err, just going by the jenkins job name [19:17:36] er [19:17:41] I wouldn't do it through jenkins itself [19:17:45] I sudo as jenkins-deploy [19:18:02] ah, ok. is this documented somewhere? [19:19:12] probably not [19:19:52] k. So on deployment-tin I can cherry-pick in /srv/mediawiki-staging (sudo'ed as jenkins-deploy) [19:21:49] under there, yes. bare in mind it's like production, just with a different branch [19:21:58] for mediawiki core you'll want /srv/mediawiki-staging/php-master [19:22:05] ok, cool [19:22:36] How do you stop your cherry-pick from being overwritten by auto updates? [19:22:56] you don't leave it there long enough for that [19:23:04] then, again sudo'ed as jenkins-deploy do a scap sync-file with the usual args? [19:23:17] yes [19:23:17] Ah [19:23:25] * AndyRussG|mostly removes -2 from all pending core changes [19:23:52] AndyRussG|mostly: think you can test it in the 10 minutes between auto-updates? [19:26:16] thanks Krenair ! [19:28:37] ejegg|food: fer sure! just ping me a bit :) [19:37:19] PROBLEM - Puppet run on tools-docker-builder-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:38:17] yuvipanda: around? [19:54:33] ejegg|food, AndyRussG|mostly: I'm going afk for a bit, all okay? [19:55:11] Krenair: all good!!! thx much 4 ur help :) [20:15:52] 06Labs, 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Wikimedia-Incident: Nodepool instance instance creation quota management - https://phabricator.wikimedia.org/T143016#2714349 (10hashar) 05Open>03Resolved a:03hashar This has been fixed up quite fast. The root cause is the quota we... [20:19:36] 06Labs, 10Labs-Infrastructure, 06Operations, 07Wikimedia-Incident: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2714378 (10hashar) Alex can you do the magic SELECT again and see whether DNS entries are still being leaked? [20:24:28] 06Labs, 10Labs-Infrastructure, 06Operations, 07Wikimedia-Incident: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2714390 (10AlexMonk-WMF) >>! In T115194#2714378, @hashar wrote: > Alex can you do the magic SELECT again and see whether DNS entries are sti... [20:39:29] volans: heya [20:39:39] hey yuvipanda [20:39:56] so just to keep you posted, the puppet agent on the puppetmaster is not working [20:40:08] while the agent on another hosts works just fine [20:40:28] I played a bit but didn't had the time due to the cert issue [20:40:39] volans: which node name is this? [20:40:48] and given that is not fundamental for me I think I will just skip over it :-P [20:41:00] af-puppetmaster.automation-framework [20:41:02] is ok :) I can take a shot [20:41:09] I can tell you what's happening [20:41:31] so I puppet cert clean, rm the files, run the agent, it generates the cert [20:41:33] I sign it [20:41:33] yes please! :) [20:41:35] and at the next run [20:41:55] Error: /File[/var/lib/puppet/facts.d]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=error: certificate [20:41:59] verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: af-puppetmaster.automation-framework.eqiad.wmflabs] [20:42:14] it fails the verification [20:42:14] interesting [20:43:03] if you do a puppet cert clean and try again? [20:43:04] if you want to play feel free, I can re-provision a new one tomorrow morning, I don't have anything there that I will loose [20:43:13] ok! [20:43:19] I'll play with it now then :) [20:43:22] I've tried the clean/sign a bunch of times [20:43:45] if you see the revocation list is like 7~8 certs :D [20:44:02] it clearly has something to do with the fact that is the same host [20:44:20] I tried also to add the FQDN in /etc/hosts without luck, so I removed it [20:44:26] interesting [20:44:30] (03CR) 10Andrew Bogott: [C: 031] Use consistent naming for accounts [labs/striker] - 10https://gerrit.wikimedia.org/r/313145 (owner: 10BryanDavis) [20:45:15] volans: I wonder if it's that the ca never got into the os store [20:45:18] because that's done by puppet [20:45:56] wmf_ca_2014_2017.pem -> /usr/local/share/ca-certificates/wmf_ca_2014_2017.crt ? [20:46:19] is that the ca that this instance of puppetmaster is using? [20:46:22] I think that might not be the case [20:46:28] I think it'll have its own? [20:47:16] the CA should be the one in /var/lib/puppet/server/ssl/ca [20:47:20] the Puppet one [20:48:32] but what do you mean with the OS store? [20:49:24] in wherever ca-certificates keeps it [20:54:12] volans: sorry, in two conversations :) [20:54:35] me too :) [20:55:29] (03CR) 10BryanDavis: [C: 04-1] "See inline notes about adding uid/gid upper range limits to match OSM." (031 comment) [labs/striker] - 10https://gerrit.wikimedia.org/r/313143 (https://phabricator.wikimedia.org/T144710) (owner: 10BryanDavis) [21:11:39] (03PS1) 10GWicke: Update wikimedia-services mapping [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/315830 [21:18:24] (03CR) 10Ppchelko: [C: 031] Update wikimedia-services mapping [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/315830 (owner: 10GWicke) [21:35:42] (03CR) 10GWicke: [C: 032] Update wikimedia-services mapping [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/315830 (owner: 10GWicke) [21:36:21] (03CR) 10GWicke: [V: 032] Update wikimedia-services mapping [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/315830 (owner: 10GWicke) [21:37:46] anybody around with wikibugs2 merge rights? Small tweak for services projects ^^ [21:50:51] it's queued gwicke [21:51:06] it's just stuck in gate-and-submit behind MF and JsonConfig [21:52:10] (03Merged) 10jenkins-bot: Update wikimedia-services mapping [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/315830 (owner: 10GWicke) [21:54:33] gwicke, ^ [22:04:34] yuvipanda: I got sidetracked and it's quite late here, so I'm heading off soon, if you do anything on the instance let me know, otherwise I'll continue as is or at most I'll re-enable the puppet agent from the Labs puppetmaster [22:05:06] volans: will do, thanks [22:12:28] FYI, strange "RECOVERY" [22:12:31] Fri 00:10:50 icinga-wm| RECOVERY - Check status of DRBD node on labstore1005 is OK: NRPE: Unable to read output [22:14:15] volans: I thikn that's becaues there is nothing to stdout and nrpe hates that [22:14:24] but technically exit 0 [22:15:40] ok, let's say the check could be improved :) glad is ok [22:15:52] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:43:50] volans: madhuvishy is working on it now I think [22:44:54] volans: yup - fixing in a bit :) [22:47:14] great! [22:47:18] thanks madhuvishy! [22:55:53] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:26] PROBLEM - Puppet run on bdsync-deb is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]