[00:14:46] 06Labs, 10Horizon: Unable to log into horizon.wikimedia.org - https://phabricator.wikimedia.org/T154860#2927158 (10dschwen) Yeah, well, still not working for me. Maybe somebody could take a look. [00:28:54] (03PS1) 10Tim Landscheidt: WIP: Don't ignore fchdir()'s errors [labs/toollabs] - 10https://gerrit.wikimedia.org/r/331227 [00:50:43] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Chuenlye was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=1286380 edit summary: [00:53:20] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Juniorsys was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=1286388 edit summary: [01:01:19] (03PS6) 10Tim Landscheidt: Import sql from operations/puppet [labs/toollabs] - 10https://gerrit.wikimedia.org/r/268563 [01:01:21] (03PS5) 10Tim Landscheidt: Add man page for sql [labs/toollabs] - 10https://gerrit.wikimedia.org/r/268602 [01:04:14] (03CR) 10Tim Landscheidt: [C: 04-2] "(Just rebased, see above.)" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/268602 (owner: 10Tim Landscheidt) [01:04:38] (03CR) 10Tim Landscheidt: [C: 04-2] "(Just rebased, see above.)" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/268563 (owner: 10Tim Landscheidt) [09:10:43] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2927577 (10akosiaris) >>! In T143349#2921072, @Jgreen wrote: > If my user is associated with the instance then it is ancient and > purgeworthy. The last work I did on OTRS was for the p... [09:47:01] PROBLEM - Free space - all mounts on tools-worker-1003 is CRITICAL: CRITICAL: tools.tools-worker-1003.diskspace._var_lib_docker.byte_percentfree (No valid datapoints found) tools.tools-worker-1003.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-worker-1003.diskspace.root.byte_percentfree (<100.00%) [13:02:47] hi all. is there anything special i need to take care of if i want to $ssh localhost on a lab instance (with bastion or so)? right now i get "public key denied" but keys is in .ssh and chmod is set [13:26:12] mschwarzer: first you should really not put the key in ~/.ssh on a bastion instance [13:27:00] (**NOT**, I should caps this) [13:28:22] use agent forwarding if you really need that feature, though I'm really not sure why you would want to ssh to localhost [13:29:35] zhuyifei1999_: not my key. i want to ssh from the lab instance (xxx.wmflabs, not bastion) to itself (localhost) [13:30:00] (on xxx.wmflabs) $ ssh localhost [13:31:31] I still don't understand why you want to ssh to localhost. you're already on that host, ssh to itself doesn't change the host you're on [13:32:10] i'm testing some oozie workflow which should execute a flink job via ssh [13:35:01] mschwarzer: have you debugged it with ssh -vvv localhost ? [13:37:39] yes. it seems that pubkey is offered correctly: ...debug1: Offering RSA public key: /home/mschwarzer/.ssh/id_rsa [13:37:57] debug3: send_pubkey_test debug2: we sent a publickey packet, wait for reply [13:38:49] correct username? since you said it's not your key I'd assume the target user isn't you [13:40:33] sshd on labs is quite integrated with ldap so there might be other thinks to watch out for [13:40:33] *things [13:44:13] * zhuyifei1999_ still finds it weird to ssh to localhost for whatever reason [15:43:43] (03PS1) 10Lokal Profil: Change Nepal namepace from Project to Draft [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331325 (https://phabricator.wikimedia.org/T154857) [15:45:58] (03PS2) 10Lokal Profil: Change Nepal namespace from Project to Draft [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331325 (https://phabricator.wikimedia.org/T154857) [15:56:34] (03CR) 10Multichill: [C: 032] Change Nepal namespace from Project to Draft [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331325 (https://phabricator.wikimedia.org/T154857) (owner: 10Lokal Profil) [15:57:49] (03Merged) 10jenkins-bot: Change Nepal namespace from Project to Draft [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331325 (https://phabricator.wikimedia.org/T154857) (owner: 10Lokal Profil) [15:59:01] (03CR) 10jenkins-bot: Change Nepal namespace from Project to Draft [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331325 (https://phabricator.wikimedia.org/T154857) (owner: 10Lokal Profil) [17:18:58] 06Labs, 10wikitech.wikimedia.org, 05MW-1.29-release-notes, 13Patch-For-Review, 05WMF-deploy-2017-01-17_(1.29.0-wmf.8): Job queue has 119663 entries - https://phabricator.wikimedia.org/T153618#2928136 (10bd808) >>! In T153618#2927016, @Cavila wrote: > Will the patch be released for 1.28? I've posted a ba... [17:20:40] 06Labs, 10wikitech.wikimedia.org, 05MW-1.29-release-notes, 13Patch-For-Review, 05WMF-deploy-2017-01-17_(1.29.0-wmf.8): LinksUpdate::acquirePageLock error with SMW enabled - https://phabricator.wikimedia.org/T153618#2928151 (10bd808) [17:50:53] andrewbogott: yuvipanda: whoever: is https://phabricator.wikimedia.org/T152456 ready for a decision? do you need more input? [17:51:50] also we have a problem in tool labs with a monitoring script that tries to ensure that the webservice is up and running: the webservice cannot be found, can we run it on submit instead or can it be fixed? [17:52:53] *the webservice script [17:52:56] on the grid [17:59:13] annika: we're all at a conference this week so won't be super responsive. I think regarding the quota increase, it would help if you can add more details about what in particular you want to do with the extra ram (like, more data processing, but what data, and processing it how, etc.) [17:59:24] (sorry for the drive-by, returning to my conf session now) [18:11:08] andrewbogott, we need the quota, because we are creating a huge quality management of dewiki, running round the clock, but at least once or twice daily. So it's necessary to get a huge amount of data of quality categories and quality list pages, templates, pagelinks and more by API and mySQL too, beyond dewiki too, and these data has to be processed too and output again. The quota now is much too [18:11:15] small, so memory is running out time to time and CPU is critically. I hope, this helps. Thank you [18:16:16] Can you paste that in the ticket? Thx [18:17:25] 06Labs: Increase resource quota for dwl - https://phabricator.wikimedia.org/T152456#2928364 (10doctaxon) Conversation in freenode channel Jan 09th 2017: 17:59 < andrewbogott> `annika: we're all at a conference this week so won't be super responsive. I think regarding the quota increase, it would help if you ca... [18:35:23] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2928420 (10dschwen) `fastcci-master` was successfully upgraded to 14.04LTS, please take it off the list. I'll work on `fastcci-worker1` next! [19:23:25] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2928577 (10Andrew) [19:35:36] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2928622 (10dschwen) NONONONO!!!!! CAN NOT BE DELETED. I upgraded it!!!!!! [19:39:35] PROBLEM - Puppet run on tools-exec-1414 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [19:42:58] 06Labs, 10Tool-Labs, 10DBA: Provisioning MySQL replica users fails on tool labs - https://phabricator.wikimedia.org/T151014#2928690 (10Marostegui) Thanks guys, let's try to coordinate next week after the all-hands so we have a normal schedule in case we need to roll back! [19:50:50] Hi! I'm getting access denied to mysql db using replica.my.cnf for my user (danmichaelo) and the service account danmicholobot . Perhaps new files need to be generated? [19:55:03] danmichaelo_: could you file a phabricator task to track this? Most of the Labs team at the Developer Summit right now so it may take us a bit of time to get to this. [19:55:38] Ah, right, I'll do! [20:00:25] danmichaelo_: thanks for the patience :) [20:10:16] 06Labs, 10Tool-Labs: Credentials in replica.my.cnf gives Access Denied - https://phabricator.wikimedia.org/T154933#2928837 (10Danmichaelo) [20:11:57] No prob, this is not urgent :) [20:13:33] 06Labs, 10Labs-Infrastructure, 10Tool-Labs, 10DBA, 10Wikimedia-Developer-Summit (2017): Labsdbs for WMF tools and contributors: get more data, faster - https://phabricator.wikimedia.org/T149624#2928877 (10srishakatux) Note-taker(s) of this session: Follow the instructions here: https://www.mediawiki.or... [20:13:49] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 06Developer-Relations, and 2 others: Developing community norms for vital bots and tools - https://phabricator.wikimedia.org/T149312#2928888 (10srishakatux) Note-taker(s) of this session: Follow the instructions here: https://www.mediawiki.org/wiki/Wikimed... [20:14:34] RECOVERY - Puppet run on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [20:26:26] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2928954 (10Krenair) [20:26:39] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2565438 (10Krenair) better? [20:26:41] yuvipanda: Is there a recommended route for use of Redis in Tool Labs via Kubernetes? Especially wrt to making reasonably sure the queue is persisted across restarts. [20:27:24] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2928957 (10dschwen) Phew! Thanks, yes, much better :-) [20:27:24] Thinking of extending perflogbot (Node.js irc recently switched to K8s) to be two processes and use Redis as go-between for persisting messages for a short time (to account for restarts/downtime/failure/retry etc.) [20:27:47] Krinkle: I think you can probably use a queue with a maximum length in redis? [20:28:12] valhallasw`cloud: Yeah, thinking of using rpoplpush actually with a "inbox" and "work" queue. [20:28:13] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2928961 (10dschwen) [20:28:21] and after sucessfully delivery, trim from the latter. [20:28:39] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2565438 (10dschwen) [20:28:45] yeah, that sounds reasonable [20:29:30] https://phabricator.wikimedia.org/T153168#2928963 [20:29:32] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2565438 (10dschwen) [20:29:34] valhallasw`cloud: if you're intersted^ [20:29:41] 06Labs, 10Labs-Infrastructure: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349#2928965 (10Krenair) [20:30:27] Krinkle: you can probably use wb2irc (wikibugs) as irc component [20:30:41] although that might be redis pub/sub at the moment [20:30:56] but 90% of the infra (irc bot and redis connections) is in there [20:31:23] https://github.com/wikimedia/labs-tools-wikibugs2/blob/master/redis2irc.py [20:31:58] valhallasw`cloud: Cool. Might try, yeah. [20:32:24] I wanna make sure though that it moves the messages to a separate list first and only after making sure that the irc connection is (still) all good, to remove it. [20:32:30] it's not even pub/sub, it's just push/pop [20:32:38] That's better in many ways, actually. [20:32:38] *nod* [20:32:47] That way it is preserved if the bot is actually not running. [20:33:00] pub/sub is really just for "send to currently connected clients" [20:33:13] which would be OK in the wikibugs case [20:33:16] I suppose [20:33:42] but it's be nice to still cover over brief outages (e.g. net split or re-configure restart) [20:33:58] *nod* [20:34:26] although if you wanna go really basic, one could use phab2stdout.py | stdin2irc.py ;-) [20:35:04] separate un-puled processes is a bit better though, for restarts [20:35:35] anyway, hoping not to have to do too much to set up redis on tools with k8s and have it persist (not just as a separate pod, but also for entire restarts/outage) [20:36:13] Krinkle: with respect to persistence -- the tools-redis instance is disk-backed (as in: it dumps data there every X period of time) [20:36:26] so data survives reboots of tools-redis as well [20:40:35] PROBLEM - Puppet run on tools-exec-1414 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:54:20] valhallasw`cloud: yeah, exactly. [20:55:41] Except I'd prefer not to literally use that one so that I can still run locally using the same configuration, and perhaps a bit more stable than the shared key instance [20:56:37] Should be relatively simple to setup another pod in k8s that also persists. Though probably would need to be NFS which is supoptomal [20:57:15] Krinkle: just use $REDIS_HOST to pass in the hostname? [21:00:29] 06Labs, 10wikitech.wikimedia.org, 05MW-1.29-release-notes, 13Patch-For-Review, 05WMF-deploy-2017-01-17_(1.29.0-wmf.8): LinksUpdate::acquirePageLock error with SMW enabled - https://phabricator.wikimedia.org/T153618#2929069 (10Cavila) okay, thanks for working on this [21:10:09] valhallasw`cloud: yeah, but I feel icky about the keys being all exposed and shared [21:10:39] $SUPERSECRETPREFIX = '...'? :P [21:10:51] the features to list keys are turned off on tools-redis [21:15:34] RECOVERY - Puppet run on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [21:54:07] (03CR) 10Jean-Frédéric: [C: 032] Disallow adding hidden categories [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328378 (https://phabricator.wikimedia.org/T153746) (owner: 10Lokal Profil) [21:55:43] (03Merged) 10jenkins-bot: Disallow adding hidden categories [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328378 (https://phabricator.wikimedia.org/T153746) (owner: 10Lokal Profil) [21:57:00] (03CR) 10jenkins-bot: Disallow adding hidden categories [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/328378 (https://phabricator.wikimedia.org/T153746) (owner: 10Lokal Profil) [21:57:16] !log tools.heritage Deploy latest from Git master: 3639bb0 (T153746), daee265 (T153842), 282b912, 6fd681c, e00ba31 (T154857) [21:57:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL [21:57:21] T153746: ErfgoedBot categorisation process should ignore hidden categories - https://phabricator.wikimedia.org/T153746 [21:57:22] T153842: Download of composer cweiske/php-sqllint requires to disable https security - https://phabricator.wikimedia.org/T153842 [21:57:22] T154857: Fix Nepal in monuments database - https://phabricator.wikimedia.org/T154857 [22:11:34] PROBLEM - Puppet run on tools-exec-1414 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:46:32] RECOVERY - Puppet run on tools-exec-1414 is OK: OK: Less than 1.00% above the threshold [0.0] [23:01:43] (03PS1) 10Jean-Frédéric: Refactor database_statistics.getStatistics [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331407 [23:03:05] (03CR) 10jerkins-bot: [V: 04-1] Refactor database_statistics.getStatistics [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331407 (owner: 10Jean-Frédéric) [23:03:38] (03PS2) 10Jean-Frédéric: Refactor database_statistics.getStatistics [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331407 [23:04:55] (03CR) 10Jean-Frédéric: "This raises coverage from 19% to 35% on that module − sweet :)" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/331407 (owner: 10Jean-Frédéric) [23:21:26] 06Labs, 10Labs-Infrastructure, 10Tool-Labs, 10DBA, 10Wikimedia-Developer-Summit (2017): Labsdbs for WMF tools and contributors: get more data, faster - https://phabricator.wikimedia.org/T149624#2929352 (10bd808) Etherpad at https://etherpad.wikimedia.org/p/devsummit17-Labsdbs [23:26:07] PROBLEM - Puppet run on tools-webgrid-lighttpd-1402 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]