[01:25:29] OK, so I know IP address information is filtered in error.log for tool labs. But I can't find any reference to that on-wiki. Can someone point me to it on-wiki for an enwiki BRfA? [01:48:41] (03CR) 10BryanDavis: [C: 031] Fix links for maintainers [labs/toollabs] - 10https://gerrit.wikimedia.org/r/281537 (https://phabricator.wikimedia.org/T131799) (owner: 10Tim Landscheidt) [03:03:41] (03PS70) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [03:09:59] (03CR) 10Ricordisamoa: "PS70 updates grunt from ~0.4.5 to ~1.0.0 and removes grunt-cli" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [03:13:15] Matthew_: do you just want someone to say that on-wiki? [03:18:48] legoktm: I was hoping for docs I can point to. The question is basically "we know you have the addresses and the information you put in the form, how are you safeguarding the usernames you collect" and I want to explain that I don't in fact have IP addresses. [03:55:15] (03PS20) 10Ricordisamoa: Initial commit [labs/tools/faces] - 10https://gerrit.wikimedia.org/r/192096 [03:59:11] (03CR) 10Ricordisamoa: "PS20 updates grunt from ~0.4.5 to ~1.0.0 and removes grunt-cli" [labs/tools/faces] - 10https://gerrit.wikimedia.org/r/192096 (owner: 10Ricordisamoa) [04:36:57] (03CR) 10Tim Landscheidt: [C: 032] "Tested live." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/281537 (https://phabricator.wikimedia.org/T131799) (owner: 10Tim Landscheidt) [04:37:24] 6Labs, 10Tool-Labs, 13Patch-For-Review: Fix URL encoding of link to user's profile on 'No webservice' warning page - https://phabricator.wikimedia.org/T131799#2179729 (10scfc) 5Open>3Resolved Will be fixed within 30 minutes. [08:18:29] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Metalindustrien was created, changed by Metalindustrien link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Metalindustrien edit summary: Created page with "{{Tools Access Request |Justification=Running pywikibot for basic tasks on Danish Wikipedia like replacing templates, etc. Already have bot flag there |Completed=false |User N..." [12:23:56] (03CR) 10Jean-Frédéric: "recheck" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/281280 (owner: 10Jean-Frédéric) [12:27:49] 6Labs, 10Tool-Labs: Install virtualenvwrapper on tools bastions - https://phabricator.wikimedia.org/T131840#2180220 (10tom29739) [12:55:10] 6Labs, 10Horizon, 13Patch-For-Review, 7Upstream: Increase horizon session length - https://phabricator.wikimedia.org/T130621#2180270 (10hashar) That is an upstream issue :( https://bugs.launchpad.net/django-openstack-auth/+bug/1562452 @Andrew proposed a patch: https://review.openstack.org/#/c/298002/ [13:35:47] 6Labs, 10Tool-Labs, 7Tracking: Packages to be added to toollabs puppet - https://phabricator.wikimedia.org/T55704#2180387 (10chasemp) [13:35:52] 6Labs, 10Tool-Labs, 13Patch-For-Review: Install virtualenvwrapper on tools bastions - https://phabricator.wikimedia.org/T131840#2180385 (10chasemp) 5Open>3Resolved a:3chasemp [14:10:46] PROBLEM - Puppet run on tools-pastion-01 is CRITICAL: CRITICAL: 77.78% of data above the critical threshold [0.0] [14:12:39] PROBLEM - Puppet run on tools-bastion-05 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [14:15:17] PROBLEM - SSH on tools-pastion-01 is CRITICAL: Connection refused [14:20:19] RECOVERY - SSH on tools-pastion-01 is OK: SSH OK - OpenSSH_6.9p1 Ubuntu-2~trusty1 (protocol 2.0) [14:21:19] Since when do we have a 'tools-pastion-01'? Is that a spelling error? [14:34:52] we have one, bu it's unreachable. NAgf shows one [14:39:36] 6Labs: Can't delete security groups (in horizon or OSM) - https://phabricator.wikimedia.org/T129437#2180471 (10Andrew) [14:41:10] 6Labs: Can't delete security groups (in horizon or OSM) - https://phabricator.wikimedia.org/T129437#2105624 (10Andrew) 5Open>3Resolved This is fixed for our install. I suspect that our install is a corner case (since it is MUCH older than other production clouds) so I'll let the upstream people decide if th... [14:44:37] tools-pastion-01 just seems to go to tools-bastion-10 [14:51:59] So it seems that the version of /usr/bin/php on tool labs "as a tool" is "5.5.9-1ubuntu4.14", but the cluster one is "5.3.10-1ubuntu3.21+wmf1", which dies on the new PHP array syntax. Can the PHP version on the cluster be upgraded, or can I specify a version/PHP path? [14:56:10] magnus_: the cluster (i.e. production) or the grid? [14:56:44] magnus_: for the grid, specify a trusty exec host (-l release=trusty for jsub, --release=trusty for webservice) [14:58:08] valhallasw`cloud: Thanks! [15:07:39] 6Labs: New labs project: debdeploy - https://phabricator.wikimedia.org/T131852#2180597 (10MoritzMuehlenhoff) [15:10:18] 6Labs, 7Tracking: New Labs project requests (tracking) - https://phabricator.wikimedia.org/T76375#2180603 (10Andrew) [15:10:20] 6Labs: New labs project: debdeploy - https://phabricator.wikimedia.org/T131852#2180600 (10Andrew) 5Open>3Resolved a:3Andrew Done. Moritz, please remember to delete the old instances that are being replaced here; we're running low on space. [16:26:37] 6Labs, 7Shinken: Downsize instances to actual need - https://phabricator.wikimedia.org/T131859#2180822 (10scfc) [17:03:39] bd808: Currently here? [17:20:55] PROBLEM - Host tools-worker-1011 is DOWN: PING CRITICAL - Packet loss = 100% [18:05:43] PROBLEM - Puppet run on tools-worker-1007 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:07:37] PROBLEM - Puppet run on tools-k8s-master-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:11:50] PROBLEM - Puppet run on tools-worker-1002 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:22:39] RECOVERY - Puppet run on tools-k8s-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:34:57] 6Labs, 10Labs-Infrastructure: Sort out our libvirt qcow2 hack with the upstream - https://phabricator.wikimedia.org/T131548#2181281 (10Andrew) Context: https://phabricator.wikimedia.org/P2858 [18:35:49] 6Labs, 10Labs-Infrastructure: Sort out our libvirt qcow2 hack with the upstream - https://phabricator.wikimedia.org/T131548#2181282 (10Andrew) I have confirmed that the patch is still meaningful, and should be perpetuated for the short run. I also have a lackluster patch in for upstream review: https://revie... [18:50:47] RECOVERY - Puppet run on tools-worker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [18:56:25] Labs question, if I only collect usernames am I required to show the disclaimer at https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use#If_my_tools_collect_Private_Information... ? [18:56:29] *Tool labs [18:59:04] Matthew_: I don't think so [18:59:36] usernames within tools are public atm anyway [18:59:54] YuviPanda: And it's impossible for me to get IP addresses correct? The ones listed in access.log and error.log are 10. addresses. [19:00:02] chasemp: No, I'm collecting usernames via a web form. [19:00:13] (It's optional but that's a snag on the brfa I'm doing.) [19:00:35] https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/Matthewrbot <-- They keep asking about the web-based tool part of it. [19:01:16] Matthew_: not without resorting to nefarious means (like embedding 3rd party plugins or what not in your web tool) [19:01:34] YuviPanda: OKay. Which I am not doing :) [19:04:21] Thank you. [20:25:16] bd808: T131886 [20:25:29] maybe you can take a look? [20:26:39] Luke081515: wrong directory? [20:27:59] Luke081515: cd /vagrant/mediawiki [20:28:21] composer works on the currnet working directory [20:29:22] meh, then I get a big red area with this text: [20:29:26] [Composer\Repository\InvalidRepositoryException] [20:29:26] Invalid repository data in /vagrant/mediawiki/vendor/composer/installed.json, packages could not be loaded: [UnexpectedValueExcepti [20:29:29] on] Could not parse package list from the repository [20:29:54] fun [20:30:11] The installed packages list is corrupt. Have fun :D [20:30:45] The "easiest" thing to do would be to wipe out mediawiki/vendor and composer.lock and start over [20:30:47] I'd imagine there's some way to regenerate it. [20:31:21] Don't wipe out composer.lock, just delete the vendor directory. [20:33:02] We pin all package versions in compsoer.json so keeping composer.lock doesn't really save anything useful [20:33:22] but yeah not technically necessary I guess [20:36:05] !log tools.stashbot Restarted stashbot process; dead from phabricator timeouts [20:36:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL, Master [20:39:44] !log tools Elasticsearch processes down. Looks like a prod puppet change that needs tweaking for tool labs [20:39:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:40:39] stashbot is flapping again. Was the other day. [20:41:05] yeah. I'm on it [20:42:11] I see from the status line something related to what I was going to ask about: namely the ssh warning I get when I try to log into tools-login.wmflabs.org. First: Is that still the host I should be logging into? (If it is, I gather I should be prepared for a new key?) [20:42:33] bd808: https://commons.wikimedia.org/wiki/File:Jerusalem_Hackacthon_IMG_8395.JPG :-D [20:42:43] 'bd808 and reedy arrive to save the day' [20:42:50] heh [20:43:43] JMarkOckerbloom: yup, and yup [20:43:56] okay. (why the new key, btw?) [20:44:12] new larger VM [20:45:21] OK. So I'll just clear the old key out and try logging in again. Just wanted to make sure I wasn't trying to log into an obsolete address that was now insecure. [20:53:40] Thanks! While I'm here, does anyone on channel know if there will be April dump files generated in dumps.wikimedia.org? I recall seeing some earlier discussion about possibly shifting to a new distribution mechanism, though I thought that was some distance off. [21:02:04] !log tools Forcing puppet runs to fix elasticsearch [21:02:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [21:02:25] blerg [21:02:42] Duplicate declaration: Class[Nginx] is already declared in file /etc/puppet/modules/elasticsearch/manifests/https.pp:25; cannot redeclare at /etc/puppet/modules/role/manifests/toollabs/elasticsearch.pp:11 on node tools-elastic-01.tools.eqiad.wmflabs [21:03:03] puppet sucks [21:04:02] yes I took a swing at that yesterday it's because class nginx is defined there in toollabs looking for 'light' [21:04:18] but it's not passed through and nginx is redefined another lvl down in elastic setup [21:04:27] I haven't gotten time to revisit but [21:05:29] why did discovery do it like that... [21:05:34] * bd808 grumbles [21:06:35] oh because of the ensure [21:06:37] hmm [21:06:48] yeah that's shitty [21:07:49] the require of ::elasticsearch::https should be moved to their role [21:08:28] untangling needed yeah [21:08:30] bd808: just kill the 'light' variant in the tool labs manifest and let the ES manifest install nginx? [21:09:02] yeah that will work I'm sure [21:09:07] I tried that [21:09:14] it then wants all kinds of special case ssl prod stuff [21:09:25] there is some serious feature creep happening in the elasticsearch module [21:09:35] yes it's all things atm [21:09:45] I'm going to whine at gehel [21:10:09] it's not role::cirrussearch :) [21:10:31] I was close to passing through the variant option to the bottom via elasticsearch module and let that be done but it's messy [21:10:34] and I was out of time [21:11:58] https://gerrit.wikimedia.org/r/#/c/281464/ [21:12:04] https://gerrit.wikimedia.org/r/#/c/281482/ [21:12:44] actually, I think the ES manifest is *removing* nginx [21:12:54] class elasticsearch::https ( [21:12:54] $ensure = absent, [21:13:02] and that's passed to class { [ 'nginx', 'nginx::ssl' ]: [21:13:38] the class does but they pass in present when calling I think [21:13:41] defaulting to absent etc [21:14:01] which seems backwards [21:14:14] (03PS1) 10Jean-Frédéric: Fix import of pywikibot bits in remaining erfgoed scripts [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/281809 [21:14:15] yeah. the ensure logic should really be ternary (present, absent, dontyoudaretouchmyfiles) [21:14:16] (03PS1) 10Jean-Frédéric: Enable cover-inclusive for erfgoedbot unitests [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/281810 [21:14:18] (03PS1) 10Jean-Frédéric: Add unit tests for ucfirst method [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/281811 [21:14:20] (03PS1) 10Jean-Frédéric: Extract method extract_elements_from_template_param from update_database [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/281812 [21:24:03] !log tools Committed local hack on tools-puppetmaster-01 to get elasticsearch working again [21:24:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [21:29:04] !log tools.stashbot Restarted after fixing elasticsearch cluster [21:29:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL, Master [21:29:38] !log tools.sal Missing data since 2016-04-04T16:36 [21:29:39] RECOVERY - Puppet run on tools-elastic-02 is OK: OK: Less than 1.00% above the threshold [0.0] [21:29:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sal/SAL, Master [21:30:08] chasemp: I local hacked it. Talking to gehel about a proper fix [21:30:19] bd808: ok cool [21:30:52] bd808, chasemp: can you wait until tomorrow for the fix? [21:31:03] And again, sorry for the pain... [21:31:28] gehel: I'm sure yeah wanna just grab https://phabricator.wikimedia.org/T131644? [21:31:31] RECOVERY - Puppet run on tools-elastic-03 is OK: OK: Less than 1.00% above the threshold [0.0] [21:35:07] 6Labs, 10Tool-Labs, 13Patch-For-Review: Puppet fails on tools-elastic-01, tools-elastic-02 and tools-elastic-03: "Class[Nginx] is already declared" - https://phabricator.wikimedia.org/T131644#2182067 (10Gehel) a:3Gehel [21:38:25] we have a lot of things in roles... [21:38:50] Who should I ping to get a presentation of how we architecture our puppet code? [21:39:47] I don't kow that a thing like that has ever been done [21:39:56] gehel: lol [21:40:15] architecture and our puppet are very loosely related [21:41:15] Ok, I use big words, but it seems that there is some purpose behind the madness and I'm used to a different madness. My guess is that we have some unwritten reasons to do things the way we do them here ... [21:41:28] gehel: any rule I gave you about it I could find 10 places taht broke that rule [21:41:36] we have a style guide [21:41:37] somewhere [21:41:49] I read it, but it is fairly low level. [21:42:07] gehel: I know what you meant. I was flippantly pointing out that there is no overarching design to the puppet code I have seen [21:42:23] I think usually it is 'ah shit, this module is used in two places, let me take out things that aren't common to both into the role' [21:43:04] techincally [21:43:09] modules are meant to be generic [21:43:10] The fact that I use things in Labs has caused many problems before. "design for reuse" isn't high on the list when people are producing puppet code here [21:43:11] The kind of question I would ask: why do we define ferm rules in elasticsearch role, where it seems to me that it clearly belong to to the elasticsearch module [21:43:19] roles are meant for parameterization [21:43:34] gehel: well not when that module can go places w/ no iptables [21:43:38] gehel: why should the module depend on another random module? [21:44:03] gehel: as an example, take the redis module. on tools, we have 3 different redises doing very different tasks with very different ferm rules limit/granting access to different places depending on their role. [21:44:05] inter-module dependencies make me crabby [21:44:12] so we have one redis module and 3 different roles [21:44:37] there is not good rule about it but in theory modules are distinct do-one-thing-well agnostic unixy things [21:44:37] that play in all realms and sites [21:44:46] and roles encapsulate multiple modules and paramiterize them to a production state for that specific instantiation [21:44:53] we all have a diff explanation :) [21:45:23] I'd agree with your explanation chasemp [21:45:25] I'm actually used to having 3 layers. Leaf modules that are usually external modules, stack modules that define how we use a particular component in our context and roles which actually instantiate all this. [21:46:13] gehel: *nod* we just squish stack and role into one pile; and avoid 3rd party modules because we love to reinvent wheels [21:46:20] well it gets more complex if you dig through the $realm check layers w/ roles and modules even [21:46:29] and frankly I do genuinely dislike 3rd party modules :D [21:46:49] or at least 99% are so poorly done I rarely even look anymore [21:47:00] I have never read a module from puppetforge that made me think "wow, that's nice code" [21:47:02] I used to hate 3rd party modules, but I learned to like them quite a lot... [21:47:20] * YuviPanda is in line with gehel, but $energy [21:47:30] for fun look at modules/stdlib in our repo :) [21:47:55] sure a few are ok, I recall trying to use one here ages ago [21:48:10] and ran into 10 places where it assumed we used another 3rd party module [21:48:16] so it's an ecosystem and without buy-in [21:48:21] it was just impossible [21:48:22] yeah, I had a look. And actually I did the same at my last job. It took 6 month to be able to get back to a standard stdlib... [21:48:38] RECOVERY - Puppet run on tools-bastion-05 is OK: OK: Less than 1.00% above the threshold [0.0] [21:49:16] if I had to describe our puppet code the word would be, fractured [21:49:21] and then there's modules/wmflib and the partial copy of it in mediawiki-vagrant [21:49:37] at least we got rid of most $::realm from modules [21:49:43] most of it [21:50:06] it was really bad before we got heira [21:51:23] yeah [21:51:31] while we are talking about it, why do we have a single repo for all modules? [21:51:48] there are a some submodules oriented modules [21:51:52] I think the correct answer is 'some people hate submodules' [21:52:01] which I wouldn't describe as a success [21:52:03] :) [21:52:04] yup, that's the answer [21:52:13] I hate submodules [21:52:21] * bd808 would love to use puppet-librarian instead [21:52:29] more precisely, I hate git submodules [21:52:41] * gehel would settle for r10k [21:53:26] puppet still does not a proper dependency management system and the discussions about I had with puppetlabs were not reassuring... [21:53:32] I have various wishes for making that more sane in venues I care about [21:53:39] but it's a huge massive task to do right [21:53:59] tbh puppetlabs never inspires confidence [21:54:04] we go along despite them not with them [21:54:08] matanya wanted me to work on puppet stuff with him at the hackathon but I didn't have the energy [21:54:23] indeed [21:54:43] he wants to be able to do project custom code without a local puppet master [21:54:53] I have barely said it aloud but I want a staging that uses pupppet environments and allows for a less local hack driven puppetmaster thing and releng to do their own merges into a staging branch [21:54:58] bd808: heh, puppetception :) [21:55:02] and various other things that play into that scenario [21:55:10] YuviPanda: yeah :) [21:55:14] bd808: I killed that module because (honest reason being) I got hired into ops right after I wrote it [21:55:19] what about packer ? [21:55:21] and so nobody used it and it rotted [21:55:51] matanya: isn't that just a container build tool? [21:56:04] it can master stuff too [21:56:23] Packer is a tool for creating machine and container images for multiple platforms from a single source configuration. [21:56:38] so no, not suited [21:56:56] matanya: I was thinking about the puppet things you want on the flight home. it's harder than I thought I think [21:57:16] because we aren't masterless [21:57:28] as always :) i never get to come up with easy problems [21:58:33] matanya: what is it that you want to do? [21:58:59] have puppet work in labs with minimal effort on tool writers [21:59:26] scaffolding for people to use puppet to deploy their tools? [21:59:40] well manage Labs instances [21:59:49] for easy backend setups [21:59:49] * YuviPanda is now of the opinion that borg-like systems are what should be considered 'future', but that is of no use to anyone today [22:00:01] that seems like large and a bit vague whish... [22:00:04] he doesn't like our self-hosted puppet master setup [22:00:15] it is a pain to maintain [22:00:23] puppet is a pain to maintain [22:00:30] I'd much prefer environments... [22:00:30] and differs from prod quickly [22:00:38] it shouldn't [22:00:48] we keep up with prod in beta cluster [22:00:53] oh yeah I agree there entirely [22:00:54] and tool labs [22:01:00] not to talk about conflicts, and packaging nightmares [22:01:01] our current scheme is bad [22:01:48] and lang-specific packages people tend to use, e.g pip, rvm etc [22:01:50] I'm sure there could be something much better, but it's so much better than it was 2 years ago... [22:01:55] 'but it is just labs! we use it for testing production changes only, right?' [22:02:24] [22:02:45] yes sarcasm heavy enough to knock a hole in the earth [22:02:55] just to make sure I don't break something else... what kind of node classifier do we use for beta? I wanted to make sure that the search / cirrus cluster is the only one to use the elasticsearch role (you never now) [22:03:19] It would be great to have some method of sharing between mw-vagarnt, labs and prod that didn't make people yell about git sub modules [22:03:42] bd808: r10k... [22:04:11] more software from puppetlabs can't really be the answer can it? ;) [22:04:17] * bd808 has to go run some errands [22:04:32] or puppet librarian. At job^1 we had around 50 Vagratn VMs based on the same puppet modules [22:04:50] oh, and gehel : welcome ! [22:04:58] * YuviPanda is personally going to move all of his 'non-work' puppet modules into k8s in the short-to-medium-term [22:04:59] r10k is some version of what I would like to do in some of those places [22:05:04] didn't greet you yet :) [22:05:22] We all hated r10k, but it got the job done... [22:09:48] bd808, chasemp: I just commited https://gerrit.wikimedia.org/r/281824 which should fix the issue. But it's too late here for me to push it any further before I get some sleep. [22:10:01] gehel: no worries thanks man [22:10:50] on that note, time to get some sleep... [22:19:36] Virtualenverapper just got installed. But it isn't working. Did it not install properly? [22:29:15] 6Labs, 10Tool-Labs, 13Patch-For-Review: Upgrade to Kubernetes 1.2 - https://phabricator.wikimedia.org/T130972#2182217 (10yuvipanda) More patches fixing things here and there, but we're on 1.2 no! \o/ https://etherpad.wikimedia.org/p/T130972 has more details that should be distilled out here. Action items f... [22:33:26] PROBLEM - SSH on tools-worker-1012 is CRITICAL: Connection refused [22:35:00] PROBLEM - Puppet run on tools-worker-1010 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [0.0] [22:45:09] RECOVERY - Puppet run on tools-worker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [22:51:20] 6Labs, 10Tool-Labs: Virtualenvwrapper script does not exist - https://phabricator.wikimedia.org/T131898#2182259 (10tom29739) [22:56:41] 6Labs, 10Phabricator, 7Puppet: Phabricator labs puppet role configures phabricator wrong - https://phabricator.wikimedia.org/T131899#2182293 (10Luke081515) [23:08:49] RECOVERY - Puppet run on tools-worker-1002 is OK: OK: Less than 1.00% above the threshold [0.0]