[04:43:16] 10Beta-Cluster-Infrastructure: Requesting interface administrator/bureaucrat right on fa.wikipedia.beta.wmflabs.org for User:Dalba - https://phabricator.wikimedia.org/T207622 (10Dalba) [07:39:14] PROBLEM - Free space - all mounts on deployment-maps03 is CRITICAL: CRITICAL: deployment-prep.deployment-maps03.diskspace._srv.byte_percentfree (<55.56%) [07:43:21] 10Phabricator (Upstream), 10Upstream: Allow Phabricator pastes to be formatted as remarkup - https://phabricator.wikimedia.org/T207606 (10Aklapper) p:05Triage>03Lowest [08:29:16] RECOVERY - Free space - all mounts on deployment-maps03 is OK: OK: All targets OK [09:10:49] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: Puppet broken on deployment-deploy* - https://phabricator.wikimedia.org/T207487 (10elukey) 05Open>03Resolved Fixed by Joe with https://gerrit.wikimedia.org/r/468937 [09:17:15] <_joe_> !log deployment-prep: cherry-pick https://gerrit.wikimedia.org/r/c/operations/puppet/+/468583/, disable puppet on deployment-mediawiki-09 [09:17:17] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:18:28] 10Release-Engineering-Team (Watching / External), 10DBA, 10Operations, 10cloud-services-team, and 2 others: Move some wikis to s5 - https://phabricator.wikimedia.org/T184805 (10Marostegui) >>! In T184805#4654953, @Marostegui wrote: > This was done successfully and new wikis are now live on eqiad. > What is... [09:24:42] <_joe_> oh so many nice differences between prod and deployment-prep [09:24:48] <_joe_> even apache-fast-test doesn't work :P [09:25:50] <_joe_> Krenair: my tests with apache-fast-test seem to say the beta change is a success [09:27:32] <_joe_> I'm honestly impressed, but I have to say the beta vhosts were more standardized than the production ones [09:29:58] were more standardized? huh [09:30:10] I thought they were a hacked together diverged copy of some old prod config [09:31:05] anyway, good to kill off another unnecessary prod vs. deployment-prep difference :) [09:33:49] <_joe_> well, the manifests are different, but at least the configuration is more consistent [09:34:04] <_joe_> I don't think unifying the puppet code completely makes sense [09:46:59] 10Release-Engineering-Team (Watching / External), 10DBA, 10Operations, 10cloud-services-team, and 2 others: Move some wikis to s5 - https://phabricator.wikimedia.org/T184805 (10jcrespo) a:05jcrespo>03Marostegui [09:55:31] 10Release-Engineering-Team, 10ContentTranslation: cx-entrypoint-dialog-page-doesnt-exist-yet showing
- https://phabricator.wikimedia.org/T207522 (10Nikerabbit) That looks like resource loader using a stale message. Maybe scap does not clear resource loader blobs like l10nupdate does? [10:05:14] PROBLEM - Free space - all mounts on deployment-maps03 is CRITICAL: CRITICAL: deployment-prep.deployment-maps03.diskspace._srv.byte_percentfree (<44.44%) [10:35:15] RECOVERY - Free space - all mounts on deployment-maps03 is OK: OK: All targets OK [12:06:13] PROBLEM - Free space - all mounts on deployment-maps03 is CRITICAL: CRITICAL: deployment-prep.deployment-maps03.diskspace._srv.byte_percentfree (<22.22%) [12:26:14] PROBLEM - Free space - all mounts on deployment-maps03 is CRITICAL: CRITICAL: deployment-prep.deployment-maps03.diskspace._srv.byte_percentfree (<44.44%) [12:51:14] RECOVERY - Free space - all mounts on deployment-maps03 is OK: OK: All targets OK [12:56:49] _joe_, I don't know about unifying all of the puppet code, but the main sites? [12:57:38] <_joe_> Krenair: maybe? I'm not so sure tbh [12:59:32] 10Release-Engineering-Team, 10Language-strategy, 10MediaWiki-extensions-WikimediaIncubator, 10Epic, 10I18n: Make creating a new Language project easier - https://phabricator.wikimedia.org/T165585 (10Amire80) [13:01:02] !log Beta: Update cxserver to 7f996f3 [13:01:04] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:21:40] 10Beta-Cluster-Infrastructure, 10Operations, 10Puppet, 10Technical-Debt, 10Tracking: Minimize differences between beta and production (Tracking) - https://phabricator.wikimedia.org/T87220 (10Joe) [13:47:50] 10Beta-Cluster-Infrastructure, 10Cloud-VPS (Ubuntu Trusty Deprecation): cloudvps: deployment-prep project trusty deprecation - https://phabricator.wikimedia.org/T204500 (10Krenair) Hey @mobrovac @Mvolz! Just a friendly reminder that you should get rid of your Trusty instances as described in https://wikitech.w... [14:17:26] 10Beta-Cluster-Infrastructure, 10Operations, 10Puppet, 10Technical-Debt, 10Tracking: Minimize differences between beta and production (Tracking) - https://phabricator.wikimedia.org/T87220 (10Krenair) [14:17:34] 10Beta-Cluster-Infrastructure, 10Goal, 10Patch-For-Review, 10Puppet: Remove all ::beta roles in puppet - https://phabricator.wikimedia.org/T86644 (10Krenair) [14:17:41] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Wikimedia-Incident: Rework beta apache config - https://phabricator.wikimedia.org/T1256 (10Krenair) 05Open>03Resolved a:03Joe [15:24:45] 10Project-Admins, 10Analytics: Create project for SWAP - https://phabricator.wikimedia.org/T207425 (10Milimetric) Seems fine by us, but when phab admins approve it someone should add a herald rule to tag anything with Analytics-SWAP with #Analytics also. And Analytics-SWAP is fine with us - you get to name it! [15:56:03] Hi, this might be a FAQ but I'm trying to set up a continuous integration job to validate that my extension remains compatible with MediaWiki-LTS. Is there such a job defined? [16:29:11] 10Deployments, 10Phabricator, 10Release-Engineering-Team (Kanban), 10Developer Productivity: Create a permalink which always redirects to the current week's train blocker task - https://phabricator.wikimedia.org/T207669 (10mmodell) [16:29:28] 10Deployments, 10Phabricator, 10Release-Engineering-Team (Kanban), 10Developer Productivity: Create a permalink which always redirects to the current week's train blocker task - https://phabricator.wikimedia.org/T207669 (10mmodell) p:05Triage>03Low [16:37:03] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-Parser, 10Quibble, and 2 others: [REL1_30] Some parserTests fail on debian stretch using Tidy, because of a new version of libtidy - https://phabricator.wikimedia.org/T191771 (10dduvall) @hashar, any update? Moving to backlog until we... [16:39:21] 10Beta-Cluster-Infrastructure, 10Wikimedia-Logstash: Logstash in beta doesn't have any logs - https://phabricator.wikimedia.org/T205863 (10Ladsgroup) 05Resolved>03Open It's empty again :( [16:39:48] 10Phabricator: Make phabricator tasks collapsible - https://phabricator.wikimedia.org/T207671 (10Theklan) [16:42:11] 10Release-Engineering-Team (Kanban): TEC13:O1.1:Q1 Goal - Investigate and propose record of origin (ROO) for deployed code (currently Developers/Maintainers page) - https://phabricator.wikimedia.org/T199253 (10Jrbranaa) [16:42:14] 10Release-Engineering-Team (Kanban), 10Analytics-Tech-community-metrics, 10Code-Health: Develop canonical/single record of origin, machine readable list of all repos deployed to WMF sites. - https://phabricator.wikimedia.org/T190891 (10Jrbranaa) [16:42:17] Amir1, hi [16:42:28] Krenair: Hey [16:42:47] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Operations, 10Patch-For-Review, 10Puppet: Get rid of "import realm.pp" in manifests/site.pp - https://phabricator.wikimedia.org/T154915 (10dduvall) @hashar, this is marked as "Done" in RelEng Kanban. Any update? [16:42:48] I've noticed on that deployment-logstash2 host that when I run puppet, it always tries to start elasticsearch? [16:42:51] i.e.: [16:42:56] Notice: /Stage[main]/Elasticsearch/Elasticsearch::Instance[beta-search]/Service[elasticsearch_5@beta-search]/ensure: ensure changed 'stopped' to 'running' [16:43:44] Oct 22 16:41:45 deployment-logstash2 systemd[1]: Starting Elasticsearch (cluster beta-search)... [16:43:44] Oct 22 16:41:45 deployment-logstash2 systemd[1]: Failed to reset devices.list on /system.slice: Invalid argument [16:43:49] huh [16:44:13] 10Release-Engineering-Team (Kanban), 10Scap, 10User-MModell: Document scap swat command - https://phabricator.wikimedia.org/T196411 (10mmodell) I have committed to have something for this by the time of the offsite. [16:44:20] logstash requires elasticsearch. [16:44:23] (i think) [16:44:31] yeah that's why I'm looking at it and wondering... [16:44:54] what does sudo service elasticsearch status [16:44:55] show? [16:45:14] or rather: [16:45:17] sudo journalctl -u elasticsearch [16:45:38] it's inactive [16:45:47] since Fri 2018-09-21 08:59:14 UTC; 1 months 0 days ago [16:45:54] Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable. [16:46:56] sudo service elasticsearch stop && sudo service elasticsearch start [16:48:20] Krenair: yeah, the puppet code is being refactored and I think that's the reason it's broken for beta now [16:48:31] it happened before too [16:48:53] 10Phabricator, 10User-MModell: Add support for task types - https://phabricator.wikimedia.org/T93499 (10mmodell) [16:48:56] 10Phabricator, 10Release-Engineering-Team (Kanban), 10Security-Team, 10User-MModell: Should security tasks be a custom type in maniphest? - https://phabricator.wikimedia.org/T204160 (10mmodell) 05Open>03Resolved I [16:49:14] paladox, hope this works... [16:49:26] heh [16:49:28] Amir1, can you try now? [16:49:34] the elasticsearch search came back up [16:50:14] it says logstash is red [16:50:40] 10Release-Engineering-Team (Kanban), 10Code-Health-Metrics: Define a code health metrics reporting approach/strategy - https://phabricator.wikimedia.org/T205143 (10Jrbranaa) p:05Triage>03Normal [16:50:57] sudo service logstash stop && sudo service logstash start [16:51:28] paladox, logstash is running... [16:51:35] oh [16:51:39] yeh [16:51:47] 10Release-Engineering-Team (Kanban), 10Scap: Check 'Check endpoints for mwdebug2002.codfw.wmnet' failed: /wiki/{title} (Main Page) is WARNING: Test Main Page responds with unexpected body - https://phabricator.wikimedia.org/T206620 (10dduvall) p:05Triage>03Normal a:03thcipriani [16:51:55] but could it still run even when elasticsearch was down? [16:52:09] (and not reconnect to elastcisearch on restart) [16:52:12] 10Release-Engineering-Team (Kanban): Add Services (and other non-extensions) to the deployment review process - https://phabricator.wikimedia.org/T203701 (10Jrbranaa) p:05Normal>03Low [16:52:21] 10Release-Engineering-Team (Kanban): Add Code Stewardship review to Review Queue process - https://phabricator.wikimedia.org/T203698 (10Jrbranaa) p:05Normal>03Low [16:52:27] will try restarting logstash anyway [16:52:50] 10Release-Engineering-Team (Kanban), 10Wikimedia-Blog-Content: [Technical Debt Series]Avoiding New Technical Debt - https://phabricator.wikimedia.org/T175184 (10Jrbranaa) 05Open>03stalled p:05Normal>03Low [16:53:04] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): Have dependencies of gated extensions in the gate - https://phabricator.wikimedia.org/T204252 (10dduvall) @hashar, would you please triage priority and placement on the RelEng Kanban board? Thanks! [16:53:07] 10Release-Engineering-Team (Kanban), 10Wikimedia-Blog-Content: [Technical Debt Series]How to remove Technical Debt - https://phabricator.wikimedia.org/T175183 (10Jrbranaa) 05Open>03stalled p:05Triage>03Low [16:53:35] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Scap: On deployment-prep scap cache_git_info takes 12 minutes (that is too slow) - https://phabricator.wikimedia.org/T204762 (10dduvall) p:05Triage>03Low [16:55:09] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Scap: On deployment-prep scap cache_git_info takes 12 minutes (that is too slow) - https://phabricator.wikimedia.org/T204762 (10dduvall) p:05Low>03Normal [16:55:35] Amir1, is it working now? [16:56:14] 10Release-Engineering-Team (Kanban), 10ORES, 10Scoring-platform-team, 10User-MModell: Create gerrit mirrors for all github-based ORES repos - https://phabricator.wikimedia.org/T192042 (10mmodell) Phabricator doesn't have proper git-lfs support and I've been told not to put any resources into phabricator's... [16:56:48] nope :/ [16:57:46] 10Release-Engineering-Team (Kanban), 10User-greg: Figure out how RelEng can better communicate accomplishments - https://phabricator.wikimedia.org/T197050 (10dduvall) @greg, we noticed this in the triage meeting you weren't at. Would you care to adjust priority? ;-) [16:59:00] 10Release-Engineering-Team, 10Scap, 10Goal: Automate the Train - https://phabricator.wikimedia.org/T196515 (10dduvall) p:05Triage>03Normal [16:59:32] 10Release-Engineering-Team (Kanban), 10Scap: Automate updating deployment notes - https://phabricator.wikimedia.org/T196516 (10dduvall) p:05Triage>03Normal [17:01:39] 10Release-Engineering-Team (Kanban), 10Wikibugs: Deprecate -devtools and redirect to -releng? - https://phabricator.wikimedia.org/T185285 (10mmodell) 05Open>03Resolved a:03mmodell >>! In T185285#4400945, @greg wrote: > I need more rights in -devtools to do the IRC redirecting bits. Or you could do http... [17:04:49] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-Parser, 10Quibble, and 2 others: [REL1_30] Some parserTests fail on debian stretch using Tidy, because of a new version of libtidy - https://phabricator.wikimedia.org/T191771 (10cscott) There's a lot of backlog to read through but --... [17:09:19] Amir1, now? [17:09:50] Krenair: still red https://logstash-beta.wmflabs.org/app/kibana#?_g=() [17:10:11] (03PS1) 10Jforrester: [Infrastructure] Update IRC logging channel [integration/config] - 10https://gerrit.wikimedia.org/r/469034 [17:11:45] sigh [17:11:53] why do I have a login prompt not telling me where to get credentials [17:12:43] I remember now, these live in deployment-deploy01:/root/secrets.txt [17:12:59] well ok [17:13:05] ' Unable to connect to Elasticsearch at http://localhost:9200.' [17:13:11] that's a good lead [17:13:28] elasticsearch service fails to start [17:13:49] but.. it failed [17:14:01] then ran for a bit, then failed [17:14:04] great... [17:18:42] krenair@deployment-logstash2:/usr/share/elasticsearch$ sudo -u elasticsearch -i [17:18:42] krenair@deployment-logstash2:/usr/share/elasticsearch$ [17:18:42] okay [17:18:54] am I going crazy or is sudo broken? [17:20:38] hm, shell is /bin/false [17:20:42] that might be it [17:20:51] yeah -s instead of -i works [17:21:43] Error: Could not find or load main class .usr.share.elasticsearch.lib.HdrHistogram-2.1.9.jar [17:21:58] elasticsearch@deployment-logstash2:/usr/share/elasticsearch$ ls lib/HdrHistogram-2.1.9.jar [17:21:58] lib/HdrHistogram-2.1.9.jar [17:24:11] this is bad, I'll take a look ^ [17:25:06] dcausse, cool, here's what I did: https://phabricator.wikimedia.org/P7708 [17:25:39] ok [17:25:57] fwiw that jar file appears to be a valid zip file but I didn't look any closer than that [17:26:38] ah ok [17:26:48] you wanted to start elastic manually [17:26:52] yeah [17:27:02] to mimic how systemd was running it [17:27:07] sure [17:27:28] I was worried of some weird stuff because we use this jar in one of our plugin [17:28:03] if it's vanilla elastic we should make it start [17:30:34] (03CR) 10Catrope: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/468171 (owner: 10Catrope) [17:33:36] (03CR) 10jerkins-bot: [V: 04-1] Make WikiEditor depend on WikimediaEvents [integration/config] - 10https://gerrit.wikimedia.org/r/468171 (owner: 10Catrope) [17:38:10] Krenair: java.lang.IllegalArgumentException: unknown setting [ltr.caches.max_mem] please check that any required plugins are installed, or check the breaking changes documentation for removed settings [17:38:34] aaaand this is way beyond my knowledge :) [17:38:38] thanks for looking into this dcausse [17:38:42] the ltr plugin needs to be installed or this setting must not be set [17:38:56] the plugin is a deb to install [17:39:02] hm [17:39:10] if it's required why was it not installed by puppet? [17:39:20] it's not required for logstash I think [17:39:40] ah no my bad [17:39:51] it's worse [17:39:54] logstash does seem sad that it's gone though [17:39:56] worse? [17:39:57] uh oh [17:40:26] this elastic machine is configured to join the search cluster in beta [17:40:43] this should be 2 separate clusters [17:42:22] 10Release-Engineering-Team (Kanban), 10Scap: Check 'Check endpoints for mwdebug2002.codfw.wmnet' failed: /wiki/{title} (Main Page) is WARNING: Test Main Page responds with unexpected body - https://phabricator.wikimedia.org/T206620 (10thcipriani) 05Open>03Resolved This failure is scap attempting to run `se... [17:43:55] Krenair: there definitely something to fix puppet side, (the ltr.caches.max_mem: 100mb setting in /etc/elasticsearch/beta-search/elasticsearch.yml should not here, the cluster should not be "beta-search") [17:44:21] was it a working cluster before? [17:44:39] I don't know the answer to your question but I do know logstash was working once [17:44:45] ok [17:44:57] so something we broke when refactoring puppet [17:44:57] which presumably means there was a working elasticsearch instance there? [17:45:08] yes I think so [17:49:50] [10:11:53] why do I have a login prompt not telling me where to get credentials <-- are you using Chromium? they decided to start hiding the text in login prompts [17:50:08] legoktm, google chrome, and bloody brilliant... [17:52:25] Krenair: this is a big mess, I'll ping g.ehel tomorrow for that (will use T205672 to track progress) [17:52:26] T205672: Elasticsearch puppet config changes broke puppet in various instances - https://phabricator.wikimedia.org/T205672 [17:52:50] thanks dcausse [17:53:41] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Puppet: Elasticsearch puppet config changes broke puppet in various instances - https://phabricator.wikimedia.org/T205672 (10Krenair) [17:55:15] legoktm, just updated chrome [17:55:21] it lost all of my open tabs [17:55:21] rage [17:55:59] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10JADE, 10Scoring-platform-team (Current): Keep JADE compatible with MediaWiki LTS - https://phabricator.wikimedia.org/T207678 (10awight) [17:59:13] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Puppet: Elasticsearch puppet config changes broke puppet in various instances - https://phabricator.wikimedia.org/T205672 (10dcausse) problems seen on deployment-logstash2 so far: - it has the cluster `b... [17:59:27] thcipriani or whoever, are VMs with names like 'integration-slave-*' generally cattle or pets? Like, is it trivial to just delete one and make a fresh one? [18:00:35] andrewbogott: https://tools.wmflabs.org/bash/quip/AU_Tlz8f1oXzWjit5jiA [18:01:21] andrewbogott: but they should be fully puppetized and an hour ish's work to recreate them? (I haven't done it myself in a few years). It does require some manual steps to configure them with jenkins [18:01:44] I'm sticking with my agribusiness-style metaphor :) [18:02:11] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Beta-Cluster-reproducible, 10Patch-For-Review, 10Puppet: Elasticsearch puppet config changes broke puppet in various instances - https://phabricator.wikimedia.org/T205672 (10Krenair) See also {T205863} [18:02:13] legoktm: it's very little trouble for me to copy them, so I can just copy them if the experience is the same… but I suspect there's a step needed to drain or depool things before that happens... [18:02:35] (I hope we're at least going to eat them and not waste the meat?) [18:02:47] 10Phabricator: Make phabricator tasks collapsible - https://phabricator.wikimedia.org/T207671 (10Aklapper) I don't understand what this means - can you please elaborate and provide steps to reproduce and link to a test case? Is this about the task description? [18:02:54] are the IP's going to be the same? [18:03:13] nope, IPs will change [18:03:14] they should be offlined in https://integration.wikimedia.org/ci/computer/ then wait a few minutes to make sure any jobs finish up [18:03:23] ok, I think that just can be edited from the same page [18:03:34] e.g. https://integration.wikimedia.org/ci/computer/integration-slave-jessie-1002/configure [18:03:59] indeed, I think that's all that's needed; although, I would be that page is locked down to ci-folks [18:04:08] hm, I wonder if I don't have the right privs for this... [18:04:16] I see the page https://integration.wikimedia.org/ci/computer/integration-slave-docker-1017/ [18:04:21] oh, wait, is that the wrong one? [18:04:25] yeah, I'm pretty sure you don't, but I can add you [18:04:34] * thcipriani checks global security [18:04:58] thcipriani: I'm not in a rush but I can try some experimental moves right now if you're around to supervise. [18:05:21] looks like it's the ciadmins ldap group currently that can fiddle with agents [18:06:26] andrewbogott: I am around to watch this stuff happen. Do you have the ability to offline these nodes inside the jenkins UI? [18:07:20] thcipriani: it looks like I don't. I can add myself to that group in ldap though... [18:08:11] Krenair: haaaave you met Firefox? [18:08:22] andrewbogott: that would be fine with me, seems like you'll need to do ciadmin-type tasks occasionally [18:08:29] legoktm, I suppose this is probably a good opportunity to make the switch... [18:09:00] 10Project-Admins, 10Analytics: Create project for SWAP - https://phabricator.wikimedia.org/T207425 (10Aklapper) (General note: Given that our [[ https://phabricator.wikimedia.org/T108586#4602901 | Herald performance gets worse ]] I'd generally would like to see teams consider [[ https://www.mediawiki.org/wiki/... [18:09:35] I used firefox many years ago [18:09:40] when it was just at 3 or 4 [18:09:49] *version 3 or version 4 [18:09:54] but i stopped using it [18:10:49] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10User-Jdlrobson: Automatically display production content on Beta Cluster wikis (via an API request) - https://phabricator.wikimedia.org/T207508 (10Jdlrobson) > If the problem is that content isn't available or has diverged the last time someone did... [18:11:18] thcipriani: ok, I'm going to move integration-slave-docker-1017 and we'll see how it goes. Thanks! [18:11:56] 10Phabricator: Make phabricator tasks collapsible - https://phabricator.wikimedia.org/T207671 (10Aklapper) Duplicate of T130229 ? This might be a technical workaround for a social problem: If people put lots of text into a task description, they should structure that text, or add less relevant information in co... [18:12:32] legoktm or thcipriani, tell me what the menu option is to move a node offline? Is that 'disconnect'? [18:12:52] "Mark this node temporarily oflfine" [18:12:55] on https://integration.wikimedia.org/ci/computer/integration-slave-docker-1017/ [18:12:57] andrewbogott: mark temp offline, then you should see the option to disconnect [18:13:10] this is all on the 'configure' page? [18:13:17] oh, I guess disconnect is already there [18:13:26] https://integration.wikimedia.org/ci/computer/integration-slave-docker-1017/ in the upper right [18:13:26] oh, I see it [18:13:53] looks like it's already idle so I'll just start the copy right now. [18:14:22] +1 [18:15:39] PROBLEM - Host integration-slave-docker-1017 is DOWN: CRITICAL - Host Unreachable (10.68.20.30) [18:15:44] hm [18:19:19] 10Project-Admins, 10User-Urbanecm: Create WMCZ-Events project - https://phabricator.wikimedia.org/T207681 (10Urbanecm) [18:19:31] looks like it might take half an hour to copy… I'll ping when it's up and back in the pool [18:19:48] I can't think of any reason why this system wouldn't work cross-region but I want to see it happen [18:20:13] Project beta-update-databases-eqiad build #29223: 04FAILURE in 12 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/29223/ [18:20:51] 10Project-Admins, 10User-Urbanecm: Create WMCZ-Events project - https://phabricator.wikimedia.org/T207681 (10Urbanecm) [18:20:58] 10Project-Admins, 10User-Urbanecm: Create WMCZ-Events project - https://phabricator.wikimedia.org/T207681 (10Urbanecm) [18:24:38] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-Parser, 10Quibble, and 2 others: [REL1_30] Some parserTests fail on debian stretch using Tidy, because of a new version of libtidy - https://phabricator.wikimedia.org/T191771 (10Legoktm) >>! In T191771#4686188, @dduvall wrote: > @hash... [18:26:01] andrewbogott: is the time to copy based on the disk size? [18:26:23] we could delete the jenkins-workspace stuff beforehand since it's not useful to keep around really [18:26:33] legoktm: mostly, there's a few minutes of overhead [18:26:39] legoktm: where is that on the instance? [18:28:11] andrewbogott: /srv/jenkins-workspace/workspace [18:28:21] ok, I'll try that on the next one — thanks [18:28:22] everything in that directory is safe to rm -rf once the jobs all finish [18:35:35] 10Continuous-Integration-Config, 10Commons-Mass-Description, 10Map-of-monuments, 10Weapon of Mass Description, and 2 others: Run lint in check pipeline for Urbanecm's tools - https://phabricator.wikimedia.org/T207685 (10Urbanecm) [18:35:59] 10Continuous-Integration-Config, 10Patch-For-Review, 10User-Urbanecm: Add php lint job to wikimedia-cz/mediawiki-config - https://phabricator.wikimedia.org/T207571 (10Urbanecm) >>! In T207571#4684302, @hashar wrote: > Instead of adding the php lint job (which is legacy), you could add support for `composer t... [18:39:36] (03PS1) 10Thcipriani: Ensure pipeline images are cleaned in production [integration/config] - 10https://gerrit.wikimedia.org/r/469054 [18:40:56] 10Continuous-Integration-Config, 10LuaSandbox: CI should run `pear package-validate` for PHP extensions with package.xml files - https://phabricator.wikimedia.org/T207686 (10Legoktm) [18:46:14] (03CR) 10Dduvall: [C: 04-1] "Looks like the right change to make! I've just added a suggestion about testing that the image ID variables are set in the case of early e" (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/469054 (owner: 10Thcipriani) [18:48:05] is there a way to set parts of arrays in IS.php ? [18:48:13] say I wanted to directly set $wmgWBSharedSettings['entityNamespaces'] ? :P [18:48:15] 10Continuous-Integration-Config, 10Release-Engineering-Team: Decide where to store jobs for releases-jenkins - https://phabricator.wikimedia.org/T207346 (10Legoktm) I think we should start out using integration/config, and if we hit some blockers, we can split later on. Also we could try using a `jjb-releases`... [18:48:23] oh wait, that wmg too [18:49:00] well, $wgWBRepoSettings['entityNamespaces'] for example [18:49:15] addshore: without overriding the rest? don't think so [18:49:23] mehehhmppfff [18:49:32] okay, that would be okay [18:49:50] i need to rework some of the mw-config stuff for wikibase for also enabling repo on commons [18:51:25] hmm, but arent the user rights etc kind of multi dimensional arrays? *checks* [18:52:00] aah, but they just define the whole array there [18:52:37] right [18:53:10] 10Beta-Cluster-Infrastructure, 10Recommendation-API, 10Core Platform Team Kanban (Done with CPT), 10Services (done): Puppet broken on deployment-sca01 - https://phabricator.wikimedia.org/T207495 (10mobrovac) 05Open>03Resolved p:05Triage>03Normal a:03mobrovac Added the missing values to the approp... [18:54:31] legoktm: any thoughts on a better way? :/ otherwise i think I'll just aim for the whole array in IS.php .. [18:54:49] that or stick it in CommonSettings [18:55:30] mhhhm, yeah, well then I might as well just leave it in Wikibase.php and Wikibase-*.php, and just clean some stuff up in there [19:22:34] Yippee, build fixed! [19:22:34] Project beta-update-databases-eqiad build #29224: 09FIXED in 2 min 34 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/29224/ [19:34:26] ah ha, I was wondering about that [19:35:02] the wb repo stuff, I mean, since it needs one set of values for wikidata and another one for commons, and the current setup in the config files doesn't lend itself to that [19:35:16] (03PS1) 10Awight: [WIP] Match rule for MediaWiki current TLS [integration/config] - 10https://gerrit.wikimedia.org/r/469069 (https://phabricator.wikimedia.org/T207678) [19:35:28] is there a ticket I can get on for that? [19:35:31] addshore: [19:38:26] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Match rule for MediaWiki current TLS [integration/config] - 10https://gerrit.wikimedia.org/r/469069 (https://phabricator.wikimedia.org/T207678) (owner: 10Awight) [19:39:40] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Adopt JSON as blubber's internal configuration format - https://phabricator.wikimedia.org/T207694 (10thcipriani) [19:41:46] 10Phabricator: Make phabricator tasks collapsible - https://phabricator.wikimedia.org/T207671 (10Theklan) When we have a large section of subtasks with more sub-subtasks in a general task the tree is very long. Collapsing it by subtasks and making them expandable would be the idea. [19:41:49] 10Project-Admins: Proposal to create acl*oversight and acl*checkuser ACL projects - https://phabricator.wikimedia.org/T207323 (10TonyBallioni) This seems like it would be pretty useful. [19:42:25] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Refactor validation system to use jsonschema - https://phabricator.wikimedia.org/T207695 (10thcipriani) [19:43:46] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Remove yaml unmarshalling from blubber - https://phabricator.wikimedia.org/T207696 (10thcipriani) [19:44:50] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Adopt JSON as blubber's internal configuration format - https://phabricator.wikimedia.org/T207694 (10thcipriani) [19:44:58] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release Pipeline (Blubber): Blubberoid – create swagger spec - https://phabricator.wikimedia.org/T205920 (10thcipriani) [19:48:43] awight: TLS or LTS? ;) [19:49:27] bahaha [19:49:38] [19:50:34] (03PS2) 10Awight: [WIP] Match rule for MediaWiki current LTS [integration/config] - 10https://gerrit.wikimedia.org/r/469069 (https://phabricator.wikimedia.org/T207678) [19:50:50] Probably means I was grepping wrong, too [19:52:32] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Match rule for MediaWiki current LTS [integration/config] - 10https://gerrit.wikimedia.org/r/469069 (https://phabricator.wikimedia.org/T207678) (owner: 10Awight) [19:57:12] 10Project-Admins: Proposal to create acl*oversight and acl*checkuser ACL projects - https://phabricator.wikimedia.org/T207323 (10Legoktm) Security tasks are generally kept to a limited audience of MediaWiki developers plus a few other people. We created an ACL for stewards because there was a repeated need to ad... [20:01:12] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10JADE, 10Patch-For-Review, 10Scoring-platform-team (Current): Keep JADE compatible with MediaWiki LTS - https://phabricator.wikimedia.org/T207678 (10Legoktm) There's a similar task requesting PHP 5.x tests against older core versions for i18n... [20:07:16] 10Project-Admins: Proposal to create acl*oversight and acl*checkuser ACL projects - https://phabricator.wikimedia.org/T207323 (10Samtar) @Legoktm on reflection, I'm agreed entirely i.r.t `"I don't think adding all oversighters and all checkusers to individual security tasks is a good idea"` - there are a few oth... [20:12:25] Does anyone know what's wrong with CI for this patch? At first parallel_lint failed because there were 0 PHP files, so I added an empty-ish one, and now phpcs is timing out trying to process a massive amount of stuff for an empty extension? [20:12:27] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/468415 [20:13:51] 10Project-Admins: Proposal to create acl*oversight and acl*checkuser ACL projects - https://phabricator.wikimedia.org/T207323 (10Krenair) 05Open>03Invalid per requestor [20:16:10] let's see [20:16:39] RoanKattouw: you're missing .phpcs.xml [20:16:59] Aha! [20:17:11] so PHPCS will try and scan everything by default, including vendor/ [20:17:16] which will have a bunch of stuff in it [20:17:41] RoanKattouw: also, why does that depend upon phpxmlrpc/phpxmlrpc ? [20:18:01] No idea, I think that most likely these configs were cargo culted from somewhere [20:18:24] https://codesearch.wmflabs.org/search/?q=phpxmlrpc&i=nope&files=&repos= Copyvio apparently [20:19:19] Makes sense. I'll remove that too [20:30:39] apergos: for which bit exactly? :) [20:30:46] ah sorry :-) [20:31:11] for the fixup of *Settings.php for wikibase settings on commons and wikidata co-existing [20:32:03] No ticket yet, I guess it just falls under the deploy mediainfo tickets [20:32:42] ok [20:32:54] i'll just stalk your gerrit changesets for a bit :-P [20:34:18] thcipriani: I am going to (finally) repool integration-slave-docker-1017. Is it easy to tell if it's going to work or not? [20:38:21] the UI should tell you if it's connected or not. [20:39:33] surely there's more to 'working' than that? [20:39:54] andrewbogott: hrm, there's not a way to run a job on a specific node that I know of, but as long as docker is there and it can connect to ldap it should be safe to repool. [20:39:57] 10Phabricator (Upstream), 10Upstream: Make phabricator tasks collapsible - https://phabricator.wikimedia.org/T207671 (10Aklapper) 05Open>03declined That functionality is offered by the {nav icon=search, name=Search...} button in the "Related Objects" area. [20:40:10] 10Phabricator (Upstream), 10Upstream: Make list of (sub-)tasks collapsible - https://phabricator.wikimedia.org/T207671 (10Aklapper) [20:40:19] 10Project-Admins, 10User-Urbanecm: Create WMCZ-General project - https://phabricator.wikimedia.org/T207572 (10MarcoAurelio) No concerns from me. [20:42:22] if: sudo -u jenkins-deploy -- docker run --rm -it docker-registry.wikimedia.org/wikimedia-stretch /usr/bin/whoami [20:42:24] works [20:42:27] thcipriani: in jjb just change node to that host as a temp hack? it'll force the job to run only on that host [20:42:31] it'd probably tell you a whole bunch [20:44:29] legoktm: andrewbogott yeah, let's try that for ops/pupppet, that'd be a quick test to run, I'll create a test job real quick [20:45:50] thanks! Meanwhile I'm trying to figure out the firewall change I need to make puppet work again... [20:46:57] made: https://integration.wikimedia.org/ci/job/operations-puppet-tests-docker-repool-test/ [20:50:36] testing now: https://integration.wikimedia.org/ci/job/operations-puppet-tests-docker-repool-test/1/console [20:50:39] ok, there's puppet working again [20:51:18] looks like it's working [20:52:06] great! Is there any reason to think a differently-named node (e.g. integration-slave-jessie-1003) would be any different? [20:53:35] the *-jessie-* nodes run a different set of jobs, but other than that, I think it'd be fine [20:53:59] ok. If you don't mind I'm going to move three more nodes. Let me know if you see any bad effects. [20:58:31] PROBLEM - Host integration-slave-docker-1033 is DOWN: CRITICAL - Host Unreachable (10.68.19.22) [21:02:43] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Release Pipeline: contint1001:/var/lib/docker growth - https://phabricator.wikimedia.org/T207702 (10thcipriani) [21:03:18] PROBLEM - Host integration-slave-jessie-1003 is DOWN: CRITICAL - Host Unreachable (10.68.17.164) [21:07:45] 10Continuous-Integration-Infrastructure, 10docker-pkg: Pruning docker-pkg images - https://phabricator.wikimedia.org/T207703 (10thcipriani) [21:11:33] 10Continuous-Integration-Infrastructure, 10docker-pkg: Pruning docker-pkg images - https://phabricator.wikimedia.org/T207703 (10Legoktm) Definitely. I'd also recommend keeping the previous version as well just in case we have to revert. [21:42:28] 10Project-Admins: Proposal to create acl*oversight and acl*checkuser ACL projects - https://phabricator.wikimedia.org/T207323 (10Legoktm) >>! In T207323#4687063, @Samtar wrote: > @Legoktm on reflection, I'm agreed entirely i.r.t `"I don't think adding all oversighters and all checkusers to individual security ta... [22:10:06] (03PS2) 10Thcipriani: Ensure pipeline images are cleaned in production [integration/config] - 10https://gerrit.wikimedia.org/r/469054 [22:11:18] (03CR) 10Thcipriani: Ensure pipeline images are cleaned in production (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/469054 (owner: 10Thcipriani) [22:17:38] (03CR) 10Dduvall: [C: 031] Ensure pipeline images are cleaned in production [integration/config] - 10https://gerrit.wikimedia.org/r/469054 (owner: 10Thcipriani) [23:25:59] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Adopt JSON as blubber's internal configuration format - https://phabricator.wikimedia.org/T207694 (10dduvall) [23:34:23] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Remove yaml unmarshalling from blubber - https://phabricator.wikimedia.org/T207696 (10dduvall) Another example of how to implement a decent YAML to JSON layer might be the Golang OpenAPI library's [[ https://github.com/go-swagger/go-swagger/blo... [23:46:39] 10Continuous-Integration-Config, 10Zuul: mediawiki-phpunit-hhvm-jessie jobs are always failing - https://phabricator.wikimedia.org/T187740 (10Krinkle)