[03:00:32] 10Continuous-Integration-Config, 10MediaViewer: Include svgmin into the CI processes - https://phabricator.wikimedia.org/T229763 (10Demian) 05Open→03Declined Last change to svgmin settings was on 2018-11-29: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MultimediaViewer/+/476440 * If svgmin's opti... [08:00:37] (03CR) 10Hashar: [C: 03+2] "It was probably chained with another change that since got rebased and merged. Thx for the notification Danny." [integration/quibble] - 10https://gerrit.wikimedia.org/r/630842 (owner: 10Awight) [08:29:09] (03Merged) 10jenkins-bot: Collapse docker steps to minimize intermediate product [integration/quibble] - 10https://gerrit.wikimedia.org/r/630842 (owner: 10Awight) [08:59:24] 10Project-Admins, 10WMDE-Technical-Wishes-Maintenance, 10WMDE-Technical-Wishes-Team, 10Technical-Debt, 10User-thiemowmde: Archive DeepCat and CatGraph related code and boards - https://phabricator.wikimedia.org/T243543 (10thiemowmde) [09:11:15] 10Gerrit, 10Security, 10Upstream: Gerrit login page should print the same message when the username doesn't exist and when the username exists - https://phabricator.wikimedia.org/T266628 (10Aklapper) [10:40:45] 10Beta-Cluster-Infrastructure: Puppet failures on many hosts - https://phabricator.wikimedia.org/T267006 (10Tarrow) [11:48:28] 10Gerrit, 10Security, 10Upstream: Gerrit login page should print the same message when the username doesn't exist and when the username exists - https://phabricator.wikimedia.org/T266628 (10hashar) 05Open→03Declined On Wikimedia setup, the username is the same as the Wikitech account and thus the whole l... [13:22:12] 10Beta-Cluster-Infrastructure: Puppet failures on many hosts - https://phabricator.wikimedia.org/T267006 (10Hermann) [13:23:25] 10Phabricator: project.search API broken - https://phabricator.wikimedia.org/T266966 (10Hermann) [13:40:43] DannyS712: thanks for your phab cleanup, you beat me running a cleanup script by 10 minutes [14:11:02] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10ExtensionDistributor, 10Wikimedia-Site-requests: Phase out https://gerrit.wikimedia.org/mediawiki-extensions.txt - https://phabricator.wikimedia.org/T266024 (10hashar) 05Open→... [14:16:56] 10Gerrit, 10Wikimedia-General-or-Unknown, 10Documentation, 10Epic, and 3 others: Update Gerrit /r/p/ links to /r/ - https://phabricator.wikimedia.org/T218844 (10Nintendofan885) [14:45:34] 10Gerrit: Gerrit link to Sonar cloud reports are broken - https://phabricator.wikimedia.org/T267028 (10awight) [15:18:22] (03CR) 10Lars Wirzenius: "I'd like the log level change or be convinced that I'm wrong about before I CR+2 this." (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/637840 (owner: 10Ahmon Dancy) [15:18:33] (03CR) 10Lars Wirzenius: [C: 04-1] scap: Add --skip-l10n-update to scap sync-world [tools/scap] - 10https://gerrit.wikimedia.org/r/637840 (owner: 10Ahmon Dancy) [15:19:13] (03CR) 10Lars Wirzenius: [C: 03+2] scap: Update help text for --force option [tools/scap] - 10https://gerrit.wikimedia.org/r/637841 (owner: 10Ahmon Dancy) [15:20:03] (03CR) 10Lars Wirzenius: [C: 03+2] _restart_php: Print the exact command that will run on the target nodes [tools/scap] - 10https://gerrit.wikimedia.org/r/637842 (owner: 10Ahmon Dancy) [15:21:51] (03Merged) 10jenkins-bot: scap: Update help text for --force option [tools/scap] - 10https://gerrit.wikimedia.org/r/637841 (owner: 10Ahmon Dancy) [15:22:37] (03Merged) 10jenkins-bot: _restart_php: Print the exact command that will run on the target nodes [tools/scap] - 10https://gerrit.wikimedia.org/r/637842 (owner: 10Ahmon Dancy) [15:30:41] 10Release-Engineering-Team (Logspam), 10CommonsMetadata, 10User-DannyS712: CommonsMetadata bad wfTimestamp call - https://phabricator.wikimedia.org/T267033 (10DannyS712) [15:32:02] 10Release-Engineering-Team-TODO, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T263182 (10DannyS712) Please be aware of {T267033} - started showing up in beta logstash this week, might show up in production after the train [15:38:52] Reedy: will you make sure Platform Engineering listen to https://phabricator.wikimedia.org/T266542#6596830 ? And also make people aware that doing the same thing over and over again and expecting a different result is just wasting time [15:39:10] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10DannyS712) [15:39:17] lol [15:39:48] * Reedy files a task [15:39:53] Reedy: I believe people normally describe that as being insane [15:40:04] I've also already told them once [15:40:11] It's a waste of herald too [15:40:36] But I tell them that and headachey, tired me will be blunt and honest and that doesn't come across well [15:42:09] https://phabricator.wikimedia.org/T267037 [15:42:55] Thank you [15:43:01] And well put [15:48:31] apergos: I'm still baffled by the logic in the first place. [15:48:51] (03PS3) 10Ahmon Dancy: scap: Add --skip-l10n-update to scap sync-world [tools/scap] - 10https://gerrit.wikimedia.org/r/637840 [15:49:30] (03CR) 10Ahmon Dancy: scap: Add --skip-l10n-update to scap sync-world (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/637840 (owner: 10Ahmon Dancy) [15:49:32] hey as absolutely-not-an-ambassador [15:49:35] for cpt [15:50:13] I can guarantee you that it has already been meantioned that maybe this herald rule needs to go because it's never been helpful [15:50:32] RhinosF1: [15:50:58] and we have someone doing the moves and tags and stuff who is not used to all the nuances [15:51:12] please just be a little patient and sorry for the spam in your mailboxes [15:51:28] (03CR) 10Lars Wirzenius: [C: 03+2] scap: Add --skip-l10n-update to scap sync-world (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/637840 (owner: 10Ahmon Dancy) [15:53:41] apergos: disabling/changing the herald rule first though would make sense. Not when you see your action is being reverted by herald immediately anyway, carrying on doing it. [15:53:47] (03Merged) 10jenkins-bot: scap: Add --skip-l10n-update to scap sync-world [tools/scap] - 10https://gerrit.wikimedia.org/r/637840 (owner: 10Ahmon Dancy) [15:53:56] it wasn't clear to te person doing it what was happening [15:53:59] *the [15:56:07] I'm sure you can tell them to look closer, given I'd already told you to review the rule over a week ago. And please make sure that task is treat as high, Reedy put it high for a reason. It's a waste of a lot. Not for it to sit there for 6 months. [15:57:41] I can't tell the team "make this high", the team must determine that (and rightly so), sa frustrating as that can be to everyone else [15:58:23] I asked you as I saw you were the one saying your talking about it [15:59:02] yeah, I thought it would be good to acknowledge the task while people are in a meeting looking at the board, so it doesn't seem like the team is just ignoring the issue [15:59:18] Cool [16:00:14] Reedy, apergos: are they seriously carrying on doing it? [16:00:49] (03PS1) 10Ahmon Dancy: Add stuff required to validate php-fpm restart [tools/train-dev] - 10https://gerrit.wikimedia.org/r/638127 [16:00:52] It would take 5 minutes to fix the rule if they knew what they wanted [16:01:19] they have to get approval [16:01:45] They need to stop making that comment until then. It's annoying. [16:01:49] it's tedious, and I don't want to type the entire 'what was said in the meeting for the last ten minutes' ... seems not productive either [16:02:08] I have already conveyed folks' feelings [16:02:12] that's the best I can do [16:02:53] I'll leave a quick comment [16:12:30] RhinosF1: it's being dealt with, as ariel said, let's give them time to do it. [16:12:41] no need to keep poking them at this point [16:12:56] Understood [16:17:27] (03CR) 10Lars Wirzenius: [C: 03+2] Add stuff required to validate php-fpm restart [tools/train-dev] - 10https://gerrit.wikimedia.org/r/638127 (owner: 10Ahmon Dancy) [16:17:35] (03Merged) 10jenkins-bot: Add stuff required to validate php-fpm restart [tools/train-dev] - 10https://gerrit.wikimedia.org/r/638127 (owner: 10Ahmon Dancy) [16:35:11] 10Release-Engineering-Team-TODO, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T263182 (10brennen) Noting on this ticket that our deployment plan for the week has changed, due to US Election Day holiday. Tyler's mail to wikitech-l: >... [16:44:21] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10Patch-For-Review: Refactor PipelineLib to allow for alternate docker image pusher - https://phabricator.wikimedia.org/T265177 (10dduvall) [16:44:24] 10Phabricator, 10Wikibase-Containers, 10Wikidata, 10Regression: Can't do shallow clone from phabricator - https://phabricator.wikimedia.org/T240862 (10dduvall) [16:44:26] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10Patch-For-Review: Refactor PipelineLib to allow for alternate docker image pusher - https://phabricator.wikimedia.org/T265177 (10dduvall) p:05Triage→03Medium [16:44:37] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)): Experiment with PipelineLib/Blubber driven MediaWiki container image pipeline - https://phabricator.wikimedia.org/T260828 (10dduvall) [16:44:39] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10Patch-For-Review: Refactor PipelineLib to allow for alternate docker image pusher - https://phabricator.wikimedia.org/T265177 (10dduvall) 05Open→03Stalled [16:47:53] (03CR) 10Ahmon Dancy: [C: 03+1] build scap.deb in Docker [tools/scap] - 10https://gerrit.wikimedia.org/r/634929 (owner: 10Lars Wirzenius) [16:53:06] 10Phabricator, 10Wikibase-Containers, 10Wikidata, 10Regression: Can't do shallow clone from phabricator - https://phabricator.wikimedia.org/T240862 (10dduvall) For the paper trail: - The [[ https://secure.phabricator.com/D21484 | upstream patch ]] has landed. If you run into rebase issues around this lat... [17:57:20] 10Release-Engineering-Team (Code Health), 10Code-Health-Group, 10MediaWiki-Core-Testing, 10Code-Health, 10Test-Coverage: Track test code coverage long term - https://phabricator.wikimedia.org/T182749 (10Aklapper) [17:57:23] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Code Health), 10Code-Health-Group, 10Test-Coverage: Migrate https://tools.wmflabs.org/coverage/mediawiki/ to CI infrastructure - https://phabricator.wikimedia.org/T182751 (10Aklapper) 05Stalled→03Open The previous comments don't expla... [17:59:33] 10Continuous-Integration-Config, 10Math, 10Platform Engineering, 10Platform Engineering Roadmap Decision Making: Enable mediawiki-quibble-apitests-vendor-docker for extension Math - https://phabricator.wikimedia.org/T254031 (10daniel) >>! In T254031#6186517, @eprodromou wrote: > We think this is big enough... [18:18:41] 10Release-Engineering-Team (Logspam), 10CommonsMetadata, 10Structured Data Engineering, 10Structured-Data-Backlog, 10User-DannyS712: CommonsMetadata bad wfTimestamp call - https://phabricator.wikimedia.org/T267033 (10Krinkle) [18:18:45] 10Release-Engineering-Team (Logspam), 10CommonsMetadata, 10Structured Data Engineering, 10Structured-Data-Backlog, 10User-DannyS712: CommonsMetadata bad wfTimestamp call - https://phabricator.wikimedia.org/T267033 (10Krinkle) p:05Triage→03High [18:18:51] 10Release-Engineering-Team (Logspam), 10CommonsMetadata, 10Structured Data Engineering, 10Structured-Data-Backlog, 10User-DannyS712: CommonsMetadata bad wfTimestamp call - https://phabricator.wikimedia.org/T267033 (10Krinkle) [18:18:54] 10Release-Engineering-Team-TODO, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T263182 (10Krinkle) [18:28:37] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10Krinkle) [18:28:39] 10Release-Engineering-Team-TODO, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T263182 (10Krinkle) [18:33:21] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10ppelberg) [18:45:07] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10Patch-For-Review, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10matmarex) > See beta logstash: http... [18:46:34] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Code Health), 10Code-Health-Group, 10Test-Coverage: Migrate https://tools.wmflabs.org/coverage/mediawiki/ to CI infrastructure - https://phabricator.wikimedia.org/T182751 (10Legoktm) I don't remember why I originally set this to stalled,... [18:48:37] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10Patch-For-Review, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10thcipriani) p:05Triage→03Unbreak! [18:48:53] 10Release-Engineering-Team (Logspam), 10CommonsMetadata, 10Structured Data Engineering, 10Structured-Data-Backlog, 10User-DannyS712: CommonsMetadata bad wfTimestamp call - https://phabricator.wikimedia.org/T267033 (10thcipriani) p:05High→03Unbreak! [18:50:21] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10MW-on-K8s: Experiment with generating json config - https://phabricator.wikimedia.org/T267057 (10jeena) [18:51:46] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10Patch-For-Review, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10matmarex) The buggy code was in 1.3... [18:52:17] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10Patch-For-Review, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10Majavah) >>! In T267035#6597685, @m... [18:54:24] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10Patch-For-Review, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10DannyS712) >>! In T267035#6597685,... [19:04:31] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10serviceops, 10Patch-For-Review: Upgrade MediaWiki appservers to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for... [19:07:59] 10Release-Engineering-Team (Logspam), 10CommonsMetadata, 10Structured Data Engineering, 10Structured-Data-Backlog, 10User-DannyS712: CommonsMetadata bad wfTimestamp call - https://phabricator.wikimedia.org/T267033 (10TJH2018) Something somewhere was touched because code doesn't just break without someone... [19:10:04] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10MW-on-K8s: Understand how people use mediawiki-config - https://phabricator.wikimedia.org/T267058 (10jeena) [19:10:18] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10MW-on-K8s: Understand how people use mediawiki-config - https://phabricator.wikimedia.org/T267058 (10jeena) a:05dancy→03None [19:11:59] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)): Experiment with buildkit's llb.ExecOp.AddMount in aggregate images - https://phabricator.wikimedia.org/T267060 (10dduvall) [19:20:53] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10Patch-For-Review, 10User-DannyS712: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10matmarex) The affected page is 10Phabricator, 10Developer-Advocacy, 10Epic, 10Patch-For-Review: (Semi)automatically close Phabricator tickets with status "stalled" after a while - https://phabricator.wikimedia.org/T252522 (10Aklapper) Using the SQL query above, as of early 2020-11-01 there were 100 stalled tasks for more than 36 months,... [19:46:51] 10Beta-Cluster-Infrastructure, 10User-zeljkofilipin, 10WorkType-NewFunctionality: Make selenium users use botflags at beta-cluster - https://phabricator.wikimedia.org/T116027 (10Aklapper) Three years stalled... Does really nobody know about botflags on beta cluster and could share their knowledge? :'( [21:23:05] would it be possible to add the time of the deploy branch cut to the deploy calendar? I keep forgetting the time and it seems like it could be useful to have there [21:24:06] 10phan, 10ObjectFactory: Add phan annotations to ObjectFactory - https://phabricator.wikimedia.org/T264930 (10DannyS712) 05Open→03Resolved [21:28:43] longma: just an FYI, I've had 2 pipeline jobs die today with some variation of `docker rmi --force ee88e6bf8898 96f16ff7774f ceb93158475f 47e2530d59a6` getting stuck and eventually timing out (or being cancelled by me). [21:29:18] basically some kind of freeze during teardown after a container is done running [21:29:31] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10MW-1.36-notes (1.36.0-wmf.16; 2020-11-03), and 2 others: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10matmarex) For fu... [21:29:32] Thanks, me or marxarelli will look into it [21:29:43] https://integration.wikimedia.org/ci/job/wikimedia-toolhub-pipeline-test/118/console is one example [21:32:45] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Cloud-VPS, 10cloud-services-team (Kanban): integration instances suffer from high IO latency due to Ceph - https://phabricator.wikimedia.org/T266777 (10hashar) I have drafted a quick dashboard for Diamond IO stat... [21:33:31] 10Release-Engineering-Team (Pipeline), 10Release Pipeline: Pipeline jobs freezing during teardown - https://phabricator.wikimedia.org/T267075 (10jeena) [21:33:52] bd808: If you have any other examples could you paste them here? https://phabricator.wikimedia.org/T267075 [21:34:17] * bd808 looks for the one he cancelled manually [21:34:30] thanks! [21:35:14] 10Release-Engineering-Team (Pipeline), 10Release Pipeline: Pipeline jobs freezing during teardown - https://phabricator.wikimedia.org/T267075 (10bd808) I manually killed this one when I felt like it was just waiting to timeout: https://integration.wikimedia.org/ci/job/wikimedia-toolhub-pipeline-test/111/console [21:38:04] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10MW-1.36-notes (1.36.0-wmf.16; 2020-11-03), and 2 others: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10DannyS712) >>! I... [21:41:04] 10MediaWiki-Codesniffer, 10Wikidata, 10Wikidata-Campsite, 10User-Addshore: Use mediawiki codesniffer v33 and retire wikibase-codesniffer - https://phabricator.wikimedia.org/T266823 (10DannyS712) [21:42:42] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Cloud-VPS, 10cloud-services-team (Kanban): integration instances suffer from high IO latency due to Ceph - https://phabricator.wikimedia.org/T266777 (10hashar) > From the above dashboard, if I look at mostly idle... [21:56:26] bd808: It doesn't seem like it's happening too often overall so we might not get to it right away, but if it keeps up affecting you don't hesitate to reach out! [21:57:05] ack [21:57:11] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Cloud-VPS, 10cloud-services-team (Kanban): integration instances suffer from high IO latency due to Ceph - https://phabricator.wikimedia.org/T266777 (10hashar) From my digging into the code/documentation this aft... [21:58:05] fyi, I'm adjusting the IO quotas on integration docker workers. That will mean reduced CI capacity while I'm moving things [22:04:49] andrewbogott: https://grafana-labs.wikimedia.org/d/Yj81kH2Gk/cloud-project-io-metrics?viewPanel=31&orgId=1&from=1599514770737&to=1604354646792&var-project=integration&var-server=integration-agent-docker-1020&var-disk=vda&var-disk=vdb ;) [22:05:41] so yeah essentially seems the workload on those instances burst to more than 500 write per seconds, and whatever new limit makes it easier for them [22:05:54] cool [22:06:09] I'm resizing the other workers now, hopefully we can reproduce those results :) [22:06:14] what surprised me is that it was all fine in september [22:06:30] but I guess -1020 did not have any quota [22:06:47] yeah, the io quota is only really needed on the shared storage backend [22:06:53] and the issue got noticeable as all instances eventually got moved to ceph / having the quota 3 weeks or so ago [22:08:13] andrewbogott: similar to the egress traffic shapping that was meant to protect /data mount ? [22:08:43] yeah, similar although not nearly as restrictive as that was [22:08:58] one sure thing [22:09:33] the new quota definitely improved things for -1020 [22:09:59] and I have found a burst setting in QEMU which is supported with OpenStack rocky / 18.0 or later [22:10:20] so one can rate limit writes to 500 but allow burst up to say 2000 [22:10:48] oh, that might be useful. I'll look into that [22:10:58] that is one of the comment on the task [22:11:03] Does the red prog meter on this VM mean something is stuck? https://integration.wikimedia.org/ci/computer/integration-agent-docker-1009/ [22:11:12] found it while reading the nova / libvirt code and the QEMU man page [22:11:41] andrewbogott: ahhhh [22:11:44] hmm yeah [22:11:53] long.ma / bd.808 were talking about it earlier. let me take a ps [22:12:01] I can wait if it's just slow [22:16:36] 10Release-Engineering-Team (Pipeline), 10Release Pipeline: Pipeline jobs freezing during teardown - https://phabricator.wikimedia.org/T267075 (10hashar) From `ps`: ` 9354 ? S 0:00 sh -c ({ while [ -d '/srv/jenkins/workspace/workspace/wikimedia-toolhub-pipeline-test@tmp/durable-265be6d9' -a \! -f '... [22:17:06] andrewbogott: seems like a bug in docker, the job got stuck running "docker rmi" [22:17:19] i have killed it [22:17:22] thanks! [22:18:25] !log Killed Pipeline job , stuck running `docker rmi --force ee88e6bf8898 96f16ff7774f ceb93158475f 47e2530d59a6` T267075 [22:18:28] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:18:28] T267075: Pipeline jobs freezing during teardown - https://phabricator.wikimedia.org/T267075 [22:24:34] bd808: longma: the job being stuck on docker rmi, that is most probably the same issue I have been investigating recently. Namely: write iops being throttled [22:25:08] * bd808 wishes that CI had dedicated hypervisors [22:28:02] 10Phabricator, 10Wikibase-Containers, 10Wikidata, 10Regression: Can't do shallow clone from phabricator - https://phabricator.wikimedia.org/T240862 (10mmodell) This is now hotfixed by {0e3e3dd2fa143942db3928beea256ece0ca434f7} which is cherry-picked on production pending a proper deployment on wednesday. [22:28:37] 10Release-Engineering-Team (Pipeline), 10Release Pipeline: Pipeline jobs freezing during teardown - https://phabricator.wikimedia.org/T267075 (10hashar) Two got deleted, but two are left behind: ` $ sudo docker images|grep ceb9315 ... [22:38:41] 10Release-Engineering-Team (Logspam), 10DiscussionTools, 10Editing-team (FY2020-21 Kanban Board), 10MW-1.36-notes (1.36.0-wmf.16; 2020-11-03), and 2 others: Call to undefined method MediaWiki\Extension\DiscussionTools\HeadingItem::addWarning() - https://phabricator.wikimedia.org/T267035 (10matmarex) p:05U... [22:49:40] hashar: should I wait for the audit-resources job on https://integration.wikimedia.org/ci/computer/integration-agent-docker-1005/ to finish before I resize or is it OK to interrupt? [22:57:33] andrewbogott: you can interrupt that job [22:57:42] great [22:57:47] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Cloud-VPS, 10cloud-services-team (Kanban): integration instances suffer from high IO latency due to Ceph - https://phabricator.wikimedia.org/T266777 (10Andrew) >>! In T266777#6598396, @hashar wrote: > And just be... [22:59:11] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Cloud-VPS, 10cloud-services-team (Kanban): integration instances suffer from high IO latency due to Ceph - https://phabricator.wikimedia.org/T266777 (10Andrew) I've adjusted the throttling rules for all integrati... [23:11:34] andrewbogott: going to bed. thx for all the resizes! [23:11:45] have a good night! [23:12:00] will probably just close the task tomorrow and we can further tune later I guess [23:16:24] James_F do you have some time? [23:16:41] hoping for some help with releasing new version of ObjectFactory [23:17:25] since apparently tags need to be signed by specific people, and I don't think I'm one of those [23:17:46] DannyS712: Oh, sure, one moment. [23:19:17] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Cloud-VPS, 10cloud-services-team (Kanban): integration instances suffer from high IO latency due to Ceph - https://phabricator.wikimedia.org/T266777 (10hashar) p:05Unbreak!→03High >>! In T266777#6598466, @And... [23:21:12] DannyS712: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/libs/ObjectFactory/+/refs/tags/v3.0.0 [23:21:58] thanks. I *think* I figured out how to do the vendor update (I use windows, so I couldn't figure out composer, but any problem with just copying the text?) [23:22:21] Composer should work on Windows? [23:22:24] you should really use composer... [23:22:50] git bash: "command not found" when trying to use composer [23:23:16] if you download the phar, you should be able to do `php composer.phar update` [23:23:37] https://getcomposer.org/download/ see the manual download section [23:23:37] ...yeah, I don't have php locally either :) [23:24:02] how do you test your patches then? [23:24:14] mediawiki-vagrant [23:24:37] You can run composer inside Vagrant. [23:24:40] Most people do. :-) [23:24:48] oh, cool - learn something new every day [23:24:52] booting it up now [23:25:03] If you run into any issues, do shout. [23:25:28] thanks [23:28:04] so I'm shouting :) [23:28:04] `vagrant ssh` takes me to `vagrant@vagrant:~$` [23:28:04] but I can't find where the mediawiki files are [23:28:21] /vagrant/mediawiki IIRC [23:28:29] https://www.mediawiki.org/wiki/MediaWiki-Vagrant#Basic_usage [23:28:36] but `ls` doesn't show a /vagrant folder [23:28:45] so I couldn't find it [23:29:38] hmm, nevermind - leading / [23:33:57] okay, next question: parsoid requires "wikimedia/object-factory": "^2.1", - is that updated changed before or after vendor? Since `composer update` was complaining [23:34:34] or is it updated to `^2.1|^3.0.0` now and then to `^3.0.0` later? [23:35:18] James_F / legoktm ^ [23:35:26] Parsoid will need to be updated to support both, then released, then vendor switched to the new Parsoid version, /then/ upgrade vendor (and core) to the new version of OF, /then/ drop 2.1 support from Parsoid. [23:35:32] Isn't release process stuff fun? [23:35:40] definitely [23:35:49] yeah what he said :) [23:36:02] this wasn't needed for the last OF release, since the dependency was only added in May (https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/parsoid/+/f34f9665948ce9bac4ca28be97a6081d26494397) [23:36:35] Yeah, over time as we 'properly' re-use libraries each upgrade gets harder and harder. [23:37:01] only if we keep making breaking changes :p [23:37:42] but some libraries definitely get entrenched the wider they're used, at-ease is/was a big one [23:37:50] Yeah. [23:37:55] hmm, doesn't this also require updating core to the new version of parsoid, not just vendor? [23:38:06] And each time we upgrade this `php` thing, it takes forever! ;-) [23:38:21] DannyS712: Parsoid has a special relaxed rule that absolutely no-one else must ever use. [23:38:45] "^0.13.0-a4@alpha" [23:38:54] Yup. [23:38:56] everything else must be exactly pinned, but parsoid can use ^ [23:38:57] okay [23:39:19] Because unlike everything else, Parsoid is actively released every week and we need to back-port to production occassionally. [23:39:39] okay, is the process I laid out at https://phabricator.wikimedia.org/T267074 now correct? [23:39:43] Which is flatly impossible in a single commit for anything else, and we don't hate ourselves enough to do the five-change process. [23:40:11] The update core and update vendor changes have to land simultaneously. [23:40:18] yeah, with depends-on [23:40:29] I've seen reedy's patches [23:40:32] About 90% of them are authored or merged by me to avoid people getting lost. [23:40:39] Yeah. [23:40:53] unless there is a specific reason for " Update Parsoid to no longer accept old version of ObjectFactory" in that it requires extra support code, I wouldn't fuss about it, it can sometimes save pain during backports [23:41:30] Parsoid won't be able to take advantage of the new feature of optional services if if still accepts the old version [23:41:53] Yes, but that's their problem, not yours. [23:42:03] ...okay [23:43:16] parsoid patch at https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/638058 [23:44:56] DannyS712: Remember to follow commit message guidelines. A commit title should always be unique and could never be used by a different patch (unless yours was reverted and this is a re-try). [23:46:22] (Fixed.) [23:47:00] oh, I was in the middle of fixing it, but thanks [23:48:59] and now I wait until the next weekly parsoid version bump? [23:49:06] Yup. [23:49:16] Or longer if there's not one next week. [23:49:22] okay [23:49:26] thanks for the help [23:49:28] (There's no train next week, so they may not bother.) [23:49:30] Any time. [23:50:40] I have two more changes to ObjectFactory in the works that will require a version bump (either 1 or 2, depending on timing) - https://phabricator.wikimedia.org/T265559 and https://phabricator.wikimedia.org/T246377 [23:50:57] and speaking of, any chance you can take a look at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/634091 ? [23:52:47] DannyS712: Why did you move it inside the config addition? [23:52:58] (And yet not explain why in your commit summary. ;-)) [23:53:45] since I'm changing $options['specIsArg'] to $spec['spec_is_arg'] instead, and so the array key is renamed and it is moved to the first parameter instead of the second [23:54:09] that is the exact thing "use spec_is_arg in ObjectFactory specs" refers to [23:56:32] In https://gerrit.wikimedia.org/r/c/mediawiki/core/+/634091/2/includes/filebackend/FileBackendGroup.php you're moving from spec to config, aren't you? [23:57:50] sorry for the confusion - I was using the variable names used within ObjectFactory. The first parameter is the spec, and can include spec_is_arg, and the second parameter is extra options, and can include specIsArg - I'm switching `specIsArg` in the second array to `spec_is_arg` in the first array [23:58:35] also, the parsoid commit failed tests due to an upstream phan issue :( - discussion at https://gerrit.wikimedia.org/r/c/mediawiki/libs/ObjectFactory/+/632798 [23:59:58] Yeah.