[00:08:56] !log no-op testing updating development images on contint primary for https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/95 [00:08:57] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:10:12] (03update) 10jhuneidi: fab: Minimize ssh calls [repos/releng/dev-images] - 10https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/95 (owner: 10dancy) [00:10:12] (03approved) 10jhuneidi: fab: Minimize ssh calls [repos/releng/dev-images] - 10https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/95 (owner: 10dancy) [01:52:52] 10Gerrit, 10VPS-project-Codesearch, 07good first task, 13Patch-For-Review: Codesearch links to Gitiles for Markdown files show rendering instead of source - https://phabricator.wikimedia.org/T371092#11649219 (10Salmanfadi) I have submitted a patch for this task: https://gerrit.wikimedia.org/r/c/labs/codese... [07:34:20] 10Gerrit, 06collaboration-services, 13Patch-For-Review: Reimage gerrit2002 - https://phabricator.wikimedia.org/T417247#11649536 (10ABran-WMF) thanks for that update, I've added [[ https://gerrit.wikimedia.org/r/c/operations/puppet/+/1243633 | 2 more fingerprints ]] [07:36:38] 10Gerrit, 06collaboration-services, 13Patch-For-Review: Reimage gerrit2002 - https://phabricator.wikimedia.org/T417247#11649538 (10ABran-WMF) [07:36:46] 10Gerrit, 06collaboration-services, 13Patch-For-Review: Reimage gerrit2002 - https://phabricator.wikimedia.org/T417247#11649540 (10ABran-WMF) [07:37:59] 10Gerrit, 06collaboration-services, 13Patch-For-Review: Reimage gerrit2002 - https://phabricator.wikimedia.org/T417247#11649547 (10ABran-WMF) @jcrespo do you confirm backups are now OK? [08:25:30] 06Release-Engineering-Team, 10Quibble, 13Patch-For-Review: Quibble should emit a report of each time it took to run the steps/stages - https://phabricator.wikimedia.org/T417399#11649671 (10ArthurTaylor) @thcipriani neat! I had no idea we had such a dashboard. Or a jobs.db. Thanks for the tip! [09:39:38] 06Release-Engineering-Team (Radar), 10Ceph, 06ServiceOps new, 10SRE-swift-storage, and 3 others: Move the docker registry's /restricted prefix to Docker Distribution backed up by Ceph - https://phabricator.wikimedia.org/T412951#11649870 (10elukey) 05Stalled→03Open The new Ceph Reef version running on a... [10:33:27] 10GitLab (Integrations), 06Release-Engineering-Team (Priority Backlog 📥), 07SecTeam-Processed: Experiment with package publishing workflows on GitLab - https://phabricator.wikimedia.org/T264131#11650020 (10Aklapper) [11:04:54] 06Release-Engineering-Team, 10Citoid, 07Essential-Work, 07Technical-Debt: update nodejs22-slim image to 22.21.0 to support proxy env variables for outbound requests - https://phabricator.wikimedia.org/T416471#11650114 (10Mvolz) [11:10:03] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: gerrit-spare behind CDN - https://phabricator.wikimedia.org/T418361 (10ABran-WMF) 03NEW [11:13:33] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Explore solutions for Gerrit on Kubernetes - https://phabricator.wikimedia.org/T418364 (10ABran-WMF) 03NEW [11:14:20] 10Gerrit, 06collaboration-services: Move Gerrit data to object storage - https://phabricator.wikimedia.org/T416972#11650185 (10ABran-WMF) [11:14:21] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Explore solutions for Gerrit on Kubernetes - https://phabricator.wikimedia.org/T418364#11650184 (10ABran-WMF) [11:14:42] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Explore solutions for Gerrit on Kubernetes - https://phabricator.wikimedia.org/T418364#11650186 (10ABran-WMF) p:05Triage→03Medium [11:17:23] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: gerrit-spare behind CDN - https://phabricator.wikimedia.org/T418361#11650190 (10ABran-WMF) p:05Triage→03Medium [11:18:05] 10Gerrit, 06collaboration-services: Harmonize DNS on all gerrit instances - https://phabricator.wikimedia.org/T417279#11650193 (10ABran-WMF) p:05Triage→03Medium [11:21:37] GitLab needs a short maintenance break in one hour [11:25:12] (03open) 10jelto: gitlab-runner: bump image version to alpine-v18.7.2 [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/553 (https://phabricator.wikimedia.org/T418344) [11:37:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-ms-fe04 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [11:37:38] 10Beta-Cluster-Infrastructure: Last Puppet run was over 24 hours ago on instance deployment-ms-fe04 in project deployment-prep - https://phabricator.wikimedia.org/T418368 (10wmcs-alerts) 03NEW [11:42:28] FIRING: [7x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-changeprop-1 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [11:47:28] FIRING: [11x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-changeprop-1 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [11:52:28] FIRING: [17x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-changeprop-1 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [11:57:28] FIRING: [23x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-changeprop-1 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:02:28] FIRING: [27x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-cache-text08 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:07:28] FIRING: [32x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-acme-chief06 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:12:28] FIRING: [40x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-acme-chief06 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:14:13] FIRING: [44x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-acme-chief06 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:19:28] FIRING: [44x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-acme-chief06 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:19:29] FIRING: [43x] PuppetAgentNoResources: No Puppet resources found on instance deployment-acme-chief05 on project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [12:22:28] RESOLVED: [44x] PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance deployment-acme-chief06 in project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [12:22:29] RESOLVED: [43x] PuppetAgentNoResources: No Puppet resources found on instance deployment-acme-chief05 on project deployment-prep - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [12:29:39] GitLab maintenance finished [12:42:52] 10Phabricator, 06Release-Engineering-Team (Priority Backlog 📥): Mark Wmdephabbot Phabricator account as bot in database - https://phabricator.wikimedia.org/T418141#11650471 (10ArthurTaylor) Hi @Aklapper, Sorry for the confusion. I think I must have created a couple of different accounts at that time in a scra... [12:46:08] 10Beta-Cluster-Infrastructure: Last Puppet run was over 24 hours ago on instance deployment-ms-fe04 in project deployment-prep - https://phabricator.wikimedia.org/T418368#11650487 (10taavi) 05Open→03Resolved a:03taavi [13:34:09] (03PS1) 10Dmaza: zuul: Add TemplateData as a dependency of CommunityRequests [integration/config] - 10https://gerrit.wikimedia.org/r/1243821 [13:36:12] (03CR) 10Jforrester: zuul: Add TemplateData as a dependency of CommunityRequests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1243821 (owner: 10Dmaza) [13:36:23] (03PS2) 10Dmaza: zuul: Add TemplateData as a dependency of CommunityRequests [integration/config] - 10https://gerrit.wikimedia.org/r/1243821 [13:39:03] (03CR) 10Dmaza: zuul: Add TemplateData as a dependency of CommunityRequests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1243821 (owner: 10Dmaza) [13:40:15] (03PS3) 10Jforrester: Zuul: [mediawiki/extensions/CommunityRequests] Add TemplateData dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1243821 (https://phabricator.wikimedia.org/T401638) (owner: 10Dmaza) [13:42:05] 10Gerrit: Gerrit: Adding a team to the reviewers does not automatically add team members to the attention set. - https://phabricator.wikimedia.org/T418376 (10EMcFarland-WMF) 03NEW [13:54:14] 10Gerrit: Gerrit: Adding a team to the reviewers does not automatically add team members to the attention set. - https://phabricator.wikimedia.org/T418376#11650737 (10hashar) It is possible the behavior is different/fixed in a later version of Gerrit, we are some versions behind (we run 3.10, upstream has releas... [13:54:22] 10Gerrit, 07Upstream: Gerrit: Adding a team to the reviewers does not automatically add team members to the attention set. - https://phabricator.wikimedia.org/T418376#11650738 (10hashar) [14:00:18] 10Gerrit, 07Upstream: Gerrit: Adding a team to the reviewers does not automatically add team members to the attention set. - https://phabricator.wikimedia.org/T418376#11650758 (10hashar) [14:04:54] (03CR) 10Jforrester: [C:03+2] Zuul: [mediawiki/extensions/CommunityRequests] Add TemplateData dependency (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/1243821 (https://phabricator.wikimedia.org/T401638) (owner: 10Dmaza) [14:07:02] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/CommunityRequests] Add TemplateData dependency [integration/config] - 10https://gerrit.wikimedia.org/r/1243821 (https://phabricator.wikimedia.org/T401638) (owner: 10Dmaza) [14:07:22] !log Zuul: [mediawiki/extensions/CommunityRequests] Add TemplateData dependency, for T401638 [14:07:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:07:24] T401638: Add 'close as duplicate' functionality to the Wishlist - https://phabricator.wikimedia.org/T401638 [14:15:04] Project beta-code-update-eqiad build #589166: 04FAILURE in 2 min 4 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/589166/ [14:25:04] Yippee, build fixed! [14:25:04] Project beta-code-update-eqiad build #589167: 09FIXED in 2 min 4 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/589167/ [14:38:04] 10Continuous-Integration-Config, 10MobileFrontend (MobileFormatter), 06Reader Growth Team (Sprint 3 (Feb 17 - Mar 2) Q3 25/26): MobileFrontend tests that depend on ext:ParserMigration are currently failing (and also don't seem to be being run in CI) - https://phabricator.wikimedia.org/T415451#11650898 (10matt... [15:12:37] 06Release-Engineering-Team, 10ChangeProp, 06Data-Engineering, 10EventStreams, and 15 others: Migrate node-based services in production to node22 - https://phabricator.wikimedia.org/T393434#11651060 (10Krinkle) [15:14:12] 10Phabricator, 06Security-Team, 07SecTeam-Processed: Audit Phabricator security policies and groups membership - https://phabricator.wikimedia.org/T391150#11651073 (10Aklapper) [15:26:26] 10Gerrit, 06collaboration-services, 13Patch-For-Review: Reimage gerrit2002 - https://phabricator.wikimedia.org/T417247#11651146 (10jcrespo) >>! In T417247#11649538, @ABran-WMF wrote: > @jcrespo do you confirm backups are now OK? Yes, they are ok now. >>! In T417247#11647749, @Dzahn wrote: > In that case..... [15:40:36] hashar the gerrit SSH interface at [gerrit.wikimedia.org]:29418 didn't change after your reimage, did it? Just asking since my client is complaining it's unknown now [15:41:17] (03merge) 10dancy: fab: Minimize ssh calls [repos/releng/dev-images] - 10https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/95 [15:51:19] (03merge) 10dancy: gitlab-runner: bump image version to alpine-v18.7.2 [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/553 (https://phabricator.wikimedia.org/T418344) (owner: 10jelto) [15:57:29] 10Gerrit, 06collaboration-services: gerrit: replication monitoring improvement - https://phabricator.wikimedia.org/T418084#11651295 (10hashar) > Polish up the Gerrit > replication dashboard I have changed the {nav Latency} and {nav Delay} panels to **heatmaps per quantile**. That is slightly nicer and the spi... [16:01:10] (03merge) 10dancy: JobCard: Format time and date [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1106 (owner: 10thcipriani) [16:01:31] 10Gerrit (Gerrit 3.11), 06Release-Engineering-Team, 06collaboration-services, 07Upstream: Update gerrit replication plugin with new metrics - https://phabricator.wikimedia.org/T418215#11651313 (10hashar) The plugin creates a `WorkQueue` using Gerrit core interface and when metrics are enabled that creates... [16:05:25] (03open) 10dancy: Release 4.242.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1108 [16:18:03] Project mediawiki-core-doxygen build #18062: 04FAILURE in 0.67 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen/18062/ [16:20:36] FIRING: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [16:23:36] Project beta-code-update-eqiad build #589179: 04FAILURE in 36 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/589179/ [16:26:35] I think that is a wide spread network/CDN issue [16:29:24] (03merge) 10dancy: Release 4.242.0 [repos/releng/scap] - 10https://gitlab.wikimedia.org/repos/releng/scap/-/merge_requests/1108 [16:31:10] (03update) 10dduvall: digitalocean: Separate management of cluster and in-cluster resources [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/552 (https://phabricator.wikimedia.org/T416260) [16:34:10] (03update) 10dduvall: digitalocean: Separate management of cluster and in-cluster resources [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/552 (https://phabricator.wikimedia.org/T416260) [16:35:12] Yippee, build fixed! [16:35:12] Project beta-code-update-eqiad build #589180: 09FIXED in 2 min 12 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/589180/ [16:35:31] RESOLVED: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [16:36:44] (03update) 10dduvall: digitalocean: Separate management of cluster and in-cluster resources [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/552 (https://phabricator.wikimedia.org/T416260) [16:41:20] 06Release-Engineering-Team, 06collaboration-services: ProbeDown - https://phabricator.wikimedia.org/T418391#11651502 (10Jdforrester-WMF) Likely related to T418392? [16:57:59] 06Release-Engineering-Team, 06collaboration-services: ProbeDown - gerrit2003 - https://phabricator.wikimedia.org/T418391#11651559 (10Dzahn) [17:02:56] 06Release-Engineering-Team, 06collaboration-services: ProbeDown - gerrit2003 - https://phabricator.wikimedia.org/T418391#11651576 (10Dzahn) Yes, what James said. [17:18:24] 10Gerrit: Unticking the Gerrit "resolved" tick box sometimes doesn't work - https://phabricator.wikimedia.org/T372196#11651674 (10Tgr) [[https://issues.gerritcodereview.com/issues/412679366|This upstream issue]] sounds similar (although it is talking about replies, not top-level comments). > if the Resolved chec... [17:20:22] 10Gerrit (Gerrit 3.10): Upgrade Gerrit from 3.10.6 to 3.10.9 - https://phabricator.wikimedia.org/T400688#11651689 (10Tgr) [17:20:23] 10Gerrit: Unticking the Gerrit "resolved" tick box sometimes doesn't work - https://phabricator.wikimedia.org/T372196#11651688 (10Tgr) [17:31:13] Yippee, build fixed! [17:31:14] Project mediawiki-core-doxygen build #18063: 09FIXED in 13 min: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen/18063/ [17:40:11] 06Gerrit-Privilege-Requests, 06Release-Engineering-Team, 06Security-Team, 06SRE, and 2 others: Request membership in deployment (and wmf-deployment group) for Rsilvola - https://phabricator.wikimedia.org/T418004#11651799 (10Dzahn) @Rsilvola You have been added to the `deployment` shell user group and also... [17:40:50] 06Gerrit-Privilege-Requests, 06Release-Engineering-Team, 06Security-Team, 06SRE, and 2 others: Request membership in deployment (and wmf-deployment group) for Rsilvola - https://phabricator.wikimedia.org/T418004#11651802 (10Dzahn) 05Open→03Resolved a:03Dzahn [17:46:34] 06Release-Engineering-Team (Doing 😎), 06WMF-NDA-Requests: NDA ticket access for user DSantamaria - https://phabricator.wikimedia.org/T418401#11651828 (10Aklapper) 05Open→03Resolved a:03Aklapper Done. [17:49:56] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Rename gerrit2 Gerrit user to gerrit in the administrator group - https://phabricator.wikimedia.org/T417642#11651838 (10hashar) [17:51:18] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Rename gerrit2 Gerrit user to gerrit in the administrator group - https://phabricator.wikimedia.org/T417642#11651843 (10hashar) [17:54:38] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Rename gerrit2 Gerrit user to gerrit in the administrator group - https://phabricator.wikimedia.org/T417642#11651855 (10hashar) From T338470#11625057 : For the **Gerrit user**, **we do not rename users** ever, that is asking for too many issues... [18:28:24] 06Release-Engineering-Team (Doing 😎), 06WMF-NDA-Requests: NDA ticket access for user DSantamaria - https://phabricator.wikimedia.org/T418401#11651982 (10DSantamaria) Thanks! [18:45:55] 10Gerrit, 06Release-Engineering-Team, 06collaboration-services: Rename gerrit2 Gerrit user to gerrit in the administrator group - https://phabricator.wikimedia.org/T417642#11652056 (10Dzahn) I like this plan :) sounds good [18:49:21] 10Gerrit, 06collaboration-services, 13Patch-For-Review, 07Puppet: Gerrit git replication should not break when Puppet changes its config - https://phabricator.wikimedia.org/T416929#11652066 (10Dzahn) > The short fix is to disable configuration autoreloading in the replication plugin. This config change ha... [19:14:06] 06Project-Admins, 07Tracking-Neverending: Requests for addition to the #acl*Project-Admins group (in comments) - https://phabricator.wikimedia.org/T706#11652102 (10WMDE-leszek) Hello, can you please get @Ifeatu_Nnaobi_WMDE added as a Project Admin. She's the Product Manager at WMDE's Wikidata Integrations team... [20:12:28] (03CR) 10Hashar: [C:03+2] tests: fix main() mangling the logging level name [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239769 (owner: 10Hashar) [20:13:07] (03CR) 10Hashar: [C:03+2] tests: fully cover quibble.commands.ReportVersions [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239756 (https://phabricator.wikimedia.org/T417409) (owner: 10Hashar) [20:19:30] (03CR) 10Hashar: [C:03+2] Collect program versions in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239757 (https://phabricator.wikimedia.org/T417409) (owner: 10Hashar) [20:19:59] (03CR) 10Hashar: [C:03+2] "Solved by sorting them by name." [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239757 (https://phabricator.wikimedia.org/T417409) (owner: 10Hashar) [20:30:22] (03Merged) 10jenkins-bot: tests: fix main() mangling the logging level name [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239769 (owner: 10Hashar) [20:30:58] (03Merged) 10jenkins-bot: tests: fully cover quibble.commands.ReportVersions [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239756 (https://phabricator.wikimedia.org/T417409) (owner: 10Hashar) [20:33:47] (03CR) 10Jforrester: [C:03+1] Collect program versions in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239757 (https://phabricator.wikimedia.org/T417409) (owner: 10Hashar) [20:38:12] (03Merged) 10jenkins-bot: Collect program versions in parallel [integration/quibble] - 10https://gerrit.wikimedia.org/r/1239757 (https://phabricator.wikimedia.org/T417409) (owner: 10Hashar) [20:58:27] 06Project-Admins, 07Tracking-Neverending: Requests for addition to the #acl*Project-Admins group (in comments) - https://phabricator.wikimedia.org/T706#11652336 (10Aklapper) I added @Ifeatu_Nnaobi_WMDE. //Standard disclaimer: Please follow the [guidelines](https://www.mediawiki.org/wiki/Phabricator/Creating_an... [21:06:45] PROBLEM - jenkins_service_running on releases2003 is CRITICAL: PROCS CRITICAL: 2 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [21:07:45] RECOVERY - jenkins_service_running on releases2003 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins [21:37:15] bartosz: Thanks for becoming the owner of Beta Cluster MX services! ;) [21:37:47] wrong Bartosz :/ [21:38:07] Congrats to whoever! [21:38:11] MatmaRex is not here apparently [21:38:40] "I will boldly mark this as resolved, but if you also suffered from this problem, please check that it works now. And don't ask me to fix it when it breaks again, I still have no idea what I'm doing here. :)" -- https://phabricator.wikimedia.org/T291679#11649059 [21:39:03] haha [21:39:06] Perfect [21:41:10] I'm pretty sure that Beta still runs under Hot Potato Rules (HPR) where the last touch means lifetime ownership. [21:49:51] i think there's a pretty good argument that hot potato ops is actually the governing methodology of the entire technical ecosystem. [21:50:04] ...beta is maybe just a particularly strong example. [22:07:33] I passed on your, er, thanks to bartosz [22:07:42] ( :-P ) [22:08:09] Thanks _and_ the hot potato [22:08:20] MatmaRex! [22:08:32] hahahahahaa [22:10:08] hey [22:10:14] i heard someone is snitching on me ;) [22:10:24] You were receiving praise and recognition [22:10:30] And the hot potato [22:11:01] if anyone in management asks, i DO NOT know why the beta cluster can send email again [22:11:59] don't worry, no one else does either? [22:12:10] just tell them you fixed the leaking kitchen sink in beta cluster instead [22:12:11] uh oh [22:12:49] thanks for the advice about it the other day though, i really did not know what i was doing, and was just trying to snipe you all into solving the problem for me [22:14:58] (bartosz is the ML Bartosz. we have been confused for each other before) [22:15:03] We're in this snipe together [22:16:31] I'm not opposed to declaring the beta cluster mail server to be the ML Bartosz's problem [22:20:39] 10Continuous-Integration-Infrastructure (Zuul upgrade), 06collaboration-services, 13Patch-For-Review: puppetize setup of new zuul VMs - https://phabricator.wikimedia.org/T395938#11652512 (10Dzahn) [22:28:30] (03merge) 10dduvall: digitalocean: Separate management of cluster and in-cluster resources [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/552 (https://phabricator.wikimedia.org/T416260) [22:33:13] 10Gerrit, 06Release-Engineering-Team, 14Release-Engineering-Team-TODO (2020-04 to 2020-06 (Q4)), 07ci-test-error (WMF-deployed Build Failure), 13Patch-For-Review: Jenkins job failing intermittently due to Gerrit HTTP 502 errors when interacting with repo... - https://phabricator.wikimedia.org/T246763#11652527 [22:34:47] 10Gerrit, 06Release-Engineering-Team, 14Release-Engineering-Team-TODO (2020-04 to 2020-06 (Q4)), 07ci-test-error (WMF-deployed Build Failure), 13Patch-For-Review: Jenkins job failing intermittently due to Gerrit HTTP 502 errors when interacting with repo... - https://phabricator.wikimedia.org/T246763#11652529 [22:43:19] (03open) 10dduvall: ci: Fix cluster state names [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/554 [22:44:28] (03merge) 10dduvall: ci: Fix cluster state names [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/554 [23:01:29] 06Release-Engineering-Team: Gerrit is missing the codehealth magic RERUN link - https://phabricator.wikimedia.org/T418424 (10HMonroy) 03NEW [23:01:59] 06Release-Engineering-Team: Gerrit UI is missing the codehealth magic RERUN link - https://phabricator.wikimedia.org/T418424#11652596 (10HMonroy) [23:06:12] (03open) 10dduvall: scripts: Fix argument help parameters [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/555 [23:07:21] (03merge) 10dduvall: scripts: Fix argument help parameters [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/555 [23:08:22] 06Release-Engineering-Team: Gerrit UI is missing the codehealth magic RERUN link - https://phabricator.wikimedia.org/T418424#11652610 (10hashar) [23:10:09] 06Release-Engineering-Team: Gerrit UI is missing the codehealth magic RERUN link - https://phabricator.wikimedia.org/T418424#11652628 (10hashar) The relevant code is at https://gerrit.wikimedia.org/g/operations/software/gerrit/+/refs/heads/deploy/wmf/stable-3.10/plugins/wm-checks-api.js#591 That is a switch/ca... [23:25:48] (03open) 10dduvall: ci: Use explicit `needs` in job that use deployer image [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/556 [23:26:26] (03merge) 10dduvall: ci: Use explicit `needs` in job that use deployer image [repos/releng/gitlab-cloud-runner] - 10https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/556 [23:50:48] !log deploying https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/merge_requests/552 to gitlab-cloud-runner production cluster (T416260) [23:50:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [23:50:50] T416260: Separate gitlab-cloud-runner k8s cluster provisioning from provider configuration - https://phabricator.wikimedia.org/T416260