[00:01:27] 10SRE, 10CommRel-Specialists-Support (Apr-Jun-2021), 10Datacenter-Switchover: CommRel support for June 2021 Switchover - https://phabricator.wikimedia.org/T281209 (10sgrabarczuk) [00:13:36] (03PS1) 10Papaul: Add ganeti202[56] to partman [puppet] - 10https://gerrit.wikimedia.org/r/700720 (https://phabricator.wikimedia.org/T282603) [00:14:53] (03CR) 10Papaul: [C: 03+2] Add ganeti202[56] to partman [puppet] - 10https://gerrit.wikimedia.org/r/700720 (https://phabricator.wikimedia.org/T282603) (owner: 10Papaul) [00:17:36] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: TBD) rack/setup/install ganeti202[56] - https://phabricator.wikimedia.org/T282603 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts: ` ganeti2025.codfw.wmnet ` The log can be found in `/var/lo... [00:33:00] !log pt1979@cumin2002 START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2025.codfw.wmnet with reason: REIMAGE [00:33:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:35:02] !log pt1979@cumin2002 END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ganeti2025.codfw.wmnet with reason: REIMAGE [00:35:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:44:32] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: TBD) rack/setup/install ganeti202[56] - https://phabricator.wikimedia.org/T282603 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['ganeti2025.codfw.wmnet'] ` and were **ALL** successful. [00:46:31] 10SRE, 10ops-codfw, 10DC-Ops, 10Patch-For-Review: (Need By: TBD) rack/setup/install ganeti202[56] - https://phabricator.wikimedia.org/T282603 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2002.codfw.wmnet for hosts: ` ganeti2026.codfw.wmnet ` The log can be found in `/var/lo... [01:00:53] !log pt1979@cumin2002 START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2026.codfw.wmnet with reason: REIMAGE [01:00:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:02:51] !log pt1979@cumin2002 END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ganeti2026.codfw.wmnet with reason: REIMAGE [01:02:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:11:17] 10SRE, 10ops-codfw, 10DC-Ops: (Need By: TBD) rack/setup/install ganeti202[56] - https://phabricator.wikimedia.org/T282603 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['ganeti2026.codfw.wmnet'] ` and were **ALL** successful. [01:19:12] 10SRE, 10ops-codfw, 10DC-Ops: (Need By: TBD) rack/setup/install ganeti202[56] - https://phabricator.wikimedia.org/T282603 (10Papaul) [01:20:01] 10SRE, 10ops-codfw, 10DC-Ops: (Need By: TBD) rack/setup/install ganeti202[56] - https://phabricator.wikimedia.org/T282603 (10Papaul) 05Open→03Resolved @MoritzMuehlenhoff this is ready for service. [01:39:53] (03PS3) 10Brennen Bearnes: CAS: stop marking users as external [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/699819 (https://phabricator.wikimedia.org/T274461) [01:40:11] (03PS5) 10Brennen Bearnes: disable issues & wikis by default on new projects [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/699812 (https://phabricator.wikimedia.org/T264231) [02:00:05] Deploy window Branching MediaWiki, extensions, skins, and vendor – See Heterogeneous_deployment/Train_deploys (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T0200) [02:16:47] (Traffic on tunnel link) firing: Traffic on tunnel link - https://alerts.wikimedia.org [02:36:47] (Traffic on tunnel link) resolved: Traffic on tunnel link - https://alerts.wikimedia.org [04:19:16] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Master switchover s5 T284529 [04:19:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:19:21] T284529: Switchover s5 from db1100 to db1130 - https://phabricator.wikimedia.org/T284529 [04:19:25] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Master switchover s5 T284529 [04:19:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:20:07] !log Start topology changes for s5 switchover T284529 [04:20:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:28:01] (03PS2) 10Marostegui: mariadb: Promote db1130 to s5 master. [puppet] - 10https://gerrit.wikimedia.org/r/700462 (https://phabricator.wikimedia.org/T284529) [04:28:13] (03PS3) 10Marostegui: wmnet: Promote db1130 to s5 master [dns] - 10https://gerrit.wikimedia.org/r/699136 (https://phabricator.wikimedia.org/T284529) [04:30:53] (03CR) 10Marostegui: [C: 03+2] mariadb: Promote db1130 to s5 master. [puppet] - 10https://gerrit.wikimedia.org/r/700462 (https://phabricator.wikimedia.org/T284529) (owner: 10Marostegui) [04:41:23] marostegui: o/ [04:42:27] kormat: o/ [04:42:41] don't you love this habit? [04:42:51] i do not [04:43:17] I know you do [04:43:20] this habit, you, 👎 [04:50:56] Morning [04:51:08] hi [04:55:22] less ning please [04:56:00] apergos: 💯 [05:00:01] ok, let's go? [05:00:21] ready here [05:00:23] 👍 [05:00:26] !log Starting s5 eqiad failover from db1100 to db1130 - T284529 [05:00:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:00:31] T284529: Switchover s5 from db1100 to db1130 - https://phabricator.wikimedia.org/T284529 [05:00:37] !log marostegui@cumin1001 dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - T284529', diff saved to https://phabricator.wikimedia.org/P16675 and previous config saved to /var/cache/conftool/dbconfig/20210622-050036-root.json [05:00:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:00:50] ro confirmed [05:00:53] same [05:01:03] same [05:01:23] !log marostegui@cumin1001 dbctl commit (dc=all): 'Promote db1130 to s5 master and set section read-write T284529', diff saved to https://phabricator.wikimedia.org/P16676 and previous config saved to /var/cache/conftool/dbconfig/20210622-050123-root.json [05:01:26] all done [05:01:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:01:28] checking [05:01:49] I can write on enwikivoyage [05:01:51] confirming r/w [05:02:02] I see rcs on de [05:02:52] no log errors [05:02:53] same on cebwiki [05:03:06] Tendril updated [05:03:41] same with zarcillo [05:05:01] (03CR) 10Marostegui: [C: 03+2] wmnet: Promote db1130 to s5 master [dns] - 10https://gerrit.wikimedia.org/r/699136 (https://phabricator.wikimedia.org/T284529) (owner: 10Marostegui) [05:06:03] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool old master running 10.1 T284529', diff saved to https://phabricator.wikimedia.org/P16677 and previous config saved to /var/cache/conftool/dbconfig/20210622-050602-marostegui.json [05:06:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:06:07] T284529: Switchover s5 from db1100 to db1130 - https://phabricator.wikimedia.org/T284529 [05:06:34] !log Stop replication on old s5 master ( db1100) - T284529 [05:06:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:07:34] (03PS1) 10Marostegui: db1100: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/700723 (https://phabricator.wikimedia.org/T284529) [05:08:36] (03CR) 10Marostegui: [C: 03+2] db1100: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/700723 (https://phabricator.wikimedia.org/T284529) (owner: 10Marostegui) [05:11:54] (03PS1) 10Jcrespo: dbbackups: Remove s5 (stretch) from backup sources [puppet] - 10https://gerrit.wikimedia.org/r/700725 (https://phabricator.wikimedia.org/T283235) [05:12:43] (03PS5) 10Jcrespo: dbbackups: Migrate db1171:s2 to db1139, reimage as buster and set s7&s8 [puppet] - 10https://gerrit.wikimedia.org/r/700473 (https://phabricator.wikimedia.org/T280979) [05:13:25] (03PS1) 10Marostegui: install_server: Reimage db1100 to Buster. [puppet] - 10https://gerrit.wikimedia.org/r/700746 (https://phabricator.wikimedia.org/T283235) [05:14:10] (03CR) 10Marostegui: [C: 03+2] install_server: Reimage db1100 to Buster. [puppet] - 10https://gerrit.wikimedia.org/r/700746 (https://phabricator.wikimedia.org/T283235) (owner: 10Marostegui) [05:20:36] There's a big spike in writes on s5 [05:25:07] sobanski: yes, it must be db1100 which is getting the schema changes applied [05:25:20] ACK [05:26:08] yep, db1100: https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&from=now-12h&to=now&var-server=db1100&var-port=9104 [05:37:39] (03PS1) 10Jcrespo: bacula: Remove sretest1002 and cloudmetrics1002 from the backup ignore list [puppet] - 10https://gerrit.wikimedia.org/r/700748 (https://phabricator.wikimedia.org/T281881) [05:37:52] (03PS2) 10Jcrespo: bacula: Remove sretest1002 and cloudmetrics1002 from the backup ignore list [puppet] - 10https://gerrit.wikimedia.org/r/700748 (https://phabricator.wikimedia.org/T281881) [05:57:58] (03CR) 10Giuseppe Lavagetto: Add base debian directory (031 comment) [debs/wmf-certificates] - 10https://gerrit.wikimedia.org/r/699155 (https://phabricator.wikimedia.org/T284417) (owner: 10Giuseppe Lavagetto) [05:58:41] (03CR) 10Giuseppe Lavagetto: [V: 03+2 C: 03+2] Add base debian directory [debs/wmf-certificates] - 10https://gerrit.wikimedia.org/r/699155 (https://phabricator.wikimedia.org/T284417) (owner: 10Giuseppe Lavagetto) [06:05:48] (03PS1) 10Giuseppe Lavagetto: Fix debhelper compat version [debs/wmf-certificates] - 10https://gerrit.wikimedia.org/r/700751 [06:09:59] (03CR) 10Giuseppe Lavagetto: [V: 03+2 C: 03+2] Fix debhelper compat version [debs/wmf-certificates] - 10https://gerrit.wikimedia.org/r/700751 (owner: 10Giuseppe Lavagetto) [06:14:50] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on db1100.eqiad.wmnet with reason: REIMAGE [06:14:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:16:59] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1100.eqiad.wmnet with reason: REIMAGE [06:17:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:20:46] (03CR) 10Ayounsi: [C: 03+2] Fix dumps fail if a device has an empty (None) name [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700348 (https://phabricator.wikimedia.org/T275587) (owner: 10Ayounsi) [06:42:43] (03CR) 10Muehlenhoff: [C: 03+1] "The patch is fine, but we could also just as well drop sretest from backups again, there's nothing worth backing up on those hosts and we " [puppet] - 10https://gerrit.wikimedia.org/r/700748 (https://phabricator.wikimedia.org/T281881) (owner: 10Jcrespo) [06:43:47] !log repool wdqs1005 [06:43:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:44:07] dcausse: o/ an-airflow again full of logs :( [06:44:37] elukey: ok, will do some cleanup again :/ [06:44:53] I'll raise the problem with Erik, forgot to do it last time [06:51:56] dcausse: if you need help ping me! [06:56:12] elukey: done, we definitely need to revisit log retention here [06:58:15] RECOVERY - Disk space on an-airflow1001 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=an-airflow1001&var-datasource=eqiad+prometheus/ops [07:00:11] 10SRE, 10ops-codfw, 10DC-Ops: (Need By: TBD) rack/setup/install ganeti202[56] - https://phabricator.wikimedia.org/T282603 (10MoritzMuehlenhoff) Thanks, Papaul! [07:00:14] (03PS2) 10ArielGlenn: dumps: Migrate miscdumps clean up cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700123 (https://phabricator.wikimedia.org/T273673) (owner: 10Ladsgroup) [07:02:06] (03CR) 10ArielGlenn: [C: 03+2] dumps: Migrate miscdumps clean up cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700123 (https://phabricator.wikimedia.org/T273673) (owner: 10Ladsgroup) [07:03:16] (03PS2) 10ArielGlenn: dumps: Migrate xml dumps clean up cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700161 (https://phabricator.wikimedia.org/T273673) (owner: 10Ladsgroup) [07:04:12] (03CR) 10ArielGlenn: [C: 03+2] dumps: Migrate xml dumps clean up cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700161 (https://phabricator.wikimedia.org/T273673) (owner: 10Ladsgroup) [07:15:37] there will be whines shortly about puppet on the dumpsdata/labstore1006,7 boxes, I'm looking at it [07:24:13] 10SRE, 10Traffic, 10observability: Implement SLI measurement for Varnish Frontend - https://phabricator.wikimedia.org/T284576 (10ema) The following timestamps depend on external factors and should not be used for this SLI: - beresp - berespbody - fetch - pipesess - req - reqbody - waitinglist All others lo... [07:26:23] (03PS1) 10ArielGlenn: fix up conversion of adds changes dumps cleanup cron job to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700834 [07:26:54] (03CR) 10jerkins-bot: [V: 04-1] fix up conversion of adds changes dumps cleanup cron job to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700834 (owner: 10ArielGlenn) [07:28:29] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1169 T283499', diff saved to https://phabricator.wikimedia.org/P16678 and previous config saved to /var/cache/conftool/dbconfig/20210622-072828-marostegui.json [07:28:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:28:38] T283499: Schema change for renaming page_timestamp index on revision table to rev_page_timestamp - https://phabricator.wikimedia.org/T283499 [07:32:21] (03PS2) 10ArielGlenn: fix up conversion of adds changes dumps cleanup cron job to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700834 [07:36:58] (03CR) 10Filippo Giunchedi: "LGTM, though see also https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/626403 I sent a while ago" [cookbooks] - 10https://gerrit.wikimedia.org/r/700695 (https://phabricator.wikimedia.org/T285273) (owner: 10Legoktm) [07:41:51] (03PS3) 10ArielGlenn: fix up conversion of adds changes dumps cleanup cron job to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700834 [07:46:19] (03CR) 10Jcrespo: "> we could also just as well drop sretest from backups again" [puppet] - 10https://gerrit.wikimedia.org/r/700748 (https://phabricator.wikimedia.org/T281881) (owner: 10Jcrespo) [07:49:19] (03CR) 10ArielGlenn: [C: 03+2] fix up conversion of adds changes dumps cleanup cron job to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700834 (owner: 10ArielGlenn) [07:49:44] (03PS1) 10Filippo Giunchedi: alertmanager: disable dashboard alert history on grafana/librenms [puppet] - 10https://gerrit.wikimedia.org/r/700837 (https://phabricator.wikimedia.org/T281454) [07:49:48] (03PS1) 10Filippo Giunchedi: alertmanager: dashboard default to 450px wide groups [puppet] - 10https://gerrit.wikimedia.org/r/700838 (https://phabricator.wikimedia.org/T281454) [07:53:37] !log uploaded wmf-certificates package to buster-wikimedia/main, T284417 [07:53:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:53:43] T284417: Add the puppet CA to the MediaWiki deployment - https://phabricator.wikimedia.org/T284417 [07:54:54] (03CR) 10Muehlenhoff: "> Patch Set 10: Code-Review+1" (031 comment) [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [07:56:03] (03PS1) 10Filippo Giunchedi: logstash: extend ssd retention to 15d [puppet] - 10https://gerrit.wikimedia.org/r/700841 [07:58:02] (03CR) 10Filippo Giunchedi: [C: 03+1] Add metrics group. [software/ecs] - 10https://gerrit.wikimedia.org/r/699428 (owner: 10Cwhite) [07:58:44] (03CR) 10Filippo Giunchedi: [C: 03+1] logstash: transition openstack to ECS [puppet] - 10https://gerrit.wikimedia.org/r/699039 (https://phabricator.wikimedia.org/T234565) (owner: 10Cwhite) [07:59:36] (03PS1) 10Giuseppe Lavagetto: pipeline: install the wmf internal CAs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700843 (https://phabricator.wikimedia.org/T284417) [08:02:33] 10SRE, 10MW-on-K8s, 10serviceops: Make all httpbb tests pass on the mwdebug deployment. - https://phabricator.wikimedia.org/T285298 (10Joe) [08:02:48] 10SRE, 10MW-on-K8s, 10serviceops: Make all httpbb tests pass on the mwdebug deployment. - https://phabricator.wikimedia.org/T285298 (10Joe) p:05Triage→03High a:03Joe [08:18:19] (03PS1) 10Muehlenhoff: Extend access for piccardi [puppet] - 10https://gerrit.wikimedia.org/r/700846 [08:19:32] (03CR) 10Muehlenhoff: [C: 03+1] "LGTM, let's remove the ignore list entry. I'll send another patch to drop backups for sretest* later." [puppet] - 10https://gerrit.wikimedia.org/r/700748 (https://phabricator.wikimedia.org/T281881) (owner: 10Jcrespo) [08:20:36] (03CR) 10Muehlenhoff: [C: 03+2] Extend access for piccardi [puppet] - 10https://gerrit.wikimedia.org/r/700846 (owner: 10Muehlenhoff) [08:35:46] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Repool db1169 after schema change', diff saved to https://phabricator.wikimedia.org/P16679 and previous config saved to /var/cache/conftool/dbconfig/20210622-083545-root.json [08:35:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:39:55] 10SRE, 10Wikimedia-Mailing-lists, 10User-Ladsgroup: Close wikimediahk@lists.wikimedia.org - https://phabricator.wikimedia.org/T285194 (10deryckchan) Thanks @Ladsgroup . @borogovia do you need access to the archives? [08:43:22] (03PS1) 10Jelto: fix GitLab backup cronjob [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700851 (https://phabricator.wikimedia.org/T274463) [08:45:27] (03CR) 10Jelto: "Could you take a look again please? I forgot to remove the -l flag from the backup script after testing and the job failed with" [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700851 (https://phabricator.wikimedia.org/T274463) (owner: 10Jelto) [08:49:15] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P16680 and previous config saved to /var/cache/conftool/dbconfig/20210622-084915-marostegui.json [08:49:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:49:53] !log Upgrade db1166 [08:49:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:50:49] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Repool db1169 after schema change', diff saved to https://phabricator.wikimedia.org/P16681 and previous config saved to /var/cache/conftool/dbconfig/20210622-085049-root.json [08:50:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:04] 10SRE, 10MW-on-K8s, 10serviceops: Make all httpbb tests pass on the mwdebug deployment. - https://phabricator.wikimedia.org/T285298 (10Joe) On second thoughts: we're mostly using http where we could just use: ` request_headers: X-Forwarded-Proto: http ` instead. I'll do that where appropriate. [08:54:29] (03CR) 10Sfigor: [C: 03+1] "The change will definitely not harm." [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700676 (owner: 10Brennen Bearnes) [08:55:09] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Repool db1166 after upgrade', diff saved to https://phabricator.wikimedia.org/P16682 and previous config saved to /var/cache/conftool/dbconfig/20210622-085508-root.json [08:55:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:04:50] (03PS1) 10Ladsgroup: flaggedrevs: Reduce levels for ruwiki to 1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700854 (https://phabricator.wikimedia.org/T284589) [09:05:53] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Repool db1169 after schema change', diff saved to https://phabricator.wikimedia.org/P16683 and previous config saved to /var/cache/conftool/dbconfig/20210622-090552-root.json [09:05:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:07:19] (03CR) 10Volans: [C: 03+1] "I don't have the context for the actual content but code wise LGTM" [cookbooks] - 10https://gerrit.wikimedia.org/r/700704 (https://phabricator.wikimedia.org/T269179) (owner: 10Legoktm) [09:10:13] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: Repool db1166 after upgrade', diff saved to https://phabricator.wikimedia.org/P16684 and previous config saved to /var/cache/conftool/dbconfig/20210622-091012-root.json [09:10:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:15:17] (03CR) 10Volans: [C: 03+1] "> Patch Set 10:" (031 comment) [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [09:18:40] (03PS1) 10Giuseppe Lavagetto: httpbb: move most tests to https [puppet] - 10https://gerrit.wikimedia.org/r/700856 (https://phabricator.wikimedia.org/T285298) [09:20:57] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Repool db1169 after schema change', diff saved to https://phabricator.wikimedia.org/P16685 and previous config saved to /var/cache/conftool/dbconfig/20210622-092056-root.json [09:21:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:25:16] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Repool db1166 after upgrade', diff saved to https://phabricator.wikimedia.org/P16686 and previous config saved to /var/cache/conftool/dbconfig/20210622-092515-root.json [09:25:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:35:57] (03PS3) 10Jcrespo: bacula: Remove sretest1002 and cloudmetrics1002 from the backup ignore list [puppet] - 10https://gerrit.wikimedia.org/r/700748 (https://phabricator.wikimedia.org/T281881) [09:40:20] !log marostegui@cumin1001 dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Repool db1166 after upgrade', diff saved to https://phabricator.wikimedia.org/P16687 and previous config saved to /var/cache/conftool/dbconfig/20210622-094019-root.json [09:40:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:44:39] (03CR) 10Volans: [C: 03+2] cumin: add support for Kerberos auth [software/cumin] - 10https://gerrit.wikimedia.org/r/699192 (https://phabricator.wikimedia.org/T244840) (owner: 10Muehlenhoff) [09:48:06] (03CR) 10Jcrespo: [C: 03+2] bacula: Remove sretest1002 and cloudmetrics1002 from the backup ignore list [puppet] - 10https://gerrit.wikimedia.org/r/700748 (https://phabricator.wikimedia.org/T281881) (owner: 10Jcrespo) [09:49:30] (03Merged) 10jenkins-bot: cumin: add support for Kerberos auth [software/cumin] - 10https://gerrit.wikimedia.org/r/699192 (https://phabricator.wikimedia.org/T244840) (owner: 10Muehlenhoff) [09:50:25] (03PS2) 10Jdrewniak: Enable new Vector Languages-in-header feature & AB test for pilot wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) [09:54:35] (03CR) 10Muehlenhoff: "> But this is pywmflib and as a library should have at least unit tests for all modules IMHO" [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [09:55:31] (03CR) 10Filippo Giunchedi: "Overall nice! See inline on notes/tips, please also include unit tests for the alerts to make sure there are no regressions (docs at https" (037 comments) [alerts] - 10https://gerrit.wikimedia.org/r/700649 (https://phabricator.wikimedia.org/T282806) (owner: 10Ayounsi) [10:08:13] (03PS1) 10Kormat: Revert "db1123: Disable notifications." [puppet] - 10https://gerrit.wikimedia.org/r/700729 [10:09:45] (03CR) 10Kormat: [C: 03+2] Revert "db1123: Disable notifications." [puppet] - 10https://gerrit.wikimedia.org/r/700729 (owner: 10Kormat) [10:10:12] 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review: Make all httpbb tests pass on the mwdebug deployment. - https://phabricator.wikimedia.org/T285298 (10Joe) [10:10:52] 10SRE, 10Infrastructure-Foundations: Upgrade eqiad/codfw Ganeti clusters to Buster - https://phabricator.wikimedia.org/T284811 (10joanna_borun) [10:11:18] (03PS1) 10Ayounsi: Prevent empty saves [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) [10:11:38] (03CR) 10Ayounsi: "Tested on netbox-next." [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) (owner: 10Ayounsi) [10:21:09] !log kormat@cumin1001 dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16688 and previous config saved to /var/cache/conftool/dbconfig/20210622-102108-kormat.json [10:21:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:21:14] T283131: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 [10:24:22] (03CR) 10Volans: [C: 03+1] "The change looks sane, please test it a bit on netbox-next to be sure." (031 comment) [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) (owner: 10Ayounsi) [10:26:40] (03PS1) 10Effie Mouzeli: hieradata: Use TLS codfw pool for memcached replication on eqiad [puppet] - 10https://gerrit.wikimedia.org/r/700861 (https://phabricator.wikimedia.org/T284420) [10:28:39] (03PS2) 10Effie Mouzeli: hieradata: Use TLS codfw pool for memcached replication on eqiad [puppet] - 10https://gerrit.wikimedia.org/r/700861 (https://phabricator.wikimedia.org/T271967) [10:30:58] 10SRE, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar), 10User-jijiki: Enable TLS on memcached for cross-dc replication - https://phabricator.wikimedia.org/T271967 (10jijiki) [10:31:26] 10SRE, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar), 10User-jijiki: Enable TLS on memcached for cross-dc replication - https://phabricator.wikimedia.org/T271967 (10jijiki) Tested using the codfw TLS pool on eqiad canaries https://gerrit.wikimedia.org/r/c/operations/puppet/+/699908 [10:31:32] (03PS11) 10Jbond: IDM: create new idm library with logoutd base class [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) [10:31:45] (03CR) 10Effie Mouzeli: "This commit has the wrong bug number, the correct one is T271967 (https://phabricator.wikimedia.org/T271967)" [puppet] - 10https://gerrit.wikimedia.org/r/699908 (https://phabricator.wikimedia.org/T284420) (owner: 10Effie Mouzeli) [10:31:59] (03CR) 10Jbond: "> Patch Set 10:" [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [10:33:50] (03CR) 10jerkins-bot: [V: 04-1] IDM: create new idm library with logoutd base class [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [10:35:32] (03PS3) 10Jdrewniak: Enable new Vector Languages-in-header feature & AB test for pilot wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) [10:36:13] !log kormat@cumin1001 dbctl commit (dc=all): 'db1123 (re)pooling @ 50%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16689 and previous config saved to /var/cache/conftool/dbconfig/20210622-103612-kormat.json [10:36:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:36:17] T283131: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 [10:41:05] (03PS1) 10Effie Mouzeli: profile::thanos::swift: add account for tegola vector tiles [puppet] - 10https://gerrit.wikimedia.org/r/700862 (https://phabricator.wikimedia.org/T283049) [10:43:10] (03PS1) 10Effie Mouzeli: profile::thanos::swift: add fake credentials for tegola_prod [labs/private] - 10https://gerrit.wikimedia.org/r/700863 (https://phabricator.wikimedia.org/T283049) [10:51:16] !log kormat@cumin1001 dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16690 and previous config saved to /var/cache/conftool/dbconfig/20210622-105115-kormat.json [10:51:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:51:21] T283131: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 [10:53:49] (03PS2) 10Ayounsi: Prevent empty saves [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) [10:54:28] (03CR) 10Volans: [C: 03+1] "LGTM" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) (owner: 10Ayounsi) [10:55:02] (03PS12) 10Jbond: IDM: create new idm library with logoutd base class [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) [10:57:20] (03CR) 10jerkins-bot: [V: 04-1] IDM: create new idm library with logoutd base class [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [10:57:49] (03CR) 10Ayounsi: "> Patch Set 1: Code-Review+1" (031 comment) [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) (owner: 10Ayounsi) [10:57:56] (03CR) 10Ayounsi: [C: 03+2] Prevent empty saves [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) (owner: 10Ayounsi) [10:58:32] (03Merged) 10jenkins-bot: Prevent empty saves [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700860 (https://phabricator.wikimedia.org/T266767) (owner: 10Ayounsi) [10:58:48] (03CR) 10Phuedx: Enable new Vector Languages-in-header feature & AB test for pilot wikis (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) (owner: 10Jdrewniak) [11:00:04] Amir1, Lucas_WMDE, awight, and Urbanecm: Your horoscope predicts another unfortunate European mid-day backport window deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T1100). [11:00:04] jan_drewniak and ma: A patch you scheduled for European mid-day backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [11:00:11] o/ [11:00:29] o/ [11:01:35] jan_drewniak: do you want to deploy your changes yourself? [11:02:14] * ma is here, sorry for the ~one minute delay [11:02:51] Lucas_WMDE: I'd prefer some help (got kids running around me) [11:02:56] ok sure [11:03:02] * Lucas_WMDE fires up the terminals [11:03:59] (03PS4) 10Jdrewniak: Enable new Vector Languages-in-header feature & AB test for pilot wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) [11:04:15] (03CR) 10Jdrewniak: Enable new Vector Languages-in-header feature & AB test for pilot wikis (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) (owner: 10Jdrewniak) [11:04:22] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+2] launchULS: Add context to interface.language.change hook [extensions/UniversalLanguageSelector] (wmf/1.37.0-wmf.10) - 10https://gerrit.wikimedia.org/r/700633 (https://phabricator.wikimedia.org/T280770) (owner: 10Jdrewniak) [11:04:38] I assume we deploy the ULS backport first? [11:04:40] Lucas_WMDE: the ULS one has to go first [11:04:43] ok [11:05:09] can that one be tested on its own? [11:06:20] !log kormat@cumin1001 dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16691 and previous config saved to /var/cache/conftool/dbconfig/20210622-110619-kormat.json [11:06:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:06:25] T283131: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 [11:08:06] oof, that backport needs 12 more minutes in CI… [11:08:25] looking at ma’s patch in the meantime [11:08:40] Lucas_WMDE: oy, well we it can be tested on it's own. [11:10:18] ma: I think I can deploy your config change while we wait for CI on jan_drewniak’s changes [11:10:26] will you be able to test it on mwdebug? [11:10:36] Lucas_WMDE: yup [11:10:53] looking at listgrouprights on mwdebug will suffice [11:11:41] ok [11:11:50] I’m not sure if it should be explicitly unset on 'confirmed' as well, or if 'autoconfirmed' is enough [11:11:53] but we can try that there [11:12:15] (I peeked at $wgGroupPermissions in shell.php and saw the right in both groups) [11:12:25] (03PS9) 10Lucas Werkmeister (WMDE): enwiki: Remove 'collectionsaveascommunitypage' from the 'autoconfirmed' user group [mediawiki-config] - 10https://gerrit.wikimedia.org/r/698041 (https://phabricator.wikimedia.org/T283523) (owner: 10MarcoAurelio) [11:12:34] urbanecm said confirmed inherits what autoconfirmed has [11:12:44] ah, yeah, I just saw that [11:12:46] that's right [11:12:49] ok then let’s try it [11:12:51] I trust he knows better :) [11:12:53] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+2] enwiki: Remove 'collectionsaveascommunitypage' from the 'autoconfirmed' user group [mediawiki-config] - 10https://gerrit.wikimedia.org/r/698041 (https://phabricator.wikimedia.org/T283523) (owner: 10MarcoAurelio) [11:13:04] and you too I mean, 'you' as in 'you all' [11:13:57] (03Merged) 10jenkins-bot: enwiki: Remove 'collectionsaveascommunitypage' from the 'autoconfirmed' user group [mediawiki-config] - 10https://gerrit.wikimedia.org/r/698041 (https://phabricator.wikimedia.org/T283523) (owner: 10MarcoAurelio) [11:14:00] there’s also a beta-only config change that hadn’t been fetched before [11:14:06] pulling that without sync [11:14:19] (I6dc38fd81c3b0f0af4fe3609e5533200ff1f68c8, for the record) [11:14:37] ma: change should be on mwdebug1001, please test :) [11:15:02] on it [11:15:15] (03PS13) 10Jbond: IDM: create new idm library with logoutd base class [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) [11:15:25] in shell.php, collectionsaveascommunitypage is false on confirmed and autoconfirmed [11:15:33] so looks like urbanecm was right d) [11:15:35] * :) [11:15:49] (03PS5) 10Jdrewniak: Enable new Vector Languages-in-header feature & AB test for pilot wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) [11:15:50] Lucas_WMDE: it works [11:15:55] ok! [11:15:58] syncing [11:16:07] thanks [11:17:21] !log lucaswerkmeister-wmde@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:698041|enwiki: Remove 'collectionsaveascommunitypage' from the 'autoconfirmed' user group (T283523)]] (duration: 00m 56s) [11:17:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:17:26] T283523: Remove the option to save books to the book namespace from Special:Book at enwiki - https://phabricator.wikimedia.org/T283523 [11:17:39] (03CR) 10jerkins-bot: [V: 04-1] IDM: create new idm library with logoutd base class [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [11:17:53] thanks Lucas_WMDE [11:17:56] np :) [11:18:07] * ma departs [11:18:10] adiós amigos [11:21:17] (03CR) 10Jbond: [C: 03+1] "> Patch Set 7:" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700389 (https://phabricator.wikimedia.org/T283242) (owner: 10Muehlenhoff) [11:22:18] (03Merged) 10jenkins-bot: launchULS: Add context to interface.language.change hook [extensions/UniversalLanguageSelector] (wmf/1.37.0-wmf.10) - 10https://gerrit.wikimedia.org/r/700633 (https://phabricator.wikimedia.org/T280770) (owner: 10Jdrewniak) [11:22:56] oh, this was targeting wmf.10? [11:23:05] …how are we going to test it then? [11:23:25] wmf.10 doesn’t even exist on deploy1002 yet afaict [11:24:11] i don't think you can test it then [11:24:21] yeah I think this should not have been merged [11:24:43] I guess I should leave a note on the train blocker task [11:24:53] aw shoot, sorry my bad, I got mixed up with the train schedule [11:24:55] since something was backported between branch cut and initial rollout [11:25:02] and I vaguely recall commits being lost in similar situations before [11:25:23] wait, nevermind, wmf.10 is the one that never gets deployed [11:25:26] wmf.11 would be the next one [11:25:30] hehe [11:25:32] so no harm done [11:25:35] so, no problem for the train [11:25:58] jan_drewniak: do you want to try a backport to wmf.9 or wait for wmf.11? [11:26:30] Lucas_WMDE: Yeah, I think I'll do the backport to wmf.9 [11:26:34] ok [11:26:57] (03CR) 10Lucas Werkmeister (WMDE): "Belated note that this was targeting the wrong branch – wmf.10 never got deployed (WMF all-hands or something, see T281151)." [extensions/UniversalLanguageSelector] (wmf/1.37.0-wmf.10) - 10https://gerrit.wikimedia.org/r/700633 (https://phabricator.wikimedia.org/T280770) (owner: 10Jdrewniak) [11:27:02] 10SRE, 10DC-Ops, 10Infrastructure-Foundations, 10netops: Collect and archive KML/KMZ fiber path files for new and existing network circuits - https://phabricator.wikimedia.org/T285136 (10ayounsi) p:05Medium→03Low [11:27:08] (03PS1) 10Jdrewniak: launchULS: Add context to interface.language.change hook [extensions/UniversalLanguageSelector] (wmf/1.37.0-wmf.9) - 10https://gerrit.wikimedia.org/r/700730 (https://phabricator.wikimedia.org/T280770) [11:27:37] Lucas_WMDE: sorry about that, here's the correct backport https://gerrit.wikimedia.org/r/c/mediawiki/extensions/UniversalLanguageSelector/+/700730 [11:27:42] ok [11:27:48] Lucas_WMDE: in theory, merging to wmf.X that's not on deployment should work now (as the branching bot applies +2 to its commits), but don't quote me on that :) [11:27:52] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+2] launchULS: Add context to interface.language.change hook [extensions/UniversalLanguageSelector] (wmf/1.37.0-wmf.9) - 10https://gerrit.wikimedia.org/r/700730 (https://phabricator.wikimedia.org/T280770) (owner: 10Jdrewniak) [11:27:58] ok ^^ [11:28:04] apparently wmf.11 hasn’t actually been branched yet [11:28:16] or at least I don’t see such a remote branch in the ULS repo [11:28:34] see T277507 for reference [11:28:34] T277507: Core subproject commits aren't updated for backports if wmf branch commit has not yet been merged. - https://phabricator.wikimedia.org/T277507 [11:31:45] 10SRE, 10SRE-Access-Requests: Access to ptwikinews Search Console for Edu - https://phabricator.wikimedia.org/T285091 (10jbond) thanks @KFrancis @Edu As you dont have an NDA on record you will need a WMF sponsor as well as arranging the NDA with @KFrancis. I wonder if it would be possible/easier to go the... [11:35:41] !log installing fluidsynth security updates [11:35:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:38:28] 10SRE, 10observability, 10Datacenter-Switchover, 10Patch-For-Review: Switchover thanos-query and thanos-swift services as part of DC switchover - https://phabricator.wikimedia.org/T285273 (10jbond) p:05Triage→03Medium [11:38:51] (03PS1) 10Muehlenhoff: Add library hint for fluidsynth [puppet] - 10https://gerrit.wikimedia.org/r/700873 [11:41:14] 10SRE, 10Infrastructure-Foundations, 10Mail, 10Wikimedia-Mailing-lists: Post to WikimediaAnnounce-l not forwarded to Wikimedia-l - https://phabricator.wikimedia.org/T115300 (10jbond) p:05Triage→03Medium [11:43:31] (03CR) 10Muehlenhoff: [C: 03+2] Add library hint for fluidsynth [puppet] - 10https://gerrit.wikimedia.org/r/700873 (owner: 10Muehlenhoff) [11:47:59] ohh, CI is about to finish on the backport… [11:48:18] (03Merged) 10jenkins-bot: launchULS: Add context to interface.language.change hook [extensions/UniversalLanguageSelector] (wmf/1.37.0-wmf.9) - 10https://gerrit.wikimedia.org/r/700730 (https://phabricator.wikimedia.org/T280770) (owner: 10Jdrewniak) [11:49:03] jan_drewniak: the backport should be on mwdebug1001, can you test it? [11:49:27] (also, `scap pull` printed some messages about being unable to delete non-empty directories below wmf.1, but I assume I can ignore that) [11:50:59] Lucas_WMDE: thanks! We're testing the ULS change now [11:51:15] ok [11:53:24] Lucas_WMDE: ULS change looks good! we're ready for the AB test now :) [11:53:31] ok, syncing the ULS change then [11:54:56] !log lucaswerkmeister-wmde@deploy1002 Synchronized php-1.37.0-wmf.9/extensions/UniversalLanguageSelector/: Backport: [[gerrit:700730|launchULS: Add context to interface.language.change hook (T280770)]] (duration: 00m 57s) [11:55:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:55:01] T280770: Instrumentation QA for language switching - https://phabricator.wikimedia.org/T280770 [11:55:34] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+2] Enable new Vector Languages-in-header feature & AB test for pilot wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) (owner: 10Jdrewniak) [11:56:19] (03Merged) 10jenkins-bot: Enable new Vector Languages-in-header feature & AB test for pilot wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700705 (https://phabricator.wikimedia.org/T269093) (owner: 10Jdrewniak) [11:58:13] !log lucaswerkmeister-wmde@mwdebug1001:~$ sudo -u mwdeploy sh -c 'rm /srv/mediawiki/php-1.37.0-wmf.1/cache/l10n/l10n_cache-*.cdb && rmdir /srv/mediawiki/php-1.37.0-wmf.1/cache/l10n && rmdir /srv/mediawiki/php-1.37.0-wmf.1/cache && rmdir /srv/mediawiki/php-1.37.0-wmf.1' # per comments in T157030 and similar tasks [11:58:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:58:18] T157030: cannot delete non-empty directory: php-1.29.0-wmf.3 messages on 'scap sync' on mwdebug1002 - https://phabricator.wikimedia.org/T157030 [11:58:53] jan_drewniak: config change should also be on mwdebug1001 now [11:59:06] (and the scap pull messages about wmf.1 went away, yay) [12:00:19] (03PS1) 10Ema: varnishmtail: add Error and FetchError VSL tags [puppet] - 10https://gerrit.wikimedia.org/r/700876 (https://phabricator.wikimedia.org/T284576) [12:01:54] jan_drewniak: can you test the AB change or should I just sync it? [12:01:59] (03CR) 10Ema: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/700876 (https://phabricator.wikimedia.org/T284576) (owner: 10Ema) [12:02:37] Lucas_WMDE: yup, test looks good! ready to sync [12:02:41] ok! [12:03:54] !log lucaswerkmeister-wmde@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:700705|Enable new Vector Languages-in-header feature & AB test for pilot wikis (T269093)]] (duration: 00m 56s) [12:03:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:04:00] T269093: Deploy new language switching location to test wikis and begin A/B test pt 1 - https://phabricator.wikimedia.org/T269093 [12:04:14] !log backport+config window done [12:04:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:10:08] (03CR) 10Jbond: "Errors seems to be unrelated (are related to missing types-* packages)" [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [12:11:50] Lucas_WMDE: Thanks! [12:11:55] (03CR) 10Filippo Giunchedi: [C: 03+1] "LGTM, nit inline" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700862 (https://phabricator.wikimedia.org/T283049) (owner: 10Effie Mouzeli) [12:12:04] (03CR) 10Filippo Giunchedi: [C: 03+1] profile::thanos::swift: add fake credentials for tegola_prod [labs/private] - 10https://gerrit.wikimedia.org/r/700863 (https://phabricator.wikimedia.org/T283049) (owner: 10Effie Mouzeli) [12:17:28] (03PS9) 10Muehlenhoff: Add helper tool for returning a user's current TGT [puppet] - 10https://gerrit.wikimedia.org/r/700389 (https://phabricator.wikimedia.org/T283242) [12:21:01] 10SRE, 10SRE-Access-Requests: Access to ptwikinews Search Console for Edu - https://phabricator.wikimedia.org/T285091 (10Edu) @jbond it's okay to follow Urbanecm's suggestion. [12:22:18] (03CR) 10Muehlenhoff: "> Patch Set 8: Code-Review+1" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700389 (https://phabricator.wikimedia.org/T283242) (owner: 10Muehlenhoff) [12:27:14] 10SRE, 10DC-Ops, 10Infrastructure-Foundations, 10netops: Collect and archive KML/KMZ fiber path files for new and existing network circuits - https://phabricator.wikimedia.org/T285136 (10faidon) With regards to prioritization: I'd like to for sure collect KMLs ~now for the wavelengths that are under procur... [12:34:51] Hey! Uhh. Is there a protocol for debugging code that's been deployed (on mwdebug1001, say)? Do I just shout out in here? [12:36:10] (03PS1) 10MSantos: Trigger tegola latest build [software/tegola] (wmf/v0.14.x) - 10https://gerrit.wikimedia.org/r/700893 [12:38:22] Lucas_WMDE, urbanecm: Perhaps you know the answer to the above? [12:38:32] (03CR) 10MSantos: [C: 03+2] Trigger tegola latest build [software/tegola] (wmf/v0.14.x) - 10https://gerrit.wikimedia.org/r/700893 (owner: 10MSantos) [12:38:46] phuedx: depends on what's "to debug" [12:38:56] if you want to make changes there, say it here (and say when you're done) [12:39:22] (03CR) 10Arturo Borrero Gonzalez: grid: php config don't rely on php being installed by puppet (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700186 (owner: 10David Caro) [12:39:29] if you just want to use shell.php, I'd just do it (as long as you're not doing any write operations, of course) [12:39:29] (03Merged) 10jenkins-bot: Trigger tegola latest build [software/tegola] (wmf/v0.14.x) - 10https://gerrit.wikimedia.org/r/700893 (owner: 10MSantos) [12:39:35] does that make sense phuedx ? [12:40:05] Noted. A particular block of code doesn't seem to be working the way I or jan_drewniak expected and I'd like to inspect it [12:40:21] urbanecm: That makes sense, yeah. Thanks. Definitely no DB access required [12:40:55] I meant that doing sth like `var_dump($wgGroupPermissions);` is ok, as this cannot change the state of the system [12:42:13] i would definitely note here if you call some bits of the code that can write [12:42:18] it's always better to note something than not :) [12:42:58] ^ Sound advice [12:43:28] Thanks, urbanecm [12:43:57] any time. Ping me if my help is needed, going to code my own (MW) stuff now :) [12:46:51] 10SRE, 10MW-on-K8s, 10serviceops: Install wiki-specific php extensions in the mediawiki production image - https://phabricator.wikimedia.org/T285309 (10Joe) [12:47:24] 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review: Make all httpbb tests pass on the mwdebug deployment. - https://phabricator.wikimedia.org/T285298 (10Joe) [12:48:09] (03PS1) 10Kormat: mariadb: Monitor pt-heartbeat for expected status. [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) [12:49:39] (03PS2) 10Ema: varnishmtail: add Error and FetchError VSL tags [puppet] - 10https://gerrit.wikimedia.org/r/700876 (https://phabricator.wikimedia.org/T284576) [12:49:44] I’ve also been told that editing the code (sudo -u mwdeploy -e PATH) can be acceptable (be careful ofc) [12:49:45] (03CR) 10Kormat: [V: 03+1] "PCC SUCCESS (DIFF 6): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29955/console" [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [12:49:50] in case shell.php is not enough [12:52:36] (03PS2) 10Kormat: mariadb: Monitor pt-heartbeat for expected status. [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) [12:54:04] (03CR) 10Kormat: [V: 03+1] "PCC SUCCESS (DIFF 6): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29956/console" [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [12:54:51] Lucas_WMDE: yeah, but that definitely should be noted here [12:55:42] (03PS3) 10Kormat: mariadb: Monitor pt-heartbeat for expected status. [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) [12:56:41] (03CR) 10Kormat: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29957/console" [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [12:57:58] (03PS4) 10Kormat: mariadb: Monitor pt-heartbeat for expected status. [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) [12:58:00] (03CR) 10RLazarus: [C: 03+1] httpbb: move most tests to https [puppet] - 10https://gerrit.wikimedia.org/r/700856 (https://phabricator.wikimedia.org/T285298) (owner: 10Giuseppe Lavagetto) [12:59:56] (03CR) 10Kormat: [V: 03+1] "PCC SUCCESS (DIFF 6): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29958/console" [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [13:00:25] (03CR) 10David Caro: grid: php config don't rely on php being installed by puppet (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700186 (owner: 10David Caro) [13:03:02] 10SRE, 10MW-on-K8s, 10serviceops: Install wiki-specific php extensions in the mediawiki production image - https://phabricator.wikimedia.org/T285309 (10Joe) p:05Triage→03High [13:03:21] I'm messing around on mwdebug1001. Hopefully it'll take < 5 minutes. I'll message when done [13:03:49] (03CR) 1020after4: [C: 03+2] selenium: Replace selenium npm script with selenium-test [phabricator/deployment] (wmf/stable) - 10https://gerrit.wikimedia.org/r/700378 (https://phabricator.wikimedia.org/T274579) (owner: 10Zfilipin) [13:04:17] (03CR) 1020after4: [V: 03+2 C: 03+2] selenium: Replace selenium npm script with selenium-test [phabricator/deployment] (wmf/stable) - 10https://gerrit.wikimedia.org/r/700378 (https://phabricator.wikimedia.org/T274579) (owner: 10Zfilipin) [13:06:15] (03PS1) 10Giuseppe Lavagetto: pipeline: install php extensions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700903 (https://phabricator.wikimedia.org/T285309) [13:07:48] Done. scap pull-ing to tidy up [13:16:02] (03PS1) 10Phuedx: Correctly enable Vector language switcher treatment A/B test [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700905 (https://phabricator.wikimedia.org/T269093) [13:17:55] ^ That patch follows up the one deployed during the earlier backport window [13:18:11] Thanks for your advice and patience, urbanecm and Lucas_WMDE [13:18:15] any time :) [13:18:58] should we deploy that now or can it wait for the next regular window? [13:19:38] ^ [13:20:03] If it could be deployed out-of-window, I'd appreciate it [13:20:12] sure, I can do it now [13:20:16] jouncebot: now [13:20:16] No deployments scheduled for the next 2 hour(s) and 39 minute(s) [13:20:20] yeah, plenty of time [13:21:12] (03CR) 10Lucas Werkmeister (WMDE): [C: 03+2] Correctly enable Vector language switcher treatment A/B test [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700905 (https://phabricator.wikimedia.org/T269093) (owner: 10Phuedx) [13:22:02] (03Merged) 10jenkins-bot: Correctly enable Vector language switcher treatment A/B test [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700905 (https://phabricator.wikimedia.org/T269093) (owner: 10Phuedx) [13:22:32] does it make sense to test it on mwdebug again? ^^ [13:23:00] Lucas_WMDE: Sure [13:23:02] Yes [13:23:06] ok [13:24:17] 10SRE, 10Infrastructure-Foundations, 10netops: Cloud IPv6 subnets - https://phabricator.wikimedia.org/T187929 (10cmooney) Having discussed with @ayounsi we were thinking it may be better to assign aggregates less sparsely, as follows: ` 2a02:ec80:100::/40 eqiad 2a02:ec80:200::/40 codfw 2a02:ec80:300::/40 esa... [13:25:51] Lucas_WMDE: mwdebug1001? [13:26:05] oh wait [13:26:08] sorry [13:26:12] I didn’t actually scap pull yet :D [13:26:13] (03CR) 10Ottomata: [C: 03+1] "Don't know the syntax but +1 to the idea" [homer/public] - 10https://gerrit.wikimedia.org/r/698202 (https://phabricator.wikimedia.org/T279429) (owner: 10Ayounsi) [13:26:18] *now* mwdebug1001, yes [13:26:21] :D [13:29:00] !log reindexing German wikis on elastic@eqiad, elastic@codfw, and cloudelastic complete (T284185) [13:29:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:07] T284185: Reindex German, Dutch, and Portugese Wikis - https://phabricator.wikimedia.org/T284185 [13:29:15] !log urbanecm@mwmaint1002:~$ foreachwikiindblist growthexperiments extensions/WikimediaMaintenance/createExtensionTables.php growthexperiments # T266913 [13:29:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:19] T266913: Add a link engineering: create tables in Wikimedia production - https://phabricator.wikimedia.org/T266913 [13:30:45] (03CR) 10Marostegui: "> Patch Set 4: Verified+1" [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [13:32:04] Lucas_WMDE: Thanks. I think I've gone through all of the variations :) [13:32:11] ok :) [13:32:46] (03Abandoned) 10Dzahn: gitlab: ensure backup dirs exist, add parameter for config backup [puppet] - 10https://gerrit.wikimedia.org/r/700601 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:33:29] !log lucaswerkmeister-wmde@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:700905|Correctly enable Vector language switcher treatment A/B test (T269093)]] (duration: 00m 57s) [13:33:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:33:34] T269093: Deploy new language switching location to test wikis and begin A/B test pt 1 - https://phabricator.wikimedia.org/T269093 [13:33:52] (03PS5) 10Dzahn: profile::gitlab: ensure backup dirs exist in production [puppet] - 10https://gerrit.wikimedia.org/r/700622 (https://phabricator.wikimedia.org/T274463) [13:34:49] Thanks again Lucas_WMDE [13:34:51] (03CR) 10Kormat: [V: 03+1] "> Patch Set 4:" [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [13:34:52] no problem [13:34:56] let’s hope it really works this time ^^ [13:35:17] (03CR) 10jerkins-bot: [V: 04-1] profile::gitlab: ensure backup dirs exist in production [puppet] - 10https://gerrit.wikimedia.org/r/700622 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:36:44] (03PS1) 10Urbanecm: Enable Growth features in dark mode at nlwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700909 (https://phabricator.wikimedia.org/T285254) [13:37:41] !log [urbanecm@mwmaint1002 /srv/mediawiki-staging]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=nlwiki growthexperiments # T285254 [13:37:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:37:46] T285254: Deploy Growth features on Dutch Wikipedia - https://phabricator.wikimedia.org/T285254 [13:37:56] (03CR) 10Dzahn: "nitpick: why not use "querybuilder" or "query-builder" eveywhere, both in the URL and the filesystem" [puppet] - 10https://gerrit.wikimedia.org/r/700317 (https://phabricator.wikimedia.org/T266703) (owner: 10Ladsgroup) [13:38:52] !log [urbanecm@mwmaint1002 /srv/mediawiki-staging/php]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=nlwiki --phab=T285254 # T285254 [13:38:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:42:11] (03CR) 10Marostegui: [C: 03+1] "thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [13:42:38] (03CR) 10Herron: [C: 03+1] logstash: extend ssd retention to 15d [puppet] - 10https://gerrit.wikimedia.org/r/700841 (owner: 10Filippo Giunchedi) [13:42:42] (03PS6) 10Dzahn: profile::gitlab: ensure backup dirs exist in production [puppet] - 10https://gerrit.wikimedia.org/r/700622 (https://phabricator.wikimedia.org/T274463) [13:43:24] (03CR) 10Herron: [C: 03+1] alertmanager: dashboard default to 450px wide groups [puppet] - 10https://gerrit.wikimedia.org/r/700838 (https://phabricator.wikimedia.org/T281454) (owner: 10Filippo Giunchedi) [13:46:20] (03CR) 10Herron: [C: 03+1] alertmanager: disable dashboard alert history on grafana/librenms [puppet] - 10https://gerrit.wikimedia.org/r/700837 (https://phabricator.wikimedia.org/T281454) (owner: 10Filippo Giunchedi) [13:49:15] (03CR) 10Dzahn: [C: 03+2] "@Jelto this shows compiler diff: https://puppet-compiler.wmflabs.org/compiler1002/29959/gitlab1001.wikimedia.org/index.html" [puppet] - 10https://gerrit.wikimedia.org/r/700622 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:49:31] !log disabling puppet on A:db-all for T285079 [13:49:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:49:37] T285079: Investigate pt-heartbeat-wikimedia failure modes - https://phabricator.wikimedia.org/T285079 [13:50:06] (03CR) 10Kormat: [V: 03+1 C: 03+2] mariadb: Monitor pt-heartbeat for expected status. [puppet] - 10https://gerrit.wikimedia.org/r/700898 (https://phabricator.wikimedia.org/T285079) (owner: 10Kormat) [13:50:23] (03PS5) 10Dzahn: gitlab::backup: create backup paths with wmflib::dir::mkdir_p [puppet] - 10https://gerrit.wikimedia.org/r/700595 (https://phabricator.wikimedia.org/T274463) [13:51:15] (03CR) 10Dzahn: "nothing happened but we could see it manages the directories:" [puppet] - 10https://gerrit.wikimedia.org/r/700622 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:52:17] (03CR) 10Dzahn: [C: 03+2] "not used in prod but one day it will be useful, when we merge classes" [puppet] - 10https://gerrit.wikimedia.org/r/700595 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:54:06] (03CR) 10Dzahn: [C: 03+1] "after both https://gerrit.wikimedia.org/r/c/operations/puppet/+/700622 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/700084 are" [puppet] - 10https://gerrit.wikimedia.org/r/697850 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:54:25] (03PS5) 10Dzahn: bacula/gitlab: add a backup::set for gitlab and use it [puppet] - 10https://gerrit.wikimedia.org/r/697850 (https://phabricator.wikimedia.org/T274463) [13:54:27] (03CR) 10Herron: "I'm in favor of removing this, but still see a fair amount of legacy list mail in the exim logs." [puppet] - 10https://gerrit.wikimedia.org/r/681242 (https://phabricator.wikimedia.org/T280472) (owner: 10Legoktm) [13:55:01] (03CR) 10Dzahn: [C: 03+1] "second link was https://gerrit.wikimedia.org/r/c/operations/gitlab-ansible/+/700084 of course" [puppet] - 10https://gerrit.wikimedia.org/r/697850 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:57:09] (03CR) 10Herron: [C: 03+1] "cursory review lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/699254 (https://phabricator.wikimedia.org/T234565) (owner: 10Cwhite) [13:58:03] (03CR) 10Dzahn: "on backup1001: Notice: /Stage[main]/Profile::Backup::Filesets/Bacula::Director::Fileset[gitlab]/File[/etc/bacula/conf.d/fileset-gitlab.con" [puppet] - 10https://gerrit.wikimedia.org/r/697850 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [13:59:11] (03CR) 10Dzahn: "in a little while we should see the fileset appear in bconsole:" [puppet] - 10https://gerrit.wikimedia.org/r/697850 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [14:01:35] (03PS1) 10ArielGlenn: Give Holger access to icinga commands [puppet] - 10https://gerrit.wikimedia.org/r/700916 (https://phabricator.wikimedia.org/T277635) [14:01:39] !log start updating analytics firewall rules to capirca generated ones on cr1-eqiad - T279429 [14:01:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:01:44] T279429: Audit analytics firewall filters - https://phabricator.wikimedia.org/T279429 [14:02:24] (03CR) 10jerkins-bot: [V: 04-1] Give Holger access to icinga commands [puppet] - 10https://gerrit.wikimedia.org/r/700916 (https://phabricator.wikimedia.org/T277635) (owner: 10ArielGlenn) [14:03:20] (03PS2) 10ArielGlenn: Give Holger access to icinga commands [puppet] - 10https://gerrit.wikimedia.org/r/700916 (https://phabricator.wikimedia.org/T277635) [14:04:34] 10SRE, 10SRE-Access-Requests: Access to ptwikinews Search Console for Edu - https://phabricator.wikimedia.org/T285091 (10jbond) 05Stalled→03Resolved Thanks @Edu i will resolve this ticket for now but please re-open if we can help further, thanks [14:05:41] (03CR) 10ArielGlenn: "Adding Leo per discussion with Moritz" [puppet] - 10https://gerrit.wikimedia.org/r/700916 (https://phabricator.wikimedia.org/T277635) (owner: 10ArielGlenn) [14:06:21] (03CR) 10Jcrespo: "> Patch Set 5:" [puppet] - 10https://gerrit.wikimedia.org/r/697850 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [14:06:35] (03CR) 10Jbond: [C: 03+1] Add helper tool for returning a user's current TGT [puppet] - 10https://gerrit.wikimedia.org/r/700389 (https://phabricator.wikimedia.org/T283242) (owner: 10Muehlenhoff) [14:06:47] topranks: fyi ^ [14:07:27] I'm going to do several passes, first anything that should be NOOP (adding comments, re-ordering, adding hosts) then a 2nd one for anything that will remove decom hosts [14:10:13] (03CR) 10Dzahn: "[backup1001:~] $ echo "show job" | sudo bconsole | grep gitlab" [puppet] - 10https://gerrit.wikimedia.org/r/697850 (https://phabricator.wikimedia.org/T274463) (owner: 10Dzahn) [14:14:33] (03PS4) 10Volans: ganeti-netbox-sync: Run InterfaceAutomation when necessary [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/662762 (https://phabricator.wikimedia.org/T263768) (owner: 10CRusnov) [14:14:35] (03PS1) 10Volans: ganeti_netbox_sync: simplify logging [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700918 [14:15:22] (03CR) 10Volans: [V: 03+1] "Tested on netbox-next" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/662762 (https://phabricator.wikimedia.org/T263768) (owner: 10CRusnov) [14:16:10] 10Puppet, 10Infrastructure-Foundations, 10GitLab (Initialization), 10Patch-For-Review, and 3 others: Puppetise gitlab-ansible playbook - https://phabricator.wikimedia.org/T283076 (10jbond) [14:16:33] 10SRE, 10SRE-Access-Requests: Access to ptwikinews Search Console for Edu - https://phabricator.wikimedia.org/T285091 (10Aklapper) 05Resolved→03Declined This was not done thus changing task status [14:22:33] (03CR) 10LMata: [C: 03+2] Give Holger access to icinga commands [puppet] - 10https://gerrit.wikimedia.org/r/700916 (https://phabricator.wikimedia.org/T277635) (owner: 10ArielGlenn) [14:23:59] (03CR) 10Filippo Giunchedi: [C: 03+2] alertmanager: dashboard default to 450px wide groups [puppet] - 10https://gerrit.wikimedia.org/r/700838 (https://phabricator.wikimedia.org/T281454) (owner: 10Filippo Giunchedi) [14:24:02] (03CR) 10Filippo Giunchedi: [C: 03+2] alertmanager: disable dashboard alert history on grafana/librenms [puppet] - 10https://gerrit.wikimedia.org/r/700837 (https://phabricator.wikimedia.org/T281454) (owner: 10Filippo Giunchedi) [14:24:28] (03PS2) 10Filippo Giunchedi: alertmanager: dashboard default to 450px wide groups [puppet] - 10https://gerrit.wikimedia.org/r/700838 (https://phabricator.wikimedia.org/T281454) [14:24:37] (03CR) 10Filippo Giunchedi: [V: 03+2 C: 03+2] alertmanager: dashboard default to 450px wide groups [puppet] - 10https://gerrit.wikimedia.org/r/700838 (https://phabricator.wikimedia.org/T281454) (owner: 10Filippo Giunchedi) [14:25:55] (03CR) 10Dzahn: [C: 03+2] "changes are only in comment lines / code examples" [puppet] - 10https://gerrit.wikimedia.org/r/700598 (owner: 10Dzahn) [14:27:19] (03PS1) 10Jbond: P:logoutd: create wrapper script for calling logout.d scripts [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) [14:28:12] (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/29960/console" [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [14:28:36] (03CR) 10Jbond: P:logoutd: create wrapper script for calling logout.d scripts [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [14:29:45] (03CR) 10Dzahn: [C: 03+1] fix GitLab backup cronjob [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700851 (https://phabricator.wikimedia.org/T274463) (owner: 10Jelto) [14:29:58] all done for cr1-eqiad - INFO:homer.transports.junos:Empty diff for cr1-eqiad.wikimedia.org, skipping device. [14:30:12] will wait a bit and do cr2-eqiad [14:30:55] (03CR) 10Ayounsi: [C: 03+1] ganeti_netbox_sync: simplify logging [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700918 (owner: 10Volans) [14:34:12] (03PS3) 10Elukey: Add istio 1.9.5 images [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/700396 (https://phabricator.wikimedia.org/T278192) [14:34:51] (03CR) 10Elukey: "Improved a little the documentation in the proxyv2's Dockerfile about how to find the istio-proxy SHA to use." [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/700396 (https://phabricator.wikimedia.org/T278192) (owner: 10Elukey) [14:35:00] !log Updated the Wikidata property suggester with data from the 2021-05-31 JSON dump (with pre-applied T132839 workarounds) [14:35:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:35:06] T132839: [RfC] Property suggester suggests human properties for non-human items - https://phabricator.wikimedia.org/T132839 [14:37:25] !log start updating analytics firewall rules to capirca generated ones on cr2-eqiad - T279429 [14:37:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:37:30] T279429: Audit analytics firewall filters - https://phabricator.wikimedia.org/T279429 [14:38:20] (03CR) 10Ayounsi: [C: 03+1] "Clean implementation!" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/662762 (https://phabricator.wikimedia.org/T263768) (owner: 10CRusnov) [14:39:54] (03PS14) 10Jbond: IDM: create new idm library with logoutd base class [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) [14:39:56] (03PS1) 10Jbond: setup.py: add types dependencies for mypy [software/pywmflib] - 10https://gerrit.wikimedia.org/r/700926 [14:40:37] jouncebot: now [14:40:37] No deployments scheduled for the next 1 hour(s) and 19 minute(s) [14:40:41] (03CR) 10Volans: "early question, will do a full pass shortly" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [14:40:59] (03PS2) 10Jbond: P:logoutd: create wrapper script for calling logout.d scripts [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) [14:41:58] XioNoX: /govol [14:42:04] ignore that :P [14:42:07] :) [14:43:26] (03CR) 10Volans: [C: 03+2] "Thanks!" [software/pywmflib] - 10https://gerrit.wikimedia.org/r/700926 (owner: 10Jbond) [14:43:58] !log dcausse@deploy1002 Started deploy [wdqs/wdqs@b082ccc]: wdqs 0.3.74 [14:44:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:45:36] (03PS4) 10Ayounsi: Manage analytics-in4/6 with Capirca [homer/public] - 10https://gerrit.wikimedia.org/r/698202 (https://phabricator.wikimedia.org/T279429) [14:46:38] (03Merged) 10jenkins-bot: setup.py: add types dependencies for mypy [software/pywmflib] - 10https://gerrit.wikimedia.org/r/700926 (owner: 10Jbond) [14:47:57] kormat, marostegui, fyi, lots of icinga unknown for DB hosts "Check unit status of pt-heartbeat-wikimedia" in https://icinga.wikimedia.org/alerts [14:48:05] XioNoX: ack, handling. [14:49:17] in theory they should all resolve themselves within 10 mins [14:49:36] did a new check just go in? [14:49:47] apergos: new check in, old check out [14:49:52] I saw something happening when I was applying a contat list addition on alert1001 [14:49:53] ah ha [14:49:59] and i forgot to run puppet on alert1001 [14:50:07] well I did that for you :-P [14:50:28] partially :) the rest should be done now. [14:50:41] 👍 [14:52:04] (03CR) 10Ayounsi: [C: 03+2] Manage analytics-in4/6 with Capirca [homer/public] - 10https://gerrit.wikimedia.org/r/698202 (https://phabricator.wikimedia.org/T279429) (owner: 10Ayounsi) [14:52:18] (03CR) 10Arturo Borrero Gonzalez: grid: php config don't rely on php being installed by puppet (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700186 (owner: 10David Caro) [14:52:50] (03Merged) 10jenkins-bot: Manage analytics-in4/6 with Capirca [homer/public] - 10https://gerrit.wikimedia.org/r/698202 (https://phabricator.wikimedia.org/T279429) (owner: 10Ayounsi) [14:53:10] (03PS3) 10Jbond: P:logoutd: create wrapper script for calling logout.d scripts [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) [14:53:54] (03CR) 10Jbond: "fixed" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [14:54:07] (03CR) 10Ayounsi: [C: 04-2] "This is already included with the move to Capirca based ACLs, so this change is not needed anymore." [homer/public] - 10https://gerrit.wikimedia.org/r/694002 (https://phabricator.wikimedia.org/T283125) (owner: 10Razzi) [14:57:24] !log dcausse@deploy1002 Finished deploy [wdqs/wdqs@b082ccc]: wdqs 0.3.74 (duration: 13m 26s) [14:57:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:59:20] 10SRE, 10Analytics, 10Infrastructure-Foundations, 10netops, 10Patch-For-Review: Audit analytics firewall filters - https://phabricator.wikimedia.org/T279429 (10ayounsi) 05Open→03Resolved [15:01:00] (03PS1) 10Marostegui: orchestrator.conf: Do not promote sanitarium masters/backup hosts [puppet] - 10https://gerrit.wikimedia.org/r/700928 [15:01:23] (03CR) 10Marostegui: "I will do codfw in a different patch" [puppet] - 10https://gerrit.wikimedia.org/r/700928 (owner: 10Marostegui) [15:03:21] (03CR) 10Volans: [C: 03+2] ganeti_netbox_sync: simplify logging [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700918 (owner: 10Volans) [15:04:05] (03Merged) 10jenkins-bot: ganeti_netbox_sync: simplify logging [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700918 (owner: 10Volans) [15:04:31] (03CR) 10Hashar: [V: 03+2 C: 03+2] Load fonts directly from Gerrit instead of 3rd party domains [software/gerrit/plugins/gitiles] (wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693552 (owner: 10Hashar) [15:06:05] (03CR) 10Hashar: [V: 03+2 C: 03+2] Merge 'refs/changes/97/273397/1' into wmf/stable3.2 [software/gerrit/plugins/gitiles] (wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/693553 (https://phabricator.wikimedia.org/T240264) (owner: 10Hashar) [15:07:27] (03CR) 10Volans: [V: 03+1 C: 03+2] ganeti-netbox-sync: Run InterfaceAutomation when necessary [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/662762 (https://phabricator.wikimedia.org/T263768) (owner: 10CRusnov) [15:08:11] (03Merged) 10jenkins-bot: ganeti-netbox-sync: Run InterfaceAutomation when necessary [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/662762 (https://phabricator.wikimedia.org/T263768) (owner: 10CRusnov) [15:10:38] 10SRE, 10Infrastructure-Foundations, 10netops: Cloud IPv6 subnets - https://phabricator.wikimedia.org/T187929 (10cmooney) @faidon wondering if you are ok with the plan set out above? Any comments / feedback welcome. [15:14:08] (03PS1) 10Hashar: [WMF] fork gitiles to prevent loading fonts from 3rd party [software/gerrit] (wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/700932 (https://phabricator.wikimedia.org/T240264) [15:22:57] (03CR) 10Muehlenhoff: [C: 03+2] Add helper tool for returning a user's current TGT [puppet] - 10https://gerrit.wikimedia.org/r/700389 (https://phabricator.wikimedia.org/T283242) (owner: 10Muehlenhoff) [15:23:53] (03CR) 10Hashar: "To fork I simply pointed the submodule to our fork of the gitiles plugin which has the cherry picked patch: https://gerrit.wikimedia.org/r" [software/gerrit] (wmf/stable-3.2) - 10https://gerrit.wikimedia.org/r/700932 (https://phabricator.wikimedia.org/T240264) (owner: 10Hashar) [15:24:27] (03CR) 10Giuseppe Lavagetto: [C: 03+2] httpbb: move most tests to https [puppet] - 10https://gerrit.wikimedia.org/r/700856 (https://phabricator.wikimedia.org/T285298) (owner: 10Giuseppe Lavagetto) [15:33:51] (03PS1) 10Ayounsi: Simplify labs-in4/6 firewall filters [homer/public] - 10https://gerrit.wikimedia.org/r/700939 [15:40:41] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good to me!" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/700922 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [15:43:32] 10SRE, 10MW-on-K8s, 10Release-Engineering-Team, 10serviceops: Check out www-portals repo in the mediawiki-webserver and in the mediawiki-multiversion images - https://phabricator.wikimedia.org/T285325 (10Joe) [15:44:01] 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review: Make all httpbb tests pass on the mwdebug deployment. - https://phabricator.wikimedia.org/T285298 (10Joe) [15:53:17] (03CR) 10Cwhite: [C: 03+1] "LGTM! Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/700841 (owner: 10Filippo Giunchedi) [15:55:57] (03PS1) 10Volans: ganeti-netbox sync: skip VMs not in PuppetDB [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700946 [15:55:59] (03PS1) 10Volans: ganeti netbox sync: fix host removal from cluster [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700947 [15:57:30] (03CR) 10Volans: [V: 03+1] "tested on netbox-next" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700946 (owner: 10Volans) [15:57:39] (03CR) 10Volans: [V: 03+1] "tested on netbox-next" [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700947 (owner: 10Volans) [15:59:41] (03CR) 10Filippo Giunchedi: [C: 03+2] logstash: extend ssd retention to 15d [puppet] - 10https://gerrit.wikimedia.org/r/700841 (owner: 10Filippo Giunchedi) [16:00:04] jbond42 and cdanis: #bothumor My software never has bugs. It just develops random features. Rise for Puppet request window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T1600). [16:00:17] (03CR) 10Ayounsi: [C: 03+1] ganeti-netbox sync: skip VMs not in PuppetDB [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700946 (owner: 10Volans) [16:00:37] (03CR) 10Ayounsi: [C: 03+1] ganeti netbox sync: fix host removal from cluster [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700947 (owner: 10Volans) [16:01:21] (03CR) 10Volans: [V: 03+1 C: 03+2] ganeti-netbox sync: skip VMs not in PuppetDB [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700946 (owner: 10Volans) [16:01:26] (03CR) 10Volans: [V: 03+1 C: 03+2] ganeti netbox sync: fix host removal from cluster [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700947 (owner: 10Volans) [16:02:10] (03Merged) 10jenkins-bot: ganeti-netbox sync: skip VMs not in PuppetDB [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700946 (owner: 10Volans) [16:02:14] (03PS2) 10Volans: ganeti netbox sync: fix host removal from cluster [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700947 (https://phabricator.wikimedia.org/T260326) [16:02:24] (03CR) 10Volans: [C: 03+2] ganeti netbox sync: fix host removal from cluster [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700947 (https://phabricator.wikimedia.org/T260326) (owner: 10Volans) [16:03:15] (03Merged) 10jenkins-bot: ganeti netbox sync: fix host removal from cluster [software/netbox-extras] - 10https://gerrit.wikimedia.org/r/700947 (https://phabricator.wikimedia.org/T260326) (owner: 10Volans) [16:03:38] (03PS1) 10TrainBranchBot: Branch commit for wmf/1.37.0-wmf.11 [core] (wmf/1.37.0-wmf.11) - 10https://gerrit.wikimedia.org/r/700949 [16:03:40] (03CR) 10TrainBranchBot: [C: 03+2] Branch commit for wmf/1.37.0-wmf.11 [core] (wmf/1.37.0-wmf.11) - 10https://gerrit.wikimedia.org/r/700949 (owner: 10TrainBranchBot) [16:06:20] (03CR) 10Effie Mouzeli: [C: 03+1] pipeline: install the wmf internal CAs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700843 (https://phabricator.wikimedia.org/T284417) (owner: 10Giuseppe Lavagetto) [16:09:05] (03PS1) 10MSantos: wikifeeds: bump to 2021-06-22-075256-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/700950 [16:09:31] (03CR) 10Effie Mouzeli: [C: 03+1] pipeline: install php extensions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700903 (https://phabricator.wikimedia.org/T285309) (owner: 10Giuseppe Lavagetto) [16:12:03] (03CR) 10Volans: "reply inline" (031 comment) [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/685462 (owner: 10Volans) [16:15:06] (03CR) 10MSantos: [C: 03+2] wikifeeds: bump to 2021-06-22-075256-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/700950 (owner: 10MSantos) [16:17:55] (03Merged) 10jenkins-bot: wikifeeds: bump to 2021-06-22-075256-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/700950 (owner: 10MSantos) [16:22:17] (03Merged) 10jenkins-bot: Branch commit for wmf/1.37.0-wmf.11 [core] (wmf/1.37.0-wmf.11) - 10https://gerrit.wikimedia.org/r/700949 (owner: 10TrainBranchBot) [16:28:11] (03PS1) 10MSantos: mobileapps: bump to 2021-06-22-161902-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/700953 [16:32:55] (03CR) 10Volans: [C: 03+1] "Aside the 2 nits for the possible typos the change LGTM and can be merged with just those 2 fixed. As for the rest of the comments they ar" (034 comments) [software/pywmflib] - 10https://gerrit.wikimedia.org/r/695341 (https://phabricator.wikimedia.org/T283242) (owner: 10Jbond) [16:41:20] !log reindexing Dutch wikis on elastic@eqiad, elastic@codfw, and cloudelastic (T284185) [16:41:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:41:26] T284185: Reindex German, Dutch, and Portugese Wikis - https://phabricator.wikimedia.org/T284185 [16:44:04] (03PS2) 10Legoktm: switchdc: include swift and thanos [cookbooks] - 10https://gerrit.wikimedia.org/r/626403 (https://phabricator.wikimedia.org/T285273) (owner: 10Filippo Giunchedi) [16:44:16] (03PS3) 10Legoktm: switchdc: include swift and thanos [cookbooks] - 10https://gerrit.wikimedia.org/r/626403 (https://phabricator.wikimedia.org/T285273) (owner: 10Filippo Giunchedi) [16:44:37] (03Abandoned) 10Legoktm: sre.switchdc.services: Don't exclude thanos-query & thanos-swift services [cookbooks] - 10https://gerrit.wikimedia.org/r/700695 (https://phabricator.wikimedia.org/T285273) (owner: 10Legoktm) [16:45:45] (03CR) 10Effie Mouzeli: "PCC: https://puppet-compiler.wmflabs.org/compiler1003/29962/" [puppet] - 10https://gerrit.wikimedia.org/r/700861 (https://phabricator.wikimedia.org/T271967) (owner: 10Effie Mouzeli) [16:50:26] (03CR) 10Legoktm: [C: 03+2] "Let's do it!" [cookbooks] - 10https://gerrit.wikimedia.org/r/626403 (https://phabricator.wikimedia.org/T285273) (owner: 10Filippo Giunchedi) [16:53:56] (03Merged) 10jenkins-bot: switchdc: include swift and thanos [cookbooks] - 10https://gerrit.wikimedia.org/r/626403 (https://phabricator.wikimedia.org/T285273) (owner: 10Filippo Giunchedi) [17:00:04] chrisalbon and accraze: (Dis)respected human, time to deploy Services – Graphoid / ORES (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T1700). Please do the needful. [17:00:25] (03PS1) 10Krinkle: mediawiki: Reduce purgeParserCache.php sleep from 500ms to 200 [puppet] - 10https://gerrit.wikimedia.org/r/700957 (https://phabricator.wikimedia.org/T282761) [17:02:50] (03CR) 10MSantos: [C: 03+2] mobileapps: bump to 2021-06-22-161902-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/700953 (owner: 10MSantos) [17:04:15] !log mbsantos@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . [17:04:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:04:32] !log 1.37.0-wmf.11 was branched at c161d3bd063b06d09be4167b38a72087db3ba7d2 for T281152 [17:04:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:04:37] T281152: 1.37.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T281152 [17:05:41] 10SRE, 10observability, 10Datacenter-Switchover, 10Patch-For-Review: Switchover thanos-query and thanos-swift services as part of DC switchover - https://phabricator.wikimedia.org/T285273 (10Legoktm) 05Open→03Resolved a:03fgiunchedi [17:05:43] (03Merged) 10jenkins-bot: mobileapps: bump to 2021-06-22-161902-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/700953 (owner: 10MSantos) [17:07:15] !log mbsantos@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [17:07:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:07:23] 10SRE: Integrate Buster 10.10 point update - https://phabricator.wikimedia.org/T285206 (10MoritzMuehlenhoff) [17:07:38] !log installing velocity security updates [17:07:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:08:50] !log mbsantos@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [17:08:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:09:29] (03PS1) 10Dduvall: testwikis wikis to 1.37.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700959 [17:09:31] (03CR) 10Dduvall: [C: 03+2] testwikis wikis to 1.37.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700959 (owner: 10Dduvall) [17:10:11] (03Merged) 10jenkins-bot: testwikis wikis to 1.37.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700959 (owner: 10Dduvall) [17:10:16] !log dduvall@deploy1002 Started scap: testwikis wikis to 1.37.0-wmf.11 [17:10:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:11:39] !log installing ruby-websocket-extensions security updates [17:11:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:12:54] I'm seeing some "Evicted" status for wikifeeds k8s pods, is that something to be concerned about? [17:14:20] 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for TChin - https://phabricator.wikimedia.org/T285326 (10tchin) I was just following my [[ https://office.wikimedia.org/wiki/Technology/Onboarding/Checklists/Thomas_Chin | onboarding checklist ]] which told me to request access to the group `ldap/wmf`, bu... [17:14:34] !log mbsantos@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [17:14:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:16:49] !log mbsantos@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [17:16:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:18:43] !log mbsantos@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [17:18:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:21:10] !log installing isc-dhcp security updates [17:21:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:22:22] 10SRE, 10serviceops, 10Datacenter-Switchover: Siteinfo timeout during switch datacenter - https://phabricator.wikimedia.org/T266618 (10Legoktm) > it's connecting to port 80 with the x-forwarded-proto header, and that should probably be updated. This is easy, I'll upload a patch. > the warmup script didn't... [17:26:09] Not sure who to ping about it, but I'm seeing "Evicted" pods in both mobileapps (eqiad) and wikifeds (eqiad and codfw) cc/ moritzm effie [17:26:22] Is that a problem? [17:29:00] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10RobH) [17:29:59] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10RobH) [17:31:05] (03PS1) 10Legoktm: mediawiki: Make siteinfo API request over HTTPS [software/spicerack] - 10https://gerrit.wikimedia.org/r/700963 (https://phabricator.wikimedia.org/T266618) [17:32:58] mbsantos: are those services having issues? [17:33:53] No, they are all fine: no increase in error rate or change in behaviour (looking at logstash and grafana) [17:36:01] I'm not sure why they're marked as evicted, `kubectl get events -n mobileapps` looks fine to me...it looks like it created the new pods and then deleted the old ones [17:39:30] (03PS1) 10Zabe: Add 'unwatchedpages' to 'rollbacker' on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700965 (https://phabricator.wikimedia.org/T285334) [17:40:16] From grafana I would infer that they hit the threshold of resource saturation during the deployment and some pods got "evicted" [17:40:31] But that's the first time I'm seeing this [17:41:14] !log dduvall@deploy1002 Finished scap: testwikis wikis to 1.37.0-wmf.11 (duration: 30m 59s) [17:41:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:41:53] 10SRE: Integrate Buster 10.10 point update - https://phabricator.wikimedia.org/T285206 (10MoritzMuehlenhoff) [17:42:59] !log testwikis to 1.37.0-wmf.11 (cc open blockers T285125 T285118 T271011) [17:43:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:43:06] T271011: Update CategoryTree to use the new HookContainer/HookRunner system - https://phabricator.wikimedia.org/T271011 [17:43:06] T285125: image sizes not displayed on beta - https://phabricator.wikimedia.org/T285125 [17:43:07] T285118: beta: Error: Unsupported operand types - https://phabricator.wikimedia.org/T285118 [17:45:00] (03CR) 10Brennen Bearnes: [V: 03+2 C: 03+2] fix GitLab backup cronjob [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700851 (https://phabricator.wikimedia.org/T274463) (owner: 10Jelto) [17:45:17] (03PS2) 10Brennen Bearnes: fix GitLab backup cronjob [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700851 (https://phabricator.wikimedia.org/T274463) (owner: 10Jelto) [17:45:28] (03CR) 10Brennen Bearnes: [V: 03+2 C: 03+2] fix GitLab backup cronjob [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700851 (https://phabricator.wikimedia.org/T274463) (owner: 10Jelto) [17:47:19] !log gitlab1001: run ansible to deploy https://gerrit.wikimedia.org/r/700851 (T274463) [17:47:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:47:24] T274463: Backups for GitLab - https://phabricator.wikimedia.org/T274463 [17:52:47] (03PS1) 10Awight: Revert "Fall back from explicit parameter order to TemplateData sort" [extensions/VisualEditor] (wmf/1.37.0-wmf.11) - 10https://gerrit.wikimedia.org/r/700734 [17:54:31] (03CR) 10Awight: [C: 03+1] "Seems to work locally." [extensions/VisualEditor] (wmf/1.37.0-wmf.11) - 10https://gerrit.wikimedia.org/r/700734 (owner: 10Awight) [17:56:05] jeena: I'd like to merge this revert ahead of the train: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/700734 -- is that okay with you? [17:56:59] dduvall: ^ [17:58:28] tl;dr, we realized that it will cause VE users to see new "dirty diffs", reordering template params in cases where it previously wouldn't have. This same behavior was reported as a bug in 2016, so I feel like I should honor that decision. [17:58:39] awight: alright with me [17:58:48] are you going to sync it for wmf.11? [17:59:04] i just deployed to test wikis [17:59:06] +1 thanks, and sorry for the mess. @dduvall yes I'll do that on the deployment server now. [17:59:13] awesome. thanks! [17:59:22] ack [18:00:04] Deploy window Pre MediaWiki train break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T1800) [18:00:36] (03CR) 10Awight: [C: 03+2] "Merging for deployment" [extensions/VisualEditor] (wmf/1.37.0-wmf.11) - 10https://gerrit.wikimedia.org/r/700734 (owner: 10Awight) [18:04:41] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar: (Need By: TBD) rack/setup/install thumbor100[56] - https://phabricator.wikimedia.org/T273914 (10RobH) [18:06:56] dduvall: Just waiting for Jenkins... Meanwhile, I'm not sure quite what it means to have deployed to only test wikis. Do I need to do anything special when syncing on the deployment server? Or just a regular scap sync-file? [18:07:42] awight: deployed is maybe a misnomer. i've promoted the test wikis to use wmf.11 :) [18:08:04] (03PS1) 10RobH: Setup info for thumbor100[56] an-web1001 [puppet] - 10https://gerrit.wikimedia.org/r/700968 (https://phabricator.wikimedia.org/T273914) [18:08:07] in other words, the wmf.11 code is already on the servers and needs to be updated with a sync [18:08:46] ty! [18:09:17] np. thanks for syncing [18:09:25] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar, 10Patch-For-Review: (Need By: TBD) rack/setup/install thumbor100[56] - https://phabricator.wikimedia.org/T273914 (10RobH) [18:09:29] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10RobH) [18:09:56] You are very chill about someone barging in and syncing during the "sanity break" <3 [18:10:09] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar, 10Patch-For-Review: (Need By: TBD) rack/setup/install thumbor100[56] - https://phabricator.wikimedia.org/T273914 (10RobH) [18:10:21] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10RobH) [18:10:25] (03CR) 10RobH: [C: 03+2] Setup info for thumbor100[56] an-web1001 [puppet] - 10https://gerrit.wikimedia.org/r/700968 (https://phabricator.wikimedia.org/T273914) (owner: 10RobH) [18:10:37] low level train anxiety presenting as chillness :) [18:13:31] :-D if I were to be so lucky [18:14:19] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar, 10Patch-For-Review: (Need By: TBD) rack/setup/install thumbor100[56] - https://phabricator.wikimedia.org/T273914 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts: ` ['thumbor1005.eqiad.wmnet', 'thumbor... [18:19:05] !log pulled in updates for thirdparty/kubeadm-k8s-1-18 buster-wikimedia on apt1001 [18:19:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:19:10] I'm feeling pretty anxious to be the one endangering a two-week train. [18:23:54] (03Merged) 10jenkins-bot: Revert "Fall back from explicit parameter order to TemplateData sort" [extensions/VisualEditor] (wmf/1.37.0-wmf.11) - 10https://gerrit.wikimedia.org/r/700734 (owner: 10Awight) [18:26:07] !log robh@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on thumbor1005.eqiad.wmnet with reason: REIMAGE [18:26:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:27:31] !log awight@deploy1002 sync-file aborted: Backport: [[gerrit:700734|Revert "Fall back from explicit parameter order to TemplateData sort" ()]] (duration: 00m 40s) [18:27:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:27:48] Testing on mwdebug1002 [18:28:19] !log robh@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thumbor1005.eqiad.wmnet with reason: REIMAGE [18:28:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:28:55] !log robh@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on thumbor1006.eqiad.wmnet with reason: REIMAGE [18:28:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:29:00] Not looking broken, continuing scap [18:30:13] !log awight@deploy1002 Synchronized php-1.37.0-wmf.11/extensions/VisualEditor: Backport: [[gerrit:700734|Revert "Fall back from explicit parameter order to TemplateData sort" ()]] (duration: 01m 09s) [18:30:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:30:38] dduvall: All done, thanks again. Hopefully I didn't break the world. [18:31:01] !log robh@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thumbor1006.eqiad.wmnet with reason: REIMAGE [18:31:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:31:05] awight: np :) [18:38:56] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar, 10Patch-For-Review: (Need By: TBD) rack/setup/install thumbor100[56] - https://phabricator.wikimedia.org/T273914 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['thumbor1005.eqiad.wmnet', 'thumbor1006.eqiad.wmnet'] ` and were **ALL** successful. [18:42:36] !log ebernhardson@deploy1002 Started deploy [wikimedia/discovery/analytics@75d35b4]: revert expect eventgate canary events in all dcs [18:42:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:46:59] !log ebernhardson@deploy1002 Finished deploy [wikimedia/discovery/analytics@75d35b4]: revert expect eventgate canary events in all dcs (duration: 04m 23s) [18:47:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:52:38] (03CR) 10RLazarus: [C: 03+1] mediawiki: Make siteinfo API request over HTTPS [software/spicerack] - 10https://gerrit.wikimedia.org/r/700963 (https://phabricator.wikimedia.org/T266618) (owner: 10Legoktm) [18:53:14] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar: (Need By: TBD) rack/setup/install thumbor100[56] - https://phabricator.wikimedia.org/T273914 (10RobH) #serviceops-radar, this is now ready for your use. [18:53:16] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops-radar: (Need By: TBD) rack/setup/install thumbor100[56] - https://phabricator.wikimedia.org/T273914 (10RobH) 05Open→03Resolved [18:53:34] Krinkle: ebernhardson and I are trying to figure out whether https://wikitech.wikimedia.org/wiki/Switch_Datacenter#ElasticSearch is still necessary. He said that it's stored in MainWANObjectCache, so is that auto replicated to codfw? [18:54:06] https://www.mediawiki.org/wiki/Manual:Object_cache#WAN_cache "shared between application servers, with invalidation events being replicated across data centers" [18:54:25] so I guess not [18:54:35] We have never replicated memc values, and most probaly should/will not. [18:54:51] If something uses memc, it means generally that is is generated and stored on-demand as needed [18:54:59] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts: ` an-web1001.eqiad.wmnet ` The log can be found in `/var/log/w... [18:55:04] and thus is taken care of by both DCs on heir own [18:55:11] ack, thanks for confirming [18:55:23] If there is code storing data directly in memc bypassing getWithSet(), then that would be a problem. [18:55:26] I don't know if that's the case. [18:55:49] It being called out here suggests that maybe it is doing something like that, as otherwise why is it called out at all? [18:56:24] I think the problem is that the CirrusSearch cache is too cold to use after the switchover [18:57:32] well, if the procedure laid out here is what was done in the past, then I suppose we can do it again, I mean, nothing has changed in terms of wan cache [18:57:39] https://gerrit.wikimedia.org/g/mediawiki/extensions/CirrusSearch/+/3df4a9b30707a2ef9ba1ebfcc84f09b915c78e15/includes/Searcher.php#642 [18:57:53] if its use of wan cache is new, then we can re-evaluate it indeed [18:58:37] it's probably not new, I'm just checking to make sure the docs are still up to date, and it seems like they are [18:59:01] aye, yeah, but this does seem a bit of an anti-pattern. [18:59:30] it's bypassing virtually all scale and performance levers and automation in the wanobjectcache class by not using getWithSet, I think. [19:00:04] marxarelli and jeena: Time to snap out of that daydream and deploy MediaWiki train - American Version. Get on with it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T1900). [19:00:06] if the load on elastic is still too great, then yeah, I suppose fixing that wouldn't change anything though, so orthogonal for now. [19:01:42] !log dduvall@deploy1002 Pruned MediaWiki: 1.37.0-wmf.6 (duration: 03m 35s) [19:01:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:05:17] Krinkle: is there a link somewhere that explains why getWithSet is better than get/set? [19:05:20] * legoktm is filing a bug [19:06:23] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-web1001.eqiad.wmnet'] ` Of which those **FAILED**: ` ['an-web1001.eqiad.wmnet'] ` [19:06:23] !log preparing to promote wmf.11 group0 (T281152) cc'ing risking patch contacts Amir1, Krinkle, DannyS712 [19:06:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:06:28] T281152: 1.37.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T281152 [19:07:04] (03PS1) 10Dduvall: group0 wikis to 1.37.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700974 [19:07:06] (03CR) 10Dduvall: [C: 03+2] group0 wikis to 1.37.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700974 (owner: 10Dduvall) [19:07:11] cc'd you on T285346 [19:07:11] T285346: CirrusSearch WAN caching should use getWithSet() instead of manual get()/set() - https://phabricator.wikimedia.org/T285346 [19:08:06] (03Merged) 10jenkins-bot: group0 wikis to 1.37.0-wmf.11 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700974 (owner: 10Dduvall) [19:08:12] legoktm: the docs for get() and set() say to consider using getWithSet, and the raw get()/set() enumerate a lot of things to consider if you call them directly. https://doc.wikimedia.org/mediawiki-core/master/php/classWANObjectCache.html [19:08:23] but more generally, if you ask me, these methods just shouldn' be public in the first place. [19:09:00] They probably are only public to allow for an optimisation in one or two places somewhere where we haven't bothered to accept or accomodate it in a way that is less damanging to the public API [19:09:17] and they probably are only called here because someone migrated the code from wgMemc to wanCache [19:09:24] which is a step in the right direction I guess. [19:09:29] !log dduvall@deploy1002 rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.11 [19:09:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:11:07] * legoktm copy & pastes [19:26:13] !log set mediawiki-l message acceptance to discard non-member posts instead of reject [19:26:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:30:56] (03PS1) 10RobH: an-web1001 mac correction [puppet] - 10https://gerrit.wikimedia.org/r/700977 (https://phabricator.wikimedia.org/T281787) [19:31:20] (03CR) 10RobH: [C: 03+2] an-web1001 mac correction [puppet] - 10https://gerrit.wikimedia.org/r/700977 (https://phabricator.wikimedia.org/T281787) (owner: 10RobH) [19:49:31] (03PS1) 10Ladsgroup: dumps: Remove absented crons [puppet] - 10https://gerrit.wikimedia.org/r/700978 (https://phabricator.wikimedia.org/T273673) [19:53:39] (03PS1) 10Ebernhardson: Enable canary events for search event streams [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700979 [19:55:24] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops, 10Patch-For-Review: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts: ` an-web1001.eqiad.wmnet ` The log can b... [19:55:51] (03PS1) 10Ssingh: test_dns: update test to reflect anycasted Wikidough service [software/knead-wikidough] - 10https://gerrit.wikimedia.org/r/700980 [19:57:18] (03PS6) 10Brennen Bearnes: disable issues & wikis by default on new projects [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/699812 (https://phabricator.wikimedia.org/T264231) [19:57:31] (03CR) 10Brennen Bearnes: [V: 03+2 C: 03+2] disable issues & wikis by default on new projects [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/699812 (https://phabricator.wikimedia.org/T264231) (owner: 10Brennen Bearnes) [19:58:32] !log gitlab1001: run ansible to deploy https://gerrit.wikimedia.org/r/c/operations/gitlab-ansible/+/699812 (T264231) [19:58:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:58:38] T264231: Investigate whether issues, operations, wikis, etc. can be disabled globally on GitLab - https://phabricator.wikimedia.org/T264231 [19:58:53] (03CR) 10Ssingh: [C: 03+2] test_dns: update test to reflect anycasted Wikidough service [software/knead-wikidough] - 10https://gerrit.wikimedia.org/r/700980 (owner: 10Ssingh) [20:00:44] (03PS1) 10Ladsgroup: dumps: Migrate dumplists cron to systemd timer [puppet] - 10https://gerrit.wikimedia.org/r/700981 (https://phabricator.wikimedia.org/T273673) [20:11:45] I see you putting stuff in my review queue, Amir1 :-P [20:12:23] hey note that ya gotta give full paths to things insystemd units, I missed that with one today and got puppet whines for that [20:12:37] !log reindexing Dutch wikis on elastic@eqiad, elastic@codfw, and cloudelastic complete (T284185) [20:12:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:12:43] I mean it's a 2 second fix but just keep your eye out [20:12:45] T284185: Reindex German, Dutch, and Portugese Wikis - https://phabricator.wikimedia.org/T284185 [20:12:55] !log reindexing Portuguese wikis on elastic@eqiad, elastic@codfw, and cloudelastic (T284185) [20:12:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:19:17] !log robh@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on an-web1001.eqiad.wmnet with reason: REIMAGE [20:19:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:20:48] apergos: oh thanks. I make sure PCC is happy for the new one [20:21:16] sweet! yeah I was a bit too out of things this morning to do the check, shame on me [20:22:37] (03PS2) 10Brennen Bearnes: remove when condition on root pass variable [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700676 [20:23:03] (03CR) 10Brennen Bearnes: [V: 03+2 C: 03+2] "Cool, thanks. Will go ahead and merge." [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/700676 (owner: 10Brennen Bearnes) [20:23:07] !log robh@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-web1001.eqiad.wmnet with reason: REIMAGE [20:23:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:26:39] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10RobH) [20:26:58] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10RobH) @Ottomata an-web1001 is now staged for your use! [20:27:02] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10RobH) 05Open→03Resolved [20:28:34] 10SRE, 10Performance-Team, 10serviceops, 10MW-1.36-notes, and 3 others: Enable "/*/mw-with-onhost-tier/" route for MediaWiki where safe - https://phabricator.wikimedia.org/T264604 (10Krinkle) Based on very rudimentary testing it seems that there is quite a direct correlation between the number of (concurre... [20:38:21] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-web1001 - https://phabricator.wikimedia.org/T281787 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-web1001.eqiad.wmnet'] ` and were **ALL** successful. [20:42:55] (03CR) 10Ladsgroup: "PCC is happy https://puppet-compiler.wmflabs.org/compiler1001/29963/" [puppet] - 10https://gerrit.wikimedia.org/r/700981 (https://phabricator.wikimedia.org/T273673) (owner: 10Ladsgroup) [20:43:11] (03PS4) 10Brennen Bearnes: CAS: stop marking users as external [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/699819 (https://phabricator.wikimedia.org/T274461) [20:43:38] (03CR) 10Brennen Bearnes: [V: 03+2 C: 03+2] CAS: stop marking users as external [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/699819 (https://phabricator.wikimedia.org/T274461) (owner: 10Brennen Bearnes) [20:46:35] !log gitlab1001: running ansible to deploy [[gerrit:699819|CAS: stop marking users as external]] (T274461) [20:46:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:46:41] T274461: Define auth strategy for GitLab - https://phabricator.wikimedia.org/T274461 [21:16:03] 10SRE, 10MediaWiki-General, 10Pybal, 10Traffic, and 2 others: SELECT query arriving to wikidatawiki db codfw hosts causing pile ups during schema change - https://phabricator.wikimedia.org/T284981 (10BPirkle) [21:33:49] PROBLEM - Router interfaces on cr2-codfw is CRITICAL: CRITICAL: host 208.80.153.193, interfaces up: 133, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [21:34:13] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 221, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [21:43:35] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 222, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [21:45:02] RECOVERY - Router interfaces on cr2-codfw is OK: OK: host 208.80.153.193, interfaces up: 134, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down [22:18:27] (03PS1) 10Dave Pifke: webperf: drop checks that are now in AlertManager [puppet] - 10https://gerrit.wikimedia.org/r/700994 (https://phabricator.wikimedia.org/T281358) [22:38:07] !log mwscript recountCategories.php --wiki=eowiktionary --mode={pages,subcats,files} (T170737) [22:38:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:38:13] T170737: Run recountCategories.php on Wikimedia wikis - https://phabricator.wikimedia.org/T170737 [22:41:28] !log [urbanecm@mwmaint1002 ~]$ mwscript recountCategories.php --wiki=zhwiki --mode=pages # T170737 [22:41:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:42:29] !log [urbanecm@mwmaint1002 ~]$ mwscript recountCategories.php --wiki=zhwiki --mode=subcats # T170737 [22:42:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:50:32] jouncebot: next [22:50:32] In 0 hour(s) and 9 minute(s): Evening backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T2300) [23:00:05] RoanKattouw, Niharika, and Urbanecm: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Evening backport window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210622T2300). [23:00:05] ebernhardson, Urbanecm, and zabe: A patch you scheduled for Evening backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [23:00:10] i can deploy today [23:00:14] o/ [23:00:25] hey zabe [23:00:30] (03CR) 10Urbanecm: [C: 03+2] Enable Growth features in dark mode at nlwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700909 (https://phabricator.wikimedia.org/T285254) (owner: 10Urbanecm) [23:00:33] hi [23:00:44] \o [23:01:15] (03Merged) 10jenkins-bot: Enable Growth features in dark mode at nlwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700909 (https://phabricator.wikimedia.org/T285254) (owner: 10Urbanecm) [23:01:19] ebernhardson: is it ok if i ping you when i'm done for you to self-deploy? [23:01:49] urbanecm: sure [23:01:55] great! [23:04:31] !log urbanecm@deploy1002 Synchronized dblists/growthexperiments.dblist: 9a594f0ce249e2b4752ea2b8d7c4258bf14ad86a: Enable Growth features in dark mode at nlwiki (T285254; 1/3) (duration: 01m 37s) [23:04:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:04:37] T285254: Deploy Growth features on Dutch Wikipedia - https://phabricator.wikimedia.org/T285254 [23:05:58] !log urbanecm@deploy1002 Synchronized wmf-config/config/nlwiki.yaml: 9a594f0ce249e2b4752ea2b8d7c4258bf14ad86a: Enable Growth features in dark mode at nlwiki (T285254; 2/3) (duration: 01m 05s) [23:06:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:08:54] !log urbanecm@deploy1002 Synchronized wmf-config/InitialiseSettings.php: 9a594f0ce249e2b4752ea2b8d7c4258bf14ad86a: Enable Growth features in dark mode at nlwiki (T285254; 3/3) (duration: 01m 07s) [23:08:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:09:14] (03PS2) 10Urbanecm: Add 'unwatchedpages' to 'rollbacker' on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700965 (https://phabricator.wikimedia.org/T285334) (owner: 10Zabe) [23:09:19] (03CR) 10Urbanecm: [C: 03+2] Add 'unwatchedpages' to 'rollbacker' on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700965 (https://phabricator.wikimedia.org/T285334) (owner: 10Zabe) [23:10:03] (03Merged) 10jenkins-bot: Add 'unwatchedpages' to 'rollbacker' on frwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700965 (https://phabricator.wikimedia.org/T285334) (owner: 10Zabe) [23:10:38] zabe: your change is at mwdebug1001, please test [23:11:35] urbanecm: works the supposed way [23:11:41] syncing [23:13:25] !log urbanecm@deploy1002 Synchronized wmf-config/InitialiseSettings.php: 7865f27430f8eea2975d7154f6009a9206fc75d6: Add unwatchedpages to rollbacker on frwiki (T285334) (duration: 01m 06s) [23:13:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:13:30] T285334: Give rollbackers the unwatchedpages right on frwiki - https://phabricator.wikimedia.org/T285334 [23:13:31] zabe: done [23:13:38] ebernhardson: I'm done, floor is yours [23:13:44] urbanecm: excellent, thanks [23:13:52] thanks :) [23:14:01] (03PS2) 10Ebernhardson: Enable canary events for search event streams [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700979 [23:14:57] (03CR) 10Ebernhardson: [C: 03+2] Enable canary events for search event streams [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700979 (owner: 10Ebernhardson) [23:15:48] (03Merged) 10jenkins-bot: Enable canary events for search event streams [mediawiki-config] - 10https://gerrit.wikimedia.org/r/700979 (owner: 10Ebernhardson) [23:23:11] !log ebernhardson@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Enable canary events for search event streams (duration: 01m 05s) [23:23:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log