[20:47:16] You are unknown to me :) [20:47:16] @whoami [20:47:21] I trust: urbanecm!.*@user/urbanecm (2admin), .*@user/legoktm (2admin), .*@user/urbanecmbackup/x-3733651 (2admin), [20:47:21] @trusted [20:47:42] Successfully added .*@wikimedia/Martin-Urbanec [20:47:42] @trustadd .*@wikimedia/Martin-Urbanec admin [20:47:53] O.o [20:48:00] Cloaks!? [20:48:01] You are admin and identified by the name .*@wikimedia/Martin-Urbanec [20:48:01] @whoami [20:48:06] much better :) [20:49:27] PROBLEM - Host labstore1007.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [20:49:44] 10SRE, 10Wikimedia-Logstash, 10observability, 10service-runner: Move service-runner to new logging infrastructure - https://phabricator.wikimedia.org/T211125 (10Aklapper) @Pchelolo: Hi, all related patches in Gerrit have been merged or abandoned. Is there more to do in this task? Asking as you are set as t... [20:50:25] 10SRE, 10Wikimedia-Logstash, 10observability, 10service-runner: Move service-runner to new logging infrastructure - https://phabricator.wikimedia.org/T211125 (10Pchelolo) 05Open→03Resolved [20:50:29] 10SRE, 10Wikimedia-Logstash, 10observability, 10Patch-For-Review: Deprecate all non-Kafka logstash inputs - https://phabricator.wikimedia.org/T227080 (10Pchelolo) [20:50:32] 10SRE, 10Discovery-Search, 10Elasticsearch, 10Wikimedia-Logstash, 10observability: Migrate Elasticsearch from deprecated Gelf logstash input to rsyslog Kafka logging pipeline - https://phabricator.wikimedia.org/T225125 (10Pchelolo) [20:50:36] 10SRE, 10Wikimedia-Logstash, 10observability: Migrate services using deprecated Gelf logstash input to Kafka enabled logging pipeline - https://phabricator.wikimedia.org/T225122 (10Pchelolo) [20:52:22] 10SRE, 10Performance-Team (Radar): Automated service restarts for common low-level system services - https://phabricator.wikimedia.org/T135991 (10Aklapper) [21:00:48] 10SRE, 10FR-MW-Vagrant, 10Fundraising-Backlog, 10MediaWiki-Vagrant: Package XDebug 2.9 for apt.wikimedia.org - https://phabricator.wikimedia.org/T220406 (10Aklapper) [21:03:25] (03PS1) 10Zabe: Restrict changetags to 'autoconfirmed' users on meta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/694686 (https://phabricator.wikimedia.org/T283625) [21:04:55] (03CR) 10MarcoAurelio: "This change is ready for review." [software/klaxon] - 10https://gerrit.wikimedia.org/r/694316 (owner: 10MarcoAurelio) [21:07:11] RECOVERY - Host labstore1007.mgmt is UP: PING WARNING - Packet loss = 77%, RTA = 1.63 ms [21:07:19] 10SRE, 10Analytics-Radar, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10jmixter) yeah sorry about that. I think this was a symptom of me being new and not having any idea what I was doing. I think things are resolved now. [21:10:42] (03PS3) 10Aklapper: Set $wgUploadNavigationUrl for few wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/364121 (https://phabricator.wikimedia.org/T170083) (owner: 10Framawiki) [21:13:30] !log razzi@cumin1001 START - Cookbook sre.hadoop.roll-restart-workers [21:13:30] !log razzi@cumin1001 END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) [21:13:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:13:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:13:39] !log razzi@cumin1001 START - Cookbook sre.hadoop.roll-restart-masters [21:13:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:13:42] !log razzi@cumin1001 END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [21:13:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:21:01] 10SRE, 10Analytics, 10Discovery, 10Event-Platform, and 2 others: Avoid accepting Kafka messages with whacky timestamps - https://phabricator.wikimedia.org/T282887 (10BPirkle) [21:34:10] (03CR) 10CDanis: [C: 03+2] index: Remove optional `?#` on webchat link to reduce potential encoding errors [software/klaxon] - 10https://gerrit.wikimedia.org/r/694316 (owner: 10MarcoAurelio) [21:36:29] (03Merged) 10jenkins-bot: index: Remove optional `?#` on webchat link to reduce potential encoding errors [software/klaxon] - 10https://gerrit.wikimedia.org/r/694316 (owner: 10MarcoAurelio) [21:36:41] PROBLEM - BGP status on cr3-eqsin is CRITICAL: BGP CRITICAL - AS6939/IPv4: Connect - HE https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [21:38:15] PROBLEM - Host cloudvirt1040.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [21:39:19] ACKNOWLEDGEMENT - Host cloudvirt1040.mgmt is DOWN: PING CRITICAL - Packet loss = 100% andrew bogott this host is cursed and will be under repair forever [21:48:47] (03PS1) 10Razzi: sre.hadoop.roll-restart-masters: use sudo -u hdfs kerberos-run-command [cookbooks] - 10https://gerrit.wikimedia.org/r/694710 [21:54:36] (03CR) 10Razzi: [C: 03+2] sre.hadoop.roll-restart-masters: use sudo -u hdfs kerberos-run-command [cookbooks] - 10https://gerrit.wikimedia.org/r/694710 (owner: 10Razzi) [21:58:05] (03Merged) 10jenkins-bot: sre.hadoop.roll-restart-masters: use sudo -u hdfs kerberos-run-command [cookbooks] - 10https://gerrit.wikimedia.org/r/694710 (owner: 10Razzi) [21:58:16] !log razzi@cumin1001 START - Cookbook sre.hadoop.roll-restart-masters [21:58:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:58:19] !log razzi@cumin1001 END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [21:58:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:04:36] !log razzi@cumin1001 START - Cookbook sre.hadoop.roll-restart-masters [22:04:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:04:39] !log razzi@cumin1001 END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [22:04:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:05:14] 10SRE, 10ops-eqiad, 10cloud-services-team (Hardware): cloudvirt1040 primary NIC disconnected - https://phabricator.wikimedia.org/T281399 (10Jclark-ctr) opened dell support ticket Service Request Detail: 1060698910 even though it shows connected now will follow up with dell [22:06:47] RECOVERY - Host cloudvirt1040.mgmt is UP: PING OK - Packet loss = 0%, RTA = 12.04 ms [22:14:29] (03PS1) 1020after4: Increase apache URL length limit for Phabricator [puppet] - 10https://gerrit.wikimedia.org/r/694731 (https://phabricator.wikimedia.org/T281390) [22:14:45] RECOVERY - BGP status on cr3-eqsin is OK: BGP OK - up: 319, down: 0, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [22:21:01] !log razzi@cumin1001 START - Cookbook sre.hadoop.roll-restart-masters [22:21:01] !log razzi@cumin1001 END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [22:21:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:21:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:21:17] !log razzi@cumin1001 START - Cookbook sre.hadoop.roll-restart-masters [22:21:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:21:20] !log razzi@cumin1001 END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [22:21:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:26:25] (03PS1) 10Razzi: sre.hadoop.roll-restart-masters: run hdfs as hdfs and yarn as yarn [cookbooks] - 10https://gerrit.wikimedia.org/r/694737 (https://phabricator.wikimedia.org/T283067) [22:26:36] (03CR) 10Razzi: [C: 03+2] kerberos: require --email_address for create and reset-password [puppet] - 10https://gerrit.wikimedia.org/r/686766 (https://phabricator.wikimedia.org/T282185) (owner: 10Razzi) [22:33:50] (03CR) 10Razzi: [C: 03+2] sre.hadoop.roll-restart-masters: run hdfs as hdfs and yarn as yarn [cookbooks] - 10https://gerrit.wikimedia.org/r/694737 (https://phabricator.wikimedia.org/T283067) (owner: 10Razzi) [22:37:11] (03Merged) 10jenkins-bot: sre.hadoop.roll-restart-masters: run hdfs as hdfs and yarn as yarn [cookbooks] - 10https://gerrit.wikimedia.org/r/694737 (https://phabricator.wikimedia.org/T283067) (owner: 10Razzi) [22:39:56] !log razzi@cumin1001 START - Cookbook sre.hadoop.roll-restart-masters [22:39:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:54:19] (03PS1) 10Cwhite: rsyslog: try to parse the msg field as json before shipping [puppet] - 10https://gerrit.wikimedia.org/r/694758 [22:56:42] (03CR) 10Cwhite: [C: 03+2] package_builder: add logstash-plugins build hooks [puppet] - 10https://gerrit.wikimedia.org/r/693958 (owner: 10Cwhite) [22:57:08] (03CR) 10Cwhite: [C: 03+2] "PCC 👍 https://puppet-compiler.wmflabs.org/compiler1001/29689/" [puppet] - 10https://gerrit.wikimedia.org/r/693958 (owner: 10Cwhite) [22:58:29] (03PS4) 10Cwhite: logstash: add openstack ECS transition config and tests [puppet] - 10https://gerrit.wikimedia.org/r/689262 (https://phabricator.wikimedia.org/T234565) [22:59:57] (03CR) 10jerkins-bot: [V: 04-1] logstash: add openstack ECS transition config and tests [puppet] - 10https://gerrit.wikimedia.org/r/689262 (https://phabricator.wikimedia.org/T234565) (owner: 10Cwhite) [23:00:04] RoanKattouw, Niharika, and Urbanecm: #bothumor I � Unicode. All rise for Evening backport window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210525T2300). [23:00:04] No GERRIT patches in the queue for this window AFAICS. [23:09:17] (03PS5) 10Cwhite: logstash: add openstack ECS transition config and tests [puppet] - 10https://gerrit.wikimedia.org/r/689262 (https://phabricator.wikimedia.org/T234565) [23:09:28] !log razzi@cumin1001 END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) [23:09:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:10:21] (03PS1) 10Razzi: sre.hadoop.roll-restart-masters: consistent sleep confirmation [cookbooks] - 10https://gerrit.wikimedia.org/r/694768 [23:18:51] (03CR) 10Razzi: [C: 03+2] sre.hadoop.roll-restart-masters: consistent sleep confirmation [cookbooks] - 10https://gerrit.wikimedia.org/r/694768 (owner: 10Razzi) [23:21:35] (03Merged) 10jenkins-bot: sre.hadoop.roll-restart-masters: consistent sleep confirmation [cookbooks] - 10https://gerrit.wikimedia.org/r/694768 (owner: 10Razzi) [23:24:34] 10SRE, 10wikimedia-irc-libera: Move SRE-related IRC channels to Libera - https://phabricator.wikimedia.org/T283230 (10razzi) [23:31:28] (03CR) 10Razzi: [C: 03+2] reportupdater: Rsync logs to HDFS [puppet] - 10https://gerrit.wikimedia.org/r/692909 (https://phabricator.wikimedia.org/T274880) (owner: 10Mforns) [23:53:49] 10SRE, 10Analytics-Radar, 10LDAP-Access-Requests, 10SRE-Access-Requests: Account setup issues for jmixter-ctr - https://phabricator.wikimedia.org/T283250 (10Dzahn) 05Open→03Resolved a:03Dzahn @jmixter Cool, great to hear that things work for you now and thanks for confirming. I think the wiki editin...