[00:53:08] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10Krenair) >>! In T241462#5763809, @MarcoAurelio wrote: > Could not change on deployment-eventgate1-3 as those ain't in Horizon (?)  They don't exist. I...
[00:54:43] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10Krenair) >>! In T241462#5763875, @MarcoAurelio wrote: > Not sure where to look at now. Shall I restart cpjobqueue from Horizon?  No need to reboot the...
[06:11:14] <wikibugs>	 10Release-Engineering-Team, 10serviceops, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Test Wikitech is still running wmf.8 (should be on wmf.11) - https://phabricator.wikimedia.org/T241251 (10Andrew) It should be fine to upgrade and add to the dsh group now.
[08:47:32] <wikibugs>	 10Continuous-Integration-Config, 10Wikidata, 10Documentation, 10User-Addshore: Investigate auto generating LUA docs with doxygen - https://phabricator.wikimedia.org/T241478 (10Addshore)
[09:43:26] <wikibugs>	 10Beta-Cluster-Infrastructure: Steward access on Beta Cluster for Tulsi Bhagat - https://phabricator.wikimedia.org/T241466 (10Tulsi_Bhagat) @DannyS712 Yes, I have [[https://commons.wikimedia.beta.wmflabs.org/wiki/Special:Log?type=rights&user=&page=Tulsi+Bhagat&wpdate=&tagfilter=&subtype=|got it yesterday]] only....
[10:38:25] <wikibugs>	 10Continuous-Integration-Config, 10Wikidata, 10doxygen, 10Documentation, 10User-Addshore: Investigate auto generating LUA docs with doxygen - https://phabricator.wikimedia.org/T241478 (10Addshore)
[10:39:04] <wikibugs>	 10Continuous-Integration-Config, 10Wikidata, 10doxygen, 10Documentation, 10User-Addshore: Investigate auto generating LUA docs with doxygen - https://phabricator.wikimedia.org/T241478 (10Addshore)
[10:49:27] <shinken-wm>	 PROBLEM - Host deployment-aqs03 is DOWN: CRITICAL - Host Unreachable (172.16.1.50)
[10:54:26] <shinken-wm>	 RECOVERY - Host deployment-aqs03 is UP: PING OK - Packet loss = 0%, RTA = 1.50 ms
[10:55:48] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10MarcoAurelio) >>! In T241462#5764138, @Krenair wrote: >>>! In T241462#5763875, @MarcoAurelio wrote: >> Not sure where to look at now. Shall I restart c...
[10:58:42] <wikibugs>	 10Gerrit: `mediawiki-replication` Gerrit group is hidden - https://phabricator.wikimedia.org/T241461 (10MarcoAurelio) p:05Triage→03Low
[11:16:12] <wikibugs>	 10Beta-Cluster-Infrastructure: deployment-logstash03: UDP listener died EADDRINUSE - https://phabricator.wikimedia.org/T241481 (10MarcoAurelio)
[11:16:54] <wikibugs>	 10Beta-Cluster-Infrastructure: deployment-logstash03: UDP listener died EADDRINUSE - https://phabricator.wikimedia.org/T241481 (10MarcoAurelio)
[11:19:24] <shinken-wm>	 PROBLEM - Host deployment-aqs03 is DOWN: CRITICAL - Host Unreachable (172.16.1.50)
[11:25:52] <shinken-wm>	 RECOVERY - Host deployment-aqs03 is UP: PING OK - Packet loss = 0%, RTA = 1.64 ms
[11:45:34] <wikibugs>	 (03CR) 10N3rsti: "Can you merge it? I still can't use recheck command" [integration/config] - 10https://gerrit.wikimedia.org/r/560049 (https://phabricator.wikimedia.org/T235286) (owner: 10N3rsti)
[11:54:17] <wikibugs>	 (03CR) 10DannyS712: [C: 03+1] "> Can you merge it? I still can't use recheck command" [integration/config] - 10https://gerrit.wikimedia.org/r/560049 (https://phabricator.wikimedia.org/T235286) (owner: 10N3rsti)
[11:54:48] <wikibugs>	 (03CR) 10DannyS712: [C: 03+1] Add Minhducsun2002 [GCI participant] to the CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/560078 (owner: 10Minhducsun2002)
[11:55:24] <wikibugs>	 (03CR) 10DannyS712: [C: 03+1] jjb: use team list instead of Antoine's e-mail [integration/config] - 10https://gerrit.wikimedia.org/r/559821 (owner: 10Hashar)
[13:39:19] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Elasticsearch: [_field_stats] endpoint is deprecated! Use [_field_caps] instead or run a min/max aggregations on the desired fields. - https://phabricator.wikimedia.org/T241485 (10MarcoAurelio)
[13:48:47] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10MarcoAurelio) On advice of @Andrew I've rebooted via Horizon `deployment-cpjobqueue`. It looks like, after the reboot, deployment-cpjobqueue is no long...
[14:00:45] <wikibugs>	 (03PS1) 10Daimona Eaytoy: layout: [ImageRating] Run phan [integration/config] - 10https://gerrit.wikimedia.org/r/560845
[14:04:45] <wikibugs>	 10Beta-Cluster-Infrastructure, 10CirrusSearch, 10Discovery-Search: deployment-mediawiki-07: Search backend error during {queryType} search for '{query}' after {tookMs}: {error_message} - https://phabricator.wikimedia.org/T241487 (10MarcoAurelio)
[14:11:22] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10MarcoAurelio) Hmm, this spawns each ten minutes:  `lines=10 {   "_index": "logstash-2019.12.27",   "_type": "changeprop",   "_id": "AW9HrbB0pvLXCyw4pFS...
[14:20:53] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10Krenair) >>! In T241462#5764469, @MarcoAurelio wrote: > Hmm, this spawns each ten minutes: >  > `lines=10 > { >   "_index": "logstash-2019.12.27", >...
[14:25:16] <wikibugs>	 10Beta-Cluster-Infrastructure: Steward access on Beta Cluster for Tulsi Bhagat - https://phabricator.wikimedia.org/T241466 (10MarcoAurelio) > Coming year, I wanna be more limitless and want to do as more as I can. Before that I would like to test functions which I haven't yet access it on original wikis. For ins...
[14:25:23] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10Krenair) Based on running `grep labmon /var/log/syslog | tail` through cumin, these hosts still have errors connecting to labmon: deployment-parsoid09,...
[14:33:23] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10Krenair) Restarted parsoid on -parsoid09, eventstreams on -sca02, mediawiki-services-cxserver on -docker-cxserver01 (which for some reason hasn't come...
[14:43:52] <wikibugs>	 10Beta-Cluster-Infrastructure: Update references from labmon.* to cloudmetrics.* on deployment-prep (beta cluster) - https://phabricator.wikimedia.org/T241462 (10Krenair) `journalctl -xu mediawiki-services-cxserver.service` has shown the reason for the container on -docker-cxserver01 not coming back up: `Dec 27...
[15:16:01] <wikibugs>	 10Beta-Cluster-Infrastructure: Steward access on Beta Cluster for Tulsi Bhagat - https://phabricator.wikimedia.org/T241466 (10Tulsi_Bhagat) >>! In T241466#5764475, @MarcoAurelio wrote: >> Coming year, I wanna be more limitless and want to do as more as I can. Before that I would like to test functions which I ha...
[17:00:27] <shinken-wm>	 PROBLEM - Host deployment-cpjobqueue is DOWN: CRITICAL - Host Unreachable (172.16.4.124)
[17:00:50] <shinken-wm>	 PROBLEM - Host deployment-memc07 is DOWN: CRITICAL - Host Unreachable (172.16.5.2)
[17:00:53] <shinken-wm>	 PROBLEM - Host deployment-kafka-jumbo-1 is DOWN: CRITICAL - Host Unreachable (172.16.5.4)
[17:01:17] <shinken-wm>	 PROBLEM - Host deployment-mediawiki-07 is DOWN: CRITICAL - Host Unreachable (172.16.4.119)
[17:01:22] <shinken-wm>	 PROBLEM - Host deployment-restbase02 is DOWN: CRITICAL - Host Unreachable (172.16.5.82)
[17:02:06] <shinken-wm>	 PROBLEM - Host deployment-elastic06 is DOWN: CRITICAL - Host Unreachable (172.16.5.131)
[17:02:36] <shinken-wm>	 PROBLEM - Host deployment-cache-text05 is DOWN: CRITICAL - Host Unreachable (172.16.4.21)
[17:02:49] <shinken-wm>	 PROBLEM - Host deployment-puppetmaster03 is DOWN: CRITICAL - Host Unreachable (172.16.4.91)
[17:02:54] <shinken-wm>	 PROBLEM - Host deployment-changeprop is DOWN: CRITICAL - Host Unreachable (172.16.5.21)
[17:03:13] <shinken-wm>	 PROBLEM - Host deployment-puppetdb02 is DOWN: CRITICAL - Host Unreachable (172.16.4.104)
[17:03:24] <shinken-wm>	 PROBLEM - Host deployment-eventlog05 is DOWN: CRITICAL - Host Unreachable (172.16.4.128)
[17:05:29] <shinken-wm>	 PROBLEM - Host deployment-imagescaler01 is DOWN: CRITICAL - Host Unreachable (172.16.5.80)
[17:05:42] <shinken-wm>	 PROBLEM - Host deployment-chromium01 is DOWN: CRITICAL - Host Unreachable (172.16.4.108)
[17:05:58] <shinken-wm>	 PROBLEM - Host Generic Beta Cluster is DOWN: CRITICAL - Host Unreachable (en.wikipedia.beta.wmflabs.org)
[17:06:29] <shinken-wm>	 PROBLEM - Host Generic Beta Cluster is DOWN: CRITICAL - Host Unreachable (en.wikipedia.beta.wmflabs.org)
[17:07:11] <wmf-insecte>	 Project beta-scap-eqiad build #281195: 04FAILURE in 2 min 40 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/281195/
[17:11:29] <shinken-wm>	 RECOVERY - Host Generic Beta Cluster is UP: PING OK - Packet loss = 0%, RTA = 1.01 ms
[17:11:40] <Krenair>	 ^ cloudvirt1014
[17:11:43] <shinken-wm>	 RECOVERY - Host deployment-puppetmaster03 is UP: PING OK - Packet loss = 0%, RTA = 1.05 ms
[17:12:05] <shinken-wm>	 RECOVERY - Host deployment-chromium01 is UP: PING OK - Packet loss = 0%, RTA = 2.03 ms
[17:12:08] <shinken-wm>	 RECOVERY - Host deployment-elastic06 is UP: PING OK - Packet loss = 0%, RTA = 0.77 ms
[17:12:11] <shinken-wm>	 RECOVERY - Host deployment-mediawiki-07 is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms
[17:12:11] <shinken-wm>	 RECOVERY - Host deployment-imagescaler01 is UP: PING OK - Packet loss = 0%, RTA = 0.82 ms
[17:12:36] <shinken-wm>	 RECOVERY - Host deployment-memc07 is UP: PING OK - Packet loss = 0%, RTA = 0.91 ms
[17:12:37] <shinken-wm>	 RECOVERY - Host deployment-cache-text05 is UP: PING OK - Packet loss = 0%, RTA = 0.53 ms
[17:12:52] <shinken-wm>	 RECOVERY - Host deployment-puppetdb02 is UP: PING OK - Packet loss = 0%, RTA = 0.73 ms
[17:12:56] <shinken-wm>	 RECOVERY - Host deployment-changeprop is UP: PING OK - Packet loss = 0%, RTA = 1.17 ms
[17:13:23] <shinken-wm>	 RECOVERY - Host deployment-kafka-jumbo-1 is UP: PING OK - Packet loss = 0%, RTA = 0.65 ms
[17:13:26] <shinken-wm>	 RECOVERY - Host deployment-eventlog05 is UP: PING OK - Packet loss = 0%, RTA = 0.69 ms
[17:13:47] <Krenair>	 (see -cloud)
[17:14:17] <shinken-wm>	 RECOVERY - Host deployment-restbase02 is UP: PING OK - Packet loss = 0%, RTA = 0.80 ms
[17:14:30] <shinken-wm>	 RECOVERY - Host deployment-cpjobqueue is UP: PING OK - Packet loss = 0%, RTA = 0.82 ms
[17:16:01] <shinken-wm>	 RECOVERY - Host Generic Beta Cluster is UP: PING OK - Packet loss = 0%, RTA = 0.48 ms
[17:16:54] <wmf-insecte>	 Yippee, build fixed!
[17:16:55] <wmf-insecte>	 Project beta-scap-eqiad build #281196: 09FIXED in 2 min 22 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/281196/
[17:42:01] <Urbanecm>	 Hi everyone, I know it's xmas season, but...could someone do https://gerrit.wikimedia.org/r/c/integration/config/+/560049, please, to help a Google Code-In student? Thanks!
[17:44:03] <Krenair>	 I'm looking at why beta is still down
[17:47:52] <shinken-wm>	 PROBLEM - Parsoid on deployment-mediawiki-parsoid10 is CRITICAL: connect to address 172.16.0.141 and port 8000: Connection refused
[17:47:52] <shinken-wm>	 PROBLEM - Parsoid on deployment-parsoid09 is CRITICAL: connect to address 172.16.5.63 and port 8000: Connection refused
[17:48:03] <Krenair>	 nginx on -cache-text05 won't start for some reason
[17:49:57] <Krenair>	 looks like update-ocsp-all is breaking when it starts due to a read-only filesystem error
[17:50:17] <Krenair>	 which wouldn't be too unheard of given the hypervisor it's on just crashed and rebooted, except actually both filesystems are writable
[17:51:10] <Krenair>	 I wonder if it's due to something in /etc/systemd/system/nginx.service.d/security.conf
[17:59:20] <Krenair>	 20<Krenair>30 I fiddled around with some of the scripts on that deployment-cache-text instance and got beta back up and running
[21:07:10] <wikibugs>	 10Beta-Cluster-Infrastructure: Steward access on Beta Cluster for Tulsi Bhagat - https://phabricator.wikimedia.org/T241466 (10Masumrezarock100) + 1 to MarkAurelio. I don't see a valid need for this. Steward access is extremely powerful and sensitive even on beta cluster! You are not a developer nor I am convince...
[21:09:51] <hauskatze>	 twentyafterfour: hi! You online?