[00:19:19] PROBLEM - Free space - all mounts on deployment-kafka-jumbo-1 is CRITICAL: (Service Check Timed Out) [00:36:47] PROBLEM - Free space - all mounts on deployment-restbase02 is CRITICAL: (Service Check Timed Out) [00:41:12] PROBLEM - Free space - all mounts on deployment-aqs02 is CRITICAL: (Service Check Timed Out) [00:41:38] RECOVERY - Free space - all mounts on deployment-restbase02 is OK: OK: deployment-prep.deployment-restbase02.diskspace._var_log.byte_percentfree (No valid datapoints found) [00:42:44] 10MediaWiki-Codesniffer, 10MediaWiki-General, 10User-DannyS712: Add global $wgUser to DeprecatedGlobalVariablesSniff - https://phabricator.wikimedia.org/T244452 (10DannyS712) 05Open→03Resolved Will go live with the next release [00:46:05] RECOVERY - Free space - all mounts on deployment-aqs02 is OK: OK: All targets OK [00:50:29] PROBLEM - Free space - all mounts on deployment-aqs03 is CRITICAL: (Service Check Timed Out) [00:55:18] RECOVERY - Free space - all mounts on deployment-aqs03 is OK: OK: All targets OK [01:10:41] PROBLEM - Free space - all mounts on integration-agent-stretch-1001 is CRITICAL: (Service Check Timed Out) [01:15:28] RECOVERY - Free space - all mounts on integration-agent-stretch-1001 is OK: OK: All targets OK [01:26:49] RECOVERY - Free space - all mounts on deployment-deploy01 is OK: OK: All targets OK [01:51:40] PROBLEM - Free space - all mounts on deployment-chromium02 is CRITICAL: (Service Check Timed Out) [01:56:29] RECOVERY - Free space - all mounts on deployment-chromium02 is OK: OK: All targets OK [02:14:40] PROBLEM - Free space - all mounts on deployment-schema-2 is CRITICAL: (Service Check Timed Out) [02:18:34] PROBLEM - Free space - all mounts on deployment-docker-mathoid01 is CRITICAL: (Service Check Timed Out) [02:19:29] RECOVERY - Free space - all mounts on deployment-schema-2 is OK: OK: All targets OK [02:23:25] RECOVERY - Free space - all mounts on deployment-docker-mathoid01 is OK: OK: All targets OK [02:44:32] PROBLEM - Free space - all mounts on deployment-puppetmaster04 is CRITICAL: (Service Check Timed Out) [02:48:11] PROBLEM - Free space - all mounts on deployment-eventgate-3 is CRITICAL: (Service Check Timed Out) [02:53:01] RECOVERY - Free space - all mounts on deployment-eventgate-3 is OK: OK: All targets OK [02:54:22] RECOVERY - Free space - all mounts on deployment-puppetmaster04 is OK: OK: All targets OK [03:02:43] PROBLEM - Free space - all mounts on integration-agent-docker-1005 is CRITICAL: (Service Check Timed Out) [03:07:33] RECOVERY - Free space - all mounts on integration-agent-docker-1005 is OK: OK: All targets OK [03:30:56] PROBLEM - Free space - all mounts on deployment-ms-be05 is CRITICAL: (Service Check Timed Out) [04:05:23] PROBLEM - Free space - all mounts on deployment-ms-be06 is CRITICAL: (Service Check Timed Out) [04:10:16] RECOVERY - Free space - all mounts on deployment-ms-be06 is OK: OK: All targets OK [04:24:39] PROBLEM - Free space - all mounts on integration-agent-stretch-1001 is CRITICAL: (Service Check Timed Out) [04:29:28] RECOVERY - Free space - all mounts on integration-agent-stretch-1001 is OK: OK: All targets OK [04:35:03] RECOVERY - Free space - all mounts on deployment-snapshot01 is OK: OK: deployment-prep.deployment-snapshot01.diskspace._data.byte_percentfree (No valid datapoints found) [05:05:32] PROBLEM - Free space - all mounts on deployment-cumin02 is CRITICAL: (Service Check Timed Out) [05:10:19] RECOVERY - Free space - all mounts on deployment-cumin02 is OK: OK: All targets OK [05:29:45] PROBLEM - Free space - all mounts on deployment-poolcounter06 is CRITICAL: (Service Check Timed Out) [05:34:35] RECOVERY - Free space - all mounts on deployment-poolcounter06 is OK: OK: All targets OK [05:37:32] PROBLEM - Free space - all mounts on deployment-maps05 is CRITICAL: (Service Check Timed Out) [05:42:23] RECOVERY - Free space - all mounts on deployment-maps05 is OK: OK: All targets OK [06:11:23] 10Gerrit, 10User-DannyS712: Cindy-the-browser-test-bot should use verified, not code review - https://phabricator.wikimedia.org/T249981 (10DannyS712) [06:15:53] PROBLEM - Free space - all mounts on integration-agent-puppet-docker-1001 is CRITICAL: (Service Check Timed Out) [06:16:59] PROBLEM - Free space - all mounts on deployment-ms-be05 is CRITICAL: (Service Check Timed Out) [06:20:45] RECOVERY - Free space - all mounts on integration-agent-puppet-docker-1001 is OK: OK: All targets OK [06:21:47] RECOVERY - Free space - all mounts on deployment-ms-be05 is OK: OK: All targets OK [06:23:57] 10Gerrit, 10User-DannyS712: Cindy-the-browser-test-bot should use verified, not code review - https://phabricator.wikimedia.org/T249981 (10Peachey88) The bot page on MW Wiki was created by @Smalyshev. [06:26:01] 10Gerrit, 10Discovery, 10User-DannyS712: Cindy-the-browser-test-bot should use verified, not code review - https://phabricator.wikimedia.org/T249981 (10Peachey88) [06:27:37] I've submitted 2 patches in core, 6/8 CI tests failed for both: [06:27:40] "Exception: Install failed with exit code: 255" [06:28:03] Traceback (most recent call last): [06:28:03] 08:03:54 File "/usr/local/bin/quibble", line 11, in [06:28:03] 08:03:54 load_entry_point('quibble==0.0.41', 'console_scripts', 'quibble')() [06:28:03] ... [06:28:43] Not much more info. Is this some outage, or some weird error without error message? [06:29:21] * Demian_ Sample: https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php72-docker/18143/console [06:40:23] PROBLEM - Free space - all mounts on deployment-hadoop-test-1 is CRITICAL: (Service Check Timed Out) [06:45:12] RECOVERY - Free space - all mounts on deployment-hadoop-test-1 is OK: OK: All targets OK [07:10:58] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10Keyholder, 10Operations, 10Patch-For-Review: Keyholder phab repo duplicate work - https://phabricator.wikimedia.org/T203003 (10faidon) The master branch of operations/software/keyholder is not ready for a release at this... [07:30:05] 10Project-Admins, 10User-DannyS712: Create component #MediaWiki-Core-Hooks - https://phabricator.wikimedia.org/T249170 (10Aklapper) 05Open→03Stalled (Setting stalled until last comment is answered) [07:43:09] PROBLEM - Free space - all mounts on deployment-snapshot01 is CRITICAL: (Service Check Timed Out) [07:48:00] RECOVERY - Free space - all mounts on deployment-snapshot01 is OK: OK: deployment-prep.deployment-snapshot01.diskspace._data.byte_percentfree (No valid datapoints found) [07:52:38] PROBLEM - Free space - all mounts on deployment-logstash2 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash2.diskspace._mnt.byte_percentfree (No valid datapoints found) deployment-prep.deployment-logstash2.diskspace._var_lib_elasticsearch.byte_percentfree (No valid datapoints found)deployment-prep.deployment-logstash2.diskspace.root.byte_percentfree (<50.00%) [07:55:07] PROBLEM - Free space - all mounts on deployment-jobrunner03 is CRITICAL: (Service Check Timed Out) [07:55:28] 10Project-Admins, 10User-DannyS712: Create component #MediaWiki-Core-Hooks - https://phabricator.wikimedia.org/T249170 (10DannyS712) >>! In T249170#6021195, @Krinkle wrote: > The new hooks system is being created by Core Platform Team, I assume they are its code stewards as well. I didn't notice until now that... [07:59:56] RECOVERY - Free space - all mounts on deployment-jobrunner03 is OK: OK: All targets OK [08:10:48] PROBLEM - Free space - all mounts on integration-agent-pkgbuilder-1002 is CRITICAL: (Service Check Timed Out) [08:15:37] RECOVERY - Free space - all mounts on integration-agent-pkgbuilder-1002 is OK: OK: All targets OK [08:48:51] 10Phabricator, 10Project-Admins, 10WMF-Communications: Archive #CommRel-Design and related Phabricator Form? - https://phabricator.wikimedia.org/T246853 (10Aklapper) 05Stalled→03Resolved a:03Aklapper >>! In T246853#6047957, @hdothiduc wrote: > OK, I have resolved/marked invalid/declined the tasks that... [09:11:25] 10Project-Admins, 10User-DannyS712: Create component #MediaWiki-Core-Hooks - https://phabricator.wikimedia.org/T249170 (10daniel) >>! In T249170#6020915, @Aklapper wrote: > Are stakeholders / hook code stewards fine with this and do they plan to use this project tag? I'm in favor of having such a tag. [09:26:05] 10Continuous-Integration-Infrastructure: Stop using integration/composer and then archive the repo - https://phabricator.wikimedia.org/T249949 (10hashar) modules/profile/manifests/releases/mediawiki.pp: class { '::contint::composer': } That is for the releases Jenkins which we should overhaul and migrate... [10:02:33] PROBLEM - Free space - all mounts on deployment-cumin02 is CRITICAL: (Service Check Timed Out) [10:07:21] RECOVERY - Free space - all mounts on deployment-cumin02 is OK: OK: All targets OK [10:12:02] (03Abandoned) 10Awight: [WIP] Sugary decorators [integration/quibble] - 10https://gerrit.wikimedia.org/r/587889 (owner: 10Awight) [10:20:01] 10Project-Admins, 10User-DannyS712: Create component #MediaWiki-Core-Hooks - https://phabricator.wikimedia.org/T249170 (10DannyS712) 05Stalled→03Open a:03DannyS712 [10:21:51] 10Gerrit, 10Discovery, 10User-DannyS712: Cindy-the-browser-test-bot should use verified, not code review - https://phabricator.wikimedia.org/T249981 (10Aklapper) I found https://www.mediawiki.org/wiki/Wikimedia_Discovery/BrowserBot and https://wikitech.wikimedia.org/wiki/Cindy_The_Browser_Test_Bot but am sti... [10:26:20] (03PS6) 10Awight: [WIP] Split npm and composer test commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/587888 [10:26:22] (03PS6) 10Awight: [DNM] Clean-up: remove old parallel_run [integration/quibble] - 10https://gerrit.wikimedia.org/r/587890 [10:26:24] (03PS1) 10Awight: Provide GitClean as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588082 [10:26:26] (03PS1) 10Awight: [WIP] Sequence of commands as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588083 [10:27:52] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Split npm and composer test commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/587888 (owner: 10Awight) [10:28:01] (03CR) 10jerkins-bot: [V: 04-1] [DNM] Clean-up: remove old parallel_run [integration/quibble] - 10https://gerrit.wikimedia.org/r/587890 (owner: 10Awight) [10:28:06] (03CR) 10jerkins-bot: [V: 04-1] Provide GitClean as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588082 (owner: 10Awight) [10:28:14] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Sequence of commands as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588083 (owner: 10Awight) [10:36:15] PROBLEM - Free space - all mounts on deployment-ircd is CRITICAL: (Service Check Timed Out) [10:41:04] RECOVERY - Free space - all mounts on deployment-ircd is OK: OK: All targets OK [10:44:12] 10Project-Admins, 10User-DannyS712: Create component #MediaWiki-Core-Hooks - https://phabricator.wikimedia.org/T249170 (10DannyS712) 05Open→03Resolved #mediawiki-core-hooks created [11:09:57] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: (Service Check Timed Out) [11:14:48] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [11:21:44] 10Project-Admins, 10User-DannyS712: Create #mediawiki-global-scope - https://phabricator.wikimedia.org/T249995 (10DannyS712) [11:28:42] 10Project-Admins, 10User-DannyS712: Create #mediawiki-namespaces - https://phabricator.wikimedia.org/T249998 (10DannyS712) [11:31:54] 10Project-Admins, 10User-DannyS712: Cleaning up #mediawiki-general - https://phabricator.wikimedia.org/T249999 (10DannyS712) [11:32:53] 10Project-Admins, 10User-DannyS712: Cleaning up #mediawiki-general - https://phabricator.wikimedia.org/T249999 (10DannyS712) p:05Triage→03Low Currently in process (have been proposed): {T249995}, {T249998} [11:47:42] PROBLEM - Free space - all mounts on integration-agent-docker-1005 is CRITICAL: (Service Check Timed Out) [11:52:33] RECOVERY - Free space - all mounts on integration-agent-docker-1005 is OK: OK: All targets OK [12:03:09] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:03:14] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:04:54] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-07 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:06:25] PROBLEM - Free space - all mounts on deployment-deploy02 is CRITICAL: (Service Check Timed Out) [12:11:16] RECOVERY - Free space - all mounts on deployment-deploy02 is OK: OK: All targets OK [12:18:57] (03CR) 10Florianschmidtwelzow: [C: 04-1] "Thanks for taking on the comments :) I think we're heading in the right direction here." (032 comments) [tools/release] - 10https://gerrit.wikimedia.org/r/587971 (https://phabricator.wikimedia.org/T249553) (owner: 10Majavah) [12:27:09] (03CR) 10Majavah: "Well, either the header or the heading level (amount of =s) is required. I'm not sure which is better." [tools/release] - 10https://gerrit.wikimedia.org/r/587971 (https://phabricator.wikimedia.org/T249553) (owner: 10Majavah) [12:29:59] PROBLEM - Free space - all mounts on deployment-elastic06 is CRITICAL: (Service Check Timed Out) [12:34:49] RECOVERY - Free space - all mounts on deployment-elastic06 is OK: OK: deployment-prep.deployment-elastic06.diskspace._var_lib_elasticsearch.byte_percentfree (No valid datapoints found) deployment-prep.deployment-elastic06.diskspace._var_log.byte_percentfree (No valid datapoints found) [12:39:12] PROBLEM - Free space - all mounts on deployment-chromium01 is CRITICAL: (Service Check Timed Out) [12:39:47] RECOVERY - App Server Main HTTP Response on deployment-mediawiki-07 is OK: HTTP OK: HTTP/1.1 200 OK - 92446 bytes in 0.984 second response time [12:42:56] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 92751 bytes in 0.954 second response time [12:43:05] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 52124 bytes in 0.962 second response time [12:44:04] RECOVERY - Free space - all mounts on deployment-chromium01 is OK: OK: All targets OK [12:46:02] PROBLEM - Free space - all mounts on deployment-mwmaint01 is CRITICAL: (Service Check Timed Out) [13:00:25] PROBLEM - Free space - all mounts on deployment-docker-citoid01 is CRITICAL: (Service Check Timed Out) [13:04:49] PROBLEM - Free space - all mounts on integration-agent-pkgbuilder-1002 is CRITICAL: (Service Check Timed Out) [13:09:40] RECOVERY - Free space - all mounts on integration-agent-pkgbuilder-1002 is OK: OK: All targets OK [13:15:14] RECOVERY - Free space - all mounts on deployment-docker-citoid01 is OK: OK: All targets OK [13:29:12] PROBLEM - Free space - all mounts on deployment-aqs02 is CRITICAL: (Service Check Timed Out) [13:34:03] RECOVERY - Free space - all mounts on deployment-aqs02 is OK: OK: All targets OK [13:34:54] PROBLEM - Free space - all mounts on deployment-db06 is CRITICAL: (Service Check Timed Out) [13:35:07] PROBLEM - Free space - all mounts on deployment-jobrunner03 is CRITICAL: (Service Check Timed Out) [13:37:46] PROBLEM - Free space - all mounts on deployment-poolcounter06 is CRITICAL: (Service Check Timed Out) [13:39:44] RECOVERY - Free space - all mounts on deployment-db06 is OK: OK: All targets OK [13:39:59] RECOVERY - Free space - all mounts on deployment-jobrunner03 is OK: OK: All targets OK [13:42:38] RECOVERY - Free space - all mounts on deployment-poolcounter06 is OK: OK: All targets OK [13:59:38] PROBLEM - Free space - all mounts on deployment-echostore01 is CRITICAL: (Service Check Timed Out) [14:04:21] PROBLEM - Free space - all mounts on integration-agent-docker-1007 is CRITICAL: (Service Check Timed Out) [14:04:27] RECOVERY - Free space - all mounts on deployment-echostore01 is OK: OK: All targets OK [14:07:12] (03Abandoned) 10Awight: Parallelize phpunit-unit -databaseless, and -standalone [integration/quibble] - 10https://gerrit.wikimedia.org/r/587887 (https://phabricator.wikimedia.org/T235449) (owner: 10Awight) [14:07:47] (03PS11) 10Awight: Parallelism as a command object [integration/quibble] - 10https://gerrit.wikimedia.org/r/587885 (https://phabricator.wikimedia.org/T235449) [14:07:49] (03PS2) 10Awight: Provide GitClean as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588082 [14:07:51] (03PS7) 10Awight: Split extension and skin npm and composer tests [integration/quibble] - 10https://gerrit.wikimedia.org/r/587888 [14:07:53] (03PS2) 10Awight: Sequence of commands as a command [integration/quibble] - 10https://gerrit.wikimedia.org/r/588083 [14:07:55] (03PS7) 10Awight: Clean-up: remove old parallel_run [integration/quibble] - 10https://gerrit.wikimedia.org/r/587890 [14:07:57] (03PS1) 10Awight: Split core npm and composer tests [integration/quibble] - 10https://gerrit.wikimedia.org/r/588087 [14:09:11] RECOVERY - Free space - all mounts on integration-agent-docker-1007 is OK: OK: All targets OK [14:22:28] PROBLEM - Host deployment-cache-upload06 is DOWN: check_ping: Invalid hostname/address [14:28:13] 10Gerrit, 10Code-Review-Workgroup, 10Discovery, 10User-DannyS712: Cindy-the-browser-test-bot should use verified, not code review - https://phabricator.wikimedia.org/T249981 (10Aklapper) [14:28:35] PROBLEM - Host deployment-cache-text06 is DOWN: check_ping: Invalid hostname/address [14:28:47] (03PS1) 10Awight: Rewrite main pipeline as a generator [integration/quibble] - 10https://gerrit.wikimedia.org/r/588092 [14:28:49] (03PS1) 10Awight: Wrap main pipeline in a Sequence [integration/quibble] - 10https://gerrit.wikimedia.org/r/588093 (https://phabricator.wikimedia.org/T249775) [14:38:22] (03Abandoned) 10Awight: Commands expand recursively [integration/quibble] - 10https://gerrit.wikimedia.org/r/519776 (owner: 10Awight) [14:46:26] PROBLEM - Free space - all mounts on deployment-docker-citoid01 is CRITICAL: (Service Check Timed Out) [14:51:15] RECOVERY - Free space - all mounts on deployment-docker-citoid01 is OK: OK: All targets OK [14:52:22] !log Migrated from deployment-cache-upload05 (stretch) to deployment-cache-upload06 (buster) - class stopped working on stretch with https://gerrit.wikimedia.org/r/c/operations/puppet/+/584553 - shut down old instance which coincidentally would turn one year old tomorrow [14:52:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:52:35] RECOVERY - Host deployment-cache-upload06 is UP: PING OK - Packet loss = 0%, RTA = 1.86 ms [14:53:32] RECOVERY - Host deployment-cache-text06 is UP: PING OK - Packet loss = 0%, RTA = 1.68 ms [14:54:17] funny how you can make an instance and put it into actual use before shinken wakes up and realises that it's properly online [14:56:26] PROBLEM - Host deployment-cache-upload05 is DOWN: CRITICAL - Host Unreachable (172.16.6.210) [14:57:44] 10Beta-Cluster-Infrastructure: Migrate deployment-cache* boxes to buster - https://phabricator.wikimedia.org/T250006 (10Krenair) [14:57:51] ^ made a tasks [14:57:53] task* [15:00:07] (03PS1) 10Awight: [WIP] Bad "pipeline" abstraction [integration/quibble] - 10https://gerrit.wikimedia.org/r/588095 [15:01:21] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Bad "pipeline" abstraction [integration/quibble] - 10https://gerrit.wikimedia.org/r/588095 (owner: 10Awight) [15:02:21] shifting beta main floating IP... [15:04:40] hm, well, that doesn't work as expected... [15:05:03] right it helps if you add the security group so the outside world can actually talk to things :) [15:07:22] !log Migrated from deployment-cache-text05 (stretch) to deployment-cache-text06 (buster) - class stopped working on stretch with https://gerrit.wikimedia.org/r/c/operations/puppet/+/584553 - shut down old instance - T250006 [15:07:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:07:24] T250006: Migrate deployment-cache* boxes to buster - https://phabricator.wikimedia.org/T250006 [15:08:13] hmmmmmmmmmmmmmm [15:08:16] PROBLEM - Host Generic Beta Cluster is DOWN: CRITICAL - Host Unreachable (en.wikipedia.beta.wmflabs.org) [15:08:24] what is it's problem now... [15:11:19] oh, old prefix config pointing at the old hosts. fun [15:11:23] RECOVERY - Host deployment-cache-upload05 is UP: PING OK - Packet loss = 0%, RTA = 1.04 ms [15:12:49] take 2... [15:13:51] PROBLEM - Host Generic Beta Cluster is DOWN: CRITICAL - Host Unreachable (en.wikipedia.beta.wmflabs.org) [15:13:55] really [15:14:18] this time it at least looks up to me [15:15:57] why is beta's shinken config duplicated... [15:16:26] PROBLEM - Host deployment-cache-upload05 is DOWN: CRITICAL - Host Unreachable (172.16.6.210) [15:16:41] this is fine [15:19:03] PROBLEM - Host deployment-cache-text05 is DOWN: CRITICAL - Host Unreachable (172.16.4.21) [15:19:40] oh I bet I know why shinken is unhappy [15:19:49] good old labs DNS [15:19:54] root@shinken-02:/etc/shinken# host en.wikipedia.beta.wmflabs.org [15:19:54] en.wikipedia.beta.wmflabs.org has address 172.16.4.21 [15:19:54] root@shinken-02:/etc/shinken# host 172.16.4.21 [15:19:54] 21.4.16.172.in-addr.arpa domain name pointer deployment-cache-text05.deployment-prep.eqiad1.wikimedia.cloud. [15:20:08] externally you'd get the floating IP which points to the right instance... internally the labsaliaser script needs to run [15:20:30] to update the list of floating IPs -> private IPs for internal usage within cloud vps [15:20:31] should sort itself [15:20:59] in approx 10 minutes [15:21:10] (the thing runs hourly at half past) [15:24:13] PROBLEM - Free space - all mounts on deployment-chromium01 is CRITICAL: (Service Check Timed Out) [15:25:16] meanwhile, shinken has duplicate config: [15:25:25] root@shinken-02:/etc/shinken# ls -lh customconfig/beta* [15:25:25] -rw-r--r-- 1 shinken shinken 3.4K Apr 15 2019 customconfig/beta.cfg [15:25:25] -rw-r--r-- 1 shinken shinken 3.4K Dec 11 15:38 customconfig/betacluster-hosts.cfg [15:25:26] root@shinken-02:/etc/shinken# diff ./customconfig/beta* [15:25:26] 79a80,81 [15:25:28] > [15:25:30] > [15:25:32] root@shinken-02:/etc/shinken# [15:28:43] PROBLEM - Host Generic Beta Cluster is DOWN: CRITICAL - Host Unreachable (en.wikipedia.beta.wmflabs.org) [15:40:42] (okay so it's not yet, asking in cloud-admin) [15:49:56] 10Beta-Cluster-Infrastructure: Migrate deployment-cache* boxes to buster - https://phabricator.wikimedia.org/T250006 (10Krenair) current status: beta is up externally, labsaliaser problems means it's not accessible internally from within Cloud VPS, but that probably doesn't affect anything other than shinken mon... [15:58:48] PROBLEM - Free space - all mounts on deployment-aqs01 is CRITICAL: (Service Check Timed Out) [16:03:39] RECOVERY - Free space - all mounts on deployment-aqs01 is OK: OK: All targets OK [16:13:53] RECOVERY - Host Generic Beta Cluster is UP: PING OK - Packet loss = 0%, RTA = 0.99 ms [16:18:47] RECOVERY - Host Generic Beta Cluster is UP: PING OK - Packet loss = 0%, RTA = 1.48 ms [16:23:39] might flap a bit for a while [16:32:19] looks fine now [16:33:07] 10Beta-Cluster-Infrastructure: Migrate deployment-cache* boxes to buster - https://phabricator.wikimedia.org/T250006 (10Krenair) Looks fine now, will delete the old hosts in a week or so. [17:29:40] PROBLEM - Free space - all mounts on deployment-wikifeeds01 is CRITICAL: (Service Check Timed Out) [17:34:32] RECOVERY - Free space - all mounts on deployment-wikifeeds01 is OK: OK: All targets OK [17:45:03] 10Project-Admins, 10User-DannyS712: Create component #MediaWiki-Core-Hooks - https://phabricator.wikimedia.org/T249170 (10Krinkle) @DannyS712 Please keep the tag focussed on issues with the hook system itself and maybe big problems that cross many different components of MediaWiki. For tasks about hooks in 1... [18:14:52] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:15:13] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:15:58] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-07 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:27:13] PROBLEM - Free space - all mounts on integration-agent-pkgbuilder-1001 is CRITICAL: (Service Check Timed Out) [18:32:05] RECOVERY - Free space - all mounts on integration-agent-pkgbuilder-1001 is OK: OK: All targets OK [18:38:19] 10Continuous-Integration-Config: Unable to find image 'docker-registry.wikimedia.org/releng/quibble-stretch-php73:0.0.41-s1' - https://phabricator.wikimedia.org/T250015 (10Reedy) [18:54:43] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 92751 bytes in 1.010 second response time [18:55:04] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 52125 bytes in 1.112 second response time [18:55:47] RECOVERY - App Server Main HTTP Response on deployment-mediawiki-07 is OK: HTTP OK: HTTP/1.1 200 OK - 92452 bytes in 0.994 second response time [19:04:34] PROBLEM - Free space - all mounts on integration-agent-docker-1001 is CRITICAL: (Service Check Timed Out) [19:09:23] RECOVERY - Free space - all mounts on integration-agent-docker-1001 is OK: OK: All targets OK [19:33:43] 10Project-Admins, 10User-DannyS712: Cleaning up #mediawiki-general - https://phabricator.wikimedia.org/T249999 (10Krinkle) The grouping of tasks should be based on code that is maintained together as a system. For example, the following are not useful as components to split from "general": * mediawiki javasc... [19:36:35] 10Project-Admins, 10User-DannyS712: Create #mediawiki-global-scope - https://phabricator.wikimedia.org/T249995 (10Krinkle) See T249999#6049329 - I do not think this can be a useful component. These all belong to other components already or need their own component inside the code, not just in Phabricator. For... [19:39:04] PROBLEM - Free space - all mounts on deployment-mwmaint01 is CRITICAL: (Service Check Timed Out) [19:43:50] RECOVERY - Free space - all mounts on deployment-mwmaint01 is OK: OK: All targets OK [19:50:47] PROBLEM - Free space - all mounts on deployment-sentry01 is CRITICAL: (Service Check Timed Out) [19:55:36] RECOVERY - Free space - all mounts on deployment-sentry01 is OK: OK: All targets OK [20:11:12] PROBLEM - Free space - all mounts on deployment-aqs02 is CRITICAL: (Service Check Timed Out) [20:16:04] RECOVERY - Free space - all mounts on deployment-aqs02 is OK: OK: All targets OK [20:28:56] PROBLEM - Free space - all mounts on integration-agent-docker-1006 is CRITICAL: (Service Check Timed Out) [20:33:46] RECOVERY - Free space - all mounts on integration-agent-docker-1006 is OK: OK: All targets OK [20:35:41] 10Continuous-Integration-Config, 10Code-Health, 10Readers-Web-Backlog (Kanbanana-2019-20-Q4), 10Vue.js: Configure ESLint for Vue.js search development ahead of it being done for all repos - https://phabricator.wikimedia.org/T249304 (10Niedzielski) [20:36:10] 10Continuous-Integration-Config, 10Code-Health, 10Readers-Web-Backlog (Kanbanana-2019-20-Q4), 10Vue.js: Configure ESLint for Vue.js search development ahead of it being done for all repos - https://phabricator.wikimedia.org/T249304 (10Niedzielski) [21:17:55] 10MediaWiki-Codesniffer, 10MW-1.35-notes (1.35.0-wmf.19; 2020-02-11), 10User-DannyS712: assertNull should be used instead of comparing to null - https://phabricator.wikimedia.org/T244279 (10Krinkle) 05Open→03Declined >>! In T244279#6029859, @gerritbot wrote: > Change 585973 had a related patch set upload... [21:17:58] 10MediaWiki-Codesniffer, 10MW-1.35-notes (1.35.0-wmf.19; 2020-02-11), 10User-DannyS712: assertNull should be used instead of comparing to null - https://phabricator.wikimedia.org/T244279 (10Krinkle) [21:18:00] 10MediaWiki-Codesniffer, 10MW-1.35-notes (1.35.0-wmf.19; 2020-02-11), 10User-DannyS712: assertTrue should be used instead of comparing to true - https://phabricator.wikimedia.org/T244552 (10Krinkle) 05Open→03Declined >>! From **Gerrit**: > The problem is with assertEqual, which performs lose/weak compari... [21:18:16] 10MediaWiki-Codesniffer, 10User-DannyS712: assertFalse should be used instead of comparing to false - https://phabricator.wikimedia.org/T244553 (10Krinkle) 05Open→03Declined >>! From **Gerrit**: > The problem is with assertEqual, which performs lose/weak comparison. assertSame is fine and should not be sni... [21:19:01] 10Continuous-Integration-Config, 10Code-Health, 10Readers-Web-Backlog (Kanbanana-2019-20-Q4), 10Vue.js: Configure ESLint for Vue.js search development ahead of it being done for all repos - https://phabricator.wikimedia.org/T249304 (10Niedzielski) [21:22:37] (03PS1) 10Awight: Simplify shell call syntax [integration/quibble] - 10https://gerrit.wikimedia.org/r/588113 [21:24:52] 10Continuous-Integration-Config, 10Code-Health, 10Readers-Web-Backlog (Kanbanana-2019-20-Q4), 10Vue.js: Configure ESLint for Vue.js search development ahead of it being done for all repos - https://phabricator.wikimedia.org/T249304 (10Niedzielski) [21:31:00] (03CR) 10Krinkle: [C: 03+1] Add problematic values 1 and 1.0 to PHPUnitAssertEqualsSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/576045 (https://phabricator.wikimedia.org/T246662) (owner: 10Thiemo Kreuz (WMDE)) [22:14:35] (03CR) 10Awight: "Thanks for this provocation! It highlights the important open question of how to define the Quibble's boundary relative to other componen" [integration/quibble] - 10https://gerrit.wikimedia.org/r/568931 (https://phabricator.wikimedia.org/T234902) (owner: 10Kosta Harlan) [22:16:33] (03CR) 10Awight: [C: 03+1] "Thanks! Leaving it to the pros to merge, so this can be coordinated with job tweaks and a quibble release." [integration/quibble] - 10https://gerrit.wikimedia.org/r/587896 (owner: 10C. Scott Ananian) [22:25:35] PROBLEM - Free space - all mounts on deployment-webperf12 is CRITICAL: (Service Check Timed Out) [22:26:03] (03CR) 10Awight: "My only hesitation is the same as in Ieb7866026bba : we shouldn't make existing classes more complex, rather the alternative backends shou" [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218) (owner: 10Kosta Harlan) [22:30:25] 10Continuous-Integration-Infrastructure, 10Quibble, 10Patch-For-Review: Consider httpd for quibble instead of php built-in server - https://phabricator.wikimedia.org/T225218 (10awight) I think this is a killer feature, in combination with {T226869}. The built-in server is single-threaded, so I expect parall... [22:30:26] RECOVERY - Free space - all mounts on deployment-webperf12 is OK: OK: All targets OK [22:43:57] 10Beta-Cluster-Infrastructure: Problems on deployment-hadoop-test-1 - https://phabricator.wikimedia.org/T250021 (10Krenair) [23:22:19] (03PS14) 10Awight: Replace argparse with docopt [integration/quibble] - 10https://gerrit.wikimedia.org/r/546167 [23:22:58] (03CR) 10jerkins-bot: [V: 04-1] Replace argparse with docopt [integration/quibble] - 10https://gerrit.wikimedia.org/r/546167 (owner: 10Awight) [23:33:34] PROBLEM - Free space - all mounts on integration-agent-docker-1009 is CRITICAL: (Service Check Timed Out) [23:38:22] RECOVERY - Free space - all mounts on integration-agent-docker-1009 is OK: OK: All targets OK