[02:15:48] PROBLEM - Host deployment-redis02 is DOWN: CRITICAL - Host Unreachable (10.68.16.231) [02:15:52] PROBLEM - Host deployment-redis01 is DOWN: CRITICAL - Host Unreachable (10.68.16.177) [02:18:02] PROBLEM - Host deployment-dumps-puppetmaster is DOWN: CRITICAL - Host Unreachable (10.68.21.153) [02:24:21] PROBLEM - Host deployment-puppetmaster02 is DOWN: CRITICAL - Host Unreachable (10.68.21.200) [07:42:39] 10Continuous-Integration-Infrastructure, 10QuickSurveys, 10WikimediaMessages, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository (again) - https://phabricator.wikimedia.org/T198000#4309001 (10Zoranzoki21) [07:42:53] 10Continuous-Integration-Infrastructure, 10QuickSurveys, 10WikimediaMessages, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository (again) - https://phabricator.wikimedia.org/T198000#4309012 (10Zoranzoki21) [07:42:56] 10Continuous-Integration-Infrastructure, 10WikimediaMessages, 10Patch-For-Review, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository - https://phabricator.wikimedia.org/T195210#4309011 (10Zoranzoki21) [08:21:19] (03PS4) 10MarcoAurelio: Mark repository as read only [extensions/CommunityVoice] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/441361 (https://phabricator.wikimedia.org/T196618) [08:23:56] 10Diffusion, 10GitHub-Mirrors, 10Repository-Admins: Mirroring mediawiki/core to GitHub from diffusion does not work - https://phabricator.wikimedia.org/T135494#2300654 (10MarcoAurelio) https://github.com/wikimedia/mediawiki/commit/d27a597efa5a926b662df48b058fbf31881b5da7 (link in the description) works now.... [08:35:46] 10Differential, 10Pywikibot-core, 10Repository-Admins, 10Gerrit-Migration: Migrate Pywikibot to Differential code review - https://phabricator.wikimedia.org/T95526#4309037 (10MarcoAurelio) The statement on the description is no longer true. MediaWiki development will continue on Gerrit after the #gerrit-mi... [11:37:49] 10Differential, 10Pywikibot-core, 10Repository-Admins, 10Gerrit-Migration: Migrate Pywikibot to Differential code review - https://phabricator.wikimedia.org/T95526#4309189 (10Xqt) 05Open>03declined See above [12:29:26] 10Diffusion, 10GitHub-Mirrors, 10Repository-Admins: Mirroring mediawiki/core to GitHub from diffusion does not work - https://phabricator.wikimedia.org/T135494#4309221 (10Paladox) 05Open>03Resolved a:03Paladox Upstream fixed this issue I think a week ago so it can push all refs at once instead of in ba... [12:29:38] 10Diffusion, 10GitHub-Mirrors, 10Repository-Admins: Mirroring mediawiki/core to GitHub from diffusion does not work - https://phabricator.wikimedia.org/T135494#4309224 (10Paladox) a:05Paladox>03None [16:09:38] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:58:22] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - string 'Wikipedia' not found on 'https://en.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 1975 bytes in 0.022 second response time [17:58:38] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-07 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 1343 bytes in 0.004 second response time [17:59:15] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - string 'Wikipedia' not found on 'https://en.m.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 1975 bytes in 0.030 second response time [17:59:45] "Service Temporarily Unavailable" [17:59:54] Krenair ^^ [18:14:34] PROBLEM - Puppet errors on deployment-deploy01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:08:33] what [19:09:40] the apaches all say en.wikipedia.beta.wmflabs.org is not a configured domain basically [19:10:32] which is interesting considering /etc/apache2/sites-enabled/03-wikipedia.conf has ServerAlias *.wikipedia.beta.wmflabs.org [19:14:12] apache service shows problems connecting to :9000 [19:18:17] 9000 is hhvm [19:19:18] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 36107 bytes in 2.175 second response time [19:19:28] !log restarted hhvm on -mediawiki-07 then apache2 to bring beta back up [19:19:30] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:19:52] for some reason hhvm had stopped listening, so apache had stopped considering the site valid or something? [19:23:26] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 47492 bytes in 2.402 second response time [19:23:36] RECOVERY - App Server Main HTTP Response on deployment-mediawiki-07 is OK: HTTP OK: HTTP/1.1 200 OK - 46947 bytes in 0.939 second response time [19:25:23] interesting [19:25:28] why does curl still fail [19:25:36] despite varnish apparently being happy [19:46:24] oh it's because lo vs. eth0 [19:46:30] I was curling against http://localhost [19:46:51] instead of http://$(hostname) [19:47:24] still, apache listens on *, the VirtualHost is for *:80 [19:47:28] why is it picky about that [19:48:52] also, why is -mediawiki-07 the only varnish backend? [19:50:44] or rather [19:50:48] the only one it's using? [19:51:24] PROBLEM - Host deployment-mx is DOWN: CRITICAL - Host Unreachable (10.68.17.78) [19:51:36] ^ that's supposed to be down [19:51:48] varnish knows about the others but apparently doesn't want to use them? if I turn off apache on -07 then varnish starts giving 503s [19:52:05] they have backend blocks in /etc/varnish/wikimedia-common_text-backend.inc.vcl [19:54:26] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:58:37] looks like this may come from conftool, which appears broken? [20:01:35] hm maybe we're not pulling from conftool [20:05:02] 10Continuous-Integration-Infrastructure, 10QuickSurveys, 10WikimediaMessages, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository (again) - https://phabricator.wikimedia.org/T198000#4309001 (10hashar) [20:05:05] 10Continuous-Integration-Infrastructure, 10WikimediaMessages, 10Patch-For-Review, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository - https://phabricator.wikimedia.org/T195210#4309461 (10hashar) 05Resolved>03Open Fails again (was T198000) while rebasing https:... [20:05:09] alright it looks like it's the cache::app_directors hiera key [20:05:13] appservers: [20:05:14] backends: {eqiad: deployment-mediawiki-07.deployment-prep.eqiad.wmflabs} [20:05:20] 10Continuous-Integration-Infrastructure, 10QuickSurveys, 10WikimediaMessages, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository (again) - https://phabricator.wikimedia.org/T198000#4309001 (10hashar) [20:05:22] 10Continuous-Integration-Infrastructure, 10WikimediaMessages, 10Patch-For-Review, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository - https://phabricator.wikimedia.org/T195210#4309467 (10hashar) [20:05:34] no space for lists, am guessing this is due to LVS/conftool use in prod [20:06:38] 10Continuous-Integration-Infrastructure, 10WikimediaMessages, 10Patch-For-Review, 10User-Zoranzoki21: WikimediaMessages lacks qqq message documentation for ext-quicksurveys-performance-internal-survey-description - https://phabricator.wikimedia.org/T195210#4219507 (10hashar) [21:10:36] 10Continuous-Integration-Infrastructure, 10QuickSurveys, 10WikimediaMessages, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository (again) - https://phabricator.wikimedia.org/T198000#4309512 (10Krinkle) [22:08:28] 10MediaWiki-Codesniffer: Remove [] from optional doc parameters - https://phabricator.wikimedia.org/T198022#4309520 (10Umherirrender) [22:11:01] 10MediaWiki-Codesniffer: Remove [] from optional doc parameters - https://phabricator.wikimedia.org/T198022#4309532 (10Umherirrender) See https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/ContentTranslation/+/441607/1/includes/CategoriesStorageManager.php for an example [23:14:58] PROBLEM - Free space - all mounts on integration-slave-docker-1002 is CRITICAL: CRITICAL: integration.integration-slave-docker-1002.diskspace.root.byte_percentfree (<20.00%) [23:23:52] 10Release-Engineering-Team (Long-Lived-Branches), 10Performance-Team: Don't trash cache for front-end resources - https://phabricator.wikimedia.org/T102578#4309586 (10Krinkle) [23:34:58] RECOVERY - Free space - all mounts on integration-slave-docker-1002 is OK: OK: All targets OK