[00:00:23] hi, quick question... Is there a recommended vagrant role for emulating running as complete a test suite as possible? [00:01:38] AndyRussG: you could try the wikimediaproduction role [00:01:49] it may or may not actually provision for you :) [00:02:59] bd808: ah okok thx! [00:17:54] anyone know whats up with phan-seccheck-docker? I can't seem to merge anything to GeoData: https://integration.wikimedia.org/ci/job/mwext-php70-phan-seccheck-docker/1293/console [00:18:03] it basically doesn't find stuff, reports success, then fails [00:20:23] legoktm: ^ [00:31:25] 10Continuous-Integration-Infrastructure, 10Composer: Upgrade integration/composer to 1.6.5 stable - https://phabricator.wikimedia.org/T125343#4247603 (10Reedy) p:05Normal>03High [00:31:46] 10Continuous-Integration-Infrastructure, 10Composer: Upgrade integration/composer to 1.6.5 stable - https://phabricator.wikimedia.org/T125343#1985019 (10Reedy) [00:37:44] (03PS1) 10Reedy: Update composer to 1.6.5 [integration/composer] - 10https://gerrit.wikimedia.org/r/436705 (https://phabricator.wikimedia.org/T125343) [00:37:59] 10Continuous-Integration-Infrastructure, 10Composer, 10Patch-For-Review: Upgrade integration/composer to 1.6.5 stable - https://phabricator.wikimedia.org/T125343#4247634 (10Reedy) Can we do something about this rather than running an ancient version? [00:38:52] (03Draft2) 10Reedy: Update composer/spdx-licenses from 1.3.0 to 1.4.0 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436706 [00:49:58] seems it was a temporary (or intermittent?) issue, re-running has merged [00:50:10] Some hosts get into a mess [01:08:44] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations, 10Patch-For-Review: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921#4247675 (10Reedy) ``` reedy@mwmaint1001:~$ time PHP='hhvm -vEval.Jit=1' mwscript rebuildLocalis... [01:42:00] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [01:43:43] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations, 10Patch-For-Review: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921#4247698 (10Reedy) The previous was with an unclean outdir... This one is clean, dunno if it ac... [01:46:32] (03CR) 10Legoktm: [C: 032] Update composer/spdx-licenses from 1.3.0 to 1.4.0 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436706 (owner: 10Reedy) [01:47:33] (03Merged) 10jenkins-bot: Update composer/spdx-licenses from 1.3.0 to 1.4.0 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436706 (owner: 10Reedy) [01:48:14] (03CR) 10jenkins-bot: Update composer/spdx-licenses from 1.3.0 to 1.4.0 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436706 (owner: 10Reedy) [01:53:24] huh [01:53:47] I'll file a task [01:55:26] 10Continuous-Integration-Infrastructure: seccheck-docker job failing on find delete step randomly/intermittently - https://phabricator.wikimedia.org/T196126#4247700 (10Legoktm) [01:55:38] RECOVERY - Puppet errors on deployment-tin is OK: is nice though [02:21:25] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4247730 (10Krinkle) @demon @thcipriani I could use some help debugging a scap issue. The following Puppet classes are used in production on the webperf1001 ho... [02:22:11] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [02:33:29] (03PS1) 10Legoktm: Drop PHP 5.5 jobs for most libraries [integration/config] - 10https://gerrit.wikimedia.org/r/436716 [02:35:57] (03CR) 10Reedy: [C: 031] "Easily reverted for specific libraries if necessary for other reasons" [integration/config] - 10https://gerrit.wikimedia.org/r/436716 (owner: 10Legoktm) [02:36:10] (03CR) 10Legoktm: [C: 032] Drop PHP 5.5 jobs for most libraries [integration/config] - 10https://gerrit.wikimedia.org/r/436716 (owner: 10Legoktm) [02:37:26] (03Merged) 10jenkins-bot: Drop PHP 5.5 jobs for most libraries [integration/config] - 10https://gerrit.wikimedia.org/r/436716 (owner: 10Legoktm) [02:39:06] !log deployed https://gerrit.wikimedia.org/r/436716 [02:39:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [02:40:02] RECOVERY - Puppet errors on deployment-webperf01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:40:22] 10Continuous-Integration-Config: phpunit-patch jobs should not install phpunit-patch-coverage in the project's composer.json - https://phabricator.wikimedia.org/T196128#4247744 (10Legoktm) [02:43:06] (03PS1) 10Legoktm: Use phpunit-patch-coverage 0.0.9 [integration/config] - 10https://gerrit.wikimedia.org/r/436719 [02:46:00] (03CR) 10Legoktm: [C: 032] "INFO:jenkins_jobs.builder:Reconfiguring jenkins job mediawiki-phpunit-coverage-patch" [integration/config] - 10https://gerrit.wikimedia.org/r/436719 (owner: 10Legoktm) [02:47:46] (03Merged) 10jenkins-bot: Use phpunit-patch-coverage 0.0.9 [integration/config] - 10https://gerrit.wikimedia.org/r/436719 (owner: 10Legoktm) [04:21:12] 10Beta-Cluster-Infrastructure, 10User-Jayprakash12345: MediaWiki Version Commit link is Broken in Special:Version at Beta Cluster - https://phabricator.wikimedia.org/T196130#4247818 (10Jayprakash12345) [04:23:50] 10Beta-Cluster-Infrastructure, 10User-Jayprakash12345: MediaWiki Version Commit link is Broken in Special:Version at Beta Cluster - https://phabricator.wikimedia.org/T196130#4247832 (10Jayprakash12345) Same on T191165 [05:00:34] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<40.00%) [06:39:05] (03CR) 10Legoktm: "> Is there a relevant difference between @remarks and @note?" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/424554 (https://phabricator.wikimedia.org/T182057) (owner: 10Thiemo Kreuz (WMDE)) [06:40:19] (03CR) 10Legoktm: [C: 032] Add FunctionAnnotations checking tags in function comments only [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/424554 (https://phabricator.wikimedia.org/T182057) (owner: 10Thiemo Kreuz (WMDE)) [06:41:16] (03Merged) 10jenkins-bot: Add FunctionAnnotations checking tags in function comments only [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/424554 (https://phabricator.wikimedia.org/T182057) (owner: 10Thiemo Kreuz (WMDE)) [06:42:12] (03CR) 10jenkins-bot: Add FunctionAnnotations checking tags in function comments only [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/424554 (https://phabricator.wikimedia.org/T182057) (owner: 10Thiemo Kreuz (WMDE)) [06:50:32] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:27:03] (03CR) 10Hashar: [C: 04-1] "The Jenkins XUnit / Junit plugins have support to aggregate junit XML files iirc. So we can have each tests spurt a result file under log" [integration/quibble] - 10https://gerrit.wikimedia.org/r/426742 (owner: 10Legoktm) [07:27:30] sweeet [07:33:45] (03PS3) 10Hashar: Set LOG_DIR environment variable [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 [07:34:42] legoktm: yeah I think that would work (multiple junit files) [07:35:00] it is probably easy to verify by crafting a couple junit files and checking what happens hehe [07:35:15] I am going to upgrade quibble again with all the patches I did yesterday [07:35:29] logs were missing, wdio suite did not save screenshots [07:36:06] (03CR) 10Hashar: [C: 032] Always pass os.environ to run commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/436545 (owner: 10Hashar) [07:36:56] (03Merged) 10jenkins-bot: Always pass os.environ to run commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/436545 (owner: 10Hashar) [07:36:58] (03Merged) 10jenkins-bot: Remove dummy test class [integration/quibble] - 10https://gerrit.wikimedia.org/r/436558 (owner: 10Hashar) [07:37:15] (03CR) 10Hashar: [C: 032] "Now I have the debug log from the web served mediawiki being written properly (mw-debug-www.log)." [integration/quibble] - 10https://gerrit.wikimedia.org/r/436559 (owner: 10Hashar) [07:37:35] (03CR) 10jenkins-bot: Always pass os.environ to run commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/436545 (owner: 10Hashar) [07:37:53] (03CR) 10Hashar: [C: 032] Set LOG_DIR environment variable [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 (owner: 10Hashar) [07:37:56] (03Merged) 10jenkins-bot: Spawn DevWebserver with OS environment variables [integration/quibble] - 10https://gerrit.wikimedia.org/r/436559 (owner: 10Hashar) [07:38:01] (03CR) 10jenkins-bot: Remove dummy test class [integration/quibble] - 10https://gerrit.wikimedia.org/r/436558 (owner: 10Hashar) [07:38:27] (03CR) 10jenkins-bot: Spawn DevWebserver with OS environment variables [integration/quibble] - 10https://gerrit.wikimedia.org/r/436559 (owner: 10Hashar) [07:38:32] (03Merged) 10jenkins-bot: Set LOG_DIR environment variable [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 (owner: 10Hashar) [07:39:01] legoktm: and the webdriver.io suite does generate multiple xml files :] [07:39:02] (03CR) 10jenkins-bot: Set LOG_DIR environment variable [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 (owner: 10Hashar) [07:55:52] (03PS1) 10Hashar: Expose LOG_DIR in the fixture docroot [integration/quibble] - 10https://gerrit.wikimedia.org/r/436733 [07:58:11] (03PS1) 10Hashar: Always pass os.environ to mediawiki install/update [integration/quibble] - 10https://gerrit.wikimedia.org/r/436735 [08:05:12] (03CR) 10Hashar: [C: 032] Expose LOG_DIR in the fixture docroot [integration/quibble] - 10https://gerrit.wikimedia.org/r/436733 (owner: 10Hashar) [08:05:18] (03CR) 10Hashar: [C: 032] Always pass os.environ to mediawiki install/update [integration/quibble] - 10https://gerrit.wikimedia.org/r/436735 (owner: 10Hashar) [08:05:56] (03Merged) 10jenkins-bot: Expose LOG_DIR in the fixture docroot [integration/quibble] - 10https://gerrit.wikimedia.org/r/436733 (owner: 10Hashar) [08:06:07] (03Merged) 10jenkins-bot: Always pass os.environ to mediawiki install/update [integration/quibble] - 10https://gerrit.wikimedia.org/r/436735 (owner: 10Hashar) [08:06:25] (03CR) 10jenkins-bot: Expose LOG_DIR in the fixture docroot [integration/quibble] - 10https://gerrit.wikimedia.org/r/436733 (owner: 10Hashar) [08:06:49] (03CR) 10jenkins-bot: Always pass os.environ to mediawiki install/update [integration/quibble] - 10https://gerrit.wikimedia.org/r/436735 (owner: 10Hashar) [08:14:37] (03PS1) 10Hashar: docker: quibble 0.0.16 [integration/config] - 10https://gerrit.wikimedia.org/r/436739 [08:14:54] (03CR) 10Hashar: [C: 032] docker: quibble 0.0.16 [integration/config] - 10https://gerrit.wikimedia.org/r/436739 (owner: 10Hashar) [08:16:33] (03Merged) 10jenkins-bot: docker: quibble 0.0.16 [integration/config] - 10https://gerrit.wikimedia.org/r/436739 (owner: 10Hashar) [08:44:23] (03PS1) 10Hashar: Bump Quibble jobs to 0.0.16 [integration/config] - 10https://gerrit.wikimedia.org/r/436744 [08:45:06] !log Bumping Quibble jobs to 0.0.16 [08:45:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:58:36] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations, 10Patch-For-Review: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921#4248100 (10Nikerabbit) I wonder why are we testing `rebuildLocalisationCache.php` with `--threa... [09:00:59] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4248103 (10hashar) [09:01:21] (03PS1) 10Hashar: Migrate TranslationNotifications to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436746 [09:01:25] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-Kanban, 10Puppet: deployment-eventlog05 puppet error about missing mysql heartbeat.heartbeat table - https://phabricator.wikimedia.org/T191109#4248105 (10elukey) [09:01:33] (03CR) 10Hashar: [C: 032] Migrate TranslationNotifications to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436746 (owner: 10Hashar) [09:01:51] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-Kanban, 10Puppet: deployment-eventlog05 puppet error about missing mysql heartbeat.heartbeat table - https://phabricator.wikimedia.org/T191109#4093870 (10elukey) Thanks for the report! I added the following to the heartbeat database, and puppet now ru... [09:04:48] (03CR) 10Hashar: [C: 032] Bump Quibble jobs to 0.0.16 [integration/config] - 10https://gerrit.wikimedia.org/r/436744 (owner: 10Hashar) [09:06:30] (03Merged) 10jenkins-bot: Bump Quibble jobs to 0.0.16 [integration/config] - 10https://gerrit.wikimedia.org/r/436744 (owner: 10Hashar) [09:06:32] (03Merged) 10jenkins-bot: Migrate TranslationNotifications to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436746 (owner: 10Hashar) [09:14:59] RECOVERY - Puppet errors on deployment-eventlog05 is OK: OK: Less than 1.00% above the threshold [0.0] [10:10:22] (03CR) 10Thiemo Kreuz (WMDE): Add possibility to change allowed prefixes (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [10:12:34] 10MediaWiki-Codesniffer, 10MediaWiki-extensions-Variables, 10Patch-For-Review: Allow configuring MediaWiki.NamingConventions.ValidGlobalName.wgPrefix to allow additional prefixes - https://phabricator.wikimedia.org/T191812#4117477 (10thiemowmde) I do see two possibilities: * The sniff could fall back to a de... [10:49:42] 10Phabricator (Upstream), 10Upstream: Option to Turn Off Status Updates in Phabricator Task-Threads - https://phabricator.wikimedia.org/T195728#4235323 (10Aklapper) >>! In T195728#4241966, @Johnywhy wrote: > They say, above, once you remove yourself, mentions should not re-subscribe you. @Johnywhy: [[ https:... [10:50:02] PROBLEM - Puppet errors on deployment-mediawiki06 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [11:00:08] PROBLEM - Puppet errors on deployment-snapshot01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [11:35:07] (03Abandoned) 10Thiemo Kreuz (WMDE): Add checks for invalid annotations [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/420159 (https://phabricator.wikimedia.org/T182057) (owner: 10MaxSem) [11:40:42] 10Beta-Cluster-Infrastructure, 10User-Jayprakash12345: MediaWiki Version Commit link is Broken in Special:Version at Beta Cluster - https://phabricator.wikimedia.org/T196130#4248475 (10EddieGP) [12:00:03] RECOVERY - Puppet errors on deployment-mediawiki06 is OK: OK: Less than 1.00% above the threshold [0.0] [13:18:45] PROBLEM - Puppet errors on deployment-maps03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [13:46:05] (03PS1) 10Hashar: Make MediaWiki honor MW_LOG_DIR [integration/quibble] - 10https://gerrit.wikimedia.org/r/436793 [13:46:07] (03PS1) 10Hashar: Move $wgEnableJavaScriptTest to mediawiki.d [integration/quibble] - 10https://gerrit.wikimedia.org/r/436794 [13:47:31] (03CR) 10Hashar: [C: 032] Make MediaWiki honor MW_LOG_DIR [integration/quibble] - 10https://gerrit.wikimedia.org/r/436793 (owner: 10Hashar) [13:47:34] (03CR) 10Hashar: [C: 032] Move $wgEnableJavaScriptTest to mediawiki.d [integration/quibble] - 10https://gerrit.wikimedia.org/r/436794 (owner: 10Hashar) [13:48:19] (03Merged) 10jenkins-bot: Make MediaWiki honor MW_LOG_DIR [integration/quibble] - 10https://gerrit.wikimedia.org/r/436793 (owner: 10Hashar) [13:48:21] (03Merged) 10jenkins-bot: Move $wgEnableJavaScriptTest to mediawiki.d [integration/quibble] - 10https://gerrit.wikimedia.org/r/436794 (owner: 10Hashar) [13:48:50] (03CR) 10jenkins-bot: Make MediaWiki honor MW_LOG_DIR [integration/quibble] - 10https://gerrit.wikimedia.org/r/436793 (owner: 10Hashar) [13:49:14] (03CR) 10jenkins-bot: Move $wgEnableJavaScriptTest to mediawiki.d [integration/quibble] - 10https://gerrit.wikimedia.org/r/436794 (owner: 10Hashar) [13:50:38] (03PS1) 10Hashar: docker: quibble 0.0.17 [integration/config] - 10https://gerrit.wikimedia.org/r/436796 [13:51:47] (03PS1) 10Hashar: Bump Quibble jobs to 0.0.17 [integration/config] - 10https://gerrit.wikimedia.org/r/436797 [13:52:14] !log Tagged Quibble 0.0.17, rebuilding Docker images and bumping jenkins jobs [13:52:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:23:13] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations, 10Patch-For-Review: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921#4248843 (10Reedy) More threads means more I/O contention... Not really unexpected. It's finding... [15:03:37] (03CR) 10Hashar: [C: 032] Bump Quibble jobs to 0.0.17 [integration/config] - 10https://gerrit.wikimedia.org/r/436797 (owner: 10Hashar) [15:04:16] (03PS1) 10Hashar: Archive VectorV2 [integration/config] - 10https://gerrit.wikimedia.org/r/436815 (https://phabricator.wikimedia.org/T196169) [15:04:28] (03CR) 10Hashar: [C: 032] Archive VectorV2 [integration/config] - 10https://gerrit.wikimedia.org/r/436815 (https://phabricator.wikimedia.org/T196169) (owner: 10Hashar) [15:05:15] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4249004 (10hashar) [15:05:17] (03Merged) 10jenkins-bot: Bump Quibble jobs to 0.0.17 [integration/config] - 10https://gerrit.wikimedia.org/r/436797 (owner: 10Hashar) [15:05:39] (03Merged) 10jenkins-bot: Archive VectorV2 [integration/config] - 10https://gerrit.wikimedia.org/r/436815 (https://phabricator.wikimedia.org/T196169) (owner: 10Hashar) [15:20:11] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4249024 (10Dzahn) ``` --- /etc/dsh/group/scap-masters 2018-05-24 14:25:47.608760286 +0000 +deploy1001.eqiad.wmnet .. [deploy10... [15:20:27] (03PS1) 10Hashar: Zuul templates for Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436817 [15:22:02] (03CR) 10jerkins-bot: [V: 04-1] Zuul templates for Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436817 (owner: 10Hashar) [15:23:19] (03PS2) 10Hashar: Zuul templates for Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436817 [15:25:30] (03CR) 10Hashar: [C: 032] Zuul templates for Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436817 (owner: 10Hashar) [15:28:09] (03Merged) 10jenkins-bot: Zuul templates for Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436817 (owner: 10Hashar) [15:39:31] (03PS1) 10Hashar: Migrate BlueSpiceSkin to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436819 (https://phabricator.wikimedia.org/T183512) [15:39:46] (03CR) 10Hashar: [C: 032] Migrate BlueSpiceSkin to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436819 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [15:41:50] (03Merged) 10jenkins-bot: Migrate BlueSpiceSkin to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436819 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [15:52:10] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4249098 (10Dzahn) 11:41 < mutante> !log root@deploy1001:/srv/mediawiki-staging# find . -uid 996 -exec chown mwdeploy {} \; 11:4... [16:03:31] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4249107 (10hashar) [16:04:21] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4180290 (10hashar) [16:09:37] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:27:37] (03PS1) 10Hashar: Migrate ParserFun to Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436824 (https://phabricator.wikimedia.org/T183512) [16:27:48] (03CR) 10Hashar: [C: 032] Migrate ParserFun to Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436824 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [16:28:14] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4249147 (10hashar) [16:29:11] (03Merged) 10jenkins-bot: Migrate ParserFun to Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436824 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [16:30:48] (03PS1) 10Hashar: Revert "Move OAuthAuthentication to composer unittests" [integration/config] - 10https://gerrit.wikimedia.org/r/436825 [16:30:52] (03PS2) 10Hashar: Revert "Move OAuthAuthentication to composer unittests" [integration/config] - 10https://gerrit.wikimedia.org/r/436825 [16:31:22] (03CR) 10Hashar: [C: 032] Revert "Move OAuthAuthentication to composer unittests" [integration/config] - 10https://gerrit.wikimedia.org/r/436825 (owner: 10Hashar) [16:32:16] 10Release-Engineering-Team (Watching / External), 10Operations: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4249152 (10Dzahn) [16:32:35] (03Merged) 10jenkins-bot: Revert "Move OAuthAuthentication to composer unittests" [integration/config] - 10https://gerrit.wikimedia.org/r/436825 (owner: 10Hashar) [16:37:30] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4249166 (10hashar) [17:07:52] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4066939 (10Dzahn) [17:09:16] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4249237 (10hashar) [17:09:37] (03PS1) 10Hashar: Migrate 4 extensions to Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436834 (https://phabricator.wikimedia.org/T183512) [17:09:50] (03CR) 10Hashar: [C: 032] Migrate 4 extensions to Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436834 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [17:11:39] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4249258 (10Dzahn) [17:11:59] (03Merged) 10jenkins-bot: Migrate 4 extensions to Quibble with composer [integration/config] - 10https://gerrit.wikimedia.org/r/436834 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [17:13:20] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure (shipyard), 10BlueSpice, 10Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#4249261 (10hashar) Sorry I have been postponing this task for too long now. I am currently migrating all... [17:13:52] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure (shipyard), 10BlueSpice, 10Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#4249271 (10hashar) [17:14:05] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4249274 (10hashar) [17:14:10] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure (shipyard), 10BlueSpice, 10Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#2505126 (10hashar) [17:14:54] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4181502 (10hashar) [17:16:03] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4249284 (10Dzahn) 05Open>03Resolved [17:17:14] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4066988 (10Dzahn) deploy1001 is now the active deployment server. from here it should just be about removing tin. we will wait a... [17:23:33] 10Release-Engineering-Team (Kanban), 10Release Pipeline, 10Wikimedia-Hackathon-2018, 10Services (watching), 10User-zeljkofilipin: Wikimedia Continuous Delivery Pipeline: Say What? - https://phabricator.wikimedia.org/T194940#4213478 (10zeljkofilipin) @bcampbell works now, thanks! [17:41:18] (03PS1) 10Hashar: QA report for the Quibble migration [integration/config] - 10https://gerrit.wikimedia.org/r/436838 (https://phabricator.wikimedia.org/T183512) [17:42:03] (03CR) 10Hashar: [C: 032] QA report for the Quibble migration [integration/config] - 10https://gerrit.wikimedia.org/r/436838 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [17:42:34] (03CR) 10jerkins-bot: [V: 04-1] QA report for the Quibble migration [integration/config] - 10https://gerrit.wikimedia.org/r/436838 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [17:43:23] (03PS2) 10Hashar: QA report for the Quibble migration [integration/config] - 10https://gerrit.wikimedia.org/r/436838 (https://phabricator.wikimedia.org/T183512) [17:43:32] (03CR) 10Hashar: [C: 032] QA report for the Quibble migration [integration/config] - 10https://gerrit.wikimedia.org/r/436838 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [17:44:47] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4249427 (10hashar) [17:45:08] (03Merged) 10jenkins-bot: QA report for the Quibble migration [integration/config] - 10https://gerrit.wikimedia.org/r/436838 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [17:46:27] hasharAway: cool! ^ [18:07:56] Hey, releng -- I'd like to have someone from your team review https://gerrit.wikimedia.org/r/#/c/428707/ but don't know who to assign. Anyone interested? [18:09:56] technically gerrit has a field for assignee now [18:09:59] right above reviewer [18:10:07] and you could enter gerrit user names there [18:10:24] who not how :) [18:10:30] who == tyler and mukunda [18:10:42] I'll add them as reviewers [18:11:08] ooh :.. indeed i read that wrong [18:11:11] :) [18:11:35] greg-g: mutante: thanks both! [18:26:29] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4249499 (10thcipriani) One idea would be to set the `require_valid_service` config value to `True` in your scap.cfg file. This will prevent scap from restart... [18:59:36] (03PS1) 10Hashar: Skip archived repos for Quibble migration report [integration/config] - 10https://gerrit.wikimedia.org/r/436844 [18:59:51] (03CR) 10Hashar: [C: 032] Skip archived repos for Quibble migration report [integration/config] - 10https://gerrit.wikimedia.org/r/436844 (owner: 10Hashar) [19:01:09] (03Merged) 10jenkins-bot: Skip archived repos for Quibble migration report [integration/config] - 10https://gerrit.wikimedia.org/r/436844 (owner: 10Hashar) [19:02:31] 10Continuous-Integration-Infrastructure, 10Composer, 10Patch-For-Review: Upgrade integration/composer to 1.6.5 stable - https://phabricator.wikimedia.org/T125343#4249600 (10hashar) Yeah hire a few more people so we can afford to maintain the infra and deal with the upgrades aftermath :] More seriously, I a... [19:10:34] 10Continuous-Integration-Infrastructure, 10Composer, 10Patch-For-Review: Upgrade integration/composer to 1.6.5 stable - https://phabricator.wikimedia.org/T125343#4249607 (10Reedy) >>! In T125343#4249600, @hashar wrote: > Yeah hire a few more people so we can afford to maintain the infra and deal with the upg... [19:13:32] 10Continuous-Integration-Infrastructure, 10Composer, 10Patch-For-Review: Upgrade integration/composer to 1.6.5 stable - https://phabricator.wikimedia.org/T125343#4249620 (10Legoktm) T125343#3876563 basically. The annoying part is going to be bumping all of the docker images manually... [19:26:50] how do I ask in gerrit to run same tests as for merge? [19:27:04] PROBLEM - Puppet errors on deployment-ms-be03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [19:27:54] PROBLEM - Puppet errors on deployment-ms-be04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [19:29:58] ^ me [19:30:46] Nikerabbit: "check php" [19:32:23] legoktm: ty I'll try it [19:42:04] RECOVERY - Puppet errors on deployment-ms-be03 is OK: OK: Less than 1.00% above the threshold [0.0] [19:42:56] RECOVERY - Puppet errors on deployment-ms-be04 is OK: OK: Less than 1.00% above the threshold [0.0] [19:45:48] (03PS1) 10Reedy: Update docker images to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436852 (https://phabricator.wikimedia.org/T125343) [19:55:42] https://gerrit.git.wmflabs.org/r/c/test/+/3 [19:56:13] cc works! [19:56:35] (you have to switch to the polygerrit ui to see it) [19:56:45] are people logging in on that? [19:57:11] (03PS2) 10Reedy: Update docker images to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436852 (https://phabricator.wikimedia.org/T125343) [19:58:55] wheee [19:59:44] (03CR) 10Reedy: [C: 032] Update composer to 1.6.5 [integration/composer] - 10https://gerrit.wikimedia.org/r/436705 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [19:59:56] (03Merged) 10jenkins-bot: Update composer to 1.6.5 [integration/composer] - 10https://gerrit.wikimedia.org/r/436705 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [20:00:05] (03CR) 10Reedy: [C: 032] Update docker images to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436852 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [20:00:44] answed in pm but yeh [20:00:53] i pre populated data to test notedb [20:01:02] as thats going to be the biggest thing when upgrading [20:01:18] i didnt realize there was an entire wiki just to serve as the equivalent of wikitech in prod [20:01:22] as LDAP wiki [20:01:24] yep [20:01:26] to be used by test gerrit [20:01:27] (03Merged) 10jenkins-bot: Update docker images to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436852 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [20:01:39] per rule i cannot use prods ldap [20:01:52] yea [20:02:04] that's why i was wondering how users get created [20:02:11] heh [20:02:18] i have ldap running in mediawiki-vagrant [20:02:22] also have logstash [20:02:31] wow, ok [20:02:33] !log Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/436852 [20:02:35] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:02:44] http://gerrit-logstash.wmflabs.org/app/kibana [20:04:48] 10Continuous-Integration-Infrastructure: git object permission issues on contint1001 - https://phabricator.wikimedia.org/T196192#4249720 (10Reedy) [20:05:40] 10Continuous-Integration-Infrastructure: Unable to build docker stuff - https://phabricator.wikimedia.org/T196192#4249730 (10Reedy) [20:05:54] paladox: made a new user on http://ldapauth-gitldap.wmflabs.org/ ; tried to login with that user on https://gerrit.git.wmflabs.org but doesnt work yet. any other preferences i need to change? it said that i can change prefs of ldap_auth if i want (in wiki) [20:06:00] or just wait because something syncs [20:06:13] hmm, you use the user + password [20:06:17] you did on the wiki [20:06:25] yes, i do [20:06:32] oh, wait [20:06:39] ok [20:07:05] yea, i do and it says invalid user or password [20:07:19] !log really Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/436852 [20:07:21] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:07:23] [contint1001.wikimedia.org] out: PermissionError: [Errno 13] Permission denied: '/tmp/docker-pkg-build.log' [20:07:33] hmm [20:07:56] mutante try lower case m? [20:08:17] paladox: tried both. but no [20:08:20] oh [20:08:26] same password? [20:08:45] yes, double confirmed i can logout and back in with it on ldapauth wiki [20:09:05] 10Continuous-Integration-Infrastructure, 10Operations, 10SRE-Access-Requests: Add Reedy to contint-docker group - https://phabricator.wikimedia.org/T196192#4249733 (10Legoktm) [20:11:59] hmm [20:12:03] ah [20:12:04] i have the problem too [20:12:05] soo hmm [20:14:00] does it really talk to that ldap server? is there a 01 and a 02 of something somewhere [20:14:34] (03PS1) 10Reedy: Update composer image to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436859 (https://phabricator.wikimedia.org/T125343) [20:17:14] it talks to a ldap server [20:17:19] filling a bug against gerrit [20:17:22] as it worked in 2.14 [20:17:58] paladox: do you have shell on both sides? [20:18:07] on gerrit-test3 [20:18:10] and mw vagrant [20:18:18] where the LDAP server runs [20:18:27] does anyone know how to sync a repo from Gerrit to Diffusion ? I am looking to get integration/quibble.git synced there :] [20:18:30] yep i do [20:18:37] hashar clone url [20:18:41] the https url [20:18:43] that is anon [20:18:51] you set it up as observe [20:19:02] oh man I am lost already :] [20:19:06] paladox: try listening to network traffic with "tcpdump" on the server while you make a connection from gerrit.. see if anything gets sent [20:19:07] heh [20:19:12] ok [20:19:22] ah there is a create repository link neat [20:19:38] yep [20:19:56] paladox: thank you!!! [20:20:13] your welcome! [20:21:32] (03PS2) 10Reedy: Update composer image to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436859 (https://phabricator.wikimedia.org/T125343) [20:23:35] PROBLEM - Disk space on contint1001 is CRITICAL: DISK CRITICAL - free space: / 2596 MB (5% inode=64%) [20:24:33] hmm [20:24:46] once these images finish building I'll look at the disk space [20:26:19] !log contint1001 - apt-get clean got a little bit more disk space [20:26:21] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:27:31] mutante: I think there are also a bunch of old docker images that we can remove [20:28:28] paladox: https://phabricator.wikimedia.org/source/quibble/ !! :] thanks [20:28:57] your welcome :) [20:33:28] legoktm: what's growing so fast on it right now? [20:34:16] creating new images [20:34:20] mutante: Reedy and I just rebuilt like half of our docker images [20:34:29] and we're going to do it again! [20:34:56] you should use /srv [20:34:58] that has lots of space [20:35:05] but not / .. that is going to run out and break stuff [20:35:31] separate volume group mounted on /srv [20:35:39] with several hundred G free [20:35:45] I think docker stores stuff in /var/lib/docker by default [20:36:27] /var/log/zuul 3.7G [20:36:36] $ docker info [20:36:37] Docker Root Dir: /var/lib/docker [20:37:52] can we delete zuul debug logs ? [20:38:34] moves some to /srv manually [20:38:40] yes [20:38:55] (03PS1) 10Reedy: [WIP] Bump various docker images for composer update [integration/config] - 10https://gerrit.wikimedia.org/r/436902 (https://phabricator.wikimedia.org/T125343) [20:40:08] !log contint1001 - mkdir /srv/zuul-debug-logs ; mv debug.log.2018-05-* from /var/log/zuul/ over there to free up disk space on / VG [20:40:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:40:46] RECOVERY - Disk space on contint1001 is OK: DISK OK [20:40:51] 10Release-Engineering-Team (Kanban), 10Quibble, 10Wikimedia-Hackathon-2018, 10User-zeljkofilipin: Breakout session: Quibble a test runner for MediaWiki - https://phabricator.wikimedia.org/T194970#4249782 (10hashar) Slides posted on commons: https://commons.wikimedia.org/wiki/File:20180519-QuibblePres.pdf [20:41:07] 10Release-Engineering-Team (Kanban), 10Quibble, 10Wikimedia-Hackathon-2018, 10User-zeljkofilipin: Breakout session: Quibble a test runner for MediaWiki - https://phabricator.wikimedia.org/T194970#4249783 (10hashar) [20:42:25] legoktm: wonder if that Docker Root Dir is puppetized [20:42:43] modules/profile/manifests/docker/storage/loopback.pp: $dm_target = '/var/lib/docker' [20:42:48] that's not it ? [20:42:57] I think that's it... [20:43:00] ah [20:43:33] "if on contint1001 then other path" .. needs Hiera override [20:43:37] yeah Docker ends up writing to /var/lib/docker which is the / partition [20:43:50] PROBLEM - Puppet errors on integration-slave-jessie-android is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [20:44:07] one sure, we need to keep the images or at least re pull them from the registry [20:44:20] dm_source and dm_target are both set to it [20:44:36] and the task is https://phabricator.wikimedia.org/T178663 "Switch CI Docker Storage Driver to its own partition and to use devicemapper" [20:45:05] we should put these pathes into Hiera and then use Hiera to adjust it for contint1001 by host, or better by role [20:45:07] seems the storage driver is/was "overlay2" and Alexandros told me it should be devicemapper [20:45:27] and ideally I would get Docker stuff on a different partition than the /srv one [20:47:23] (03PS3) 10Reedy: Update composer image to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436859 (https://phabricator.wikimedia.org/T125343) [20:47:51] (03PS2) 10Reedy: Bump various docker images for composer update [integration/config] - 10https://gerrit.wikimedia.org/r/436902 (https://phabricator.wikimedia.org/T125343) [20:48:14] !log contint1001: deleting some old wikimedia/mediawiki-services-mathoid docker images [20:48:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:49:54] Reedy: NOOO [20:50:06] dont bump the version of quibble images to 0.0.18 [20:50:10] that is the version from quibble :] [20:50:15] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline: Switch CI Docker Storage Driver to its own partition and to use devicemapper - https://phabricator.wikimedia.org/T178663#3698972 (10Dzahn) contint1001 was close to running out of disk... [20:50:34] Want a suffix adding? [20:50:43] 0.0.17-wmf1? [20:50:46] yeah I went with 0.0.17-1 [20:50:52] 0.0.17-2 [20:50:58] no big deal. dont craft a new commit for that [20:51:14] Can amend the commit above, it's not a big problem [20:51:26] !log deleting old versions of docker images [20:51:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:51:38] Reedy: ah yeah. And maybe amend +quibble-stretch (0.0.18) wikimedia; urgency=medium [20:51:48] but maybe I am just overengineering all of that [20:52:02] (03PS4) 10Reedy: Update composer image to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436859 (https://phabricator.wikimedia.org/T125343) [20:52:16] The .18 are rubbish because we messed up :P [20:52:27] no worires [20:52:38] and sorry about my reply regarding bumping composer to 1.6.5 [20:52:40] /dev/md0 46G 21G 23G 48% / [20:52:42] E_too many stuff to do [20:52:51] It's fine that you don't have the time [20:52:53] mutante: there, cleared up 15GB :) [20:53:12] legoktm is helping me fix it by knowing roughly what we need to do :P [20:54:05] (03PS3) 10Reedy: Bump various docker images for composer update [integration/config] - 10https://gerrit.wikimedia.org/r/436902 (https://phabricator.wikimedia.org/T125343) [20:54:30] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline: Switch CI Docker Storage Driver to its own partition and to use devicemapper - https://phabricator.wikimedia.org/T178663#4249798 (10hashar) From my previous comment T178663#3699074 , I... [20:54:58] legoktm: great :) [20:55:38] (03CR) 10Legoktm: [C: 032] Update composer image to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436859 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [20:55:45] legoktm: and I mentionned you on my quibble update https://phabricator.wikimedia.org/phame/post/view/107/quibble_in_may/ \o: [20:55:56] hashar: ^.^ [20:56:23] I deleted quibble 0.0.18 images [20:56:26] legoktm: +1 :] [20:56:48] I am not sure what kind of side effects you will end up running into [20:56:50] (at least, I think I did) [20:56:57] I think it might be safer to skip 0.0.18 images [20:57:10] such as composer 1.6.5 generating stuff that is not compatible with php 5.5 [20:57:16] (03Merged) 10jenkins-bot: Update composer image to use composer 1.6.5 [integration/config] - 10https://gerrit.wikimedia.org/r/436859 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [20:57:21] no_justification we totally need https://gerrit-review.googlesource.com/c/gerrit/+/178815 [20:57:33] i used that [20:57:38] and maybe mediawiki/vendor would need to be regenerated. I used to refresh it using integration/composer but maybe things have changed [20:57:38] and found the ldap issue i had :) [20:57:42] paladox: i confirm the login on gerrit.git is fixed now :) [20:57:52] https://github.com/composer/composer/blob/1.6.5/composer.json is "php": "^5.3.2 || ^7.0", [20:57:53] !log deploying docker-pkg with https://gerrit.wikimedia.org/r/436859 for reals this time (again) [20:57:54] when you do the update as root you will need to chown the git repo to the correct user [20:57:55] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:58:04] as it uses notedb and thats where it store users accounts [20:58:07] hashar: we've already been using new composer for vendor :) [20:58:09] :) [20:58:24] legoktm: greaaat [20:58:26] [contint1001.wikimedia.org] out: == Step 1: building images == [20:58:26] [contint1001.wikimedia.org] out: => Building image docker-registry.discovery.wmnet/releng/composer:0.1.3 [20:58:41] Reedy: looks like it'll work this time [20:58:59] Bollocks [20:59:00] No, it won't [20:59:06] Well, that bit will [20:59:21] mutante: if you have hint about how to resize a volume group, I am all for it :] [21:00:00] Oh, no, we'll be fine [21:00:03] I did think ahead [21:00:04] https://github.com/symfony/console/blob/3.0/composer.json [21:00:14] "symfony/console": "<4.0", [21:00:17] I thought ahead :D [21:00:27] "php": ">=5.5.9", [21:00:28] <3 [21:00:31] :D [21:01:43] legoktm: How long should it take? [21:02:25] like 20-30 minutes I think [21:02:25] roughly 3 days [21:05:05] well that means that it was really using notedb so yay! [21:05:10] have a good upgrade. I am falling asleep already [21:05:14] and it performed really good. [21:05:16] and happy week-end [21:05:58] hashar: checked for free extents (PEs) with vgdisplay to just make a new one.. would have been easiest. but no free extents [21:06:27] mutante: yeah the volume group got created with all disk space available. But maybe it can be resized [21:07:22] cant just use /srv/ togethter with the other app data? [21:07:53] I know docker will eventually fill it up entirely and cause havoc [21:08:31] but yeah maybe that is a good first step [21:08:38] ( /var/lib/docker > /srv/docker ) [21:09:00] anyway I am sleeping for real now [21:09:51] resizing is possible but not while all the data is on it without risk [21:10:04] so it would have to temp go somewhere else etc [21:10:18] let's continue on ticket / with alex. i didnt know you already had all that [21:10:31] good night [21:14:24] mutante: thank you for taking the extra step to think about resizing / polishing it up ! :] [21:17:16] yw hashar, cya [21:20:05] thcipriani: Interesting (re: scap/webperf), I didn't do anything to make it run. Checking now. [21:21:37] huh, well, I guess it failed at the restart service step, so everything would have been in the right place [21:24:08] puppet was repeatedly failing with it for many hours, though. [21:26:27] thcipriani: https://gist.github.com/Krinkle/8ffa99a1ca7d62578cdfe54aa5426f6d#file-webperf01-syslog-L1726 [21:28:36] thcipriani: but to clarify, without the require_valid_service option, it is expected that a scap-deployed service cannot be bootstrapped on a newly provisioned server? Or has it worked without as well, just trying to gauge what the best practice is :) [21:29:34] !log Re-creating webperf01 in deploymet-prep, T195314 [21:29:36] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:29:37] T195314: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 [21:31:15] the require_valid_service option should work, although it is a relatively recent addition. I was trying to remember how we worked around that in the past, and I couldn't figure it. You could check in with services, but my guess would be there is some manual undocumented step that people do when setting up new servers. [21:32:59] when we first setup all the providers the main focus was transitioning existing servers/services over to scap3, so this may well have been overlooked until now. [21:33:29] Indeed. I can't think of what (obvious) manual step would get me out of the paradox. If I had attempted a deploy from tin in beta the service would not yet exist. And puppet won't create the service until it's resource is completed. The only way would be to create the systemd unit file manually but that feels wrong. [21:34:13] the manual step would be to comment out service_name in the scap config, run scap deploy --init on the deployment machine, then re-run puppet on the target [21:34:23] thcipriani: How would I set require_valid_service? In puppet, or in require_valid_service: (something) in scap.cfg in git? [21:34:37] PROBLEM - Host deployment-webperf01 is DOWN: CRITICAL - Host Unreachable (10.68.20.126) [21:34:48] in the scap.cfg add the line: require_valid_service: True [21:34:51] should do it [21:35:41] legoktm: Hows progress? [21:35:47] after you make that change and pull it down to the deployment server you'll want to run: scap deploy --init [21:35:58] Reedy: it's on the adding_tag latest step [21:36:06] in /srv/deployment/[webperf directory] [21:37:22] !log Re-create performance-beta.wmflabs.org webproxy (wired to webperf01) - T195314 [21:37:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:37:25] T195314: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 [21:38:04] thcipriani: I'm re-creating the server with the described patch in place first, and then apply the role to see if it works :) [21:38:16] thcipriani: can you sanity check https://gerrit.wikimedia.org/r/#/c/436908/ ? [21:40:12] thcipriani: btw, is `server_groups: default` by itself meaningful, or the same as not present? I notice performance/coal has it, but performance/navtiming does not. [21:40:42] 10Beta-Cluster-Infrastructure, 10Puppet, 10Tracking: Deployment-prep hosts with puppet errors (tracking) - https://phabricator.wikimedia.org/T132259#4249908 (10EddieGP) [21:40:46] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Puppet: deployment-eventlog05 puppet error about missing mysql heartbeat.heartbeat table - https://phabricator.wikimedia.org/T191109#4249907 (10EddieGP) 05Open>03Resolved [21:40:47] server_groups: default is the default, unless it's overriding some other config section or another environment [21:41:30] PROBLEM - Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76) [21:42:10] for instance, I've seen folks use: `server_groups: canary, default` as a base, and then for beta override it in another section of the config with: `server_groups: default` that's the only time it's not superfluous [21:44:28] thcipriani: Thanks, that makes sense. [21:48:02] 10Beta-Cluster-Infrastructure: Puppet failure on deployment-maps03 - https://phabricator.wikimedia.org/T196197#4249918 (10EddieGP) [21:49:41] 10Beta-Cluster-Infrastructure: Puppet failure on deployment-maps03 - https://phabricator.wikimedia.org/T196197#4249918 (10EddieGP) Ping @Catrope - you created that instance two months ago. [21:51:40] Reedy: done!! [21:51:49] :o [21:51:55] 10Beta-Cluster-Infrastructure: Puppet failure at deployment-snapshot01 - https://phabricator.wikimedia.org/T196198#4249934 (10EddieGP) [21:52:18] (03PS4) 10Reedy: Bump various docker images for composer update [integration/config] - 10https://gerrit.wikimedia.org/r/436902 (https://phabricator.wikimedia.org/T125343) [21:52:47] Reedy: are you going to deploy the jjb changes? [21:52:49] Let's try breaking stuf then [21:52:53] ... [21:53:10] Reedy: knows how to summon me [21:53:24] is it "break" and "reedy" in the same string? [21:53:38] basically :) [21:53:55] RECOVERY - Host deployment-webperf01 is UP: PING OK - Packet loss = 0%, RTA = 1.26 ms [21:54:30] (03CR) 10Reedy: [C: 032] Bump various docker images for composer update [integration/config] - 10https://gerrit.wikimedia.org/r/436902 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [21:56:11] (03Merged) 10jenkins-bot: Bump various docker images for composer update [integration/config] - 10https://gerrit.wikimedia.org/r/436902 (https://phabricator.wikimedia.org/T125343) (owner: 10Reedy) [22:03:53] 10Beta-Cluster-Infrastructure: Puppet failure at deployment-snapshot01 - https://phabricator.wikimedia.org/T196198#4249959 (10EddieGP) Ping @ArielGlenn per your work on this instance in T184258. [22:04:25] no_justification i think i found the font issue to be a bug [22:04:29] possibly fixed in [22:04:34] https://github.com/GerritCodeReview/gerrit/commit/d450a9eb57f6b2fdd0b28d79b9e99a32ce65b816#diff-73d7bb55541d04b410e408893e2a15b6 [22:07:40] Reedy: https://integration.wikimedia.org/ci/job/composer-package-hhvm-docker/1809/console [22:08:01] PROBLEM - Host deployment-webperf01 is DOWN: CRITICAL - Host Unreachable (10.68.22.60) [22:09:20] legoktm: uh [22:09:29] where's parallel-lint gone? [22:10:02] php70,71,72 all pass [22:11:06] Well, it was never going to be 100% successful [22:11:11] https://integration.wikimedia.org/ci/job/composer-package-hhvm-docker/1810/console [22:11:15] hmm [22:11:31] Some other changes been pulled in? [22:13:48] Package operations: 35 installs, 0 updates, 0 removals [22:13:48] - Installing jakub-onderka/php-parallel-lint (v1.0.0): Loading from cache [22:14:07] "Writing lock file" "sh: 1: parallel-lint: not found" [22:14:10] that's just weird [22:17:03] !log https://gerrit.wikimedia.org/r/#/c/436902/ finished deploying [22:17:05] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:17:15] I can't even test composer locally with hhvm, it fails with "Array to string conversion" [22:18:04] maybe parallel lint 1.0.0 no more support hhvm and magically disappear ? :/ [22:18:28] https://packagist.org/packages/jakub-onderka/php-parallel-lint [22:18:30] !Log Set up deploymenet-webperf11 with beta's puppetmaster per https://wikitech.wikimedia.org/wiki/Help:Standalone_puppetmaster, T195314 [22:18:31] php: >=5.3.3 [22:18:31] T195314: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 [22:18:42] Yeah, if it wasn't supported, composer would complain [22:18:56] and the build would fail before at composer-install instead of reaching composer -test [22:19:10] Worth a recheck? [22:20:17] nope [22:21:46] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4249972 (10Krinkle) [22:21:49] https://github.com/JakubOnderka/PHP-Parallel-Lint/commit/e6967cf59899f3e69c4a13f7e15b648ac673c5f8 [22:22:40] well, removing travis tests isn't actually removing support... [22:23:12] potentially parallel-lint might be there and available in the PATH [22:23:21] but maybe the parallel-lint script has a shebang that point to an invalid file [22:23:27] nobody@9bd3d0e2b0c8:/src/cdb$ ls vendor/bin/ [22:23:27] covers-validator minus-x parallel-lint phpcbf phpcs phpunit [22:24:15] calling it manually works [22:24:16] Reedy: It seems the jobs run differently [22:24:16] nobody@9bd3d0e2b0c8:/src/cdb$ ./vendor/bin/parallel-lint . --exclude vendor [22:24:19] the iamges are not the same [22:24:23] + exec docker run --rm --env-file /dev/fd/63 --volume /srv/jenkins-workspace/workspace/composer-package-hhvm-docker/log:/log --volume /srv/jenkins-workspace/workspace/composer-package-hhvm-docker/cache:/cache --volume /srv/jenkins-workspace/workspace/composer-package-hhvm-docker/src:/src docker-registry.wikimedia.org/releng/composer-package-hhvm:0.2.3 [22:24:28] + exec docker run --rm --env-file /dev/fd/63 --volume /srv/jenkins-workspace/workspace/composer-package-php70-docker/log:/log --volume /srv/jenkins-workspace/workspace/composer-package-php70-docker/cache:/cache --volume /srv/jenkins-workspace/workspace/composer-package-php70-docker/src:/src docker-registry.wikimedia.org/releng/composer-package:0.1.2 [22:24:45] one uses composer-package:* the other composer-package-(something)>:* [22:24:49] we just deployed new images... [22:24:52] oh [22:24:58] no version means php7.0 [22:24:58] seems like other both should use the generic one, or both the specific one [22:25:05] Hm.. okay [22:25:07] it's just poorly named [22:25:31] it seems to be different though, like extra stuff in the dockerfile. Maybe something that makes the composer install go to the wrong directory or something [22:26:42] PROBLEM - Puppet errors on deployment-webperf11 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [22:27:33] https://github.com/wikimedia/integration-config/blob/master/dockerfiles/composer-package-php71/Dockerfile.template and https://github.com/wikimedia/integration-config/blob/master/dockerfiles/composer-hhvm/Dockerfile.template are quite different. [22:27:39] The entry point different seems suspicious [22:27:46] https://integration.wikimedia.org/ci/job/composer-package-hhvm-docker/1813/console echo $PATH [22:27:59] I dont' know how but I would assume one may discard the result or not be the same container/enviornment [22:28:24] and /src/vendor/bin in the PATH of the hhvm build [22:28:45] grr [22:28:53] and /src/vendor/bin is **NOT** the PATH of the hhvm build [22:31:42] RECOVERY - Puppet errors on deployment-webperf11 is OK: OK: Less than 1.00% above the threshold [0.0] [22:31:50] Krinkle: composer-hhvm is just a base image. You want to look at composer-package-hhvm [22:33:46] yeah, exactly what hashar said :( [22:37:32] https://github.com/composer/composer/commits/master/src/Composer/EventDispatcher/EventDispatcher.php [22:37:36] I don't see any relevant changes [22:38:30] (03Draft2) 10Reedy: Quote require-dev values [integration/config] - 10https://gerrit.wikimedia.org/r/436953 (https://phabricator.wikimedia.org/T195688) [22:40:55] ^ fixes another bug at least [22:42:43] PROBLEM - Puppet errors on deployment-webperf11 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:45:09] (03CR) 10Krinkle: [C: 031] Quote require-dev values [integration/config] - 10https://gerrit.wikimedia.org/r/436953 (https://phabricator.wikimedia.org/T195688) (owner: 10Reedy) [22:46:01] uh [22:46:13] nope, never mind [22:47:16] (03Draft2) 10Reedy: Rebuild operations-mw-config-composer-test-docker [integration/config] - 10https://gerrit.wikimedia.org/r/436954 (https://phabricator.wikimedia.org/T195688) [22:49:04] So it's only the hhvm jobs broken? [22:49:08] I think so [22:50:59] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4250007 (10Krinkle) [22:51:08] deployment-webperf11.deployment-prep.eqiad.wmflabs Krinkle ? [22:51:26] Krenair: Yes, it's fixed. [22:51:37] (03CR) 10Reedy: [C: 032] "Can be put into a docker image later" [integration/config] - 10https://gerrit.wikimedia.org/r/436953 (https://phabricator.wikimedia.org/T195688) (owner: 10Reedy) [22:53:13] (03Merged) 10jenkins-bot: Quote require-dev values [integration/config] - 10https://gerrit.wikimedia.org/r/436953 (https://phabricator.wikimedia.org/T195688) (owner: 10Reedy) [22:59:10] Do we just need to munge path to add /src/vendor/bin? [23:00:18] as a workaround... [23:00:28] that would work [23:01:13] I saw the tickets about deployment-snapshot01's puppetmaster [23:01:43] which one? [23:01:48] will try to sort that out next week; tl;dr is that I need a separate puppetmaser because when I test over there it's stuff that could break the rest of the mw appservers in beta [23:01:54] and that's not ok [23:02:02] ok... [23:02:03] well there's two issues [23:02:12] one is moving it to stretch i guess [23:02:25] the other is the ssh key thing, not sure what can be done about that frankly [23:02:44] RECOVERY - Puppet errors on deployment-webperf11 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:51] is there any possibility of using the normal CA with the special puppetmaster? [23:02:55] and then the third (because I can't count) [23:03:10] and hook it up to puppetdb? [23:03:15] is that puppet is I guess broken over on deployment-snapshot [23:03:19] right now [23:03:25] legoktm: run.sh? Or the dockerfile? [23:03:41] I don't see how we can use it because the cn's won't match up [23:04:02] two hard problems in computer science: cache invalidation, naming, and off-by-one errors :) [23:04:09] pretty much [23:04:16] the third hard problem is getting enough sleep [23:04:26] I'm going to take a whack at that one right now (2 am) [23:04:28] doesn't prod have multiple puppetmasters? [23:04:32] ah [23:04:35] this was sort of a drive-by to let you know I'm not ignoring that stuff [23:04:36] yes [23:04:39] thanks [23:04:43] get some sleep :) [23:04:58] night! [23:05:45] Reedy: uh, the former? not sure, I'm debugging composer atm [23:06:06] (03PS1) 10Chico Venancio: zuul: Add Chicocvenancio to jenkins whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/436957 [23:06:28] (03CR) 10Paladox: [C: 031] zuul: Add Chicocvenancio to jenkins whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/436957 (owner: 10Chico Venancio) [23:11:48] (03Draft2) 10Reedy: Prepend /src/vendor/bin to $PATH [integration/config] - 10https://gerrit.wikimedia.org/r/436958 [23:13:06] just revert the jenkins jobs to the previous image I guess [23:13:51] and if you do https://gerrit.wikimedia.org/r/#/c/436953/ you still need to bump all the images that depends on ccomposer [23:13:55] just saying :] [23:14:28] Yeah, not doing that today [23:16:31] I take it people are already investigating the parallel-lint issue? [23:16:31] 23:14:49 sh: 1: parallel-lint: not found [23:17:06] https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/4414/console [23:17:46] Yeah, we're not exactly sure what's happened after updating the images [23:17:54] the hhvm $PATH is missing a part [23:17:57] the zend PHP jobs are fine [23:19:02] I want to stab my internet [23:19:29] doin't we all when it's slow! :) [23:23:36] Reedy: yaaaaaaaaaaay I know what's wrong [23:28:25] (03PS1) 10Legoktm: Use exact same dependencies that composer uses [integration/composer] - 10https://gerrit.wikimedia.org/r/436960 [23:28:59] PROBLEM - Puppet errors on deployment-certcentral-testclient is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [23:29:16] Reedy: https://gerrit.wikimedia.org/r/#/c/436960/ there's some regression in symfony 3.x [23:30:06] That's what's causing it? [23:30:10] (03CR) 10Legoktm: [C: 032] Use exact same dependencies that composer uses [integration/composer] - 10https://gerrit.wikimedia.org/r/436960 (owner: 10Legoktm) [23:30:19] (03Merged) 10jenkins-bot: Use exact same dependencies that composer uses [integration/composer] - 10https://gerrit.wikimedia.org/r/436960 (owner: 10Legoktm) [23:30:24] yep [23:30:43] :/ [23:31:00] Known bug? [23:31:05] no idea [23:31:15] I forget when symfony dropped hhvm support [23:31:22] Ah [23:31:36] https://symfony.com/blog/symfony-4-end-of-hhvm-support [23:31:59] Meh [23:32:03] Fair enough [23:32:24] Time to bump all those changelogs again :D [23:32:40] (03Abandoned) 10Reedy: Prepend /src/vendor/bin to $PATH [integration/config] - 10https://gerrit.wikimedia.org/r/436958 (owner: 10Reedy) [23:33:09] (03PS3) 10Reedy: Rebuild operations-mw-config-composer-test-docker [integration/config] - 10https://gerrit.wikimedia.org/r/436954 (https://phabricator.wikimedia.org/T195688) [23:33:21] (03CR) 10Reedy: [C: 032] "Might aswell merge this if we're gonna do another build" [integration/config] - 10https://gerrit.wikimedia.org/r/436954 (https://phabricator.wikimedia.org/T195688) (owner: 10Reedy) [23:33:39] Reedy: I can do that this time :p [23:34:16] Heh [23:34:49] Upto you :P [23:35:37] (03Merged) 10jenkins-bot: Rebuild operations-mw-config-composer-test-docker [integration/config] - 10https://gerrit.wikimedia.org/r/436954 (https://phabricator.wikimedia.org/T195688) (owner: 10Reedy) [23:39:02] RECOVERY - Puppet errors on deployment-certcentral-testclient is OK: OK: Less than 1.00% above the threshold [0.0] [23:41:54] Reedy: did you make the changelog entries by hand? [23:42:01] Yeah [23:42:19] dch: warning: changelog(l5): invalid abbreviated month name 'June' [23:42:19] LINE: -- Sam Reed Tue, 01 June 2018 19:32:43 +0000 [23:42:42] lol [23:59:32] (03PS1) 10Legoktm: Fix Reedy's changelog entries [integration/config] - 10https://gerrit.wikimedia.org/r/436963 [23:59:35] (03PS1) 10Legoktm: Rebuild images for composer update [integration/config] - 10https://gerrit.wikimedia.org/r/436964 [23:59:52] (03CR) 10Reedy: [C: 032] Fix Reedy's changelog entries [integration/config] - 10https://gerrit.wikimedia.org/r/436963 (owner: 10Legoktm) [23:59:59] PROBLEM - Puppet errors on deployment-certcentral-testclient is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]