[00:20:54] (03CR) 10BearND: [C: 031] Add ReadingLists extension [tools/release] - 10https://gerrit.wikimedia.org/r/384909 (https://phabricator.wikimedia.org/T174651) (owner: 10Gergő Tisza) [00:28:33] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap, 10Patch-For-Review: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3692582 (10Paladox) I guess we can say this is resolved now :) And close the task in a few days just to make sure everything works :) [00:28:51] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3692599 (10Paladox) [00:30:05] (03CR) 10Jforrester: [C: 031] Replace cicalese@mitre.org with cindom@gmail.com. [integration/config] - 10https://gerrit.wikimedia.org/r/384900 (owner: 10Cicalese) [00:32:58] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3692601 (10demon) Probably this first but ya: https://gerrit.wikimedia.org/r/#/c/384760/ [01:15:53] thcipriani: I'm very sorry, baby emergency all day, right after we talked and I just literally sat back down at the computer [01:16:31] at this point I'm just going to let Luca look at that thing tomorrow morning [01:16:41] but I'll send him the details though [01:32:27] milimetric: no worries at all! hope everything is all right. [01:33:07] oh sorry, yeah, it's all good now [01:33:16] all of us are just exhausted :) [01:34:23] that's good to hear :) [01:34:55] yeah, waiting for Luca to take a look tomorrow is no big deal on my side. No worries. [03:00:02] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<50.00%) [03:59:13] Project selenium-MultimediaViewer » firefox,mediawiki,Linux,BrowserTests build #550: 04FAILURE in 3 min 12 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=mediawiki,PLATFORM=Linux,label=BrowserTests/550/ [04:11:51] Project selenium-MultimediaViewer » firefox,beta,Linux,BrowserTests build #550: 04FAILURE in 15 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/550/ [05:20:19] 10Continuous-Integration-Config, 10Anti-Harassment: Run `composer install` on Jenkins for AbuseFilter & AntiSpoof - https://phabricator.wikimedia.org/T178452#3692770 (10dbarratt) [05:21:04] 10Continuous-Integration-Config, 10Anti-Harassment, 10Composer: Run `composer install` on Jenkins for AbuseFilter & AntiSpoof - https://phabricator.wikimedia.org/T178452#3692788 (10dbarratt) [05:44:51] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [06:06:35] (03PS1) 10Gergő Tisza: Add Groovier1 to the CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/384926 [06:20:59] 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board, 10Unplanned-Sprint-Work: Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3692824 (10Joe) Hi! I'm not sure I understand the details or the requirements, in fact last time I looked at your p... [06:43:23] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3692832 (10Paladox) Thanks +1 :) Though before puppet runs with that change, you will probably want to move it so that I can be a symblink :). [06:44:48] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [06:50:01] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [06:51:57] 10Gerrit, 10Patch-For-Review: Gerrit: Convert Velocity templates to Closure Templates - https://phabricator.wikimedia.org/T158008#3692851 (10Paladox) Upstream have removed support for velocity now. I will now have to try and find someone to merge my soy support change in its-base :). I will also have to remov... [07:10:26] PROBLEM - Disk space on contint1001 is CRITICAL: DISK CRITICAL - /var/lib/docker/overlay2/792542d039b7e66780e8b3e9a00fd5e2ab986252b8ac1f0d74719d45be63cb66/merged is not accessible: Permission denied [07:15:04] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10monitoring: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3692868 (10hashar) [07:19:33] RECOVERY - Disk space on contint1001 is OK: DISK OK [07:22:31] PROBLEM - Disk space on contint1001 is CRITICAL: DISK CRITICAL - /var/lib/docker/overlay2/477586f8c192c241d251dcda6d0d050581e5e075b25bf945435a8cb391d8e64d/merged is not accessible: Permission denied [07:40:26] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10monitoring: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3692896 (10hashar) For a running Docker container we have: ``` overlay on /v... [07:40:32] RECOVERY - Disk space on contint1001 is OK: DISK OK [07:44:48] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [07:46:53] (03PS1) 10Hashar: Fix qa tox testenv [integration/config] - 10https://gerrit.wikimedia.org/r/384941 [07:49:02] (03PS1) 10Hashar: Send integration/config QA notifications to qa-alerts [integration/config] - 10https://gerrit.wikimedia.org/r/384942 [07:49:54] (03CR) 10Hashar: [C: 032] Fix qa tox testenv [integration/config] - 10https://gerrit.wikimedia.org/r/384941 (owner: 10Hashar) [07:51:09] (03Merged) 10jenkins-bot: Fix qa tox testenv [integration/config] - 10https://gerrit.wikimedia.org/r/384941 (owner: 10Hashar) [07:51:42] (03CR) 10Hashar: [C: 032] Send integration/config QA notifications to qa-alerts [integration/config] - 10https://gerrit.wikimedia.org/r/384942 (owner: 10Hashar) [07:52:56] (03Merged) 10jenkins-bot: Send integration/config QA notifications to qa-alerts [integration/config] - 10https://gerrit.wikimedia.org/r/384942 (owner: 10Hashar) [08:27:31] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): deployment-videoscaler01 has memcached failures - https://phabricator.wikimedia.org/T178457#3692974 (10hashar) ferm is a different issue due to AAAA DNS resolution which is not available on labs. That is T176314 and worked around via https://g... [08:28:13] thcipriani: o/ - fixed aqs* hosts, sorry for the puppet failures! [08:33:04] RECOVERY - Puppet errors on deployment-aqs02 is OK: OK: Less than 1.00% above the threshold [0.0] [08:36:48] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Thumbor: deployment-videoscaler01 has memcached failures - https://phabricator.wikimedia.org/T178457#3692980 (10hashar) ``` name=/var/log/nutcracker/nutcracker.log [2017-10-18 08:28:35.263] nc.c:184 nutcracker-0.4.1 built for Linux 4.9.0-3... [08:38:20] !log deployment-videoscaler01: install --owner=nutcracker -d /var/run/nutcracker && systemctl start nutcracker # T178457 [08:38:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:38:26] T178457: deployment-videoscaler01 has memcached failures - https://phabricator.wikimedia.org/T178457 [08:39:50] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Thumbor: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3692987 (10hashar) [08:39:55] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations, 10Thumbor: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3692957 (10hashar) Fixed it by MANUALLY creating a `/va... [08:41:22] !log deployment-mediawiki07: install --owner=nutcracker -d /var/run/nutcracker && systemctl start nutcracker # T178457 [08:41:28] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:43:49] RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0] [08:59:51] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [09:01:44] 10Continuous-Integration-Config, 10Anti-Harassment, 10Composer: Run `composer install` on Jenkins for AbuseFilter & AntiSpoof - https://phabricator.wikimedia.org/T178452#3693007 (10hashar) Extensions deployed on the Wikimedia cluster have the composer dependencies shipped via mediawiki/vendor. The CI job onl... [09:02:59] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Provide git repositories on docker slaves to act as reference to git clone - https://phabricator.wikimedia.org/T178076#3693009 (10hashar) Thank you @Addshore and @Dzahn ! [09:04:49] morning hashar! [09:05:52] FYI I wrote this (after having some fun and failing a bit yesterday) https://www.mediawiki.org/w/index.php?title=Continuous_integration/Docker&diff=2586762&oldid=2560238 [09:06:18] Also, removed the slave with the longer name (as horizon struggles with FQNs over 64 chars) and recreated it with a shorter name [09:06:46] Any idea why this part of jenkins just gives 403s? https://integration.wikimedia.org/ci/computer/integration-slave-docker-1006/builds [09:12:08] PROBLEM - Free space - all mounts on integration-slave-jessie-android is CRITICAL: CRITICAL: integration.integration-slave-jessie-android.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-android.diskspace.root.byte_percentfree (<100.00%) [09:13:32] addshore: https://www.mediawiki.org/wiki/Continuous_integration/Docker#Slave_creation !! +1 ! [09:13:53] "horizon struggles with FQNs over 64 chars" ..... poor horizons [09:14:21] addshore: and jobsXXX/builds I had to blacklist those URLs at apache level [09:14:24] !log deployment-prep: upgrading elasticsearch to 5.5.2 [09:14:28] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:14:44] addshore: because that requires to parse all the build XML files and that is kind of slow. Specially when some web crawler bots hit them all [09:14:45] hashar: indeed, that was quite an interesting one to dig through [09:14:53] hashar: aaaah, I see [09:14:56] shame it is not paged [09:15:02] addshore: the filtering is done at apache level and would 403 any url with /builds. But maybe we can drop it and see what happens [09:15:04] / the xml files are not split or something, heh [09:15:11] jenkins might be a bit more efficient at loading them nowadays [09:15:18] It would be cool to look! [09:15:20] can we? :D [09:15:30] task it !!! :] [09:15:43] I guess we will want to look at apache logs to figure out whether the URLs are hit often [09:15:55] Also, phan jobs ran on docker all day yesterday and had no problems! (other than we I created 2 slaves in jenkins pointing to the same VM, HAH) [09:16:06] ah yeah [09:16:16] when you create a slave by copying the configuration of another slave [09:16:33] jenkins immediately connect to it, but with the IP coming from the copied config [09:16:36] :(((( [09:16:53] yup, I was not aware of that ;) took me a couple of minuites to figure out why 2 jobs couldnt run at the same time! [09:17:03] I still have to look at what you did for the phan container [09:17:39] and verify containers are properly disposed of at end of job (damn signal handling) [09:18:19] * hashar digs back in mails. I am cleaning my backlog [09:18:32] 10Continuous-Integration-Infrastructure: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3693025 (10Addshore) [09:18:33] hashar: ^^ [09:18:43] neat8 [09:19:09] 10Continuous-Integration-Infrastructure, 10User-Addshore: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3693037 (10Addshore) [09:25:30] 10Continuous-Integration-Infrastructure, 10User-Addshore: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3693043 (10hashar) The URLs are denied at Apache level. Hitting them causes Jenkins to parse all the XML build files which takes a while... [09:30:17] addshore: i think asking for builds on a given computer would trigger a parse of every single builds that ever happened :( [09:31:56] that ever happened on that computer or on ALLLLL? :P [09:32:03] ie, does it do it on the slave or on the master? [09:33:39] not sure [09:33:41] gotta verify [09:40:37] hashar: https://issues.jenkins-ci.org/browse/JENKINS-20892 [09:41:52] addshore: ah at least the build time trend is somewhat limited now :) [09:42:49] addshore: you can probably check by bypassing Apache and do the query directly to the jenkins web service [09:46:35] addshore: hacked manually https://integration.wikimedia.org/ci/computer/contint1001/builds :) [09:51:36] 10Continuous-Integration-Infrastructure, 10User-Addshore: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3693116 (10hashar) I gave it a try. I have manually adjusting the Apache config and loaded https://integration.wikimedia.org/ci/computer/c... [09:51:55] addshore: looks good to me [09:52:15] addshore: feel free to propose a patch that drop the /builds filter and I will hapilly +1 it [09:53:17] lunch etc [09:53:30] 10Continuous-Integration-Infrastructure, 10User-Addshore: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3693025 (10Paladox) Duplicate of T177827 ? [09:57:09] 10Continuous-Integration-Infrastructure, 10User-Addshore: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3693132 (10Addshore) >>! In T178458#3693119, @Paladox wrote: > Duplicate of T177827 ? I dont think so as it talks about a different view. [10:05:52] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10User-Addshore: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3693177 (10Addshore) The code has a ticket in the comment that I can't access, so will just tag it here to leave a c... [10:24:17] (03PS1) 10Zfilipin: Created mediawiki-core-qunit-selenium-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/384961 (https://phabricator.wikimedia.org/T177262) [10:25:25] (03CR) 10jerkins-bot: [V: 04-1] Created mediawiki-core-qunit-selenium-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/384961 (https://phabricator.wikimedia.org/T177262) (owner: 10Zfilipin) [10:26:37] (03PS1) 10Addshore: Remove unused zuul-cloner-docker [integration/config] - 10https://gerrit.wikimedia.org/r/384962 [10:27:47] (03CR) 10Addshore: [C: 032] Remove unused zuul-cloner-docker [integration/config] - 10https://gerrit.wikimedia.org/r/384962 (owner: 10Addshore) [10:28:55] (03Merged) 10jenkins-bot: Remove unused zuul-cloner-docker [integration/config] - 10https://gerrit.wikimedia.org/r/384962 (owner: 10Addshore) [10:31:27] (03PS2) 10Zfilipin: Created mediawiki-core-qunit-selenium-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/384961 (https://phabricator.wikimedia.org/T177262) [10:32:41] (03CR) 10jerkins-bot: [V: 04-1] Created mediawiki-core-qunit-selenium-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/384961 (https://phabricator.wikimedia.org/T177262) (owner: 10Zfilipin) [10:33:42] (03PS3) 10Zfilipin: Created mediawiki-core-qunit-selenium-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/384961 (https://phabricator.wikimedia.org/T177262) [10:53:54] (03PS1) 10Zfilipin: Running mediawiki-core-qunit-selenium-jessie for Popups [integration/config] - 10https://gerrit.wikimedia.org/r/384965 (https://phabricator.wikimedia.org/T177262) [10:54:17] (03PS2) 10Zfilipin: WIP Running mediawiki-core-qunit-selenium-jessie for Popups [integration/config] - 10https://gerrit.wikimedia.org/r/384965 (https://phabricator.wikimedia.org/T177262) [10:54:44] (03PS4) 10Zfilipin: WIP Created mediawiki-core-qunit-selenium-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/384961 (https://phabricator.wikimedia.org/T177262) [11:03:03] (03CR) 10Groovier1: [C: 031] "Information verified." [integration/config] - 10https://gerrit.wikimedia.org/r/384926 (owner: 10Gergő Tisza) [11:03:39] (03PS1) 10Zfilipin: Run mediawiki-core-qunit-selenium-jessie for Popups [integration/config] - 10https://gerrit.wikimedia.org/r/384968 (https://phabricator.wikimedia.org/T177262) [11:11:00] (03PS2) 10Zfilipin: Run mediawiki-core-qunit-selenium-jessie for Popups [integration/config] - 10https://gerrit.wikimedia.org/r/384968 (https://phabricator.wikimedia.org/T177262) [11:15:14] (03PS3) 10Zfilipin: WIP Run mediawiki-core-qunit-selenium-jessie for Popups [integration/config] - 10https://gerrit.wikimedia.org/r/384968 (https://phabricator.wikimedia.org/T177262) [12:17:49] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations, 10Thumbor: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3692957 (10faidon) nutcracker ships `/usr/lib/tmpfiles.... [12:20:10] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations, 10Thumbor: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3692957 (10MoritzMuehlenhoff) deployment-videoscaler01... [12:23:08] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:28:33] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3693635 (10Gilles) [12:28:45] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3692957 (10Gilles) Videoscalers don't run Thumbor [13:04:46] Project selenium-Math » firefox,beta,Linux,BrowserTests build #548: 04FAILURE in 45 sec: https://integration.wikimedia.org/ci/job/selenium-Math/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/548/ [13:20:29] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3693745 (10faidon) Ah! Yes, that all makes sense now, thanks! We ha... [13:24:06] 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board, 10Unplanned-Sprint-Work: Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3693750 (10bmansurov) Thanks for the reply, @Joe. To briefly update you, in T175853#3610941 we found out that inter... [13:40:36] 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board, 10Unplanned-Sprint-Work: Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3693818 (10bmansurov) 05Open>03declined I think we have the information we need in order to proceed. I've updat... [14:13:33] 10Continuous-Integration-Config, 10Anti-Harassment, 10Composer: Run `composer install` on Jenkins for AbuseFilter & AntiSpoof - https://phabricator.wikimedia.org/T178452#3693999 (10dbarratt) >>! In T178452#3693007, @hashar wrote: > Extensions deployed on the Wikimedia cluster have the composer dependencies s... [14:16:21] PROBLEM - Puppet errors on deployment-videoscaler01 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [14:26:23] RECOVERY - Puppet errors on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:56:10] 10Release-Engineering-Team (Kanban), 10Mathoid, 10Release Pipeline: Add experimental blubber test build/run to mathoid jenkins test pipeline - https://phabricator.wikimedia.org/T177954#3694093 (10thcipriani) a:05thcipriani>03dduvall @dduvall already has a good start here, reassigning. [14:56:36] 10Release-Engineering-Team (Kanban), 10Release Pipeline: Pipeline image build cleanup - https://phabricator.wikimedia.org/T177867#3694095 (10thcipriani) a:03thcipriani [14:56:54] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, 10monitoring: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3694099 (10thcipriani) [15:05:23] 10Beta-Cluster-Infrastructure, 10monitoring, 10Tracking: Setup monitoring for Beta Cluster (tracking) - https://phabricator.wikimedia.org/T53497#581090 (10Dzahn) > Setup (more) monitoring of Beta Cluster and expose it through ganglia/icinga/etc. Similar monitoring as to what is of production, just not set to... [15:45:14] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3694296 (10demon) No, puppet will just overwrite it. [15:50:43] no_justification upstream removed support for velocity, so i am going to get upstream to merge my change if i can convince luca :). [15:50:56] * paladox writes patch to remove velocity from its-base to unbreak it [16:18:09] 10RelEng-Archive-FY201718-Q1, 10Scap (Scap3-MediaWiki-MVP), 10Fundraising-Backlog, 10MediaWiki-extensions-ContributionTracking, and 3 others: Clean up Contribution Tracking settings in main wmf config repo - https://phabricator.wikimedia.org/T147479#3694373 (10DStrine) [16:26:12] PROBLEM - puppet last run on contint1001 is CRITICAL: CRITICAL: Puppet has 38 failures. Last run 3 minutes ago with 38 failures. Failed resources (up to 3 shown): File[/etc/apache2/conf-available/50-server-status.conf],File[/usr/local/bin/prometheus-puppet-agent-stats],File[/home/hashar],File[/home/yuvipanda] [16:45:39] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10User-Addshore: un blacklist https://integration.wikimedia.org/ci/computer/XXXX/builds - https://phabricator.wikimedia.org/T178458#3694441 (10hashar) >>! In T178458#3693119, @Paladox wrote: > Duplicate of T177827 ? Yeah that it is tightly related... [17:20:38] mutante remeber you were asking for if the new template changed worked regarding identifying who created the change. [17:20:41] it worked [17:20:42] Change in test[master]: Testtesttesttesttesttesttesttest by Paladox [17:21:23] paladox: i forgot the context and writing the class for Cyberpower :) [17:21:30] ok :) [17:27:12] no_justifications hi, wondering could you merge https://gerrit-review.googlesource.com/#/c/plugins/its-phabricator/+/133790/ please? :) I've tested locally and works. And could a stable-2.14 and stable-2.15 branch be created please. [17:29:06] 10Continuous-Integration-Config, 10MediaWiki-extensions-DonationInterface: Run unit tests on mediawiki/extensions/DonationInterface also against mediawiki/core master - https://phabricator.wikimedia.org/T178516#3694578 (10Umherirrender) [18:15:39] Could someone please look at https://gerrit.wikimedia.org/r/#/c/371947/? as it's been waiting for more than a month now with no progress [18:21:11] Woops i spelt no_justifications wrong it's no_justification :) [18:26:05] PROBLEM - Free space - all mounts on integration-slave-jessie-1002 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1002.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1002.diskspace._srv.byte_percentfree (<100.00%) [18:29:59] !ran `foreachwiki extensions/LoginNotify/maintenance/migratePreferences.php` on deployment-prep [18:32:18] !log MaxSem ran `foreachwiki extensions/LoginNotify/maintenance/migratePreferences.php` on deployment-prep [18:32:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:32:34] d'oh :P [18:32:39] :0 [18:32:42] ;0 [18:32:43] gah [18:32:44] :) [18:56:39] Hmm, CI looks starved for jessie nodes again. :-( [18:57:06] We're having a bit of an openstack outage, I'm looking at it [18:57:17] I still don't know what the problem is though, every individual thing I've checked looks fine [18:58:39] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Update gerrit to 2.14.5 - https://phabricator.wikimedia.org/T156120#3694797 (10Paladox) [19:00:51] Thanks andrewbogott. [19:10:36] as part of https://phabricator.wikimedia.org/T171473 we put that box in service and it seems to have bogged down everything around it [19:10:42] so trying to sort that out [19:10:45] that server is cursed [19:20:18] nodepool should be catching back up now, slowly [19:23:51] Project selenium-MinervaNeue » firefox,beta,Linux,BrowserTests build #167: 04FAILURE in 34 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/167/ [19:28:22] PROBLEM - puppet last run on contint1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [19:28:58] I don't know what ^ is and am pretty sure it's unrelated [19:29:02] nodepool looks good to me now [19:30:24] see -operations [19:31:53] 10Gerrit, 10Jenkins: Jenkins-bot lating with adding verifies in patches on gerrit - https://phabricator.wikimedia.org/T176546#3694921 (10Zoranzoki21) 05Invalid>03Open See: https://gerrit.wikimedia.org/r/#/c/383999/3 Any is not ok.. He added verify on gerrit after ~40 minutes [19:34:40] 10Continuous-Integration-Config, 10AntiSpoof, 10Anti-Harassment (AHT Sprint 7), 10Patch-For-Review: Get Equivset Test Coverage to 100% - https://phabricator.wikimedia.org/T177667#3694933 (10hashar) [19:47:07] 10Gerrit, 10Jenkins: Jenkins-bot lating with adding verifies in patches on gerrit - https://phabricator.wikimedia.org/T176546#3694970 (10hashar) 05Open>03declined The reason is we have very limited resources to run the jobs. Each job runs in a virtual machine which is discarded at the end of the run. The... [19:53:22] RECOVERY - puppet last run on contint1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [20:04:59] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations, 10Patch-For-Review: nutcracker fails to start due to lack of /var/run/nutcracker (ex: deployment-videoscaler01 has memcached failures) - https://phabricator.wikimedia.org/T178457#3694999 (10hashar) @gilles sorry for the spam... [20:08:14] 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board, 10Unplanned-Sprint-Work: Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3695008 (10bmansurov) Puppeteer documentations warns against using versions of Chromium that doesn't come with pupe... [20:27:57] 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board, 10Unplanned-Sprint-Work: Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3695040 (10bmansurov) 05declined>03Open Reopening given T178189#3695008. What do you guys think? [20:41:55] Yippee, build fixed! [20:41:55] Project selenium-Echo » chrome,beta,Linux,BrowserTests build #551: 09FIXED in 54 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/551/ [20:42:43] Project selenium-Echo » firefox,beta,Linux,BrowserTests build #551: 04FAILURE in 1 min 43 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/551/ [20:53:35] 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board, 10Unplanned-Sprint-Work: Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3695125 (10Joe) I would still advise distributing such a large binary (and the corresponding libraries) as a deb pa... [20:56:21] 10Gerrit, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board, 10Unplanned-Sprint-Work: Temporarily allow pushing large objects - https://phabricator.wikimedia.org/T178189#3695130 (10Joe) >>! In T178189#3695008, @bmansurov wrote: > Puppeteer documentations warns against using versions o... [21:08:29] 10Continuous-Integration-Config, 10AntiSpoof, 10Anti-Harassment (AHT Sprint 7), 10Patch-For-Review: Get Equivset Test Coverage to 100% - https://phabricator.wikimedia.org/T177667#3695168 (10dbarratt) >>! In T177667#3694929, @hashar wrote: > Do you have a Gerrit change and job name that shows the issue? H... [21:33:07] PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:33:30] PROBLEM - Puppet errors on deployment-eventlogging04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:33:36] Project beta-code-update-eqiad build #177455: 04FAILURE in 35 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/177455/ [21:34:12] 10Continuous-Integration-Config, 10Fundraising-Backlog, 10Fundraising Sprint Prank Seatbelt, 10Patch-For-Review, 10Unplanned-Sprint-Work: Continuous integration: wikimedia/fundraising/tools/DjangoBannerStats needs V+2 jobs - https://phabricator.wikimedia.org/T121723#3695244 (10DStrine) [21:37:48] PROBLEM - Puppet errors on deployment-eventlog02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:45:42] Yippee, build fixed! [21:45:42] Project beta-code-update-eqiad build #177456: 09FIXED in 38 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/177456/ [22:12:47] RECOVERY - Puppet errors on deployment-eventlog02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:13:09] RECOVERY - Puppet errors on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [22:13:31] RECOVERY - Puppet errors on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:22:23] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure (Little Steps Sprint), 10Release-Engineering-Team (Backlog): Get rid of Zend 5.5 tests for wmf branches - https://phabricator.wikimedia.org/T94149#3695360 (10Jdforrester-WMF) [22:33:38] 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10Deployments, 10WorkType-NewFunctionality: Scap3 submodule space issues - https://phabricator.wikimedia.org/T137124#3695393 (10mmodell) [22:37:33] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3695395 (10Paladox) This can now be closed as resolved :). Change was merged. Package removed from puppet but not host. [22:46:07] (03PS3) 10Paladox: Update mw-install-postgresql to include the install script [integration/jenkins] - 10https://gerrit.wikimedia.org/r/316232 (https://phabricator.wikimedia.org/T22343) [22:46:22] (03PS4) 10Paladox: Add an assert-databaseflavour script [integration/config] - 10https://gerrit.wikimedia.org/r/316227 [22:47:58] (03PS5) 10Paladox: Add php7 pipeline for zuul [integration/config] - 10https://gerrit.wikimedia.org/r/313213 (https://phabricator.wikimedia.org/T144872) [22:49:52] (03CR) 10Gergő Tisza: [C: 032] Add ReadingLists extension [tools/release] - 10https://gerrit.wikimedia.org/r/384909 (https://phabricator.wikimedia.org/T174651) (owner: 10Gergő Tisza) [22:51:58] (03PS6) 10Paladox: Move script out of assert-phpflavor macro [integration/config] - 10https://gerrit.wikimedia.org/r/296061 (https://phabricator.wikimedia.org/T124572) [22:52:00] (03Merged) 10jenkins-bot: Add ReadingLists extension [tools/release] - 10https://gerrit.wikimedia.org/r/384909 (https://phabricator.wikimedia.org/T174651) (owner: 10Gergő Tisza) [23:32:11] PROBLEM - Puppet errors on integration-slave-docker-1005 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]