[00:42:40] (03CR) 10Legoktm: [C: 032] Make rules for footer contents less strict [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304496 (https://phabricator.wikimedia.org/T142804) (owner: 10BryanDavis) [00:43:08] (03Merged) 10jenkins-bot: Make rules for footer contents less strict [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304496 (https://phabricator.wikimedia.org/T142804) (owner: 10BryanDavis) [00:49:03] (03PS3) 10Legoktm: Add script to test already merged commits in a repository [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304583 [00:49:28] (03CR) 10Legoktm: [C: 032] Add script to test already merged commits in a repository [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304583 (owner: 10Legoktm) [00:49:56] (03Merged) 10jenkins-bot: Add script to test already merged commits in a repository [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304583 (owner: 10Legoktm) [01:13:06] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [01:15:14] (03PS1) 10Legoktm: Release 0.4.0 [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/306085 [01:16:53] (03CR) 10Legoktm: [C: 032] Release 0.4.0 [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/306085 (owner: 10Legoktm) [01:18:35] (03Merged) 10jenkins-bot: Release 0.4.0 [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/306085 (owner: 10Legoktm) [01:28:27] PROBLEM - Puppet staleness on deployment-imagescaler01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [02:25:26] RECOVERY - Host deployment-parsoid05 is UP: PING OK - Packet loss = 0%, RTA = 0.72 ms [02:31:08] PROBLEM - Host deployment-parsoid05 is DOWN: CRITICAL - Host Unreachable (10.68.16.120) [03:17:09] (03CR) 10Krinkle: [C: 04-1] "Pending outcome of upstream bug." [integration/config] - 10https://gerrit.wikimedia.org/r/305993 (https://phabricator.wikimedia.org/T142964) (owner: 10Phedenskog) [03:30:50] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10MobileFrontend, 13Patch-For-Review, and 3 others: Jenkins complains on MobileFrontend commits with Could not read gem at /var/lib/gems/2.1.0/cache/rake-10.5.0.gem. It may be corrupt... - https://phabricator.wikimedia.org/T143601#2574258 [04:09:44] is zuul dead again? :( [04:11:32] ah, its probably wmf26 [04:16:50] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #117: 04FAILURE in 20 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/117/ [04:20:41] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574298 (10demon) a:03demon [04:21:15] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574301 (10Joergi123) T137264 added the according change. Interestingly, the changeset in Ger... [04:21:35] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574303 (10demon) As soon as the tags are finished, I'll build some new tarballs. [04:22:21] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574306 (10demon) >>! In T143635#2574301, @Joergi123 wrote: > T137264 added the according cha... [04:22:41] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574307 (10demon) p:05Triage>03Unbreak! [04:24:19] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574310 (10demon) @Joergi123 As a quick workaround, you can swap `[` for `array(` and `]` for... [04:26:07] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574311 (10Joergi123) >>! In T143635#2574310, @demon wrote: > @Joergi123 As a quick workaroun... [04:26:48] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574312 (10demon) Yeah I'll wrap it up shortly :) [05:17:21] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574344 (10demon) 05Open>03Resolved [[https://lists.wikimedia.org/pipermail/mediawiki-ann... [05:41:52] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574389 (10Joergi123) Actually the tarballs for 1.26 are still unchanged... [05:44:20] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574392 (10demon) Looking at mediawiki-1.26.4.patch.gz I'm seeing the correct array syntax i... [05:49:14] (03CR) 10Phedenskog: "The upstream bug: It works for mobile devices but it fails for desktop, so I've configured so we only run the mobile devices as a proxy, s" [integration/config] - 10https://gerrit.wikimedia.org/r/305993 (https://phabricator.wikimedia.org/T142964) (owner: 10Phedenskog) [05:53:15] (03CR) 10Krinkle: "The bug you saw with broken timeline captures, does that affect wpt-reporter as well, or only the web interface?" [integration/config] - 10https://gerrit.wikimedia.org/r/305993 (https://phabricator.wikimedia.org/T142964) (owner: 10Phedenskog) [05:53:34] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574393 (10Joergi123) I have tried with mediawiki-1.26.4.tar.gz and also a new download still... [06:00:51] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574283 (10Legoktm) Maybe it is cached somewhere? ``` km@km-tp ~/p/sandbox> wget https://rele... [06:15:22] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574442 (10Joergi123) Obviously it was cached, but not on **my** side. I even downloaded to t... [06:54:34] 10Beta-Cluster-Infrastructure, 06Operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#2574547 (10yuvipanda) [06:55:44] 10MediaWiki-Releasing, 10MediaWiki-General-or-Unknown, 05MW-1.23-release, 05MW-1.26-release, 05Release: Regression: MediaWiki 1.26.4 is using incompatible array syntax - https://phabricator.wikimedia.org/T143635#2574549 (10demon) It does sit behind varnish, which I suppose is nice except in rare instance... [07:00:48] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10MobileFrontend, 13Patch-For-Review, and 3 others: Jenkins complains on MobileFrontend commits with Could not read gem at /var/lib/gems/2.1.0/cache/rake-10.5.0.gem. It may be corrupt... - https://phabricator.wikimedia.org/T143601#2573107 [07:14:28] /tmp/hudson1687932244205747632.sh: line 5: /usr/bin/lintian-junit-report: No such file or directory - hmm. again. [08:04:23] I'm not sure why the zuul queue is so backlogged, it looks like the jobs just aren't running even though there are free executor slots? [08:07:17] good morning hashar [08:07:23] hello :-} [08:07:50] legoktm: thank you for all the nodepool/CI baby sitting you have done ! [08:08:07] no problem :) [08:08:37] I'm not sure why it's stuck right now [08:08:43] [01:04:23] I'm not sure why the zuul queue is so backlogged, it looks like the jobs just aren't running even though there are free executor slots? [08:08:56] there was a giant backlog because of the security release, but it cleared through that pretty well [08:09:14] but there are trusty slaves just sitting there, doing nothing... [08:12:01] legoktm: oh I will deal with it [08:12:12] do you know what's wrong? [08:12:19] most probably no trusty slaves can be spawned [08:12:30] or Jenkins is deadlocked somehow [08:12:44] no, these are normal slaves - not nodepool [08:13:01] all the nodepool jobs on trusty ran just fine :) [08:13:10] oh [08:13:41] all the jessie jobs on permanent slaves and nodepool also ran fine. It seems to just be trusty permanent slaves having issues [08:14:11] * hashar tries disabling and reenabling gearman client [08:14:29] that some times fix deadlocks [08:15:00] legoktm: solved by disabling/enabling gearman client [08:15:18] the thing is that there is a bad interaction between the Jenkins plugin that throttle builds one per node and the Gearman plugin [08:15:29] sometime they race for executors and ends up being deadlocked on each other [08:15:31] ah [08:15:36] disabling gearman remove the deadlock [08:16:06] !log disabled/enabled Jenkins Gearman client to remove deadlock with Throttle plugin [08:16:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [08:16:18] whenever we get most stuff moved to Nodepool we will be able to drop the Throttle plugin [08:18:32] !log reboot integration-slave-trusty-1014 [08:18:39] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [08:22:48] !log running puppet on integration-slave-trusty-1014 [08:22:52] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [08:25:43] PROBLEM - Puppet staleness on integration-slave-trusty-1014 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [43200.0] [08:27:02] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10MobileFrontend, 13Patch-For-Review, and 3 others: Jenkins complains on MobileFrontend commits with Could not read gem at /var/lib/gems/2.1.0/cache/rake-10.5.0.gem. It may be corrupt... - https://phabricator.wikimedia.org/T143601#2574626 [08:35:46] RECOVERY - Puppet staleness on integration-slave-trusty-1014 is OK: OK: Less than 1.00% above the threshold [3600.0] [08:54:39] 10Beta-Cluster-Infrastructure: New wiki cluster wikipedia indonesian language - https://phabricator.wikimedia.org/T143557#2574672 (10hashar) p:05Triage>03Normal Each wiki added to the beta cluster adds overhead to the ongoing maintenance of the small infrastructure. In most cases it is better to just use on... [08:56:05] 10Beta-Cluster-Infrastructure, 07Beta-Cluster-reproducible, 07I18n: On Beta Cluster, MediaWiki namespace override is inconsistently applied - https://phabricator.wikimedia.org/T142863#2574675 (10hashar) 05Open>03Resolved a:03hashar Assuming it is fixed for now. Cant reproduce reliably for now, we can a... [09:04:04] 05Continuous-Integration-Scaling, 07Nodepool: Nodepool should send metrics to statsd - https://phabricator.wikimedia.org/T111496#2574696 (10hashar) 05stalled>03Resolved a:03chasemp Nodepool now report statistics to statsd. Has been done via https://gerrit.wikimedia.org/r/#/c/305529/ We will want to pat... [09:36:54] 05Continuous-Integration-Scaling, 07Nodepool: Nodepool should send metrics to statsd - https://phabricator.wikimedia.org/T111496#2574751 (10hashar) And I have created a basic dashboard in Grafana at https://grafana-admin.wikimedia.org/dashboard/db/nodepool [09:43:26] (03PS25) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [10:40:49] (03PS1) 10Hashar: Link to Nodepool grafana dashboard [integration/docroot] - 10https://gerrit.wikimedia.org/r/306175 [10:42:42] 10Browser-Tests-Infrastructure, 10VisualEditor, 10VisualEditor-MediaWiki, 13Patch-For-Review, 15User-zeljkofilipin: Fix font support on SauceLabs VE screenshots - https://phabricator.wikimedia.org/T141369#2574814 (10zeljkofilipin) The job is running using Chrome browser. | | [[ https://integration.wiki... [10:44:08] (03CR) 10Hashar: [C: 032] Link to Nodepool grafana dashboard [integration/docroot] - 10https://gerrit.wikimedia.org/r/306175 (owner: 10Hashar) [10:44:25] (03Merged) 10jenkins-bot: Link to Nodepool grafana dashboard [integration/docroot] - 10https://gerrit.wikimedia.org/r/306175 (owner: 10Hashar) [11:36:13] zeljkof: look at http://logstash-beta.wmflabs.org ? :) [11:37:03] hashar: thanks, looking [11:37:07] I am getting this: [11:37:24] /usr/local/lib/ruby/gems/2.3.0/gems/mediawiki_api-0.7.0/lib/mediawiki_api/client.rb:211:in `send_request': [V7wzDQpEEH8AAHb1ERgAAAAD] Exception Caught: Could not acquire lock for "Array". (internal_api_error_LocalFileLockError) (MediawikiApi::ApiError) [11:37:32] works fine on vagrant and production [11:37:39] broken only on beta [11:38:31] Failed to lock 'VisualEditor_toolbar-ta.png' [11:38:36] does not say much [11:38:41] at least something I can report [11:38:45] thanks hashar [11:46:33] zeljkof: well report that as a task for VE folks + beta-cluster-infra [11:46:40] there must be some more traces in logstash [11:46:52] reporting [11:47:01] I do not think it is related to VE [11:47:02] make sure to dig in logs :} [11:47:12] might be resource loader [11:47:16] logs are all spanish to me :| [11:47:24] I am using the API [11:47:28] not the web [11:53:16] hashar: https://phabricator.wikimedia.org/T143655 [11:53:39] 10Beta-Cluster-Infrastructure, 06Commons, 10MediaWiki-API, 07Beta-Cluster-reproducible, 15User-zeljkofilipin: internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T143655#2575020 (10zeljkofilipin) [12:27:16] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T141551#2575183 (10hashar) [12:35:02] 10MediaWiki-Releasing, 10MediaWiki-Containers: Ready-to-use Docker package for MediaWiki - https://phabricator.wikimedia.org/T92826#2575219 (10mobrovac) [PR #1](https://github.com/wikimedia/mediawiki-node-services/pull/1) updates the Node services container with the latest developments. [12:35:25] 10MediaWiki-Releasing, 10MediaWiki-Containers, 06Services, 15User-mobrovac: Ready-to-use Docker package for MediaWiki - https://phabricator.wikimedia.org/T92826#2575220 (10mobrovac) [12:39:09] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T141551#2575224 (10hashar) I have applied the security patch For core most of got merged in master yesterday and I have dropped them from /srv/patches/1.28.0-wmf.16/core. One... [12:49:57] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T141551#2575265 (10hashar) Next step is to sync to cluster https://wikitech.wikimedia.org/wiki/Heterogeneous_deployment/Train_deploys#Sync_to_cluster_and_v... [13:20:31] HMmMMm [13:20:37] puppet compiler says: [13:20:38] [ 2016-08-23T13:12:19 ] CRITICAL: Build run failed: [Errno 28] No space left on device: '/mnt/jenkins-workspace/puppet-compiler/3799' [13:20:57] https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/3799/console [13:25:12] ottomata: task fill please :) [13:27:01] 10Beta-Cluster-Infrastructure, 06Commons, 10MediaWiki-API, 10MediaWiki-Uploading, and 3 others: internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T143655#2575366 (10Anomie) [13:27:34] ottomata: ema had the same issue earlier [13:27:38] the slave needs to be cleanedup [13:33:40] Project beta-code-update-eqiad build #118280: 15ABORTED in 39 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/118280/ [13:34:00] 10Continuous-Integration-Infrastructure, 06Operations: OSError: [Errno 28] No space left on device on compiler02.puppet3-diffs.eqiad.wmflabs - https://phabricator.wikimedia.org/T143671#2575396 (10ema) [13:34:18] ottomata: https://phabricator.wikimedia.org/T143671 [13:34:19] :D [13:36:47] thanks! [13:36:54] 10Continuous-Integration-Infrastructure, 06Operations: OSError: [Errno 28] No space left on device on compiler02.puppet3-diffs.eqiad.wmflabs - https://phabricator.wikimedia.org/T143671#2575396 (10Ottomata) +1 [13:39:14] 10Continuous-Integration-Infrastructure, 06Operations, 10puppet-compiler: OSError: [Errno 28] No space left on device on compiler02.puppet3-diffs.eqiad.wmflabs - https://phabricator.wikimedia.org/T143671#2575460 (10hashar) [13:51:11] 10Beta-Cluster-Infrastructure, 06Commons, 10MediaWiki-API, 10MediaWiki-Uploading, and 3 others: internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T143655#2575547 (10Anomie) Looking at other log messages in that reque... [14:02:54] European SWAT went just well. [14:03:23] 10Beta-Cluster-Infrastructure, 06Commons, 10MediaWiki-API, 10MediaWiki-Uploading, and 3 others: internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T143655#2575625 (10zeljkofilipin) >>! In T143655#2575545, @Anomie wrot... [14:03:49] 10Continuous-Integration-Config: Remove mediawiki/extensions/PdfBook from /zuul/layout.yaml - https://phabricator.wikimedia.org/T143683#2575628 (10Aklapper) [14:06:48] PROBLEM - Puppet run on integration-slave-trusty-1014 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:07:33] 10Beta-Cluster-Infrastructure, 06Commons, 10MediaWiki-API, 10MediaWiki-Uploading, and 3 others: internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T143655#2575653 (10Anomie) Do the reproductions lack the "Redis is loa... [14:08:08] PROBLEM - Puppet run on integration-slave-trusty-1011 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:10:00] (03PS26) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [14:11:04] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T141551#2575673 (10hashar) https://test.wikipedia.org/wiki/Special:Version | MediaWiki | 1.28.0-wmf.16 (88beb39) | 11:39, 23 August 2016 [14:14:02] PROBLEM - Puppet run on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:16:50] PROBLEM - Puppet run on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:20:04] (03PS1) 10Hashar: Drop PdfBook [integration/config] - 10https://gerrit.wikimedia.org/r/306213 (https://phabricator.wikimedia.org/T143683) [14:20:19] (03CR) 10Hashar: [C: 032] Drop PdfBook [integration/config] - 10https://gerrit.wikimedia.org/r/306213 (https://phabricator.wikimedia.org/T143683) (owner: 10Hashar) [14:21:17] (03Merged) 10jenkins-bot: Drop PdfBook [integration/config] - 10https://gerrit.wikimedia.org/r/306213 (https://phabricator.wikimedia.org/T143683) (owner: 10Hashar) [14:23:32] 10Continuous-Integration-Config, 13Patch-For-Review: Remove mediawiki/extensions/PdfBook from /zuul/layout.yaml - https://phabricator.wikimedia.org/T143683#2575740 (10hashar) 05Open>03Resolved a:03hashar [14:24:07] PROBLEM - Puppet run on integration-slave-jessie-1005 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:28:52] puppet broken will be fixed with https://gerrit.wikimedia.org/r/306214 [14:29:42] PROBLEM - Puppet run on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:31:44] PROBLEM - Puppet run on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:41:29] 10Continuous-Integration-Infrastructure, 06Labs, 07Wikimedia-Incident: Investigate upgrade of OpenStack python module for labnodepool1001 - https://phabricator.wikimedia.org/T143013#2575765 (10hashar) [14:41:49] RECOVERY - Puppet run on integration-slave-trusty-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [14:43:06] 10Continuous-Integration-Infrastructure, 06Labs, 07Wikimedia-Incident: Investigate upgrade of OpenStack python module for labnodepool1001 - https://phabricator.wikimedia.org/T143013#2554094 (10hashar) I am merging this one in the other task T137217 they are close dupes [14:43:11] RECOVERY - Puppet run on integration-slave-trusty-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [14:43:35] 10Continuous-Integration-Infrastructure, 07Nodepool: Clean up apt:pin of python modules used for Nodepool - https://phabricator.wikimedia.org/T137217#2575769 (10hashar) [14:43:38] 10Continuous-Integration-Infrastructure, 07Nodepool: Clean up apt:pin of python modules used for Nodepool - https://phabricator.wikimedia.org/T137217#2361186 (10hashar) Updated the task details with the list of current vs jessie-backports python modules. I guess we can first upgrade `python-novaclient` , rest... [14:44:01] RECOVERY - Puppet run on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [14:44:29] 10Continuous-Integration-Infrastructure, 07Nodepool: Clean up apt:pin of python modules used for Nodepool - https://phabricator.wikimedia.org/T137217#2575796 (10hashar) [14:44:31] 10Continuous-Integration-Infrastructure, 06Labs, 07Wikimedia-Incident: Investigate upgrade of OpenStack python module for labnodepool1001 - https://phabricator.wikimedia.org/T143013#2575798 (10hashar) [14:49:19] 10Beta-Cluster-Infrastructure, 06Commons, 10MediaWiki-API, 10MediaWiki-Uploading, and 3 others: internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T143655#2575805 (10zeljkofilipin) Sorry, I am not familiar with logsta... [14:52:02] 10Beta-Cluster-Infrastructure, 06Commons, 10MediaWiki-API, 10MediaWiki-Uploading, and 3 others: internal_api_error_LocalFileLockError while uploading file via API to commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T143655#2575808 (10Anomie) In the search bar near the top of the page,... [14:56:52] RECOVERY - Puppet run on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [14:57:31] 10Continuous-Integration-Infrastructure, 06Labs, 07Nodepool, 13Patch-For-Review: Clean up apt:pin of python modules used for Nodepool - https://phabricator.wikimedia.org/T137217#2575815 (10hashar) https://gerrit.wikimedia.org/r/306220 `nodepool: bump nova client and openstack CLI` should do it. That would... [14:57:58] 10Continuous-Integration-Infrastructure, 06Labs, 07Nodepool, 13Patch-For-Review: Clean up apt:pin of python modules used for Nodepool - https://phabricator.wikimedia.org/T137217#2575822 (10hashar) p:05Low>03High [14:58:30] PROBLEM - Puppet run on deployment-sca02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:58:31] 10Continuous-Integration-Infrastructure, 07Nodepool: 2016-08-10 CI incident follow-ups - https://phabricator.wikimedia.org/T142952#2575826 (10hashar) [14:58:33] 10Continuous-Integration-Infrastructure, 06Labs, 07Nodepool, 13Patch-For-Review: Clean up apt:pin of python modules used for Nodepool - https://phabricator.wikimedia.org/T137217#2575825 (10hashar) [14:59:10] RECOVERY - Puppet run on integration-slave-jessie-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [14:59:14] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:06:45] RECOVERY - Puppet run on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [15:09:43] RECOVERY - Puppet run on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [15:31:26] 10Continuous-Integration-Infrastructure, 06Labs, 13Patch-For-Review, 07Wikimedia-Incident: Nodepool instance instance creation quota management - https://phabricator.wikimedia.org/T143016#2575912 (10hashar) > we are doing a lot less and getting a lot more done it seems like Most of the jobs have been mov... [15:57:55] PROBLEM - Puppet run on integration-slave-trusty-1013 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:59:43] 10Continuous-Integration-Infrastructure, 06Labs, 13Patch-For-Review, 07Wikimedia-Incident: Nodepool instance instance creation quota management - https://phabricator.wikimedia.org/T143016#2575937 (10hashar) Gave it a try with: * max-server 12 * jessie 8 * trusty 4 And at 11 instances, attempt to spawn a... [16:04:38] 10Continuous-Integration-Infrastructure, 10MediaWiki-Unit-tests, 13Patch-For-Review, 07Regression: Job mediawiki-extensions-php55 frequently fails due to "Segmentation fault" - https://phabricator.wikimedia.org/T142158#2575953 (10EBernhardson) Looking at the overnight runs of mediawiki-extensions-php55, th... [16:08:35] 10Continuous-Integration-Infrastructure, 06Labs, 13Patch-For-Review, 07Wikimedia-Incident: Nodepool instance instance creation quota management - https://phabricator.wikimedia.org/T143016#2575961 (10hashar) Via the openstack command line tool I got 8 instances: | ef6ece68-5d79-4ba0-b2f0-e6d0fe308201 | ci-... [16:08:55] thcipriani: I am moving out but found out a three ghost instances in the contintcloud tenant https://phabricator.wikimedia.org/T143016#2575961 :D [16:09:07] thcipriani: something prevent them from showing up /Being deleted so they are hidden [16:09:17] but still go against the tenant instances quota :( [16:09:19] ah ha! [16:09:26] and magic https://grafana.wikimedia.org/dashboard/db/nodepool [16:09:33] so statsd is going to be helpful [16:09:43] and we gotta sort out with labs whatever prevent instances from being properly deleted [16:09:56] then we can surely bump the quota / restore the rate :} [16:10:37] * hashar disappears [16:12:10] changing the rate at which openstack is pinging nodepool seems to have cleared a lot of issues from the nodepool logs, FWIW [16:13:45] but that 2 instance find is amazing. with a max-server of 10 we were only getting 6, with the allocation currently at 15 we were getting 10 some times. With that 2 instance removal we're now getting 12 sometimes. [16:13:57] I guess [16:14:04] * hashar bicycles [16:14:07] but, I suppose, max-servers has something to do with it. [16:20:32] 10scap, 06Operations: Scap::server::sources is out of sync with the repositories actually present on tin/mira - https://phabricator.wikimedia.org/T143692#2576004 (10Joe) [16:21:06] 10scap, 06Operations: Scap::server::sources is out of sync with the repositories actually present on tin/mira - https://phabricator.wikimedia.org/T143692#2576016 (10Joe) p:05Triage>03High [16:21:31] 10scap, 06Operations: Scap::server::sources is out of sync with the repositories actually present on tin/mira - https://phabricator.wikimedia.org/T143692#2576004 (10Joe) [16:30:36] (03CR) 10EBernhardson: "Looks reasonable to me. Mentioning the tickets in the commit message content is good, but they should also be in the structured section at" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/305762 (owner: 10Lethexie) [16:31:14] 03Scap3, 10scap, 06Operations, 15User-mobrovac: Scap::server::sources is out of sync with the repositories actually present on tin/mira - https://phabricator.wikimedia.org/T143692#2576089 (10mobrovac) [16:31:25] (03CR) 10EBernhardson: [C: 032] Add usage to forbid superglobals like $_GET,$_POST [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/296395 (owner: 10Lethexie) [16:38:23] !log Fixed ops/puppet sync by removing stale cherry-pick of https://gerrit.wikimedia.org/r/#/c/305996/ [16:38:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [16:39:00] blerg, sorry :(( [16:39:11] this is when I giggle because j.oe and I were talking about the puppet cherry-pick mess in deployment-prep yesterday and today his patch breaks it :) [16:39:51] well, the patch that I cherry-picked :P [16:40:24] we are up to 14 picks which is a bit crazy [16:40:28] some of them are really old too [16:41:27] 10Continuous-Integration-Infrastructure, 06Operations, 10puppet-compiler: OSError: [Errno 28] No space left on device on compiler02.puppet3-diffs.eqiad.wmflabs - https://phabricator.wikimedia.org/T143671#2576152 (10greg) p:05Triage>03High [16:41:56] it would be neat if there were arbitrary tags in gerrit so we could easily find all of the things that are cherry-picked [16:43:12] 10Continuous-Integration-Infrastructure, 06Operations, 10puppet-compiler: OSError: [Errno 28] No space left on device on compiler02.puppet3-diffs.eqiad.wmflabs - https://phabricator.wikimedia.org/T143671#2575396 (10greg) (Set it to High, but I assume @fgiunchedi fixed it by removing the old compilations.) [16:46:04] 10Continuous-Integration-Infrastructure, 06Labs, 13Patch-For-Review, 07Wikimedia-Incident: Nodepool instance instance creation quota management - https://phabricator.wikimedia.org/T143016#2576175 (10chasemp) @hashar I'm uncomfortable with you changing these values without a notice here before hand, a SAL e... [16:48:35] PROBLEM - Puppet run on deployment-redis02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:01:14] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2576224 (10thcipriani) p:05High>03Normal Lowing priority from high since nothing is happened here in a while. Here's what's currently cherry picked: | Bryan Davis |... [17:02:12] ^ bd808 threw out an idea there that incorporates "tags" in some way [17:02:21] (probably not the way you were thinking :)) [17:03:13] ah. yeah something in phab could be made to work [17:03:57] I'm going to bump on each of them that looks stalled in gerrit and make sure there are reviewers too [17:04:13] gergo's sentry stuff has been rotting for a long time :/ [17:04:47] yeah, that's not the only thing :(( [17:08:01] 06Release-Engineering-Team: Preload TestingAccessWrapper in production mwrepl - https://phabricator.wikimedia.org/T143607#2576264 (10Mattflaschen-WMF) [17:10:27] RECOVERY - Puppet staleness on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [3600.0] [17:34:18] 06Release-Engineering-Team, 06Operations, 07Puppet: Preload TestingAccessWrapper in production mwrepl - https://phabricator.wikimedia.org/T143607#2576402 (10greg) [17:42:14] 10Continuous-Integration-Infrastructure, 06Labs, 13Patch-For-Review, 07Wikimedia-Incident: Nodepool instance instance creation quota management - https://phabricator.wikimedia.org/T143016#2576452 (10hashar) Asked sorry @chasemp :-( [17:52:18] 06Release-Engineering-Team, 15User-greg, 07Wikimedia-Incident: Identify "first responders" for "all" "components" deployed on Wikimedia servers - https://phabricator.wikimedia.org/T141066#2576530 (10greg) [17:52:33] (03PS10) 10Legoktm: Add usage to forbid superglobals like $_GET,$_POST [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/296395 (owner: 10Lethexie) [17:52:39] (03CR) 10Legoktm: [C: 032] Add usage to forbid superglobals like $_GET,$_POST [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/296395 (owner: 10Lethexie) [17:55:24] (03Merged) 10jenkins-bot: Add usage to forbid superglobals like $_GET,$_POST [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/296395 (owner: 10Lethexie) [17:58:15] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2576550 (10bd808) >>! In T135427#2576224, @thcipriani wrote: > Here's what's currently cherry picked: > > | Tyler Cipriani | scap: bump version to 3.2.3-1 merged > | Bra... [17:59:53] thcipriani: I got it back down to 11 cherry-picks. Not great, but minor progress [18:00:26] bd808: awesome! thanks for your help [18:20:09] (03PS1) 10Legoktm: Enable composer-test for TemplateSandbox [integration/config] - 10https://gerrit.wikimedia.org/r/306257 (https://phabricator.wikimedia.org/T143703) [18:20:38] (03CR) 10Legoktm: [C: 032] Enable composer-test for TemplateSandbox [integration/config] - 10https://gerrit.wikimedia.org/r/306257 (https://phabricator.wikimedia.org/T143703) (owner: 10Legoktm) [18:21:38] (03Merged) 10jenkins-bot: Enable composer-test for TemplateSandbox [integration/config] - 10https://gerrit.wikimedia.org/r/306257 (https://phabricator.wikimedia.org/T143703) (owner: 10Legoktm) [18:21:41] !log deploying https://gerrit.wikimedia.org/r/306257 [18:21:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [18:53:08] RECOVERY - Long lived cherry-picks on puppetmaster on deployment-puppetmaster is OK: OK: Less than 100.00% above the threshold [0.0] [18:59:25] 10Continuous-Integration-Infrastructure, 06Labs, 13Patch-For-Review, 07Wikimedia-Incident: Nodepool instance instance creation quota management - https://phabricator.wikimedia.org/T143016#2576881 (10chasemp) >>! In T143016#2576452, @hashar wrote: > Asked sorry @chasemp :-( No worries, thanks for understan... [19:12:46] (03CR) 10Ejegg: "Thanks awight! Looks like it needs a manual rebase" [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [19:34:00] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T141551#2577111 (10hashar) There are some undefined index from SpamBlacklist but that is a known issue: T138429 [19:35:54] 10Continuous-Integration-Config: Update jenkins job builder - https://phabricator.wikimedia.org/T143731#2577130 (10Paladox) [19:39:58] hashar: noteworthy since you're doing train this week. I setup fatalmonitors for each group https://logstash.wikimedia.org/app/kibana#/dashboard/group0 and https://logstash.wikimedia.org/app/kibana#/dashboard/group1 [19:41:58] oh man [19:42:00] thank you for that [19:42:17] are the errors properly flagged with the version nowadays ? [19:43:23] I'm unsure how that version graph is being generated, actually [19:43:39] I added it iirc [19:44:02] no idea how i did it though [19:44:09] yeah, mwversion.raw :) [19:44:10] the new web interface really confuses me [19:44:28] takes some getting used to. [19:47:29] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 10Wikimedia-Logstash: Send Jenkins daemon logs to logstash - https://phabricator.wikimedia.org/T143733#2577168 (10hashar) [19:50:45] legoktm: thcipriani: if you get around still. I am curious why you got the various jenkins jobs back to the permanent slaves [19:51:02] but I guess it was because there was no/not enough instances available on nodepool [19:51:10] or was there some other oddity that needs proper fixing? [19:51:10] hashar since the tests were slow on nodepool [19:51:11] nodepool slaves were spawning really slowly or not at all [19:51:23] ^ [19:51:29] but once the build ran on them they worked just fine weren't they? [19:51:34] and the jobs I moved took less than a minute, it didn't really make sense to keep them on nodepool [19:51:42] yes, the jobs were running fine, once they ran [19:51:49] great :] [19:51:59] but they'd just burn through the available vms really quickly with little benefit [19:52:02] particularly last Tuesday and Wednesday, nodepool couldn't seem to spin up new machines or enough machines [19:52:24] also since Ch4se enabled statsd reporting we now have a basic dashboard at https://grafana.wikimedia.org/dashboard/db/nodepool [19:52:25] logs were filled with 403 errors and no actual servers were getting built [19:52:41] yup [19:53:39] when we get a bigger pool, the spike of jobs will be triggeredin a few seconds [19:53:49] :) [19:53:50] also have to look again at the time it take for an instance to boot [19:54:09] last low hanging fruit I have seen is grub waiting 5 seconds for user to choose a kernel to boot from [19:54:44] I think in this instance, labs tripped, and then nodepool couldn't right itself, took a couple reboots [19:55:16] then the next day at midnight utc wait times started to go up again [19:55:24] oh [19:55:43] midnight utc seems to be the peak demand (I guess) [19:56:01] that is a bit earlier usually [19:56:12] cause midnight UTc is 2am in europe [19:56:29] this was rough, based on looking at https://grafana.wikimedia.org/dashboard/db/releng-zuul?panelId=25&fullscreen for a few days [19:56:52] the huge spike over night was the bunch of security patches [19:57:23] cause if we get 7 patches over 3 release branches that is 21 changes [19:57:31] each entering test then gate and submit or 42 events [19:57:40] each having roughly 10 builds (mediawiki/core) [19:57:41] or 420 jobs [19:57:51] all sent over a few minutes [19:58:15] they are run in parallel but that really exhaust the infra :( [19:58:16] hashar actually 1am and 2am in europe LOL [19:58:50] ah yeah it would be 1am in Portugal [19:58:50] yeah, mostly looking at last week's stuff, it wasn't a ton of patches, but nodepool hit a deficit of like 40 machines and just couldn't recover. This was pre all the tweaks that were made last week: moving to permanent slaves and tweaking nodepool's refresh rate [19:59:01] and london ie uk [19:59:02] too [19:59:18] were +0 utc in the winter and +1 bst in the summer [19:59:24] and no more in europe :D [19:59:49] LOL, nope just +1 and +2, and +0 in winter [19:59:50] thcipriani: if it still managed to spawn / allocate instances it would have dealt with the queue but that would surely take a loooong time [20:00:08] part of the problem I think is that the jobs are still optimized for permanent slaves and we haven't combined them like we planned for nodepool to run travis-ci style [20:00:21] with a deficit of 40 machines each jobs taking say 5 minutes but only 4 instances // that would take an hour :/ [20:00:44] this is the first security release in a long time that I can remember where we didn't have issues with slaves running out of disk space and totally falling over - and I think that was because all the large jobs were on nodepool [20:01:07] yeah that is one of the intent of disposable instances [20:01:39] a discussion we had in spring with Dan was to have a few instances that would have lxc based containers added in slaves for the small / light jobs [20:01:42] legoktm: That and I was doing it a low-traffic time. [20:01:52] Like 6-10pm pacific. [20:01:54] or even for everything if that turns out to be a good thinkg [20:02:22] we could just use docker or something [20:03:47] so in short, a nodepool-light :D [20:04:05] :) [20:04:13] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 10Wikimedia-Logstash: Send Jenkins daemon logs to logstash - https://phabricator.wikimedia.org/T143733#2577218 (10hashar) p:05Triage>03Normal [20:26:21] What does it mean that gerrit is suddenly asking me for a username when I do 'git review'? Did something change or did my repo somehow degrade? [20:26:30] This has worked without a prompt the last 600 times I've done it [20:30:22] ostriches: sorry, do you have a moment to help me understand ^ ? [20:30:40] This is surely user error :( [20:30:50] 10Continuous-Integration-Config: Update jenkins job builder - https://phabricator.wikimedia.org/T143731#2577322 (10hashar) The process I follow is: * generate the XML config with the current jb * git rebase * generate XML config and do a diff -ur If all happy, push (maybe force push if we ahv... [20:31:24] andrewbogott: git-review --verbose ? [20:31:29] that often helps [20:31:41] also look at the list of git remotes [20:31:57] specially the push url with git remote -v [20:32:14] https://www.irccloud.com/pastebin/3p4C8Mjl/ [20:32:15] should have something like: gerrit ssh://hashar@gerrit.wikimedia.org:29418/mediawiki/core.git (push) [20:32:45] yeah the push url is over https [20:33:01] Also in my .git/config I have [20:33:05] https://www.irccloud.com/pastebin/NKlab4wb/ [20:33:32] maybe that repo has the remote named 'origin' ? [20:34:14] how would I have changed that? [20:34:23] with recent versions of git-review + an option , it is smart enough to reuse origin with a remote named origin and a push url set to ssh. Mine has: [20:34:25] origin https://gerrit.wikimedia.org/r/p/mediawiki/core.git (fetch) [20:34:25] origin ssh://hashar@gerrit.wikimedia.org:29418/mediawiki/core.git (push) [20:34:36] but I digress [20:36:22] hashar: I could tell if that was advice :) [20:36:25] *couldn't [20:36:32] :D [20:36:40] so what is your 'git remote -v' saying ? [20:37:05] https://www.irccloud.com/pastebin/pvWTmSVz/ [20:37:44] so I need to tell this somehow to always use the ssh url when pushing [20:38:16] well looks like your git-review uses the origin remote [20:38:27] Yeah, as of 30 minutes ago and for no reason [20:38:35] you updated it maybe ? :D [20:39:00] or it is your local branch 'production' that is set to track origin/production [20:39:13] and maybe git-review uses the remote tracked branch as the remote repo [20:39:19] git branch -vv would tell [20:40:09] I have about 100 branches, all of the tracking origin/production [20:40:38] and https://www.irccloud.com/pastebin/3p4C8Mjl/ shows it is using origin [20:40:48] so maybe just change the pushurl for the 'origin' remote [20:41:13] git remote set-url --push origin ssh://andrew@gerrit.wikimedia.org:29418/operations/puppet.git [20:41:27] and git remote -v will then tell you: [20:41:38] origin https://gerrit.wikimedia.org/r/p/operations/puppet (fetch) [20:41:38] origin ssh://andrew@gerrit.wikimedia.org:29418/operations/puppet.git (push) [20:42:00] Project selenium-Echo » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #126: 04FAILURE in 58 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/126/ [20:42:02] that seems to have done it. Why it broke will forever remain a mystery [20:42:05] Project selenium-Echo » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #126: 04FAILURE in 1 min 4 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/126/ [20:42:05] thank you! [20:42:22] andrewbogott: also for your gitconfig, you are only changing the url for the 'gerrit' remote [20:42:38] there is a trick which is to have git replace the https url regardless of the remote name [20:43:21] using something like: url.https://gerrit.wikimedia.org/.pushInsteadOf = ssh://andrew@gerrit.wikimedia.org:29418/ [20:43:57] 06Release-Engineering-Team, 06Operations, 07Puppet: Preload TestingAccessWrapper in production mwrepl - https://phabricator.wikimedia.org/T143607#2577379 (10Mattflaschen-WMF) [20:44:08] [url "ssh://@gerrit.wikimedia.org:29418"] [20:44:08] pushInsteadOf = git://git.wikimedia.org [20:44:13] andrewbogott: ^^ [20:44:19] got it wrong sorry [20:44:27] that comes from https://www.mediawiki.org/wiki/Gerrit/TortoiseGit_tutorial#Using_TortoiseGit [20:44:53] and adjust the git:// url to the https://gerrit url [20:45:05] with that git magically does the rewriting behind the scene [20:45:59] ok, will try next time I have something to review [20:46:22] :) [20:48:28] PROBLEM - Puppet run on mira is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [20:57:06] PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:17:46] Project selenium-Wikidata » firefox,test,Linux,contintLabsSlave && UbuntuTrusty build #95: 04FAILURE in 2 hr 27 min: https://integration.wikimedia.org/ci/job/selenium-Wikidata/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=test,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/95/ [21:22:59] andrewbogott: Oh herp derp, looks like hashar helped you (sorry, was out for my afternoon walk with the dog) [21:23:07] Generally, I prefer https over ssh with gerrit ;-) [21:23:25] ostriches: yeah, I have it working again; no idea why it broke [21:24:21] Friends don't let friends use git review ;-) [21:24:24] Or ssh ;-) [21:28:30] RECOVERY - Puppet run on mira is OK: OK: Less than 1.00% above the threshold [0.0] [21:31:47] PROBLEM - Puppet run on deployment-aqs01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:34:37] PROBLEM - Puppet run on integration-slave-precise-1012 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:35:43] PROBLEM - Puppet run on integration-slave-precise-1011 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:37:08] RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [21:58:33] Project selenium-Core » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #130: 04FAILURE in 6 min 32 sec: https://integration.wikimedia.org/ci/job/selenium-Core/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/130/ [22:22:04] Yippee, build fixed! [22:22:05] Project selenium-CentralAuth » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #124: 09FIXED in 2 min 3 sec: https://integration.wikimedia.org/ci/job/selenium-CentralAuth/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/124/ [22:54:01] (03CR) 10EBernhardson: "my previous comment should have read (T prefixed bug ids):" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/305762 (owner: 10Lethexie) [23:07:12] (03PS8) 10Awight: Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) [23:23:34] RECOVERY - Puppet run on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:24:35] Project beta-code-update-eqiad build #118338: 04FAILURE in 1 min 34 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/118338/ [23:27:44] PROBLEM - Puppet run on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [23:34:30] Yippee, build fixed! [23:34:30] Project beta-code-update-eqiad build #118339: 09FIXED in 1 min 29 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/118339/