[00:39:33] 10Beta-Cluster, 10RESTBase: RESTbase not working on Beta Cluster - https://phabricator.wikimedia.org/T104272#1412369 (10Jdforrester-WMF) 3NEW [00:48:05] 10Deployment-Systems: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1412409 (10thcipriani) 3NEW [00:48:47] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1412416 (10Jdforrester-WMF) [00:57:41] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1412464 (10GWicke) Current staging setup is using three physical nodes (xenon, cerium, praseodymium) in prod. It has been very valuable for testing, but recently did not catch some memory /... [01:10:09] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1412495 (10thcipriani) `staging-resbase01.staging.wmflabs` is the first instance that's setup, running debian jessie. Initially cassandra wouldn't start due to a missing `libjamalloc.so` A... [02:23:40] 10Beta-Cluster, 10RESTBase: RESTbase not working on Beta Cluster - https://phabricator.wikimedia.org/T104272#1412563 (10Krenair) ```● restbase.service - LSB: REST storage API and backend orchestration layer Loaded: loaded (/etc/init.d/restbase) Active: active (exited) since Tue 2015-06-30 01:45:34 UTC; 3... [02:56:40] 10Beta-Cluster, 10RESTBase: RESTbase not working on Beta Cluster - https://phabricator.wikimedia.org/T104272#1412599 (10GWicke) [03:11:07] 10Beta-Cluster, 10RESTBase: RESTBase install broken in beta labs: shasum check failed - https://phabricator.wikimedia.org/T104284#1412611 (10Krenair) [03:23:22] 10Beta-Cluster, 10RESTBase: RESTBase install broken in beta labs: shasum check failed - https://phabricator.wikimedia.org/T104284#1412647 (10GWicke) Seems to be an issue in the npm version on that instance. - the install works locally (Debian unstable) - the md5sum matches the one I get locally ``` curl http... [03:26:18] 10Beta-Cluster, 10RESTBase: RESTBase install broken in beta labs: shasum check failed - https://phabricator.wikimedia.org/T104284#1412648 (10GWicke) 5Open>3Resolved It turns out that 0.7.0 seems to have had a broken sha1sum, possibly because Marko & I were racing to upload the package at the same time. I u... [03:41:59] PROBLEM - Puppet failure on deployment-restbase01 is CRITICAL 40.00% of data above the critical threshold [0.0] [04:39:33] 10Beta-Cluster, 10RESTBase: RESTBase install broken in beta cluster: shasum check failed - https://phabricator.wikimedia.org/T104284#1412730 (10greg) [04:51:34] (03CR) 10MZMcBride: "When was the mediawiki-l discussion?" [tools/release] - 10https://gerrit.wikimedia.org/r/221059 (https://phabricator.wikimedia.org/T74420) (owner: 10Nemo bis) [05:03:42] (03CR) 10Nemo bis: "There were multiple, easy to find in archives." [tools/release] - 10https://gerrit.wikimedia.org/r/221059 (https://phabricator.wikimedia.org/T74420) (owner: 10Nemo bis) [06:53:34] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 40.00% of data above the critical threshold [0.0] [06:59:02] PROBLEM - Puppet failure on integration-zuul-server is CRITICAL 100.00% of data above the critical threshold [0.0] [07:06:05] (03PS1) 10Legoktm: Remove 'jshint' from CentralAuth, covered by 'npm' [integration/config] - 10https://gerrit.wikimedia.org/r/221835 [07:07:23] (03CR) 10Legoktm: [C: 032] Remove 'jshint' from CentralAuth, covered by 'npm' [integration/config] - 10https://gerrit.wikimedia.org/r/221835 (owner: 10Legoktm) [07:09:07] (03Merged) 10jenkins-bot: Remove 'jshint' from CentralAuth, covered by 'npm' [integration/config] - 10https://gerrit.wikimedia.org/r/221835 (owner: 10Legoktm) [07:09:24] !log deploying https://gerrit.wikimedia.org/r/221835 [07:09:27] Logged the message, Master [07:23:34] RECOVERY - Puppet failure on deployment-memc04 is OK Less than 1.00% above the threshold [0.0] [07:52:47] (03PS9) 10Hashar: WIP: Hack for npm oid jobs [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) [07:55:35] 10Beta-Cluster, 10RESTBase: RESTBase install broken in beta cluster: shasum check failed - https://phabricator.wikimedia.org/T104284#1412977 (10mobrovac) This was indeed a publish race error :) Both instances are now up & running. [07:55:42] (03PS10) 10Hashar: Link node_modules deploy repo for grunt [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) [07:56:16] (03CR) 10Hashar: [C: 032] "I have refreshed the cxserver and parsoidsvc jobs, triggered them and they are all working." [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) (owner: 10Hashar) [07:56:41] 10Continuous-Integration-Infrastructure, 10ContentTranslation-Deployments, 5Patch-For-Review: Fix npm oid jobs - https://phabricator.wikimedia.org/T92369#1412978 (10hashar) 5Open>3Resolved I have refreshed the cxserver and parsoidsvc jobs, triggered them and they are all working. Sorry @KartikMistry for... [07:56:47] kart_: hello [07:57:09] kart_: the cxserver npm jobs should no more be broken. Finally took time to test the effect on the parsoid jobs and that is running just fine [07:57:25] kart_: the root cause being that grunt does not support NODE_PATH ( https://github.com/gruntjs/grunt-cli/pull/18 ) [07:57:45] duh [07:58:24] mobrovac: yeah we got the deploy node_modules in /deploy, but run grunt in /src without running npm install :D [07:58:25] (03Merged) 10jenkins-bot: Link node_modules deploy repo for grunt [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) (owner: 10Hashar) [07:58:30] * hashar blames grunt [07:59:39] 10Continuous-Integration-Infrastructure, 7Zuul: Zuul-cloner checks out wrong branch - https://phabricator.wikimedia.org/T104243#1412985 (10hashar) Ok :-) Keep that in mind though if you ever want to get some tests run for contrib using master from other branch. [07:59:46] kart_: speaking of which, resolving https://phabricator.wikimedia.org/T101272 would go a long way [07:59:53] even for things like this [08:09:00] hashar: thanks! [08:09:12] mobrovac: ack. [09:30:12] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1413090 (10mobrovac) >>! In T104276#1412495, @thcipriani wrote: > After installing `libjamalloc-dev`, now cassandra won't start with the error: The problem is its configuration. In `hierada... [09:31:02] !log beta: running git gc on deployment-bastion Trebuchet directories. As trebuchet: find /srv/deployment/*/*/.git -type d -name .git -print -exec bash -c 'cd {} && git gc' \; [09:31:05] Logged the message, Master [09:38:46] !log deployment-bastion sudo -u jenkins-deploy bash -c 'cd /srv/mediawiki-staging/php-master/extensions/.git && git gc' [09:38:52] Logged the message, Master [09:39:50] !log deployment-bastion: sudo -u l10nupdate bash -c 'cd /srv/l10nupdate/mediawiki/extensions/.git && git gc' [09:39:58] Logged the message, Master [09:40:21] !log deployment-bastion sudo -u l10nupdate bash -c 'cd /srv/l10nupdate/mediawiki/core/.git && git gc' [09:40:30] Logged the message, Master [09:43:51] !log deployment-bastion sudo -u jenkins-deploy bash -c 'cd /srv/mediawiki-staging/php-master/extensions && git submodule foreach git gc' [09:43:54] damn cleaning [09:43:56] Logged the message, Master [10:07:30] !log deployment-bastion sudo -u l10nupdate bash -c 'cd /srv/l10nupdate/mediawiki/extension && git submodule foreach git gc' [10:07:33] Logged the message, Master [10:30:14] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Create CI slaves using Debian Jessie (tracking) - https://phabricator.wikimedia.org/T94836#1413171 (10hashar) [10:59:32] 6Release-Engineering: Prepare my CzechTest talk - https://phabricator.wikimedia.org/T103233#1413209 (10zeljkofilipin) Wrote blog post: http://filipin.eu/how-software-that-runs-wikipedia-is-tested/ [10:59:54] 6Release-Engineering: Prepare my CzechTest talk - https://phabricator.wikimedia.org/T103233#1413211 (10zeljkofilipin) 5Open>3Resolved Gave the talk. [12:06:29] (03CR) 10MZMcBride: "I'm here: . Which of these hundreds of links do I click?" [tools/release] - 10https://gerrit.wikimedia.org/r/221059 (https://phabricator.wikimedia.org/T74420) (owner: 10Nemo bis) [12:07:18] PROBLEM - Puppet failure on integration-slave-jessie-1001 is CRITICAL 100.00% of data above the critical threshold [0.0] [12:10:20] (03CR) 10Nemo bis: "If this is a way to say you'd like me to add direct links to threads somewhere, sure, I can do that honey." [tools/release] - 10https://gerrit.wikimedia.org/r/221059 (https://phabricator.wikimedia.org/T74420) (owner: 10Nemo bis) [12:10:29] 6Release-Engineering, 10Gather, 10MobileFrontend, 7Epic, 3Reading-Web: [EPIC] Create a formal release process for MobileFrontend/Gather - https://phabricator.wikimedia.org/T100296#1413384 (10Jhernandez) [12:17:56] (03CR) 10MZMcBride: "It's my way of saying that I'm not subscribed to mediawiki-l and pipermail has no search functionality. :-) I might be willing to open br" [tools/release] - 10https://gerrit.wikimedia.org/r/221059 (https://phabricator.wikimedia.org/T74420) (owner: 10Nemo bis) [12:24:00] (03CR) 10Glaisher: "I haven't seen this extension in action anywhere but there's a bug saying that this extension does not work at all." [tools/release] - 10https://gerrit.wikimedia.org/r/221059 (https://phabricator.wikimedia.org/T74420) (owner: 10Nemo bis) [12:24:43] (03CR) 10Glaisher: "T104081" [tools/release] - 10https://gerrit.wikimedia.org/r/221059 (https://phabricator.wikimedia.org/T74420) (owner: 10Nemo bis) [12:34:48] 6Release-Engineering, 10Gather, 10MobileFrontend, 7Epic, 3Reading-Web: [EPIC] Create a formal release process for MobileFrontend/Gather - https://phabricator.wikimedia.org/T100296#1413439 (10Jhernandez) [12:34:59] 6Release-Engineering, 10Gather, 10MobileFrontend, 7Epic, 3Reading-Web: [EPIC] Create a formal release process for MobileFrontend/Gather - https://phabricator.wikimedia.org/T100296#1309879 (10Jhernandez) [13:51:20] 10Continuous-Integration-Infrastructure: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#1413663 (10hashar) a:3hashar [14:38:16] 10Continuous-Integration-Infrastructure, 6operations: Investigate usage of ttf-ubuntu-font-family which is not available on Jessie - https://phabricator.wikimedia.org/T103325#1413825 (10Aklapper) [14:42:35] 10Continuous-Integration-Infrastructure, 6operations: Investigate usage of ttf-ubuntu-font-family which is not available on Jessie - https://phabricator.wikimedia.org/T103325#1413834 (10Dzahn) https://gerrit.wikimedia.org/r/#/c/218640/ [15:10:53] commute commute [15:11:04] every day [15:11:56] 5Continuous-Integration-Isolation, 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103, and 2 others: Instances without a shared NFS storage suffers from a 3 minutes boot delay - https://phabricator.wikimedia.org/T102544#1413950 (10Andrew) The puppet fix for this can be merged as soon as the export daemon is ru... [15:28:55] 10Continuous-Integration-Infrastructure, 6Release-Engineering, 10Wikimedia-Git-or-Gerrit: Unreviewed commits merged in gerrit - https://phabricator.wikimedia.org/T103396#1413981 (10awight) Thanks for looking at this! FYI, we've worked around in the way you suggested, by merging reverts, and it seems perfect... [16:01:33] hi guys, having an issue with running the puppet compiler from intergration.wikimedia.org [16:02:01] apparently its trying to compile on a server that is offline, aborted it and retried and got the same issue. anyone? :) [16:04:45] greg-g: ^ think you'll be able to defer that to who ever is likely responsible for it? [16:19:47] PROBLEM - Puppet staleness on deployment-restbase01 is CRITICAL 50.00% of data above the critical threshold [43200.0] [16:48:17] 10Deployment-Systems, 6Release-Engineering, 6Performance-Team, 6operations, 7HHVM: Make scap able to depool/repool servers via the conftool API - https://phabricator.wikimedia.org/T104352#1414314 (10Joe) 3NEW [16:53:59] 10Deployment-Systems, 6Release-Engineering, 6Performance-Team, 6operations, 7HHVM: Make scap able to depool/repool servers via the conftool API - https://phabricator.wikimedia.org/T104352#1414324 (10bd808) Is there any way we can discover the topography from conftool? We have a list of mw servers from th... [16:58:09] 10Deployment-Systems, 6Release-Engineering, 6Performance-Team, 6operations, 7HHVM: Make scap able to depool/repool servers via the conftool API - https://phabricator.wikimedia.org/T104352#1414332 (10Joe) Note that this problem statement could very well be expanded to all of our application clusters. I ju... [17:48:34] 10Beta-Cluster, 10Traffic: Puppet failing on deployment-prep caches - https://phabricator.wikimedia.org/T104076#1414537 (10akosiaris) a:3akosiaris [17:58:20] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<44.44%) [18:18:02] (03CR) 10Thcipriani: [C: 032] "Cutting new branch, including this as a patch to 1.26wmf12" [tools/release] - 10https://gerrit.wikimedia.org/r/221795 (owner: 10Hoo man) [18:18:15] (03Merged) 10jenkins-bot: Update Wikidata to the wmf/1.26wmf12 branch [tools/release] - 10https://gerrit.wikimedia.org/r/221795 (owner: 10Hoo man) [18:40:52] (03PS1) 10Addshore: Run mw-set-env for WikibaseQuality* tests [integration/config] - 10https://gerrit.wikimedia.org/r/221899 (https://phabricator.wikimedia.org/T103626) [19:58:11] greg-g: [15:01:56] greg-g: can I get a deployment window for tomorrow to test global user merge in production? I need to backport some patches, enable GUM (config setting), do a merge, then disable GUM [19:59:30] legoktm: sure thing [19:59:34] (sorry, just got back) [20:03:23] greg-g: heh, I asked that yesterday :) is Thursday ok? I just realized tomorrow is busy for me [20:03:53] legoktm: sure [20:10:22] 10Continuous-Integration-Infrastructure, 6Release-Engineering, 7Jenkins, 7Upstream: [upstream] Jenkins Gearman plugin has deadlock on executor threads (was: Beta Cluster stopped receiving code updates (beta-update-databases-eqiad hung) - https://phabricator.wikimedia.org/T72597#1415140 (10zaro0508) @Antoin... [20:53:51] 10Beta-Cluster: Enable the possibility to block users by the AbuseFilter at the deployment wiki at the beta cluster - https://phabricator.wikimedia.org/T103060#1415268 (10Anomie) >>! In T103060#1389046, @hashar wrote: > @krenair @anomie @reedy : any clue ? :-) Likely referring to `$wgAbuseFilterAvailableActions... [21:08:29] (03CR) 10JanZerebecki: [C: 04-1] Run mw-set-env for WikibaseQuality* tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/221899 (https://phabricator.wikimedia.org/T103626) (owner: 10Addshore) [22:16:09] 10Continuous-Integration-Infrastructure, 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Mysterious failure to zuul-clone drupal repo - https://phabricator.wikimedia.org/T93707#1415573 (10atgo) [22:16:28] 7Blocked-on-RelEng, 10Continuous-Integration-Infrastructure, 6Release-Engineering, 10Fundraising Tech Backlog, and 3 others: Run CiviCRM testing scripts during CI - https://phabricator.wikimedia.org/T89896#1415585 (10atgo) [22:16:33] 7Blocked-on-RelEng, 10Continuous-Integration-Infrastructure, 6Release-Engineering, 10Fundraising Tech Backlog, and 2 others: Configure Jenkins to run CiviCRM builds on Fundraising CI slave instance - https://phabricator.wikimedia.org/T89895#1415586 (10atgo) [22:31:22] (03CR) 10Addshore: Run mw-set-env for WikibaseQuality* tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/221899 (https://phabricator.wikimedia.org/T103626) (owner: 10Addshore) [22:37:13] 10Browser-Tests: Investigate distribution of browser test run time - https://phabricator.wikimedia.org/T104396#1415799 (10greg) 3NEW [22:48:27] 10Deployment-Systems, 6Community-Liaison, 6Multimedia: New Feature Notification - https://phabricator.wikimedia.org/T77347#1415853 (10greg) This really isn't a deployment systems task; all of the logic would live within MW or whatever extension manages this (eg: BetaFeatures). From a deployment system's pers... [22:51:34] 10Deployment-Systems: Investigate what changes are needed to deploy MW+Extensions by percentage of users (instead of by domain/wiki) - https://phabricator.wikimedia.org/T104398#1415863 (10greg) 3NEW [22:53:19] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1415871 (10greg) >>! In T104212#1411283, @Ryasmeen wrote: > Yes, I am not... [23:17:53] 10Deployment-Systems, 6Release-Engineering, 6Performance-Team, 6operations, 7HHVM: Make scap able to depool/repool servers via the conftool API - https://phabricator.wikimedia.org/T104352#1415962 (10bd808) For the current workflow of the scap family of tools, it would be easiest if we could select a list... [23:31:45] PROBLEM - Puppet failure on deployment-cache-text02 is CRITICAL 100.00% of data above the critical threshold [0.0] [23:56:36] PROBLEM - Host integration-t102459 is DOWN: CRITICAL - Host Unreachable (10.68.16.67)