[00:03:56] (03CR) 10Legoktm: [C: 032] Add 'npm' for more extensions to run banana-checker & jsonlint [integration/config] - 10https://gerrit.wikimedia.org/r/220020 (owner: 10Legoktm) [00:05:39] (03Merged) 10jenkins-bot: Add 'npm' for more extensions to run banana-checker & jsonlint [integration/config] - 10https://gerrit.wikimedia.org/r/220020 (owner: 10Legoktm) [00:06:13] 10Deployment-Systems, 6Release-Engineering, 6operations: Corrupt /srv/deployment/scap/scap checkouts on WMF prod cluster - https://phabricator.wikimedia.org/T103441#1390321 (10fgiunchedi) so, audit time ``` # salt --output=txt mw2*.codfw.wmnet cmd.run 'find /srv/deployment/scap/scap/.git -size 0 -ls' mw206... [00:07:07] !log deploying https://gerrit.wikimedia.org/r/220020 [00:07:10] Logged the message, Master [00:14:27] 10Deployment-Systems, 6Release-Engineering, 6operations: Corrupt /srv/deployment/scap/scap checkouts on WMF prod cluster - https://phabricator.wikimedia.org/T103441#1390355 (10fgiunchedi) tin logs for `mw2086` around that time ``` tin.eqiad.wmnet_access.log:2620:0:860:102:92b1:1cff:fe25:954d - - [22/Jun/201... [00:26:04] 10Deployment-Systems, 6Release-Engineering, 6operations: Corrupt /srv/deployment/scap/scap checkouts on WMF prod cluster - https://phabricator.wikimedia.org/T103441#1390399 (10fgiunchedi) I've removed the zero-size files and ran `deploy.fetch` + `deploy.checkout` on those machines [00:31:21] 10Browser-Tests, 10MediaWiki-extensions-OAuth: Add browser tests against beta to catch integration issues - https://phabricator.wikimedia.org/T78314#1390411 (10Tgr) Do we really need a browser test for this? Those are the most expensive to develop and maintain and the least helpful when they break. I can see... [00:40:04] 10Deployment-Systems, 6Release-Engineering, 6operations: Corrupt /srv/deployment/scap/scap checkouts on WMF prod cluster - https://phabricator.wikimedia.org/T103441#1390448 (10bd808) Some spot checked servers were still missing files in the local checkouts following @fgiunchedi's forced updates. I did a no-o... [00:44:11] 10Deployment-Systems, 6Release-Engineering, 6operations: Corrupt /srv/deployment/scap/scap checkouts on WMF prod cluster - https://phabricator.wikimedia.org/T103441#1390469 (10fgiunchedi) running `deploy.checkout` manually on `mw2197` yields a few sha1 files missing ``` mw2197:~$ sudo salt-call deploy.check... [00:50:28] 10Deployment-Systems, 6Release-Engineering, 6operations: Corrupt /srv/deployment/scap/scap checkouts on WMF prod cluster - https://phabricator.wikimedia.org/T103441#1390476 (10fgiunchedi) that didn't work as we expected, I've removed `/srv/deployment/scap` from the affected machines and ran `deploy.fetch` +... [00:57:30] 10Deployment-Systems, 6Release-Engineering, 6operations: Corrupt /srv/deployment/scap/scap checkouts on WMF prod cluster - https://phabricator.wikimedia.org/T103441#1390495 (10bd808) 5Open>3Resolved a:3bd808 A third trebuchet run showed only virt1000 failing and SAL lists it as shut down as of today. [01:04:09] 10Continuous-Integration-Infrastructure, 10Fundraising Tech Backlog, 6Scrum-of-Scrums, 10Wikimedia-Fundraising-CiviCRM, and 2 others: Continuous integration - CiviCRM - https://phabricator.wikimedia.org/T78100#1390500 (10awight) 5Open>3Resolved a:3awight [01:04:11] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 10Wikimedia-Fundraising: Create unit and integration tests for Fundraising extensions to identify breaking MediaWiki changes - https://phabricator.wikimedia.org/T89404#1390502 (10awight) [01:16:02] 6Release-Engineering, 10Wikidata, 10Wikimedia-General-or-Unknown, 6operations: Wikidata and Wikiversity logo 404ing on wikimedia.org - https://phabricator.wikimedia.org/T103296#1390541 (10Krinkle) [01:19:21] 6Release-Engineering, 10Wikidata, 10Wikimedia-General-or-Unknown, 6operations: Wikidata and Wikiversity logo 404ing on wikimedia.org - https://phabricator.wikimedia.org/T103296#1390550 (10Krinkle) > https://www.wikimedia.org/static/images/project-logos/enwikiversity.png Failed to load resource: the server... [01:19:30] 6Release-Engineering, 10Wikidata, 10Wikimedia-General-or-Unknown, 6operations: Wikidata and Wikiversity logo 404ing on wikimedia.org - https://phabricator.wikimedia.org/T103296#1390554 (10Krinkle) p:5Triage>3High a:3Krinkle [01:21:55] 6Release-Engineering, 10Wikidata, 10Wikimedia-General-or-Unknown, 6operations: Wikidata and Wikiversity logo 404ing on wikimedia.org - https://phabricator.wikimedia.org/T103296#1390555 (10Krinkle) 5Open>3Resolved [03:49:02] 10Deployment-Systems, 6Performance-Team, 7Technical-Debt: Replace xenon subscriber in wmf-config/StartProfile with Arc-Lamp - https://phabricator.wikimedia.org/T103462#1390824 (10Krinkle) 3NEW [06:16:56] Project browsertests-VisualEditor-production-linux-firefox-sauce build #68: FAILURE in 1 hr 16 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-production-linux-firefox-sauce/68/ [07:47:47] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-10-sauce build #77: FAILURE in 38 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-10-sauce/77/ [08:44:11] (03CR) 10Zfilipin: [C: 032] Upgrade cucumber dependency for fix to JUnit logger [selenium] - 10https://gerrit.wikimedia.org/r/219882 (https://phabricator.wikimedia.org/T102458) (owner: 10Dduvall) [08:44:36] (03Merged) 10jenkins-bot: Upgrade cucumber dependency for fix to JUnit logger [selenium] - 10https://gerrit.wikimedia.org/r/219882 (https://phabricator.wikimedia.org/T102458) (owner: 10Dduvall) [09:08:01] 10Continuous-Integration-Infrastructure, 6operations: Jessie does not have libvips15 - https://phabricator.wikimedia.org/T103322#1391497 (10hashar) [09:20:53] 10Continuous-Integration-Infrastructure, 10ContentTranslation-Deployments, 10ContentTranslation-cxserver: cxserver/deploy Jenkins fails - https://phabricator.wikimedia.org/T103486#1391560 (10KartikMistry) 3NEW [09:25:50] 5Continuous-Integration-Isolation, 6operations: Figure out fine sudo rules for the nodepool service - https://phabricator.wikimedia.org/T102281#1391598 (10hashar) Nodepool creates new images using python-diskimage-builder. Turns out that script rely on having root access and the nodepool-puppet manifests have... [09:31:26] 10Beta-Cluster, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation, 5ContentTranslation-Release5, 3LE-Sprint-88: Setup new wikis in Beta Cluster for Content Translation - https://phabricator.wikimedia.org/T90683#1391629 (10Arrbee) p:5High>3Low [09:32:27] 10Continuous-Integration-Infrastructure, 6operations: Remove Java 6 from CI Jenkins slaves - https://phabricator.wikimedia.org/T103491#1391656 (10hashar) 3NEW [09:33:45] kart_: https://gerrit.wikimedia.org/r/#/c/216907/ [09:34:02] kart_: so not sure what is happening there. Seems grunt can't find the node_modules dir :/// [09:36:57] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Fix npm oid jobs - https://phabricator.wikimedia.org/T92369#1391675 (10hashar) [09:36:58] 10Continuous-Integration-Infrastructure, 10ContentTranslation-Deployments, 10ContentTranslation-cxserver: cxserver/deploy Jenkins fails - https://phabricator.wikimedia.org/T103486#1391673 (10hashar) [09:37:50] 10Continuous-Integration-Infrastructure, 10ContentTranslation-Deployments, 10ContentTranslation-cxserver: cxserver/deploy Jenkins fails - https://phabricator.wikimedia.org/T103486#1391560 (10hashar) The job must have been refreshed thus triggering T92369 :-( Lets follow up there. [09:38:35] ah. that old stuff? [09:38:59] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Fix npm oid jobs - https://phabricator.wikimedia.org/T92369#1391685 (10hashar) The job got refresh again thus triggering the failure. Example: https://gerrit.wikimedia.org/r/#/c/216907/ Seems grunt doesn't honor NPM_SET_PATH when loading its tasks :-... [09:39:04] yeah [09:39:14] that works for parsoid but not for cxserver [09:39:18] though maybe parsoid doesn't use grunt [09:39:30] it doesn't [09:40:12] (03PS7) 10Hashar: WIP: Hack for npm oid jobs [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) [09:40:15] parsoid uses mocha, but that's another story [09:40:17] kart_: that is the fix [09:40:26] but that impacts both cxserver and parsoid IIRC [09:40:57] and the hack is not nice: https://gerrit.wikimedia.org/r/#/c/189473/7/jjb/macro.yaml,unified :-(( [09:41:11] that's really nasty hashar [09:41:13] imho [09:41:19] yup [09:41:27] that is why I havent deployed / merged it [09:41:34] I guess we need a step by step to reproduce the issue [09:41:40] then figure out a proper solution [09:46:59] 10Continuous-Integration-Infrastructure, 6operations: Remove Java 6 from CI Jenkins slaves - https://phabricator.wikimedia.org/T103491#1391724 (10hashar) Here are the Jenkins jobs JDK from `ssh gallium.wikimedia.org grep jdk /var/lib/jenkins/jobs/*/config.xml` | Job name | Jenkins XML config |--|-- | analytic... [10:06:10] 10Browser-Tests, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: investigate failing Wikidata browsertests on jenkins - https://phabricator.wikimedia.org/T92619#1391755 (10WMDE-Fisch) Dylan answered again he reproduced the issue himself and they are looking for a solution. "I’ve done some testing mys... [10:10:30] 10Continuous-Integration-Infrastructure, 6operations: Remove Java 6 from CI Jenkins slaves - https://phabricator.wikimedia.org/T103491#1391763 (10hashar) The Jenkins main configuration file has: ``` lang=xml Ubuntu - OpenJdk 6 /usr/lib/jvm/java-6-openjdk-amd64/... [10:20:47] 10Beta-Cluster, 10Citoid: Can't start Zotero deployment in deployment-prep - https://phabricator.wikimedia.org/T103493#1391791 (10mobrovac) 3NEW [11:22:03] 10Browser-Tests, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: investigate failing Wikidata browsertests on jenkins - https://phabricator.wikimedia.org/T92619#1391875 (10zeljkofilipin) @wmde-fisch: I think sauce labs deletes screenshots and videos older than 30 days, so even if you had the link, it... [12:23:59] !log rebooting integration-labvagrant (stuck) [12:24:02] Logged the message, Master [12:54:51] Yippee, build fixed! [12:54:52] Project browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #512: FIXED in 50 sec: https://integration.wikimedia.org/ci/job/browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/512/ [13:03:28] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #693: FAILURE in 31 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/693/ [13:07:49] 10Beta-Cluster, 10OCG-General-or-Unknown, 6operations: salt on deployment-pdf02.deployment-prep.eqiad.wmflabs is wedged - https://phabricator.wikimedia.org/T103473#1392118 (10yuvipanda) [13:08:31] 10Beta-Cluster, 10OCG-General-or-Unknown, 6operations: salt on deployment-pdf02.deployment-prep.eqiad.wmflabs is wedged - https://phabricator.wikimedia.org/T103473#1392122 (10Krenair) Are you saying this relies on the OCG hosts in production? Because you're projectadmin on the deployment-prep, which should a... [13:19:02] 10Beta-Cluster, 10OCG-General-or-Unknown, 6operations: salt on deployment-pdf02.deployment-prep.eqiad.wmflabs is wedged - https://phabricator.wikimedia.org/T103473#1392153 (10cscott) Well, I'll be: ``` cscott@deployment-pdf02:~$ sudo -s root@deployment-pdf02:~# ``` I guess I was already `sudo`ed to `ocg` be... [13:31:03] 10Beta-Cluster, 10OCG-General-or-Unknown, 6operations: salt on deployment-pdf02.deployment-prep.eqiad.wmflabs is wedged - https://phabricator.wikimedia.org/T103473#1392170 (10cscott) Happiness: ``` Repo: ocg/ocg Tag: ocg/ocg-sync-20150623-132307 2/2 minions completed checkout Details: ``` I had to manually... [13:32:00] 6Release-Engineering: Prepare my CzechTest talk - https://phabricator.wikimedia.org/T103233#1392171 (10zeljkofilipin) Uploaded my slides https://commons.wikimedia.org/wiki/File:How_software_that_runs_Wikipedia_is_tested.pdf [13:32:44] 10Beta-Cluster, 10OCG-General-or-Unknown, 6operations: salt on deployment-pdf02.deployment-prep.eqiad.wmflabs is wedged - https://phabricator.wikimedia.org/T103473#1392173 (10cscott) 5Open>3Resolved a:3cscott [13:33:14] 6Release-Engineering: Prepare my CzechTest talk - https://phabricator.wikimedia.org/T103233#1392175 (10zeljkofilipin) This first time I gave the talk in the new format: http://filipin.eu/2013/07/11/selenium-conference-2013.html Video: https://youtu.be/Qp75Aq5wArE [13:34:41] 6Release-Engineering: Prepare my CzechTest talk - https://phabricator.wikimedia.org/T103233#1392185 (10zeljkofilipin) The first version of the talk: http://filipin.eu/2013/05/17/how-mediawiki-software-that-runs-wikipedia-is-tested.html [13:36:19] 6Release-Engineering: Prepare my CzechTest talk - https://phabricator.wikimedia.org/T103233#1392197 (10zeljkofilipin) Slides on google docs: https://docs.google.com/presentation/d/185B-pjn_uXRaddz-MskiCzirOgarhIpczaCUL7cZWJs/edit?usp=sharing [13:59:21] 10Beta-Cluster, 10Citoid: Can't start Zotero deployment in deployment-prep - https://phabricator.wikimedia.org/T103493#1392398 (10thcipriani) 5Open>3Resolved a:3thcipriani It's strange the directory `/srv/deployment/zotero/translators/.git/refs/tags/zotero` (which is ostensibly created because the tags t... [14:00:42] 5Continuous-Integration-Isolation, 6operations: Figure out fine sudo rules for the nodepool service - https://phabricator.wikimedia.org/T102281#1392401 (10chasemp) >>! In T102281#1391598, @hashar wrote: > Nodepool creates new images using python-diskimage-builder. Turns out that script rely on having root acce... [14:09:55] 10Continuous-Integration-Infrastructure: Zuul repositories have too many refs causing slow updates - https://phabricator.wikimedia.org/T70481#1392436 (10JanZerebecki) zuul-clear-refs.py --verbose --dry-run --until 90 /srv/zuul/git/project [14:15:50] 10Continuous-Integration-Infrastructure: Run zuul-clear-refs.py daily on all our repositories to reclaim Zuul references - https://phabricator.wikimedia.org/T103528#1392454 (10hashar) 3NEW a:3hashar [14:16:10] 10Continuous-Integration-Infrastructure: Package / puppets zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#1392464 (10hashar) 3NEW a:3hashar [14:26:16] legoktm: D13098 is indeed live on our instance [14:34:04] 10Continuous-Integration-Infrastructure: Zuul repositories have too many refs causing slow updates - https://phabricator.wikimedia.org/T70481#1392638 (10hashar) [14:41:12] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce build #650: STILL FAILING in 22 min: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce/650/ [14:43:18] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #292: STILL FAILING in 26 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/292/ [14:48:18] 5Continuous-Integration-Isolation: Gearman plugin doesn't put nodes offline when given parameter OFFLINE_NODE_WHEN_COMPLETE - https://phabricator.wikimedia.org/T103551#1392675 (10hashar) 3NEW [15:08:36] 5Continuous-Integration-Isolation: Gearman plugin doesn't put nodes offline when given parameter OFFLINE_NODE_WHEN_COMPLETE - https://phabricator.wikimedia.org/T103551#1392790 (10hashar) Demo effect. The parameter apparently needs to be set by Zuul :( From the Gearman plugin code `src/main/java/hudson/plugins/g... [15:08:57] 10Browser-Tests: Improve mediawiki_api documentation with inline yard - https://phabricator.wikimedia.org/T102726#1392812 (10zeljkofilipin) p:5Triage>3Normal [15:10:20] 10Browser-Tests: Display job name/type for each build in Raita - https://phabricator.wikimedia.org/T102546#1392816 (10zeljkofilipin) p:5Triage>3Normal [15:11:33] 10Browser-Tests: Display date/time and timing info for Raita builds - https://phabricator.wikimedia.org/T102536#1392822 (10zeljkofilipin) p:5Triage>3Normal [15:17:44] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Automatic account creation for MW-Selenium test user in integration environment - https://phabricator.wikimedia.org/T103135#1392829 (10zeljkofilipin) [15:19:10] (03PS1) 10Hashar: zuul: function for OFFLINE_NODE_WHEN_COMPLETE [integration/config] - 10https://gerrit.wikimedia.org/r/220149 (https://phabricator.wikimedia.org/T103551) [15:20:22] (03PS2) 10Hashar: zuul: function for OFFLINE_NODE_WHEN_COMPLETE [integration/config] - 10https://gerrit.wikimedia.org/r/220149 (https://phabricator.wikimedia.org/T103551) [15:25:01] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Experiment with JJB builder for running a subset of integration MW-Selenium tests - https://phabricator.wikimedia.org/T103039#1392843 (10zeljkofilipin) [15:26:14] 10Browser-Tests, 10CirrusSearch, 6Discovery: Upgrade CirrusSearch browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99653#1392845 (10dduvall) a:5dduvall>3None [15:27:14] (03CR) 10Hashar: "We can potentially make it a single job that is triggered for all extensions we have on Wikimedia. Such an effort started with the mediaw" [integration/config] - 10https://gerrit.wikimedia.org/r/219513 (https://phabricator.wikimedia.org/T103039) (owner: 10Dduvall) [15:27:52] marxarelli: Builder for running MW-Selenium integration tests https://gerrit.wikimedia.org/r/#/c/219513/ <-- that is all nice thanks !!!! [15:30:44] 10Browser-Tests, 6Collaboration-Team, 10Echo: 503 on Echo tests - https://phabricator.wikimedia.org/T103437#1392860 (10hashar) The 503 are most usually MediaWiki server side errors. They can be find in logstash https://logstash-beta.wmflabs.org/ [15:33:03] Krenair: btw, I only messaged twentyafterfour about it since he was on vacation on Fri and Mon, so I was worried a filtered ops@ list message might not be seen in time. [15:33:47] and I only mentioned it because I thought it was helpful, not a requirement ;) [15:34:11] greg-g, okay. I kind of assumed he wasn't on ops@ from the message, I should have checked the subscriber list [15:35:17] 10Browser-Tests, 5Patch-For-Review, 3Readership-Web, 5WMF-deploy-2015-06-16_(1.26wmf10): Issue with Chrome driver with resizing window - https://phabricator.wikimedia.org/T88288#1392888 (10dduvall) 5Open>3stalled [15:35:35] 10Browser-Tests, 5Patch-For-Review, 3Readership-Web, 7Upstream, 5WMF-deploy-2015-06-16_(1.26wmf10): Issue with Chrome driver with resizing window - https://phabricator.wikimedia.org/T88288#1392891 (10hashar) [15:41:59] 10Browser-Tests: Benchmark WebDriver via Browserstack for performance increase over SauceLabs - https://phabricator.wikimedia.org/T102282#1392923 (10zeljkofilipin) Another option is https://testingbot.com/ [15:42:39] 10Browser-Tests: Benchmark WebDriver via Browserstack for performance increase over SauceLabs - https://phabricator.wikimedia.org/T102282#1392933 (10zeljkofilipin) p:5Triage>3Normal [15:42:45] 10Browser-Tests: Benchmark WebDriver via Browserstack for performance increase over SauceLabs - https://phabricator.wikimedia.org/T102282#1361461 (10zeljkofilipin) p:5Normal>3Low [15:46:48] 10Browser-Tests: Show/link repo commit range in Raita between previous success and current failure - https://phabricator.wikimedia.org/T101880#1392955 (10zeljkofilipin) p:5Triage>3Normal [15:46:56] 10Deployment-Systems, 6Release-Engineering, 7Epic, 3releng-201415-Q4: EPIC: The future of MediaWiki deployment: Tooling - https://phabricator.wikimedia.org/T94620#1392961 (10mmodell) [15:46:59] 10Browser-Tests: Display 'pulse' for each scenario in Raita - https://phabricator.wikimedia.org/T101877#1392963 (10zeljkofilipin) p:5Triage>3Normal [15:47:04] 10Browser-Tests: Paginate projects and builds in Raita - https://phabricator.wikimedia.org/T101875#1392966 (10zeljkofilipin) p:5Triage>3Normal [15:47:13] 10Browser-Tests: Indicate failure at project level in Raita - https://phabricator.wikimedia.org/T101874#1392976 (10zeljkofilipin) p:5Triage>3Normal [15:47:16] 10Browser-Tests: Provide link to new Phabricator task for failures in Raita - https://phabricator.wikimedia.org/T101873#1392980 (10zeljkofilipin) p:5Triage>3Normal [15:48:36] greg-g: FYI the new CX release done in SWAT went to all but a few wikis, not all but a few Wikipedias… [15:48:56] greg-g: There just *might* be a follow-up config patch, once I confirm intent. [15:49:24] 10Browser-Tests, 10Wikidata, 7Pywikibot-Wikidata, 7Pywikibot-tests: Testing Pywikibot-Wikidata changes on non-production wikis - https://phabricator.wikimedia.org/T85358#1392985 (10zeljkofilipin) Is this still a problem? If not, can it be resolved? [15:50:27] James_F: :) kk [15:51:08] 6Release-Engineering, 6Collaboration-Team, 10Echo, 10MediaWiki-General-or-Unknown, and 2 others: Get JQuery error "a is undefined" running browser tests locally for Firefox - https://phabricator.wikimedia.org/T87446#1392999 (10zeljkofilipin) [15:51:12] 10Browser-Tests, 6Collaboration-Team, 10Echo, 10MediaWiki-General-or-Unknown, 7JavaScript: Get JQuery error "a is undefined" running Echo browser tests locally for Firefox - https://phabricator.wikimedia.org/T87873#1392996 (10zeljkofilipin) 5Open>3Invalid a:3zeljkofilipin Shared step no longer uses... [15:54:20] greg-g: https://gerrit.wikimedia.org/r/#/c/220161/ is the patch. [15:54:43] James_F: want it out? 4 minutes before next deployment. [15:54:50] greg-g: Can I beg someone to emergency deploy it now? [15:54:55] thcipriani: That'd be great if you can. [15:54:57] can do [15:55:00] Thanks! [15:55:03] wee [16:14:24] (03PS1) 10BryanDavis: Fix reference to _get_apache_list [tools/scap] - 10https://gerrit.wikimedia.org/r/220166 [16:16:10] (03CR) 10BryanDavis: [C: 032] "Trivial fix to a method rename." (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/220166 (owner: 10BryanDavis) [16:16:31] (03Merged) 10jenkins-bot: Fix reference to _get_apache_list [tools/scap] - 10https://gerrit.wikimedia.org/r/220166 (owner: 10BryanDavis) [16:20:13] !log updated scap to 947b93f (Fix reference to _get_apache_list) [16:20:16] Logged the message, Master [16:23:54] 6Release-Engineering, 6Phabricator: Phabricator: new users get "Login cookie was set correctly, but your login session is not valid." - https://phabricator.wikimedia.org/T102276#1393136 (10mmodell) a:3mmodell [16:30:08] (03PS1) 10BryanDavis: Cast pid read from file to an int [tools/scap] - 10https://gerrit.wikimedia.org/r/220168 [16:31:02] 10Browser-Tests, 10MediaWiki-extensions-OAuth: Add browser tests against beta to catch integration issues - https://phabricator.wikimedia.org/T78314#1393161 (10csteipp) p:5Normal>3Low More integration tests would definitely help! I made this to specifically test the /authorize dialog, since that's by far... [16:31:28] (03CR) 10Ori.livneh: [C: 032] Cast pid read from file to an int [tools/scap] - 10https://gerrit.wikimedia.org/r/220168 (owner: 10BryanDavis) [16:31:47] legoktm: Any progress on getting the bot for https://www.mediawiki.org/wiki/User:Legoktm/ci updated? [16:31:48] (03Merged) 10jenkins-bot: Cast pid read from file to an int [tools/scap] - 10https://gerrit.wikimedia.org/r/220168 (owner: 10BryanDavis) [16:33:05] !log updated scap to da64a65 (Cast pid read from file to an int) [16:33:09] Logged the message, Master [16:38:52] James_F: https://phabricator.wikimedia.org/T103205 [16:44:08] (03PS1) 10BryanDavis: Add shell=True to subprocess.check_call() calls [tools/scap] - 10https://gerrit.wikimedia.org/r/220175 [16:46:06] legoktm: Kk. [16:46:08] (Boo.) [16:48:55] (03CR) 10Ori.livneh: [C: 032] Add shell=True to subprocess.check_call() calls [tools/scap] - 10https://gerrit.wikimedia.org/r/220175 (owner: 10BryanDavis) [16:49:18] (03Merged) 10jenkins-bot: Add shell=True to subprocess.check_call() calls [tools/scap] - 10https://gerrit.wikimedia.org/r/220175 (owner: 10BryanDavis) [16:54:21] (03PS1) 10Ori.livneh: Use service instead of start to start apache2 [tools/scap] - 10https://gerrit.wikimedia.org/r/220177 [16:54:24] (03CR) 10jenkins-bot: [V: 04-1] Use service instead of start to start apache2 [tools/scap] - 10https://gerrit.wikimedia.org/r/220177 (owner: 10Ori.livneh) [16:58:34] (03PS2) 10Ori.livneh: Use service instead of start to start apache2 [tools/scap] - 10https://gerrit.wikimedia.org/r/220177 [17:00:51] (03CR) 10BryanDavis: [C: 032] Use service instead of start to start apache2 [tools/scap] - 10https://gerrit.wikimedia.org/r/220177 (owner: 10Ori.livneh) [17:01:14] (03Merged) 10jenkins-bot: Use service instead of start to start apache2 [tools/scap] - 10https://gerrit.wikimedia.org/r/220177 (owner: 10Ori.livneh) [17:21:20] Project browsertests-Wikidata-WikidataTests-linux-chrome-sauce build #62: STILL FAILING in 2 hr 52 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-linux-chrome-sauce/62/ [17:31:22] Project browsertests-Wikidata-WikidataTests-linux-firefox-sauce build #265: STILL FAILING in 3 hr 9 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-linux-firefox-sauce/265/ [18:12:17] 10Browser-Tests, 10MediaWiki-extensions-OAuth: Add tests against beta to catch OAuth integration issues - https://phabricator.wikimedia.org/T78314#1393547 (10Tgr) [18:20:33] 10Browser-Tests, 10MediaWiki-extensions-OAuth: Add tests against beta to catch OAuth integration issues - https://phabricator.wikimedia.org/T78314#1393568 (10Tgr) > The problem is that to test it, we need a request token, so we need a consumer in the DB and the code to generate the request token by hitting /in... [18:24:40] (03PS4) 10JanZerebecki: PronunciationRecording depends on UploadWizard [integration/config] - 10https://gerrit.wikimedia.org/r/219778 (owner: 10Mattflaschen) [18:24:47] (03CR) 10JanZerebecki: [C: 032] PronunciationRecording depends on UploadWizard [integration/config] - 10https://gerrit.wikimedia.org/r/219778 (owner: 10Mattflaschen) [18:26:32] (03Merged) 10jenkins-bot: PronunciationRecording depends on UploadWizard [integration/config] - 10https://gerrit.wikimedia.org/r/219778 (owner: 10Mattflaschen) [18:28:02] 6Release-Engineering, 6Team-Practices: Organize "testing: where does it hurt?" workshop for the second week of July - https://phabricator.wikimedia.org/T102713#1393586 (10ggellerman) @dduvall QRs are nearing....should we schedule this week of June 29 (next week) or is that too soon? @Greg would the week of Ju... [18:28:20] !log zuul reload for https://gerrit.wikimedia.org/r/#/c/219778/4 [18:28:24] Logged the message, Master [18:33:00] (03CR) 10JanZerebecki: "Deployed to zuul. Tested successfully: https://integration.wikimedia.org/ci/job/mwext-testextension-zend/3842/" [integration/config] - 10https://gerrit.wikimedia.org/r/219778 (owner: 10Mattflaschen) [18:44:05] 6Release-Engineering, 10Gather, 10MobileFrontend, 7Epic, and 2 others: [EPIC] Encourage developers to increase code coverage - https://phabricator.wikimedia.org/T100294#1393687 (10Jdlrobson) [18:47:58] 10Continuous-Integration-Infrastructure, 6Labs, 10Tool-Labs: Recover homedir of "ci" tool - https://phabricator.wikimedia.org/T103205#1393713 (10Jdforrester-WMF) p:5Triage>3Normal [19:14:39] (03PS1) 10BryanDavis: Use utils.sudo_check_call instead of subprocess.check_call [tools/scap] - 10https://gerrit.wikimedia.org/r/220240 [19:14:42] (03PS1) 10BryanDavis: Set --restart batch size to 5% of total hosts [tools/scap] - 10https://gerrit.wikimedia.org/r/220241 [19:43:10] (03CR) 10Ori.livneh: [C: 032] Use utils.sudo_check_call instead of subprocess.check_call [tools/scap] - 10https://gerrit.wikimedia.org/r/220240 (owner: 10BryanDavis) [19:43:32] (03Merged) 10jenkins-bot: Use utils.sudo_check_call instead of subprocess.check_call [tools/scap] - 10https://gerrit.wikimedia.org/r/220240 (owner: 10BryanDavis) [19:46:11] (03CR) 10Ori.livneh: [C: 04-1] Set --restart batch size to 5% of total hosts (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/220241 (owner: 10BryanDavis) [19:48:52] (03PS2) 10Ori.livneh: Set --restart batch size to 5% of total hosts [tools/scap] - 10https://gerrit.wikimedia.org/r/220241 (owner: 10BryanDavis) [19:49:35] (03CR) 10Ori.livneh: [C: 032] Set --restart batch size to 5% of total hosts [tools/scap] - 10https://gerrit.wikimedia.org/r/220241 (owner: 10BryanDavis) [19:49:56] (03Merged) 10jenkins-bot: Set --restart batch size to 5% of total hosts [tools/scap] - 10https://gerrit.wikimedia.org/r/220241 (owner: 10BryanDavis) [19:53:40] !log deleted broken renames from centralauth.renameuser_status on beta cluster [19:53:43] Logged the message, Master [19:56:55] http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:RecentChanges&hidebots=0 [19:59:45] does anyone mind if I run a python script on deployment-bastion to generate junk edits for user merge testing? it's pretty slow from my laptop and I think it's because of the network [20:16:38] (03PS1) 1020after4: .gitignore local config file for make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/220291 [20:16:40] (03PS1) 1020after4: add deploy-promote script [tools/release] - 10https://gerrit.wikimedia.org/r/220292 [20:17:05] (03CR) 1020after4: [C: 032] .gitignore local config file for make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/220291 (owner: 1020after4) [20:18:26] (03Merged) 10jenkins-bot: .gitignore local config file for make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/220291 (owner: 1020after4) [20:20:09] marxarelli: switching to public for the audience :-D [20:20:14] I wrote some very basic doc at https://wikitech.wikimedia.org/wiki/Nodepool [20:20:21] gotta up it [20:20:32] and probably want to update the arch document from febuary [20:20:46] then I guess list all questions regarding nodepool and loopback with chase / andrew from ops [20:37:17] 10Continuous-Integration-Infrastructure: Request Jenkins shell access for account "sniedzielski" - https://phabricator.wikimedia.org/T103192#1394117 (10Niedzielski) [20:38:01] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.eqiad.wmflabs despite reboot - https://phabricator.wikimedia.org/T102592#1394123 (10hashar) [20:38:04] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1394124 (10hashar) [20:39:27] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1387030 (10hashar) From /var/log/auth.log : ``` Jun 23 20:36:22 integration-slave-jessie-1001 sshd[6109]: Connection from 10.6... [20:41:13] Are there any read-only mirrors of the Beta Cluster, for poking around? [20:41:34] Of the DB I mean. [20:41:38] If not, are there recommended credentials to use (ala the researcher system for production)? [20:42:32] 10Beta-Cluster, 6Labs, 7Shinken: Shinken is showing HTTP 404 warnings for deployment-mathoid/sca02 mathoid services - https://phabricator.wikimedia.org/T103595#1394154 (10Krenair) 3NEW [20:42:48] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1394162 (10hashar) So bastion-01.bastion.eqiad.wmflabs has the IP 10.68.17.232 but it is not enabled in the ferm rules! Thus... [20:45:00] matt_flaschen, I don't think so [20:45:13] I guess you're looking for a labs-replica equivalent [20:45:21] that was resolved declined or something iirc [20:45:27] research system though... hm [20:45:55] Something like that. I wasn't expecting it to exist, but thought it was worth asking. [20:46:35] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1394190 (10hashar) Stopped ferm service via salt and I can ssh again. Purged the ferm package, removed /etc/ferm and reran pu... [20:47:07] matt_flaschen, why can't you just connect straight to the normal databases? [20:47:22] like you would in production to just read data? [20:48:20] (03PS1) 10BryanDavis: Guard against https://bugs.python.org/issue1731717 [tools/scap] - 10https://gerrit.wikimedia.org/r/220300 [20:48:25] I can, I was just wondering. I prefer to use read-only mirrors where possible, as does etonkovidova. In prod, I do use mirrors, e.g. x1-analytics-slave, which I assume is read-only. [20:49:28] (03PS1) 10Ori.livneh: Handle ECHILD in ssh.py [tools/scap] - 10https://gerrit.wikimedia.org/r/220301 [20:49:48] (03CR) 10jenkins-bot: [V: 04-1] Handle ECHILD in ssh.py [tools/scap] - 10https://gerrit.wikimedia.org/r/220301 (owner: 10Ori.livneh) [20:51:19] (03PS2) 10Ori.livneh: Handle ECHILD in ssh.py [tools/scap] - 10https://gerrit.wikimedia.org/r/220301 [20:51:23] (03Abandoned) 10BryanDavis: Guard against https://bugs.python.org/issue1731717 [tools/scap] - 10https://gerrit.wikimedia.org/r/220300 (owner: 10BryanDavis) [20:53:38] matt_flaschen, I really wouldn't be concerned about that in beta [20:56:46] 10Continuous-Integration-Infrastructure, 6Labs: Continuous integration should not depend on labs NFS - https://phabricator.wikimedia.org/T90610#1394220 (10hashar) [20:56:48] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1394218 (10hashar) 5Open>3Resolved a:3hashar [21:06:48] 10Continuous-Integration-Infrastructure: On Jessie CI slaves, install ruby2.1 instead of ruby1.9.3 - https://phabricator.wikimedia.org/T103600#1394281 (10hashar) 3NEW [21:18:10] (03PS3) 10BryanDavis: Handle ECHILD in ssh.py [tools/scap] - 10https://gerrit.wikimedia.org/r/220301 (owner: 10Ori.livneh) [21:19:17] (03CR) 10BryanDavis: [C: 032] Handle ECHILD in ssh.py [tools/scap] - 10https://gerrit.wikimedia.org/r/220301 (owner: 10Ori.livneh) [21:19:37] (03Merged) 10jenkins-bot: Handle ECHILD in ssh.py [tools/scap] - 10https://gerrit.wikimedia.org/r/220301 (owner: 10Ori.livneh) [21:24:26] 10Continuous-Integration-Infrastructure: On Jessie CI slaves, install ruby2.1 instead of ruby1.9.3 - https://phabricator.wikimedia.org/T103600#1394364 (10hashar) 5Open>3Resolved a:3hashar Solved via https://gerrit.wikimedia.org/r/#/c/220308/ and deployed on integration puppet master. [21:24:28] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Create CI slaves using Debian Jessie (tracking) - https://phabricator.wikimedia.org/T94836#1394367 (10hashar) [21:25:45] (03PS1) 10BryanDavis: Ensure that the minimum batch size used by cluster_ssh is 1 [tools/scap] - 10https://gerrit.wikimedia.org/r/220316 [21:31:09] (03CR) 10Ori.livneh: [C: 032] Ensure that the minimum batch size used by cluster_ssh is 1 [tools/scap] - 10https://gerrit.wikimedia.org/r/220316 (owner: 10BryanDavis) [21:31:33] (03Merged) 10jenkins-bot: Ensure that the minimum batch size used by cluster_ssh is 1 [tools/scap] - 10https://gerrit.wikimedia.org/r/220316 (owner: 10BryanDavis) [21:34:04] !log updated scap to 33f3002 (Ensure that the minimum batch size used by cluster_ssh is 1) [21:34:07] Logged the message, Master [21:36:44] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Isolation, 3releng-201516-q1, 7Epic, and 2 others: [Quarterly Success Metric] Jenkins: Run jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#1394425 (10greg) [21:41:39] hashar: doesn't ubuntu use libav as well? [21:42:54] matanya: maybe, but we got ffmpeg [21:43:12] matanya: I dont even know what it is used for, most probably for rendering videos [21:43:27] hashar: your title is misleading :) [21:43:36] https://phabricator.wikimedia.org/T103335 [21:43:49] it is indeed used for transconding videos [21:43:58] *transcoding [21:44:11] matanya: feel free to rephrase, add details to the task :-} [21:44:32] maybe that bug needs #multimedia team added to it [21:44:40] will do. anyhow, ffmpeg is only in testing of debian [21:44:53] and ubuntu currently is shipped with libav [21:45:03] same as jessie [21:45:27] yeah [21:45:42] https://packages.debian.org/source/jessie/libav https://launchpad.net/ubuntu/+source/libav [21:45:50] the idea behind the bug is to identify what relies on ffmpeg [21:45:56] and figure out the impacts [21:46:00] so debian ^-------------------------------------- ^ubuntu [21:46:02] of switching from ffmpeg to libav [21:46:21] i can tell you that, did it in labs [21:46:28] ohh [21:46:31] video transcoding breaks [21:46:33] write it down on the task https://phabricator.wikimedia.org/T103335 so :-} [21:46:53] sounds like we will need an epic task to switch to libav [21:47:01] or maybe we can build ffmpeg on Debian [21:47:47] we can build ffmpeg in debian [21:47:51] not very hard [21:48:13] but why you say we use ffmpeg, ubuntu shipps with libav [21:48:37] and this might be the future: https://lwn.net/Articles/607591/ [21:48:54] and http://packages.ubuntu.com/search?keywords=ffmpeg [21:49:13] anyway, the purpose of the task is to figure out whether we can migrate transcoding to libav [21:49:17] and apparently it does not work [21:49:37] so either a) we add ffmpeg to debian b) we code libav support for transcoding [21:51:31] hashar: just for ref: https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/mediawiki/manifests/packages/multimedia.pp [21:51:36] see the first part [21:52:08] 10Continuous-Integration-Infrastructure, 6Multimedia, 6operations: Investigate impact of switching from ffmpeg to libav (ffmpeg is not in Jessie) - https://phabricator.wikimedia.org/T103335#1394480 (10Matanya) [21:54:37] 10Continuous-Integration-Infrastructure, 6Multimedia, 6operations: Investigate impact of switching from ffmpeg to libav (ffmpeg is not in Jessie) - https://phabricator.wikimedia.org/T103335#1394492 (10Matanya) The service that relies on ffmpeg/libav is video transcoding, switching from either to the other br... [21:57:17] matanya: excellent thanks a ton :-} [21:57:31] :) [22:04:29] now is bed time :-] [22:15:51] CirrusSearch repo has just decided its composer stuff is out of date and is -1'ing things :( [22:16:07] can i fix that from the outside, or is it something that will fix itself if i wait a little longer? [22:16:33] (by just decided, i mean within the last half hour) [22:18:05] ebernhardson: It's a core thing; wait a few moments. [22:18:28] ebernhardson: (Specifically, wait for 220334 to finally merge and make the world work again.) [22:18:48] awsome :) [22:20:01] Project beta-update-databases-eqiad build #968: FAILURE in 1.7 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/968/ [22:21:00] also, how can i get myself on the BarryTheBrowserTestBot train? :) [22:23:56] ebernhardson: Ask jdlrobson. [22:32:33] ebernhardson: Try now, should work fine. [22:41:59] 10Deployment-Systems, 6operations, 7HHVM, 5Patch-For-Review, 15User-Bd808-Test: Scap should restart HHVM - https://phabricator.wikimedia.org/T103008#1394650 (10bd808) scap now has a `--restart` command line option that will run `scap-hhvm-restart` across the cluster. The `scap-hhvm-restart` script that r... [22:55:35] (03PS1) 10Odder: Update e-mail address for myself [integration/config] - 10https://gerrit.wikimedia.org/r/220345 [22:59:37] 10Continuous-Integration-Infrastructure, 10Wikibase-Quality-Constraints, 10Wikidata: MW_INSTALL_PATH incorrect for WikibaseQualityConstraints jenkins jobs - https://phabricator.wikimedia.org/T103626#1394735 (10hoo) 3NEW a:3JanZerebecki [23:15:47] (03PS2) 10Dduvall: WIP Builder for running MW-Selenium integration tests [integration/config] - 10https://gerrit.wikimedia.org/r/219513 (https://phabricator.wikimedia.org/T103039) [23:18:56] (03PS1) 10Dduvall: Integration tests to be run by CI [selenium] - 10https://gerrit.wikimedia.org/r/220348 (https://phabricator.wikimedia.org/T103039) [23:20:53] Yippee, build fixed! [23:20:54] Project beta-update-databases-eqiad build #969: FIXED in 53 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/969/ [23:25:38] (03PS1) 10Krinkle: Add npm job for performance/docroot [integration/config] - 10https://gerrit.wikimedia.org/r/220350 [23:26:59] (03CR) 10Krinkle: [C: 032] Add npm job for performance/docroot [integration/config] - 10https://gerrit.wikimedia.org/r/220350 (owner: 10Krinkle) [23:28:56] (03Merged) 10jenkins-bot: Add npm job for performance/docroot [integration/config] - 10https://gerrit.wikimedia.org/r/220350 (owner: 10Krinkle) [23:29:23] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/220350 [23:29:26] Logged the message, Master