[00:00:13] not in any significant way as far as I can tell [00:00:19] the colors are a bit different [00:00:38] it's ok paladox you aren't selling me a new car today [00:00:43] I'm ranting [00:01:01] bd808: heh [00:01:02] * bd808 goes to look for his cheese somewhere else [00:01:26] bd808: the full redesign is still to be implemented anyways :) [00:01:35] (They started in 2.15) [00:02:03] that's a big part of why I don't want to use it now [00:02:09] its shifting sands [00:03:04] I do like pg from the master branch more, as it has more colour [00:03:18] + I like the new status badge on the change screen [00:04:39] I also like the new editing experience + the file list [00:19:25] bd808: you're the resident mediawiki-vagrant expert, right? [00:19:38] apergos: heh. probably [00:19:52] * bd808 looks at the clock and looks back at apergos [00:19:53] because I sure have a question: fedora 27, downloaded and installed virtualbox, [00:20:00] have vagrant uh [00:20:30] ok. mw-vagrant with virtual box on fedora [00:20:33] 2.0.2 and vagrant-libvirt 0.40 [00:20:48] well vagrant up hangs on ssh, never connects. I can actually vagrant ssh in another window [00:21:02] when I vagrant up --debug, it turns out net-ssh whines that [00:21:13] could not verify server signature [00:21:21] google was useless for it [00:21:24] any ideas? [00:21:34] obviously we're not anywhere near the mediawiki setup yet [00:21:38] yuck. are you using VirtualBox or libvirt for the image runtime? [00:21:50] I guess it's virtualbox [00:22:02] I ran setup.sh from inside mediawiki-vagrant clone so [00:22:06] whatever that might do [00:22:23] nothing really. jsut makes a yaml config file [00:22:39] `vagrant status` should tell you [00:22:54] well let me start it again in the one window and let it hang again [00:22:55] sec [00:22:58] mine says "default running (virtualbox)" [00:23:23] even if the vm is down it should tell you what runtime it is picking [00:23:52] it says libvirt [00:25:24] ok. the libvirt runtime isn't in my wheelhouse at all [00:25:27] do i want vitualbox or do we prefer libvirt? I have no iea [00:25:36] and I wonder how I force one or the other [00:25:50] there is a way ... [00:26:15] --provider=virtualbox [00:26:19] ah ha [00:26:20] that should force VB [00:26:31] ok i'll poke at this tomorrow and see what it looks like [00:26:42] we run --provider=lxc in Cloud VPS land [00:26:50] orilly [00:26:52] and I run VirtualBox on my mac [00:27:05] that's pretty interesting, I didn't know lxc would do the job [00:27:20] lxc is probably the right thing to try on fedora [00:27:34] well i would love it because avoiding virtualblahblah would be nice [00:27:39] already have a linux kernel [00:27:46] there are notes in support/README-lxc.md [00:27:47] and vms are so heavy [00:27:57] cool I will check those out tomorrow for sure [00:28:02] thanks for the tip! [00:28:12] the fedora notes there are kind of old but should head you in the right direction [00:28:14] well "tomorrow" [00:28:19] heh [00:28:23] the weekend or if I'm a big slacker then monda [00:28:24] y [00:28:29] time shifting! [00:28:32] yeah [00:28:41] and now, to bed! yay [00:28:47] o/ [00:28:47] also, it's after midnight for you :P [00:28:51] g'night [00:29:02] ~02:30 [00:29:19] omg I had to hand sign the vbox kernel modules too for secure boot [00:29:24] would be so happy to lose those [00:29:26] anyways [00:29:28] tah! [00:31:16] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:35:50] 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#4020028 (10greg) [00:36:28] 10Release-Engineering-Team, 10Epic, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review, 10User-zeljkofilipin: Q3 Selenium framework improvements - https://phabricator.wikimedia.org/T182421#4020031 (10greg) [00:37:00] 10Release-Engineering-Team, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986#4020033 (10greg) [00:41:34] 10Release-Engineering-Team, 10DNS, 10Operations, 10Traffic, 10WMF-Communications: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4020038 (10greg) [00:46:50] hmm [00:46:53] nodepool launch errors [00:49:57] 10Release-Engineering-Team, 10Epic, 10Zuul: Update zuul to v3 - https://phabricator.wikimedia.org/T186426#4020066 (10greg) p:05Triage>03Normal [00:52:55] 10Release-Engineering-Team, 10Epic, 10Zuul: Upgrade to Zuul3 - https://phabricator.wikimedia.org/T186426#4020071 (10greg) [01:02:23] 10Scap, 10Scoring-platform-team: Investigate deployment concurrency limitations for ORES - https://phabricator.wikimedia.org/T188281#4020086 (10greg) >>! In T188281#4003034, @awight wrote: >>>! In T188281#4003007, @greg wrote: >> Using a fan out method is how this is handled normally. > > @greg Sorry in advan... [01:03:44] 10Continuous-Integration-Config, 10Release-Engineering-Team (Next): Wikimedia Portals needs libpng-dev for npm-browser-node-6 tests - https://phabricator.wikimedia.org/T186117#4020089 (10greg) [01:05:30] 10Release-Engineering-Team (Watching / External), 10MediaWiki-Core-Tests, 10MediaWiki-extensions-ORES, 10MW-1.31-release-notes (WMF-deploy-2018-02-06 (1.31.0-wmf.20)), and 2 others: How do I test my extension's maintenance scripts? - https://phabricator.wikimedia.org/T184775#4020092 (10greg) 05Open>03Re... [01:27:11] PROBLEM - nodepoold running on labnodepool1001 is CRITICAL: PROCS CRITICAL: 0 processes with UID = 113 (nodepool), regex args ^/usr/bin/python /usr/bin/nodepoold -d [02:14:30] RECOVERY - nodepoold running on labnodepool1001 is OK: PROCS OK: 1 process with UID = 113 (nodepool), regex args ^/usr/bin/python /usr/bin/nodepoold -d [02:29:51] (03PS1) 10Legoktm: ci-src-setup: Add setup-mwext-vendor.sh and documentation [integration/config] - 10https://gerrit.wikimedia.org/r/416203 [02:39:34] PROBLEM - Puppet errors on saucelabs-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [03:11:41] RECOVERY - Free space - all mounts on integration-slave-jessie-1004 is OK: OK: All targets OK [03:19:33] RECOVERY - Puppet errors on saucelabs-01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:36:48] (03CR) 10Legoktm: [C: 04-1] ci-src-setup: Add setup-mwext-vendor.sh and documentation [integration/config] - 10https://gerrit.wikimedia.org/r/416203 (owner: 10Legoktm) [04:28:52] (03PS2) 10Legoktm: ci-src-setup: Add setup-mwext-vendor.sh and documentation [integration/config] - 10https://gerrit.wikimedia.org/r/416203 [04:28:54] (03PS1) 10Legoktm: Create integration-jenkins image [integration/config] - 10https://gerrit.wikimedia.org/r/416212 [04:28:56] (03PS1) 10Legoktm: [WIP] mwext-test-php70-sqlite image [integration/config] - 10https://gerrit.wikimedia.org/r/416213 [04:30:25] (03CR) 10jerkins-bot: [V: 04-1] [WIP] mwext-test-php70-sqlite image [integration/config] - 10https://gerrit.wikimedia.org/r/416213 (owner: 10Legoktm) [04:34:20] (03CR) 10Legoktm: "This is very WIP. Originally I tried to use mw-install-sqlite.sh and those scripts, but I think we're better off forking them just for do" [integration/config] - 10https://gerrit.wikimedia.org/r/416213 (owner: 10Legoktm) [04:50:30] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [05:00:30] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [05:38:01] (03PS1) 10Legoktm: Update squizlabs/php_codesniffer to 3.2.3 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/416214 [09:30:53] Hi [09:30:56] Jenkins again no work [10:22:54] As I see, noone no reply [10:23:11] Jenkins, CI, zuul no work [10:23:29] 66 tests from before 10 hours [10:54:07] PROBLEM - Host deployment-videoscaler01 is DOWN: CRITICAL - Host Unreachable (10.68.19.130) [10:54:53] PROBLEM - Host deployment-tmh01 is DOWN: CRITICAL - Host Unreachable (10.68.16.211) [11:20:59] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020574 (10Paladox) [11:21:38] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020585 (10Paladox) p:05Triage>03Unbreak! Changing priority to UBN as tests are not running [12:06:54] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020574 (10MarcoAurelio) Confirmed. No tests are running and those showing on the page are stuck. Can someone reboot the CI? Thanks. [12:07:23] Krinkle: I understand you're able to rebook jenkins/zuul/ci? [12:11:35] Hauskatze hi, this may not be as simple as rebooting it [12:11:43] There was a problem with nodepool last night [12:11:48] but then it was fixed. [12:11:52] Could be the same problem. [12:12:32] ok paladox [12:12:48] ci runs on contint right? [12:12:56] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020627 (10Paladox) This may be nodepool, as there was a problem with node pool last night + it only looks like the tests for nodepool are not running. [12:13:10] Hauskatze on contint1001? [12:13:21] Hauskatze jenkins runs on there, the slaves runs in labs [12:13:31] heh, read my mind, if only the -jessie tests are not running then it might be nodepool [12:13:53] very limited knowledge to make an assesment here though [12:14:01] got to make some calls, brb [12:30:53] PROBLEM - Puppet errors on deployment-ores01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:36:24] does Gearman spike in the graph below mean anything? [13:07:31] PROBLEM - nodepoold running on labnodepool1001 is CRITICAL: PROCS CRITICAL: 0 processes with UID = 113 (nodepool), regex args ^/usr/bin/python /usr/bin/nodepoold -d [13:09:31] RECOVERY - nodepoold running on labnodepool1001 is OK: PROCS OK: 1 process with UID = 113 (nodepool), regex args ^/usr/bin/python /usr/bin/nodepoold -d [13:30:07] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020675 (10Andrew) New VMs in contintcloud were failing with ``` Failed to allocate the network(s) with error Maximum number of fixed ips exceeded, not rescheduling ``` This is because the fixed_ip quota usages w... [13:39:42] Project mwext-phpunit-coverage-publish build #1724: 04FAILURE in 21 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1724/ [13:41:36] 13:39:42 [38f8487ce831ffa9080dc44e] [no req] MWException from line 167 of /srv/jenkins-workspace/workspace/mwext-phpunit-coverage-publish/src/includes/Hooks.php: Invalid callback WikibaseQuality\ExternalValidation\WikibaseQualityExternalValidationHooks::onCreateSchema in hooks for LoadExtensionSchemaUpdates [13:44:51] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020713 (10Hasechris) @Paladox Could it be a mistake to tag me in this issue? I haven't been developing for wikimedia ever. Greetings [13:45:44] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020715 (10Paladox) 05Open>03Resolved a:03Andrew Thankyou @Andrew [13:45:57] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020718 (10Paladox) @Hasechris i didn't tag you though. [14:05:10] 10Continuous-Integration-Infrastructure: CI is broken - https://phabricator.wikimedia.org/T188820#4020753 (10Hasechris) @Paladox Interesting. I'm a subscriber of this issue and i absolute dont know why. Here is the first email i got from this issue: > ---------- Forwarded message ---------- > From: Paladox <... [14:24:50] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [14:53:16] PROBLEM - Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76) [15:52:42] Project mwext-phpunit-coverage-publish build #1725: 04STILL FAILING in 19 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1725/ [16:01:11] ^ that failure looks deja vu [16:33:16] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:21:02] 10Beta-Cluster-Infrastructure: Captchas sent with wrong mime type on beta - https://phabricator.wikimedia.org/T164047#3219881 (10Ciencia_Al_Poder) Probably related: T164047 [17:32:48] PROBLEM - Free space - all mounts on deployment-mediawiki05 is CRITICAL: CRITICAL: deployment-prep.deployment-mediawiki05.diskspace.root.byte_percentfree (<11.11%) [17:37:47] RECOVERY - Free space - all mounts on deployment-mediawiki05 is OK: OK: All targets OK [18:33:49] 10Release-Engineering-Team (Watching / External), 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: Deprecation of mw.errors.* metrics - https://phabricator.wikimedia.org/T188749#4021055 (10elukey) Thanks @Krinkle! @fgiunchedi I think we are ready to go, what do you think? [18:34:37] PROBLEM - App Server Main HTTP Response on deployment-mediawiki07 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 hphp_invoke - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 287 bytes in 0.003 second response time [18:54:28] PROBLEM - Puppet errors on deployment-secureredirexperiment is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:06:30] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [19:41:43] Yippee, build fixed! [19:41:43] Project mwext-phpunit-coverage-publish build #1726: 09FIXED in 25 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/1726/ [20:09:05] legoktm we should discontinue the use of phantomjs per https://github.com/ariya/phantomjs/issues/15344 [20:28:27] Yippee, build fixed! [20:28:28] Project selenium-Wikibase-chrome ยป chrome,beta,Linux,DebianJessie && contintLabsSlave build #131: 09FIXED in 41 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase-chrome/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=DebianJessie%20&&%20contintLabsSlave/131/ [21:08:22] PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [22:30:47] 10Gerrit: Polygerrit search dropdown does not list all projects - https://phabricator.wikimedia.org/T188842#4021255 (10Tgr) [22:45:50] 10Beta-Cluster-Infrastructure: Captchas sent with wrong mime type on beta - https://phabricator.wikimedia.org/T164047#3219881 (10Krenair) Loaded the login page, it downloaded this URL: ```alex@alex-laptop:~$ curl -sI 'https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:Captcha/image&wpCaptchaId=10637... [23:02:12] 10Beta-Cluster-Infrastructure: Captchas sent with wrong mime type on beta - https://phabricator.wikimedia.org/T164047#4021275 (10Krenair) That last-modified date was from near my NFS -> Swift migration: T64835#2459268 So the file itself may be very old [23:15:42] 10Beta-Cluster-Infrastructure: Captchas sent with wrong mime type on beta - https://phabricator.wikimedia.org/T164047#4021296 (10Krenair) That task shows that I ran `root@deployment-ms-fe01:/data/project/upload7/private/captcha# swift upload global-data-captcha-render *` (I'm so glad I kept records of important... [23:49:40] 10Gerrit: Polygerrit search dropdown does not list all projects - https://phabricator.wikimedia.org/T188842#4021347 (10Paladox) Seems to be a problem with searching with capitals. [23:51:34] 10Gerrit: Polygerrit search dropdown does not list all projects - https://phabricator.wikimedia.org/T188842#4021349 (10Paladox) Seems to be fixed in 2.15.