[00:24:35] 10Beta-Cluster-Infrastructure: deployment-mediawiki0[12] have outdated mediawiki-config - https://phabricator.wikimedia.org/T127888#2058294 (10Ladsgroup) Even with several resyncs done, it's not there yet. Tell if I'm wrong. [00:37:18] 10Beta-Cluster-Infrastructure: deployment-mediawiki0[12] have outdated mediawiki-config - https://phabricator.wikimedia.org/T127888#2058345 (10thcipriani) Hmm...code is definitely present on all the beta mediawiki boxes: ``` thcipriani@deployment-mediawiki03:/srv/mediawiki/wmf-config$ ack-grep ORES --php Initia... [01:10:06] 10Continuous-Integration-Config, 10MediaWiki-extensions-Scribunto, 10Wikidata: Add Scribunto to extension-gate in CI - https://phabricator.wikimedia.org/T125050#2058386 (10Paladox) [01:29:26] greg-g, I think ostriches may not have time to look at the External Store Beta issue (https://phabricator.wikimedia.org/T95871), so if there's anyone else I could discuss it with, that would be helpful. [01:29:32] 10Beta-Cluster-Infrastructure: deployment-mediawiki0[12] have outdated mediawiki-config - https://phabricator.wikimedia.org/T127888#2058519 (10Ladsgroup) Also, it's present in API too, it's in Special:Version as well. (It was before these changes too): http://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&... [01:32:00] ori, gilles, or Krinkle, could you look at https://phabricator.wikimedia.org/T127785 if you have a chance? [01:32:26] Sorry, wrong channel, that second one is not specifically a Beta issue. [01:34:00] matt_flaschen: Soooo, I'm not exactly sure what you need from me re: external storage. Of all MW core subsystems, I know basically zilch about it. [01:50:05] 10Beta-Cluster-Infrastructure, 10Staging, 10DBA, 3Collaboration-Team-Current: Use External Store on Beta Cluster - https://phabricator.wikimedia.org/T95871#2058573 (10demon) >>! In T95871#2047217, @greg wrote: > Adding Chad for their help/expertise. So I'm not exactly sure what's needed here from me :) I... [02:32:52] ostriches, greg-g pointed me to you. Mainly just want to talk to someone familiar with the Beta Cluster DB setup. [02:44:56] 10Beta-Cluster-Infrastructure, 10Staging, 10DBA, 3Collaboration-Team-Current: Use External Store on Beta Cluster - https://phabricator.wikimedia.org/T95871#2058642 (10Mattflaschen) Talked to @demon a bit about this. >>! In T95871#2047174, @Mattflaschen wrote: > make-all-blobs blobs1 If no one o... [03:00:44] 10Beta-Cluster-Infrastructure, 10Staging, 10DBA, 3Collaboration-Team-Current: Use External Store on Beta Cluster - https://phabricator.wikimedia.org/T95871#2058648 (10Mattflaschen) The config change will first be done for one Beta wiki, then if that goes fine I'll do the rest. [03:16:18] 10Beta-Cluster-Infrastructure, 10Staging, 10DBA, 3Collaboration-Team-Current: Use External Store on Beta Cluster - https://phabricator.wikimedia.org/T95871#2058661 (10demon) Yep, that about summarizes it. Ping me on IRC when you wanna do this so you can have a second set of hands :) [05:33:08] RECOVERY - Puppet failure on deployment-analytics03 is OK: OK: Less than 1.00% above the threshold [0.0] [06:48:33] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<11.11%) [07:05:56] RECOVERY - Free space - all mounts on deployment-jobrunner01 is OK: OK: All targets OK [07:08:25] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [08:26:01] 3Scap3: Proof-of-concept: sync l10n cache with git-annex + zsync - https://phabricator.wikimedia.org/T126805#2058822 (10mmodell) 5Open>3declined zsync is a really interesting application, however, it doesn't seem to be well suited to syncing our localization cache CDB files. [08:32:56] PROBLEM - Host cache-rsync is DOWN: CRITICAL - Host Unreachable (10.68.23.165) [08:36:25] PROBLEM - SSH on integration-make-wmf-branch is CRITICAL: Connection refused [09:52:48] lets do some more eviews [09:56:39] hashar: You know the python tests, You said that migrating to trusty will not be easy for those. But what if we create a new python template that uses trusty so it uses the newer python. We could then migrate repos to use the new template. [10:04:06] hashar: In jenkins im not sure how to fix it so when an extension has a dependacy it will test both at the same time so the main extension dosent error out but it seems to not happen. [10:07:47] (03PS1) 10Paladox: Install dependant extensions first then the main extension [integration/config] - 10https://gerrit.wikimedia.org/r/272949 [10:08:47] (03CR) 10jenkins-bot: [V: 04-1] Install dependant extensions first then the main extension [integration/config] - 10https://gerrit.wikimedia.org/r/272949 (owner: 10Paladox) [10:10:03] 3Scap3, 10scap, 7WorkType-NewFunctionality: [Spike] Benchmark built-in HTTP server options for scap3 fanout - https://phabricator.wikimedia.org/T127733#2059064 (10mmodell) [10:10:26] paladox: good morning [10:10:44] paladox: I think we have / had templates to vary tox to trusty / jessie [10:11:10] but I might have dropped them in favor of just migrating everything to 'tox-jessie' which is running on the Nodepool disposable instances [10:11:19] I am going to dig in the changes you sent yesterday and approve / refine as needed :} [10:11:27] hashar: And you too, Ok thanks. [10:13:02] hashar https://gerrit.wikimedia.org/r/#/c/264333/ this will work for mysql. Reason i looked at the one for the non generic one and it showed mysql install before moving extension list to src. [10:13:28] yeah have to carefully look at that one as well [10:13:36] havent made my mind about it yet [10:14:13] (03Abandoned) 10Paladox: Install dependant extensions first then the main extension [integration/config] - 10https://gerrit.wikimedia.org/r/272949 (owner: 10Paladox) [10:14:33] hashar: Ok, Thanks for reviewing my patches. :) [10:14:45] thanks for all of them :} [10:14:52] though I have trouble to keep up hehe [10:15:23] hashar: Your welcome. :) [10:16:03] (03CR) 10Hashar: [C: 032] "Great excellent / perfect etc :-)" [integration/config] - 10https://gerrit.wikimedia.org/r/272902 (owner: 10Paladox) [10:16:13] ^^^ that kind of change is very helpful [10:16:30] (03CR) 10Paladox: "Thanks. :)" [integration/config] - 10https://gerrit.wikimedia.org/r/272902 (owner: 10Paladox) [10:17:06] (03CR) 10Hashar: "I will bulk clean up the jobs in Jenkins and the workspaces later on." [integration/config] - 10https://gerrit.wikimedia.org/r/272902 (owner: 10Paladox) [10:18:02] (03Merged) 10jenkins-bot: Replace jslint test with jshint and jsonlint tests [integration/config] - 10https://gerrit.wikimedia.org/r/272902 (owner: 10Paladox) [10:18:32] hashar: Your welcome, I think most of the tests legoktm linked in https://phabricator.wikimedia.org/T127362 are passing, Only the ones that are non voting are failing but i think we should just make them jsonlint and when or who wants to start updating the repo again we could turn jshint or npm on for them. That way we can get jslint removed as fast as the migration is complete. [10:21:35] hashar: Is https://gerrit.wikimedia.org/r/#/c/272748/ this blocked until zuul is migrated to either trusty or jessie [10:24:16] paladox: yeah should abandon it [10:24:23] hashar: Ok [10:24:41] paladox: it is a lame integration test to make sure we validate the layout using the exact same version that is running the Zuul scheduler (the server) on gallium the production machine [10:24:47] (03Abandoned) 10Paladox: Migrate integration to UbuntuTrusty [integration/config] - 10https://gerrit.wikimedia.org/r/272748 (owner: 10Paladox) [10:24:55] so indeed while the production scheduler is still Precise, the test should be precise [10:25:04] hashar: Ok [10:25:10] note how it references /usr/bin/zuul-server [10:25:22] which is Zuul installed on the Precise slaves using a Debian package [10:25:33] just like it is installed on the production server [10:26:05] so when upgrading Zuul, I upgrade it both on production server and the Precise slaves. If the job start falling we have a trouble in prod [10:26:14] hashar: Yes. ok. [10:26:31] hashar: Is there any plans to migrate zuul to the updated os. [10:26:33] that is what I have said yesterday, it is not as easy as just replacing UbuntuPrecise by UbuntuTrusty ;-} [10:26:40] there are tasks yes [10:26:42] gotta find out a server [10:26:52] sit down on the whiteboard and figure out the target architecture [10:27:19] gallium has been setup ~ 5 years ago and the way it connected to the rest of the infrastructure is legacy / no more supported [10:27:29] hashar: Ok. [10:27:42] its replacement server must be current state of the art. Which has a few different implications I havent looked at yet [10:28:04] not much details / context to give. I haven't even started to look at it [10:28:36] hashar: Ok. Im aware gerrit is being upgraded soon since it is being migrated to a new server so soon some of the features in zuul that only support newer gerrit should be supported. [10:29:14] (03PS2) 10Hashar: Add npm-node-4.3 to a few of apps/* templates in experimental: [integration/config] - 10https://gerrit.wikimedia.org/r/272765 (owner: 10Paladox) [10:30:35] (03CR) 10Hashar: [C: 032] "Niedzielski the 'experimental' pipeline triggers the jobs listed under it when someone comment in Gerrit 'check experimental'. That is a " [integration/config] - 10https://gerrit.wikimedia.org/r/272765 (owner: 10Paladox) [10:31:31] (03Merged) 10jenkins-bot: Add npm-node-4.3 to a few of apps/* templates in experimental: [integration/config] - 10https://gerrit.wikimedia.org/r/272765 (owner: 10Paladox) [10:32:22] (03CR) 10Paladox: "Thanks." [integration/config] - 10https://gerrit.wikimedia.org/r/272765 (owner: 10Paladox) [10:35:45] paladox: you should test the changes locally :-} [10:35:56] paladox: have you figured out how to get node/npm installed on your machine? [10:36:05] hashar: Nope [10:36:58] hashar: Maybe this http://blog.teamtreehouse.com/install-node-js-npm-windows [10:37:21] hashar: which one do i download from https://nodejs.org/en/ [10:37:35] acroding to it, it says 4.3.1 lts or 5.x [10:40:46] paladox: LTS stands for Long Term Support [10:40:56] so that is rather an "old" version but it is still actively maintained [10:41:05] hashar: Oh should i use 5.x [10:41:11] namely if there is a security issue or a critical bug, the 4.3.x version will be updated [10:41:26] on Jessie we have 4.3.x [10:41:34] so might want to use 4.3 rather than 5.x [10:41:38] hashar: Ok, so i should use 4.3. Ok. [10:41:47] and I think npm is installed along it [10:42:03] you will have a different version than the one being used on the CI nodes, but it should not be of any concern [10:42:10] hashar: Ok and https://integration.wikimedia.org/ci/job/npm-node-4.3/43/console has passed now. [10:42:30] magic [10:43:14] yep [10:48:54] hashar: Do we archive the tests for https://gerrit.wikimedia.org/r/#/c/272773/ since it was moved to github per its description. [10:57:40] (03PS1) 10Paladox: Remove apps-jslint, Add npm to two of apps/* repos [integration/config] - 10https://gerrit.wikimedia.org/r/272954 [10:58:38] (03Abandoned) 10Paladox: Migrate apps tests from apps-jslint to apps-jshint and apps-jsonlint [integration/config] - 10https://gerrit.wikimedia.org/r/271725 (https://phabricator.wikimedia.org/T62619) (owner: 10Paladox) [11:00:47] paladox: indeed [11:00:54] they should probably blank the repository in Gerrit [11:00:58] then we can make it read-only [11:01:02] hashar: Ok [11:01:13] and yeah archived [11:01:26] good catch [11:02:40] I have asked about it in #wikimedia-mobile [11:03:02] (03PS1) 10Paladox: [apps/ios/wikipedia] Archive repo [integration/config] - 10https://gerrit.wikimedia.org/r/272955 [11:03:13] Ok/ [11:03:25] (03CR) 10Hashar: [C: 032] [apps/ios/wikipedia] Archive repo [integration/config] - 10https://gerrit.wikimedia.org/r/272955 (owner: 10Paladox) [11:03:42] (03CR) 10Paladox: "Thanks." [integration/config] - 10https://gerrit.wikimedia.org/r/272955 (owner: 10Paladox) [11:04:08] hashar: Should i remove the php tests for apps/ in a different patch. Because only the ios of apps had php test. [11:04:36] (03Merged) 10jenkins-bot: [apps/ios/wikipedia] Archive repo [integration/config] - 10https://gerrit.wikimedia.org/r/272955 (owner: 10Paladox) [11:06:53] (03PS2) 10Paladox: Remove apps-jslint, Add npm to apps/android/wikipedia repo [integration/config] - 10https://gerrit.wikimedia.org/r/272954 [11:12:42] paladox: no clue :-} [11:12:47] paladox: heading out for lunch [11:12:52] hashar: Ok. [11:16:48] 10Browser-Tests-Infrastructure, 6Release-Engineering-Team, 7Ruby: Ruby mediawiki_api client hides error details - https://phabricator.wikimedia.org/T127786#2059184 (10hashar) When mediawiki-ruby-api catch an HttpError it only reports the status code. Seems we will want to emit the response payload as well to... [11:24:17] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2059202 (10Joe) p:5Triage>3Normal a:3Joe [11:29:56] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2059209 (10Joe) I removed the cherry pick of https://gerrit.wik... [11:30:05] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2059211 (10Joe) 5Open>3Resolved [11:56:43] PROBLEM - App Server Main HTTP Response on deployment-mediawiki01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:57:52] Project browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #401: 04FAILURE in 51 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/401/ [12:01:39] RECOVERY - App Server Main HTTP Response on deployment-mediawiki01 is OK: HTTP OK: HTTP/1.1 200 OK - 40340 bytes in 5.596 second response time [12:54:39] Project browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #767: 04FAILURE in 39 sec: https://integration.wikimedia.org/ci/job/browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/767/ [13:00:38] hashar: Did you know you can install npm through composer [13:03:24] paladox: ohhh [13:04:04] hashar: Yes see https://gerrit.wikimedia.org/r/#/c/272961/ please. I was doing some testing with doing that. [13:41:03] 10Continuous-Integration-Infrastructure: make CI for extensions able to run PHPUnit from composer instead of a system wide installation - https://phabricator.wikimedia.org/T112867#2059555 (10JanZerebecki) 5Open>3Resolved a:3Krinkle I think this was solved during T99982. Thx. [13:53:46] 10Continuous-Integration-Config, 10Wikibase-JavaScript-Api, 10Wikidata: [Task] Make Wikibase-JavaScript-API testextension job run composer - https://phabricator.wikimedia.org/T110172#2059605 (10JanZerebecki) 5Open>3Resolved a:3JanZerebecki Fixed during T100654. [14:06:44] Project browsertests-Wikidata-WikidataTests-linux-firefox build #114: 04STILL FAILING in 8 min 43 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-linux-firefox/114/ [14:26:07] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:26:37] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #802: 04FAILURE in 36 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/802/ [14:31:05] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 40683 bytes in 6.128 second response time [14:35:26] 10Beta-Cluster-Infrastructure: Cannot login on Commons Beta, complains about disabled cookies - https://phabricator.wikimedia.org/T127964#2059755 (10JeanFred) [14:41:08] !log beta: we have a lost a memcached server 11:51am UTC [14:41:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [14:45:32] 10Beta-Cluster-Infrastructure: deployment-mediawiki02 lost memcached access at 11:51am UTC - https://phabricator.wikimedia.org/T127966#2059794 (10hashar) [14:49:06] hashar: Could it be failing because of [14:49:07] rsync: change_dir "/mediawiki-services-mathoid-deploy/master/mathoid-deploy-npm-node-4.3" (in caches) failed: No such file or directory (2) [14:49:07] 14:22:43 rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [Receiver=3.1.1] [14:49:07] 14:22:43 rsync: read error: Connection reset by peer (104) [14:49:11] https://integration.wikimedia.org/ci/job/mathoid-deploy-npm-node-4.3/9/console [14:51:29] 10Beta-Cluster-Infrastructure: deployment-mediawiki02 lost memcached access at 11:51am UTC - https://phabricator.wikimedia.org/T127966#2059832 (10hashar) A puppet run on deployment-mediawiki02: ``` Notice: /Stage[main]/Nutcracker/Service[nutcracker]/ensure: ensure changed 'stopped' to 'running' Info: /Stage[main... [14:52:17] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2055880 (10hashar) That seems to cause the nutcracker.yam file... [14:53:33] 10Beta-Cluster-Infrastructure: deployment-mediawiki02 lost memcached access at 11:51am UTC - https://phabricator.wikimedia.org/T127966#2059794 (10hashar) `/etc/nutcracker/nutcracker.yml` on deployment-mediawiki02: ``` lang=yaml mc-unix: auto_eject_hosts: true distribution: ketama hash: md5 listen: /var/r... [15:02:37] hashar: What are you people doing to MultimediaViewer, your CR comment scared me [15:04:12] MarkTraceur: ah sorry poke zeljkof ^^^ [15:04:27] Mostly I'm just curious. [15:04:35] in meeting with tyler [15:04:55] MarkTraceur: fixing selenium tests [15:05:05] just a one liner that will fix 80% or so of the tests [15:05:14] recent upgrade broke something in MMV [15:05:41] Oh, K [15:05:52] hashar, https://phabricator.wikimedia.org/T111259 bit us again recently .. a patch that got merged 3 weeks back should have failed php parser tests then .. but it got caught y'day somewhat randomly. [15:05:59] it would be good to get that fixed. [15:06:04] zeljkof: I figured those tests were unsalvageable based on the backlog of test fail emails I have [15:06:12] but looks like beta cluster is down, so rerunning the tests would probably fail again [15:06:20] MarkTraceur: oh noes [15:06:25] I can fix all the tests ;) [15:06:34] Ah, well, that's good [15:06:39] take a look at the patch, it is one like, if you know what you are doing ;) [15:06:48] hasharAW, arlo thinks that jenkins job is using some cached version or something like that. [15:07:03] but yes, we are working on JS+Selenium, should be ready soon ™️ [15:07:49] !log beta app servers have lost access to memcached due to bad nutcracker conf | T127966 [15:07:52] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [15:08:05] Good fix! [15:08:12] 7Browser-Tests, 10MediaWiki-extensions-MultimediaViewer: Disable MultimediaViewer scenarios that fail at en.wikipedia.beta.wmflabs.org from running daily - https://phabricator.wikimedia.org/T94157#2059956 (10zeljkofilipin) [15:08:14] zeljkof: And yeah, I'm looking forward to the JS tests. [15:08:14] 10Browser-Tests-Infrastructure, 10MediaWiki-extensions-MultimediaViewer, 13Patch-For-Review: undefined method `test_name' for # (NoMethodError) - https://phabricator.wikimedia.org/T125072#2059955 (10zeljkofilipin) 5Open>3Resolved [15:08:31] zeljkof: does that actually fix the daily browser tests ? :) [15:08:40] I was never sure if I was failing at browser tests because Ruby sucks or because PageObject sucks [15:08:45] Now I guess we'll find out for sure [15:08:53] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #888: 04FAILURE in 19 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/888/ [15:08:54] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce build #358: 04FAILURE in 20 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce/358/ [15:08:57] hasharAW: yes [15:09:00] a lot of them [15:09:23] there is a MMV JS problem, so not sure how many tests will fail, but the tooling side is fixed now [15:09:33] MarkTraceur: :P [15:09:45] both ruby and page object pattern are cool as beer [15:09:54] all three new failures https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/888/testReport/ [15:09:59] but I know not everybody likes ruby [15:10:05] but maybe it is due to beta lacking memcached right now [15:10:18] yeah MediawikiApi::LoginError [15:10:19] ... [15:10:38] you will still get page objects with JS [15:10:46] Project beta-scap-eqiad build #90965: 04FAILURE in 6 min 4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/90965/ [15:11:04] hasharAW: yes, will rerun the tests when beta is back, let me know if you remember [15:13:21] PROBLEM - Puppet failure on integration-slave-precise-1012 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:15:39] PROBLEM - Puppet failure on deployment-cache-text04 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [15:17:34] PROBLEM - Puppet failure on deployment-redis02 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [15:18:18] PROBLEM - Puppet failure on mira is CRITICAL: CRITICAL: 16.67% of data above the critical threshold [0.0] [15:18:46] PROBLEM - Puppet failure on deployment-urldownloader is CRITICAL: CRITICAL: 16.67% of data above the critical threshold [0.0] [15:19:58] PROBLEM - Puppet failure on deployment-kafka04 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0] [15:19:58] PROBLEM - Puppet failure on integration-slave-trusty-1003 is CRITICAL: CRITICAL: 77.78% of data above the critical threshold [0.0] [15:20:41] and now puppet is dead [15:23:14] PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [15:33:42] Yippee, build fixed! [15:33:43] Project beta-scap-eqiad build #90967: 09FIXED in 9 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/90967/ [15:33:44] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:46:16] 7Browser-Tests, 10MediaWiki-Vagrant, 10Wikidata: Problem with running wikidata browser tests targeting mediawiki-vagrant with wikidata role - https://phabricator.wikimedia.org/T127972#2060039 (10zeljkofilipin) [15:46:31] jzerebecki, Jonas_WMDE: https://phabricator.wikimedia.org/T127972 [15:46:46] I have reported my problem with running browser tests for wikidata [15:46:57] any help is appreciated [15:54:58] RECOVERY - Puppet failure on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:54:58] RECOVERY - Puppet failure on integration-slave-trusty-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [15:55:42] RECOVERY - Puppet failure on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:56:19] 7Browser-Tests, 10MediaWiki-Vagrant, 10Wikidata: Problem with running wikidata browser tests targeting mediawiki-vagrant with wikidata role - https://phabricator.wikimedia.org/T127972#2060099 (10JanZerebecki) > WIKIDATA_REPO_API: "http://127.0.0.1:8080/w/api.php" > WIKIDATA_REPO_URL: "http://127.0.0.1:8080/w... [15:58:42] RECOVERY - Puppet failure on deployment-apertium01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:53] 7Browser-Tests, 10VisualEditor, 13Patch-For-Review, 5WMF-deploy-2016-03-01_(1.27.0-wmf.15): Disable VisualEditor scenarios that fail at en.wikipedia.beta.wmflabs.org from running daily - https://phabricator.wikimedia.org/T94162#2060121 (10zeljkofilipin) [16:02:41] RECOVERY - Puppet failure on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:04:53] zeljkof: answered on ticket. thx for looking into this. [16:05:18] jzerebecki: thank _you_! [16:08:50] RECOVERY - Puppet failure on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0] [16:12:11] hashar: Maybe because npm fails on mathoid deploy because it needs to use npm install. [16:15:12] paladox: it does not by design [16:15:34] paladox: we have a script nom-inject-dev or something that does install dev depencies via npm [16:15:41] but all the other node modules are embedded in the /deploy repo [16:15:55] that is because on production we only clone /deploy.git and never ever run npm install on prod [16:16:07] so /deploy is really a snapshot of the code + run dependencies [16:16:36] hashar: Oh but why does it run npm find on mathoid but not on mathoid deploy [16:17:24] s/npm/tests/ [16:17:26] no clue [16:17:46] the Jenkins job running against /deploy tweak npm/node related variables [16:17:46] hashar: Oh it does fail to do something for castor. [16:17:52] Oh ok. [16:17:54] so that might have a side effect [16:17:56] ah yeah castor-save can be ignored [16:18:00] undocumented :( [16:18:15] hashar: Ok. So is castor causing the problem. [16:18:44] there is only an introduction paragraph for castor :( https://www.mediawiki.org/wiki/Continuous_integration/Architecture/Castor [16:18:46] my blame [16:19:11] hashar: Oh. [16:19:26] hashar: o/ [16:21:19] hashar: Would this 16:19:39 rsync: change_dir "/mediawiki-services-mathoid-deploy/master/mathoid-deploy-npm-node-4.3" (in caches) failed: No such file or directory (2) indicate a problem. [16:21:20] https://phabricator.wikimedia.org/T127888#2058519 [16:21:42] paladox: na that is ignored [16:21:57] hashar: Oh. Ok. Wonder why it is failing then. [16:21:59] paladox: castor is a way to save the package managers cache to a central place [16:22:09] hashar: Oh ok. [16:22:16] paladox: and it is run at the start of some jobs to try to populate the cache. But if none exist on the central repo it is ignored. [16:22:26] paladox: the script should be enhanced to produce nicer output for sure [16:22:32] hashar: Ok. Yes [16:23:01] Amir1: ah apparently the code is now up-to-date that is great [16:23:08] Amir1: I have no idea how BetaFeatures really :( [16:23:32] hashar: Could it be because npm packages are installed in parent directory then npm goes into src and does npm install for the dev and then tests it there. [16:23:52] I don't think it's related to beta features, because it works in a similar setup [16:24:05] the API of beta features work as well [16:24:40] so maybe mem cache or varnish (I know it sounds stupid) [16:25:08] legoktm might know [16:26:45] 7Browser-Tests, 10MediaWiki-Vagrant, 10Wikidata: Problem with running wikidata browser tests targeting mediawiki-vagrant with wikidata role - https://phabricator.wikimedia.org/T127972#2060245 (10zeljkofilipin) I think I have found wikidata at http://wikidata.wiki.local.wmftest.net:8080/ {F3412849} {F341285... [16:32:01] 5Continuous-Integration-Scaling, 6Labs, 10Labs-Infrastructure, 7Nodepool, 13Patch-For-Review: Nodepool can't refresh snapshot on labs since ~ Feb 15th - https://phabricator.wikimedia.org/T127755#2060251 (10hashar) I have tried again to snapshot a running instance. On labnodepool1001.eqiad.wmnet as nodepo... [16:32:31] hashar: beta still broken? [16:32:38] Amir1: sounds to me the BetaFeatures preference page is generated based on hooks [16:33:00] Amir1: but maybe it indeed cache the result somehow / somewhere and does not properly invalidate its cache when a new hook is added/changed. [16:33:05] yes, it's a hook that the beta features extension intrduced [16:33:12] Amir1: but really I have no idea whether BetaFeatures as such caching mechanism [16:33:25] ".... has such..." [16:33:28] zeljkof: afaik yes [16:33:54] zeljkof: I should just hack the nutcracker file I guess [16:34:01] let me browse the source code [16:34:14] Amir1: and might want to poke authors of betaFeatures [16:34:15] hashar: any estimate on when it will be fixed? [16:34:30] 7Browser-Tests, 10MediaWiki-Vagrant, 10Wikidata: Problem with running wikidata browser tests targeting mediawiki-vagrant with wikidata role - https://phabricator.wikimedia.org/T127972#2060039 (10Addshore) This probably relates to https://phabricator.wikimedia.org/T120572 The rest of Wikibase probably still... [16:34:40] !sal [16:34:40] https://tools.wmflabs.org/sal/releng [16:34:49] it's MarkTraceur :) [16:34:54] zeljkof: till someone fix it ? ;)} [16:35:06] oh, I thought you are working on it :) [16:35:13] nobody is working on it? [16:35:15] bah releng sal is blocked [16:35:17] qa-morebots: poke [16:35:17] I am a logbot running on tools-exec-1221. [16:35:17] Messages are logged to https://tools.wmflabs.org/sal/releng. [16:35:18] To log a message, type !log . [16:35:41] * hashar takes a break [16:35:42] James_F: All you buddy [16:35:57] ? [16:36:30] MarkTraceur: hey, do you have a min to check out what's wrong with a beta feature? [16:36:46] https://phabricator.wikimedia.org/T127888#2058519 [16:37:32] Amir1: No, I'm out today, and James owns it now anyway [16:37:52] James_F: heyyy [16:38:20] MarkTraceur: sure, enjoy you're vacation. Please be just a vacation [16:38:28] 7Browser-Tests, 10MediaWiki-Vagrant, 10Wikidata: Problem with running wikidata browser tests targeting mediawiki-vagrant with wikidata role - https://phabricator.wikimedia.org/T127972#2060266 (10zeljkofilipin) If I change `WIKIDATA_REPO_API` and `WIKIDATA_REPO_URL` to point to wikidata.wiki.local.wmftest.net... [16:38:49] Amir1: Sick day really. Kinda mental health. [16:39:57] I understand [16:41:01] PROBLEM - Free space - all mounts on deployment-fluorine is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine.diskspace._srv.byte_percentfree (<33.33%) [16:41:12] MarkTraceur: rest well ! [16:42:22] 5Gerrit-Migration, 10Gitblit-Deprecate, 6Release-Engineering-Team, 3releng-201516-q3, and 4 others: [RfC]: Migrate code review / management to Phabricator from Gerrit - https://phabricator.wikimedia.org/T119908#2060299 (10Jdforrester-WMF) [16:42:49] 6Release-Engineering-Team: releng sal is stall since Feb 20 - https://phabricator.wikimedia.org/T127981#2060313 (10hashar) [16:43:16] !log sal on elastic search is stall https://phabricator.wikimedia.org/T127981 [16:43:19] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [16:43:45] 6Release-Engineering-Team: releng sal is stall since Feb 20 - https://phabricator.wikimedia.org/T127981#2060330 (10hashar) Works on the wiki though but that is a different bot :D https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:45:54] zeljkof: so the task is https://phabricator.wikimedia.org/T127966 [16:46:08] zeljkof: in short nutcracker is a proxy running on the app servers [16:46:16] that in turns connect to memcached servers [16:46:24] hashar: yikes. let me see if I can figure out what's up with stashbot [16:46:42] zeljkof: the nutcracker daemon refuses to start because its yaml configuration is invalid. The reason is the redis_codfw pool lacks a list of servers [16:47:04] thanks, will follow the task [16:47:10] zeljkof: which in turns as its root cause in hiera defining the list of servers as being {} (an empty list) but somehow the ERB template that generate the yaml file skip it entirely [16:47:22] :d [16:47:23] :D [16:48:19] zeljkof: there https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/nutcracker/templates/nutcracker.yml.erb [16:48:50] nutcracker is a cool name, did not hear it so far :) [16:49:52] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2060353 (10hashar) 5Resolved>3Open From T127966 A puppet r... [16:50:08] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2060368 (10hashar) [16:50:25] 10Beta-Cluster-Infrastructure: deployment-mediawiki02 lost memcached access at 11:51am UTC - https://phabricator.wikimedia.org/T127966#2059794 (10hashar) I have reopened {T127845} and made it a blocker [16:58:58] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2060445 (10hashar) Looking at the nutcracker erb template on ht... [16:59:32] 6Release-Engineering-Team, 15User-bd808: releng sal is stall since Feb 20 - https://phabricator.wikimedia.org/T127981#2060446 (10bd808) 5Open>3Resolved p:5Triage>3Normal a:3bd808 I restarted @Stashbot (instructions at https://wikitech.wikimedia.org/wiki/Tool:Stashbot) and verified that messages are b... [16:59:40] 6Release-Engineering-Team, 15User-bd808: releng sal is stall since Feb 20 - https://phabricator.wikimedia.org/T127981#2060452 (10bd808) 5Resolved>3Open [17:01:25] !log https://wmflabs.org/sal/releng missing SAL data since 2016-02-20T20:19 due to bot crash; needs to be backfilled from wikitech data (T127981) [17:01:26] T127981: releng sal is stall since Feb 20 - https://phabricator.wikimedia.org/T127981 [17:01:29] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [17:03:35] 5Continuous-Integration-Scaling, 6Labs, 10Labs-Infrastructure, 7Nodepool, 13Patch-For-Review: Nodepool can't refresh snapshot on labs since ~ Feb 15th - https://phabricator.wikimedia.org/T127755#2060481 (10Andrew) That patch is scattershot, but I changed the defaults so if there were secret policies prev... [17:03:40] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2060482 (10hashar) hieradata/labs/deployment-prep/common.yaml d... [17:05:54] bd808: thanks :) [17:10:35] 5Continuous-Integration-Scaling, 6Labs, 10Labs-Infrastructure, 7Nodepool, 13Patch-For-Review: Nodepool can't refresh snapshot on labs since ~ Feb 15th - https://phabricator.wikimedia.org/T127755#2060491 (10hashar) Gave it a try and it works fine now: | status | active And `$ openstack image... [17:11:19] I can't login to beta [17:11:57] Login error [17:11:57] There was an unexpected error logging in. Please try again. If the problem persists, it may be because you have cookies disabled, and you should check that they are enabled in your browser settings. [17:12:45] !log Refreshing nodepool snapshot. Been stall since Feb 15th T127755 [17:12:46] T127755: Nodepool can't refresh snapshot on labs since ~ Feb 15th - https://phabricator.wikimedia.org/T127755 [17:12:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [17:16:50] 5Continuous-Integration-Scaling, 6Labs, 10Labs-Infrastructure, 7Nodepool, 13Patch-For-Review: Nodepool can't refresh snapshot on labs since ~ Feb 15th - https://phabricator.wikimedia.org/T127755#2060509 (10hashar) 5Open>3Resolved ``` $ nodepool image-update wmflabs-eqiad ci-jessie-wikimedia 2016-02-2... [17:20:17] !log Deleted Nodepool instances so new ones get to use the new snapshot ci-jessie-wikimedia-1456333979 [17:20:20] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [17:54:58] 6Release-Engineering-Team, 10DBA, 10MediaWiki-Configuration, 6Operations, and 3 others: codfw is in read only according to mediawiki - https://phabricator.wikimedia.org/T124795#2060685 (10jcrespo) [17:57:40] 6Release-Engineering-Team, 10DBA, 10MediaWiki-Configuration, 6Operations, and 3 others: codfw is in read only according to mediawiki - https://phabricator.wikimedia.org/T124795#2060698 (10jcrespo) I'm ok with leaving the master databases pointing to the original ones but that is 1) more reasons to create a... [17:59:09] 7Browser-Tests, 10MediaWiki-Vagrant, 10Wikidata: Problem with running wikidata browser tests targeting mediawiki-vagrant with wikidata role - https://phabricator.wikimedia.org/T127972#2060701 (10zeljkofilipin) 5Open>3Resolved This is resolved, as far as I am concerned. Some of the tests fail, but most of... [18:04:47] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #12: 04FAILURE in 18 sec: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/12/ [18:08:31] 7Browser-Tests, 10VisualEditor, 13Patch-For-Review, 5WMF-deploy-2016-03-01_(1.27.0-wmf.15): Disable VisualEditor scenarios that fail at en.wikipedia.beta.wmflabs.org from running daily - https://phabricator.wikimedia.org/T94162#2060741 (10zeljkofilipin) [18:08:35] 7Browser-Tests, 10VisualEditor, 13Patch-For-Review, 5WMF-deploy-2016-03-01_(1.27.0-wmf.15): Fix `ve.init is undefined` and `ve.init.target is undefined` error messages in VisualEditor browser tests - https://phabricator.wikimedia.org/T126966#2060740 (10zeljkofilipin) 5Open>3Resolved [18:11:55] ostriches: can i please get a repo in wmf's git server ? [18:12:15] Name? [18:12:27] it will be used for steward tools, scripts and bots [18:12:37] so: stewardbots [18:13:02] Is it gonna be a toollabs thing? So like labs/tools/stewardbots maybe? [18:13:04] Or similar? [18:13:12] yes [18:13:50] all the code exists already, just migrating from old svn server [18:14:28] will it mirror to github too ? [18:17:16] RECOVERY - Puppet failure on mira is OK: OK: Less than 1.00% above the threshold [0.0] [18:24:41] (03PS4) 10Paladox: Add new test mediawiki-core-parallel-lint to mediawiki/core [integration/config] - 10https://gerrit.wikimedia.org/r/267540 [18:24:48] erm [18:24:51] I can't login to beta. [18:24:57] " There was an unexpected error logging in. Please try again. If the problem persists, it may be because you have cookies disabled, and you should check that they are enabled in your browser settings. " [18:25:01] my cookies are fine, etc. [18:25:12] bd808: ^ anything known atm? [18:25:37] legoktm: not that I've heard of [18:25:42] * bd808 looks at logstash [18:26:35] looks like some nutcracker process may be busted [18:26:45] (03CR) 10jenkins-bot: [V: 04-1] Add new test mediawiki-core-parallel-lint to mediawiki/core [integration/config] - 10https://gerrit.wikimedia.org/r/267540 (owner: 10Paladox) [18:27:46] !log nutcracker dead on mediawiki01; investigating [18:27:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [18:28:28] thanks [18:28:35] Amir1: I'll look once I can login :P [18:29:09] it's not starting.... [18:30:24] !log "configuration file '/etc/nutcracker/nutcracker.yml' syntax is invalid" [18:30:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [18:33:19] bd808: I know that has was working on this https://phabricator.wikimedia.org/T127966 [18:34:14] thcipriani: ah thanks [18:34:21] prod kills beta cluster again [18:34:57] * bd808 looks to see if the cross-dc bits can be shutoff [18:36:55] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2060807 (10bd808) p:5Normal>3Unbreak! Nobody can login in b... [18:41:28] thcipriani, legoktm: I poked _joe_ about fixing nutcracker [18:43:49] :/ thanks [18:43:50] :)))) [18:43:59] same here [18:45:08] 7Blocked-on-RelEng, 10Continuous-Integration-Config, 10Parsoid, 7Jenkins: Recheck runs parsoidsvc-php-parsertests on last revision project wide, not patchset - https://phabricator.wikimedia.org/T111259#2060842 (10thcipriani) [18:45:29] 7Blocked-on-RelEng, 10Continuous-Integration-Config, 10Parsoid, 7Jenkins: Recheck runs parsoidsvc-php-parsertests on last revision project wide, not patchset - https://phabricator.wikimedia.org/T111259#2060845 (10ssastry) This happened again yesterday. https://github.com/wikimedia/parsoid/commit/e701d9cb3... [18:48:08] legoktm: fixed! [18:50:07] ty :D [18:50:10] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2060866 (10bd808) p:5Unbreak!>3Normal @Joe put in a quick f... [18:50:44] 10Beta-Cluster-Infrastructure: Cannot login on Commons Beta, complains about disabled cookies - https://phabricator.wikimedia.org/T127964#2060872 (10bd808) [18:50:48] 10Beta-Cluster-Infrastructure: deployment-mediawiki02 lost memcached access at 11:51am UTC - https://phabricator.wikimedia.org/T127966#2060870 (10bd808) 5Open>3Resolved Fixed with https://gerrit.wikimedia.org/r/273016 [18:51:00] 10Beta-Cluster-Infrastructure: Cannot login on Commons Beta, complains about disabled cookies - https://phabricator.wikimedia.org/T127964#2059755 (10bd808) 5Open>3Resolved Fixed with https://gerrit.wikimedia.org/r/273016 [18:51:59] legoktm: interestingly if you open source of page and search for ORES it comes up in RL [18:52:10] yep, I'm looking now [18:52:19] awesome [18:55:56] Amir1: oh, I know why. [18:56:17] \o/ [19:05:56] legoktm: what's happening? [19:06:29] I'm waiting ~5 minutes for https://gerrit.wikimedia.org/r/273017 to get deployed [19:07:02] yay [19:07:24] so it was my fault all the time [19:07:29] * Amir1 goes to hide [19:07:49] oh it wasn't my fault [19:07:54] * Amir1 comes back [19:08:07] nono, it was just a mistake in some cleanup [19:09:04] Amir1: http://en.wikipedia.beta.wmflabs.org/wiki/Special:Preferences#mw-prefsection-betafeatures and there we go :) [19:09:42] yay [19:09:43] yay [19:09:56] YEEEEEEEESSSSSSSSSS [19:10:40] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-ORES, 6Revision-Scoring-As-A-Service, 13Patch-For-Review: Deploy ORES extension to beta cluster - https://phabricator.wikimedia.org/T127661#2060987 (10Legoktm) Aaaand then it was {de7c31eddcdf2f8b6dd6bb3178c85a2fc7294557}. [19:10:41] :D [19:13:20] Amir1: if everything looks good on beta, close the task? ^ [19:13:56] legoktm: I enabled it but it is not showing anything in RC [19:14:02] hmm. [19:14:09] maybe it's a bug in the extension itself [19:14:28] oh you know, I didn't run the maint script. [19:15:07] oh, please do [19:15:15] and also run another script that adds some data [19:15:37] first CheckModelVersions.php [19:15:47] and then PopulateDatabase.php [19:15:49] legoktm: ^ [19:16:59] Amir1: [06dd4aaa] [no req] RuntimeException from line 46 of /mnt/srv/mediawiki-staging/php-master/extensions/ORES/includes/Api.php: Failed to make ORES request to [https://ores.wmflabs.org/scores/testwiki/? [19:16:59] models=damaging%7Cgoodfaith%7Creverted%7Cwp10%7Cdamaging&revids=321532%7C321434%7C321432%7C321430%7C321429%7C321426%7C321425%7C321424%7C321423%7C321422%7C321421%7C321420%7C321417%7C321416%7C321415%7C321414%7C321413%7C321399%7C321268%7C321264%7C321255%7C321254%7C321253%7C321252%7C321250%7C321249%7C321248%7C321245%7C321244%7C321243%7C321242%7C321241%7C321238%7C321237%7C321236%7C321235%7C321234%7C321233%7C321232%7C321231%7C321228%7C321227%7C32122 [19:16:59] 6%7C321225%7C321224%7C321223%7C321222%7C321221%7C321219%7C321217], There was a problem during the HTTP request: 400 BAD REQUEST [19:17:34] "message": "Models '['wp10', 'goodfaith']' not available for testwiki." [19:17:36] hrm. [19:17:46] I removed those [19:18:41] legoktm: https://gerrit.wikimedia.org/r/#/c/272466/ [19:19:05] https://gerrit.wikimedia.org/r/#/c/272466/5/wmf-config/InitialiseSettings-labs.php,cm [19:19:11] oh, eh, that doesn't work [19:19:23] * legoktm works on a patch [19:19:37] why? [19:20:08] instead of replacing, it actually adds to the default [19:24:47] 10Beta-Cluster-Infrastructure: Internal error when assigning user rights on Commons Beta - https://phabricator.wikimedia.org/T128006#2061064 (10JeanFred) [19:33:39] legoktm: ok, all patches are merged now for the extension. I think we need to change to wmf-config files too [19:35:34] * legoktm waits for jenkins [19:36:48] 10Continuous-Integration-Infrastructure, 7Technical-Debt: Remove manual $PHP_BIN handling in slave-scripts - https://phabricator.wikimedia.org/T128008#2061127 (10Legoktm) [19:45:20] Amir1: ok, now we wait 5-10 minutes [19:45:27] and I'll try running the script again [19:45:47] cool [19:54:01] !log legoktm@deployment-tin:~$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=enwiki [19:54:44] Amir1: script finished... http://en.wikipedia.beta.wmflabs.org/wiki/Special:RecentChanges looks the same though? [19:55:11] hm, maybe because they're all flow stuff [19:55:44] Amir1: wooo, I see a few on http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:RecentChanges&limit=500 [19:56:05] awesome [19:56:13] I just saw one of them in 250 [19:56:25] (we can use bot to make edits ;) [19:56:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [19:56:45] * legoktm pets qa-morebots [19:56:52] Amir1: ok, so everything good now? :) [19:57:04] yeaaaaaah [19:57:16] I do some edits [19:57:20] here and there [19:57:51] :D [19:59:22] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-ORES, 6Revision-Scoring-As-A-Service, 13Patch-For-Review: Deploy ORES extension to beta cluster - https://phabricator.wikimedia.org/T127661#2061245 (10Legoktm) 5Open>3Resolved Finally configuration fixed up with {3d71edef140471a0c0339fd109c551e5a7c0... [20:07:47] * legoktm goes away for food [20:18:27] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-ORES, 6Revision-Scoring-As-A-Service, 13Patch-For-Review: Deploy ORES extension to beta cluster - https://phabricator.wikimedia.org/T127661#2061291 (10hashar) That is really evil! Well done @legoktm. [20:19:30] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-ORES, 6Revision-Scoring-As-A-Service, 13Patch-For-Review: Deploy ORES extension to beta cluster - https://phabricator.wikimedia.org/T127661#2061294 (10hashar) @Halfak @Ladsgroup I guess we want another task to get the ORES service deployed to beta as we... [20:38:13] ostriches: was it created? will it mirror to github ? [20:40:41] I got distracted, one moment. [20:42:46] matanya: Created labs/tools/stewardbots, replicating to https://github.com/wikimedia/labs-tools-stewardbots [20:42:53] Got a group I can give ownership to? [20:43:05] not yet [20:43:15] you can create stewards [20:43:25] and grant me rights to add/remove users [20:45:54] Group's called stewardbots, but yeah, done. [20:46:27] https://gerrit.wikimedia.org/r/#/admin/projects/labs/tools/stewardbots,access & https://gerrit.wikimedia.org/r/#/admin/groups/1152,members [20:48:21] thank you so much ostriches [20:48:47] Yw [20:57:27] 10Continuous-Integration-Infrastructure, 6Operations, 10puppet-compiler: puppet compiler wrongly indicates errors when dealing with subrepositories - https://phabricator.wikimedia.org/T118406#2061468 (10hashar) [20:58:19] 10Continuous-Integration-Infrastructure, 10puppet-compiler: puppet compiler broken - https://phabricator.wikimedia.org/T94631#2061473 (10hashar) [20:58:38] 10Continuous-Integration-Infrastructure, 10puppet-compiler: puppet-compiler should not link to not existing change.*.pson file - https://phabricator.wikimedia.org/T126796#2061475 (10hashar) [20:59:46] 10Continuous-Integration-Infrastructure, 6Labs, 6Operations, 10puppet-compiler, 7Puppet: compiler02.puppet3-diffs.eqiad.wmflabs out of disk space - https://phabricator.wikimedia.org/T122346#2061479 (10hashar) [21:04:58] 10Beta-Cluster-Infrastructure, 6Release-Engineering-Team, 6Operations, 13Patch-For-Review, 7Puppet: deployment-tin puppet Error 400 on SERVER: Failed to parse template nutcracker/nutcracker.yml.erb - https://phabricator.wikimedia.org/T127845#2061502 (10hashar) Thank you @Joe , I was already fighting vari... [21:12:57] ostriches: Good news for you: If everything work as it is planned, we have the possibilty of optional callassigns at 03.03.2016 ;) [21:13:08] Details here: https://phabricator.wikimedia.org/T126797 [21:13:58] and we got the **very** useful feature for workboards... background colors for them... [21:14:21] 10Browser-Tests-Infrastructure, 10MediaWiki-extensions-MultimediaViewer, 13Patch-For-Review, 5WMF-deploy-2016-03-01_(1.27.0-wmf.15): undefined method `test_name' for # (NoMethodError) - https://phabricator.wikimedia.org/T125072#2061560 (10hashar) Manually tr... [21:20:25] 3Scap3, 10scap, 7WorkType-NewFunctionality: [Spike] Benchmark built-in HTTP server options for scap3 fanout - https://phabricator.wikimedia.org/T127733#2061582 (10mmodell) [21:23:15] 3Scap3, 10scap, 7WorkType-NewFunctionality: [Spike] Benchmark built-in HTTP server options for scap3 fanout - https://phabricator.wikimedia.org/T127733#2061605 (10mmodell) [21:29:55] Yippee, build fixed! [21:29:56] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce build #359: 09FIXED in 23 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce/359/ [21:30:59] Yippee, build fixed! [21:31:00] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #889: 09FIXED in 24 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/889/ [21:32:38] zeljkof: MultimediaViewer fixed ^^^ [21:35:35] 10Continuous-Integration-Infrastructure, 6Operations, 10Traffic, 13Patch-For-Review: https://integration.wikimedia.org/ci/api/json is corrupted when required more than one time in a raw - https://phabricator.wikimedia.org/T127294#2061679 (10hashar) Verified on Nodepool, requests to Jenkins are all fine: ``... [21:39:41] 21:32:00 Finished sync-masters (duration: 04m 31s) [21:39:41] 21:39:08 Finished sync-proxies (duration: 07m 07s) [21:39:49] thcipriani: ouch ^ [21:40:14] blerg. in beta? [21:40:18] prod [21:40:19] on tin [21:40:21] no joke [21:41:00] Remember I pushed the actual wmf code last week...the delta shouldn't have been quite so big.... [21:41:15] yeah, that's...strange [21:41:37] so there's code in the latest package to fix unnecessary cdb rebuilding in sync-masters that hasn't deployed yet... [21:41:46] should help a bit [21:42:11] https://phabricator.wikimedia.org/D132 [21:43:39] Yippee, build fixed! [21:43:40] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #803: 09FIXED in 1 min 28 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/803/ [21:45:23] Yippee, build fixed! [21:45:24] Project browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #402: 09FIXED in 1 min 54 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/402/ [21:57:44] 7Blocked-on-RelEng, 10Continuous-Integration-Config, 10Parsoid, 7Jenkins: Recheck runs parsoidsvc-php-parsertests on last revision project wide, not patchset - https://phabricator.wikimedia.org/T111259#2061721 (10hashar) I have marked the build [[ https://integration.wikimedia.org/ci/job/parsoidsvc-php-par... [22:05:14] ostriches: phabrictor gives 404 when i look for the repo, is that intensional ? [22:05:31] Depends on the repo? [22:05:34] Not all repos be there. [22:06:01] ah, mind adding it ? [22:06:15] plus i would like to know how i can merge from gerrit's UI [22:09:08] 7Blocked-on-RelEng, 10Continuous-Integration-Config, 10Parsoid, 7Jenkins: Recheck runs parsoidsvc-php-parsertests on last revision project wide, not patchset - https://phabricator.wikimedia.org/T111259#2061738 (10hashar) If the regression occurred in mediawiki/services/parsoid e701d9cb3427942185e9bcf670c9a... [22:12:13] matanya: There's a backlog of importing them to Phab. Long story. [22:12:46] thanks ostriches what about my other question ? [22:13:29] Same way you would for any repo? Push a patch for review with `git review` or refs/for/* [22:13:32] Then +2 and merge? [22:13:49] i +2ed it [22:13:57] but don't see any merge button [22:14:01] https://gerrit.wikimedia.org/r/#/c/273116/ [22:14:44] +2 and verified? [22:14:57] yes [22:15:23] maybe i am missing rights? [22:15:24] hashar: Im going to start the next stage of migrating the jslint tests to jshint and jsonlint. [22:18:53] 7Blocked-on-RelEng, 10Continuous-Integration-Config, 10Parsoid, 7Jenkins: Recheck runs parsoidsvc-php-parsertests on last revision project wide, not patchset - https://phabricator.wikimedia.org/T111259#2061748 (10hashar) The failing build in P2668 has: ``` Running parser tests from: /mnt/jenkins-workspace/... [22:19:53] matanya: Explicitly granted you submit [22:19:59] It should have inherited. [22:20:11] RECOVERY - Puppet failure on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:21:13] I must say Special:RecentChanges in beta is stalled [22:21:24] I made several edits but they didn't show up in RC [22:21:52] http://en.wikipedia.beta.wmflabs.org/w/index.php?title=History_Test_Page&action=history compare it against RC [22:21:58] hashar: ^ [22:22:22] ostriches: yes, that worked, thanks [22:25:26] legoktm: ^ [22:40:10] 7Blocked-on-RelEng, 10Continuous-Integration-Config, 10Parsoid, 7Jenkins, 7WorkType-Maintenance: Recheck runs parsoidsvc-php-parsertests on last revision project wide, not patchset - https://phabricator.wikimedia.org/T111259#2061794 (10hashar) 5Open>3Resolved a:3MaxSem I did a grep on all the build... [22:40:33] thcipriani: hello again [22:40:47] thcipriani: from SoS we had Parsoid enquiring about tests breakage ( https://phabricator.wikimedia.org/T111259 ) [22:40:53] yup. [22:40:56] thcipriani: the job actually always passed [22:41:12] until MaxSem fixed it in mediawiki [22:41:24] which as a side effect caused the job to fail as expecting [22:41:34] anyway long story short, fixed -:} [22:41:44] and I have a few page longs of trailing audit log to backup my claim it is fixed! [22:41:58] heh, this ticket is quite a bit longer since SoS :) [22:42:13] awesome! Thank you! [22:53:57] 10Continuous-Integration-Infrastructure, 10Parsoid: Intermittent jenkins failures with tar and disk related errors. - https://phabricator.wikimedia.org/T128032#2061848 (10ssastry) [22:57:50] (03PS1) 10Paladox: Replace jslint test with jshint and jsonlint [integration/config] - 10https://gerrit.wikimedia.org/r/273128 (https://phabricator.wikimedia.org/T127362) [22:58:18] 10Continuous-Integration-Infrastructure, 10Parsoid: Intermittent jenkins failures with tar and disk related errors. - https://phabricator.wikimedia.org/T128032#2061877 (10hashar) ``` Running npm install + rm -rf node_modules + npm install WARN engine hawk@3.1.0: wanted: {"node":">=0.10.32"} (current: {"node":"... [22:59:32] (03PS1) 10Paladox: [SimpleSurvey] Archive repo [integration/config] - 10https://gerrit.wikimedia.org/r/273129 (https://phabricator.wikimedia.org/T127362) [22:59:50] hashar: Can I bribe you with beer / etc. to make https://phabricator.wikimedia.org/T119143 happen faster? :-) [23:01:45] 10Continuous-Integration-Infrastructure, 10Parsoid: Intermittent jenkins failures with tar and disk related errors. - https://phabricator.wikimedia.org/T128032#2061885 (10hashar) Builds matching: ``` hashar@gallium:/var/lib/jenkins/jobs/parsoidsvc-source-parse-tool-check$ grep -o 'insufficient space on your sy... [23:02:39] (03PS1) 10Paladox: [PollNY] Add dependance on extension SocialProfile [integration/config] - 10https://gerrit.wikimedia.org/r/273130 [23:03:16] James_F: bribe me with a warm bed instead :-} [23:03:29] James_F: that one is more or less going on [23:03:55] James_F: been distracted with various maintenance tasks. At least some of the services have been switched. [23:04:38] James_F: will get mediawiki switched so we can use the git clone cache [23:04:49] unless prod magically explode again [23:05:09] hashar: I can buy you a hotel room. :-) [23:05:17] hehe [23:05:21] s/buy/rent/ [23:05:41] so yeah in short migrating npm jobs to nodepool is the prio for new feature work [23:05:53] but then 80% of my load is maintenance work ... [23:06:20] good news is that it works just fine for mediawiki/services/* repos [23:06:54] Yippee, build fixed! [23:06:54] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #430: 09FIXED in 9 min 53 sec: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/430/ [23:07:37] James_F: sleeping. thank you for the reminder! feel free to ask about status update on irc/phabricator as needed! [23:07:56] :-) [23:08:01] hashar: OOUI npm job migration would be a nice first step. [23:08:07] noted [23:08:13] * James_F coughs. [23:08:13] Sleep! [23:08:34] oh I will just migrate in priority whatever is James related I guess ;D [23:08:48] if mw works fine, I will probably mass migrate everything [23:27:10] (03PS1) 10Paladox: Migrate OOJS repos to npm-node-4.3 [integration/config] - 10https://gerrit.wikimedia.org/r/273135 (https://phabricator.wikimedia.org/T119143) [23:28:10] James_F: https://gerrit.wikimedia.org/r/273135 [23:28:49] I saw. :-) [23:29:05] * James_F will have a poke when he's out of meetings. [23:29:14] Thanks, paladox. [23:29:37] James_F: Your welcome. :) [23:30:52] James_F: Im not sure about the other npm tests that those repos uses since those wont be using nodejs 4.3 they will be using 0.10. But i doint expect them to cause any problems [23:33:51] paladox: I run the OOjs and OOjs UI publication steps on node 4.3.x+; they work fine there. :-) [23:34:44] James_F: Ok thanks. [23:34:58] (03PS1) 10Paladox: Add npm-node-4.3 test to experimental: in a few templates [integration/config] - 10https://gerrit.wikimedia.org/r/273136 [23:37:04] (03PS2) 10Paladox: Add npm-node-4.3 test to experimental: in a few templates [integration/config] - 10https://gerrit.wikimedia.org/r/273136 [23:38:39] (03CR) 10Paladox: "@Hashar @Jforrester tested the other tests that are npm that are not on nodejs 4.3 on nodejs 4.3 and works. So this can be merged please." [integration/config] - 10https://gerrit.wikimedia.org/r/273135 (https://phabricator.wikimedia.org/T119143) (owner: 10Paladox)