[03:42:01] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #80: FAILURE in 22 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/80/ [03:56:19] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #243: FAILURE in 11 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/243/ [04:24:05] Yippee, build fixed! [04:24:07] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #133: FIXED in 9 min 35 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/133/ [05:36:23] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (10.00%) [07:04:27] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #80: FAILURE in 25 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/80/ [07:11:11] Project browsertests-Flow-test2.wikipedia.org-linux-chrome-sauce build #225: FAILURE in 34 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-test2.wikipedia.org-linux-chrome-sauce/225/ [07:39:53] 3Wikimedia / 3Quality Assurance: Activate ChuckNorris Plugin after Jenkins builds - 10https://bugzilla.wikimedia.org/63305#c7 (10Željko Filipin) ...as if millions of jobs suddenly cried out in terror, and were suddenly silenced... [07:40:09] zeljkof: so I am not really there yet :D [07:40:54] zeljkof: had an horrible night with kids waking up constantly. Zombie breakfasting :D [07:41:51] hashar: good morning :) [07:42:27] + I am not wearing a proper pant which is against the rule ! [07:42:44] my block has no electricity 8am-noon, regular maintenance, but I have found that out at 8am when the lights shut down [07:43:15] now on chromebook, using my phone to get to the intertubes [07:43:33] this will be interesting day :) [07:46:21] until noon [07:46:42] I guess whenever you have power back and that I am fully ready, we can look at Rubocop :D [07:46:49] I like your idea of moving Gemfiles at the root of the repo [07:47:24] also wondering whether we can add default actions to bundle [07:47:31] i.e. if someone runs "bundle" [07:47:35] it would: [07:47:37] bundle install [07:47:40] bundle exec rubocop [07:47:44] bundle exec cucumber [07:55:18] * hashar listens to Dark Side of the Moon with kids [07:55:22] bbl [07:55:42] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #184: FAILURE in 44 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/184/ [08:03:43] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #233: FAILURE in 20 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/233/ [08:07:43] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [08:17:20] hashar: I will try to get a few more repos ready while on chromebook, not sure how that will go :) [08:18:38] (03PS1) 10Hashar: Debian packaging for apertium/apy [integration/config] - 10https://gerrit.wikimedia.org/r/167753 [08:19:21] (03CR) 10Hashar: [C: 032] Debian packaging for apertium/apy [integration/config] - 10https://gerrit.wikimedia.org/r/167753 (owner: 10Hashar) [08:20:28] (03CR) 10Hashar: "Alexandros, Kartik: that will attempt to craft the Debian package whenever a patch is proposed. The Jenkins job is: https://integration.w" [integration/config] - 10https://gerrit.wikimedia.org/r/167753 (owner: 10Hashar) [08:21:08] (03CR) 10jenkins-bot: [V: 04-1] Debian packaging for apertium/apy [integration/config] - 10https://gerrit.wikimedia.org/r/167753 (owner: 10Hashar) [08:23:40] (03CR) 10KartikMistry: "Thanks! We've bunch of other packages under operations/debs/contenttranslation/..." [integration/config] - 10https://gerrit.wikimedia.org/r/167753 (owner: 10Hashar) [08:23:43] (03Merged) 10jenkins-bot: Debian packaging for apertium/apy [integration/config] - 10https://gerrit.wikimedia.org/r/167753 (owner: 10Hashar) [08:24:59] hashar: feel free to create jenkins-debian-glue jobs for all packages. [08:41:34] (03CR) 10Hashar: "Deployed" [integration/config] - 10https://gerrit.wikimedia.org/r/167753 (owner: 10Hashar) [08:43:43] kart_: you can add them based on the example above :-d [08:47:54] :D [09:08:05] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [09:25:53] 3Wikimedia Labs / 3deployment-prep (beta): no log in deployment-bastion:/data/project/logs from "503 server unavailable" on beta labs - 10https://bugzilla.wikimedia.org/72275#c1 (10Antoine "hashar" Musso (WMF)) Well operations/mediawiki-config has: $ mwscript eval.php --wiki=enwiki > print $wmfUdp2logDe... [09:28:08] !log deployment-logstash1 disk full [09:28:10] Logged the message, Master [09:29:00] !log forget me deployment-logstash1 has a puppet agent error but it is simply because the agent is disabled "'debugging logstash config'" [09:29:02] Logged the message, Master [10:07:27] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [10:11:24] hashar_: the power is back :) [10:11:38] when ever you are ready to talk rubocop, I am ready [10:20:23] zeljkof: ready in a few minutes :D [10:28:20] zeljkof: ready :D [10:29:48] zeljkof: so as I understand it you would update the jenkins jobs to use the Gemfile at the root of the repository [10:29:56] thus having to craft a patch for each repository having browser tests [10:29:59] sounds fine to me [10:30:15] hashar: yes, that is the plan [10:30:29] hopefully the owner of the other repositories will not mind :-D [10:30:48] hashar: I guess nobody would really care where the gemfile is located [10:30:58] the most people do not even know what it is [10:31:25] maybe the `cd {folder}/browser/ ` can be removed entirely? [10:31:26] do you think some repos will mind? [10:31:44] hashar: let me check [10:32:05] though the exec cucumber will need to be adjusted to look for /feature/ dir in {folder}/browser/ [10:32:18] hashar: yes, let's not go there for now [10:32:20] (potentially, I have no idea how cucumber find the files) [10:32:36] cucumber by default looks for features folder in the root of the repo [10:32:46] that is why we cd into tests/browser [10:33:20] can we pass the dir as an argument to cucumber? [10:34:22] I think so [10:34:28] this should work [10:34:37] bundle exec cucumber tests/browser [10:34:38] or [10:34:47] bundle exec cucumber tests/browser/features [10:35:03] but I would make that a separate commit [10:35:27] I do not want to make a lot of changes this time [10:35:36] we already move a lot of files around [10:35:47] I mean, a couple of files, but in 20 or so repos [10:36:40] another way [10:36:48] would be to have the Jenkins job shell snippet to support both layout [10:36:51] as a transition [10:37:03] so the job would first look whether there is a Gemfile in tests/browser and use that [10:37:12] else, it fallback to use the one at the root of the repo [10:37:20] that let you deploy the job now [10:37:25] and migrate the repositories smoothly [10:37:29] no [10:37:36] I plan to make all the patches today [10:37:42] once all have been migrated, remove the code path that cd in tests/browser/ :D [10:37:46] so one shot? :-D [10:37:51] deploy the job today or tomorrow [10:38:11] and then it is up to repos (or us) to merge the jobs that start failing [10:38:32] I do not want to add more code to the macro to support old layout that should go away in days [10:38:49] the migration should be quick [10:38:58] in this case, the changes are minimal [10:40:51] some repos have ruby/selenium but have no jenkins jobs [10:40:52] https://gerrit.wikimedia.org/r/#/q/topic:bug/69245,n,z [10:40:58] (the ones without WIP) [10:41:14] so we could merge them even before jjb is updated [10:44:28] sounds good [10:44:33] lets do it right now ? :-D [10:46:35] hashar: go ahead :) [10:46:54] I am working on repos in alpha order, now on echo :) [10:47:01] https://github.com/wikimedia/mediawiki-selenium#repositories [10:48:00] and the next step is to move rubocop back from experimental :) [10:48:31] !log beta: signing puppet cert for deployment-elastic{06,07}. On deployment-salt ran: puppet ca sign i-000006b6.eqiad.wmflabs; puppet ca sign i-000006b7.eqiad.wmflabs [10:48:33] Logged the message, Master [10:48:42] !log rerunning puppet manually on deployment-elastic{06,07} [10:48:44] Logged the message, Master [10:53:10] zeljkof: can you refresh the jenkins jobs and +2 the integration/config change please ? :-] [10:53:15] can +1 it if you want :] [10:53:24] hashar: please do [10:53:34] I will then +2 and update the jobs [10:53:47] (please do +1) [10:55:51] !log deleted salt master key on deployment-elastic{06,07}, restarted salt-minion and reran puppet. It is now passing on both instances \O/ [10:55:53] Logged the message, Master [10:56:11] (03PS6) 10Hashar: Run "bundle install" before cd-ing into browser folder [integration/config] - 10https://gerrit.wikimedia.org/r/167566 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [10:56:14] (03PS7) 10Hashar: Run "bundle install" before cd-ing into browser folder [integration/config] - 10https://gerrit.wikimedia.org/r/167566 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [10:56:48] (03PS8) 10Hashar: Run "bundle install" before cd-ing into browser folder [integration/config] - 10https://gerrit.wikimedia.org/r/167566 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [10:57:35] (03CR) 10Hashar: [C: 031] "PS6: removed WIP from commit summary message" [integration/config] - 10https://gerrit.wikimedia.org/r/167566 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [10:57:38] zeljkof: done :] [10:58:00] hashar: thanks, will push the job in a minute, just to finish echo repo [10:58:30] \O/ [11:02:31] (03CR) 10Zfilipin: [C: 032] Run "bundle install" before cd-ing into browser folder [integration/config] - 10https://gerrit.wikimedia.org/r/167566 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [11:06:09] (03Merged) 10jenkins-bot: Run "bundle install" before cd-ing into browser folder [integration/config] - 10https://gerrit.wikimedia.org/r/167566 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [11:06:50] hashar: updating browsertests* jobs... [11:06:51] \O/ [11:07:34] I will leave give repo maintainers a day or two to merge the required changes [11:07:38] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [11:07:50] if they do not do it, I will merge required changes [11:08:09] but failing tests should make it more interesting to merge the changes :) [11:10:57] zeljkof: you probably want to merge without asking [11:11:03] cause the Jenkins job will fail [11:11:13] hashar: probably [11:11:13] which would make developers / qa angry :] [11:11:30] have to go, lunch [11:11:34] will merge after lunch [11:11:37] * zeljkof is out of lunch [12:07:15] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [12:15:45] * zeljkof is back [12:17:05] zeljkof: back as well [12:17:08] though still lunching :D [12:41:22] zeljkof: should you have rubocop catch all problem and fails ? [12:41:32] then have the related job flagged as non voting? [12:41:50] hashar: the first step is to catch new problems [12:42:00] so it does not overwhelm people [12:42:02] though it seems you are going to make it voting and let the developers clean out the todo list [12:42:05] which is fine :D [12:42:11] and old problems can be fixed one by one [12:42:17] agree [12:42:23] so it will be voting by default right? :D [12:42:32] yes, I would prefer that [12:42:40] when voting, people will notice it [12:42:55] and we can clean up old problems with time [12:51:06] +1 :-D [12:51:18] /make-it-happen [13:07:53] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [13:25:43] zeljkof: need any help to finish the rubocop stuff ? [13:25:46] such as reviews maybe [13:26:04] hashar: feel free to review any commit here [13:26:12] https://gerrit.wikimedia.org/r/#/q/topic:bug/69245,n,z [13:26:31] zeljkof: have you updated the Jenkins jobs already ? [13:26:37] hashar: yes, jobs are updated [13:27:36] (03CR) 10Hashar: [C: 032] "Straightforward. The related Jenkins job will be triggered on each patchset soon ™." [ruby/api] - 10https://gerrit.wikimedia.org/r/167210 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:27:40] zeljkof: +2ing as needed :D [13:27:50] (03Merged) 10jenkins-bot: Prepare repository for running RuboCop after every push to Gerrit [ruby/api] - 10https://gerrit.wikimedia.org/r/167210 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:27:52] (03PS3) 10Hashar: Prepare repository for running RuboCop after every push to Gerrit [selenium] - 10https://gerrit.wikimedia.org/r/167578 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:28:09] (03CR) 10Hashar: "check experimental" [selenium] - 10https://gerrit.wikimedia.org/r/167578 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:29:09] (03CR) 10Hashar: [C: 032] "Will enable rubocop on patch submission soon ™" [selenium] - 10https://gerrit.wikimedia.org/r/167578 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:29:20] hashar: thanks :) [13:29:21] (03Merged) 10jenkins-bot: Prepare repository for running RuboCop after every push to Gerrit [selenium] - 10https://gerrit.wikimedia.org/r/167578 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:31:56] zeljkof: I am starting to wonder whether we should use rake to define entry points :D [13:32:14] hashar: we used to, but gave up [13:32:25] not sure if it is worth the effort [13:32:30] what did you have in mind? [13:33:41] not sure [13:33:55] I am wondering how to pass argument to rubocop [13:34:13] so mediawiki/core change https://gerrit.wikimedia.org/r/#/c/167794/ still has a bunch of rubocop errors [13:34:25] hashar: oops, let me check [13:34:33] maybe I forgot to commit something [13:34:41] most of them are pretty trivial, I would fix them as well [13:35:02] for example trailing whitespaces or some spaces missing next to curly braces [13:35:27] they are trivial enough that I would not bother updating a .rubocop_todo.yml :D [13:35:37] hashar: but wait [13:35:45] this job complains about flow [13:35:46] https://integration.wikimedia.org/ci/job/mediawiki-core-bundle-rubocop/61/console [13:35:53] but it runs for core? [13:36:07] I mean, why does core job complain about a violation in flow? [13:36:12] oh [13:36:27] I have only ignored violations for the given codebase [13:36:37] so any new violations would be detected [13:36:41] but old ones ignored [13:37:00] does core checkout a few extensions? [13:37:06] by mistake? or on purpose? [13:38:18] good catch [13:38:21] I have no clue what happens [13:38:42] hashar: do you reuse folder structure across repos? [13:38:52] ah I found it [13:39:21] mediawiki-core-bundle-rubocop can be run for mediawiki/core.git wmf branches [13:39:31] and the wmf branches have extensions has submodules [13:39:37] we need to adjust the macro to not process submodules [13:39:58] my bad [13:40:06] CI FTW :) [13:40:56] the question is then, what are going to be the impact of not processing submodules :D [13:41:13] :) [13:41:15] fun times [13:41:59] still [13:42:03] mw core has issues :D [13:42:29] hashar: I think for some reason rubocop did not log one trivial issue to the dotfile [13:42:44] but I did not have the time to debug or report it [13:43:02] it should be trivial to fix in the next commit, I did not want to complicate the current commit [13:43:06] (03PS1) 10Hashar: bundle jobs no more process submodules [integration/config] - 10https://gerrit.wikimedia.org/r/167798 [13:43:18] zeljkof: i am going to do it :D [13:43:27] hashar: go ahead :) [13:43:27] the less commit we have in mw/core the happier we are ! [13:43:44] did you find the one remaining problem? [13:43:52] meanwhile https://gerrit.wikimedia.org/r/167798 should disable submodules processing [13:43:55] it should be literally trivial, whitespace [13:45:00] ah I found the issue [13:45:16] you want to ignore a bunch of paths such as node_modules [13:45:19] I have issues with tests/frontend/node_modules :-D [13:49:02] commented on https://gerrit.wikimedia.org/r/#/c/167794/ :D [13:51:34] (03PS1) 10Hashar: Enable rubocop on mediawiki/{selenium,ruby/api} [integration/config] - 10https://gerrit.wikimedia.org/r/167802 (https://bugzilla.wikimedia.org/69245) [13:52:02] zeljkof: https://gerrit.wikimedia.org/r/#/c/167802/ makes rubocop voting on mediawiki/selenium and mediawiki/ruby/api [13:52:02] :D [13:52:12] those two are good to go imho and pass [13:52:42] hashar: thanks [13:52:46] there are a few more repos that are ready [13:52:59] as soon as I finish producing patches I will take a look :) [13:53:07] a few more left [13:53:15] currently at PageTriage [13:56:55] bundle exec rubocop --auto-correct is nice :D [13:57:11] hashar: yes it is :) [13:57:18] rubocop is really a nice tool [14:06:12] Project browsertests-Wikidata-PerformanceTests-linux-firefox-sauce build #25: FAILURE in 11 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-PerformanceTests-linux-firefox-sauce/25/ [14:07:42] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [14:09:10] hashar: zeljkof: did something change with the setup of browsertests? https://integration.wikimedia.org/ci/job/browsertests-Wikidata-PerformanceTests-linux-firefox-sauce/25/console [14:11:57] Tobi_WMDE_SW: oops, yes [14:12:19] I did not make the change for wikidata yet [14:12:22] let me do that right now [14:16:08] (03PS1) 10Zfilipin: WikidataBrowserTests jobs are now hosted at Wikimedia Jenkins [selenium] - 10https://gerrit.wikimedia.org/r/167807 [14:17:12] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #23: FAILURE in 11 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/23/ [14:17:28] Tobi_WMDE_SW: what is wikidata browsertests workflow? [14:17:32] on github [14:17:43] should I fork the repo and then send pull request? [14:18:05] zeljkof: you can just send a pull request directly [14:18:23] I think I've added you to the right team [14:18:43] so I should clone the repo locally, make the change and push to github? [14:18:58] that will not automatically merge, but create a pull request? [14:19:10] oh wait, I should create a topic branch anyway [14:19:14] ok, let me try [14:19:25] it has been a while since I have used github :) [14:26:29] Tobi_WMDE_SW: https://github.com/wmde/WikidataBrowserTests/pull/16 [14:26:31] done [14:27:33] in short, added rubocop config files to repo root, moved bundler config files (gemfile, gemfile.lock) from tests/browser to repo root [14:27:39] the commit should fix the buidl [14:27:40] build [14:28:28] sorry, it was on my list to do, I was at TwnMainPage [14:28:31] https://github.com/wikimedia/mediawiki-selenium#repositories [14:28:46] WikidataBrowserTests commit would happen in the next 30-60 minutes anyway [14:28:51] apologies for the broken link [14:35:58] (03CR) 10Zfilipin: [C: 032] Enable rubocop on mediawiki/{selenium,ruby/api} [integration/config] - 10https://gerrit.wikimedia.org/r/167802 (https://bugzilla.wikimedia.org/69245) (owner: 10Hashar) [14:37:53] zeljkof: merged. :) [14:38:22] Tobi_WMDE_SW: yeah! :) [14:38:23] so, if I retrigger the job it should be fixed. or are there other things to resolve first? [14:38:33] Tobi_WMDE_SW: no, this should do it [14:38:40] well, if I understood the problem [14:38:43] let me check [14:38:45] let's try [14:38:49] just retriggered [14:38:58] https://integration.wikimedia.org/ci/job/browsertests-Wikidata-PerformanceTests-linux-firefox-sauce/26/console [14:39:06] looks better [14:39:16] yes, this was the problem "Could not locate Gemfile) [14:39:25] it was looking in repo root [14:39:27] where it is now [14:39:32] Yippee, build fixed! [14:39:32] Project browsertests-Wikidata-PerformanceTests-linux-firefox-sauce build #26: FIXED in 52 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-PerformanceTests-linux-firefox-sauce/26/ [14:39:37] yay [14:39:39] and as a bonus you have rubocop files [14:39:40] (03Merged) 10jenkins-bot: Enable rubocop on mediawiki/{selenium,ruby/api} [integration/config] - 10https://gerrit.wikimedia.org/r/167802 (https://bugzilla.wikimedia.org/69245) (owner: 10Hashar) [14:39:44] :D [14:39:47] take a look [14:39:58] need to look into how to run it on travis [14:40:11] and when ever you have the time we can pair on fixing a few violations [14:40:34] ok. let's do this next week? [14:42:58] \O/ [14:43:30] (03CR) 10Hashar: "deployed" [integration/config] - 10https://gerrit.wikimedia.org/r/167802 (https://bugzilla.wikimedia.org/69245) (owner: 10Hashar) [14:43:45] Tobi_WMDE_SW: deal [14:44:39] (03CR) 10Hashar: "recheck" [selenium] - 10https://gerrit.wikimedia.org/r/137673 (owner: 10Zfilipin) [14:58:30] Yippee, build fixed! [14:58:31] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #24: FIXED in 16 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/24/ [15:07:22] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [15:49:38] hashar: huh, all repos fixed :) [15:50:30] now we can turn rubocop to voting ;) [15:50:41] * zeljkof will brb [15:50:56] nice [15:51:03] zeljkof: though [15:51:20] you might need to ensure it pass in the wmf branches in case folks cherry pick to there [15:52:39] YuviPanda: is https://wikitech.wikimedia.org/wiki/User:Yuvipanda/Wikitech_hiera how Beta Cluster should be done as well? or since we have -labs files and obvious inherentance from prod we shouldn't? [15:53:33] maybe hashar knows :) [15:54:10] yeah i noticed the task about hiera [15:54:12] haven't committed to it [15:54:17] BuT [15:54:29] I probably fixed a nasty Zuul bug [15:56:04] :) [16:03:11] hashar: can't hear you if you're talking [16:08:03] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [16:15:53] 3Wikimedia / 3Continuous integration: Jobs are sometime no more being triggered by Zuul / Jenkins - 10https://bugzilla.wikimedia.org/63760#c13 (10Antoine "hashar" Musso (WMF)) That is related to bug 63758 (JJB created jobs not registering). I have upgraded Jenkins Gearman plugin to fix jobs registrations: *... [16:17:55] zeljkof: https://gerrit.wikimedia.org/r/#/c/167823/ (Rubocop for VE browser tests) LGTM – want me to +2? [16:18:24] 3Wikimedia / 3Continuous integration: [upstream] Jenkins: jobs created via JJB are not properly registered in Zuul Gearman server - 10https://bugzilla.wikimedia.org/63758#c5 (10Antoine "hashar" Musso (WMF)) Upstream patch has been merged and applied on our setup. Need to verify whether the gearman plugin ha... [16:18:26] 3Wikimedia / 3Continuous integration: [upstream] Jenkins: jobs created via JJB are not properly registered in Zuul Gearman server - 10https://bugzilla.wikimedia.org/63758 (10Antoine "hashar" Musso (WMF)) 5NEW>3ASSI [16:18:44] zeljkof: And, for that matter, all the others. :-) [16:20:38] James_F: please do :) [16:20:45] they look good to me too [16:21:05] Do you have a patch to make them voting? [16:21:27] might need to add the patches to wmf branches before making the rubocop jobs voting though [16:21:58] hashar: People backporting can V+2 in the incredibly few cases. [16:22:07] I guess :) [16:22:12] {{sold}} [16:22:16] :-) [16:22:23] James_F: we will probably make them non voting for a week or so [16:22:33] zeljkof: Very well. [16:22:42] zeljkof: But moved from experimental at least? [16:23:02] James_F: yes, hope to move from experimental soon, probably tomorrow [16:23:13] as soon as all the commits are merged into master [16:23:20] * James_F nods. [16:26:47] James_F: if you have some time, take a look at this :) https://gerrit.wikimedia.org/r/#/q/topic:bug/69245,n,z [16:27:01] zeljkof: Doing so. :-) [16:27:12] James_F: great, thanks :) [16:30:27] zeljkof: There's now quite the back-log in Zuul. :-) [16:30:40] James_F: :) [16:30:54] zeljkof: Some of the patches don't pass Rubocop, though. [16:31:18] James_F: really? I know core has one failed thing, but the rest should be all green [16:31:24] which one does not pass? [16:31:27] zeljkof: Cirrus and one other. [16:31:29] * James_F looks again. [16:31:37] yes, [16:31:40] cirrus and core, then [16:31:52] Oh, and CentralAuth didn't run Rubocop. [16:32:02] When you did `check experimental`. [16:32:07] cirrus actually introduced a problem _after_ rubocop dotfile was generated [16:32:19] Just mediawiki-core-extensions-integration, which failed. [16:32:30] Is there something special about that repo? [16:32:34] James_F: yes, by mistake centralauth is not set up for rubocop yet, will be tomorrow, ran out of time today [16:32:41] Aha, OK. [16:32:46] So, mostly done, at least. [16:32:52] James_F: sorry, in a meeting now [16:33:27] zeljkof: No worries. [16:49:33] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [16:55:50] ^^^ that instance has puppet disabled manually [16:58:31] greg-g: got a few minutes for a quick hangout ? [16:59:50] hashar: sure, I can do 10 minutes (I have an interview in a bit) [17:00:03] will be short [17:02:11] hashar: I'll turn puppet back on there later today. [17:03:58] greg-g: thank for letting brain dumping my thoughts. I will sleep better tonight hehe [17:04:04] :) :) [17:04:08] 'tis what I'm here for [17:04:15] bd808: yeah I understood you were working on it / tweakd it so did nothing :D [17:04:25] meanwhile [17:04:39] Phabricator upstream already proposed a patch to a feature request I sent to day https://secure.phabricator.com/D10733 [17:04:41] they are fast [17:04:53] deployment-elastic is flapping (not sure in what way) according to icinga [17:04:58] manybubbles: ^ [17:05:07] not sure if it's real or what [17:06:18] https://phabricator.wikimedia.org/P34 [17:06:28] unhelpful email from icinga ^ [17:06:28] and I am off, see you tomorrow [17:06:31] g'night! [17:08:23] greg-g: P34 must be an instance which is down / dead / removed [17:08:33] had the issue this morning, and that is what Yuvi told me :D [17:08:39] it is still a bit experimental after all hehe [17:08:41] *wave* [17:10:34] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [17:14:30] neat, email notifs when someone comments on your paste [17:14:33] the future is neat [17:35:23] Project beta-scap-eqiad build #26507: FAILURE in 1 min 16 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/26507/ [18:03:08] 3Wikimedia / 3Quality Assurance: get ZeroPortal test to work - 10https://bugzilla.wikimedia.org/72326#c1 (10Chris McMahon) This seems improperly configured: https://integration.wikimedia.org/ci/view/BrowserTests/view/-All/job/browsertests-ZeroPortal-zero.wikimedia.org-linux-firefox-sauce/15/console [18:03:52] 3Wikimedia / 3Quality Assurance: get ZeroPortal configured in jenkins job builder - 10https://bugzilla.wikimedia.org/72326 (10Chris McMahon) [18:04:15] Yippee, build fixed! [18:04:16] Project beta-scap-eqiad build #26508: FIXED in 20 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/26508/ [18:07:47] Project browsertests-ZeroBanner-en.m.wikipedia.org-linux-phantomjs build #216: FAILURE in 4.2 sec: https://integration.wikimedia.org/ci/job/browsertests-ZeroBanner-en.m.wikipedia.org-linux-phantomjs/216/ [18:10:13] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [18:43:46] Yippee, build fixed! [18:43:47] Project browsertests-Flow-test2.wikipedia.org-windows_8-internet_explorer-sauce build #222: FIXED in 42 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-test2.wikipedia.org-windows_8-internet_explorer-sauce/222/ [18:54:00] greg-g: its ^d's doing I imagine. [18:54:09] deployment-elastic02 that is [18:55:36] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #221: FAILURE in 8.5 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/221/ [19:00:04] Yippee, build fixed! [19:00:05] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #81: FIXED in 38 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/81/ [19:10:33] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [19:33:53] 3Wikimedia / 3Quality Assurance: ZeroPortal browsertests fails early due to auth issue with zero.wikimedia.org - 10https://bugzilla.wikimedia.org/72326#c2 (10Antoine "hashar" Musso (WMF)) Browser tests jobs assume the targeted URL is a MediaWiki instance with an accessible API. It then query /w/api.php for... [19:51:09] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #120: FAILURE in 8 min 11 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/120/ [19:51:17] Project browsertests-Translate-meta.wikimedia.org-linux-firefox-sauce build #238: FAILURE in 7.2 sec: https://integration.wikimedia.org/ci/job/browsertests-Translate-meta.wikimedia.org-linux-firefox-sauce/238/ [19:59:02] the beta redis server seems to be down [19:59:13] who knows about rdb1002.eqiad.wmnet ? [20:01:28] cscott: never head of rdb1002 [20:01:58] OCG seems to have gone down on beta because its redis server can't be contacted [20:03:29] ah that is a production one [20:03:34] I though it was on beta :-] [20:03:48] rdb1002.eqiad.wmnet is a production box [20:03:56] oh, hm. [20:05:32] hm, switching to #ops [20:05:32] Project browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce build #153: FAILURE in 5.7 sec: https://integration.wikimedia.org/ci/job/browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce/153/ [20:11:32] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) [20:35:35] Project beta-scap-eqiad build #26523: FAILURE in 1 min 33 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/26523/ [20:36:03] flakey dns is flakey [20:40:45] Yippee, build fixed! [20:40:45] Project beta-scap-eqiad build #26524: FIXED in 1 min 9 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/26524/ [20:44:36] cscott: is that related to http://icinga.wmflabs.org/cgi-bin/icinga/extinfo.cgi?type=1&host=deployment-soa-cache01.eqiad.wmflabs [20:46:19] greg-g: probably not -- it turns out that _joe_ pushed a puppet change a few hours ago that broke the OCG configuration on beta. but i didn't notice until I started the service on beta and it reread the config file. [20:46:27] greg-g: that one just has a dead console with nothing listed [20:46:51] it's expected that rdb1002 isn't reachable from labs. the bug was that the beta ocg was trying to use that as its redis server in the first place. [20:47:28] greg-g: but we turned off puppet on deployment-pdf01 temporarily, i manually hacked in a correct config file, and _joe_ promises to fix puppet properly in the morning. [20:48:09] all of which I tried to !log on #wikimedia-labs, but the log bot isn't running there right now? [20:49:46] greg-g: can try a reboot of soa-cache01 if you want but I don't think it will do much :) [20:50:46] !log deployment-prep updated OCG to version 523c8123cd826c75240837c42aff6301032d8ff1 [20:50:47] Logged the message, Master [20:51:06] !log deployment-prep turned off puppet on deployment-pdf01, manually fixed broken /etc/ocg/mw-ocg-service.js [20:51:07] Logged the message, Master [20:51:09] cscott: in this channel you can !log without the 'deployment-prep' [20:51:16] cscott: that saves a bunch of key strokes :] [20:51:16] !log deployment-prep _joe_ promises to fix this properly tomorrow am [20:51:18] Logged the message, Master [20:51:36] hashar: but it's more keystrokes if i'm just cut-and-pasting from the other channel [20:51:38] ;) [20:52:05] JohnLewis: greg-g: deployment-soa-cache01 was I guess created to test out co-hosting several -oid system on the same box [20:52:26] cscott: you can log here, too [20:52:36] heh, you just did [20:52:39] JohnLewis: greg-g: I think Gabriel / Service team is willing to push for that to avoid having dedicated hardware for mathoid, some other hardware for citoid [20:52:41] * greg-g catches up, really [20:52:57] hashar: Ah. [20:53:27] hashar: soo, it's dead though, we should kill it or ack the warnings, right? [20:53:29] JohnLewis: the instance was apparently created by Alexandros from ops. He is I think the ops liaison for *-oid stuff [20:53:34] was probably some experimentation [20:53:50] guessI will have to poke alexandros :-] [20:53:54] :) [20:53:57] kk [20:54:17] replying to YuviPanda on qa list [20:54:19] also, where can I or b-d-8-0-8 ack the labmon one? [20:55:14] * bd808 nacks that [20:56:14] eventually we will need to clean up the instances on beta [20:56:19] it has grown out of control :D [20:56:24] bah, I was trying to not ping him [20:56:36] greg-g: I was idling here :) [20:56:36] "eventually" ;) [20:56:41] bd808: :P [20:56:43] bryan must have a clever regex [20:57:00] /b.*d.*8.*0.*8/i or something [20:57:29] greg-g: as for caking alarms, the puppet error is a single check on the icinga production system apparently [20:57:39] huh [20:57:41] a) we don't have write access on the Icinga prod [20:57:56] b) acking the "puppet failures" alarm would ack it for all hosts [20:57:59] so these checks are redundant or ... http://icinga.wmflabs.org/icinga/ [20:58:26] oh [20:58:29] now I am confused [20:58:34] yeah, me too [20:58:54] I get emails from neon.wikimedia, but then there's icinga.wmflabs... [20:59:06] that icinga.wmflabs is probably unrelated [20:59:08] * bd808 blames YuviPanda [20:59:11] neon is the production Icinga [20:59:15] I can't find the things I get emails about in icinga.wmflabs, "obviously" [20:59:18] yeah [20:59:21] it has checks execute on bare metal labmon1001 [20:59:28] (labmon == lab monitoring) [20:59:35] so, we have two icingas, one that spams me and I can't do anything about, and another than doesn't that I could [20:59:45] a dedicated hardware box inside the labs infrastructure solely for the purpose of execute checks [20:59:58] so the icinga prod (on neon) execute checks on labmon1001 over nrpe [21:00:06] yeah [21:00:08] and labmon execute whatever plugin to verify graphite metric and reports [21:00:13] that is what I understood [21:00:38] maybe http://icinga.wmflabs.org/icinga/ is a WIP :D [21:00:45] or has been abandoned. who knows [21:01:02] anyway, the labs monitoring would be moved to Shinken (reusing labsmon1001 ) [21:01:14] * greg-g nods [21:01:20] who knows?! [21:01:24] Yuvi [21:01:24] :D [21:01:27] #notrhetorical ;) [21:01:35] busfactor=1 [21:01:44] he nicely replied with a nice roadmap but I did not take note [21:01:48] heh [21:02:21] that is the default: def wmf.assignWork( numpeople=1 ) [21:02:22] :D [21:03:27] at least we have some notifications now. that is a huge improvment [21:04:04] totally [21:04:18] I just have siphoned them off to another folder, they were killing my inbox [21:07:49] (03PS1) 10Hashar: mw-install-sqlite: clear sqlite DB after 20 mins [integration/jenkins] - 10https://gerrit.wikimedia.org/r/167948 (https://bugzilla.wikimedia.org/71128) [21:08:17] (03CR) 10Hashar: [C: 032] mw-install-sqlite: clear sqlite DB after 20 mins [integration/jenkins] - 10https://gerrit.wikimedia.org/r/167948 (https://bugzilla.wikimedia.org/71128) (owner: 10Hashar) [21:08:20] (03Merged) 10jenkins-bot: mw-install-sqlite: clear sqlite DB after 20 mins [integration/jenkins] - 10https://gerrit.wikimedia.org/r/167948 (https://bugzilla.wikimedia.org/71128) (owner: 10Hashar) [21:09:59] !log contint: refreshed slave-scripts 0b85d48..8c3f228 sqlite files will be cleared out after 20 minutes (instead of 60 minutes) {{bug|71128}} [21:10:01] Logged the message, Master [21:10:14] I am sure git deploy can handle such !log for us [21:10:23] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) WARN: deployment-prep.deployment-pdf01.puppetagent.time_since_last_run.value (66.67%) [21:26:48] diaper duties && sleep [21:29:37] 3Wikimedia / 3Continuous integration: Jenkins: lanthanum/gallium tmpfs are filling up with stale tmp files - 10https://bugzilla.wikimedia.org/71128#c9 (10Antoine "hashar" Musso (WMF)) 5PATC>3NEW (In reply to Gerrit Notification Bot from comment #8) > Change 167948 merged by jenkins-bot: > mw-install-sql... [21:49:37] Project browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce build #239: FAILURE in 9.1 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce/239/ [22:12:11] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) WARN: deployment-prep.deployment-pdf01.puppetagent.time_since_last_run.value (100.00%) [22:23:52] Yippee, build fixed! [22:23:53] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #185: FIXED in 1 min 16 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/185/ [22:36:27] Yippee, build fixed! [22:36:28] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #234: FIXED in 22 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/234/ [23:09:13] PROBLEM - BetaLabs: Puppet freshness check on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash1.puppetagent.time_since_last_run.value (100.00%) WARN: deployment-prep.deployment-pdf01.puppetagent.time_since_last_run.value (100.00%) [23:26:30] Nothing seems to be merging for MobileFrontend in gerrit: https://gerrit.wikimedia.org/r/#/c/167309/ https://gerrit.wikimedia.org/r/#/c/167983/ [23:33:37] "nothing" [23:33:55] both are merged