[00:00:49] 10Release-Engineering-Team (Kanban), 10Product-Analytics, 10Release, 10Train Deployments: 1.31.0-wmf.30 deployment blockers - https://phabricator.wikimedia.org/T183969#4141461 (10hoo) [00:10:03] 10Beta-Cluster-Infrastructure, 10Puppet: redis/nutcracker down on deployment-prep - https://phabricator.wikimedia.org/T192473#4141484 (10EddieGP) [00:17:49] 10Beta-Cluster-Infrastructure, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4141502 (10EddieGP) [01:58:52] (03PS3) 10Krinkle: Remove composer-dev-args.js [integration/jenkins] - 10https://gerrit.wikimedia.org/r/412944 (owner: 10Reedy) [01:58:55] (03CR) 10Krinkle: [C: 032] Remove composer-dev-args.js [integration/jenkins] - 10https://gerrit.wikimedia.org/r/412944 (owner: 10Reedy) [01:59:48] (03Merged) 10jenkins-bot: Remove composer-dev-args.js [integration/jenkins] - 10https://gerrit.wikimedia.org/r/412944 (owner: 10Reedy) [02:03:33] (03PS4) 10Krinkle: Remove `composer dump-autoload --optimize` [integration/jenkins] - 10https://gerrit.wikimedia.org/r/394907 (https://phabricator.wikimedia.org/T181940) (owner: 10Reedy) [02:04:19] (03CR) 10Krinkle: "I recall something about this being needed because with the regular mode before this (for non-dev deps), adding more may not rebuild the a" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/394907 (https://phabricator.wikimedia.org/T181940) (owner: 10Reedy) [03:04:26] 10MediaWiki-Releasing, 10MediaWiki-extensions-LoginNotify, 10MW-1.31-release: Bundle LoginNotify extension with MW 1.31 - https://phabricator.wikimedia.org/T191746#4141700 (10Legoktm) This one is still waiting on Echo [04:42:21] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4141728 (10aaron) The warnings are pointless, the patch above adds an isset() check. [06:54:38] 10Beta-Cluster-Infrastructure: Puppet errors on deployment-mediawiki07 - https://phabricator.wikimedia.org/T192507#4141832 (10MoritzMuehlenhoff) This is intentional, temporary and can be removed in a few days, while we're rolling out the changes to memcached handling in production it would be useful to have one... [07:30:40] (03CR) 10Hashar: [C: 032] Use quibble on mediawiki/core@master [integration/config] - 10https://gerrit.wikimedia.org/r/427472 (owner: 10Hashar) [07:31:56] (03Merged) 10jenkins-bot: Use quibble on mediawiki/core@master [integration/config] - 10https://gerrit.wikimedia.org/r/427472 (owner: 10Hashar) [07:44:30] halfak: QUIBBLE [07:44:33] fuck [07:44:36] hashar: :P [07:44:42] sorry for the ping hal...fak [07:46:20] addshore: yeah I am adding it right now for mediawiki/core @ master :]]] [07:46:24] YAY [07:46:28] is it faster? :P [07:46:33] i guess we dont have to wait for nodepool [07:47:22] (03PS1) 10Hashar: Also use quibble in gate-and-submit [integration/config] - 10https://gerrit.wikimedia.org/r/427610 [07:47:22] yup [07:47:32] it is a bit slower because all test commands are run serially [07:47:51] but in my experience it takes ~ 6 minutes with a cold cache [07:47:57] and we can surely make it faster later on [08:04:50] (03PS2) 10Hashar: Also use quibble in gate-and-submit [integration/config] - 10https://gerrit.wikimedia.org/r/427610 [08:05:37] (03CR) 10Hashar: [C: 032] "No with tests. Yesterday night I could not turn my mind about how to manage the jobs:" [integration/config] - 10https://gerrit.wikimedia.org/r/427610 (owner: 10Hashar) [08:06:48] (03Merged) 10jenkins-bot: Also use quibble in gate-and-submit [integration/config] - 10https://gerrit.wikimedia.org/r/427610 (owner: 10Hashar) [08:11:46] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4141946 (10EddieGP) [08:11:49] 10Beta-Cluster-Infrastructure, 10GlobalRename, 10MediaWiki-extensions-CentralAuth: Please unblock stuck global rename Artix Kreiger 2 to Artix Krieger 2 - https://phabricator.wikimedia.org/T192471#4141941 (10EddieGP) 05Open>03Resolved a:03EddieGP Resolved per the comments above. The errors is just logs... [08:19:41] !log eddie@deployment-tin:~$ for wiki in deploymentwiki enwiki enwikinews loginwiki metawiki simplewiki; do mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=$wiki --logwiki=loginwiki 'Samtar' "There'sNoTime"; done T192476 [08:19:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:19:44] T192476: Please unblock stuck global rename Samtar → There'sNoTime - https://phabricator.wikimedia.org/T192476 [08:20:22] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4141975 (10EddieGP) [08:20:24] 10Beta-Cluster-Infrastructure, 10GlobalRename, 10MediaWiki-extensions-CentralAuth: Please unblock stuck global rename Samtar → There'sNoTime - https://phabricator.wikimedia.org/T192476#4141973 (10EddieGP) 05Open>03Resolved a:03EddieGP [08:24:17] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4141990 (10EddieGP) p:05Unbreak!>03Low Per aarons comment, just logspam. Seems the actual problem for renames was nutcracker, which I fixed in T192473#4141... [08:30:16] 10Beta-Cluster-Infrastructure: Puppet errors on deployment-mediawiki07 - https://phabricator.wikimedia.org/T192507#4141994 (10EddieGP) Alright. Please remember to !log such things - if there's no log entry I'll assume puppet breakage is unintended and try to fix it, where it could just be ignored for a few days. [08:49:30] 10Beta-Cluster-Infrastructure, 10GlobalRename, 10MediaWiki-extensions-CentralAuth: Please unblock stuck global rename Samtar → There'sNoTime - https://phabricator.wikimedia.org/T192476#4142021 (10MarcoAurelio) I don't think it matters much because it's beta, but the script as run above is wrong IMHO, as per... [08:54:54] 10Beta-Cluster-Infrastructure, 10GlobalRename, 10MediaWiki-extensions-CentralAuth: Please unblock stuck global rename Samtar → There'sNoTime - https://phabricator.wikimedia.org/T192476#4142027 (10EddieGP) I didn't run that very command above. Actually I've been doing this for deploymentwiki first - Special:G... [09:03:33] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4142093 (10MarcoAurelio) 👍 [09:12:00] hi eddiegp - is the jobqueue issues at beta resolved then? [09:12:05] is/are [09:12:42] https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:GlobalRenameProgress?username=Hauskatze <-- fails to start again [09:12:53] Hauskatze: I think so, I assume it was that dumb thing with nutcracker not creating the directory for it's socker files, though I'm not sure how to test that. [09:17:33] (03PS1) 10Hashar: docker: add tidy from jessie in quibble-stretch [integration/config] - 10https://gerrit.wikimedia.org/r/427623 [09:18:00] eddiegp: not running again afaics, showJobs == 0 [09:24:31] (03PS2) 10Hashar: docker: add tidy from jessie in quibble-stretch [integration/config] - 10https://gerrit.wikimedia.org/r/427623 (https://phabricator.wikimedia.org/T191771) [09:39:46] (03PS1) 10Hashar: Bump quibble-stretch to 0.0.8-2 [integration/config] - 10https://gerrit.wikimedia.org/r/427625 (https://phabricator.wikimedia.org/T191771) [09:39:57] (03CR) 10Hashar: [C: 032] docker: add tidy from jessie in quibble-stretch [integration/config] - 10https://gerrit.wikimedia.org/r/427623 (https://phabricator.wikimedia.org/T191771) (owner: 10Hashar) [09:41:14] (03Merged) 10jenkins-bot: docker: add tidy from jessie in quibble-stretch [integration/config] - 10https://gerrit.wikimedia.org/r/427623 (https://phabricator.wikimedia.org/T191771) (owner: 10Hashar) [09:41:56] !log building releng/quibble-stretch:0.0.8-2 [09:41:57] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:42:31] (03PS2) 10Hashar: Bump quibble-stretch to 0.0.8-2 [integration/config] - 10https://gerrit.wikimedia.org/r/427625 (https://phabricator.wikimedia.org/T191771) [09:44:00] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4142149 (10MarcoAurelio) jobqueue at beta is down again; see 10Continuous-Integration-Config, 10Operations, 10puppet-compiler, 10Puppet: Figure out a way to enable volunteers to use the puppet compiler - https://phabricator.wikimedia.org/T192532#4142170 (10EddieGP) [09:56:05] is there any problem with the train? [[wikidata:Special:Version]] says it’s still on wmf.29 and I can’t see why [09:56:34] https://tools.wmflabs.org/versions/ claims wikidatawiki and all of group1 are on .30, and the blockers task (T183969) has no open subtasks [09:56:34] T183969: 1.31.0-wmf.30 deployment blockers - https://phabricator.wikimedia.org/T183969 [09:57:07] ([[wikidata:Special:Version]] → https://www.wikidata.org/wiki/Special:Version – sorry, I thought stashbot would link that) [10:17:46] Lucas_WMDE: Yes [10:18:21] I dunno when the blockers were resolved (after the window?) [10:24:20] so apparently the reason why the versions tool claims group1 is on wmf.30 is that it only checks the version of the first wiki of each group [10:24:40] and the first wiki in group1 is advisorswiki, which is on wmf.30 even though the rest of group1 still seems to be on wmf.29 as far as I can tell [10:29:07] it seems the rename job never arrived to the job queue afaics [10:34:16] !log fixStuckGlobalRename.php ran to unblock rename jobs never arriving to jobqueue [10:34:18] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:40:06] 10Beta-Cluster-Infrastructure, 10Cassandra: deployment-cassandra3-0{1,2}: Contact point 0 () is not a valid host name, the following values are valid contact points: ipAddress, hostName or ipAddress:port - https://phabricator.wikimedia.org/T192539#4142340 (10MarcoAurelio) [11:23:48] o/ addshore :P [11:53:44] Hi halfak :D [11:54:19] A missed [TAB] is a good excuse as any to say hello :D [12:16:56] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Lexicographical data, 10Wikidata, and 2 others: MediaWiki core's selenium tests flaky when run as part of mwext-mw-selenium-node-composer-jessie job - https://phabricator.wikimedia.org/T191537#4142508 (10Addshore) As it seems t... [12:27:27] (03CR) 10Hashar: [C: 032] Bump quibble-stretch to 0.0.8-2 [integration/config] - 10https://gerrit.wikimedia.org/r/427625 (https://phabricator.wikimedia.org/T191771) (owner: 10Hashar) [12:29:01] (03Merged) 10jenkins-bot: Bump quibble-stretch to 0.0.8-2 [integration/config] - 10https://gerrit.wikimedia.org/r/427625 (https://phabricator.wikimedia.org/T191771) (owner: 10Hashar) [12:35:14] (03PS1) 10Hashar: docker: add luasandbox/tidy/wikidiff to quibble hhvm image [integration/config] - 10https://gerrit.wikimedia.org/r/427647 [12:39:47] 10Deployments, 10Release-Engineering-Team (Kanban), 10Operations, 10Patch-For-Review, 10Release: Deploy Scap 3.8.0 to production - https://phabricator.wikimedia.org/T192124#4142546 (10fgiunchedi) @mmodell for sure, I was reading the log and I wonder why architecture changed from all to any? [12:41:31] (03CR) 10Hashar: [C: 032] docker: add luasandbox/tidy/wikidiff to quibble hhvm image [integration/config] - 10https://gerrit.wikimedia.org/r/427647 (owner: 10Hashar) [12:42:51] (03Merged) 10jenkins-bot: docker: add luasandbox/tidy/wikidiff to quibble hhvm image [integration/config] - 10https://gerrit.wikimedia.org/r/427647 (owner: 10Hashar) [14:16:55] Gerrit supports reloading gerrit.config now without restarting the whole thing :) [14:16:59] https://gerrit-review.googlesource.com/#/c/gerrit/+/172414/ was merged [14:17:15] Though some config may require you to still restart [14:26:06] 10Continuous-Integration-Config, 10Wiki-Loves-Monuments-Database: Generate coverage for PHPUnit tests of labs-tools-heritage - https://phabricator.wikimedia.org/T192083#4142788 (10Lokal_Profil) [14:26:19] 10Release-Engineering-Team (Watching / External), 10Operations, 10Parsoid, 10Patch-For-Review: Provide an archive endpoint for older Parsoid debs (on releases.wikimedia.org or elsewhere) - https://phabricator.wikimedia.org/T150672#4142789 (10Dzahn) - https://releases.wikimedia.org/parsoid/ has been create... [14:52:30] Lucas_WMDE: (a little late, but catching up on scrollback) I didn't roll the train forward yesterday due to blockers that were resolved after the window closed, advisorswiki looks like it is a wiki that was added outside of the train process. It ended up at the top of group1 since group1 is all wikis except wikipedia wikis and explicit group0 wikis. [14:53:18] ok thanks [14:53:39] so it’s just an unfortunate accident that advisorswiki happens to be alphabetically at the beginning of group1 so it confused the versions tool [14:53:58] yes, so it seems [15:15:38] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Operations: Upgrade deployment-prep deployment servers to stretch - https://phabricator.wikimedia.org/T192561#4142985 (10thcipriani) [15:16:58] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Operations: Upgrade deployment-prep deployment servers to stretch - https://phabricator.wikimedia.org/T192561#4142999 (10thcipriani) This would also give us a place to test various mwscripts used by scap with php7 [15:17:10] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Operations: Upgrade deployment-prep deployment servers to stretch - https://phabricator.wikimedia.org/T192561#4143013 (10thcipriani) [15:17:14] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: mwscript rebuildLocalisationCache.php takes 40 minutes - https://phabricator.wikimedia.org/T191921#4143012 (10thcipriani) [16:12:27] (03PS1) 10Hashar: Switch mediawiki/core and vendor to quibble [integration/config] - 10https://gerrit.wikimedia.org/r/427697 [16:12:29] (03PS1) 10Hashar: mediawiki/core phpcs job and docker container [integration/config] - 10https://gerrit.wikimedia.org/r/427698 [16:12:31] (03PS1) 10Hashar: Remove mediawiki-core-npm-node-6-docker [integration/config] - 10https://gerrit.wikimedia.org/r/427699 [16:18:02] (03CR) 10Hashar: "So that should more or less migrate us to quibble. I will deploy it either tonight or early tomorrow morning :]" [integration/config] - 10https://gerrit.wikimedia.org/r/427697 (owner: 10Hashar) [16:36:55] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Lexicographical data, 10Wikidata, and 2 others: MediaWiki core's selenium tests flaky when run as part of mwext-mw-selenium-node-composer-jessie job - https://phabricator.wikimedia.org/T191537#4143301 (10Anomie) The one thing t... [16:54:45] (03CR) 10Jforrester: "Core's phpcs times out if just run through composer (which is why we had this bespoke job) – and quibble tries the composer route and thus" [integration/config] - 10https://gerrit.wikimedia.org/r/427698 (owner: 10Hashar) [16:58:42] Did we move Commons from deployment group 1 to group 2? https://tools.wmflabs.org/versions/ says wmf.30 is on group 1, but https://commons.wikimedia.org/wiki/Special:Version says it's still on wmf.29. [17:01:29] 14:52:30 +thcipriani | Lucas_WMDE: (a little late, but catching up on scrollback) I didn't roll the train forward yesterday due to blockers that were [17:01:32] | resolved after the window closed, advisorswiki looks like it is a wiki that was added outside of the train process. It ended up at [17:01:34] | the top of group1 since group1 is all wikis except wikipedia wikis and explicit group0 wikis. [17:01:36] gah, bad copy/pasta [17:01:44] * greg-g is getting tired of this with weechat [17:02:37] tl;dr: new wiki alphasorts as top was added post train which makes the version checker think group1 is on wmf.30 (it only checks the first in the list) [17:05:57] 10Release-Engineering-Team (Watching / External), 10Operations, 10Parsoid, 10Patch-For-Review: Provide an archive endpoint for older Parsoid debs (on releases.wikimedia.org or elsewhere) - https://phabricator.wikimedia.org/T150672#4143465 (10Dzahn) In Hiera it is defined which is the currently "active" rel... [17:07:32] 10Release-Engineering-Team (Watching / External), 10Operations, 10Parsoid, 10Patch-For-Review: Provide an archive endpoint for older Parsoid debs (on releases.wikimedia.org or elsewhere) - https://phabricator.wikimedia.org/T150672#4143477 (10Dzahn) @ssastry I think this should have resolved the ticket. See... [17:10:19] hasharAway: qubble jobs fail with "no space lft on device" breaking master commits [17:15:29] James_F: Looks like hashar has left for the day. [17:15:45] The commits unmerged suggest the switch hasn't taken place, and yet they are running. [17:15:49] I have no idea how to safely revert this. [17:20:45] is it just integration-slave-docker-1004? [17:20:54] I can mark that as offline and try to clear space there [17:21:32] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4143521 (10Krinkle) [17:21:42] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4143531 (10Krinkle) p:05Triage>03Unbreak! [17:21:50] ah, looks like it is already offline [17:21:51] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4143533 (10Krinkle) [17:21:56] thcipriani: The job is still broken in other ways. [17:22:00] Even if it did have space. [17:22:49] I recall chatter in this channel yesterday about enabling the new quibble job in non-voting fashion first to weed out false negatives before switching [17:24:00] looks like it was enabled 9 hours ago: https://gerrit.wikimedia.org/r/#/c/427610/ [17:27:50] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4143521 (10thcipriani) ```counterexample error: copy-fd: write returned: No space left on device fatal: fai... [17:32:43] (03PS1) 10Krinkle: Revert quibble-related changes [integration/config] - 10https://gerrit.wikimedia.org/r/427717 (https://phabricator.wikimedia.org/T192576) [17:33:25] thcipriani: This is an option ^ - but if you or someone else knows a smaller change that would work, we can try that first. [17:34:16] T192577 [17:34:17] T192577: tools.wmflabs.org/versions shows incorrect data for group1 - https://phabricator.wikimedia.org/T192577 [17:34:45] 10Release-Engineering-Team: tools.wmflabs.org/versions shows incorrect data for group1 - https://phabricator.wikimedia.org/T192577#4143568 (10Catrope) [17:35:32] 10Release-Engineering-Team: tools.wmflabs.org/versions shows incorrect data for group1 - https://phabricator.wikimedia.org/T192577#4143568 (10Krinkle) > [wikimedia-releng] times in BST > 15:52 <•thcipriani> Lucas_WMDE: (a little late, but catching up on scrollback) I didn't roll the train forward yesterday due t... [17:35:42] RoanKattouw: I bet it was no_justification moving more wikis into different groups [17:35:49] I think so too [17:35:59] bd808 is travelling [17:36:13] whereas, it should use line one of group1.dblist [17:36:19] https://phabricator.wikimedia.org/source/tool-versions/ [17:36:28] Aah I see from Krinkle's paste on the task that it was discovered in this channel yesterday [17:36:42] 30min ago [17:36:58] in this instance it was determined that it only looks at the first wiki in the group [17:36:59] Someone else mentioned it before though.. I think one of the WMDE guys [17:37:01] Ugh it was JUST above my screen cutoff [17:37:02] https://gerrit.wikimedia.org/r/#/c/426762/ [17:37:03] or.. 30min ago greg repeated a chunk [17:37:03] Thanks [17:37:10] from 3 h ago [17:37:13] and that was added after the group0 rollforward [17:37:31] Sorry for repeating then [17:38:14] Krinkle: lets just revert for now, I'm not up-to-speed enough on quibble development, sadly. AFAICT the timeout is set outside of quibble though. [17:38:42] thcipriani: Yeah, but the old job didn't take so long. Probaly has to do with file filters (it's linting too many files) [17:39:02] It's a clean revert of zuul-only config, so no jenkins jobs changed [17:40:22] (03CR) 10Thcipriani: [C: 032] Revert quibble-related changes [integration/config] - 10https://gerrit.wikimedia.org/r/427717 (https://phabricator.wikimedia.org/T192576) (owner: 10Krinkle) [17:40:59] Krinkle: I'll deploy ^ once it merges [17:41:35] (03Merged) 10jenkins-bot: Revert quibble-related changes [integration/config] - 10https://gerrit.wikimedia.org/r/427717 (https://phabricator.wikimedia.org/T192576) (owner: 10Krinkle) [17:43:39] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/427717/ [17:43:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:43:53] Krinkle: done [17:44:14] I'll poke hasha.r about it as well [17:44:55] thcipriani: thx, re-checking jobs now [17:46:32] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4143650 (10thcipriani) p:05Unbreak!>03Normal Deployed a revert of adding quibble... [17:48:48] antoine shoudl be back before the train (he said) [18:00:49] !log cleared a bunch of old docker images from integration-slave-docker-1004, freed 4.4GB of space [18:00:51] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:11:25] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4143712 (10Jdforrester-WMF) >>! In T192576#4143650, @thcipriani wrote: > ```counterex... [18:30:33] 10Release-Engineering-Team (Watching / External): tools.wmflabs.org/versions shows incorrect data for group1 - https://phabricator.wikimedia.org/T192577#4143744 (10greg) p:05Triage>03Low (I don't think there's a project for the versions tool, but it's @bd808's baby :) ) As tyler said on IRC, this is due to... [18:32:56] 10Deployments, 10Release-Engineering-Team (Kanban), 10Operations, 10Patch-For-Review, 10Release: Deploy Scap 3.8.0 to production - https://phabricator.wikimedia.org/T192124#4143748 (10mmodell) @fgiunchedi: I'm not sure, the commit says it's for py3 transition, but I'm not sure why it matters. @demon, can... [18:47:02] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.30 deployment blockers - https://phabricator.wikimedia.org/T183969#4143805 (10thcipriani) [18:54:32] 10MediaWiki-Codesniffer: PHPCS should not complain about @covers and @dataProvider being used in traits - https://phabricator.wikimedia.org/T192384#4143826 (10Umherirrender) With T191046 (Version 18.0.0) this was fixed, but the trait must follow the naming convention of test classes (ending with Test, TestBase o... [18:59:42] 10MediaWiki-Codesniffer: PHPCS should not complain about @covers and @dataProvider being used in traits - https://phabricator.wikimedia.org/T192384#4143834 (10daniel) @Umherirrender hm, I'd like to avoid using the "Test" suffix - it's not a runnable test case afterall. We use the "TestBase" or "TestCase" ending... [19:04:51] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4143853 (10hashar) > No space left on device The Docker slaves have their /var/lib/d... [19:11:07] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Discovery, 10Wikidata, and 2 others: Set up user for automatic WDQS GUI builds - https://phabricator.wikimedia.org/T189811#4143877 (10mmodell) @smalyshev: Ok that's set up in jenkins: https://integration.wikimedia.org/ci/credentials/st... [19:11:40] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Discovery, 10Wikidata, and 2 others: Set up user for automatic WDQS GUI builds - https://phabricator.wikimedia.org/T189811#4143878 (10mmodell) [19:11:54] (03PS1) 10Hashar: Bump composer timeout [integration/quibble] - 10https://gerrit.wikimedia.org/r/427761 (https://phabricator.wikimedia.org/T192576) [19:15:45] (03CR) 10Hashar: [C: 032] Bump composer timeout [integration/quibble] - 10https://gerrit.wikimedia.org/r/427761 (https://phabricator.wikimedia.org/T192576) (owner: 10Hashar) [19:16:19] (03Merged) 10jenkins-bot: Bump composer timeout [integration/quibble] - 10https://gerrit.wikimedia.org/r/427761 (https://phabricator.wikimedia.org/T192576) (owner: 10Hashar) [19:18:40] 19:02:41 INFO:zuul.Cloner:Creating repo mediawiki/core from cache /srv/git/mediawiki/core.git [19:18:41] 19:02:41 DEBUG:git.cmd:AutoInterrupt wait stderr: "fatal: destination path '/src' already exists and is not an empty directory.\n" [19:18:49] integration-slave-docker-1004 [19:19:21] (03PS1) 10Hashar: docker: quibble 0.0.9 [integration/config] - 10https://gerrit.wikimedia.org/r/427763 (https://phabricator.wikimedia.org/T192576) [19:20:21] (03PS1) 10Hashar: Bump quibble jobs to use 0.0.9 [integration/config] - 10https://gerrit.wikimedia.org/r/427765 [19:21:01] Reedy: job? [19:21:16] https://integration.wikimedia.org/ci/job/mediawiki-core-php70-phan-docker/8865/ [19:21:17] Reedy: I guess the slave has a dirty workspace somehow [19:21:34] Reedy: wiped [19:21:40] cheers [19:21:42] hmm no [19:21:46] not yet :) [19:22:46] !log integration-slave-docker-1004 : wiping all jenkins workspace and all docker images [19:22:48] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:23:40] 10Continuous-Integration-Infrastructure: integration-slave-docker-1004 out of disk space - https://phabricator.wikimedia.org/T192586#4143911 (10Gilles) [19:23:49] no_justification: gerrit died :( [19:23:53] Gerrit is down. We're working on bringing it back as soon as possible. [19:24:02] I logged in ops [19:24:04] It's not down [19:24:06] It's restarting ;-) [19:24:11] ah, just saw your log message hashar :) [19:24:28] no_justification: YOU HAVE PUT IT DOWN FIX IT !!!!!!!!!!!!!!!!! :]]]]]]]]]]]] [19:24:38] I don't wanna [19:24:39] :p [19:24:48] no_justification yeah I should have read the doc really. Looking at #wikimedia-operations made it obvious it was under a restart :D [19:28:50] (03PS1) 10Hashar: Use quibble on mediawiki/core@master [2] [integration/config] - 10https://gerrit.wikimedia.org/r/427767 (https://phabricator.wikimedia.org/T192576) [19:28:52] (03PS1) 10Hashar: Also use quibble in gate-and-submit [integration/config] - 10https://gerrit.wikimedia.org/r/427768 (https://phabricator.wikimedia.org/T192576) [19:36:56] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.30 deployment blockers - https://phabricator.wikimedia.org/T183969#4143978 (10thcipriani) [19:38:00] 10Continuous-Integration-Config, 10Mobile-Content-Service, 10Patch-For-Review, 10Reading-Infrastructure-Team-Backlog (Kanban): Create a CI task for MCS periodic tests - https://phabricator.wikimedia.org/T177896#4143980 (10Mholloway) p:05Normal>03High [19:38:51] 10Continuous-Integration-Config, 10Mobile-Content-Service, 10Patch-For-Review, 10Reading-Infrastructure-Team-Backlog (Kanban): Create a CI task for MCS periodic tests - https://phabricator.wikimedia.org/T177896#3674205 (10Mholloway) The instability of these tests is a large and ongoing burden. [19:44:49] (03CR) 10Hashar: [C: 032] Bump quibble jobs to use 0.0.9 [integration/config] - 10https://gerrit.wikimedia.org/r/427765 (owner: 10Hashar) [19:46:21] (03Merged) 10jenkins-bot: Bump quibble jobs to use 0.0.9 [integration/config] - 10https://gerrit.wikimedia.org/r/427765 (owner: 10Hashar) [19:50:27] 10Continuous-Integration-Infrastructure: integration-slave-docker-1004 out of disk space - https://phabricator.wikimedia.org/T192586#4144026 (10thcipriani) [19:50:33] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4144029 (10thcipriani) [19:51:45] 10Continuous-Integration-Infrastructure: integration-slave-docker-1004 out of disk space - https://phabricator.wikimedia.org/T192586#4143911 (10hashar) I have deleted all the workspaces and delete all the Docker images. [19:58:27] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4144047 (10hashar) I have rebuild the failing job and it passed successfully https://... [20:03:00] 10Gerrit, 10Patch-For-Review: Experiment switching gc back on in gerrit - https://phabricator.wikimedia.org/T190045#4144056 (10Paladox) 05Open>03Resolved a:03Paladox [20:03:15] 10Gerrit: Experiment switching gc back on in gerrit - https://phabricator.wikimedia.org/T190045#4061062 (10Paladox) [20:09:02] (03PS2) 10Hashar: Also use quibble in gate-and-submit [2] [integration/config] - 10https://gerrit.wikimedia.org/r/427768 (https://phabricator.wikimedia.org/T192576) [20:21:14] 10Release-Engineering-Team (Watching / External), 10Tools: tools.wmflabs.org/versions shows incorrect data for group1 - https://phabricator.wikimedia.org/T192577#4144103 (10bd808) >>! In T192577#4143744, @greg wrote: > The version tool apparently just checks the first alphasorted wiki in each group (sensible,... [20:24:01] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10Wikimedia-log-errors (Jenkins Failure): Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576#4144119 (10Jdforrester-WMF) Yay. [20:55:50] (03CR) 10Hashar: [C: 032] Use quibble on mediawiki/core@master [2] [integration/config] - 10https://gerrit.wikimedia.org/r/427767 (https://phabricator.wikimedia.org/T192576) (owner: 10Hashar) [20:55:54] (03CR) 10Hashar: [C: 032] Also use quibble in gate-and-submit [2] [integration/config] - 10https://gerrit.wikimedia.org/r/427768 (https://phabricator.wikimedia.org/T192576) (owner: 10Hashar) [20:57:14] (03Merged) 10jenkins-bot: Use quibble on mediawiki/core@master [2] [integration/config] - 10https://gerrit.wikimedia.org/r/427767 (https://phabricator.wikimedia.org/T192576) (owner: 10Hashar) [20:57:16] (03Merged) 10jenkins-bot: Also use quibble in gate-and-submit [2] [integration/config] - 10https://gerrit.wikimedia.org/r/427768 (https://phabricator.wikimedia.org/T192576) (owner: 10Hashar) [21:01:21] !log Bringing back quibble (0.0.9), this time with COMPOSER_PROCESS_TIMEOUT=600 | T192576 [21:01:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:01:24] T192576: Broken quibble jobs fail all mediawiki commits - https://phabricator.wikimedia.org/T192576 [21:08:50] !log rebuilding integration-slave-docker-1002 and integration-slave-docker-1003 ci1.medium > m1.medium (+2G RAM) [21:08:53] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:23:00] !log Pooling in the new integration-slave-docker-1002 and integration-slave-docker-1003 [21:23:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:24:08] hashar: it seems that the jobqueue at Beta is down. I see you handled this kind of issue in the past. Any idea? [21:39:50] Hauskatze: no idea. I havent looked at the jobqueue for months and in production it is being replaced entirely as I understand it [21:40:23] Hauskatze: namely instead of using redis to store the jobs and the mediawiki/services/jobrunner to trigger them, there is a new system using Kafka to spread the jobs and "something" to run them [21:40:35] but I don't know anything about the new system :^( [21:42:16] I see kafka errors and redis errors as well [21:42:32] ie, hashar, globalrename refuses to work there if not forced via a maintenance script [21:42:40] and I guess other jobs are failing as well [21:43:42] maybe it is just globalrename being stuck for whatever reason [21:43:48] Hauskatze: sorry but i cant look into it :( [21:43:59] ok [21:44:01] ie: 180419 21:37:12 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. Statement is unsafe because it accesses a non-transactional table after accessing a transactional table within the same transaction. Statement: UPDATE /* GlobalRenameUserDatabaseUpdates::update */ `globaluser` SET gu_name = 'Spambot~BobFeliciano' WHERE gu_name = 'BobFeliciano' [21:44:39] I think eddiegp was having a look or knew where this came from [21:46:00] how is it that it uses a non-transactional table? [21:46:25] 10Release-Engineering-Team (Kanban), 10Scap, 10Scoring-platform-team, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4144423 (10awight) @mmodell Do you have an idea how the various git caches will react to the large files? I noticed that the .git/modules/submodules/assets... [21:49:20] Oh yeah, replacing redis with kafka. [21:49:58] 10Beta-Cluster-Infrastructure, 10JobRunner-Service: GlobalRename refuses to work on Beta Cluster - https://phabricator.wikimedia.org/T192604#4144426 (10MarcoAurelio) [21:50:09] No one updated the docs on wikitech. Not that this should surprise me. [21:50:38] But that might very well be the reason why the nullJobs I inserted via eval.php never made it to redis. [21:52:01] 10Beta-Cluster-Infrastructure, 10MW-1.32-release-notes (WMF-deploy-2018-04-24 (1.32.0-wmf.1)), 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4144438 (10MarcoAurelio) The error mentioned is gone, thanks. However we still have issues: T1... [21:52:55] I'm not sure I can do anything to help. [21:56:31] Me neither - everything I know about Kafka so far is the first sentence of it's Wikipedia article. [21:57:01] maybe the nutcracker fix you did is gone after puppet ran? [21:57:09] That's not exactly a good position to start troubleshooting from :D [21:57:47] Umm, no, why would it? [21:58:08] It'd be insane to have an ensure => absent for that directory. [21:59:12] puppet do funny things [22:07:17] eddie@deployment-mediawiki-07:/srv/mediawiki$ mwscript eval.php --wiki=deploymentwiki [22:07:19] > global $wgJobTypeConf; echo $wgJobTypeConf["default"]["class"]; [22:07:21] JobQueueEventBus [22:07:53] Yeah, the job queue doesn't even use redis any more in beta. 'JobQueueEventBus' means it's using kafka. [22:09:20] Wow, that was a waste of time today. :D [22:10:51] eval.php runs in the global scope [22:13:23] Reedy: That means...? It'll not respect labs-specific settings? [22:13:35] no [22:13:43] you don't need to do global $wgFoobar; first [22:14:41] Ah, yeah, that makes way more sense than what I though you'd mean. [22:15:17] * Hauskatze looks for kafka on logstash [22:18:19] nothing useful afaics, only level=info messages [22:20:00] though on cassandra, lots of error/service-runner/master, worker 14724 died (1), restarting. [22:22:23] 10Continuous-Integration-Config, 10ORES, 10Scoring-platform-team: Daily build integration test to prove that ORES makefiles are sane - https://phabricator.wikimedia.org/T192606#4144488 (10awight) [22:48:46] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.30 deployment blockers - https://phabricator.wikimedia.org/T183969#4144555 (10thcipriani) [22:52:19] 10Release-Engineering-Team (Kanban), 10Scap, 10Scoring-platform-team, 10Patch-For-Review: Support git-lfs - https://phabricator.wikimedia.org/T180627#4144579 (10mmodell) @awight: hrm, well, no, the whole point of git-lfs is to avoid that! AFAIK it doesn't keep stuff around in git cache because the files' c... [22:55:05] thcipriani: reverting from wmf.30 is probably best bet for now, it's going to take a bit and would delay train till monday at least to figure it out: https://gerrit.wikimedia.org/r/427827 [22:55:44] will figure out how to reproduce and get it fixed in the next train [22:57:56] wmf.30 is reverted for now (merging the reverts at the moment). Ideally we'd backport whatever fixes you all make since the next train coming will likely have its own set of issues. [23:03:21] +1 [23:16:36] 10Beta-Cluster-Infrastructure, 10MW-1.32-release-notes (WMF-deploy-2018-04-24 (1.32.0-wmf.1)), 10Patch-For-Review, 10Puppet: deployment-prep has jobqueue/caching issues - https://phabricator.wikimedia.org/T192473#4144668 (10EddieGP) p:05Low>03High Indeed, the jobqueue on beta is still broken, although... [23:16:56] 10Deployments, 10Release-Engineering-Team (Kanban), 10Operations, 10Patch-For-Review, 10Release: Deploy Scap 3.8.0 to production - https://phabricator.wikimedia.org/T192124#4144670 (10demon) That was part of that commit. I was kinda following the example set by the conftool package. If this is problemati...