[00:17:14] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320781 (10mmodell) [00:19:02] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320793 (10mmodell) p:05Triage>03High Marking high priority because there are security related patches that need to be deployed asap. [00:25:18] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320781 (10Paladox) that user + ip are in the grants https://github.com/wikimedia/puppet/blob/a7883098b37ecb23d1291d404857a8105a193fee/modules/role/templates/mariadb/grants/producti... [01:03:04] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320867 (10mmodell) Well maybe the grants didn't get applied properly to the new database master? I'm really not sure. [01:39:28] (03CR) 10Prtksxna: "Thanks Hashar :)" [integration/config] - 10https://gerrit.wikimedia.org/r/442266 (owner: 10Prtksxna) [04:35:02] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320781 (10Marostegui) Can you try again? [05:29:25] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320947 (10Marostegui) This seems to be working for me from phab1001, but I would like to get your confirmation: ``` root@phab1001:~# mysql --skip-ssl -hdb1072 -p -uphadmin -e "sele... [05:32:31] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320964 (10Marostegui) Also works from phab1002: ``` root@phab1002:~# mysql --skip-ssl -hdb1072 -p -uphadmin -e "select now()" Enter password: +---------------------+ | now()... [06:10:41] RECOVERY - Free space - all mounts on deployment-kafka-jumbo-2 is OK: OK: deployment-prep.deployment-kafka-jumbo-2.diskspace._mnt_kafka.byte_percentfree (No valid datapoints found) [06:17:21] RECOVERY - Free space - all mounts on deployment-kafka-jumbo-1 is OK: OK: deployment-prep.deployment-kafka-jumbo-1.diskspace._mnt_kafka.byte_percentfree (More than half of the datapoints are undefined) [06:19:13] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<11.11%) [06:49:50] PROBLEM - Puppet errors on integration-slave-docker-1009 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [07:06:08] 10Release-Engineering-Team (Watching / External), 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, and 2 others: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459#4320989 (10jcrespo) @Ladsg... [07:08:45] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056#4092272 (10jcrespo) I think commons data loss occurring from Wednesday could be related to train? T198177 [07:24:50] RECOVERY - Puppet errors on integration-slave-docker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [07:49:53] 10Continuous-Integration-Infrastructure, 10Cloud-VPS, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348#4321061 (10hashar) [07:50:09] 10Continuous-Integration-Infrastructure, 10Cloud-VPS, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348#4320116 (10hashar) Looks like the issue is `npm install` issuing `npm ERR! registry error parsing json` and idling/del... [08:13:38] (03PS1) 10Hashar: Use npm --verbose on some Quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/442789 (https://phabricator.wikimedia.org/T198348) [08:15:20] (03CR) 10Hashar: [C: 032] Use npm --verbose on some Quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/442789 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [08:16:39] (03Merged) 10jenkins-bot: Use npm --verbose on some Quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/442789 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [08:25:16] (03PS1) 10Hashar: Use a different job for npm --verbose [integration/config] - 10https://gerrit.wikimedia.org/r/442791 (https://phabricator.wikimedia.org/T198348) [08:25:46] (03CR) 10Hashar: [C: 032] Use a different job for npm --verbose [integration/config] - 10https://gerrit.wikimedia.org/r/442791 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [08:27:05] (03Merged) 10jenkins-bot: Use a different job for npm --verbose [integration/config] - 10https://gerrit.wikimedia.org/r/442791 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [08:48:47] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348#4321111 (10hashar) A build using `npm install --verbose` is https://integration.wikimedia.org/ci/job/mediawiki-... [08:54:23] dcausse: just noticed you do not have swat deployer phab badge, resolved :D https://phabricator.wikimedia.org/people/badges/3157/ [08:54:36] zeljkof: thanks! :) [08:56:43] zeljkof: I am in a massive debugging session with CI / npm causing havoc :( [08:56:58] so I am gonna skip this morning, but we can chat this afternnoon [08:57:05] (03PS1) 10Hashar: Remove npm --verbose [integration/config] - 10https://gerrit.wikimedia.org/r/442799 (https://phabricator.wikimedia.org/T198348) [08:57:09] hashar: sure, I do need help :D [08:57:17] (03CR) 10Hashar: [C: 032] Remove npm --verbose [integration/config] - 10https://gerrit.wikimedia.org/r/442799 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [08:58:37] (03Merged) 10jenkins-bot: Remove npm --verbose [integration/config] - 10https://gerrit.wikimedia.org/r/442799 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [09:06:05] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348#4321130 (10hashar) With system time (UTC): ``` 08:30:19 npm verb afterAdd /cache/npm/to-array/0.1.4/package/pac... [09:11:03] (03PS1) 10Mglaser: Add test dependencies for some BlueSpice extensions [integration/config] - 10https://gerrit.wikimedia.org/r/442803 [09:12:51] (03PS1) 10Hashar: Campaigns depends on EventLogging [integration/config] - 10https://gerrit.wikimedia.org/r/442805 [09:16:14] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4321166 (10mmodell) the upgrade script is using m3-master.eqiad.wmnet. When I try that: ``` twentyafterfour@phab1001:/srv/phab$ mysql --skip-ssl -h m3-master.eqiad.wmnet -P3323 -p... [09:19:00] (03PS2) 10Hashar: Campaigns depends on EventLogging [integration/config] - 10https://gerrit.wikimedia.org/r/442805 (https://phabricator.wikimedia.org/T198378) [09:19:13] (03CR) 10Hashar: [C: 032] Campaigns depends on EventLogging [integration/config] - 10https://gerrit.wikimedia.org/r/442805 (https://phabricator.wikimedia.org/T198378) (owner: 10Hashar) [09:20:34] (03Merged) 10jenkins-bot: Campaigns depends on EventLogging [integration/config] - 10https://gerrit.wikimedia.org/r/442805 (https://phabricator.wikimedia.org/T198378) (owner: 10Hashar) [09:27:15] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4320781 (10jcrespo) Can you try now, I have a theory. [09:28:05] 10Beta-Cluster-Infrastructure, 10Analytics, 10User-Elukey: Disk usage on deployment-kafa-jumbo-* causing alerts - https://phabricator.wikimedia.org/T198262#4321228 (10elukey) [09:29:52] 10Beta-Cluster-Infrastructure, 10Analytics, 10User-Elukey: Disk usage on deployment-kafa-jumbo-* causing alerts - https://phabricator.wikimedia.org/T198262#4317436 (10elukey) This happens periodically and affects people testing on deployment-eventlog05 (that pulls from jumbo hosts). I just moved /srv/kafka t... [09:44:42] 10Deployments, 10Phabricator, 10DBA: Mysql Access denied to 'phadmin'@'10.64.0.198' - https://phabricator.wikimedia.org/T198367#4321265 (10jcrespo) Yeah, let's elevate this to a security incident: ``` $ mysql -h m3-master.eqiad.wmnet -P 3306 -uphadmin -e "select now()" -p Enter password: +-----------------... [09:52:49] The selenium tests for UW are failing, seemingly for no reason https://gerrit.wikimedia.org/r/#/q/status:open+project:mediawiki/extensions/UploadWizard [09:53:44] (for the 5 most recent patches) [09:56:13] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10Patch-For-Review, and 2 others: Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137#4321314 (10zeljkofilipin) I took a look at [[ https://integration.wikimedi... [10:06:33] do we still use deployment-tin on Beta or there has been a server switch too? [10:07:14] Phab down. 503. (Maybe intentional?) [10:07:22] ah, it is. okay! [10:07:24] sorry [10:18:44] andre__: some issues, beind investigated by MM on -operations [10:22:39] (03PS1) 10Hashar: Migrate Campaigns to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442823 (https://phabricator.wikimedia.org/T183512) [10:23:05] (03CR) 10Hashar: [C: 032] Migrate Campaigns to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442823 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [10:24:27] (03Merged) 10jenkins-bot: Migrate Campaigns to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442823 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [10:34:37] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512 (10hashar) [10:41:41] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10Patch-For-Review, and 2 others: Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10WMDE-leszek) @Pablo-WMDE and me tried a bit more to make some sense fro... [10:44:00] Project mwext-phpunit-coverage-publish build #5988: 04FAILURE in 50 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/5988/ [11:03:31] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10Patch-For-Review, and 2 others: Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10zeljkofilipin) p:05Unbreak!>03High >>! In T198137#4321314, @zeljkof... [11:08:56] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10Patch-For-Review, and 2 others: Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10zeljkofilipin) [11:09:13] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10User-zeljkofilipin, 10Wikimedia-log-errors (Shared Build Failure): Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10zeljkofilipin) [11:09:40] 10Release-Engineering-Team (Watching / External), 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, and 2 others: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Ladsgroup) >>! In T1044... [11:09:45] Yippee, build fixed! [11:09:46] Project mwext-phpunit-coverage-publish build #5989: 09FIXED in 1 min 25 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/5989/ [11:11:42] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986 (10zeljkofilipin) [11:11:44] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), and 3 others: Q3 Selenium framework improvements - https://phabricator.wikimedia.org/T182421 (10zeljkofilipin) 05Open>03Resolved a:03zeljkofilipin Resolving this task, since Q3... [11:13:01] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [11:13:50] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986 (10zeljkofilipin) a:03zeljkofilipin [11:15:25] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [11:15:36] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) p:05Triage>03Normal [11:17:42] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) a:03zeljkofilipin [11:18:10] !log Ran namespaceDupes for eswiki and eswikibooks following namespace changes on JADE. [11:18:12] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:19:38] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986 (10zeljkofilipin) [11:19:41] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10MediaWiki-Vagrant, 10User-zeljkofilipin: Running selenium inside Vagrant with xvfb or X11 does not work - https://phabricator.wikimedia.org/T196646 (10zeljkofilipin) [11:20:28] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986 (10zeljkofilipin) [11:21:22] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986 (10zeljkofilipin) [11:21:24] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [11:24:24] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) a:03zeljkofilipin [11:24:40] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [11:30:58] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10User-zeljkofilipin, 10Wikimedia-log-errors (Shared Build Failure): Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10zeljkofilipin) [11:31:39] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Multimedia, 10UploadWizard, and 2 others: UploadWizard: Selenium tests failing - https://phabricator.wikimedia.org/T198384 (10zeljkofilipin) p:05Triage>03High a:03zeljkofilipin [11:32:58] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10User-zeljkofilipin, 10Wikimedia-log-errors (Shared Build Failure): Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10zeljkofilipin) [11:38:50] 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [11:39:03] 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) a:03zeljkofilipin [11:39:24] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [11:42:34] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10hashar) For the npm side: `npm-registry-client` relies on `request` which creates an `http.ClientRequest`. O... [11:44:21] DanielK_WMDE: o/ [11:44:27] noone will really be around right now though [11:44:36] but the scrollback will be useful of course [11:45:13] yea. I'll prepare a patch, we can discuss restarting the trasin later [11:46:11] im wondering if we did already see this one beta DanielK_WMDE [11:46:21] so you remember the slow transaction logs I pasted at some point [11:46:27] * addshore scrolls away to go and find them [11:47:57] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [11:48:07] It was https://www.irccloud.com/pastebin/Vgsxk1mR/, but nothing relating to comments [11:52:28] addshore: i'm noting we are doing runAtomicSection inside another atomic block. I'm now comparint that to the old code. [11:54:44] naw, that didn't change [11:55:37] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [11:56:32] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [11:56:59] tgr: I'm trying to wrap my head around the lock retention issue. I have to admit that i'm blurry on the details of the locking mechanism. [11:57:33] My understanding is that we are talking about locks on individual tables (or sets of rows on individual tables) [11:57:41] but we are seeing errors onm several tables. [12:00:26] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [12:04:12] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<11.11%) [12:07:25] DanielK_WMDE: going to lunch, back in a bit [12:09:09] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [12:10:09] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [12:10:32] (03PS1) 10Phedenskog: Enable performance/bttostatsv. [integration/config] - 10https://gerrit.wikimedia.org/r/442833 (https://phabricator.wikimedia.org/T196600) [12:10:57] 10Phabricator, 10Release-Engineering-Team (Kanban), 10media-storage, 10Patch-For-Review: Connect Phabricator to swift for storage of git-lfs and file uploads. - https://phabricator.wikimedia.org/T182085 (10mmodell) @fgiunchedi: * Currently all file uploads get stored in mysql database tables, not locally... [12:11:27] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [12:58:14] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [13:03:38] 10Release-Engineering-Team (Watching / External), 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, and 2 others: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui) Thanks Amir... [13:15:32] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [13:18:33] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [13:19:18] DanielK_WMDE: spot anything? [13:19:49] addshore: no. it's very likely *somthing* in the MCR patches. But I have no idea what [13:19:58] :D [13:20:05] may be PageUpdater. May be RevisionStore. [13:20:16] I'd like to at least narrow it down to one of the two... [13:20:36] What logging could we add to help? [13:20:37] best I could do so far is to prepare https://gerrit.wikimedia.org/r/c/mediawiki/core/+/442834 [13:21:04] hm... [13:21:12] there's the trx profiler stuff [13:21:36] i have never used it, but it may tell us which transactions are open for (too) long [13:21:45] do you know how to use that? [13:21:52] nope *looks* [13:22:45] "Helper class that detects high-contention DB queries via profiling calls" [13:22:52] that sounds like what we w3ant, right? [13:23:49] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [13:25:40] twentyafterfour hmm i am not sure why https://phabricator.wikimedia.org/T198367 was hidden? It was public early this morning then suddenly it's now private and someone removed me. [13:26:08] addshore: AaronSchulz will be the expert on that thing. Looks like Database always has a trxProfiler, but the default instance usesa NullLogger. [13:26:16] No idea how this is configured in production [13:26:26] no idea if it's safe to enable it, either [13:33:39] So Looking at logstash some more it does look like there was a few "DB transaction writes or callbacks still pending" once the branch landed on gorup0 [13:33:52] DanielK_WMDE, addshore: one thing that's surprising to me is that there isn't a single error for the two weeks MCR was on group0 [13:35:29] but they are actually for .8 not .999 anyway [13:35:49] tgr: i suspect it's a matter of load. number of concurrent edits. [13:36:27] probably, all the errors are from Commons and Wikidata [13:37:33] that and maybe patrol patterns (creation of a wrong article followed very quickly by fixes/deletes which might sometimes conflict) [13:38:15] addshore: $trxProfiler->setLogger( LoggerFactory::getInstance( 'DBPerformance' ) ); [13:38:32] addshore: is DBPerformance logging enabled? Can you see if that logger is emitting anything? [13:39:23] DanielK_WMDE: it is https://logstash.wikimedia.org/goto/54fa011df559b8abff337b498b8957c5 [13:39:41] DanielK_WMDE: jcrespo collected long-running queries to P7314, FWIW [13:40:07] tgr: I'm more intersted in long-running transactions with locks... [13:40:40] sub optimal transactions https://logstash.wikimedia.org/goto/247f41bb8e8ff2532151dbf0b33235f0 [13:40:47] those that *got* logs, not those that *tried* to get locks.. [13:41:50] ideally the log system would just record what process it got locked on [13:42:00] not sure how well MySQL supports that [13:43:07] we get a bunch of dbperf warnings all the time, any warnings that don't show a spike around the deploy are probably unrelated [13:44:08] this is just around the deployment time, and now only for commons and wikidata wiki https://logstash.wikimedia.org/goto/994abd77df739d0034cc7d0d55a31fef [13:45:14] Sub-optimal transaction on DB(s) [10.64.48.23 (commonswiki) (TRX#530829)]: [13:45:14] 0 15.435372 query-m: INSERT IGNORE INTO `page` (page_namespace,page_title,page_restrictions,page_is_redirect,page_is_new,page_random,page_touched,page_latest,page_len) VALUES ('X') [TRX#530829] [13:45:18] lovely 15 second insert into page [13:45:52] addshore: so what's the window we are looking at? When was this deployed to group1? [13:46:06] I'm seeing a spike here https://logstash.wikimedia.org/app/kibana#/dashboard/mediawiki-errors?_g=h@a816759&_a=h@2fa4f9d [13:46:20] 18:30 utc yesterday [13:46:27] DanielK_WMDE: yesterday 19:15 to 19:55 ish [13:46:36] utc? [13:46:44] so the super confusing thing for me is that most lock timeouts are on revision_comment_temp [13:46:58] DanielK_WMDE: to share logstash dashboard you have to click share in the top right [13:47:28] meh [13:47:32] https://usercontent.irccloud-cdn.com/file/KP3TTJ0V/image.png [13:47:33] DanielK_WMDE: ^^ [13:47:37] https://logstash.wikimedia.org/app/kibana#/dashboard/mediawiki-errors?_g=(refreshInterval%3A(display%3AOff%2Cpause%3A!f%2Cvalue%3A0)%2Ctime%3A(from%3A'2018-06-27T19%3A10%3A20.575Z'%2Cmode%3Aabsolute%2Cto%3A'2018-06-27T20%3A09%3A34.643Z')) [13:47:38] which just should not be possible as far as I can tell from the code, by the time you get to that table there was a lock on the page row ensuring no one else is messing with it [13:47:53] no, this one https://logstash.wikimedia.org/app/kibana#/dashboard/mediawiki-errors?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:'2018-06-27T19:10:20.575Z',mode:absolute,to:'2018-06-27T20:09:34.643Z'))&_a=(description:'',filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'logstash-*',key:_type,negate:!f,type:phrase,value:mediawiki),query:(match:(_type:(query:mediawiki)))),('$state':(store:appState), [13:47:55] meta:(alias:!n,disabled:!f,index:'logstash-*',key:channel.raw,negate:!f,type:phrase,value:DBPerformance),query:(match:(channel.raw:(query:DBPerformance,type:phrase)))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'logstash-*',key:normalized_message.raw,negate:!f,type:phrase,value:'Sub-optimal+transaction+on+DB(s)+%5B%7Bdbs%7D%5D:+%0A%7Btrace%7D'),query:(match:(normalized_message.raw:(query:'Sub-optimal+transaction+on+DB( [13:47:56] s)+%5B%7Bdbs%7D%5D:+%0A%7Btrace%7D',type:phrase))))),options:(darkTheme:!f),panels:!((col:7,id:Event-Level,panelIndex:3,row:3,size_x:3,size_y:3,type:visualization),(col:10,id:Top-Domains,panelIndex:5,row:6,size_x:3,size_y:3,type:visualization),(col:1,id:Trending-Messages,panelIndex:6,row:9,size_x:12,size_y:4,type:visualization),(col:10,id:MediaWiki-Versions,panelIndex:8,row:3,size_x:3,size_y:3,type:visualization),(col:1,id:Trending- [13:47:58] Backtrace-File,panelIndex:10,row:13,size_x:12,size_y:3,type:visualization),(col:1,id:Top-Channels-table,panelIndex:11,row:3,size_x:6,size_y:3,type:visualization),(col:1,id:Top-Hosts-table,panelIndex:12,row:6,size_x:6,size_y:3,type:visualization),(col:7,id:Top-Wikis-table,panelIndex:13,row:6,size_x:3,size_y:3,type:visualization),(col:1,id:Events-Over-Time,panelIndex:14,row:1,size_x:12,size_y:2,type:visualization),(col:1,columns:!(level, [13:47:59] channel,host,wiki,message),id:MediaWiki-Events-List,panelIndex:15,row:16,size_x:12,size_y:16,sort:!('@timestamp',desc),type:search)),query:(match_all:()),timeRestore:!f,title:mediawiki-errors,uiState:(P-10:(vis:(params:(sort:(columnIndex:!n,direction:!n)))),P-11:(vis:(params:(sort:(columnIndex:!n,direction:!n)))),P-12:(vis:(params:(sort:(columnIndex:!n,direction:!n)))),P-13:(vis:(params:(sort:(columnIndex:!n,direction:!n)))),P-14:(vis:( [13:48:01] legendOpen:!f)),P-3:(spy:(mode:(fill:!f,name:!n)),vis:(legendOpen:!f)),P-5:(spy:(mode:(fill:!f,name:!n)),vis:(legendOpen:!f,params:(sort:(columnIndex:!n,direction:!n)))),P-6:(spy:(mode:(fill:!f,name:!n)),vis:(params:(sort:(columnIndex:!n,direction:!n))))),viewMode:view) [13:48:01] DanielK_WMDE: shorturl! :P [13:48:03] gah! [13:48:04] sorry :/ [13:48:08] *sigh* [13:48:14] I'm such a noob ;) [13:48:15] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [13:48:27] honestly, i couldn't even find the right log in this thing. [13:49:06] wait... [13:49:19] query-m: INSERT INTO `text` (old_id,old_text,old_flags) VALUES (NULL,'X') [TRX#fdf0c9] [13:50:21] * addshore listens [13:50:41] why is this writing NULL into old_text? [13:51:35] maybe it isn't, looks like the VALUEs bit isn't right in the log [13:51:44] could that actually be confusing logging? and NULL is for old_id for auto inc? and old_flags are not there or something similar? [13:52:12] probably the lo sanitizer messing up [13:52:16] anyway, blobstore has been in for ages [13:53:08] uh, which wikis are on what group again? where can i see that? [13:53:30] https://tools.wmflabs.org/versions/ (click on each group) [13:53:55] or rather on the branch name just below the group [13:54:06] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [13:54:29] It could be something touched in between .999 and .10 also, perhaps we should look at the logs for .10 landing on group0 [13:54:36] * addshore has a call in 5 mins so will run away [13:54:39] i'm seeing a ton of these for commonswiki [13:55:00] it's about 80% commonswiki, 10% wikidatawiki [13:55:07] none of the smaller ones [13:55:16] clearly something requiring significant load [13:55:38] i'm seeing writes to blobs_cluster2N in the same transaction as the revision table insert [13:55:54] that's... odd? [13:56:06] I thought the blob tables are on different db servers? [14:00:06] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [14:02:00] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [14:03:09] 10Continuous-Integration-Infrastructure, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10hashar) I have filled #upstream bug https://github.com/npm/npm/issues/21101 [14:03:26] 10Gerrit, 10Release-Engineering-Team, 10Patch-For-Review: Make PolyGerrit the default ui - https://phabricator.wikimedia.org/T196812 (10Johan) [14:06:01] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [14:08:42] (03PS1) 10Hashar: Force npm loglevel to 'info' [integration/config] - 10https://gerrit.wikimedia.org/r/442857 (https://phabricator.wikimedia.org/T198348) [14:08:56] (03CR) 10Hashar: [C: 032] Force npm loglevel to 'info' [integration/config] - 10https://gerrit.wikimedia.org/r/442857 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [14:11:15] (03Merged) 10jenkins-bot: Force npm loglevel to 'info' [integration/config] - 10https://gerrit.wikimedia.org/r/442857 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [14:12:46] tgr, addshore: I'm still trying to make sense of the log... will post to the ticket [14:13:09] no big revelations yet. except that i'll check why we are writing ext store blobs inside the trx [14:13:14] ...and whether that's new [14:14:55] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [14:15:50] side note: maybe it is time to move commons to group2? [14:16:04] these kinds of errors wouldn't be *that* bad [14:16:16] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) [14:16:25] except that we do not have proper transactions for file uploads [14:17:00] so any rollback there results in broken files which then haunt bug reports for years [14:19:25] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10MW-1.29-release-notes, and 3 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740 (10zeljkofilipin) a:05zeljkofilipin>03None [14:24:06] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-Kanban, 10User-Elukey: Disk usage on deployment-kafa-jumbo-* causing alerts - https://phabricator.wikimedia.org/T198262 (10elukey) [14:31:00] tgr: having proper transactions means more lock retention :) [14:31:00] we resolve lock retention by relaxing transactional consisteny... [14:31:00] anyway. i think i found something. [14:32:32] DanielK_WMDE: I'm pretty sure jynus is correct: the RevisionStore refactoring caused some FOR UPDATE lock to extends to revision_comment_temp [14:33:07] that field is included in RevisionStore::getQueryInfo and it's hard to follow what else can fetch it from there [14:33:19] tgr, addshore: the MCR code introduced a doAtomicSection() bracket around the code that updates the revision, slots, and content tables. That bracket also covers blob writes and comment insertion. [14:33:42] yea. [14:33:48] do AtomicSection doesn't really do anything in MediaWiki though? [14:34:02] so. The solution would be to insert comments and blobs first, then do the atomic section. [14:34:10] and if the traqnsaction fails, just live with the orphans [14:34:28] everything is an implicit transaction anyway, unless you are in a job or maintenance script [14:34:41] (03CR) 10Hashar: [C: 032] Enable performance/bttostatsv. [integration/config] - 10https://gerrit.wikimedia.org/r/442833 (https://phabricator.wikimedia.org/T196600) (owner: 10Phedenskog) [14:35:15] (03CR) 10Hashar: [C: 032] Add test dependencies for some BlueSpice extensions [integration/config] - 10https://gerrit.wikimedia.org/r/442803 (owner: 10Mglaser) [14:36:04] (03Merged) 10jenkins-bot: Enable performance/bttostatsv. [integration/config] - 10https://gerrit.wikimedia.org/r/442833 (https://phabricator.wikimedia.org/T196600) (owner: 10Phedenskog) [14:36:24] anyway the issue is not with the comment table, it is with revision_comment_temp and by that time you need to have a revision ID [14:36:25] tgr: you are right, but it's *somehow* related to this [14:36:36] (03Merged) 10jenkins-bot: Add test dependencies for some BlueSpice extensions [integration/config] - 10https://gerrit.wikimedia.org/r/442803 (owner: 10Mglaser) [14:41:27] DanielK_WMDE: could something go wrong with https://gerrit.wikimedia.org/r/c/mediawiki/core/+/442834/2/includes/page/WikiPage.php#b2530 ? [14:41:57] or maybe it inadvertently fixed a bug where before revision_comment_temp did not get selected [14:43:10] although at this point the code should have a write lock on the page row so it's hard to see how it could cause problems [14:44:29] tgr: oh, i forgot about that nasty thing! [14:45:22] possibly, but I would think that there are not enough deletions happening for this to cause problems [14:45:30] but please post this on the ticket! [14:46:06] I'll work on a patch that minimizes the size of the atomic section [14:47:26] tgr: any idea why this is failing? https://gerrit.wikimedia.org/r/c/mediawiki/core/+/442834 [14:48:11] Build timed out (after 30 minutes). Marking the build as failed. [14:48:35] DanielK_WMDE there's a bug for that [14:48:42] tgr: no, the failure after that [14:48:43] there is an open task about that, npm doing something stupid and hanging [14:48:54] https://phabricator.wikimedia.org/T198348 [14:49:03] DanielK_WMDE tgr ^^ [14:49:04] I don't think there's a failure after that [14:49:29] https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php70-docker/1537/console [14:49:51] the recheck [14:49:58] same thing, but you have to click on show details to see it [14:50:06] took me some time to figure out yesterday [14:50:37] oh! great. thanks. [14:56:08] tgr: re FOR UPDATE: in wmf-8, WikiPage::doModify calls startAtomic() and lockAndGetLatest(), triggering FOR UPDATE. Then it calls Revision::insertOn which calls RevisionStore->insertRevisionOn which calls CommentStore->insertWithTempTable. [14:56:24] I don't see how with the old code, the comment table insert was NOT in a FOR UPDATE transaction. [14:56:36] am I missing something? [14:56:38] lockAndGetLatest only locks page, that's sane [14:56:55] the issue here is something locking revision_comment_temp [14:57:07] i don't see how anything else would get a FOR UPDATE lock [14:57:17] except maybe during deletion [14:57:32] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Sample code in Node.js for repositories that still have Selenium+Ruby tests - https://phabricator.wikimedia.org/T183160 (10zeljkofilipin) [14:58:16] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10hashar) [14:58:29] RevisionStore::getQueryInfo joins revision_comment_temp and takes lock options as a parameter [14:58:36] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Patches in Gerrit deleting Selenium+Ruby tests for repositories that still have them - https://phabricator.wikimedia.org/T183162 (10zeljkofilipin) [14:58:53] tgr: but that's not new. [14:58:56] I strongly suspect someone started calling it in locking mode that did not before [14:59:06] might not even be MCR-related [14:59:35] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Sample code in Node.js for repositories that still have Selenium+Ruby tests - https://phabricator.wikimedia.org/T183160 (10zeljkofilipin) [14:59:41] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Patches in Gerrit deleting Selenium+Ruby tests for repositories that still have them - https://phabricator.wikimedia.org/T183162 (10zeljkofilipin) [14:59:56] tgr: uh, no, wait. getQueryInfo does not take lock options. [15:00:16] getQueryInfo doesn't return options either [15:00:43] my bad, options are applied when the caller does something with the query fragment [15:01:07] but anyway, anything might get a query involving revision_comment_temp from that method [15:01:28] yes, but again, that's already teh case for wmf-8 [15:01:35] that's actually the reason getQueryInfo was introduced [15:02:41] hm... but I think I made a few things use getQueryOptions, so it would in the future do the right thing wrt slots [15:02:58] maybe revision_comment_temp got exposed to something, somehow... [15:04:32] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10hashar) I have sent a summary to Cloudflare support via https://support.cloudflare.com/ . We... [15:04:50] ...no, i don't see anything relevant in the diff [15:05:56] DanielK_WMDE: could be anything in wmf.999 or wmf.10 using that method, not necessarily MCR [15:07:27] tgr: i know, i just searched the diff between wmf8 and wmf-10 for getQueryInfo [15:07:35] I found no new usages [15:08:10] so, why do we lock revision_comment_temp during deletion? [15:09:45] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10dbarratt) See T179229 for a potential way to mitigate/resolve. [15:13:33] I don't think there is any use for it [15:13:41] since there is already a lock on page [15:22:28] tgr: well, something other than doEditContent could be messing with the revision table [15:22:52] the old code for null revisions, for instance [15:23:08] and nothing does a FOR UPDATE lock on the revision table, afaik [15:24:05] I can see why locking the revision table would be useful [15:24:23] bug revision_comment_temp is basically immutable as long as revision IDs are not reused [15:25:47] yea, I also think that locking just the revision able should be sufficient. [15:26:06] But RevisionStore should also do that, then, I suppose. [15:27:13] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [15:29:35] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [15:30:17] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-zeljkofilipin: Run selenium-EXTENSION-jessie for all repositores with Selenium tests - https://phabricator.wikimedia.org/T188742 (10zeljkofilipin) [15:31:03] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:31:13] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [15:31:17] 10Release-Engineering-Team (Kanban), 10MW-1.30-release-notes, 10MediaWiki-Core-Tests, 10JavaScript, and 5 others: Run Selenium Cucumber tests in CI - https://phabricator.wikimedia.org/T179190 (10zeljkofilipin) [15:31:44] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:32:32] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [15:32:53] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Create mediawiki-core-qunit-selenium-composer-jessie - https://phabricator.wikimedia.org/T180482 (10zeljkofilipin) a:03zeljkofilipin [15:33:31] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Refactor mediawiki-core-qunit-selenium-jessie Jenkins job so qunit/karma and webdriverio are invoked via npm script - https://phabricator.wikimedia.org/T180125 (10zeljkofilipin) a:03zeljkofilipin [15:34:55] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Refactor mediawiki-core-qunit-selenium-jessie Jenkins job so qunit/karma and webdriverio are invoked via npm script - https://phabricator.wikimedia.org/T180125 (10zeljkofilipin) [15:35:13] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Create mediawiki-core-qunit-selenium-composer-jessie - https://phabricator.wikimedia.org/T180482 (10zeljkofilipin) [15:35:50] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MinervaNeue, 10Readers-Web-Backlog, and 4 others: Minerva Ruby and Node.js browser tests running side by side - https://phabricator.wikimedia.org/T190710 (10zeljkofilipin) [15:43:04] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [15:43:06] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Create mediawiki-core-qunit-selenium-composer-jessie - https://phabricator.wikimedia.org/T180482 (10zeljkofilipin) 05Open>03Invalid mediawiki-core-qunit-selenium-jessie now runs fine for: - Wikibase: [[ https://gerrit.wiki... [15:44:16] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:44:20] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-zeljkofilipin: Run selenium-EXTENSION-jessie for all repositores with Selenium tests - https://phabricator.wikimedia.org/T188742 (10zeljkofilipin) [15:44:22] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [15:45:33] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:52:10] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [15:53:28] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Refactor mediawiki-core-qunit-selenium-jessie Jenkins job so qunit/karma and webdriverio are invoked via npm script - https://phabricator.wikimedia.org/T180125 (10zeljkofilipin) [15:55:27] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:55:29] 10Release-Engineering-Team (Kanban), 10MW-1.31-release-notes (WMF-deploy-2018-04-03 (1.31.0-wmf.28)), 10Patch-For-Review, 10User-zeljkofilipin: Video recording for Selenium tests in Node.js - https://phabricator.wikimedia.org/T179188 (10zeljkofilipin) [15:56:27] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:58:34] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:58:37] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Sample code in Node.js for repositories that still have Selenium+Ruby tests - https://phabricator.wikimedia.org/T183160 (10zeljkofilipin) [15:58:47] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [15:58:49] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Patches in Gerrit deleting Selenium+Ruby tests for repositories that still have them - https://phabricator.wikimedia.org/T183162 (10zeljkofilipin) [16:00:17] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:01:24] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Patches in Gerrit deleting Selenium+Ruby tests for repositories that still have them - https://phabricator.wikimedia.org/T183162 (10zeljkofilipin) [16:01:33] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Sample code in Node.js for repositories that still have Selenium+Ruby tests - https://phabricator.wikimedia.org/T183160 (10zeljkofilipin) [16:02:17] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Set up CI for running Selenium tests for repositories that have Ruby tests - https://phabricator.wikimedia.org/T198409 (10zeljkofilipin) [16:02:28] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Set up CI for running Selenium tests for repositories that have Ruby tests - https://phabricator.wikimedia.org/T198409 (10zeljkofilipin) p:05Triage>03Low [16:03:41] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:04:21] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Run selenium-EXTENSION-jessie for all repositores with Selenium tests - https://phabricator.wikimedia.org/T188742 (10zeljkofilipin) [16:04:36] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MW-1.31-release-notes (WMF-deploy-2018-04-03 (1.31.0-wmf.28)), 10Patch-For-Review, 10User-zeljkofilipin: Video recording for Selenium tests in Node.js - https://phabricator.wikimedia.org/T179188 (10zeljkofilipin) [16:05:14] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:11:37] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Blog post about implemented Q3 and Q4 Selenium framework improvements and plans for Q1 - https://phabricator.wikimedia.org/T198410 (10zeljkofilipin) [16:11:43] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Blog post about implemented Q3 and Q4 Selenium framework improvements and plans for Q1 - https://phabricator.wikimedia.org/T198410 (10zeljkofilipin) p:05Triage>03Normal [16:12:06] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:12:09] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Blog post about implemented Q3 and Q4 Selenium framework improvements and plans for Q1 - https://phabricator.wikimedia.org/T198410 (10zeljkofilipin) [16:14:12] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Blog post about implemented Q3 and Q4 Selenium framework improvements and plans for Q1 - https://phabricator.wikimedia.org/T198410 (10zeljkofilipin) [16:14:51] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:28:14] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:28:43] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:28:47] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:28:49] 10Release-Engineering-Team (Kanban), 10CirrusSearch, 10Discovery, 10Discovery-Search, and 2 others: selenium-CirrusSearch-jessie does not run any tests - https://phabricator.wikimedia.org/T193244 (10zeljkofilipin) [16:29:36] DanielK_WMDE: so I think https://github.com/wikimedia/mediawiki/commit/125bf7e44fe4145127c9e19c25d8c0c6dc1a6c3e#diff-a0f7feeaae57e9d2c735c8919c16ad15R2503 breaks the query [16:30:03] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:30:06] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:30:09] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Documentation, 10JavaScript, 10User-zeljkofilipin: Blog posts about new Selenium framework features - https://phabricator.wikimedia.org/T191982 (10zeljkofilipin) [16:30:10] the join key does not match the table name anymore, and you end up with a cartesian product of page and revision_comment_temp [16:30:31] so basically any delete locks the entirey table [16:31:05] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Documentation, 10JavaScript, 10User-zeljkofilipin: Blog posts about new Selenium framework features - https://phabricator.wikimedia.org/T191982 (10zeljkofilipin) [16:31:26] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Blog post about implemented Q3 and Q4 Selenium framework improvements and plans for Q1 - https://phabricator.wikimedia.org/T198410 (10zeljkofilipin) [16:31:31] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Documentation, 10JavaScript, 10User-zeljkofilipin: Blog posts about new Selenium framework features - https://phabricator.wikimedia.org/T191982 (10zeljkofilipin) [16:32:12] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:32:30] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10hashar) Other persons are reporting similar issues in https://github.com/npm/npm/issues/21101... [16:32:36] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Documentation, 10JavaScript, 10User-zeljkofilipin: Blog posts about new Selenium framework features - https://phabricator.wikimedia.org/T191982 (10zeljkofilipin) [16:33:27] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:33:31] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:33:34] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document differences between Ruby and Node.js Selenium frameworks - https://phabricator.wikimedia.org/T182692 (10zeljkofilipin) [16:35:09] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:35:12] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Selenium test job should install local dependencies before starting tests - https://phabricator.wikimedia.org/T193943 (10zeljkofilipin) [16:35:51] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:35:55] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10MW-1.32-release-notes (WMF-deploy-2018-05-15 (1.32.0-wmf.4)), 10Patch-For-Review, and 2 others: Avoid importing core's selenium/pageobjects files using relative paths - https://phabricator.wikimedia.org/T193088 (10zeljkofilipin) [16:36:32] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:38:12] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:38:16] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MW-1.31-release-notes (WMF-deploy-2018-03-20 (1.31.0-wmf.26)), 10Patch-For-Review, and 2 others: Create selenium-MediaWiki-jessie daily Jenkins job - https://phabricator.wikimedia.org/T185011 (10zeljkofilipin) [16:38:37] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:40:56] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Docker container with geckodriver - https://phabricator.wikimedia.org/T183163 (10zeljkofilipin) [16:42:31] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:42:34] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:42:37] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Run Selenium tests using Firefox - https://phabricator.wikimedia.org/T161697 (10zeljkofilipin) [16:42:39] 10Continuous-Integration-Infrastructure, 10Browser-Tests-Infrastructure, 10Packaging, 10Patch-For-Review: Create a Debian package for https://github.com/mozilla/geckodriver for at least Debian Jessie - https://phabricator.wikimedia.org/T137797 (10zeljkofilipin) [16:43:12] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Investigate if WebdriverIO `sync: false` would be useful to us and document how to use it - https://phabricator.wikimedia.org/T182412 (10zeljkofilipin) [16:43:47] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:43:50] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Investigate if WebdriverIO `sync: false` would be useful to us and document how to use it - https://phabricator.wikimedia.org/T182412 (10zeljkofilipin) [16:44:33] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Selenium tests should be easier to run - https://phabricator.wikimedia.org/T182691 (10zeljkofilipin) [16:45:08] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:45:11] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:45:14] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Selenium tests should be easier to run - https://phabricator.wikimedia.org/T182691 (10zeljkofilipin) [16:50:10] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Selenium tests should be easier to run - https://phabricator.wikimedia.org/T182691 (10zeljkofilipin) [16:50:41] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:50:44] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:50:47] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Selenium tests should be easier to run - https://phabricator.wikimedia.org/T182691 (10zeljkofilipin) [16:51:19] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [16:51:40] tgr: I'll make a patch killing that ugly thing. Anomie was big on keeping it though. We may need to check with him before merging this into master [16:51:46] we could try it out on the branch, though [16:52:00] DanielK_WMDE: https://gerrit.wikimedia.org/r/442889 [16:52:08] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Vector, 10MW-1.31-release-notes (WMF-deploy-2018-03-06 (1.31.0-wmf.24)), and 2 others: Move one Selenium tests from mediawiki/core to mediawiki/skins/Vector - https://phabricator.wikimedia.org/T187859 (10zeljkofilipin) [16:53:26] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Vector, 10MW-1.31-release-notes (WMF-deploy-2018-03-06 (1.31.0-wmf.24)), and 2 others: Move one Selenium tests from mediawiki/core to mediawiki/skins/Vector - https://phabricator.wikimedia.org/T187859 (10zeljkofilipin) [16:53:44] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:53:47] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:53:51] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Vector, 10MW-1.31-release-notes (WMF-deploy-2018-03-06 (1.31.0-wmf.24)), and 2 others: Move one Selenium tests from mediawiki/core to mediawiki/skins/Vector - https://phabricator.wikimedia.org/T187859 (10zeljkofilipin) [16:54:36] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:54:39] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:56:11] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Update page object pattern in Selenium tests - https://phabricator.wikimedia.org/T185094 (10zeljkofilipin) [16:56:42] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) [16:56:45] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:56:48] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Update page object pattern in Selenium tests - https://phabricator.wikimedia.org/T185094 (10zeljkofilipin) [16:58:35] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994 (10zeljkofilipin) [16:58:38] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Sample code in Node.js for repositories that still have Selenium+Ruby tests - https://phabricator.wikimedia.org/T183160 (10zeljkofilipin) [16:58:41] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Patches in Gerrit deleting Selenium+Ruby tests for repositories that still have them - https://phabricator.wikimedia.org/T183162 (10zeljkofilipin) [17:06:32] 10Release-Engineering-Team (Kanban), 10MW-1.30-release-notes, 10MediaWiki-Core-Tests, 10JavaScript, and 5 others: Run Selenium Cucumber tests in CI - https://phabricator.wikimedia.org/T179190 (10zeljkofilipin) a:05zeljkofilipin>03None [17:08:27] 10Release-Engineering-Team (Kanban), 10Documentation, 10MediaWiki-SWAT-deployments, 10User-zeljkofilipin: X-Wikimedia-Debug page should have short video on how to use the extension - https://phabricator.wikimedia.org/T197230 (10zeljkofilipin) [17:09:04] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Multimedia, 10UploadWizard, and 2 others: UploadWizard: Selenium tests failing - https://phabricator.wikimedia.org/T198384 (10zeljkofilipin) a:05zeljkofilipin>03None [17:10:19] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10User-zeljkofilipin, 10Wikimedia-log-errors (Shared Build Failure): Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10zeljkofilipin) a:05zeljkofilipin... [17:11:49] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MinervaNeue, 10Readers-Web-Backlog, and 4 others: Minerva Ruby and Node.js browser tests running side by side - https://phabricator.wikimedia.org/T190710 (10zeljkofilipin) a:05zeljkofilipin>03None [17:13:01] zeljkof: hi - not sure why https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/441233/ is failing on selenium tests [17:13:44] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MinervaNeue, 10Readers-Web-Backlog, and 4 others: Minerva Ruby and Node.js browser tests running side by side - https://phabricator.wikimedia.org/T190710 (10zeljkofilipin) Sorry on being slow with this, I am busy with other things. This is high... [17:15:06] Hauskatze: looks related to https://phabricator.wikimedia.org/T198137 could you please add a comment there? [17:15:21] looking [17:16:42] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986 (10zeljkofilipin) a:05zeljkofilipin>03None [17:17:28] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10greg) >>! In T198348#4322523, @hashar wrote: > Other persons are reporting similar issues in h... [17:17:40] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10Epic, 10Tracking, 10User-zeljkofilipin: Selenium framework improvements - https://phabricator.wikimedia.org/T182986 (10zeljkofilipin) p:05Normal>03Low [17:18:13] 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10zeljkofilipin) a:05zeljkofilipin>03None [17:18:34] Done. [17:18:38] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10User-zeljkofilipin, 10Wikimedia-log-errors (Shared Build Failure): Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137 (10MarcoAurelio) Per request of @zelj... [17:20:10] 10Phabricator: Cannot change tags on T198350 - https://phabricator.wikimedia.org/T198411 (10daniel) [17:20:53] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document which per-patch/daily Jenkins job is running for repositories with Ruby/Node.js Selenium tests - https://phabricator.wikimedia.org/T198412 (10zeljkofilipin) [17:20:59] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document which per-patch/daily Jenkins job is running for repositories with Ruby/Node.js Selenium tests - https://phabricator.wikimedia.org/T198412 (10zeljkofilipin) p:05Triage>03Normal [17:21:36] 10Phabricator: Cannot change tags on T198350 - https://phabricator.wikimedia.org/T198411 (10MarcoAurelio) T198350 is marked as "release" ticket. I think most of us do not have permissions to edit most of the fields on those. [17:21:39] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document which per-patch/daily Jenkins job is running for repositories with Ruby/Node.js Selenium tests - https://phabricator.wikimedia.org/T198412 (10zeljkofilipin) [17:21:42] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [17:21:47] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q1 Selenium framework improvements - https://phabricator.wikimedia.org/T198389 (10zeljkofilipin) [17:22:21] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10Jdforrester-WMF) Given that upstream marked them as invalid, I opened https://github.com/npm/n... [17:22:49] tgr: did $wgCommentTableSchemaMigrationStage change recently? [17:23:21] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document which per-patch/daily Jenkins job is running for repositories with Ruby/Node.js Selenium tests - https://phabricator.wikimedia.org/T198412 (10zeljkofilipin) [17:25:15] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Document which per-patch/daily Jenkins job is running for repositories with Ruby/Node.js Selenium tests - https://phabricator.wikimedia.org/T198412 (10zeljkofilipin) [17:27:50] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Run Selenium tests in CI per-patch for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [17:28:40] DanielK_WMDE: not since 2017 November as far as I can tell. Why? [17:29:51] tgr: because that could have triggered the issue. but it didn't. [17:30:08] I'm reluctant to remove the lock on the revision_comment_temp table [17:30:15] it was put there explicitly [17:30:20] and apparently has been there for a while [17:30:25] https://phabricator.wikimedia.org/T191875#4120050 [17:30:33] https://phabricator.wikimedia.org/T191892 [17:30:38] the issue was triggered by mismatching table names [17:30:42] https://gerrit.wikimedia.org/r/#/c/425283/ [17:30:55] the patch I pasted above fixes it [17:32:43] tgr: oh hell. i remember discussing this on the patch! damn! but... your patch undoes https://gerrit.wikimedia.org/r/c/mediawiki/core/+/425283 [17:32:50] the array_recursive thing has basically the same effect as array_values [17:32:50] which was fixing a performance issue [17:32:59] well, partially undoes it [17:33:27] no, the original code does that just fine [17:33:51] all the array_recursive call does is remove table keys [17:34:11] tgr: i added that to deal with table aliases [17:34:19] MediaWiki matches the $table and $join arguments to select() via table keys [17:34:33] the array_intersect will fail if there are complex values in the array [17:34:41] which may happen depending on what getQueryInfo returns. [17:34:50] so the join gets lost and the query locks the whole table instead of just the rows for the page being deleted [17:35:07] yea, i see. very good find! [17:35:16] but i'm worried that the new code isn't safe. [17:35:36] i guess i removed al lthe complex table alias stuff from getQueryInfo [17:35:46] addshore added it to support read-both fallbacks [17:36:30] well, it's what we had before the RevisionStore patch so it can't be that bad [17:36:38] 10Phabricator: Cannot change tags on T198350 - https://phabricator.wikimedia.org/T198411 (10daniel) @MarcoAurelio what does "release" mean, and why is this setup useful? [17:37:01] in any case I don't see any code in getQueryInfo that would return a multilevel array [17:37:12] tgr: i only touched that code because the RevisionStore patch broke on that line with a fatal error.... [17:37:22] yea, i guess i removed that. let's hope nobody puts it back :) [17:40:43] tgr: anyway, thanks for finding this! [17:41:03] i didn't see how deletions could have that much of an effect. but with this locking the *whole* table, it of course would! [17:42:22] tgr and DanielK_WMDE, thanks for mobilizing on this issue! [17:43:10] given we can decide on which fixes to merge, i'm trying to come up with a good approach for the deploy [17:44:03] commons and wikidata are where the issue surfaced, right? [17:44:24] 10Phabricator: Cannot change tags on T198350 - https://phabricator.wikimedia.org/T198411 (10MarcoAurelio) @daniel I think @mmodell can answer you better. Best regards. [17:44:54] marxarelli: if we backport tgr's patch, I don't think mine are needed any more. [17:45:08] I'll abandon them. [17:45:11] i'm wondering if we should target just one (group1 but with one of those removed) so as to monitor the effects without too much prod impact if the issue persists [17:45:19] DanielK_WMDE: k [17:45:47] if we can, don't put it on commons. [17:46:03] the errors caused inconsistencies between file uploads and pages in the file namespace [17:46:05] nasty to clean up [17:46:20] got i [17:46:22] got it [17:46:29] otoh, error volume on wikidata wasn't nearly as high [17:46:32] may go unnoticed [17:46:42] but then, we know what to look for, so... [17:46:44] not if we know to watch the logs [17:46:49] yea [17:47:07] and it was ~20% of commons, not that small [17:47:09] we can filter for just that error message and wikidata probably [17:57:16] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10hashar) @greg wrote: > They did link to https://status.npmjs.org/incidents/51c7q80zsj9f which... [18:04:51] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056 (10dduvall) Current plan for today's train rollout is mentioned in {T198350#4322794}. It'll be a graduated rollout of (group1 - commonswiki)... [18:10:33] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Refactor mediawiki-core-qunit-selenium-jessie Jenkins job so qunit/karma and webdriverio are invoked via npm script - https://phabricator.wikimedia.org/T180125 (10zeljkofilipin) [18:11:45] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Run Selenium tests in CI per-patch for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [18:18:57] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Run Selenium tests in CI per-patch for extensions - https://phabricator.wikimedia.org/T164721 (10zeljkofilipin) [18:19:32] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Refactor mediawiki-core-qunit-selenium-jessie Jenkins job so qunit/karma and webdriverio are invoked via npm script - https://phabricator.wikimedia.org/T180125 (10zeljkofilipin) [18:21:19] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): CI jobs takes too long / instances overloaded - https://phabricator.wikimedia.org/T198348 (10hashar) > Intermittent 500 errors when installing packages > > Investigating > We are currentl... [18:30:44] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10Krinkle) [18:31:49] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Refactor mediawiki-core-qunit-selenium-jessie Jenkins job so qunit/karma and webdriverio are invoked via npm script - https://phabricator.wikimedia.org/T180125 (10zeljkofilipin) [18:34:37] PROBLEM - Free space - all mounts on integration-slave-docker-1004 is CRITICAL: CRITICAL: integration.integration-slave-docker-1004.diskspace.root.byte_percentfree (<30.00%) [18:39:29] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10hashar) That still happens has shown in the info l... [18:49:06] 10Phabricator: Cannot change tags on T198350 - https://phabricator.wikimedia.org/T198411 (10mmodell) @daniel: can you try again and see if it works better now? I changed the configuration on the "edit release" form. [18:59:31] 10Phabricator: Cannot change tags on T198350 - https://phabricator.wikimedia.org/T198411 (10daniel) 05Open>03Resolved a:03daniel @mmodell yes, thank you! [19:50:08] [21:48] is there a database problem on commons? [19:50:10] [21:48] i'm getting this after saving: [19:50:11] [21:48] A database query error has occurred. This may indicate a bug in the software. [19:50:13] [21:48] [WzU7eApAMFsAAHLkobgAAAAE] 2018-06-28 19:48:25: Fatal exception of type "Wikimedia\Rdbms\DBQueryError" [19:50:14] [21:49] the hash and the timestamp change each time obviously [19:50:16] uuuhhhh [19:50:20] see -operations [19:51:02] but wmf10 did *not* go love on commons, did it? [19:52:47] getting timeouts from CI parallel-lint: [19:52:48] [Symfony\Component\Process\Exception\ProcessTimedOutException] [19:52:48] 19:42:47 The process "parallel-lint . --exclude vendor --exclude node_modules --exclude .git" exceeded the timeout of 350 seconds. [19:52:52] known issue? [19:53:33] RECOVERY - Free space - all mounts on integration-slave-docker-1004 is OK: OK: All targets OK [19:57:35] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<11.11%) [19:58:33] PROBLEM - Puppet errors on integration-slave-docker-1002 is CRITICAL: CRITICAL: 88.89% of data above the critical threshold [0.0] [20:02:01] PROBLEM - Puppet errors on integration-slave-docker-1004 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [20:03:30] PHP Fatal error: Out of memory (allocated 241172480) (tried to allocate 9437184 bytes) in /srv/composer/vendor/composer/composer/src/Composer/Repository/ComposerRepository.php on line 565 [20:03:53] time to sudo rm -Rf jenkins workspace? [20:05:10] DanielK_WMDE: .10 did go live on commons, commons is group1 [20:05:50] addshore: initial deploy was to "group1 minus commons", that's why i asked. [20:06:04] aaah commons had .10 yesterday, but indeed not today [20:06:11] were there errors again? [20:14:28] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10Krinkle) Yeah, the time-outs are still happening.... [20:15:11] 10Continuous-Integration-Infrastructure, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10Krinkle) [20:18:04] hasharAway: ^ [20:18:14] Hauskatze: disk != ram - I think this is a ram issue. [20:21:41] Krinkle: true that, maybe anything is consuming many memory atm? [20:22:27] Yeah. I don't know if the CI containers have a fixed amount of ram allocated. If they do, then the problem is likely that the code changed to need more memory, which might be a real bug in the test. (e.g. memory leak) [20:22:50] Or, if their memory use is dynamic, then it might be that other stuff in cloud vps or other CI jobs are pressuring the memory allocation. [20:23:00] I suppose it depends on whether it happens more often :) [20:23:24] I'd file a bug but I'm not sure about it. I trust releng/ci people can check. [20:24:10] Project mwext-phpunit-coverage-publish build #5997: 15ABORTED in 1 min 42 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/5997/ [20:28:35] RECOVERY - Puppet errors on integration-slave-docker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [20:30:53] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 (10Krinkle) [20:32:49] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 (10Krinkle) [20:32:57] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 (10Krinkle) [20:40:12] 10Beta-Cluster-Infrastructure, 10Parsoid, 10VisualEditor: VE not loading on Beta Cluster, getting 503s - https://phabricator.wikimedia.org/T198421 (10Ryasmeen) [20:41:49] 10Beta-Cluster-Infrastructure, 10Parsoid, 10VisualEditor: VE is not loading on Beta Cluster, getting 503s - https://phabricator.wikimedia.org/T198421 (10Ryasmeen) [20:42:01] RECOVERY - Puppet errors on integration-slave-docker-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [20:45:27] hasharAway, Krinkle : i'm seeing jenkins jobs failing during timeout of parallel-lint, too: https://gerrit.wikimedia.org/r/442209 [20:46:34] 10Continuous-Integration-Infrastructure, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10cscott) `parallel-lint` is timing out and failing as well: https://gerri... [21:01:12] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10DBA, 10MediaWiki-Database, 10Quibble: Enable MariaDB/MySQL strict mode on CI db hosts - https://phabricator.wikimedia.org/T119371 (10hashar) @jcrespo we had MediaWiki tests running with `sql_mode = 'TRADITIONAL'` . I clearly... [21:01:16] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10DBA, 10MediaWiki-Database, 10Quibble: Enable MariaDB/MySQL strict mode on CI db hosts - https://phabricator.wikimedia.org/T119371 (10hashar) [21:01:20] cscott, Krinkle: yeah I have the same problem [21:02:32] 10Deployments: Running scap sync-dir php-1.32.0-wmf.10 fails due to syntax error - https://phabricator.wikimedia.org/T198422 (10Catrope) [21:02:54] 10Continuous-Integration-Infrastructure, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10hashar) >>! In T198348#4323113, @cscott wrote: > `parallel-lint` is timi... [21:05:15] 10Continuous-Integration-Infrastructure, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10hashar) >>! In T198348#4323018, @Krinkle wrote: > Yeah, the time-outs ar... [21:05:45] Krinkle: I am gonna nuke a couple castor caches and see what happens [21:05:59] 10Scap: Linting phase in scap doesn't surface errors - https://phabricator.wikimedia.org/T198423 (10Catrope) [21:07:26] !log castor: nuking caches mediawiki-core/master/mediawiki-quibble-vendor-mysql-php70-docker and mediawiki-core/master/mediawiki-quibble-vendor-mysql-hhvm-docker | T198348 [21:07:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:07:31] T198348: Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 [21:07:33] (bah I also nuked the composer cache ...) [21:09:23] !log castor: nuking caches castor-mw-ext-and-skins/master/wmf-quibble-vendor-mysql-hhvm-docker/npm and castor-mw-ext-and-skins/master/wmf-quibble-vendor-mysql-php70-docker/npm | T198348 [21:09:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:14:42] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056 (10dduvall) [21:14:51] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056 (10dduvall) 05Open>03Resolved [21:17:46] !log castor02: nuking cache of npm/node jobs via rm -fR /srv/jenkins-workspace/caches/*/*/*node* (note: other jobs might still have a npm cache) | T198348 [21:17:51] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:17:51] T198348: Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 [21:18:14] Krinkle: I am nuking a bunch of caches. Even npm-node-6-docker had issues [21:22:12] 10Scap: Linting phase in scap doesn't show which file caused the error - https://phabricator.wikimedia.org/T198423 (10Krinkle) [21:23:11] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056 (10Krinkle) [21:23:13] 10Deployments: Running scap sync-dir php-1.32.0-wmf.10 fails due to syntax error - https://phabricator.wikimedia.org/T198422 (10Krinkle) [21:24:20] 10Deployments: Running scap sync-dir php-1.32.0-wmf.10 fails due to syntax error - https://phabricator.wikimedia.org/T198422 (10Krinkle) Traced to which first went out in wmf.10 * 7a7dcec075 (vendor) / c8b0acfe7c (core) [21:24:50] (03PS1) 10Hashar: Bump Quibble jobs timeout from 30 to 45 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/442988 (https://phabricator.wikimedia.org/T198348) [21:25:53] (03CR) 10Hashar: [C: 032] Bump Quibble jobs timeout from 30 to 45 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/442988 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [21:26:13] and I raised the timeout to 45 minutes https://gerrit.wikimedia.org/r/#/c/integration/config/+/442988/ [21:26:19] so at least jobs will vote ok [21:26:44] 10Deployments, 10Release-Engineering-Team, 10MediaWiki-Maintenance-scripts: Running scap sync-dir php-1.32.0-wmf.10 fails due to syntax error - https://phabricator.wikimedia.org/T198422 (10Krinkle) [21:27:08] 10Deployments, 10Release-Engineering-Team, 10MediaWiki-Maintenance-scripts: Running scap sync-dir php-1.32.0-wmf.10 fails due to syntax error - https://phabricator.wikimedia.org/T198422 (10Krinkle) p:05Triage>03Unbreak! [21:27:35] (03Merged) 10jenkins-bot: Bump Quibble jobs timeout from 30 to 45 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/442988 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [21:27:36] hasharAway: Thx. [21:29:02] Krinkle: that is all I can do for today, I wake up at 6am tomorrow :/ [21:29:17] maybe cloudflare/npmjs will have it fixed [21:29:26] plan B, I setup a local npm registry.. [21:30:01] (03PS1) 10Hashar: Revert "Bump Quibble jobs timeout from 30 to 45 minutes" [integration/config] - 10https://gerrit.wikimedia.org/r/442989 (https://phabricator.wikimedia.org/T198348) [21:30:16] (03CR) 10Hashar: [C: 04-2] "Hold until the npmjs/cloudflare issue is resolved ( T198348 )" [integration/config] - 10https://gerrit.wikimedia.org/r/442989 (https://phabricator.wikimedia.org/T198348) (owner: 10Hashar) [21:31:49] sleep & [21:31:56] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Upstream, 10Wikimedia-log-errors (Shared Build Failure): Quibble CI jobs time out after 30min due to instance stalling at "npm install parse" step - https://phabricator.wikimedia.org/T198348 (10hashar) * The job now have a 45 minutes timeout (w... [21:37:04] PROBLEM - Free space - all mounts on integration-slave-docker-1011 is CRITICAL: CRITICAL: integration.integration-slave-docker-1011.diskspace.root.byte_percentfree (<55.56%) [22:24:24] Project beta-scap-eqiad build #213751: 15ABORTED in 1 hr 0 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/213751/ [22:27:36] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<11.11%) [23:26:46] (03PS1) 10Jforrester: [Parsoid] Stop injecting MwEmbedHandler as a dependency, it's a no-op [integration/config] - 10https://gerrit.wikimedia.org/r/443002 [23:26:48] (03PS1) 10Jforrester: [mwgate] Drop MwEmbedHandler, it's a no-op [integration/config] - 10https://gerrit.wikimedia.org/r/443003 [23:26:50] (03PS1) 10Jforrester: [CirrusSearch] Stop injecting MwEmbedHandler as a dependency, it's a no-op [integration/config] - 10https://gerrit.wikimedia.org/r/443004 [23:26:53] (03PS1) 10Jforrester: [TimedMediaHandler] Stop injecting MwEmbedHandler as a dependency, it's a no-op [integration/config] - 10https://gerrit.wikimedia.org/r/443005 [23:26:54] (03PS1) 10Jforrester: [MwEmbedHandler] Archive [integration/config] - 10https://gerrit.wikimedia.org/r/443006 (https://phabricator.wikimedia.org/T197918) [23:27:54] (03CR) 10Paladox: [C: 04-1] "Will break the stable branches even if master won't use it." [integration/config] - 10https://gerrit.wikimedia.org/r/443005 (owner: 10Jforrester) [23:28:28] (03CR) 10jerkins-bot: [V: 04-1] [mwgate] Drop MwEmbedHandler, it's a no-op [integration/config] - 10https://gerrit.wikimedia.org/r/443003 (owner: 10Jforrester) [23:28:31] (03CR) 10jerkins-bot: [V: 04-1] [CirrusSearch] Stop injecting MwEmbedHandler as a dependency, it's a no-op [integration/config] - 10https://gerrit.wikimedia.org/r/443004 (owner: 10Jforrester) [23:28:35] (03CR) 10jerkins-bot: [V: 04-1] [TimedMediaHandler] Stop injecting MwEmbedHandler as a dependency, it's a no-op [integration/config] - 10https://gerrit.wikimedia.org/r/443005 (owner: 10Jforrester) [23:28:57] (03CR) 10Jforrester: "> Patch Set 1: Code-Review-1" [integration/config] - 10https://gerrit.wikimedia.org/r/443005 (owner: 10Jforrester) [23:29:38] (03CR) 10Paladox: [C: 04-1] "> > Patch Set 1: Code-Review-1" [integration/config] - 10https://gerrit.wikimedia.org/r/443005 (owner: 10Jforrester) [23:30:35] paladox: We don't support non-production use of TimedMediaHandler, including patches for old versions. [23:30:50] really? [23:30:56] why was a stable branch created? [23:31:15] Well, yeah, supporting TMH on third parties would be a huge amount of work. [23:31:35] All repos get the version branches for the adventurous. [23:31:50] so we recommend using the master branch? [23:32:12] i thought the master branch breaks support for older mw? [23:32:14] We recommend not using it. [23:33:00] Maybe once we set the videojs code as main and then delete the Kaltura code we could think about encouraging people to use it, but that's a long way off. [23:33:27] hmm [23:37:29] 10Beta-Cluster-Infrastructure, 10Parsoid, 10VisualEditor: VE is not loading on Beta Cluster, getting 503s - https://phabricator.wikimedia.org/T198421 (10Krenair) Looks like RB is timing out trying to connect to parsoid: ```krenair@deployment-cache-text04:~$ curl http://deployment-restbase01.deployment-prep.e... [23:53:32] Turns out even at 45 mins quibble can still time out: https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-hhvm-docker/1579/console :-( [23:54:23] 10Beta-Cluster-Infrastructure, 10Parsoid, 10VisualEditor: VE is not loading on Beta Cluster, getting 503s - https://phabricator.wikimedia.org/T198421 (10Krenair) Tried restarting parsoid service on deployment-parsoid09 ad then getting the URI above. Based on `tail -f /srv/log/parsoid/main.log | grep -v Chang...