[00:33:35] 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10Cloud-VPS (Debian Jessie Deprecation): "integration" Cloud VPS project jessie deprecation - https://phabricator.wikimedia.org/T236576 (... [00:42:39] PROBLEM - Host deployment-dumps-puppetmaster02 is DOWN: CRITICAL - Host Unreachable (172.16.4.101) [00:59:28] 10MediaWiki-Codesniffer, 10Patch-For-Review: Set tab-width in the base ruleset file - https://phabricator.wikimedia.org/T243598 (10Samwilson) Here's the 6 long lines: https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/579448 [01:02:28] (03CR) 10DannyS712: [C: 03+1] "LGTM" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/579119 (https://phabricator.wikimedia.org/T243598) (owner: 10Samwilson) [01:33:51] 10Gerrit, 10Social-Tools: Gerrit group creation request: Create group for Social-Tools - https://phabricator.wikimedia.org/T154078 (10ashley) >>! In T154078#5959643, @DannyS712 wrote: > @ashley if this is still desired (per above comment), should it still be stalled? I'm not sure why this seemingly simple requ... [01:34:56] 10Gerrit, 10Social-Tools: Gerrit group creation request: Create group for Social-Tools - https://phabricator.wikimedia.org/T154078 (10DannyS712) 05Stalled→03Open [03:43:07] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Packaging, 10Patch-For-Review: debian-glue jobs have a 2 minutes delay when building against Jessie - https://phabricator.wikimedia.org/T247496 (10bd808) I think we can probably drop the Jessie build for the labs... [08:46:16] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team): Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10Pchelolo) [09:05:37] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), and 2 others: CI / Zuul not processing changes - https://phabricator.wikimedia.org/T246973 (10hashar) 05Resolved→03Open Reopening since that... [09:07:33] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), and 2 others: CI / Zuul not processing changes - https://phabricator.wikimedia.org/T246973 (10hashar) [09:08:42] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), and 2 others: CI / Zuul not processing changes - https://phabricator.wikimedia.org/T246973 (10hashar) [09:15:13] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) [09:15:26] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), and 2 others: CI / Zuul not processing changes - https://phabricator.wikimedia.org/T246973 (10hashar) [09:15:29] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) [09:16:05] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), and 2 others: CI / Zuul not processing changes - https://phabricator.wikimedia.org/T246973 (10hashar) I have filed T247591 for the DBA, there mig... [09:57:27] (03CR) 10Daimona Eaytoy: [C: 03+2] Add tab-width=4 and increase line length to 120 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/579119 (https://phabricator.wikimedia.org/T243598) (owner: 10Samwilson) [09:58:06] (03Merged) 10jenkins-bot: Add tab-width=4 and increase line length to 120 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/579119 (https://phabricator.wikimedia.org/T243598) (owner: 10Samwilson) [10:06:35] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) [10:09:32] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) > From time to time Could you be more specific, at random times? When under high load? Which approximate frequency: Once every month, every week, every day? I would bet (this is... [10:10:34] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) Thanks for the extra info, I commented before you added the logstash link. Any way to reproduce it manually? [10:15:49] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), and 2 others: CI / Zuul not processing changes - https://phabricator.wikimedia.org/T246973 (10hashar) 05Open→03Stalled Stall for now, the wor... [10:17:14] 10MediaWiki-Codesniffer: Set tab-width in the base ruleset file - https://phabricator.wikimedia.org/T243598 (10Daimona) 05Open→03Resolved a:03Samwilson [10:20:16] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10User-zeljkofilipin: Cypress testing framework evaluation - https://phabricator.wikimedia.org/T230729 (10Jpita) @zeljkofilipin I want to use cypress in the new VueJS framework, we're testing it o... [10:23:28] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10User-zeljkofilipin: Cypress testing framework evaluation - https://phabricator.wikimedia.org/T230729 (10zeljkofilipin) 05Open→03Stalled @Jpita as far as I know, no progress has been made. 😕 [10:28:44] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) I could not find a way to reproduce it. From the log of events, that seems to be a transient issue, occurred early in january and again now. It started being noticeable for the last... [10:34:12] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) The last ERROR actually happened during a period of low load on the database: https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1132&var-port=9104&var-dc=eqiad%2... [10:37:27] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) > doing too many connections at the same time Not that, it is the first thing I checked, at least not from the perspective of the server (we have metrics of that, and it is not ob... [10:38:36] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO: Update JavaMelody on Gerrit to 1.82.0 - https://phabricator.wikimedia.org/T232678 (10hashar) [10:41:23] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) The other things that is interesting is the: > proceedHandshakeWithPluggableAuthentication Could it be some kind of strange compatibility issue? MySQL 8 recently changed how auth... [10:45:03] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10User-zeljkofilipin: Cypress testing framework evaluation - https://phabricator.wikimedia.org/T230729 (10Jpita) [10:45:43] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) Excellent @jcrespo thank you very much for the detailed answers! I have no idea how Gerrit manages the database connections, the settings seem to be all default. Thus I don't know... [10:48:04] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) Oh, wait, this will connect using the a dbproxy, so maybe the issue is there, not on mysql. Will give that a look. [10:58:13] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) I don't see immediate concerns about haproxy health, but I can see timeout is set as follows: ` timeout connect 3000ms timeout client 28800s timeout server 28800s ` Maybe the c... [11:04:10] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10Paladox) >>! In T247591#5966895, @hashar wrote: > Excellent @jcrespo thank you very much for the detailed answers! > > I have no idea how Gerrit manages the database connections, the setti... [11:10:40] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10Paladox) https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html [11:24:10] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) From Gerrit database configuration documentation ( https://gerrit.wikimedia.org/r/Documentation/config-gerrit.html#database ) > **`database.connectionPool`** > > If `true`, use co... [11:46:50] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) I have looked at the [[ https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=gerrit1001&var-datasource=eqiad%20prometheus%2Fops&var-cluster=misc&fro... [12:06:01] (03CR) 10Hashar: "For paramiko < pycryptyo, we need gcc, libssl-dev and others. I thought about updating the python3-build containers to have them ship the" [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/577846 (https://phabricator.wikimedia.org/T215458) (owner: 1020after4) [12:06:07] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) @hashar - I recommend the following: * Wait for T246098, which may impact negatively or positively this * Later, try to get some stats regarding TCP for the gerrit host and the pr... [12:11:12] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10jcrespo) >>! In T247591#5967006, @hashar wrote: > I have looked at the [[ https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=gerrit1001&var-datasource=eqi... [12:14:06] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Packaging, 10Patch-For-Review: debian-glue jobs have a 2 minutes delay when building against Jessie - https://phabricator.wikimedia.org/T247496 (10akosiaris) If this is limited to just jessie, I 'd advise to not... [12:16:49] 10Gerrit, 10Release-Engineering-Team (Development services): Gerrit javamelody monitoring lacks jdbc connections and SQL informations - https://phabricator.wikimedia.org/T247598 (10hashar) [12:34:23] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10Paladox) @jynus could this https://bugs.mysql.com/bug.php?id=93590 be it? Also that’s fixed in https://dev.mysql.com/doc/relnotes/connector-j/5.1/en/news-5-1-48.html [12:35:46] 10Gerrit, 10Release-Engineering-Team (Development services): Gerrit javamelody monitoring lacks jdbc connections and SQL informations - https://phabricator.wikimedia.org/T247598 (10Paladox) It supports it, but we have NoteDB switched on which seems to affect this. [12:38:58] 10Release-Engineering-Team, 10Wikimedia-GitHub: Github Actions do not run correctly on Github mirrors of Gerrit-hosted repositories - https://phabricator.wikimedia.org/T245826 (10WMDE-leszek) Anyone from #release-engineering-team having any thoughts on this? [12:40:51] 10Release-Engineering-Team (CI & Testing services), 10Quality-and-Test-Engineering-Team (QTE), 10User-zeljkofilipin: install cypress dependencies on CI - https://phabricator.wikimedia.org/T247599 (10zeljkofilipin) p:05Triage→03Medium [12:42:06] 10Release-Engineering-Team (CI & Testing services), 10Quality-and-Test-Engineering-Team (QTE), 10User-zeljkofilipin: install cypress dependencies on CI - https://phabricator.wikimedia.org/T247599 (10Jpita) p:05Medium→03High [13:33:44] 10Continuous-Integration-Config, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10CPT Initiatives (API Integration Tests), 10Code-Health, and 2 others: In CI, configure MediaWiki to enable API testing - https://phabricator.wikimedia.org/T243978 (10AMooney) [14:02:51] 10Release-Engineering-Team (Pipeline), 10Analytics, 10Analytics-Kanban, 10Release Pipeline, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10Ottomata) I think there are still some issues, but most things seem to be working fine. There is a period... [14:05:50] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team): Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) The rest_v1 api works, eg: ` curl -X GET "https://en.wikipedia.beta.wmflabs.org/api/rest_v1/page/html/Main_Pag... [14:22:46] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) 05Open→03Resolved a:03cscott I think this is not a bug. As documented in https://www... [14:22:52] 10Beta-Cluster-Infrastructure, 10Core Platform Team, 10Parsoid, 10Patch-For-Review, 10User-Ryasmeen: Parsoid/RESTbase seems to be unavailable in Beta - https://phabricator.wikimedia.org/T246833 (10cscott) [14:24:08] 10Release-Engineering-Team (Pipeline), 10Analytics, 10Analytics-Kanban, 10Release Pipeline, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10akosiaris) While things do indeed look way better, the memory leak is most certainly still there. Looking... [14:45:32] (03PS6) 10Hashar: Scap3 deploy repo for zuul [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/577846 (https://phabricator.wikimedia.org/T215458) (owner: 1020after4) [15:05:38] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10Pchelolo) 05Resolved→03Open All your analysis is correct, but in beta cluster we need to have d... [15:05:42] 10Beta-Cluster-Infrastructure, 10Core Platform Team, 10Parsoid, 10Patch-For-Review, 10User-Ryasmeen: Parsoid/RESTbase seems to be unavailable in Beta - https://phabricator.wikimedia.org/T246833 (10Pchelolo) [15:08:15] (03CR) 10Hashar: "Adding volans as reviewer since he has done the scap repo for operations/software/homer :)" (035 comments) [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/577846 (https://phabricator.wikimedia.org/T215458) (owner: 1020after4) [15:33:27] 10Beta-Cluster-Infrastructure: "The database is read-only until replication lag decreases" when saving preferences on beta - https://phabricator.wikimedia.org/T247617 (10AlexisJazz) [15:44:34] 10Beta-Cluster-Infrastructure, 10MediaWiki-Revision-backend, 10User-DannyS712: BetaCluster: ExternalStoreException - Unable to store text to external storage - https://phabricator.wikimedia.org/T228088 (10AlexisJazz) @Krinkle look at the history of https://commons.wikimedia.beta.wmflabs.org/w/index.php?title... [15:47:33] 10Beta-Cluster-Infrastructure, 10MediaWiki-Revision-backend, 10User-DannyS712: BetaCluster: ExternalStoreException - Unable to store text to external storage - https://phabricator.wikimedia.org/T228088 (10Krinkle) [16:03:08] (03PS1) 10Hashar: Add .gitreview [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/579588 [16:03:55] (03CR) 10Hashar: [V: 03+2 C: 03+2] Add .gitreview [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/579588 (owner: 10Hashar) [16:08:04] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:09:02] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) Can you point me to what you were using this for? Using rest.php seems to be the wrong thi... [16:09:48] (03PS1) 10KartikMistry: Add dependency on libgtk-3-0 [integration/config] - 10https://gerrit.wikimedia.org/r/579590 (https://phabricator.wikimedia.org/T247599) [16:11:05] (03CR) 10Hashar: [V: 03+2 C: 03+2] "That was merely to fix CI on https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/579587/" [integration/zuul/deploy] - 10https://gerrit.wikimedia.org/r/579588 (owner: 10Hashar) [16:12:12] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:14:26] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 100.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:26:53] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) In particular, it looks like there was an alias set up for `parsoid-php-beta.wmflabs.org` -... [16:26:55] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)): Research available third-party Argo and Kubernetes cloud providers - https://phabricator.wikimedia.org/T244384 (10dduvall) [16:26:57] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)): Order of magnitude for New CI hosting budget - https://phabricator.wikimedia.org/T247320 (10dduvall) 05Open→03Resolved Hand-wavy estimates using the GCS calculator have been sent to @thcipriani an... [16:29:57] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Operations, 10SRE-tools, and 5 others: Integrate automated DNS snippets into CI - https://phabricator.wikimedia.org/T243362 (10crusnov) [16:33:35] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Operations, 10SRE-tools, and 5 others: Integrate automated DNS snippets into CI - https://phabricator.wikimedia.org/T243362 (10crusnov) Summary of status: We have decided that modifying the CI image itself is unnecessar... [16:34:03] (03Abandoned) 10CRusnov: operations-dnslint: Add support for generated DNS as part of linting process [integration/config] - 10https://gerrit.wikimedia.org/r/568546 (https://phabricator.wikimedia.org/T243362) (owner: 10CRusnov) [16:56:05] hi. I am not sure if this is the right channel for this, so please let me know: I was wondering if it would be possible to add https://travis-ci.org/github/wikimedia/operations-software-censorship-monitoring to the Travis account [16:56:09] thanks [17:03:03] or maybe, it's because I just pushed the travis.yml file and I need an additional commit to trigger it? the help message on Travis doesn't seem useful. "You don't have sufficient rights to enable this repo on Travis." [17:03:15] sukhe: Done. [17:03:27] oh great, thanks James_F! [17:03:36] sukhe: But you should consider moving to GitHub Actions; they run faster. :-) [17:03:58] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) Ok, the issue seems to be that the restbase test suite uses `https://en.wikipedia.beta.wmfl... [17:04:27] James_F: ah, is that so? cool. I will check them out later [17:18:49] 10Release-Engineering-Team (CI & Testing services), 10Quality-and-Test-Engineering-Team (QTE), 10Patch-For-Review, 10User-zeljkofilipin: install cypress dependencies on CI - https://phabricator.wikimedia.org/T247599 (10Jdforrester-WMF) [17:18:52] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10User-zeljkofilipin: Cypress testing framework evaluation - https://phabricator.wikimedia.org/T230729 (10Jdforrester-WMF) [17:20:42] so i set up a web proxy for parsoid-php.wmflabs.org => deployment-parsoid11 [17:20:53] but http://parsoid-php-beta.wmflabs.org/wiki/Main_Page gives "domain not configured" [17:20:57] where does that come from? [17:21:25] to be clear, I'm expecting some error, because we're using MWScript, so the Host header needs to be (re)set [17:21:49] but the mwscript error message is different. [17:21:51] (03CR) 10Hashar: "Then we just have to bump the various changelog in quibble*/ and we are set." [integration/config] - 10https://gerrit.wikimedia.org/r/579590 (https://phabricator.wikimedia.org/T247599) (owner: 10KartikMistry) [17:22:31] cscott: parsoid-php vs parsoid-php-beta ? [17:23:16] i stripped -php- from the hostname. no more js or php, it's just parsoid [17:23:33] oh, but that could be part of the problem, yes, hang on, thanks [17:23:50] no, http://parsoid-beta.wmflabs.org/ gives the same thing [17:24:38] and that's the correct hostname according to https://horizon.wikimedia.org/project/proxy/ [17:25:01] (03CR) 10Jdlrobson: "cc Aron in case you are wondering why CI is not running on certain patches." [integration/config] - 10https://gerrit.wikimedia.org/r/578983 (owner: 10Reedy) [17:25:21] $ curl -x parsoid-beta.wmflabs.org:80 'http://en.wikipedia.beta.wmflabs.org/wiki/Special:Version' [17:25:30] gives a 404 from nginx [17:26:08] as does [17:26:08] cscott: i think that error message actually comes from the backend and not the webproxy [17:26:10] $ curl -H 'Host: en.wikipedia.beta.wmflabs.org:80' https://parsoid-beta.wmflabs.org/w/rest.php/foo [17:26:23] that looks like the default apache error template that comes with mediawiki apache [17:26:47] mutante: it's in mediawiki-config errorpages/default.php [17:27:23] yea, so the webproxy part probably just works but the apache on an mw appserver does not know this domain name [17:28:05] but setting Host should help then, right? [17:28:06] but [17:28:09] $ curl -H "Host: en.wikipedia.beta.wmflabs.org" https://parsoid-beta.wmflabs.org [17:28:39] gives an nginx error -- a different one than the Domain not configured message, so that's your theory being right I guess? [17:29:27] curl -H 'Host: en.wikipedia.org:80' https://parsoid-beta.wmflabs.org/w/rest.php/foo is a 404 [17:29:37] but that is not "Domain not configured" [17:29:56] mutante: that's the nginx right? [17:29:58] so en.wikipedia.org is found but then that path is not [17:30:38] if you ssh into deployment-parsoid11, then this works: [17:30:38] curl -x deployment-parsoid11:80 http://en.wikipedia.beta.wmflabs.org/wiki/Special:Version [17:30:45] (03CR) 10Aron Manning: "I'm sorry for leaving you out of the loop, I did not expect your interest. The former discussions are:" [integration/config] - 10https://gerrit.wikimedia.org/r/578983 (owner: 10Reedy) [17:31:03] cscott: yea, that is nginx answering [17:32:00] this also works (only when logged into deployment-parsoid11): [17:32:03] $ curl -H 'Host: en.wikipedia.beta.wmflabs.org' http://deployment-parsoid11/wiki/Special:Version [17:33:02] maybe the web proxy is munging the Host header? [17:33:58] (03CR) 10Jdlrobson: "Ignore I now see https://www.mediawiki.org/wiki/User_talk:Reedy#Jenkins_request" [integration/config] - 10https://gerrit.wikimedia.org/r/578983 (owner: 10Reedy) [17:36:00] cscott: this used to work in the past apparently, something to compare: [17:36:04] ssastry@deployment-mediawiki-parsoid10:~$ curl -v https://en.wikipedia.beta.wmflabs.org/wiki/Main_Page -x deployment-mediawiki-parsoid10:80 [17:36:13] found that on previous ticket [17:36:46] https://phabricator.wikimedia.org/T231569#5507884 ff [17:39:16] i get a 404 when doing that with -x parsoid-beta.wmflabs.org:80 . you might be right about the proxy [17:40:14] that ticket above seems to be pretty similar. for example the comment "As for testing it externally, MediaWiki config doesn't know about 'parsoid-php-beta.wmflabs.org' as a domain/proxy I think and maybe some config patch is required if you want to hit it from outside " [17:40:17] (03CR) 10Aron Manning: "> Aron would be great if you were in IRC as it's not clear to me what is the best way to communicate with you more in real time." [integration/config] - 10https://gerrit.wikimedia.org/r/578983 (owner: 10Reedy) [17:40:18] mutante: but that was run on parsoid10 [17:40:24] mutante: it still works from the host itself [17:40:40] but restbase needs to access the host from outside of labs, because apparently it's in their travis ci [17:40:42] cscott: yea, so it's something to compare to parsoid11 [17:41:33] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10JavaScript: Migrate mediawiki-core-jsduck-docker-publish off node 6 so it works again - https://phabricator.wikimedia.org/T247536 (10Jdforrester-WMF) 05Open→0... [17:41:36] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10JavaScript: Upgrade all CI jobs from node6/npm3 to node10/npm6 across all projects - https://phabricator.wikimedia.org/T211784 (10Jdforrester-WMF) [17:41:52] see that last comment about testing it externally and what followed it to "added the needed Varnish config to Horizon " [17:42:14] cscott@deployment-parsoid11:~$ curl -v https://en.wikipedia.beta.wmflabs.org/wiki/Main_Page -x deployment-parsoid11:80 [17:42:16] works [17:42:43] oh, hm, is varnish involved? [17:42:50] after that Marco had a curl from external [17:42:56] i'd imagined the web proxy as a straight pipe [17:44:04] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) @Dzahn points out that T231569#5507884 is pretty similar. [17:44:40] i am not sure, it's different in beta from other projects i think [17:44:52] there are instances like deployment-cache-text05 [17:44:59] 10Project-Admins, 10User-DannyS712: Create #dependency-injection tag - https://phabricator.wikimedia.org/T247623 (10DannyS712) [17:45:46] (Because why would it be the same?) [17:46:28] 10Project-Admins, 10User-DannyS712: Create #dependency-injection tag - https://phabricator.wikimedia.org/T247623 (10DannyS712) [17:46:45] It appeasrs that the "needed Varnish config" was the "magic" that @mobrovac added [17:47:16] 10Project-Admins, 10User-DannyS712: Create #dependency-injection tag - https://phabricator.wikimedia.org/T247623 (10DannyS712) As a component, this can be created without discussion[1], but just wanted to make this task to see if there were any objections. Will create in a few days if there are none [1] https... [17:47:16] but he did it to redirect /w/rest.php on en.wikipedia.beta.wmflabs.org to parsoid, instead of handling paroid-beta.wmflabs.org directly [17:48:12] All REST calls [17:48:13] ? [17:48:34] 10Project-Admins, 10User-DannyS712: Create #mediawiki-dependency-injection - https://phabricator.wikimedia.org/T247623 (10DannyS712) [17:48:50] 10Phabricator, 10Release-Engineering-Team (Kanban): Deploy "Deadlines" feature - https://phabricator.wikimedia.org/T191865 (10MBinder_WMF) @mmodell Some challenges my teams are facing: # They would like to have multiple stamps on a task. So, for instance, a Bug Report for a bug that also has a deadline (an... [17:49:03] 10Project-Admins, 10User-DannyS712: Create #MediaWiki-TrackingCategories tag - https://phabricator.wikimedia.org/T247192 (10DannyS712) a:03DannyS712 [17:49:15] James_F: yes, apparently [17:49:21] James_F: that's what i'm trying to *avoid* doing this time. [17:50:07] since /w/rest.php is distinct from /apt/rest_v1/ and from the parsoid API and i think it will only cause pain and suffering later if I conflate them. [17:50:35] So we broke all rest.php testing on Beta Cluster just to make Parsoid / local test RESTbase testing work? [17:50:56] James_F: i think the reason he did that though is to make the vhosting 'just work' without having to manually specify a desired Host header for parsoid-beta.wmflabs.org [17:51:15] James_F: must not have mattered because no one noticed, right? :-p [17:51:15] Right. [17:53:15] James_F: anyway, where do you think "varnish config on horizon" would live? puppet configuration, i'd think? [17:53:26] I guess? [17:53:32] * James_F is not an expert. [17:53:47] is there any easy way to diff what's in horizon vs what's in puppet git, to see if there are uncommitted changes related to this? [17:54:13] i've been grepping through the puppet repo for likely strings, but i may have been missing uncommitted changes directly in horizon [17:55:35] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) @paladox unlikely :] We need traces and try to find the root cause first! Thanks for more hints Jaime. Indeed lets hold for the m2 upgrade. We might also look into switching to c... [17:55:55] 10Gerrit, 10DBA: Investigate Gerrit troubles to reach the MariaDB database - https://phabricator.wikimedia.org/T247591 (10hashar) 05Open→03Stalled p:05Triage→03Medium [17:55:57] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), and 2 others: CI / Zuul not processing changes - https://phabricator.wikimedia.org/T246973 (10hashar) [17:59:03] cscott: maybe, since horizon Hiera changes are also some kind of commit now. but i would ask in -cloud [17:59:51] cscott: regarding varnish. yea, i think that would be Hiera things on the instances called deployment-cache-text*. [18:00:17] unless varnish got replaced by ATS there meanwhile [18:00:30] (03PS2) 10Jforrester: dockerfiles: [quibble-stretch] Add dependency on libgtk-3-0 [integration/config] - 10https://gerrit.wikimedia.org/r/579590 (https://phabricator.wikimedia.org/T247599) (owner: 10KartikMistry) [18:04:20] mutante: what is ATS? [18:04:39] cscott: Apache Traffic Server. [18:04:44] cscott: Aka new-Varnish. [18:06:17] 10Project-Admins, 10User-DannyS712: Create #MediaWiki-TrackingCategories tag - https://phabricator.wikimedia.org/T247192 (10DannyS712) 05Open→03Resolved p:05Triage→03Medium Created #mediawiki-trackingcategories and followed the steps listed at https://www.mediawiki.org/wiki/Phabricator/Project_manageme... [18:06:19] cscott: traffic team is replacing varnish with ATS but it's ongoing [18:08:34] well, i see where you'd add a redirect for /w/rest.php to https://horizon.wikimedia.org/project/instances/eed81e86-2874-4740-a8c2-fee29ced046d/?marker=06c8f1f1-2070-4d05-b97b-386c6bf5636b [18:08:49] but i don't want to do it that way, i want the web proxy to Just Work [18:09:34] my working hypothesis is that the horizon web proxy is munging or deleting or not forwarding the Host headers in some way. [18:09:37] maybe try asking in -cloud or -cloud-admin about the proxy and host header [18:10:00] ok. i'm going to get lunch, then i'll ask when i get back [18:14:56] (03CR) 10Jforrester: [C: 03+2] dockerfiles: [quibble-stretch] Add dependency on libgtk-3-0 [integration/config] - 10https://gerrit.wikimedia.org/r/579590 (https://phabricator.wikimedia.org/T247599) (owner: 10KartikMistry) [18:16:03] (03Merged) 10jenkins-bot: dockerfiles: [quibble-stretch] Add dependency on libgtk-3-0 [integration/config] - 10https://gerrit.wikimedia.org/r/579590 (https://phabricator.wikimedia.org/T247599) (owner: 10KartikMistry) [18:16:36] 10Phabricator, 10Project-Admins, 10CommRel-Design, 10WMF-Communications: Archive #CommRel-Design and related Phabricator Form? - https://phabricator.wikimedia.org/T246853 (10hdothiduc) Thank you all! And thank you Andre, sorry I can't tell you yet when all the ongoing and to do tasks will be completed! [18:18:05] !log Docker: Pushing quibble-stretch 0.0.40-s1 and cascade. [18:18:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:43:03] (03PS3) 10Jforrester: jjb: Make mwext-doxygen-publish publish to DOC_PROJECT not DOC_BASENAME [integration/config] - 10https://gerrit.wikimedia.org/r/574563 (https://phabricator.wikimedia.org/T246042) [18:43:17] (03CR) 10Jforrester: [C: 03+2] jjb: Make mwext-doxygen-publish publish to DOC_PROJECT not DOC_BASENAME [integration/config] - 10https://gerrit.wikimedia.org/r/574563 (https://phabricator.wikimedia.org/T246042) (owner: 10Jforrester) [18:44:08] (03Merged) 10jenkins-bot: jjb: Make mwext-doxygen-publish publish to DOC_PROJECT not DOC_BASENAME [integration/config] - 10https://gerrit.wikimedia.org/r/574563 (https://phabricator.wikimedia.org/T246042) (owner: 10Jforrester) [18:46:50] !log jjb: Make mwext-doxygen-publish publish to DOC_PROJECT not DOC_BASENAME T246042 [18:46:52] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:46:52] T246042: Reconcile `mwext-doxygen-publish and `mwext-node10-docs-docker-publish` put files in different locations; pick one - https://phabricator.wikimedia.org/T246042 [18:50:05] 10Deployments, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10serviceops, and 3 others: Cache of wmf-config/InitialiseSettings often 1 step behind - https://phabricator.wikimedia.org/T236104 (10thcipriani) FWIW, ran some tests with apachebench... [19:17:57] 10Project-Admins, 10User-DannyS712: Create #mediawiki-dependency-injection - https://phabricator.wikimedia.org/T247623 (10Aklapper) To help me understand, what would be two example tasks that should receive this new tag? [19:22:21] (03PS1) 10Jforrester: Follow-up 6a37f7f: Update doxygen publish paths [integration/docroot] - 10https://gerrit.wikimedia.org/r/579611 (https://phabricator.wikimedia.org/T246042) [19:22:23] (03PS1) 10Jforrester: Add ChessBrowser [integration/docroot] - 10https://gerrit.wikimedia.org/r/579612 [19:22:42] (03CR) 10jerkins-bot: [V: 04-1] Follow-up 6a37f7f: Update doxygen publish paths [integration/docroot] - 10https://gerrit.wikimedia.org/r/579611 (https://phabricator.wikimedia.org/T246042) (owner: 10Jforrester) [19:23:35] (03CR) 10Jforrester: "recheck" [integration/docroot] - 10https://gerrit.wikimedia.org/r/579611 (https://phabricator.wikimedia.org/T246042) (owner: 10Jforrester) [19:24:23] (03CR) 10Jforrester: [C: 03+2] Follow-up 6a37f7f: Update doxygen publish paths [integration/docroot] - 10https://gerrit.wikimedia.org/r/579611 (https://phabricator.wikimedia.org/T246042) (owner: 10Jforrester) [19:24:25] (03CR) 10Jforrester: [C: 03+2] Add ChessBrowser [integration/docroot] - 10https://gerrit.wikimedia.org/r/579612 (owner: 10Jforrester) [19:24:55] (03Merged) 10jenkins-bot: Follow-up 6a37f7f: Update doxygen publish paths [integration/docroot] - 10https://gerrit.wikimedia.org/r/579611 (https://phabricator.wikimedia.org/T246042) (owner: 10Jforrester) [19:24:57] (03Merged) 10jenkins-bot: Add ChessBrowser [integration/docroot] - 10https://gerrit.wikimedia.org/r/579612 (owner: 10Jforrester) [19:26:53] 10Continuous-Integration-Config, 10Patch-For-Review: Reconcile `mwext-doxygen-publish and `mwext-node10-docs-docker-publish` put files in different locations; pick one - https://phabricator.wikimedia.org/T246042 (10Jdforrester-WMF) 05Open→03Resolved a:03Jdforrester-WMF [19:29:55] (03PS1) 10Jforrester: jjb: Migrate all quibble images to version with libgtk-3.0 [integration/config] - 10https://gerrit.wikimedia.org/r/579614 (https://phabricator.wikimedia.org/T247599) [19:32:58] Demian_: Hey there. Zuul-merger is a piece of software inside CI that creates virtual git commits that represent a potential state of reality. [19:33:38] When you write a patch, it is responsible for creating a commit that represents what would happen if we merged that patch onto the tip of the branch right now. [19:34:16] So it auto-rebases it (and anything beneath it) onto the branch; it does the same for any Depends-On patches in other repos that are tested alongside this one. [19:34:56] For one patch directly on current master with no dependencies, this is one virtual commit. [19:35:14] If the patch is "on" master but non-current, that's two commits (the former one, plus the auto-rebased one). [19:35:50] If the patch is dependent on a patch beneath it, that's bast-case two and normally four. [19:36:04] If the patch is dependent on two patches beneath it, that's bast-case four and normally eight (2^n expansion). [19:36:38] A stack of 40 commits instructs zuul-cloner to create 2^40 different versions of reality. It's exceptionally expensive. [19:36:51] 10Project-Admins, 10User-DannyS712: Create #mediawiki-dependency-injection - https://phabricator.wikimedia.org/T247623 (10DannyS712) >>! In T247623#5968257, @Aklapper wrote: > To help me understand, what would be two example tasks that should receive this new tag? More than 2 :) * {T141495} * {T245900} * {T20... [19:37:03] (As in, it massively degraded CI for 9 hours until we killed things.) [19:37:11] Demian_: Is that helpful? [19:37:41] Thanks, processing [19:38:11] !log jjb: Speculatively pushing mediawiki-fresnel-patch-docker with libgtk-3.0 for manual testing of T247599 [19:38:12] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:38:13] T247599: install cypress dependencies on CI - https://phabricator.wikimedia.org/T247599 [19:40:07] @James_F I assume there is a reason to automatically rebase tested patches on master. This caused failed build for me when a conflicting patch was merged in the meantime. What the reason to rebase? [19:40:49] Demian_: So that the test run tells you anything useful about whether or not it can be merged. [19:42:44] I see. The "Merge conflict" tag in gerrit is not sufficient for that? [19:42:57] What do you think works out if there's a merge conflict? :-) [19:43:25] In that case there was no merge conflict without rebase. [19:43:39] That doesn't make any sense? [19:44:25] The build could be made successfully, on the original base, not on the current top of master [19:45:13] That's not a successful build. [19:45:43] We're not in GitHub arbitrary-branch land. We're a merge-to-branch environment. [19:45:50] It is imo. Outdated, but successful as in operational [19:46:09] Telling you that the code you wrote merges on the code it merged on last month is totally irrelevant and a waste of everyone's time. [19:46:20] The only question is "can I usefully merge this into HEAD?". [19:46:37] I'm talking about hours, not a month :D [19:46:54] Sure, months, nanoseconds. Same difference. [19:47:01] Either it's mergeable or it isn't. [19:47:10] We're not in the business of writing random code for no purpose. [19:48:12] (03CR) 10Jforrester: [C: 03+2] "Deployed." [integration/config] - 10https://gerrit.wikimedia.org/r/579614 (https://phabricator.wikimedia.org/T247599) (owner: 10Jforrester) [19:48:40] 10Release-Engineering-Team (CI & Testing services), 10Quality-and-Test-Engineering-Team (QTE), 10Patch-For-Review, 10User-zeljkofilipin: install cypress dependencies on CI - https://phabricator.wikimedia.org/T247599 (10Jdforrester-WMF) 05Open→03Resolved a:05hashar→03Jdforrester-WMF [19:48:43] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10User-zeljkofilipin: Cypress testing framework evaluation - https://phabricator.wikimedia.org/T230729 (10Jdforrester-WMF) [19:48:45] Okay... What I needed in that case is, however, to see if the tests succeed on the specific, non-rebased version. It was not code meant to be merged, so it was irrelevant for me whether it can be merged (rebased). [19:49:04] OK, you should write code in a different environment then. [19:49:04] (03Merged) 10jenkins-bot: jjb: Migrate all quibble images to version with libgtk-3.0 [integration/config] - 10https://gerrit.wikimedia.org/r/579614 (https://phabricator.wikimedia.org/T247599) (owner: 10Jforrester) [19:49:22] But you understand what I mean? [19:49:32] I understand you're asking for something we intentionally don't provide. [19:49:41] (And we're not going to provide.) [19:52:30] Okay. Actually I was asking for the reasons, not a feature. Thanks for explaining. [19:52:52] * James_F nods. [19:53:21] We're in the middle of replacing the whole of CI, so all of these issues will change in the medium-term (next 12 months or so). [19:54:26] That's interesting. Are some documents to be read about that project? [19:54:58] Some, yes. [19:56:49] https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/CI_Futures_WG#Charter -> https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/CI_Futures_WG/Candidates -> https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/CI_Futures_WG/Phase2 -> https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Seakeeper_proposal [19:56:55] I'm sure there's more docs out there. [20:07:12] > If the patch is dependent on two patches beneath it, that's bast-case four and normally eight (2^n expansion). [20:07:15] As I understand for each first patch in the chain it creates one state for the patch applied to its original base and one for the rebased. Then for the next, it creates 2 states: original 1st + 2nd patch, 1st rebased + 2nd patch, but what's the remaining 2? [20:14:48] Because it also has to consider if this can be rebased off the stack underneath. [20:25:08] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) [20:25:11] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10Parsoid, 10Patch-For-Review: Replace deployment-mediawiki-parsoid10 with a "purer" deployment-parsoid11 box - https://phabricator.wikimedia.org/T246854 (10cscott) [20:27:29] Ye, I understand that it makes 1 unrebased and one rebased on top of the rebased chain. That's 2 parallel chains: 1 unrebased, 1 rebased on the actual master. I don't understand what are the remaining 2 states (2 states in case of the 2nd patch, 8-2=6 in case the 3rd, etc.) [20:30:48] 1: unrebased plus child unrebased. 2: unrebased plus child rebased. 3: rebased plus child unrebased. 4: rebased plus child rebased. [20:31:03] Each of the commits can be out of sync with its direct parent. [20:34:11] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) Ok, we've set up the parsoid server with a floating IP and A record: `parsoid-external-ci-a... [20:37:54] I see. The test is only run on " 4: rebased plus child rebased." as I understand. What's the other 3 used for? [20:39:08] (03PS1) 10DannyS712: Whitelist QEDK [integration/config] - 10https://gerrit.wikimedia.org/r/579623 [20:40:00] Demian_: To see if it can fall back to those if the full one doesn't work. [20:40:28] (03PS2) 10DannyS712: Whitelist QEDK [integration/config] - 10https://gerrit.wikimedia.org/r/579623 [20:41:07] (03CR) 10jerkins-bot: [V: 04-1] Whitelist QEDK [integration/config] - 10https://gerrit.wikimedia.org/r/579623 (owner: 10DannyS712) [20:43:05] How this fallback works? In case of my patch that conflicted with the new master commit, it could have used the fallback non-rebased state, but it did not. The tests weren't run. [20:43:13] (03PS3) 10DannyS712: Whitelist QEDK [integration/config] - 10https://gerrit.wikimedia.org/r/579623 [20:44:35] 10Beta-Cluster-Infrastructure: Fix scap on deployment-parsoid11 - https://phabricator.wikimedia.org/T247545 (10Krenair) a:03Krenair think I sorted this by adding the `role::beta::mediawiki` role and running puppet [20:45:49] (03CR) 10DannyS712: "@QEDK this will make Jenkins run tests on your changes automatically, rather than waiting for someone to say "recheck" or add a CR vote" [integration/config] - 10https://gerrit.wikimedia.org/r/579623 (owner: 10DannyS712) [20:54:59] Project beta-scap-eqiad build #291690: 04FAILURE in 20 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/291690/ [20:55:54] Yeah, see -cloud chat [21:01:16] (03CR) 10QEDK: "> Patch Set 3:" [integration/config] - 10https://gerrit.wikimedia.org/r/579623 (owner: 10DannyS712) [21:04:29] Hmm. [21:04:41] Hopefully beta-scap-eqiad will fix itself next time. [21:08:44] Yippee, build fixed! [21:08:44] Project beta-scap-eqiad build #291691: 09FIXED in 4 min 9 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/291691/ [21:11:00] 10Beta-Cluster-Infrastructure: Fix scap on deployment-parsoid11 - https://phabricator.wikimedia.org/T247545 (10Jdforrester-WMF) 05Open→03Resolved Thank you! [21:11:03] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10Parsoid: Replace deployment-mediawiki-parsoid10 with a "purer" deployment-parsoid11 box - https://phabricator.wikimedia.org/T246854 (10Jdforrester-WMF) [21:11:21] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10Parsoid: Replace deployment-mediawiki-parsoid10 with a "purer" deployment-parsoid11 box - https://phabricator.wikimedia.org/T246854 (10Jdforrester-WMF) [21:11:43] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10Jdforrester-WMF) 05Open→03Resolved [21:11:46] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10Parsoid: Replace deployment-mediawiki-parsoid10 with a "purer" deployment-parsoid11 box - https://phabricator.wikimedia.org/T246854 (10Jdforrester-WMF) [21:11:51] 10Beta-Cluster-Infrastructure, 10Core Platform Team, 10Parsoid, 10Patch-For-Review, 10User-Ryasmeen: Parsoid/RESTbase seems to be unavailable in Beta - https://phabricator.wikimedia.org/T246833 (10Jdforrester-WMF) [21:11:59] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10Parsoid: Replace deployment-mediawiki-parsoid10 with a "purer" deployment-parsoid11 box - https://phabricator.wikimedia.org/T246854 (10Jdforrester-WMF) 05Open→03Resolved [21:12:05] 10Beta-Cluster-Infrastructure, 10Core Platform Team, 10Parsoid, 10Patch-For-Review, 10User-Ryasmeen: Parsoid/RESTbase seems to be unavailable in Beta - https://phabricator.wikimedia.org/T246833 (10Jdforrester-WMF) [21:12:41] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) Fix in progress at https://github.com/wikimedia/restbase/pull/1245 [21:13:25] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10Parsoid: Replace deployment-mediawiki-parsoid10 with a "purer" deployment-parsoid11 box - https://phabricator.wikimedia.org/T246854 (10cscott) [21:13:28] 10Beta-Cluster-Infrastructure, 10Parsoid, 10Core Platform Team Workboards (Clinic Duty Team), 10User-Ryasmeen: Parsoid-PHP should be publicly accessible in beta - https://phabricator.wikimedia.org/T247589 (10cscott) 05Resolved→03Open (re-opening until I actually fix restbase CI... perhaps I should crea... [21:13:31] 10Beta-Cluster-Infrastructure, 10Core Platform Team, 10Parsoid, 10Patch-For-Review, 10User-Ryasmeen: Parsoid/RESTbase seems to be unavailable in Beta - https://phabricator.wikimedia.org/T246833 (10cscott) [21:30:07] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:31:11] James_F: ^ did you have a look at them alerts? [21:34:56] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 93272 bytes in 1.254 second response time [21:45:58] RhinosF1: Looks like it fixed itself. [21:48:44] yep [21:49:02] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [21:51:14] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [21:54:55] merger:merge 272 2 2 [22:12:02] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [150.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [22:14:59] hashar: is there a big chain of ops/puppet commits about to crash land somewhere? [22:15:13] yeah [22:15:26] should we lock down that zuul repo? [22:15:28] and I guess the git merge on operations/puppet are a bit slow [22:15:32] https://integration.wikimedia.org/zuul/ is mostly empty. [22:16:27] seems to be a lot of tasks in the merge queue though and I'm not sure why [22:16:46] Unreaped zombie tasks? [22:16:51] and some more on mediawiki-config [22:16:58] in the debug log for zuul merger I see stuff about ops/puppet and mw-config, yeah [22:17:04] so that is two chains of commits send at roughly the same time [22:17:15] James and I talked about it earlier today during our CI checkin [22:17:20] we should get a few more merger [22:17:38] the root cause is fixed in Zuul upstream code, but I could not manage to backport it to our fork [22:18:25] And upgrading zuul to zuul3 is a bunch of work. [22:18:40] https://gerrit.wikimedia.org/r/#/c/integration/zuul/+/508388/ was my attempt but failed [22:19:03] ah, I see: 2 chains https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/579640/ + https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/579643/ [22:19:07] yeah [22:19:23] seems strange that would cause sooo many merge requests [22:19:25] Ooh, large new ChessBrowser stack. [22:20:13] https://review.opendev.org/#/c/643703/ has a nice explanation [22:21:28] oh facepalm [22:21:45] anyway my backport was terrible [22:27:30] interesting. [22:29:59] zuul merger is a strange beast. [22:30:09] Yeah. [22:57:17] 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10MediaWiki-General, 10MW-1.34-notes (1.34.0-wmf.23; 2019-09-17), 10MW-1.35-notes (1.35.0-wmf.24; 2020-03-17), and 3 others: Make MediaWiki core compatible with PHP 7.4 - https://phabricator.wikimedia.org/T233012 (10Krinkle) [23:03:23] 10Project-Admins, 10User-DannyS712: Create #mediawiki-dependency-injection - https://phabricator.wikimedia.org/T247623 (10Krinkle) "Depedency injection" is not a component in MediaWiki core. There is the #mediawiki-servicecontainer which has a component already. For the short/mid-term effort of decoupling mor... [23:06:59] 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10MediaWiki-General, 10MW-1.34-notes (1.34.0-wmf.23; 2019-09-17), 10MW-1.35-notes (1.35.0-wmf.24; 2020-03-17), and 3 others: Make MediaWiki core compatible with PHP 7.4 - https://phabricator.wikimedia.org/T233012 (10Reedy) [23:07:11] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 100.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [23:09:25] 10Project-Admins, 10User-DannyS712: Create #mediawiki-dependency-injection - https://phabricator.wikimedia.org/T247623 (10DannyS712) [23:09:49] 10Project-Admins, 10User-DannyS712: Create #mediawiki-dependency-injection - https://phabricator.wikimedia.org/T247623 (10DannyS712) >>! In T247623#5968861, @Krinkle wrote: > "Depedency injection" is not a component in MediaWiki core. There is the #mediawiki-servicecontainer which has a component already. > >... [23:09:56] 10Project-Admins, 10User-DannyS712: Create #dependency-injection - https://phabricator.wikimedia.org/T247623 (10DannyS712)