[04:16:09] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO, 10Operations, 10Platform Engineering, and 6 others: Kask functional testing with Cassandra via the Deployment Pipeline - https://phabricator.wikimedia.org/T224041 (10jeena) Hmm, I tried to deploy again but still couldn't. I would be ha... [04:32:47] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))): deployment-charts: Deploy script failing - kube_env: command not found - https://phabricator.wikimedia.org/T259684 (10jeena) [04:33:08] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))): deployment-charts: Deploy script failing - kube_env: command not found - https://phabricator.wikimedia.org/T259684 (10jeena) [05:20:22] (03PS2) 10Hashar: dockerfiles: drop backports, add git proto v2 to ci-stretch [integration/config] - 10https://gerrit.wikimedia.org/r/611181 (https://phabricator.wikimedia.org/T256844) [05:20:45] (03CR) 10jerkins-bot: [V: 04-1] dockerfiles: drop backports, add git proto v2 to ci-stretch [integration/config] - 10https://gerrit.wikimedia.org/r/611181 (https://phabricator.wikimedia.org/T256844) (owner: 10Hashar) [05:35:33] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))): echostore helm test service checker failing in staging cluster - https://phabricator.wikimedia.org/T259686 (10jeena) [05:39:30] (03CR) 10Hashar: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/611181 (https://phabricator.wikimedia.org/T256844) (owner: 10Hashar) [05:49:15] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))): echostore helm test service checker failing in staging cluster - https://phabricator.wikimedia.org/T259686 (10jeena) [07:15:59] 10Phabricator, 10DBA, 10Patch-For-Review: Upgrade m3 to Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T259589 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db1132.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reima... [07:50:29] 10Phabricator, 10DBA, 10Patch-For-Review: Upgrade m3 to Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T259589 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1132.eqiad.wmnet'] ` and were **ALL** successful. [08:28:27] PROBLEM - Host deployment-changeprop is DOWN: CRITICAL - Host Unreachable (172.16.5.21) [08:28:27] PROBLEM - Host deployment-cpjobqueue is DOWN: CRITICAL - Host Unreachable (172.16.4.124) [08:28:27] PROBLEM - Host deployment-chromium02 is DOWN: CRITICAL - Host Unreachable (172.16.4.14) [08:32:35] 10Phabricator, 10DBA: Upgrade m3 to Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T259589 (10Marostegui) [08:48:12] PROBLEM - Free space - all mounts on integration-agent-docker-1007 is CRITICAL: CRITICAL: integration.integration-agent-docker-1007.diskspace._srv.byte_percentfree (<11.11%) [08:53:12] RECOVERY - Free space - all mounts on integration-agent-docker-1007 is OK: OK: All targets OK [08:57:14] 10Release-Engineering-Team-TODO, 10DBA, 10Product-Infrastructure-Team-Backlog: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) @jcrespo can we take a logical backup from this specific wiki so I can truncate its tables? The wiki is super small... [09:47:45] 10Release-Engineering-Team-TODO, 10DBA, 10Product-Infrastructure-Team-Backlog: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10jcrespo) I have created both a mydumper and a mysqldump exports of zerowiki. It was anyway being backed up regularly. You can s... [09:48:41] 10Release-Engineering-Team-TODO, 10DBA, 10Product-Infrastructure-Team-Backlog: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) Thank you! @Jdforrester-WMF I can proceed with the truncate now. However, I do have a last question, do we have to... [09:50:07] 10Release-Engineering-Team-TODO, 10DBA, 10Product-Infrastructure-Team-Backlog: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) And same question goes for x1 tables? [09:51:04] 10Release-Engineering-Team-TODO, 10DBA, 10Product-Infrastructure-Team-Backlog: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) >>! In T227717#6362222, @Marostegui wrote: > And same question goes for x1 tables? Disregard that, there are no ta... [10:45:15] (03CR) 10Tchanders: [C: 03+1] Add CheckUser as a integration dependency of WikimediaMessages (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/617832 (https://phabricator.wikimedia.org/T256586) (owner: 10Dbarratt) [10:49:44] (03CR) 10Tchanders: [C: 04-1] Add CheckUser as a integration dependency of WikimediaMessages (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/617832 (https://phabricator.wikimedia.org/T256586) (owner: 10Dbarratt) [11:00:35] PROBLEM - Host deployment-mcs01 is DOWN: CRITICAL - Host Unreachable (172.16.5.64) [11:38:06] 10Release-Engineering-Team-TODO, 10DBA, 10Product-Infrastructure-Team-Backlog: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Jdforrester-WMF) >>! In T227717#6362221, @Marostegui wrote: > Thank you! > @Jdforrester-WMF I can proceed with the truncate now... [11:58:59] (03CR) 10Hashar: [C: 03+2] "Nice thank you! I had read the go mod documentation and I could not even figure out how to install a package with it bah ... :]" [integration/config] - 10https://gerrit.wikimedia.org/r/617720 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [12:00:05] (03Merged) 10jenkins-bot: dockerfiles: target specific pebble version [integration/config] - 10https://gerrit.wikimedia.org/r/617720 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [12:09:57] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10Scap: Scap's error message when preppping a non-existent branch is unhelpful - https://phabricator.wikimedia.org/T259699 (10LarsWirzenius) [12:10:49] (03CR) 10Hashar: "GRRRR docker-pkg." [integration/config] - 10https://gerrit.wikimedia.org/r/617720 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [12:59:00] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10Scap: Scap fails when checking train version number - https://phabricator.wikimedia.org/T259706 (10LarsWirzenius) [13:32:53] 10Phabricator, 10Patch-For-Review: tooltip mentions non-existent "Wishlist priority" in Open Tasks by Project and Priority report - https://phabricator.wikimedia.org/T91428 (10Aklapper) [13:42:01] 10Phabricator: Phabricator logs out several times in a day! - https://phabricator.wikimedia.org/T248946 (10Aklapper) 05Open→03Declined Unfortunately closing this Phabricator task as no further information has been provided. @KartikMistry: After you have provided the information asked for and if this still h... [13:43:54] 10Phabricator, 10Browser-Support-Google-Chrome: Double clicking on text to highlight also highlights surrounding strings in Chrome - https://phabricator.wikimedia.org/T247916 (10Aklapper) [13:44:20] 10Phabricator, 10Browser-Support-Google-Chrome: Double clicking on text to highlight also highlights surrounding strings in Chrome - https://phabricator.wikimedia.org/T247916 (10Aklapper) 05Open→03Invalid Phab's HTML source looks correct (`
...
`) so this should be reported under h... [13:47:27] I can't seem to get my head around Blubber/PipelineLib. We have a bunch of preparatory steps before executing blubber and building the image. Basically a bash-script that downloads a bunch of files, compiles a bit of Java, gathers dependencies and what not. How are we supposed to get this executed within the CI? [13:48:19] 10Phabricator (Upstream), 10Upstream: Error 503 when renaming a project - https://phabricator.wikimedia.org/T113024 (10Aklapper) 05Open→03Resolved Haven't experienced this for a while and a good bunch of server changes have taken place. Hence resolving. If this still happens, please reopen. [13:49:00] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10JavaScript, 10Patch-For-Review: Upgrade all CI jobs from node6/npm3 to node10/n... - https://phabricator.wikimedia.org/T211784 [13:49:06] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO, 10Operations, 10Release Pipeline, and 2 others: Migrate production services to kubernetes using the pipeline - https://phabricator.wikimedia.org/T198901 (10akosiaris) [13:51:18] 10Phabricator: Tag URL for milestone without board causes weird 404 redirect to https://phabricator.wikimedia.org/tag// - https://phabricator.wikimedia.org/T186173 (10Aklapper) p:05Triage→03Lowest [13:53:31] 10Phabricator: Phabricator: "500 Internal Server Error" when using VPN (due to specific IP range) - https://phabricator.wikimedia.org/T258059 (10Aklapper) @NicoV: Is this still a problem? Asking as the other ticket is resolved. [14:08:30] 10Phabricator, 10Operations, 10Traffic: Access Forbidden to Phabricator at WikiArabia 2019 (Morocco) via Indian IP 185.174.156.75 - https://phabricator.wikimedia.org/T234598 (10Aklapper) [14:09:38] 10Phabricator, 10Operations, 10Traffic: Access Forbidden to Phabricator at WikiArabia 2019 (Morocco) via Indian IP 185.174.156.75 - https://phabricator.wikimedia.org/T234598 (10Aklapper) See also T257507, T229575, T258059, T246923, etc. Reason might be vandalism (non-public T218589). (The error itself is a... [14:17:18] 10Phabricator: "Unknown Object (????)" shown in Status field in saved Phabricator search (due to renaming/removing some status) - https://phabricator.wikimedia.org/T227910 (10Aklapper) W6 uses W4 for the "New Tasks" tab (not very obvious to find out). Would have to create a new panel, then add it to W6, rearrang... [14:17:24] (03PS1) 10Vgutierrez: dockerfiles: bump tox-acme-chief container version [integration/config] - 10https://gerrit.wikimedia.org/r/618533 (https://phabricator.wikimedia.org/T257456) [14:17:26] (03PS1) 10Vgutierrez: jjb: Use tox-acme-chief 0.5.1 for the new-pebble acme-chief job [integration/config] - 10https://gerrit.wikimedia.org/r/618534 (https://phabricator.wikimedia.org/T257456) [14:18:01] 10Phabricator: Phabricator issue on 2019-09-08; "Unable to establish a connection to any database host: Cannot assign requested address." - https://phabricator.wikimedia.org/T232276 (10Aklapper) 05Open→03Resolved Nothing to further investigate here, hence resolving. [14:18:20] (03CR) 10Vgutierrez: "> Patch Set 2:" [integration/config] - 10https://gerrit.wikimedia.org/r/617720 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [14:21:05] 10Phabricator, 10Release-Engineering-Team (Development services): bzimport uses deprecated certificate auth - https://phabricator.wikimedia.org/T242860 (10Aklapper) [14:21:07] 10Phabricator: User's Bugzilla era comments not (yet) merged into their Phabricator account - https://phabricator.wikimedia.org/T163581 (10Aklapper) [14:31:15] 10Phabricator: Phabricator Maniphest showing "Cancelar suscripción" in English user interface - https://phabricator.wikimedia.org/T232702 (10Aklapper) 05Stalled→03Declined DB states that the language was set to Spanish 2016-10-30 and changed to English on 2019-09-12: ` mysql:phstats@m3-slave.eqiad.wmnet [pha... [14:46:06] 10Phabricator: Imported bugzilla comment by specific account (mwjames) has "Unknown Object (User)" as commenter (does not happen as task author) - https://phabricator.wikimedia.org/T85203 (10Aklapper) Now that I have DB access, looking at the transaction logs of the three tasks via `SELECT usr.userName,trs.* FRO... [14:49:38] 10Release-Engineering-Team, 10Analytics-Radar, 10Product-Analytics, 10Repository-Admins: Create a repository and user for Product Analytics Oozie jobs - https://phabricator.wikimedia.org/T230743 (10mpopov) @nshahquinn-wmf: That's an excellent point! I updated the task description with the `analytics/wmf-pr... [14:51:13] (03CR) 10Dbarratt: Add CheckUser as a integration dependency of WikimediaMessages (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/617832 (https://phabricator.wikimedia.org/T256586) (owner: 10Dbarratt) [14:51:14] 10Phabricator: Imported bugzilla comment by specific account (mwjames) has "Unknown Object (User)" as commenter (does not happen as task author) - https://phabricator.wikimedia.org/T85203 (10chasemp) > Might be a WONTFIX / declined nowadays... :-/ > last user activity: Sep 22 2018 There was no response to my... [14:52:12] 10Phabricator (Upstream), 10Developer-Wishlist (2017), 10Upstream: Phabricator should suggest possible duplicates when creating a new task - https://phabricator.wikimedia.org/T45 (10Jdlrobson) Is now a good time to reconsider low? [14:53:32] (03CR) 10Tchanders: Add CheckUser as a integration dependency of WikimediaMessages (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/617832 (https://phabricator.wikimedia.org/T256586) (owner: 10Dbarratt) [15:01:54] 10Release-Engineering-Team, 10Analytics-Radar, 10Product-Analytics, 10Repository-Admins: Create a repository and user for Product Analytics Oozie jobs - https://phabricator.wikimedia.org/T230743 (10nshahquinn-wmf) >>! In T230743#6362955, @mpopov wrote: > @nshahquinn-wmf: That's an excellent point! I update... [15:06:56] (03CR) 10Hashar: [C: 03+2] "Thanks :)" [integration/config] - 10https://gerrit.wikimedia.org/r/618533 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [15:07:53] (03Merged) 10jenkins-bot: dockerfiles: bump tox-acme-chief container version [integration/config] - 10https://gerrit.wikimedia.org/r/618533 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [15:08:04] @brennen|afk whenever you have a moment today, I'm curious on what it would take to make it so that considerable error spikes on web clients in https://logstash.wikimedia.org/app/kibana#/dashboard/AXDBY8Qhh3Uj6x1zCF56 block trains automatically. [15:09:54] Practically. I'd like to create a task to manage that but don't know what to tag it with :) [15:12:07] Jdlrobson: create a train blocker task maybe? [15:12:48] https://phabricator.wikimedia.org/T257971 under "How this work", there is a link "Use this form to create a blocker" [15:12:50] which leads to https://phabricator.wikimedia.org/maniphest/task/edit/form/46/?parent=257971 [15:13:15] and has the project #Wikimedia-production-error which is the phab project we use to keep track of error [15:14:12] after deployment, we primarily look at the mediawiki-new-error dashboard in kibana: https://logstash.wikimedia.org/app/kibana#/dashboard/0a9ecdc0-b6dc-11e8-9d8f-dbc23b470465 [15:14:41] and apparently that does not include messages having meta.stream: mediawiki.client.error [15:14:46] so those errors are not seen by us [15:15:20] hashar: i was thinking automatically detecting and creating the task [15:15:28] cause we filter on type:mediawiki when those have type:clienterror [15:16:06] So yeh it sounds like adding the mw-client-error to mediawiki-new-error might be the way forward? [15:16:53] yeah most probably [15:16:57] which is worth filing a taska bout it [15:17:17] as for filing a task based on an existing error, in the list of raw events, when you expand the event all the fields are shown [15:17:26] in a nice table [15:17:38] there should be tabs: Table | JSON | Phatality [15:17:50] Phatality is a javascript plugin we wrote for kibana which lets you easily file a task in Phabricator [15:18:39] so filing a task is just a few clicks: expand the event, click Phatality tab, click submit and a prefiled form shows up [15:19:41] Jdlrobson: the doc: https://phabricator.wikimedia.org/phame/post/view/177/introducing_phatality/ [15:19:41] ;] [15:20:34] 10Release-Engineering-Team, 10Analytics-Radar, 10Product-Analytics, 10Repository-Admins: Create a repository and user for Product Analytics Oozie jobs - https://phabricator.wikimedia.org/T230743 (10mpopov) [15:20:49] (03CR) 10Hashar: [C: 03+2] "I have updated the job:" [integration/config] - 10https://gerrit.wikimedia.org/r/618534 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [15:21:50] (03Merged) 10jenkins-bot: jjb: Use tox-acme-chief 0.5.1 for the new-pebble acme-chief job [integration/config] - 10https://gerrit.wikimedia.org/r/618534 (https://phabricator.wikimedia.org/T257456) (owner: 10Vgutierrez) [15:26:10] 10Release-Engineering-Team, 10Analytics-Radar, 10Product-Analytics, 10Repository-Admins: Create a repository and user for Product Analytics Oozie jobs - https://phabricator.wikimedia.org/T230743 (10mpopov) Requested `analytics/wmf-product/jobs` Gerrit repo [15:29:22] hashar thanks for providing today's reading material :) [15:30:02] Still I'd rather avoid those clicks.. if I want to get it added to mediawiki-new-error what's the process there? [15:37:08] 10Release-Engineering-Team, 10Cloud-VPS (Project-requests), 10cloud-services-team (Kanban): Request creation of gitlab-test VPS project - https://phabricator.wikimedia.org/T259668 (10bd808) a:03nskaggs Approved in 2020-08-05 WMCS meeting [15:37:19] Jdlrobson: it would be good to have a task for discussing that. i expect it has some considerable implications. [15:37:45] if i remember correctly, the client errors are a pretty large volume of logs [15:39:13] practically speaking, they might make it much harder to notice and triage php errors. [15:39:31] 10Continuous-Integration-Config, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10User-DannyS712: Jenkins sometimes doesn't rebase with `action = rebas... - https://phabricator.wikimedia.org/T259450 [15:39:36] there's nothing at present that automatically creates a blocker task, so that's sort of new territory. [15:40:12] not _necessarily_ unworkable, but it would definitely need some serious thought. [15:41:55] 10Continuous-Integration-Config, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10User-DannyS712: Jenkins sometimes doesn't rebase with `action = rebas... - https://phabricator.wikimedia.org/T259450 [15:45:24] Jdlrobson: I echo brennen, a task about adding the mediawiki.client errors to our general dashboard would certainly be nice bu tneeds to be considered carefully :] [15:45:38] cause any spike of error in there almost immediately cause a rollback of train [15:46:00] and we would certainly need some more metadata to be added. It seems the "wiki" field is missing, and we could use the wmf version inserted as well [15:46:07] but that definitely sounds like a good idea :] [15:50:28] we should also be clear that by "cause a rollback", we mean a human train deployer is watching it and manually grinding through log triage [15:51:28] if client errors are a good signal for train deployers to use to determine breakage, that's all well and good - but... [15:51:46] ...i would perhaps like to be careful about further centralizing that workload. [15:52:06] a current goal of ours is to more broadly distribute the monitoring of this stuff. [15:53:44] expanding the group of people suffering through the log triage process is probably a better path to automating things than increasing the suffering of a handful of already-overburdened individuals. [15:59:14] hashar: the wiki can be inferred from the domain: meta.domain field [16:02:52] hashar: brennen I guess at minimum if there's a huge (to be defined what that means) spike in client errors after rolling out the train I'd hope that the train gets rolled back in the same way we do for PHP [16:03:11] I don't think that's the case right now? [16:03:20] yeah, it's probably not, and i think that's a reasonable goal. [16:03:34] but obviously with client side errors there's the added fun of gadgets adding noise [16:03:42] which are not part of the deploy itself [16:04:07] my gut reaction though is that if we're going to try to achieve that goal, i'd definitely like some help outside of releng in achieving it. [16:04:26] Who should I tag on such a task? [16:06:07] Jdlrobson: for a start i think K.rinkle, thcipriani, and me? [16:07:58] Jdlrobson: also, this is kind of off the cuff, but we now have a regular train log triage meeting on wednesdays with platform team folks. expanding that to have representation from someone who focuses more on client-side stuff would probably be a good step. [16:20:07] Jdlrobson: surely if a popular gadget ends up totally broken, we would want to know about it :] [16:21:20] and I would love any doc you might have about those emdiawiki.client error reporting system. That is entirely new to me [16:22:45] a similar idea/rfc was to use Sentry to collect the browser errors which got declined a couple weeks ago https://phabricator.wikimedia.org/T382#6310025 [16:23:02] mentioning "Modern Event Platform" so I guess that is the thing I should look at :] [16:24:15] Jdlrobson: hashar would clearly also be an appropriate person to add to your filed task. :) [16:24:46] yeah you want old granpa to come rant at it :D [16:25:01] 10Continuous-Integration-Config, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10User-DannyS712: Jenkins sometimes doesn't rebase with `action = rebas... - https://phabricator.wikimedia.org/T259450 [16:25:11] more seriously, I would love to have the client side error to be taken in account [16:25:23] instead of discovering after the fact that some crucial feature ended up being broken [16:29:38] 10Phabricator (Upstream), 10Developer-Wishlist (2017), 10Upstream: Phabricator should suggest possible duplicates when creating a new task - https://phabricator.wikimedia.org/T45 (10Aklapper) As long as I neither see a good NLP / AI algorithm for the English language (more relevant to me) nor much research t... [16:31:25] 10Continuous-Integration-Config, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10User-DannyS712: Jenkins sometimes doesn't rebase with `action = rebas... - https://phabricator.wikimedia.org/T259450 [16:32:26] (03PS1) 10DannyS712: Edit Repo Config [extensions/GlobalWatchlist] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/618305 [16:33:05] (03PS2) 10DannyS712: Edit Repo Config: Set `mergeContent = true` [extensions/GlobalWatchlist] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/618305 (https://phabricator.wikimedia.org/T259450) [16:33:14] (03CR) 10DannyS712: "This change is ready for review." [extensions/GlobalWatchlist] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/618305 (https://phabricator.wikimedia.org/T259450) (owner: 10DannyS712) [16:38:44] (03CR) 10Hashar: dockerfiles: Provide composer-test-php73 (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/618331 (owner: 10Jforrester) [16:39:41] hashar can you take a look at https://gerrit.wikimedia.org/r/618305 ? Thanks for your help with explaining T259450 [16:39:42] T259450: Jenkins sometimes doesn't rebase with `action = rebase if necessary` - https://phabricator.wikimedia.org/T259450 [16:39:54] (03CR) 10Hashar: "Might need a different version depending on the outcome of the parent change https://gerrit.wikimedia.org/r/c/integration/config/+/618331/" [integration/config] - 10https://gerrit.wikimedia.org/r/618332 (owner: 10Jforrester) [16:41:36] DannyS712: hi! I have been super verbose on that task, sorry ;) [16:41:56] (03CR) 10Hashar: [V: 03+2 C: 03+2] "Looks about correct :]" [extensions/GlobalWatchlist] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/618305 (https://phabricator.wikimedia.org/T259450) (owner: 10DannyS712) [16:42:15] DannyS712: merged ! :] [16:42:20] thanks [16:43:07] DannyS712: the default comes from https://gerrit.wikimedia.org/r/admin/repos/mediawiki/extensions and is "Merge if necessary" / "Allow content merges = True" [16:43:47] Yeah. I guess I only needed to change the "Merge if necessary" bit [16:44:57] I can live with the merge spam (git log --first-parent or git log --no-merges works fine sometime) [16:45:10] but I found the "rebase if necessary" to be nicer to the eyes [16:45:14] cause the whole history is flatten [16:45:50] then there is a lost o finformation, you do not know the original parent commit anymore [16:46:12] 10Continuous-Integration-Config, 10Gerrit, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10User-DannyS712: Jenkins sometimes doesn't rebase with `action = rebas... - https://phabricator.wikimedia.org/T259450 [16:46:14] which can be a bit of a pain in case the merge resolution ended up working but results in a bad state [16:46:24] but that only occured to me once in all those years [16:46:40] rebase if necessary ends up having way more benefit [16:54:31] (03PS3) 10Hashar: Move castor-save-filter.bash check into castor-save-workspacecache.bash [integration/config] - 10https://gerrit.wikimedia.org/r/616595 (owner: 10Ahmon Dancy) [16:55:47] 10Phabricator (Upstream), 10Developer-Wishlist (2017), 10Upstream: Phabricator should suggest possible duplicates when creating a new task - https://phabricator.wikimedia.org/T45 (10Jdlrobson) Thank you for the honesty! [17:07:00] !log Updated https://integration.wikimedia.org/ci/job/castor-save-workspace-cache/ so that it no more fails when being skipped due to not being triggered from postmerge / gate-and-submit [17:07:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:11:48] !log Reverted https://integration.wikimedia.org/ci/job/castor-save-workspace-cache/ back [17:11:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:13:17] (03CR) 10Hashar: [C: 03+2] "That ends up being straight forward. I procrastinated cause that used to be in the releng/castor container as well but that bit has been r" [integration/config] - 10https://gerrit.wikimedia.org/r/616595 (owner: 10Ahmon Dancy) [17:13:51] !log BAH, updated again https://integration.wikimedia.org/ci/job/castor-save-workspace-cache/ [17:13:53] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:14:16] (03CR) 10Hashar: [C: 03+2] "Job updated:" [integration/config] - 10https://gerrit.wikimedia.org/r/616595 (owner: 10Ahmon Dancy) [17:14:24] (03Merged) 10jenkins-bot: Move castor-save-filter.bash check into castor-save-workspacecache.bash [integration/config] - 10https://gerrit.wikimedia.org/r/616595 (owner: 10Ahmon Dancy) [17:24:16] dancy: thanks for the castor patch, it is merged now ;) [17:24:23] no more FAILURE for the castor-save-workspace job! [17:24:28] Awesome! [17:24:36] took me a while to jump in, but that ended up straightforward [17:28:16] 10Continuous-Integration-Infrastructure, 10castor: Castor rsync causes: rsync: failed to set times on "/cache/.": Operation not permitted (1) - https://phabricator.wikimedia.org/T188488 (10hashar) [17:28:21] 10Continuous-Integration-Infrastructure, 10castor: castor caching model seems to be broken for non-voting or coverage jobs - https://phabricator.wikimedia.org/T189077 (10hashar) [17:28:33] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10castor: Don't hardcode castor url in castor docker container - https://phabricator.wikimedia.org/T216244 (10hashar) [17:30:12] 10Continuous-Integration-Infrastructure, 10castor: castor rsync's taking 3-5 minutes for mwgate-npm jobs - https://phabricator.wikimedia.org/T188375 (10hashar) [17:30:18] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10Release-Engineering-Team-TODO (2020-04 to 2020-06 (Q4)), 10Patch-For-Review, 10castor: Quibble jobs re-download npm packages every build (Castor not loading?) - https://phabricator.wikimedia.org/T234738 (10hashar) [17:30:24] 10Release-Engineering-Team-TODO (201907), 10Wikimedia-Portals, 10Jenkins, 10castor: wikimedia-portals-build job failing to create castor workspace - https://phabricator.wikimedia.org/T227448 (10hashar) [17:30:35] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10castor: Castor: mediawiki-core-qunit-jessie node_modules cache ineffective - https://phabricator.wikimedia.org/T159591 (10hashar) [17:30:40] 10Continuous-Integration-Infrastructure (shipyard), 10RelEng-Archive-FY201718-Q2, 10Patch-For-Review, 10castor: Port castor to support docker container - https://phabricator.wikimedia.org/T179208 (10hashar) [17:30:47] 10Beta-Cluster-Infrastructure, 10RelEng-Archive-FY201718-Q1, 10Cloud-VPS, 10castor: castor.integration.eqiad.wmflabs unreacheable deadlocking the whole CI - https://phabricator.wikimedia.org/T133652 (10hashar) [17:30:55] 10Continuous-Integration-Config, 10RelEng-Archive-FY201718-Q1, 10Patch-For-Review, 10castor: Trigger a castor save when operations/puppet change are merged and change dependencies - https://phabricator.wikimedia.org/T168063 (10hashar) [17:31:01] 10Continuous-Integration-Infrastructure, 10RelEng-Archive-FY201718-Q1, 10Wikimedia-Incident, 10castor: CI jobs are blocked because castor is unreachable - https://phabricator.wikimedia.org/T171148 (10hashar) [17:31:11] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10castor: castor rsyncd spam logs with forward name lookup for ci-trusty-wikimedia-95202.contintcloud.eqiad.wmflabs failed: Name or service not known - https://phabricator.wikimedia.org/T136276 (10hashar) [17:31:16] 10Continuous-Integration-Infrastructure, 10castor: castor-load reports when cache has not been populated yet "broken: rsync No such file or directory" (eg: operations-puppet-tox-pep8-jessie jobs) - https://phabricator.wikimedia.org/T136261 (10hashar) [17:31:23] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10User-dancy, and 2 others: Selenium quibble jobs have a huge cache overloa... - https://phabricator.wikimedia.org/T258972 [17:31:30] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10castor: Jenkins instances /srv suddenly becomes full causing it to be disconnected - https://phabricator.wikimedia.org/T220948 (10hashar) [17:31:41] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10User-dancy, 10castor: Investigate again a central cache for package managers - https://phabricator.wikimedia.org/T147635 (10hashar) [18:35:15] 10Release-Engineering-Team (Logspam), 10Growth-Team, 10StructuredDiscussions, 10User-brennen, 10Wikimedia-production-error: Flow: PHP Notice: Undefined index: flow-workflow-change - https://phabricator.wikimedia.org/T259739 (10brennen) [18:37:19] (03CR) 10Ahmon Dancy: [C: 03+2] Purge Helm releases that fail to install [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/618156 (https://phabricator.wikimedia.org/T259319) (owner: 10Dduvall) [18:38:17] (03Merged) 10jenkins-bot: Purge Helm releases that fail to install [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/618156 (https://phabricator.wikimedia.org/T259319) (owner: 10Dduvall) [18:42:59] 10Release-Engineering-Team (Logspam), 10Product-Infrastructure-Team-Backlog, 10Reading List Service, 10User-brennen, 10Wikimedia-production-error: ReadingListRepository::deleteListEntryQuery: BIGINT UNSIGNED value is out of range in '`wikishared`.`reading_list`.`... - https://phabricator.wikimedia.org/T259740 [19:03:10] 10Continuous-Integration-Config, 10MediaWiki-extensions-ExternalData: Add `ext-mongodb` to the testing environment - https://phabricator.wikimedia.org/T259743 (10alex-mashin) [19:28:39] 10Release-Engineering-Team, 10Vue.js (Vue.js-Search): Work with RelEng to add PipelineBot to Vector - https://phabricator.wikimedia.org/T257582 (10dduvall) Moving the conversation from email. Thanks for bearing with my catching up, @Niedzielski. > === Code review workflow > 1. A patch is submitted. > 2. NPM an... [19:35:57] can haz the "the Overall/Read permission" on releases-jenkins.wikimedia.org? [19:45:17] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T257971 (10brennen) [19:47:18] mutante: are you social engineering me? [19:48:27] thcipriani: lol. yes. but let me say what i really want. i want to switch the backends for releases* and checking releases.wikimedia.org is easy but releases-jenkins.wikimedia.org i can't see much because after login it tells me "Dzahn is missing the Overall/Read permission" [19:48:53] but on the other hand.. it is 2 separate changes in ATS for both host names anyways, so i was going to start with just releases.wm.org [19:49:20] goal is to move to releases1002/2002 on buster [19:51:10] * thcipriani nods [19:51:20] mutante: you should have overall read now: let me know if that's not the case [19:53:30] thcipriani: thank you, i see things now! [19:55:18] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T257971 (10brennen) [19:56:04] 10Release-Engineering-Team (Logspam), 10Product-Infrastructure-Team-Backlog, 10Reading List Service, 10User-brennen, 10Wikimedia-production-error: ReadingListRepository::deleteListEntryQuery: BIGINT UNSIGNED value is out of range in '`wikishared`.`reading_list`.`... - https://phabricator.wikimedia.org/T259740 [20:21:48] 10Release-Engineering-Team, 10Vue.js (Vue.js-Search): Work with RelEng to add PipelineBot to Vector - https://phabricator.wikimedia.org/T257582 (10Niedzielski) Thank you, @dduvall! > Is this what you want to happen during gate-and-submit (after someone +2s but before the merge) or post merge? I'm worried ther... [20:54:03] 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10User-brennen: Install a scratch / test GitLab instance in WMCS / Toolforge for workflow experimentation, etc. - https://phabricator.wikimedia.org/T258220 (10nsk... [20:54:07] 10Release-Engineering-Team, 10Cloud-VPS (Project-requests), 10cloud-services-team (Kanban): Request creation of gitlab-test VPS project - https://phabricator.wikimedia.org/T259668 (10nskaggs) 05Open→03Resolved Created gitlab-test project and added thcipriani and brennen to it. [20:58:22] \o/ [21:16:28] 10Release-Engineering-Team (Logspam), 10Product-Infrastructure-Team-Backlog, 10Reading List Service, 10User-brennen, 10Wikimedia-production-error: ReadingListRepository::deleteListEntryQuery: BIGINT UNSIGNED value is out of range in '`wikishared`.`reading_list`.`... - https://phabricator.wikimedia.org/T259740 [21:26:36] !log Creating performance_arclamp Swift account and related container for testing on deployment-ms-fe03. [21:26:37] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:56:42] (03PS1) 10Ahmon Dancy: deploy-promote: Show usage if no args are supplied [tools/release] - 10https://gerrit.wikimedia.org/r/618632 [21:58:46] nice (re: gitlab-test) [22:00:01] (03CR) 10Ahmon Dancy: [C: 04-1] "Just learned that the second argument is optional." [tools/release] - 10https://gerrit.wikimedia.org/r/618632 (owner: 10Ahmon Dancy) [22:00:50] (03CR) 10Thcipriani: "> Patch Set 1: Code-Review-1" [tools/release] - 10https://gerrit.wikimedia.org/r/618632 (owner: 10Ahmon Dancy) [22:05:00] 10Release-Engineering-Team (Logspam), 10Growth-Team, 10StructuredDiscussions, 10User-brennen, 10Wikimedia-production-error: Flow: PHP Notice: Undefined index: flow-workflow-change - https://phabricator.wikimedia.org/T259739 (10Umherirrender) The `flow-workflow-change` is taken from `rc_params` out of the... [22:08:26] (03PS2) 10Ahmon Dancy: deploy-promote: Show usage if no args are supplied [tools/release] - 10https://gerrit.wikimedia.org/r/618632 [22:15:36] 10Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T257971 (10brennen) Notes at end-of-day: - Log triage meeting today... [22:15:37] (03CR) 10Thcipriani: [C: 03+2] "Much better. Thank you for fixing my mistake!" [tools/release] - 10https://gerrit.wikimedia.org/r/618632 (owner: 10Ahmon Dancy) [22:16:27] (03Merged) 10jenkins-bot: deploy-promote: Show usage if no args are supplied [tools/release] - 10https://gerrit.wikimedia.org/r/618632 (owner: 10Ahmon Dancy) [22:17:44] 10Continuous-Integration-Infrastructure, 10Zuul, 10Patch-For-Review: Add support for ecdsa keys in zuul (Also update paramiko to 2.2+) - https://phabricator.wikimedia.org/T171165 (10hashar) 05Open→03Resolved a:03hashar Apparently Paramiko introduced ecdsa keys support with 1.12.0 (see http://www.parami... [23:32:39] 10Beta-Cluster-Infrastructure: deployment-perfapt01 seems to be broken - https://phabricator.wikimedia.org/T259540 (10dpifke) 05Open→03Resolved a:03dpifke The host has been deleted. [23:54:18] PROBLEM - Host deployment-sentry01 is DOWN: CRITICAL - Host Unreachable (172.16.5.16)