[00:20:06] Project beta-update-databases-eqiad build #32083: 04STILL FAILING in 5.8 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/32083/ [00:25:14] !log disabled beta-update-databases-eqiad in the jenkins UI - T216067 [00:25:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:25:16] T216067: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 [01:35:52] if I want to merge updates from external repo into gerrit repo and I get "committer email address does not match", what should I do? [01:37:42] qchris: ^^ ? [01:40:02] SMalyshev: you can temporarily give yourself permission to push to a repo with an email other than your own [01:40:13] (at least that is what i have done before) [01:40:19] it is in the project settings somewhere [01:40:22] * addshore is bed now [02:02:58] (03PS1) 10Krinkle: Remove unused castor steps from Fresnel job [integration/config] - 10https://gerrit.wikimedia.org/r/491667 [02:09:50] addshore: not sure I have permission to give myself permission for that [03:42:59] 10Gerrit, 10Release-Engineering-Team, 10Wikidata, 10Wikidata-Query-Service, 10User-Smalyshev: Unable to push third-party changes to gerrit repo - https://phabricator.wikimedia.org/T216587 (10Smalyshev) p:05Triage→03Normal [03:43:20] 10Gerrit, 10Release-Engineering-Team, 10Wikidata, 10Wikidata-Query-Service, 10User-Smalyshev: Unable to push third-party changes to gerrit repo - https://phabricator.wikimedia.org/T216587 (10Smalyshev) a:05Smalyshev→03None [03:43:55] SMalyshev: you have to edit ,access for the project [03:44:48] And grant your self “Forge Committer Identity” [03:45:32] 10Gerrit, 10Release-Engineering-Team, 10Wikidata, 10Wikidata-Query-Service, 10User-Smalyshev: Unable to push third-party changes to gerrit repo - https://phabricator.wikimedia.org/T216587 (10Paladox) Yes, you go to the project in gerrits ui, go to access and then click edit, then you add “Forge Committer... [03:45:33] paladox: I don't think I have access to that [03:45:43] 10Gerrit, 10Release-Engineering-Team, 10Wikidata, 10Wikidata-Query-Service, 10User-Smalyshev: Unable to push third-party changes to gerrit repo - https://phabricator.wikimedia.org/T216587 (10Paladox) Like https://gerrit.wikimedia.org/r/#/admin/projects/operations/software/gerrit,access [03:45:51] could you do that for me? [03:45:59] Which repo? [03:46:02] paladox: I don't have any "edit" in access screen [03:46:13] paladox: wikidata/query/LDFServer [03:46:30] all is greyed out [03:46:36] Are you on ldap/wmf? [03:46:45] yes probably [03:47:05] You can edit that repo [03:47:14] Your in https://gerrit.wikimedia.org/r/#/admin/groups/1466,members [03:47:19] I am not allowed to see members of ldap/wmf either [03:47:51] paladox: I can edit the repo maybe but not access. there's no edit button and everything is gray [03:49:00] I don't think that group membership allows access editing [03:49:03] (03PS1) 10Paladox: Modify access rules [wikidata/query/LDFServer] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/491680 [03:49:23] (03PS2) 10Paladox: Modify access rules [wikidata/query/LDFServer] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/491680 (https://phabricator.wikimedia.org/T216587) [03:49:28] SMalyshev: ^^ [03:49:39] (You can merge that and it will grant it to you) [03:49:39] (03CR) 10Smalyshev: [C: 03+1] Modify access rules [wikidata/query/LDFServer] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/491680 (https://phabricator.wikimedia.org/T216587) (owner: 10Paladox) [03:50:09] (03CR) 10Smalyshev: [V: 03+2 C: 03+2] Modify access rules [wikidata/query/LDFServer] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/491680 (https://phabricator.wikimedia.org/T216587) (owner: 10Paladox) [03:50:33] paladox: ok didn't know that let me see... [03:51:33] paladox: yep that worked! thanks! [03:51:47] Your welcome :) [03:52:13] 10Gerrit, 10Release-Engineering-Team, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Unable to push third-party changes to gerrit repo - https://phabricator.wikimedia.org/T216587 (10Paladox) 05Open→03Resolved a:03Paladox [03:57:36] 10Gerrit, 10Operations, 10serviceops: Gerrit loads very slowly - https://phabricator.wikimedia.org/T215855 (10Paladox) Thanks @hashar [06:58:35] (03CR) 10Giuseppe Lavagetto: [C: 03+2] Rewrite the concurrency logic of OpcacheManager [tools/scap] - 10https://gerrit.wikimedia.org/r/490888 (https://phabricator.wikimedia.org/T211964) (owner: 10Giuseppe Lavagetto) [07:00:36] (03Merged) 10jenkins-bot: Rewrite the concurrency logic of OpcacheManager [tools/scap] - 10https://gerrit.wikimedia.org/r/490888 (https://phabricator.wikimedia.org/T211964) (owner: 10Giuseppe Lavagetto) [07:01:27] (03CR) 10jenkins-bot: Rewrite the concurrency logic of OpcacheManager [tools/scap] - 10https://gerrit.wikimedia.org/r/490888 (https://phabricator.wikimedia.org/T211964) (owner: 10Giuseppe Lavagetto) [07:16:44] 10Release-Engineering-Team (Kanban), 10MediaWiki-Authentication-and-authorization, 10MediaWiki-Core-Testing, 10MW-1.33-notes (1.33.0-wmf.17; 2019-02-12), and 4 others: wdio browser tests fail locally due to session not being persisted before 2nd stage of login star... - https://phabricator.wikimedia.org/T214471 [08:35:18] SMalyshev: for ldap members you can use https://tools.wmflabs.org/ldap/group/wmf [08:35:23] 10Beta-Cluster-Infrastructure, 10User-Ryasmeen: Beta cluster is broken: No working replica DB server: Unknown error (172.16.5.5:3306)) - https://phabricator.wikimedia.org/T216045 (10jcrespo) Sorry, I don't maintain Beta cluster, in fact, I was denied access to it some time ago. [08:47:40] hashar: around by any chance? Would you have time to have a look at https://gerrit.wikimedia.org/r/c/integration/config/+/489800 ? [08:48:12] (03PS1) 10Hashar: Register extension FilterSpecialPages [integration/config] - 10https://gerrit.wikimedia.org/r/491702 [08:48:38] 10Beta-Cluster-Infrastructure, 10User-Ryasmeen: Beta cluster is broken: No working replica DB server: Unknown error (172.16.5.5:3306)) - https://phabricator.wikimedia.org/T216045 (10jcrespo) I found this, though: T216404 [08:49:02] hashar: and if you have more time, I also have a few more: https://gerrit.wikimedia.org/r/q/project:integration%252Fconfig+owner:guillaume.lederrey%2540wikimedia.org+status:open [08:49:31] (03PS1) 10Hashar: Register extension ScrollableTables [integration/config] - 10https://gerrit.wikimedia.org/r/491703 [08:50:19] (03PS1) 10Hashar: Register extension WebDAV [integration/config] - 10https://gerrit.wikimedia.org/r/491705 [08:51:08] (03CR) 10Hashar: [C: 03+2] Register extension FilterSpecialPages [integration/config] - 10https://gerrit.wikimedia.org/r/491702 (owner: 10Hashar) [08:51:16] (03CR) 10Hashar: [C: 03+2] Register extension ScrollableTables [integration/config] - 10https://gerrit.wikimedia.org/r/491703 (owner: 10Hashar) [08:51:20] (03CR) 10Hashar: [C: 03+2] Register extension WebDAV [integration/config] - 10https://gerrit.wikimedia.org/r/491705 (owner: 10Hashar) [08:52:43] (03Merged) 10jenkins-bot: Register extension FilterSpecialPages [integration/config] - 10https://gerrit.wikimedia.org/r/491702 (owner: 10Hashar) [08:52:45] (03Merged) 10jenkins-bot: Register extension ScrollableTables [integration/config] - 10https://gerrit.wikimedia.org/r/491703 (owner: 10Hashar) [08:52:47] (03Merged) 10jenkins-bot: Register extension WebDAV [integration/config] - 10https://gerrit.wikimedia.org/r/491705 (owner: 10Hashar) [08:53:23] (03PS1) 10Hashar: Register extension WikibaseLexemeCirrusSearch [integration/config] - 10https://gerrit.wikimedia.org/r/491706 [08:54:41] gehel: bonjour :) [08:58:04] (03CR) 10Hashar: [C: 03+2] "That is straightforward. I have deployed the jobs:" [integration/config] - 10https://gerrit.wikimedia.org/r/489800 (owner: 10Gehel) [08:58:15] hashar: thanks! [08:58:26] deploying the glent as soon as the change is merged [08:59:23] gehel: for the maven wrapper, I thought we had that implemented ages ago? [08:59:48] I guess I was wrong ;) [08:59:54] nope, we talked about it ages ago, but it fell through the cracks [09:00:01] (03Merged) 10jenkins-bot: Add search/glent project to jenkins. [integration/config] - 10https://gerrit.wikimedia.org/r/489800 (owner: 10Gehel) [09:00:35] and the last one is failing build, but I'm not sure why [09:10:09] 10Continuous-Integration-Config, 10Discovery-Search (Current work): Setup CI for search/glent - https://phabricator.wikimedia.org/T216599 (10dcausse) looks like it was already done in https://gerrit.wikimedia.org/r/c/integration/config/+/489800 but sadly this does not seem to be effective, CI is not triggered,... [09:11:57] damn sorry^ [09:12:12] 10Continuous-Integration-Config, 10Discovery-Search (Current work): Setup CI for search/glent - https://phabricator.wikimedia.org/T216599 (10Gehel) >>! In T216599#4967899, @dcausse wrote: > looks like it was already done in https://gerrit.wikimedia.org/r/c/integration/config/+/489800 but sadly this does not se... [09:13:09] hashar: can you ping us (dcausse and me) when the config is deployed? So we can check? [09:17:22] (03CR) 10Hashar: [C: 04-1] "tldr I think the logic should be in dockerfiles/java8/mvn in order to reuse some other custom hacks we have ;)" (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/487285 (https://phabricator.wikimedia.org/T208938) (owner: 10Gehel) [09:17:58] gehel: dcausse: search/glent now has CI :) [09:18:08] hashar, gehel: thanks! [09:18:10] gehel: and I reviewed the mvnw change ( https://gerrit.wikimedia.org/r/#/c/integration/config/+/487285/ ) [09:18:22] hashar: I'll do another pass! [09:18:47] hashar: a "recheck" on search/glent should be sufficient to trigger CI? [09:18:51] gehel: in short, the mvn command is already a wrapper which has some custom hacks/settings which we most probably want to reuse/apply when using mvnw [09:18:59] +1 on "recheck" yes [09:20:26] 10Continuous-Integration-Config, 10Discovery-Search (Current work): Setup CI for search/glent - https://phabricator.wikimedia.org/T216599 (10dcausse) a:03Gehel [09:21:23] (03CR) 10Hashar: [C: 03+2] "Lets see what is going on on CI front ;)" [integration/config] - 10https://gerrit.wikimedia.org/r/491706 (owner: 10Hashar) [09:21:24] hashar: build is failing, so it looks like it is working fine :) [09:22:11] ahah [09:22:49] (03Merged) 10jenkins-bot: Register extension WikibaseLexemeCirrusSearch [integration/config] - 10https://gerrit.wikimedia.org/r/491706 (owner: 10Hashar) [10:22:00] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10puppet-compiler: compiler1002.puppet-diffs.eqiad.wmflabs instance is down - https://phabricator.wikimedia.org/T216513 (10hashar) a:05hashar→03herron @herron restored the instance, not me :] Will be verified in a few hours and... [10:23:09] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10puppet-compiler: compiler1002.puppet-diffs.eqiad.wmflabs instance is down - https://phabricator.wikimedia.org/T216513 (10hashar) p:05Triage→03High [10:25:39] (03CR) 10Hashar: "recheck" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/486702 (https://phabricator.wikimedia.org/T214739) (owner: 10Paladox) [10:26:37] (03CR) 10jerkins-bot: [V: 04-1] Update GitPython to 2.1.11 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/486702 (https://phabricator.wikimedia.org/T214739) (owner: 10Paladox) [10:36:43] (03PS2) 10Hashar: debian-glue: avoid mangling $distribution and $DIST [integration/config] - 10https://gerrit.wikimedia.org/r/491639 [10:38:13] (03CR) 10Hashar: [C: 03+2] "Seems good now. I really have to rewrite all of that mess to standalone script." [integration/config] - 10https://gerrit.wikimedia.org/r/491639 (owner: 10Hashar) [10:38:31] (03CR) 10Hashar: "recheck" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/486702 (https://phabricator.wikimedia.org/T214739) (owner: 10Paladox) [10:40:14] (03Merged) 10jenkins-bot: debian-glue: avoid mangling $distribution and $DIST [integration/config] - 10https://gerrit.wikimedia.org/r/491639 (owner: 10Hashar) [10:47:29] 10Continuous-Integration-Config, 10Discovery-Search (Current work): Setup CI for search/glent - https://phabricator.wikimedia.org/T216599 (10hashar) I have deployed the CI change several minutes after it got merged, I was busy reviewed another patch by Gehel. Sorry. Seems some maven step fail due to `forbidde... [10:48:18] 10Continuous-Integration-Config, 10Discovery-Search (Current work): Setup CI for search/glent - https://phabricator.wikimedia.org/T216599 (10Gehel) I confirm that this is working as intended, the failure is expected. [10:59:52] 10Gerrit, 10Operations, 10serviceops: Gerrit loads very slowly - https://phabricator.wikimedia.org/T215855 (10hashar) I have pasted P8073 content to [[ https://fastthread.io/ | fastthread.io ]]. It is an analyzer for Java thread dumps. https://fastthread.io/ft-thread-report.jsp?dumpId=1&oTxnId_value=c865888... [11:18:10] 10Gerrit, 10Operations, 10serviceops: Gerrit loads very slowly - https://phabricator.wikimedia.org/T215855 (10hashar) ` zcat /var/log/apache2/gerrit.wikimedia.org.https.access.log.9.gz|cut -b-13|sort|uniq -c 17821 2019-02-11T06 55594 2019-02-11T07 52925 2019-02-11T08 54292 2019-02-11T09 74124 2019-... [11:32:00] 10Gerrit: Cannot assign user name "kipcool" to account 6839; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) [11:32:14] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Regression, 10Upstream: Cannot log into Gerrit as of recent upgrade - https://phabricator.wikimedia.org/T152640 (10hashar) 05Open→03Resolved @Kipcool this task is unrelated. That would be T197083 instead. It is better to just fill another task which I di... [11:57:08] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: Remove trusty-specific hacks from logstash_checker.py - https://phabricator.wikimedia.org/T216380 (10MoritzMuehlenhoff) p:05Triage→03Low [12:32:50] 10Gerrit: Cannot assign user name "kipcool" to account 6839; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) [12:40:33] 10Gerrit: Cannot assign user name "kipcool" to account 6839; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) ` gerrit> select * from account_external_ids where external_id LIKE '%ipcool%'; account_id | email_address | password | external_id -----------+---------------------+--... [12:45:09] hashar: your looking in the wrong place :) [12:45:32] That’s now in All-Users (the db will be out dates) [12:46:09] yeah [12:46:17] the All-Users record seems fine [12:46:28] so I suspected that some part still lookup in the database [12:46:42] who knows really, I lack all the history / context anyway [12:47:12] 10Gerrit: Cannot assign user name "kipcool" to account 6839; name already in use. - https://phabricator.wikimedia.org/T216605 (10Paladox) His account is missing a external id. Also this is no longer stored in the db, the db will be out dated now. It’s all stored in All-Users under refs/external* or (similar) [12:47:28] hashar: it’s missing the external-id [12:47:48] I think [12:48:33] Unless it’s different in All-Users (that being it missing gerrit: prefix) [12:50:11] Ah yeh gerrit: [12:50:13] https://phabricator.wikimedia.org/T152640#3383814 [12:50:35] Per example from ^^ (it would only happen if gerrit: prefix was missing) [12:52:51] I am all confused ;) [12:54:36] I don't even get what the table account_external_ids [12:59:28] ah doc https://gerrit-review.googlesource.com/Documentation/config-accounts.html [13:02:30] +++ b/6c/84cc1cfca1fbbc7f405b7a9f5c6510a7d6ad82 [13:02:30] @@ -0,0 +1,2 @@ [13:02:30] +[externalId "username:kipcool"] [13:02:30] + accountId = 609 [13:02:31] ;) [13:03:33] 6316 accounts with "username:" [13:03:42] 6513 accounts with gerrit: [13:03:43] bah [13:20:16] 10Gerrit: Cannot assign user name "kipcool" to account 6839; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) Indeed sorry, the accounts external ids are stored as git notes in `refs/meta/external-ids`: ` $ git fetch origin refs/meta/external-ids $ git checkout -b meta/external-ids FET... [13:26:50] (03PS3) 10Gehel: java: build maven projects with maven wrapper if it exists [integration/config] - 10https://gerrit.wikimedia.org/r/487285 (https://phabricator.wikimedia.org/T208938) [13:27:08] hashar: ^ is that what you were suggesting? [13:30:57] (03PS4) 10Gehel: java: build maven projects with maven wrapper if it exists [integration/config] - 10https://gerrit.wikimedia.org/r/487285 (https://phabricator.wikimedia.org/T208938) [13:33:58] 10Release-Engineering-Team, 10Analytics, 10EventBus, 10MediaWiki-Core-Testing, and 4 others: Flaky quibble-vendor-mysql-hhvm-docker test in Jenkins - https://phabricator.wikimedia.org/T216069 (10hashar) EventBus itself seems to be fine since a change got merged yesterday by CI https://gerrit.wikimedia.org/... [14:06:38] 10Release-Engineering-Team (Watching / External), 10MW-1.32-notes, 10MW-1.32-release: Release 1.32.1 as a maintenance release - https://phabricator.wikimedia.org/T213595 (10mmodell) Ok I've uploaded tarballs to releases1001, they should appear soon on releases.wikimedia.org as follows: *********************... [14:10:12] hashar: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/491414/ is great, thanks [14:10:27] I should go do some maths about how many hours of cpu time you just saved [14:17:50] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), 10artificial-intelligence: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10Halfak) [14:18:13] 10Continuous-Integration-Infrastructure (Slipway), 10Wikimedia-Portals: Migrate wikimedia-portals-build to Docker container - https://phabricator.wikimedia.org/T213806 (10hashar) [14:18:13] Hey folks. We have a weird issue with gerrit mirroring from phab that is blocking our deployment today. See https://phabricator.wikimedia.org/T216616 [14:18:17] 10Continuous-Integration-Infrastructure (Slipway), 10Wikidata, 10Wikidata Query UI, 10Wikidata-Campsite: Migrate wikidata-query-gui-build to Docker containers - https://phabricator.wikimedia.org/T210286 (10hashar) [14:18:19] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Don't hardcode castor url in castor docker container - https://phabricator.wikimedia.org/T216244 (10hashar) [14:18:22] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), and 2 others: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10Zppix) [14:18:27] Would love if someone can take a look to see if the solution is obvious. [14:19:04] We tried turning mirroring off and on again. Of course, we also scheduled a repo update and watched it go by with no error or change. [14:19:45] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), and 2 others: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10Halfak) @Ladsgroup tried turning mirroring off and on again. We also tried scheduling an updat... [14:21:33] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), and 2 others: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10Halfak) [14:23:00] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Don't hardcode castor url in castor docker container - https://phabricator.wikimedia.org/T216244 (10hashar) It is a bit of a mess. The instance fqdn is hardcoded in jjb/castor-load-sync.bash which is the inlined in the jobs definition in jjb/... [14:30:38] 10Continuous-Integration-Infrastructure, 10Operations: jenkins / zuul backing up due to jenkins slaves down - https://phabricator.wikimedia.org/T216039 (10hashar) 05Open→03Resolved a:03thcipriani We can resolve this task since Tyler did the emergency action. The `castor-save` job could not be triggered... [14:32:30] 10Continuous-Integration-Infrastructure: Jenkins failing everything due to npm being screwed up - https://phabricator.wikimedia.org/T216053 (10hashar) Several similar tasks got filled. Namely the central cache (that holds a copy of packagist/npm cache) ends up being corrupted due to the WMCS outage last week. T... [14:50:49] (03CR) 10Giuseppe Lavagetto: [C: 03+2] dnslint: update to 0.0.4 [integration/config] - 10https://gerrit.wikimedia.org/r/490880 (owner: 10BBlack) [14:52:45] (03Merged) 10jenkins-bot: dnslint: update to 0.0.4 [integration/config] - 10https://gerrit.wikimedia.org/r/490880 (owner: 10BBlack) [14:55:16] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), and 2 others: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10MarcoAurelio) So, Phab is the one out of sync. if I read this right? [14:57:49] 10Phabricator: On Phabricator workboard, show status of associated Gerrit patches - https://phabricator.wikimedia.org/T215148 (10hashar) Some hints for @Jdrewniak ---- `q=bug:T215861+OR+T198810+OR+T198342` Probably each task should be prefixed with `bug:` eg: `bug:T123 OR bug:T456 ...`, that might makes it a... [15:17:09] (03CR) 10Kosta Harlan: [C: 03+1] "Zeljko, Guillaume and I looked at this today, and it makes sense." [integration/config] - 10https://gerrit.wikimedia.org/r/490950 (owner: 10Thcipriani) [15:18:45] twentyafterfour: having problems with mirroring a diffusion repo to gerrit using either k18/k19 - could you help us when you got some time, please? [15:19:06] hauskatze: sure [15:19:29] twentyafterfour: great - see T216616 [15:19:29] T216616: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 [15:19:56] https://gerrit.wikimedia.org/r/plugins/gitiles/scoring/ores/draftquality/+log/master <-- it's gerrit which is out of date [15:20:02] phab working okay [15:20:11] <_joe_> hashar: how can I get a jjb change to take effect? [15:20:17] (03Abandoned) 10Kosta Harlan: Sonar: Specify branch name and target [integration/config] - 10https://gerrit.wikimedia.org/r/487877 (https://phabricator.wikimedia.org/T215175) (owner: 10Kosta Harlan) [15:20:18] but then phab -> gerrit seems it is not working either via K18/K19 [15:23:36] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), and 2 others: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10MarcoAurelio) I've tried using {K19} with URI `ssh://phabricator@gerrit.wikimedia.org:29418/sco... [15:26:20] 10Continuous-Integration-Config, 10Fresnel, 10Performance-Team: Jenkins should collect Fresnel records as build artefact - https://phabricator.wikimedia.org/T216622 (10Krinkle) [15:33:24] hauskatze: strange, I see the access denied error from gerrit, not sure what changed [15:33:37] gerrit is rejecting phabricator's credentials [15:33:51] twentyafterfour: aha, so it's k18/k19 right? [15:34:07] twentyafterfour: perhaps trying to get a new http password for K18 would fix it? [15:34:57] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), and 2 others: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10mmodell) It seems that gerrit is rejecting phabricator's credentials. I'm not sure what changed... [15:35:55] assuming http mirroring is preferred over ssh or vice versa [15:38:52] 10Release-Engineering-Team, 10ORES, 10draftquality-modeling, 10Scoring-platform-team (Current), and 2 others: Gerrit mirror from phab broken for source/draftquality - https://phabricator.wikimedia.org/T216616 (10MarcoAurelio) @mmodell if K18 credentials are being rejected as well maybe we should regenerate... [15:43:45] (03PS2) 10Zfilipin: Sonar: job template for change vs branch [integration/config] - 10https://gerrit.wikimedia.org/r/490950 (https://phabricator.wikimedia.org/T215175) (owner: 10Thcipriani) [15:57:06] 10Release-Engineering-Team, 10Analytics, 10EventBus, 10MediaWiki-Core-Testing, and 4 others: Flaky quibble-vendor-mysql-hhvm-docker test in Jenkins - https://phabricator.wikimedia.org/T216069 (10Pchelolo) I have tried to resubmit the change: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/EventBus/... [16:12:19] (03PS2) 10Giuseppe Lavagetto: Remove functionality to talk to conftool [tools/scap] - 10https://gerrit.wikimedia.org/r/491412 [16:12:21] (03PS1) 10Giuseppe Lavagetto: Add tests for scap.cli [tools/scap] - 10https://gerrit.wikimedia.org/r/491789 [16:15:22] hey! I have trouble signing in to Gerrit: "Cannot assign user name "albert221" to account 6851; name already in use.". My credentials are valid, I can login to wikitech with them easily, but Gerrit gives me this error :( [16:18:15] hashar or thcipriani ^^ [16:18:39] im pretty sure that's happening again due to a recent security fix (that affected oauth) [16:18:56] oh my god [16:18:57] :( [16:19:13] Albert221: that is a known issue :/ Someone recently refilled a bug about it [16:19:33] Albert221 when did you last log into gerrit? [16:19:51] paladox: hmm, February 2018? [16:19:54] aha [16:19:55] paladox: could it be the issue twentyafterfour and me are trying to debug as well? [16:20:12] i think Albert221 problem is when we were using a db [16:20:15] (credentials suddenly not working) [16:20:37] hauskatze possibly, but i doin't think so [16:20:49] unless your trying to use a password and not a ssh key? [16:21:09] 10Gerrit: Cannot assign user name "XXX" to account ####; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) p:05Triage→03High [16:21:14] paladox: k18 uses https password to mirror diffusion to gerrit iirc [16:21:20] Albert221: https://phabricator.wikimedia.org/T216605 , I have subscribed you to it [16:21:20] oh [16:21:25] but k19 uses ssh keypair and it's not working either [16:21:42] hauskatze can you log into the gerrit account? [16:22:04] paladox: I can log just fine. I'm talking about the Phabricator account on Gerrit :) [16:22:11] 10Gerrit: Cannot assign user name "XXX" to account ####; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) [16:22:12] hauskatze yeh i ment that :) [16:22:25] paladox: I don't have access to that account, cannot answer [16:22:37] but we can revert the fix upstream did [16:22:41] Phabricator@gerrit I mean [16:22:42] as it does not afect us [16:23:01] (03PS1) 10Volans: operations-dnslint: force colors for tox [integration/config] - 10https://gerrit.wikimedia.org/r/491793 [16:24:02] twentyafterfour can you log into phabricator@gerrit through the ui? [16:24:04] 10Gerrit: Cannot assign user name "XXX" to account ####; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) @Albert221 has a similar issue: ` All-Users$ git fetch origin refs/meta/externals-id All-Users$ git grep -i albert221 80/e6adae5c49fadbb97697d11da542868c34d532:[externalId "usernam... [16:26:15] hashar lower case "gerrit:Albert221" [16:26:41] will probably want to audit all of them [16:26:46] what a mess :/ [16:27:11] yeh, that issue should be fixed now (after the migration to NoteDB in june last year) [16:27:21] so existing accounts will need fixing. [16:27:47] Project beta-scap-eqiad build #238626: 04FAILURE in 9 min 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238626/ [16:27:51] I see there is some major problem with that. Don't feel hassle, I don't _need_ to log in to Gerrit right now, I _just_ wanted to log in and spotted that bug :) [16:28:07] if need be we can revert https://gerrit-review.googlesource.com/c/gerrit/+/209089 hashar [16:28:25] I mean, don't feel hurry because of my case [16:29:10] Albert221 you doin't seem to be the only one :) (this has happened before) [16:29:15] ^ [16:29:21] Albert221: will check with other folks that have deal with the exact same issue previously [16:29:32] well namely thcipriani and twentyafterfour :))) [16:29:35] hashar: FWIW, last time the solution was git-ref surgery on the All-Users repo :\ [16:29:46] and probably we want to audit all accounts and fix them all ;) [16:29:48] which seems to be the only path forward [16:29:50] yeah [16:30:01] I have notes about what we did somewhere. [16:30:01] gotta change the key id in files under refs/meta/external-ids [16:30:06] I'll try to post on the task [16:30:08] get the sha1 sum of the id [16:30:10] and git mv the file [16:30:13] craft a commit and push [16:30:17] (theorically) [16:30:39] instead of reopening the task for Kaldari, I created a new one https://phabricator.wikimedia.org/T216605 [16:30:43] with some links to gerrit doc [16:30:51] hashar: the push doesn't work, bien sur :) [16:31:09] oh [16:31:11] it should [16:31:13] gerrit tries to audit that emails are unique per account [16:31:17] hmm [16:31:25] and that fails because it didn't enforce that during the notedb migration [16:31:48] ah https://gerrit-review.googlesource.com/Documentation/config-accounts.html#external-ids [16:31:52] The external IDs are maintained by Gerrit, this means users are not allowed to manually edit their external IDs. Only users with the Access Database global capability can push updates to the refs/meta/external-ids branch. However Gerrit rejects pushes if: [16:32:00] so need "Access Database" permission apparently [16:32:08] apparently if someone trys to change there prefered email in gerrit, it breaks your email too :( (ie the ui reports you cannot edit the user but then it proceeds to remove it) [16:32:23] but there is a workaround [16:32:23] eeek :( [16:32:42] namly re adding it through All-Users (users can edit there own ref) thus bypassing the ui [16:33:03] eek [16:33:06] ^ [16:33:07] have a call with greg-g (sorry) [16:33:17] hashar at least i helped someone who had done that in -cloud [16:33:31] and broke it for myself trying [16:34:46] if only we had a gerrit cli tool to validate the content of all-users ... [16:34:49] would let us validate it [16:35:06] (though potentially one can clone All-Users locally, setup a gerrit and iterate until a push works [16:35:12] or maybe the push reports all issues [16:36:47] !log Ran replication start mediawiki/extensions/PageViewInfo --wait on gerrit.wikimedia to populate GitHub mirror (success messages afterwards) | T180864 [16:36:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:36:50] T180864: Inactive extensions and skins in Diffusion / Github - https://phabricator.wikimedia.org/T180864 [16:37:55] Project beta-scap-eqiad build #238627: 04STILL FAILING in 8 min 43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238627/ [16:42:57] ^ me fiddling with beta-scap-eqiad [16:43:02] er.../me [16:48:18] and the big q, can i send a change to refs/meta/external-ids for review :) [16:48:31] * hashar tries git push origin HEAD:refs/for/meta/external-ids [16:48:41] ! [remote rejected] HEAD -> refs/for/meta/external-ids (branch meta/external-ids not found) [16:48:41] bah [16:49:26] refs/for/refs/meta/external-ids? [16:50:25] yeah ;/ [16:50:32] sending a dummy commit [16:50:47] and gerrit gives us the list of accounts! [16:51:03] so we now what has to be fixed [16:51:14] not sure whether the list of emails should be made public [16:51:19] s/should/can/ [16:51:37] * hauskatze votes for no [16:51:44] zeljkof: hi! in SoS I was just saying we might reach out to rel-eng this week if we need help updating our fundraising-branch tests to REL1_31 and composer merge plugin [16:52:14] ejegg: ok, will ping the team [16:52:48] I'll take a first pass at the zuul config update for sure [16:53:12] thanks! [16:54:19] Project beta-scap-eqiad build #238628: 04STILL FAILING in 0.66 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238628/ [16:54:43] (03PS2) 10Thcipriani: Move cxserver to deployment pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/490559 (https://phabricator.wikimedia.org/T213195) (owner: 10Alexandros Kosiaris) [16:54:57] (03CR) 10Thcipriani: [C: 03+2] Move cxserver to deployment pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/490559 (https://phabricator.wikimedia.org/T213195) (owner: 10Alexandros Kosiaris) [16:55:07] (03CR) 10MarcoAurelio: [C: 03+1] "Extension archived already." [integration/config] - 10https://gerrit.wikimedia.org/r/490606 (https://phabricator.wikimedia.org/T213011) (owner: 10MarcoAurelio) [16:55:14] (03PS4) 10MarcoAurelio: Archive the UploadLocal extension [integration/config] - 10https://gerrit.wikimedia.org/r/490606 (https://phabricator.wikimedia.org/T213011) [16:55:51] * thcipriani continues to fiddle with beta-scap-eqiad [16:56:29] (03Merged) 10jenkins-bot: Move cxserver to deployment pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/490559 (https://phabricator.wikimedia.org/T213195) (owner: 10Alexandros Kosiaris) [16:57:54] 10Gerrit: Cannot assign user name "XXX" to account ####; name already in use. - https://phabricator.wikimedia.org/T216605 (10hashar) We can not fix accounts one by one. Gerrit enforces a validation when pushing to `refs/meta/external-ids`. I gave it a try using: ` git clone All-Users.git git fetch origin refs/me... [16:57:56] !log reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/490559 [16:57:57] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:58:57] thcipriani: could you https://gerrit.wikimedia.org/r/#/c/integration/config/+/490606/ ? [17:01:02] (03CR) 10Thcipriani: [C: 03+2] Archive the UploadLocal extension [integration/config] - 10https://gerrit.wikimedia.org/r/490606 (https://phabricator.wikimedia.org/T213011) (owner: 10MarcoAurelio) [17:01:07] (03CR) 10Hashar: "Sorry I have missed your review request. This morning I noticed the repository lacked some CI and went with a basic one https://gerrit.wik" [integration/config] - 10https://gerrit.wikimedia.org/r/490792 (https://phabricator.wikimedia.org/T216206) (owner: 10Smalyshev) [17:02:45] (03Merged) 10jenkins-bot: Archive the UploadLocal extension [integration/config] - 10https://gerrit.wikimedia.org/r/490606 (https://phabricator.wikimedia.org/T213011) (owner: 10MarcoAurelio) [17:02:54] (03PS4) 10Hashar: Set up CI for WikibaseLexemeCirrusSearch extension [integration/config] - 10https://gerrit.wikimedia.org/r/490792 (https://phabricator.wikimedia.org/T216206) (owner: 10Smalyshev) [17:03:14] (03CR) 10Hashar: [C: 03+2] "Sorry about that :/" [integration/config] - 10https://gerrit.wikimedia.org/r/490792 (https://phabricator.wikimedia.org/T216206) (owner: 10Smalyshev) [17:04:48] SMalyshev: sorry about WikibaseLexemeCirrusSearch CI setup :/ I have missed your perfect patch and instead deployed my own basic patch :/ [17:05:01] !log reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/490606/ [17:05:02] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:05:21] I rebased your and I am about to deploy it. WikibaseLexemeCirrusSearch has some issues, I commented on https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseLexemeCirrusSearch/+/490794/ [17:05:43] (03Merged) 10jenkins-bot: Set up CI for WikibaseLexemeCirrusSearch extension [integration/config] - 10https://gerrit.wikimedia.org/r/490792 (https://phabricator.wikimedia.org/T216206) (owner: 10Smalyshev) [17:09:00] !log contint1001: fix broken root ownership on zuul git deploy repo: sudo find /etc/zuul/wikimedia/.git -not -user zuul -exec chown zuul:zuul {} + [17:09:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:09:22] :Id1e3afd0afba9b388778066b9b6e8e564a25826b [17:09:29] !log reloading zuul for Id1e3afd0afba9b388778066b9b6e8e564a25826b [17:09:30] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:15:03] 10Release-Engineering-Team (Kanban), 10Code-Stewardship-Reviews, 10Graphoid, 10Operations, and 2 others: graphoid: Code stewardship request - https://phabricator.wikimedia.org/T211881 (10Jhernandez) >>! In T211881#4954470, @akosiaris wrote: >>>! In T211881#4954092, @Jhernandez wrote: >> The graphoid repo i... [17:18:48] 10Release-Engineering-Team (Kanban), 10MediaWiki-Authentication-and-authorization, 10MediaWiki-Core-Testing, 10MW-1.33-notes (1.33.0-wmf.17; 2019-02-12), and 4 others: wdio browser tests fail locally due to session not being persisted before 2nd stage of login star... - https://phabricator.wikimedia.org/T214471 [17:23:09] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) [17:24:09] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10Marostegui) Broken storage?: ` Feb 18 13:24:54 mysqld[837]: InnoDB: Error number 5 means 'Input/output error'. ` [17:24:14] 10Beta-Cluster-Infrastructure, 10Cloud-VPS, 10Wikidata, 10User-Addshore, and 2 others: deployment-db03.deployment-prep.eqiad.wmflabs instance can not start - https://phabricator.wikimedia.org/T216404 (10hashar) 05Open→03Resolved a:03hashar Closing this task since it is going off topic. The original i... [17:24:19] 10Release-Engineering-Team (Kanban), 10MediaWiki-Authentication-and-authorization, 10MediaWiki-Core-Testing, 10MW-1.33-notes (1.33.0-wmf.17; 2019-02-12), and 4 others: wdio browser tests fail locally due to session not being persisted before 2nd stage of login star... - https://phabricator.wikimedia.org/T214471 [17:24:35] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) For #DBA , I will poke you on Thursday for some assistance. Not sure whether the Innodb database can be recovered. dep... [17:24:42] (03Abandoned) 10Joal: Update analytics maven job to fix surefire failure [integration/config] - 10https://gerrit.wikimedia.org/r/470789 (https://phabricator.wikimedia.org/T208377) (owner: 10Joal) [17:26:29] !log For beta cluster the MySQL master database has some innodb issue T216635 , the MySQL slave has an issue as well T216067 [17:26:33] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:26:33] T216067: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 [17:26:34] T216635: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 [17:34:57] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) ` umount /srv fsck /srv mount /srv ` I restarted mysql. Same I/O error. Maybe the disk is corr... [17:37:01] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10Marostegui) Anything on `dmesg`? Can you do a `touch /srv/test`? [17:51:07] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) Ah dmesg! [Mon Feb 18 13:24:48 2019] EXT4-fs (dm-0): warning: mounting fs with errors, running e2fsck is recommended [... [17:54:20] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) I did a fsck again, but I am afraid the partitions are corrupted beyond control :/ head /usr/share/prometheus-node-e... [17:55:42] 10Release-Engineering-Team (Kanban), 10MediaWiki-Authentication-and-authorization, 10MediaWiki-Core-Testing, 10MW-1.33-notes (1.33.0-wmf.17; 2019-02-12), and 4 others: wdio browser tests fail locally due to session not being persisted before 2nd stage of login star... - https://phabricator.wikimedia.org/T214471 [18:15:55] 10Release-Engineering-Team (Kanban), 10MediaWiki-Authentication-and-authorization, 10MediaWiki-Core-Testing, 10MW-1.33-notes (1.33.0-wmf.17; 2019-02-12), and 4 others: wdio browser tests fail locally due to session not being persisted before 2nd stage of login star... - https://phabricator.wikimedia.org/T214471 [18:18:55] hio [18:19:04] i'm making a new role in mediawiki vagrant [18:19:09] but it requires nodejs version 10 [18:19:18] and class npm installs version 6 [18:19:28] i want to make it only use 10 if you enable the role [18:19:48] i can do that with yaml...but it woudl require manually setting the node version [18:19:57] i can't do it automatically if someone enables the role [18:19:59] any suggestions? [18:45:09] 10Release-Engineering-Team (Kanban), 10MediaWiki-Authentication-and-authorization, 10MediaWiki-Core-Testing, 10MW-1.33-notes (1.33.0-wmf.17; 2019-02-12), and 4 others: wdio browser tests fail locally due to session not being persisted before 2nd stage of login star... - https://phabricator.wikimedia.org/T214471 [18:54:30] Yippee, build fixed! [18:54:30] Project beta-scap-eqiad build #238629: 09FIXED in 0.4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238629/ [19:00:55] upstream are making "checkers" in gerrit through a plugin [19:11:42] (03PS1) 10Mahveotm: Add mahveotm to jekins whitelist (longterm contributor) [integration/config] - 10https://gerrit.wikimedia.org/r/491831 [19:16:49] paladox: hmm? checkers? [19:17:08] greg-g yeh, upstream are implementing "first class ci" support [19:17:19] they are calling this checkers [19:17:37] greg-g https://groups.google.com/forum/#!topic/repo-discuss/MBYw4wM-7c8 [19:17:51] (03PS1) 10Thcipriani: Revert "scap say: python2/3 compatible" [tools/scap] - 10https://gerrit.wikimedia.org/r/491832 [19:18:31] they already have the changes up and ready to be reviewed :) [19:19:29] https://gerrit-review.googlesource.com/q/owner:ekempin%2540google.com [19:19:37] https://gerrit-review.googlesource.com/q/owner:dborowitz%2540google.com [19:21:49] (03CR) 10Thcipriani: [C: 03+2] Revert "scap say: python2/3 compatible" [tools/scap] - 10https://gerrit.wikimedia.org/r/491832 (owner: 10Thcipriani) [19:23:41] Can someone run disc clean-up on integration-slave-docker-1040 ? [19:25:50] (03Merged) 10jenkins-bot: Revert "scap say: python2/3 compatible" [tools/scap] - 10https://gerrit.wikimedia.org/r/491832 (owner: 10Thcipriani) [19:30:48] Eurgh, and integration-slave-docker-1050 is also failing during phpunit due to lack of space. [19:31:29] any planned date for fixing beta cluster stuff? [19:31:43] (03CR) 10jenkins-bot: Revert "scap say: python2/3 compatible" [tools/scap] - 10https://gerrit.wikimedia.org/r/491832 (owner: 10Thcipriani) [19:32:21] James_F: I can take a look. That's strange, we have a job that should be depooling those machines before they run out of space :\ [19:32:46] Yeah, it's been happening a bunch recently. [19:33:05] (I say as the person who merges the most patches across MW, and so notices most. ;-)) [19:33:39] yšfrč řtřg t [19:33:51] sorry for the message I sent by mistake [19:34:58] James_F: hrm, 1050's most full disk has 12GB of space (70%), 1040 is roughly the same (13GB) [19:35:02] https://integration.wikimedia.org/ci/job/maintenance-disconnect-full-disks/48461/console [19:35:21] Oh. Odd. [19:35:28] link to the job? [19:35:30] In that case I'll tell Krinkle that he's wrong. [19:35:34] :) [19:36:51] thcipriani: E.g. https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/35974/console – "subprocess.CalledProcessError: Command '['php', 'tests/phpunit/phpunit.php', …]' returned non-zero exit status -11" [19:38:01] DatGuy: we don't have a crystal ball for that, sorry. [19:38:14] PHPUnit's status -11 according to what I could glean means "out of disc/memory". [19:38:15] hrm, that one was on 1043, looks like that one has 15GB of space currently anyway [19:38:17] ah [19:38:27] * DatGuy shakes 8-ball furiously [19:38:31] Yeah, there are a bunch. [19:38:50] I wonder if we should lower number of executors per machine [19:43:37] thcipriani you should be able to through the jenkins ui [19:43:39] (i think) [19:45:21] yeah, I can, just need to edit nodes, but what I mean is: I wonder if that's what's causing machines to run OOM (i.e.: parallelism) or if there was some test-runner/test change that could account for the change in memory usage. [19:45:39] !log deployment-db03: restored some old /var/lib/dpkg/status file : sudo zcat /var/backups/dpkg.status.2.gz | sudo tee /var/lib/dpkg/status [19:45:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:45:56] !log deployment-db03: restored some old /var/lib/dpkg/status file : sudo zcat /var/backups/dpkg.status.2.gz | sudo tee /var/lib/dpkg/status # T216635 [19:45:58] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:45:59] T216635: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 [20:01:17] why does the world keeps exploding constantly ... :( [20:03:37] Phabricator feature request: Allow me to resolve tickets with "this doesn't spark joy" [20:03:45] Resolved: Kondo-style [20:05:25] I like it [20:07:46] it's not quite declined.. it's not quite resolved.. and it's honest. [20:18:42] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) After I have done the fsck the I/O error is gone. It was: Feb 18 13:24:54 mysqld[837]: 190218 13:24:54 [ERROR] InnoDB:... [20:19:45] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10Addshore) p:05Triage→03High Marking as High (as I did with the other ticket) as beta is broken until this is fixed afaik. [20:28:02] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) Then I am trying: ` name=/etc/my.cnf [mysqld] innodb-force-recovery = 1 ` Then `sudo systemctl start mariadb` which sp... [20:33:40] (03PS1) 10Smalyshev: WikibaseLexemeCirrusSearch actually doesn't have its own seleium tests now [integration/config] - 10https://gerrit.wikimedia.org/r/491851 [20:45:28] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10Marostegui) Data looks very corrupted. At this point the best option is to rebuild that host from the slave. [20:45:49] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) Without innodb-force-recovery = 1, I get the same P8111 by simply moving enwiki/archive files. So that table definitely... [20:49:44] 10Beta-Cluster-Infrastructure: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10hashar) >>! In T216067#4964127, @Krenair wrote: > And of course I can't find the package from db03/db04 because their apt status is corrupted... I worked around that by restori... [20:50:22] jdlrobson: haha [20:53:27] 10Beta-Cluster-Infrastructure: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10hashar) >>! In T216067#4961394, @Krenair wrote: > Transfer running the documented way, screen `import` on -db05 and `export` on -db04. > Once this is done I'll probably make dep... [20:54:29] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10hashar) 05Open→03Stalled Not much to do here. The slave deployment-db04 eventually managed to start mysql and new instances... [21:52:49] 10Release-Engineering-Team, 10Analytics, 10EventBus, 10MediaWiki-Core-Testing, and 4 others: Flaky quibble-vendor-mysql-hhvm-docker test in Jenkins - https://phabricator.wikimedia.org/T216069 (10hashar) The tests are being run with #quibble which should let us reproduce the failure. Specially if one reuse... [21:54:19] Project beta-scap-eqiad build #238630: 04FAILURE in 0.71 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238630/ [22:06:50] * paladox has finally done this large change https://gerrit-review.googlesource.com/c/gerrit/+/214872 :) [22:24:41] Yippee, build fixed! [22:24:41] Project beta-scap-eqiad build #238631: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238631/ [22:44:04] 10Release-Engineering-Team (Kanban), 10MW-1.32-notes, 10MW-1.32-release: Release 1.32.1 as a maintenance release - https://phabricator.wikimedia.org/T213595 (10greg) a:03mmodell [22:48:29] 10Release-Engineering-Team (Kanban), 10MW-1.32-notes, 10MW-1.32-release: Release 1.32.1 as a maintenance release - https://phabricator.wikimedia.org/T213595 (10greg) ` greg@x1 ~/Downloads % gpg --fetch-keys "https://www.mediawiki.org/keys/keys.txt" gpg: requesting key from 'https://www.mediawiki.org/keys/ke... [23:04:14] 10Release-Engineering-Team, 10Discovery-Search: CI fails all tests to CirrusSearch REL1_32 branch - https://phabricator.wikimedia.org/T216663 (10EBernhardson) [23:10:35] Project beta-scap-eqiad build #238636: 04FAILURE in 1.4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238636/ [23:11:36] 10Release-Engineering-Team, 10Discovery-Search: CI fails all tests to CirrusSearch REL1_32 branch - https://phabricator.wikimedia.org/T216663 (10Krinkle) Might be related/duplicate: {T189560} [23:24:27] Yippee, build fixed! [23:24:28] Project beta-scap-eqiad build #238637: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/238637/