[00:25:45] (03CR) 1020after4: "I'll test this locally and report back..." [releng/phatality] - 10https://gerrit.wikimedia.org/r/668202 (https://phabricator.wikimedia.org/T237682) (owner: 10Krinkle) [01:11:45] 10Continuous-Integration-Config, 10LuaSandbox: Run LuaSandbox tests against PHP with ZTS - https://phabricator.wikimedia.org/T206591 (10tstarling) We need to do some manual testing before Excimer is released, like a high-concurrency multi-threaded stress test, since Remi is questioning whether it will work eve... [02:59:56] (03PS1) 10Jeena Huneidi: PipelineRunner: allowedCredentials [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) [03:17:17] (03CR) 10Jeena Huneidi: [C: 04-1] "Need to throw exception if the binding isn't in our list as well as update PipelineStage.groovy" [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) (owner: 10Jeena Huneidi) [04:31:12] 10Continuous-Integration-Config, 10LuaSandbox: Run LuaSandbox tests against PHP with ZTS - https://phabricator.wikimedia.org/T206591 (10tstarling) I did that manual testing, it seems to work. Sorry for the noise on a mostly irrelevant task. [06:33:17] !log create Buster VM deployment-mwlog01 to eventually replace deployment-fluorine02 which is still on Stretch [06:33:19] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:22:05] 10Continuous-Integration-Config: CI should validate that the pecl tarball contains all the necessary files to build the extension - https://phabricator.wikimedia.org/T276417 (10Legoktm) [07:33:34] (03PS1) 10Legoktm: Have php*-compile images test pecl installation [integration/config] - 10https://gerrit.wikimedia.org/r/668259 (https://phabricator.wikimedia.org/T276417) [07:45:06] (03CR) 10Legoktm: [C: 03+2] Have php*-compile images test pecl installation [integration/config] - 10https://gerrit.wikimedia.org/r/668259 (https://phabricator.wikimedia.org/T276417) (owner: 10Legoktm) [07:46:15] (03Merged) 10jenkins-bot: Have php*-compile images test pecl installation [integration/config] - 10https://gerrit.wikimedia.org/r/668259 (https://phabricator.wikimedia.org/T276417) (owner: 10Legoktm) [07:47:41] !log rebuilding php*-compile images https://gerrit.wikimedia.org/r/668259 [07:47:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:49:16] (03PS1) 10Legoktm: jjb: Bump php*-compile images [integration/config] - 10https://gerrit.wikimedia.org/r/668335 [07:55:30] 10Beta-Cluster-Infrastructure, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Majavah) p:05Triage→03Medium [07:55:46] 10Beta-Cluster-Infrastructure, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Majavah) [07:55:49] 10Beta-Cluster-Infrastructure, 10Cloud-VPS (Debian Jessie Deprecation): Migrate deployment-prep away from Debian Jessie to Debian Stretch/Buster - https://phabricator.wikimedia.org/T218729 (10Majavah) [08:01:54] 10Beta-Cluster-Infrastructure, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Majavah) I created deployment-mwlog01 as a Buster host. fluorine02 is a disk80 host but the log partition is fairly small: ` /dev/mapper/vd-second--local--disk 60G 6... [08:06:40] 10Beta-Cluster-Infrastructure, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Legoktm) >>! In T276419#6881654, @Majavah wrote: > I tried to apply the udp2log role to the new VM but it's failing because udplog package is not available for Buster y... [08:11:29] 10Beta-Cluster-Infrastructure, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Majavah) >>! In T276419#6881673, @Legoktm wrote: >>>! In T276419#6881654, @Majavah wrote: > > Please file a dedicated task for this and tag #SRE, no reason it can't be... [08:12:17] 10Gerrit: Unarchive analytics/udplog repository - https://phabricator.wikimedia.org/T276422 (10Legoktm) [08:12:35] 10Gerrit: Unarchive analytics/udplog repository - https://phabricator.wikimedia.org/T276422 (10Legoktm) [08:12:56] 10Beta-Cluster-Infrastructure, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Legoktm) [08:15:25] (03CR) 10Awight: Work around python2 in tox container (032 comments) [integration/quibble] - 10https://gerrit.wikimedia.org/r/668176 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [08:25:45] Amir1: do you happen to know if lists.beta.wmflabs.org DNS record is needed? it points to an unassigned floating IP and you have a separate project for the mailman3 upgrade [08:38:19] Majavah: have you ever tried Debian packaging before? :) [08:38:39] legoktm: not yet, if that is what you mean [08:41:44] if you're interested I can send you some resources on how to get started, but no pressure :) [08:42:22] I can try :P [08:42:31] does it need a full redo or just some updates? [08:43:24] updates really [08:43:56] I just tried and it technically does build on buster but it's using a ton of deprecated stuff [08:46:51] Majavah: the main thing is to switch debian/rules from using the old verbose method to a simpler "dh" method [08:46:55] see https://www.debian.org/doc/manuals/debmake-doc/ch05.en.html#dh [08:47:23] I think you could try swapping in the "Simple debian/rules" it gives there and it might just work [08:48:23] install -m644 udp2log/udp2log.conf.default $(CURDIR)/debian/udplog/etc/udp2log <-- that step should be done by creating a debian/install file with the target and destination, see https://manpages.debian.org/buster/debhelper/dh_install.1.en.html#FILES [08:50:26] https://manpages.debian.org/buster/git-buildpackage/git-buildpackage.1.en.html is the recommended tool to actually build the package [08:50:39] so debian/install should look something like "udp2log/udp2log.conf.default etc/udp2log"? [08:53:08] yep [08:55:54] 10Continuous-Integration-Config: CI should validate that the pecl tarball contains all the necessary files to build the extension - https://phabricator.wikimedia.org/T276417 (10Legoktm) ` 00:52:26 + pecl install package.xml 00:52:26 Cannot install, php_dir for channel "pecl.php.net" is not writeable by the curre... [08:58:22] it fails to compile because I don't have otto@wm.o secret key [08:58:29] I guess I have to update the changelog as well [08:59:18] Run `dch` for that [08:59:33] and there's a flag to disable auto signing, let me find it [08:59:49] Majavah: what command did you run btw to build it? [09:00:08] legoktm: gbp buildpackage [09:00:18] (signing is usless until you actually want to publish it somewhere, but for this, SRE will do that step) [09:01:34] try uh, gbp buildpackage -- --no-sign [09:02:15] it works just fine after adding the changelog entry, since my laptop has a key for my email [09:03:55] it may be working, not sure where it put the deb file [09:04:25] hashar: Where is the magic that suppresses tox integration tests? Is something passed in the environment? [09:04:55] awight: GOOOD MORNING AMERICA^GERMANY [09:05:02] 10Continuous-Integration-Config, 10LuaSandbox: Run LuaSandbox tests against PHP with ZTS - https://phabricator.wikimedia.org/T206591 (10hashar) I never ever heard about Zend Thread Safety before, that seems to be an optional compile flag which would require us to build Debian packages specially for this. It is... [09:05:07] so hm [09:05:12] hashar: hehe sorry for the 24/7 dev cycles [09:05:24] that tox running with python2 ending up selecting the wrong python for some testenv was a super fun investigation yesterday night [09:05:26] like [09:05:35] Majavah: usually in the parent directory [09:05:44] it was late, I should have gone to bed but that definitely got me interested. Most importantly cause I BROKE IT IN THE FIRST PLACE [09:05:59] lol [09:06:06] it was too late for me to find the actual root cause. Most probably it always has been broken somehow maybe [09:06:17] or my change to Quibble tox.ini and/or the update of tox broke it [09:06:18] but [09:06:26] legoktm: it says it put it there, "dpkg-deb: building package 'udplog-dbgsym' in '../udplog-dbgsym_1.8-6~buster1_amd64.deb'." I just don't see it there [09:06:30] I found the actual (poorly named) config that makes it possible [09:06:46] Majavah: paste the full log? [09:07:07] Majavah: legoktm: theorically our CI should be able to build a Debian package for you when a patch is prepared if that can help. [09:07:34] (03PS1) 10Awight: Don't include python2 in the tox-buster image [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) [09:07:36] hashar: This is my preferred fix btw ^ [09:07:37] legoktm: https://phabricator.wikimedia.org/P14623 [09:07:38] hashar: I cc'd you on https://phabricator.wikimedia.org/T276422 ;-) [09:07:45] awight: the 'integration' testenv I have added it to only run tests flagged with 'integration' keyword cause they spawn HHVM / php. That was helpful when I coded some stuff for php/hhvm [09:08:33] awight: the selection is done via [tox:jenkins] envlist = x,y,z which does not contain "integration" that overides the default [tox] envlist = x,y,z,integration which one get on their local machine (ie when JENKINS_URL env variable is not set) [09:08:48] hashar: That's how it behaves in CI, but I'm confused because when I run the container locally it seems to default to running all test envs listed in envlist, which includes `integration`. [09:08:53] oooh [09:08:54] ty [09:09:09] * awight files magic in notes.txt [09:09:35] awight: and we need python 2.7. We still have some repositories supporting py2.7 for historical purpose. Pretty sure pywikibot does, and the tests for integration/config still do because of Zuul (which is python2 only) [09:09:59] pywikibot dropped 2.7 a while back [09:10:06] puppet still needs 2.7 though [09:10:35] Majavah: uh...try looking for it somewhere in your home dir? :p [09:11:00] I'm not sure, I thought the default was to put it one directory above the git repo [09:11:09] (that's what happens when I run it) [09:11:12] hashar: Okay, I'll splice your suggested tweaks into tox.ini in that case. [09:11:57] legoktm: found it, it was in ../build-area for some reason [09:11:58] https://debmonitor.wikimedia.org/packages/udplog version "1.8-5~jessie1" huhuu [09:12:18] what now? just send the patch in gerrit, or something else? should I test the new package somehow? how? [09:13:12] legoktm: Majavah: why do you spend time to migrate udplog to Buster anyway? It is supposed to be phased out entirely and hence why it is still running jessie [09:14:07] eh? [09:14:09] https://phabricator.wikimedia.org/T224565 [09:14:10] hashar: {{citation needed}}, https://phabricator.wikimedia.org/T224565 states "update udp2log servers to Buster" [09:14:19] yeah [09:14:26] I am challenging the usefulness of that task though [09:15:01] my understanding is that udp2log is to be decommissioned entirely [09:15:16] I'm not aware of any such plans [09:15:31] in any case, the push to get off of jessie is happening before that [09:16:45] well make sure to raise that to observability at least [09:16:51] Majavah: we need to unarchive the repository first. I'm not exactly sure how to test the individual binaries, I think if they build properly, and work in beta, then it's probably good enough. The code is mostly boost so I'd expect it to be pretty stable [09:17:14] cause we still have to maintain an obsolete stack that is a decade+ years old when it should have been replaced by now [09:17:32] that adds to the stuff we have to maintain and it is a pain overall (yeah for #technical-debt ..) [09:18:15] I don't know what you're referring to as "obsolete" [09:18:38] 10Continuous-Integration-Config, 10Gerrit: Unarchive analytics/udplog repository - https://phabricator.wikimedia.org/T276422 (10hashar) [09:19:14] (03PS2) 10Awight: Work around python2 in tox container [integration/quibble] - 10https://gerrit.wikimedia.org/r/668176 (https://phabricator.wikimedia.org/T276384) [09:19:16] Majavah: ahhh, looks like moritzm already did all the packaging work :| sorry, but at least you're unblocked now [09:19:25] lol oops :D [09:19:36] I'll see what happens on that deployment-prep VM [09:20:15] !log Restored analytics/udp2log cause it got to be packaged for Buster # T276422 T180301 [09:20:19] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:20:19] T276422: Unarchive analytics/udplog repository - https://phabricator.wikimedia.org/T276422 [09:20:19] T180301: Add CI to all analytics/* repositories and archive obsolete ones - https://phabricator.wikimedia.org/T180301 [09:20:33] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10ci-test-error: CI tox-buster image failing with python 2 errors - https://phabricator.wikimedia.org/T276384 (10awight) Thanks for finding the fix! Latest patch includes your changes, and local `tox-buster:0.3.0` tests succeed. [09:22:32] (03PS1) 10Hashar: analytics/udplog: add non voting debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/668347 (https://phabricator.wikimedia.org/T276422) [09:22:54] 10Continuous-Integration-Config, 10Gerrit, 10Patch-For-Review: Unarchive analytics/udplog repository - https://phabricator.wikimedia.org/T276422 (10hashar) a:03hashar [09:23:34] (03CR) 10jerkins-bot: [V: 04-1] analytics/udplog: add non voting debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/668347 (https://phabricator.wikimedia.org/T276422) (owner: 10Hashar) [09:24:04] thanks hashar [09:24:35] Majavah: anyways, if you do want to learn more packaging stuff, happy to teach/advise :) [09:24:46] there's still plenty to do on that package [09:25:07] Moritz only did the boost fix, so your change will still be useful [09:25:11] * legoktm -> zzz [09:25:25] sure, thanks for your help! [09:25:42] (03CR) 10Hashar: "This will build the Debian package and ignore the result (always vote Verified +1)." [integration/config] - 10https://gerrit.wikimedia.org/r/668347 (https://phabricator.wikimedia.org/T276422) (owner: 10Hashar) [09:25:55] damn [09:25:56] ci fails [09:26:21] test_projects_have_pipeline_gate_and_submit [09:26:22] pff [09:27:24] (03PS2) 10Hashar: analytics/udplog: add non voting debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/668347 (https://phabricator.wikimedia.org/T276422) [09:30:13] Majavah: ^ that would cause CI to build the debian package for you :] [09:30:18] when I deploy it [09:30:26] thank you! [09:31:33] 10Continuous-Integration-Config, 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10MediaWiki-Core-Testing, and 6 others: Reduce runtime of MW shared gate Jenkins jobs to 5 min - https://phabricator.wikimedia.org/T225730 (10awight) [09:31:56] Majavah: and when the job does pass, we will need another CI change to use the voting job (which will vote verified -1 on failure) and add that job to run when someone CR+2 (aka add it to gate-and-submit) [09:32:02] (03CR) 10Hashar: [C: 03+2] analytics/udplog: add non voting debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/668347 (https://phabricator.wikimedia.org/T276422) (owner: 10Hashar) [09:34:03] (03Merged) 10jenkins-bot: analytics/udplog: add non voting debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/668347 (https://phabricator.wikimedia.org/T276422) (owner: 10Hashar) [09:36:06] 10Continuous-Integration-Config, 10Gerrit, 10Patch-For-Review: Unarchive analytics/udplog repository - https://phabricator.wikimedia.org/T276422 (10hashar) I have removed the read only flag from the Gerrit repository and added a CI job to build the Debian package (although in non voting mode it would thus no... [09:36:44] Majavah: so yeah just send a change to analytics/udplog with an update of the debian/changelog to point to Buster and you will get the CI job building the deb package for you (which might just fail :D ) [09:37:51] 10Continuous-Integration-Config, 10Gerrit, 10Patch-For-Review: Unarchive analytics/udplog repository - https://phabricator.wikimedia.org/T276422 (10hashar) 05Open→03Stalled This is pending port of udplog packaging to Buster and then we can make CI to enforce successful build. [09:39:27] (03CR) 10Hashar: [C: 04-1] "We definitely still need python2.7 in the image. There is at least integration/config still using it and I guess some other repositories m" [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [09:39:50] awight: so yeah essentially: change the CI image to install tox using pip3 / python3 [09:40:15] awight: and for Quibble tox.ini we can fix using [tox] ignore_basepython_conflict = True [testenv] basepython = python3 [09:44:43] 10Continuous-Integration-Config: Introduce non-voting jobs with quibble+apache - https://phabricator.wikimedia.org/T276428 (10awight) [09:45:52] 10Continuous-Integration-Config: Introduce non-voting jobs with quibble+apache - https://phabricator.wikimedia.org/T276428 (10awight) [09:47:14] (03Abandoned) 10Hashar: Remove trigger- jobs [integration/config] - 10https://gerrit.wikimedia.org/r/666353 (https://phabricator.wikimedia.org/T271107) (owner: 10Hashar) [09:47:19] (03Abandoned) 10Hashar: Test direct triggering of a Pipeline job [integration/config] - 10https://gerrit.wikimedia.org/r/666352 (https://phabricator.wikimedia.org/T271107) (owner: 10Hashar) [09:47:57] (03PS21) 10Daimona Eaytoy: dockerfiles: coverage: add pcov, use it if we're on PHPUnit 8+ [integration/config] - 10https://gerrit.wikimedia.org/r/567938 (https://phabricator.wikimedia.org/T234020) [09:48:08] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Addshore) [09:48:18] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Addshore) [09:49:58] (03PS22) 10Daimona Eaytoy: dockerfiles: coverage: add pcov, use it if we're on PHPUnit 8+ [integration/config] - 10https://gerrit.wikimedia.org/r/567938 (https://phabricator.wikimedia.org/T234020) [09:51:54] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO, 10Release Pipeline, 10Patch-For-Review: Have PipelineBot link directly to build console output on test - https://phabricator.wikimedia.org/T271107 (10kostajh) >>! In T271107#6881927, @gerritbo... [09:51:54] (03CR) 10Kosta Harlan: "Do you want to link this with T209149?" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/647594 (owner: 10Gergő Tisza) [09:52:52] 10Continuous-Integration-Config: composer-package-php72-docker runs with xdebug enabled - https://phabricator.wikimedia.org/T269489 (10Daimona) It seems like xdebug was enabled for more jobs. As a case in point, all phan jobs are now running with xdebug enabled, see [[https://integration.wikimedia.org/ci/job/mwe... [09:53:49] (03CR) 10jerkins-bot: [V: 04-1] Work around python2 in tox container [integration/quibble] - 10https://gerrit.wikimedia.org/r/668176 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [10:00:23] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Addshore) [10:02:40] 10phabricator maintenance bot: Remove #Patch-For-Review when patch is abandoned in Gerrit - https://phabricator.wikimedia.org/T276390 (10Aklapper) [10:07:00] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Addshore) I'm not sure which group this would best live in? [10:08:00] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO, 10Release Pipeline, 10Patch-For-Review: Have PipelineBot link directly to build console output on test - https://phabricator.wikimedia.org/T271107 (10hashar) The Pipeline jobs are instances of... [10:09:49] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Lucas_Werkmeister_WMDE) At least “Wikimedia deployed”, I guess. wikidata/query/rdf would also be good to include, by the way. [10:10:35] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Majavah) Now that `udplog` is available on Buster, next I'll try live hacking https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/66833... [10:24:57] (03PS9) 10Gergő Tisza: Add Gerrit report format [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/647594 (https://phabricator.wikimedia.org/T209149) [10:25:24] (03PS11) 10Gergő Tisza: Add fix reporting to Gerrit robot comment reporter [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/647595 (https://phabricator.wikimedia.org/T209149) [10:26:48] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI, 10Patch-For-Review, 10User-Addshore: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Addshore) a:03Addshore [10:38:28] (03PS2) 10Awight: Install tox with python3 [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) [10:39:26] (03Abandoned) 10Awight: Work around python2 in tox container [integration/quibble] - 10https://gerrit.wikimedia.org/r/668176 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [10:48:55] hashar: Good news, that minor container tweak fixes the issue with no changes to tox.ini. [10:51:17] !log stop bogus service udp2log on deployment-mwlog01, no idea what it is but it was using the same port as udp2log-mw.service is [10:51:21] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:02:21] !log live hacking https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/668338/ on deployment-deploy01 to test new deployment-mwlog01 ref T276419 [11:02:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:02:24] T276419: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 [11:12:04] awight: dropping python2 ? [11:12:12] not sure I can qualify that has minor hehe [11:12:22] s/has/as/ ;] [11:12:41] I think the right fix is to have the CI images to install tox using python3 [11:12:49] so it would have a sane basepython [11:13:01] that would break any repository expecting the default to be python2 [11:13:12] but they can either formally phase out python2 [11:13:19] or us basepython python3 [11:13:39] and that will save us from having to add a dirty hack to quibble tox.ini or other repo that would suffer frmo that [11:13:48] I should write that oni the task really but need a coffee [11:13:52] one sure thing [11:13:57] it is a good finding Adam! [11:23:05] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10User-Majavah: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 (10Majavah) Live hacks reverted. I confirmed that mwlog01 receives events and logs are still flowing to logstash-beta.wmflabs.org. Scheduled the medi... [11:26:57] hashar: check the integration-config patch again! Now it's just s/pip/pip3/g [11:27:02] (03PS3) 10Pwirth: Remove selenium from BlueSpiceFlaggedRevsConnector [integration/config] - 10https://gerrit.wikimedia.org/r/668159 [11:31:26] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10User-zeljkofilipin: Upgrade all CI jobs for WMF-deployed projects from Node 10 to Node 14 LTS - https://phabricator.wikimedia.org/T267890 (10zeljkofilipin) [11:33:36] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10ci-test-error: CI tox-buster image failing with python 2 errors - https://phabricator.wikimedia.org/T276384 (10hashar) The `ignore_basepython_conflict` setting I found out yesterday night is a great hack, then it is really just a hack an any repo... [11:34:07] awight: yes definitely ) [11:34:28] awight: though we should only use pip3 for the tox-buster images. The tox one has to be phased out eventually [11:34:59] (03CR) 10Hashar: [C: 03+2] Remove selenium from BlueSpiceFlaggedRevsConnector [integration/config] - 10https://gerrit.wikimedia.org/r/668159 (owner: 10Pwirth) [11:34:59] hashar: FWIW, I ran the resulting image on master quibble and it passes :-) [11:35:12] !!!!!!!!!!!!!!!!!!!!! [11:35:20] Yeah I see what you mean, this will probably break jobs which rely on py2 now. [11:35:21] awight: without any tweak to tox.ini right? [11:35:35] +1 [11:35:57] I guess that is why volans is using environment names explicitly listing py3 [11:36:15] in operations/software/spicerack or operations/software/cumin [11:36:39] (03Merged) 10jenkins-bot: Remove selenium from BlueSpiceFlaggedRevsConnector [integration/config] - 10https://gerrit.wikimedia.org/r/668159 (owner: 10Pwirth) [11:36:47] ah well it'll be nice to trim all that cruft away [11:37:03] awight: do you have docker-pkg installed ? [11:37:20] cause in order for an image to be rebuild, it must have a new entry in the ./changelog file [11:37:46] and docker-pkg update command lets one update the changelog for the affected image as well as for all child images that inherits from it [11:37:53] crafting changelog entries for each descendant [11:38:42] (03CR) 10Hashar: "Deployed" [integration/config] - 10https://gerrit.wikimedia.org/r/668159 (owner: 10Pwirth) [11:39:33] hashar: ooh that's right, I just docker rm the image locally and rebuilt the same one. I'll bump versions now... [11:39:56] awight: and docker-pkg should take care of crafting the changelog bumps for all the other tox-buster based images ;] [11:39:59] I am off for lunch [11:40:03] familly waiting for me [11:40:16] will rebuild the fleet of images this afternoon and switch the job [11:40:26] then I guess announce it [11:42:14] docker-pkg gives bad usage help for `update` btw, it's missing the final DIRECTORY arg. [11:44:53] (03PS3) 10Awight: Install tox with python3 [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) [11:45:09] thanks for the tip, otherwise I would have hunted down all those dependencies by hand! [12:06:10] !log deployment-prep Delete lists.beta.wmflabs.org DNS record, points to an unassigned floating IP and not used according to Amir [12:06:13] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:19:14] !log Beta cluster is now using deployment-mwlog01 instead of deployment-fluorine02 for MediaWiki logs. fluorine02 is still used for some other misc services, these will be migrated soon [12:19:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:23:53] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10jbond) p:05Triage→03Medium [12:24:14] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10jbond) [12:24:45] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10jbond) [12:34:09] 10Phabricator: Unhandled exception when trying to use the Next button of the Notifications page - https://phabricator.wikimedia.org/T276447 (10Agabi10) [12:38:10] !log `git rebase origin/production` on deployment-puppetmaster04 to update few settings for T276419 [12:38:14] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:38:14] T276419: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 [12:43:30] awight: yeah docker-pkg has some annoyances ;D [12:48:06] (03PS4) 10Hashar: Install tox with python3 [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [12:49:36] (03PS5) 10Hashar: Install tox with python3 [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [12:50:07] awight: I have made some tweaks to the changelog version ( https://gerrit.wikimedia.org/r/c/integration/config/+/668342/3..5 ) but yeah +2 ing it [12:50:33] (03CR) 10Hashar: [C: 03+2] "Excellent :) I have just altered the changelog entries to create a minor version bump rather than a -sX bump." [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [12:51:00] will rebuild the fleet of image then update the ci jobs [12:51:53] (03Merged) 10jenkins-bot: Install tox with python3 [integration/config] - 10https://gerrit.wikimedia.org/r/668342 (https://phabricator.wikimedia.org/T276384) (owner: 10Awight) [13:18:14] !log shutdown deployment-fluorine02 for a scream test for T276419, I believe everything has been moved to deployment-mwlog01 [13:18:17] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:18:18] T276419: Replace deployment-fluorine02 with a Buster host - https://phabricator.wikimedia.org/T276419 [13:38:24] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10ci-test-error: CI tox-buster image failing with python 2 errors - https://phabricator.wikimedia.org/T276384 (10hashar) Successfully published image docker-registry.discovery.wmnet/releng/tox-pywikibot:0.7.0 Successfully published image docker-reg... [13:44:25] (03PS1) 10Hashar: jjb: use python3 tox for quibble repository [integration/config] - 10https://gerrit.wikimedia.org/r/668424 (https://phabricator.wikimedia.org/T276384) [13:45:07] (03CR) 10Hashar: [C: 03+2] "Job updated so we can verify on integration/quibble.git ;)" [integration/config] - 10https://gerrit.wikimedia.org/r/668424 (https://phabricator.wikimedia.org/T276384) (owner: 10Hashar) [13:46:29] (03Merged) 10jenkins-bot: jjb: use python3 tox for quibble repository [integration/config] - 10https://gerrit.wikimedia.org/r/668424 (https://phabricator.wikimedia.org/T276384) (owner: 10Hashar) [13:48:52] awight: should be good for integration/quibble now [13:49:18] hashar: In case I do that again, is there a flag to tell docker-pkg to create the minor version bump? I saw --version but couldn't figure how that would apply to dependent images. [13:49:31] nop :- [13:49:31] ( [13:49:45] the uupdate feature has originally been written for security updates [13:49:54] so one would docker update whatever is the base image [13:50:09] and that would trigger a dummy -s### security update for all containers [13:50:23] guess we can look at adding a flag to bump major/minor/patchset instead [13:50:38] * awight swings lasso around to catch docker-pkg on my next python spree... [13:52:34] the whole stack is inherently broken anyway so [13:54:06] :-D [13:55:00] btw, the quibble integration tests never pass for me, I need to add postgres instructions to the readme. [13:56:41] (03PS37) 10Awight: Parallelism as a command object [integration/quibble] - 10https://gerrit.wikimedia.org/r/587885 (https://phabricator.wikimedia.org/T235449) [13:56:43] (03PS12) 10Awight: Split extension and skin npm and composer tests [integration/quibble] - 10https://gerrit.wikimedia.org/r/587888 [13:56:45] (03PS5) 10Awight: Split core npm and composer tests [integration/quibble] - 10https://gerrit.wikimedia.org/r/588087 [13:56:53] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) {T276316} has a workaround in this train, but bug is kept open so it's not forgotten. It's a b... [13:57:03] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) [13:58:04] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI, 10Patch-For-Review, 10User-Addshore: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Ladsgroup) It'll be there in 24 hours. [13:58:16] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) {T276353} has a workaround/fix in this train, but bug is open to allow monitoring better. I'm... [13:58:29] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) [14:07:36] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) All wikis on 1.36.0-wmf.33 now. Watching logs for problems. [14:22:12] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10GitLab (Initialization), 10User-brennen: Remove Speed & Function blockers for GitLab work - https://phabricator.wikimedia.org/T274458 (10wkandek) [14:28:01] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) Reported {T276461} but it's not happening enough to be a blocker. [14:33:32] (03CR) 10jerkins-bot: [V: 04-1] Split core npm and composer tests [integration/quibble] - 10https://gerrit.wikimedia.org/r/588087 (owner: 10Awight) [14:33:34] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10GitLab (Initialization), 10User-brennen: Remove Speed & Function blockers for GitLab work - https://phabricator.wikimedia.org/T274458 (10thcipriani) [14:34:01] (03CR) 10jerkins-bot: [V: 04-1] Split extension and skin npm and composer tests [integration/quibble] - 10https://gerrit.wikimedia.org/r/587888 (owner: 10Awight) [14:34:04] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI, 10User-Addshore: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Addshore) 05Open→03Resolved I'll go ahead and close it now! [14:34:17] 10Beta-Cluster-Infrastructure: Replace deployment-etcd-01 with a Buster host - https://phabricator.wikimedia.org/T276462 (10Majavah) p:05Triage→03Medium [14:34:53] 10Beta-Cluster-Infrastructure: Replace deployment-etcd-01 with a Buster host - https://phabricator.wikimedia.org/T276462 (10Majavah) [14:34:57] 10Beta-Cluster-Infrastructure, 10Cloud-VPS (Debian Jessie Deprecation): Migrate deployment-prep away from Debian Jessie to Debian Stretch/Buster - https://phabricator.wikimedia.org/T218729 (10Majavah) [14:43:59] 10Beta-Cluster-Infrastructure: Replace deployment-etcd-01 with a Buster host - https://phabricator.wikimedia.org/T276462 (10Majavah) The current instance is running etcd 2.2.1. Buster has 3.2.26 available. `etcdctl` is convinced that it's running on wikimedia.cloud and refuses to connect, even when specifying a... [14:54:21] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10ci-test-error: CI tox-buster image failing with python 2 errors - https://phabricator.wikimedia.org/T276384 (10awight) a:05awight→03None [14:54:29] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10ci-test-error: CI tox-buster image failing with python 2 errors - https://phabricator.wikimedia.org/T276384 (10awight) 05Open→03Resolved a:03awight Verified working! [14:54:42] 10phabricator maintenance bot: Remove #Patch-For-Review when patch is abandoned in Gerrit - https://phabricator.wikimedia.org/T276390 (10Zabe) [14:57:00] hashar: Not sure what we should do about it, but the quibble-fullrun has been timing out after 30 min. I don't see anything obvious, other than a mysterious 5-minute delay at 07:40, https://integration.wikimedia.org/ci/job/integration-quibble-fullrun/285/consoleFull [15:00:48] !log remove graphoid role from deploymenr-sca[01-02] ref T276102 and it being decomissioned in T242855 [15:00:53] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:00:53] T242855: Undeploy graphoid - https://phabricator.wikimedia.org/T242855 [15:00:53] T276102: broken puppet on deployment-sca01.deployment-prep.eqiad1.wikimedia.cloud and deployment-sca02.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T276102 [15:24:32] 10Beta-Cluster-Infrastructure, 10cloud-services-team (Kanban): broken puppet on deployment-sca01.deployment-prep.eqiad1.wikimedia.cloud and deployment-sca02.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T276102 (10Majavah) Recommendation_api and Apertium are still running on these... [15:27:34] hashar: If you have any thoughts about T276428, I can attempt to configure jobs but will need guidance... [15:27:35] T276428: Introduce non-voting jobs with quibble+apache - https://phabricator.wikimedia.org/T276428 [15:28:56] 10Beta-Cluster-Infrastructure: Replace deployment-etcd-01 with a Buster host - https://phabricator.wikimedia.org/T276462 (10Majavah) Etcd v2 -> v3 migration [[ https://etcd.io/docs/current/upgrades/upgrade_3_0/ | looks ]] annoying. The cluster needs to first be upgraded to 2.3, then 3.0 to 3.1 and only after tha... [15:29:18] legoktm: I sent a udplog patch updating to debsrc 3 and to dh, please take a look when you have time [15:31:40] awight: the 5 mins delays is eslint [15:31:41] 00:07:40.938 Running "eslint:all" (eslint) task [15:31:41] 00:12:59.160 [15:31:58] maybe that is an infra issue [15:32:57] awight: and looking at https://integration.wikimedia.org/ci/job/integration-quibble-fullrun/buildTimeTrend it used to take 10 minutes but now 30+ [15:33:12] not like that job runs often though [15:34:35] hashar: +1 thanks for the hints. The increased run time ramps up evenly, looks like an infra or mw-core thing? [15:34:46] awight: or it is IO starved [15:35:24] cause since October, WMCS instances no more have a local disk. Instead everything is on the Ceph distributed storage which has a 500 iops rate limitation [15:35:30] that definitely caused a bunch of havoc [15:35:41] but I think we now have instance with a much large iops rate (like 2000 maybe) [15:35:44] omg everything uses ceph now? =o [15:35:48] yeah [15:35:52] <3333 [15:35:59] the compute machine (wihch have the RAM / CPU) run the images [15:36:10] but the instance filesystem is stored in Ceph [15:36:30] so all file actions go from the compute machine to Ceph which adds some delay I guess [15:36:33] and there is a rate limit [15:36:45] but that might not be the reason the job takes so long :] [15:37:00] though well [15:37:07] eslint taking 5 Minutes is suspicious ;] [15:37:56] meanwhile [15:38:12] I got the integration/quibble tox job to use the python3 based tox [15:38:13] The regular job is up against the same wall, wmf-quibble-selenium-php72-docker [15:38:41] so whatever issue you had should be solved. I guess you want to rework your chain of patches and drop the patch that was setting basepython=python3 , it is still in the chain [15:39:01] slow eslint as well? :-\ [15:39:29] hashar: +1 I did :-) [15:40:38] the Selenium job has at least two issues [15:40:44] thenpm install takes age [15:40:58] it runs every single selenium testsuite [15:42:38] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10RobH) As far as I know, ONLY phabricator administrators can edit forms. So giving you permission to do this would entail giving you full admin in phab. I don't have a vote here, just echoin... [15:46:19] (03PS1) 10Hashar: jjb: use a python3 based tox [integration/config] - 10https://gerrit.wikimedia.org/r/668467 (https://phabricator.wikimedia.org/T276384) [15:47:03] !log Refreshing jobs based on releng/tox-buster to use latest image. That brings in tox installed with python3 instead of python2 # T276384 [15:47:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:47:06] T276384: CI tox-buster image failing with python 2 errors - https://phabricator.wikimedia.org/T276384 [15:47:19] (03CR) 10Hashar: [C: 03+2] jjb: use a python3 based tox [integration/config] - 10https://gerrit.wikimedia.org/r/668467 (https://phabricator.wikimedia.org/T276384) (owner: 10Hashar) [15:50:00] (03Merged) 10jenkins-bot: jjb: use a python3 based tox [integration/config] - 10https://gerrit.wikimedia.org/r/668467 (https://phabricator.wikimedia.org/T276384) (owner: 10Hashar) [15:51:44] hashar: Helpful to hear about the potential IO bottleneck, I'll keep that in mind when reading through quibble code + logs. [15:53:32] we had a more specific instance flavor specially for that rate limit though [15:54:28] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10Majavah) @RobH I see "Allow members of any project acl*phabricator, acl*security_team" when trying to edit forms, so I doubt that it's restricted to only full admins [15:56:16] awight: https://phabricator.wikimedia.org/T266777 has the glorious details [15:57:00] fun there is a comment that says integration-quibble-fullrun doubled run time from 11 to 22 minutes [15:57:35] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10RhinosF1) >>! In T276446#6883231, @Majavah wrote: > @RobH I see "Allow members of any project acl*phabricator, acl*security_team" when trying to edit forms, so I doubt that it's restricted to... [15:59:38] hashar: Progress! 8D [16:00:09] I did a patch to have that fullrun job to point mysql to a tmpfs https://gerrit.wikimedia.org/r/c/integration/config/+/654933/2/jjb/integration.yaml [16:00:13] but maybe I screwed up the path [16:00:23] and the data are not in /workspace/db [16:04:05] (03PS1) 10Hashar: (DNM) where is MySQL datadir on CI? [integration/quibble] - 10https://gerrit.wikimedia.org/r/668476 (https://phabricator.wikimedia.org/T2666777) [16:04:41] Cool, I was having a similar thought. Data locality is one of the main gotchas of big data and cloud systems, it seems. If there's enough memory to hold the entire file tree under test, we're golden... [16:05:00] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10RobH) I don't see where to edit that, can you put in step by step? [16:05:04] https://gerrit.wikimedia.org/r/c/integration/quibble/+/668476 (DNM) where is MySQL datadir on CI? [NEW] [16:05:07] hmm [16:05:25] awight: yeah our wold field is crippled by storage [16:05:56] if we could invest billions toward figuring out a storage system that is reliable, fast and cheap, we would enter in an entirely new age [16:06:20] after the communication revolution, that would be the archiving revolution [16:06:23] or something like that [16:11:05] hashar: mysql datadir - is it still in a tempfs mount liked we used to? I remember that was a very big perf boost [16:11:21] (03CR) 10jerkins-bot: [V: 04-1] (DNM) where is MySQL datadir on CI? [integration/quibble] - 10https://gerrit.wikimedia.org/r/668476 (https://phabricator.wikimedia.org/T2666777) (owner: 10Hashar) [16:11:25] I assume that is still the case, but given we have slowed down so much (5m -> 25m) I wonder if maybe that got lost at some point [16:12:32] 10Phabricator: Proposed changes to the SRE Access request - https://phabricator.wikimedia.org/T276473 (10jbond) [16:13:04] 10LibUp: LibUp hasn't successfully run on AutoCreateCategoryPages for 7 weeks - https://phabricator.wikimedia.org/T275292 (10hashar) [16:13:23] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Upstream: Can't delete weird ref using git in Gerrit - https://phabricator.wikimedia.org/T275946 (10hashar) 05Open→03Resolved a:03hashar Solved by deleting the `refs/master` reference. There is some oddit... [16:13:27] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Upstream: Can't delete weird ref using git in Gerrit - https://phabricator.wikimedia.org/T275946 (10hashar) [16:13:35] awight: Exception: MySQL datadir is: /tmp/quibble-mysql-8occxz06 hehe [16:13:52] Krinkle: yeah it should [16:14:04] but at least for that specific integration-quibble-fullrun it is not :-\ [16:14:20] 10Phabricator: Proposed changes to the SRE Access request - https://phabricator.wikimedia.org/T276473 (10jbond) p:05Triage→03Medium [16:14:26] cuase you now [16:14:33] cause you know, I am dumb sometime :-\ [16:14:40] 10Phabricator: Proposed changes to the SRE Access request - https://phabricator.wikimedia.org/T276473 (10jbond) [16:14:57] 10Phabricator: Proposed changes to the SRE Access request - https://phabricator.wikimedia.org/T276473 (10jbond) [16:16:50] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10RhinosF1) It'll be one of the policies shown on https://phabricator.wikimedia.org/transactions/editengine/maniphest.task/edit/8/ [16:17:51] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10RobH) >>! In T276446#6883406, @RhinosF1 wrote: > It'll be one of the policies shown on https://phabricator.wikimedia.org/transactions/editengine/maniphest.task/edit/8/ So on that I see Name,... [16:21:30] (03PS1) 10Hashar: jjb: point db dir to the tmpfs in Quibble fullrun jobs [integration/config] - 10https://gerrit.wikimedia.org/r/668483 (https://phabricator.wikimedia.org/T266777) [16:21:58] awight: ^ :] the fullrun jobs missed --db-dir /workspace/db thus Mysql datadir ends up in /tmp :-\ [16:22:01] which is not a tmpfs [16:22:19] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10Majavah) Do you see anything useful under https://phabricator.wikimedia.org/transactions/editengine/transactions.editengine.config/? Phabricator allows editing the forms used for editing forms. [16:22:39] (03Abandoned) 10Hashar: (DNM) where is MySQL datadir on CI? [integration/quibble] - 10https://gerrit.wikimedia.org/r/668476 (https://phabricator.wikimedia.org/T2666777) (owner: 10Hashar) [16:26:26] (03CR) 10Hashar: "recheck fullrun job should be faster after properly setting db dir to point to a tmpfs https://gerrit.wikimedia.org/r/668483" [integration/quibble] - 10https://gerrit.wikimedia.org/r/587885 (https://phabricator.wikimedia.org/T235449) (owner: 10Awight) [16:27:08] (03CR) 10Hashar: "Deployed and I have done a recheck of ps 37 https://gerrit.wikimedia.org/r/c/integration/quibble/+/587885" [integration/config] - 10https://gerrit.wikimedia.org/r/668483 (https://phabricator.wikimedia.org/T266777) (owner: 10Hashar) [16:37:55] 10Phabricator: Proposed changes to the SRE Access request - https://phabricator.wikimedia.org/T276473 (10jbond) [16:38:41] awight: and yeah eslint is slow... waiting access which sounds like disk io issue [16:38:59] or maybe not [16:39:45] maybe the eslint config is broken somehow [16:39:52] and it ends up doing a full tree traversal [16:42:14] hashar: great discovery re. /tmp ! [16:42:24] yeah I think eslint is broken somehow [16:42:41] it does a wide range of stat() for some reason [16:42:50] randomly, my eslint is only slow when I overlook an ignore directory and it traverses some crazy dependency rabbithole [16:43:07] looking for .babelrc / .babelrc.json / .babelrc.js in every single parent directories [16:43:47] (03CR) 10Awight: "recheck" [integration/quibble] - 10https://gerrit.wikimedia.org/r/587888 (owner: 10Awight) [16:43:53] 10Beta-Cluster-Infrastructure: Replace deployment-etcd-01 with a Buster host - https://phabricator.wikimedia.org/T276462 (10Majavah) Caller survey: ` taavi@deployment-etcd-01:/var/log/nginx$ sudo cat etcd_access.log | cut -d ' ' -f1 | sort | uniq -c | sort 1098 172.16.4.16 13 172.16.5.46 142 172.1... [16:43:53] (03CR) 10Awight: "recheck" [integration/quibble] - 10https://gerrit.wikimedia.org/r/588087 (owner: 10Awight) [16:46:55] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) Reported {T276476}, but it does not warrant rolling back the train, I think. [16:47:23] I have no idea how to flag a task about eslint performance bah [16:57:58] 10Project-Admins, 10Maps: Create geoshapes component under the Maps project - https://phabricator.wikimedia.org/T276479 (10MSantos) [17:03:34] awight: and yeah eslint:all is slower https://phabricator.wikimedia.org/T276477 ;D [17:18:34] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10Scap, 10Patch-For-Review: Update Scap to perform rolling restart for all MW deploy - https://phabricator.wikimedia.org/T266055 (10thcipriani) >>! In T266055#6878922, @LarsWirzenius wrote: >... [17:20:30] (03CR) 10Hashar: "The job is still super slow, but that at least improves it a bit ;)" [integration/config] - 10https://gerrit.wikimedia.org/r/668483 (https://phabricator.wikimedia.org/T266777) (owner: 10Hashar) [17:20:47] (03CR) 10Hashar: [C: 03+2] jjb: point db dir to the tmpfs in Quibble fullrun jobs [integration/config] - 10https://gerrit.wikimedia.org/r/668483 (https://phabricator.wikimedia.org/T266777) (owner: 10Hashar) [17:22:18] (03Merged) 10jenkins-bot: jjb: point db dir to the tmpfs in Quibble fullrun jobs [integration/config] - 10https://gerrit.wikimedia.org/r/668483 (https://phabricator.wikimedia.org/T266777) (owner: 10Hashar) [17:23:18] (03CR) 10Dduvall: [C: 03+1] "Looks great to me. Just one nit. Why do you need to update PipelineStage.groovy? I think what you have is sufficient to pass on the whitel" (031 comment) [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) (owner: 10Jeena Huneidi) [17:23:45] (03CR) 10Dduvall: [C: 04-1] "Sorry, I guess it's a -1 on account of my nit. Then a +1." [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) (owner: 10Jeena Huneidi) [17:25:37] 10Beta-Cluster-Infrastructure: Replace deployment-etcd-01 with a Buster host - https://phabricator.wikimedia.org/T276462 (10Majavah) Might be related: found some DNS records, ` _etcd._tcp.beta.wmflabs.org.` and ` _etcd_server._tcp.beta.wmflabs.org.`, that are pointing to a non-existent instance that was deleted... [17:38:51] (03CR) 10Jeena Huneidi: [C: 04-1] "In PipelineStage (line 548), it passes the credentials id and name to the run method, but since I've added the ability to use other kinds " (031 comment) [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) (owner: 10Jeena Huneidi) [17:39:08] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (2020-10-01 to 2020-12-31 (Q2)), 10Scap, 10Patch-For-Review: Update Scap to perform rolling restart for all MW deploy - https://phabricator.wikimedia.org/T266055 (10jijiki) @thcipriani @dancy we will try to get what is needed... [17:40:55] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI, 10User-Addshore: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Legoktm) I think this would fit better under "services" since "deployed" is currently just MediaWiki core+extensions+skins+vendor [17:42:35] 10Gerrit: gerrit's sshd is incompatible with RSA pubkeys + Fedora 33 clients (and future versions of OpenSSH proper) - https://phabricator.wikimedia.org/T276486 (10CDanis) [17:43:06] (03CR) 10Dduvall: [C: 04-1] "Conceptually this looks great. I like how you were able to get the definitions so close to the project pipeline definitions. Nicely done." (035 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) (owner: 10Jeena Huneidi) [17:45:00] (03CR) 10Dduvall: [C: 04-1] Define and pass allowed credentials to pipeline (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) (owner: 10Jeena Huneidi) [17:45:13] (03CR) 10Mholloway: "I forgot about this for a while since I had a working image built locally from creating this patch, but now I'm on a new machine and looki" [releng/dev-images] - 10https://gerrit.wikimedia.org/r/626739 (owner: 10Mholloway) [17:45:23] 10Gerrit: gerrit's sshd is incompatible with RSA pubkeys + Fedora 33 clients (and future versions of OpenSSH proper) - https://phabricator.wikimedia.org/T276486 (10Nemo_bis) Thanks for the help! The error presents itself when making a `git pull` over ssh, like this: ` nemobis@gerrit.wikimedia.org: Permission d... [17:46:29] 10Gerrit: gerrit's sshd is incompatible with RSA pubkeys + Fedora 33 clients (and future versions of OpenSSH proper) - https://phabricator.wikimedia.org/T276486 (10CDanis) [17:47:20] 10Release-Engineering-Team, 10MW-on-K8s, 10serviceops: Progressive rollout of MediaWiki deployment on Kubernetes - https://phabricator.wikimedia.org/T276487 (10jijiki) [17:51:45] (03CR) 10Dduvall: [C: 04-1] "> Patch Set 1:" (031 comment) [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) (owner: 10Jeena Huneidi) [17:57:23] Majavah: woot, I'll look in a bit [18:00:18] (03PS3) 10Jeena Huneidi: Define and pass allowed credentials to pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) [18:01:46] (03CR) 10Jeena Huneidi: Define and pass allowed credentials to pipeline (035 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) (owner: 10Jeena Huneidi) [18:03:47] Reedy: any chance you could look at https://phabricator.wikimedia.org/T276446, you're a phab admin right [18:07:16] (03CR) 10Jeena Huneidi: [C: 04-1] "> Patch Set 1:" (031 comment) [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) (owner: 10Jeena Huneidi) [18:08:33] Majavah: just logged into deployment-mwlog01 and peeked in /srv/mw-log, nicely done :)) [18:12:58] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI, 10User-Addshore: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Lucas_Werkmeister_WMDE) Can we find a better name for the “deployed” group, then? I don’t think it makes sense to exclude the Wikidata Query Servic... [18:20:22] 10VPS-project-Codesearch, 10Wikidata, 10Wikidata Query UI, 10User-Addshore: Add wikidata/query/gui to codesearch - https://phabricator.wikimedia.org/T276429 (10Legoktm) >>! In T276429#6883944, @Lucas_Werkmeister_WMDE wrote: > Can we find a better name for the “deployed” group, then? I don’t think it makes... [18:24:02] 10LibUp: LibUp hasn't successfully run on AutoCreateCategoryPages for 7 weeks - https://phabricator.wikimedia.org/T275292 (10Legoktm) 05Open→03Resolved a:03Legoktm I had to manually delete the clone on the libup server so it did a fresh checkout and now we have a successful run: https://libraryupgrader2.wm... [18:25:49] (03CR) 10Dduvall: Define and pass allowed credentials to pipeline (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) (owner: 10Jeena Huneidi) [18:33:28] legoktm: should I add debhelper-compat into Depends, while currently build tools are under Build-Depends? under source or package? [18:33:30] and thank you! [18:39:21] Majavah: er sorry, no, it should be under Build-Depends, my bad [18:45:51] legoktm: thanks, I guess I need to remove the debian/compat file? gbp complains that debhelper-compat version can't be specified in two places [18:45:58] yes [18:46:13] thanks, done, sent to Gerrit [18:48:53] 10Phabricator: Request Edit permissions to the "SRE Access Request" form - https://phabricator.wikimedia.org/T276446 (10RobH) >>! In T276446#6883437, @Majavah wrote: > Do you see anything useful under https://phabricator.wikimedia.org/transactions/editengine/transactions.editengine.config/? Phabricator allows e... [18:49:00] woo, CI buit it [18:49:09] it's failing because there are some lintian warnings, see https://integration.wikimedia.org/ci/job/debian-glue-non-voting/2897/artifact/udplog-1.9+0%7E20210304184436.2897+buster+wikimedia%7E1.gbp14cbf7.lintian.txt [18:49:35] I think if you could just fix the one error ("extended-description-is-empty") that would make jenkins happy? [18:49:45] hmm, looking [18:50:14] if jenkins wants the warnings fixed too then I think we should just suppress them, e.g. writing man pages isn't useful [18:50:45] fyi the Lintian website has descriptions for all the issues https://lintian.debian.org/tags/extended-description-is-empty.html [18:50:56] already found that via ddg [18:51:07] :)) [18:54:51] (03CR) 10Jeena Huneidi: Define and pass allowed credentials to pipeline (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) (owner: 10Jeena Huneidi) [18:56:51] hopefully I did that correctly [18:58:50] > 10:58:44 Finished: SUCCESS [18:59:00] oh I did :D [18:59:13] 20:58:44 Finished: SUCCESS [18:59:20] oh jenkins does that in local time [18:59:31] awesome work ^.^ [18:59:48] thanks, now just the systemd thing remaining [19:00:33] later today I'll merge this and upload it to apt.wm.o, let's do the systemd conversion in a separate patch [19:01:14] yes, I wasn't planning on adding new features to a patch already on review [19:07:26] (03CR) 10Dduvall: "I tested the output for the `wvui-pipeline-release` job. It looks right and the JSON deserialization works as expected when I run it throu" [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) (owner: 10Jeena Huneidi) [19:13:46] (03PS4) 10Jeena Huneidi: Define and pass allowed credentials to pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/668199 (https://phabricator.wikimedia.org/T269900) [19:38:03] 10Continuous-Integration-Config: CI should validate that the pecl tarball contains all the necessary files to build the extension - https://phabricator.wikimedia.org/T276417 (10Legoktm) There are a few more settings that needed adjustment: ` pecl config-set ext_dir /tmp pecl config-set php_dir /tmp ` (note that... [20:08:00] 10Gerrit: fatal: the remote end hung up unexpectedly - https://phabricator.wikimedia.org/T276500 (10Kizule) [20:11:46] 10Gerrit: fatal: the remote end hung up unexpectedly - https://phabricator.wikimedia.org/T276500 (10Aklapper) 05Open→03Invalid Hi, this is not a server side issue but a support request, hence closing Please bring this up in a support forum - thanks. [20:17:14] (03PS3) 10Jgleeson: WIP: Added image for civiproxy [releng/dev-images] - 10https://gerrit.wikimedia.org/r/664919 [20:22:46] (03PS1) 10Kosta Harlan: elasticsearch: Update to latest version of search/extra plugin [releng/dev-images] - 10https://gerrit.wikimedia.org/r/668549 (https://phabricator.wikimedia.org/T276499) [20:24:17] (03CR) 10Kosta Harlan: "I've tested this and can confirm the maintenance/UpdateWeightedTags.php script in CirrusSearch will work with this, but fails with our cur" [releng/dev-images] - 10https://gerrit.wikimedia.org/r/668549 (https://phabricator.wikimedia.org/T276499) (owner: 10Kosta Harlan) [20:34:41] (03CR) 10Ebernhardson: [C: 03+1] "This will fix the issue today. I wonder if there is some longer-term thing that should be thought about, this will almost certainly break " [releng/dev-images] - 10https://gerrit.wikimedia.org/r/668549 (https://phabricator.wikimedia.org/T276499) (owner: 10Kosta Harlan) [20:44:00] (03CR) 10Kosta Harlan: "> Patch Set 1: Code-Review+1" [releng/dev-images] - 10https://gerrit.wikimedia.org/r/668549 (https://phabricator.wikimedia.org/T276499) (owner: 10Kosta Harlan) [21:14:51] Is there some magic place that I can grep through all of the CI jenkins output logs? [21:17:13] (03CR) 10Ebernhardson: [C: 03+1] "> Patch Set 1:" [releng/dev-images] - 10https://gerrit.wikimedia.org/r/668549 (https://phabricator.wikimedia.org/T276499) (owner: 10Kosta Harlan) [21:17:24] 10Gerrit: fatal: the remote end hung up unexpectedly - https://phabricator.wikimedia.org/T276500 (10matmarex) @Aklapper No, it's a real issue, we have a task for it. [21:17:38] 10Gerrit: fatal: the remote end hung up unexpectedly - https://phabricator.wikimedia.org/T276500 (10matmarex) [21:17:49] 10Gerrit: Can't `git pull` mediawiki/core from Gerrit: "fatal: the remote end hung up unexpectedly" - https://phabricator.wikimedia.org/T263293 (10matmarex) [21:20:40] addshore: No. hashar was thinking of feeding them all into an ElasticSearch instance, but nothing's done yet. [21:24:09] 10Gerrit: fatal: the remote end hung up unexpectedly - https://phabricator.wikimedia.org/T276500 (10Kizule) Adding `lang=bash [protocol] version = 2 ` in $HOME/.gitconfig works for me. :) Thank you @matmarex for pointing me to the T263293. [22:49:58] Do they all just exist on individual nodes currently ? Or are they collected my the Jenkins main server somehow? [22:50:10] I'd love to be able to get more data on silly browser test failures [22:52:18] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Documentation, 10User-brennen: Update/organize train deployment and related policy documentation - https://phabricator.wikimedia.org/T273802 (10brennen) [23:03:07] 10Phabricator, 10Release-Engineering-Team, 10Wikimedia-Phabricator-Extensions: Phab reports: Improve clarity between labels and values - https://phabricator.wikimedia.org/T276513 (10Krinkle) [23:03:13] 10Phabricator, 10Release-Engineering-Team, 10Wikimedia-Phabricator-Extensions, 10Developer Productivity: Phab reports: Improve clarity between labels and values - https://phabricator.wikimedia.org/T276513 (10Krinkle) [23:04:44] addshore: they are collected on the master, and a few janitors are able to ssh there and perhaps run some carefully deoptimised ack-greps for your benefit [23:05:14] s/collected/only ever written to disk/ [23:05:31] the nodes have work spaces but no build output, that's piped directly from agent processes to the master