[00:32:55] 10Beta-Cluster-Infrastructure, 10Operations: Upgrade puppet in deployment-prep (Puppet agent broken in Beta Cluster) - https://phabricator.wikimedia.org/T243226 (10Krenair) So this error was the memory usage problem on puppetdb03 I mentioned above - puppetdb won't work without postgresql, which can't start bec... [01:23:23] 10Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), 10dev-images, 10docker-pkg, 10User-brennen: dev/stretch-php72-fpm-apache2-xdebug deploy fails from contint1001 - https://phabricator.wikimedia.org/T245304 (10Jdforrester-WMF) [07:44:11] 10Phabricator, 10Release-Engineering-Team: change "default" column on Parsoid workboard - https://phabricator.wikimedia.org/T245308 (10Aklapper) I was about to look into this while wondering if this could simply be solved by renaming columns, but seems that @LGoto already did the right thing by renaming `Needs... [08:00:01] Just fyi, beta is giving a lot of MediaWiki\Storage\BlobAccessException from line 261 of /srv/mediawiki/php-master/includes/Storage/SqlBlobStore.php: Unable to store text to external storage [08:33:35] uh ohes [08:34:26] which wiki(s)? do we know if it's via api or something else? [08:57:25] it was at en.wikipedia.beta.wmflabs.org [08:57:35] And its itermitent, editing from web api [08:57:40] Its happened to me twice [09:03:34] It also seems to loose my session a surprising amount of times [11:07:58] 10Release-Engineering-Team (CI & Testing services), 10Release Pipeline: Experiment with different PipelineLib-/helm-based approaches to system testing - https://phabricator.wikimedia.org/T244313 (10zeljkofilipin) [13:35:40] paladox: you there? [13:35:48] hi, yes [13:35:58] paladox: pm? [13:36:02] sure [13:36:05] ok thanks [15:01:38] 10Continuous-Integration-Infrastructure: Reenable tests for github.com/wikimedia/texvcjs - https://phabricator.wikimedia.org/T245344 (10Physikerwelt) [17:47:52] PROBLEM - Parsoid on deployment-mediawiki-parsoid10 is CRITICAL: connect to address 172.16.0.141 and port 8000: Connection refused [17:47:52] PROBLEM - Parsoid on deployment-parsoid09 is CRITICAL: connect to address 172.16.5.63 and port 8000: Connection refused [19:47:36] 10Beta-Cluster-Infrastructure, 10Operations: Upgrade puppet in deployment-prep (Puppet agent broken in Beta Cluster) - https://phabricator.wikimedia.org/T243226 (10Krenair) Thanks to Andrew it seems to be running well now. I've copied across /var/lib/puppet/volatile to sort a lot of swift/GeoIP failures. [20:23:08] 10Beta-Cluster-Infrastructure, 10Operations: Upgrade puppet in deployment-prep (Puppet agent broken in Beta Cluster) - https://phabricator.wikimedia.org/T243226 (10Krenair) Also copied /etc/conftool-state/mediawiki.yaml to sort out mediawiki::state for mwmaint01 I've also taken /root and /home and put them at... [20:25:06] !log T243226 Shut off deployment-puppetdb02 and deployment-puppetmaster03 - will leave for a week before deletion [20:25:07] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:25:08] T243226: Upgrade puppet in deployment-prep (Puppet agent broken in Beta Cluster) - https://phabricator.wikimedia.org/T243226 [20:29:24] PROBLEM - Host deployment-puppetmaster03 is DOWN: CRITICAL - Host Unreachable (172.16.4.91) [20:29:33] PROBLEM - Host deployment-puppetdb02 is DOWN: CRITICAL - Host Unreachable (172.16.4.104) [21:24:48] 10Continuous-Integration-Infrastructure, 10GitHub-Mirrors: Reenable tests for github.com/wikimedia/texvcjs - https://phabricator.wikimedia.org/T245344 (10MarcoAurelio) [21:25:53] 10Continuous-Integration-Infrastructure, 10GitHub-Mirrors: Reenable tests for github.com/wikimedia/texvcjs - https://phabricator.wikimedia.org/T245344 (10MarcoAurelio) So, you want Travis CI integration tests to run for that repo right? [21:33:48] !log GitHub: Activated Travis CI at https://travis-ci.org/wikimedia/texvcjs for https://github.com/wikimedia/texvcjs per https://github.com/wikimedia/texvcjs/pull/34 and T245344 [21:33:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:33:51] T245344: Reenable tests for github.com/wikimedia/texvcjs - https://phabricator.wikimedia.org/T245344 [21:43:44] 10Phabricator, 10Release-Engineering-Team: change "default" column on Parsoid workboard - https://phabricator.wikimedia.org/T245308 (10LGoto) Looks like only 90 tasks moved when I first did this, though. I'm trying the "move to column" again for the remaining tasks. [21:58:30] 10Continuous-Integration-Infrastructure, 10GitHub-Mirrors: Reenable tests for github.com/wikimedia/texvcjs - https://phabricator.wikimedia.org/T245344 (10MarcoAurelio) @Physikerwelt I think I have enabled back tests via the legacy Travis CI integration page. I think this is being deprecated in favour of GitHub... [22:14:27] (03PS1) 10Legoktm: zuul: Use the composer-test-php72-or-later template [integration/config] - 10https://gerrit.wikimedia.org/r/572422 [22:14:29] (03PS1) 10Legoktm: zuul: Add legoktm's Toolforge tools [integration/config] - 10https://gerrit.wikimedia.org/r/572423 [22:16:53] (03CR) 10Legoktm: [C: 03+2] zuul: Use the composer-test-php72-or-later template [integration/config] - 10https://gerrit.wikimedia.org/r/572422 (owner: 10Legoktm) [22:17:42] (03Merged) 10jenkins-bot: zuul: Use the composer-test-php72-or-later template [integration/config] - 10https://gerrit.wikimedia.org/r/572422 (owner: 10Legoktm) [22:19:00] legoktm: https://www.mediawiki.org/wiki/Extension_talk:UserThrottle#Deprecated? ? [22:19:01] (03CR) 10Legoktm: [C: 03+2] zuul: Add legoktm's Toolforge tools [integration/config] - 10https://gerrit.wikimedia.org/r/572423 (owner: 10Legoktm) [22:19:08] hi hauskatze [22:19:19] hi :) [22:19:36] * legoktm looks [22:19:53] (03Merged) 10jenkins-bot: zuul: Add legoktm's Toolforge tools [integration/config] - 10https://gerrit.wikimedia.org/r/572423 (owner: 10Legoktm) [22:21:00] hauskatze: yes, I think UserThrottle is mostly obsolete now, I'll leave a comment shortly [22:21:17] !log reloading zuul for https://gerrit.wikimedia.org/r/572423 [22:21:18] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:21:50] legoktm: Appreciated. That's one extension missing ExtReg and before even bothering I though I might ask first if it is worth the effort [22:23:56] indeed :D [22:24:32] I recently made a patch to migrate UploadBlacklist [22:24:42] probably unneeded too [22:24:49] meh [22:24:55] helps me practice though [22:27:38] I think UploadBlacklist is mostly superceeded by AbuseFilter? [22:29:12] IIRC there's a SHA hash variable in AF to check for uploads, yes; but I think I heard it's broken? [22:31:12] I mean, it's AF… [22:40:17] 10Continuous-Integration-Infrastructure, 10GitHub-Mirrors: Reenable tests for github.com/wikimedia/texvcjs - https://phabricator.wikimedia.org/T245344 (10Jdforrester-WMF) I'd be happy to help migrate from Travis to GHActions if that's wanted. [22:41:04] legoktm: Also, you may find https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Jade/+/572429/1 entertaining/depressing. :-( [22:44:03] oof [22:44:10] … yeah. [22:45:28] amazing [22:45:35] shouldn't grunt complain if it is linting nothing? [22:45:42] or am I imagining that? [22:45:48] It does lint things. [22:45:54] It was reporting that it was linting 26 files. [22:46:09] oh [22:46:13] It didn't mention that they were 26 JS files, and it found nothing wrong in the styling in them because they had none. [22:46:16] "It does lint things." deserve !_bash :P [22:46:17] heh [22:46:45] But if there had been a "$div.setStyles( 'color:red;');" in the JS, that'd have been linted. [22:46:56] Artificial Stupids. [22:52:30] now back to figuring out what broke our tox setup... [22:52:35] https://integration.wikimedia.org/ci/job/tox-docker/10211/console [22:52:46] Oh, right, it was a tox breakage. [22:53:43] virtualenv 20 was released earlier this week and it mostly broke the world, because they completely rewrote the project [22:54:02] James_F: are you familiar with our GitHub use of Travis CI and/or GitHub actions? There's a task I think I resolved about it but I'm not sure. [22:54:20] hauskatze: Yes. I said I'd be happy to help migrate. [22:54:31] In general, we're slowly moving off Travis. [22:54:46] oh, geez, I now see the reply [22:54:49] sorry [22:54:51] Complicated by the lack of a list of what repos are hosted in GitHub and so should have GH Actions installed. [22:54:52] ;-) [22:55:05] But I saw GHA does cost money? [22:55:24] We have a magical account that doesn't [22:55:32] Because GitHub like us. [22:55:42] Well, if we're going to blow 700k in nonsense rebranding... [22:56:13] Yup, special offer :D [23:02:15] I hadn't noticed this before [23:02:59] https://i.imgur.com/0aFrQ9W.png [23:06:14] Yup [23:14:55] sigh [23:15:01] the issue was right in front of me the entire time [23:15:10] 89 FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/.local/share/virtualenv/py-info/20.0.4/2d7075eda06572014a3dfd447e6915f0dadc794767f53ba9bc4e3ed19eeb098d.lock' [ERROR __main__:42] [23:15:18] of course, /nonexistent doesn't exist [23:21:11] we need export XDG_DATA_HOME=/cache [23:21:17] Oh, yeah, I was wondering what that was from. [23:21:33] Krinkle: Got an email you tranferred a repo yesterday but I'm not sure I understand why? [23:21:46] oh well, he's away [23:21:53] hauskatze: it just went back and forth to create a redirect that's all [23:21:54] Yeah, Krinkle gave us MediaWiki. How kind, but I think we had it already. [23:22:04] Ah. [23:22:19] Some of the messages GitHub gives aren't perfect for our situation. [23:22:23] GitHub considers renames as "creation" which is odd [23:22:35] sorry for the noise [23:22:44] but it continues to mirror from mediawiki/core.git@gerrit [23:22:45] No worries. [23:22:52] yeah [23:23:05] Ok, thanky :) [23:23:35] I was fearing I broke something when I used the replication plugin yesterday because if you miss the --wait var you end up having to rename repos [23:23:55] thankfully it was not the case [23:24:15] legoktm: yeah, the default value of HOME in our images was set to /nonexistent by antoine, I think to hard fail on any programs that expect a writable/persistent home directory. [23:24:58] I personall think setting it to /tmp would be more useful and yeah, opt-ing in to cached/pesistent via XDG_CACHE-something (forgot its exact name) already works [23:25:21] I'm not sure all writes to XDG_DATA_HOME should accumulate over time via castor though. [23:25:37] I don't think that's per se expected to be self-cleaning [23:25:44] it's a bit of a mess anyway [23:26:04] we keep most /cache stuff opt-in I think, as it generally causes more trouble than its worth if done too widely [23:26:11] Everything is a mess. [23:26:18] but at least it should be a writable directory like /tmp [23:26:27] Some things are messes but work. [23:26:35] XDG_DATA_HOME is unset and falls back to HOME which we disallow [23:29:49] James_F: thx for making review of travis-ci fixes easy with per-repo references :) [23:31:45] https://github.com/pypa/virtualenv/issues/1640 [23:33:16] Krinkle: ack, I suggested /tmp in the upstream bug ^ [23:36:16] legoktm: well, yeah, but imho the problem is on our end. Why are we running with an unwritable home directory? What principal are we enforcing here? [23:36:41] I've spent a lot of time working with several JS modules on a similar issue trying to make it configurable and then repeat that configuraiotn in all our repos [23:37:09] to identify stuff that doesn't respect $XDG_CACHE_whatever I think [23:38:16] Yeah, that's great for stuff we want to cache across jobs, but for ephemeral stuff it just means we're the only ones that can't consume a package and spend dozens/hundreds of hours fixing non-issues in a bunch of packages. [23:38:34] I'd rather we default HOME=/tmp and if we find something coudl be cached, maybe configure it and upstream a fix to use XDG_CACHE_ indeed [23:41:46] Krinkle: Of course. :-) [23:43:07] (03PS1) 10Legoktm: docker: Set XDG_DATA_HOME in tox image [integration/config] - 10https://gerrit.wikimedia.org/r/572453 [23:43:13] Krinkle: yeah, I'd support that [23:44:19] legoktm: oh virtualenv does honour XDG? Cool, I thoguht the issue was the other way around [23:44:29] it does [23:44:32] I think it's just using the wrong one [23:44:34] (we setting it, the tool not using it and falling back to HOME) [23:44:38] hm. yeah [23:44:54] legoktm: depends, is the thing it writes meant to be executable? [23:45:30] it's just a lock file [23:45:34] bunch of packages I had to workaround actually used to use /tmp and moved away from it, to HOME due to some users not allowing executables there. [23:45:57] I'd expect lock files to go in either /cache or /tmp [23:45:59] in fact, I think that's the only reason why I keep running into /nonexistent - when a tool used to use /tmp and moved away from it (instead of adopting XDG) [23:46:03] not a persistent data directory [23:46:13] sure yeah, that's fair. [23:50:34] (03CR) 10Legoktm: [C: 03+2] docker: Set XDG_DATA_HOME in tox image [integration/config] - 10https://gerrit.wikimedia.org/r/572453 (owner: 10Legoktm) [23:51:25] (03Merged) 10jenkins-bot: docker: Set XDG_DATA_HOME in tox image [integration/config] - 10https://gerrit.wikimedia.org/r/572453 (owner: 10Legoktm) [23:52:25] !log rebuilding tox docker image for https://gerrit.wikimedia.org/r/572453 [23:52:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [23:57:46] (03CR) 10Krinkle: [C: 03+2] Use yaml.safe_load to avoid potential RCE [integration/jenkins] - 10https://gerrit.wikimedia.org/r/570816 (owner: 10Legoktm) [23:58:02] (03CR) 10Krinkle: [C: 03+2] Use pytest instead of deprecated nose [integration/jenkins] - 10https://gerrit.wikimedia.org/r/570817 (owner: 10Legoktm) [23:58:07] (03PS1) 10Legoktm: jjb: Bump tox image to 0.4.3 [integration/config] - 10https://gerrit.wikimedia.org/r/572454 [23:58:33] (03Merged) 10jenkins-bot: Use yaml.safe_load to avoid potential RCE [integration/jenkins] - 10https://gerrit.wikimedia.org/r/570816 (owner: 10Legoktm) [23:58:36] (03CR) 10Krinkle: [C: 03+2] Remove cobertura-clover-transform files [integration/jenkins] - 10https://gerrit.wikimedia.org/r/571225 (owner: 10Legoktm) [23:58:47] (03Merged) 10jenkins-bot: Use pytest instead of deprecated nose [integration/jenkins] - 10https://gerrit.wikimedia.org/r/570817 (owner: 10Legoktm) [23:58:49] (03CR) 10jerkins-bot: [V: 04-1] Simplify configuration with tox-wikimedia [integration/jenkins] - 10https://gerrit.wikimedia.org/r/570818 (owner: 10Legoktm) [23:58:51] (03CR) 10jerkins-bot: [V: 04-1] Remove cobertura-clover-transform files [integration/jenkins] - 10https://gerrit.wikimedia.org/r/571225 (owner: 10Legoktm) [23:59:24] * Krinkle realises why legoktm was looking at virtualenv vs /nonexistent [23:59:36] heh yep [23:59:58] should be fixed in a few minutes...