[06:50:07] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/ORES [06:51:41] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 981 bytes in 8.072 second response time https://wikitech.wikimedia.org/wiki/ORES [07:23:22] (03PS11) 10Kosta Harlan: Inject Config to ORESService, convert test to unit test [extensions/ORES] - 10https://gerrit.wikimedia.org/r/510166 [08:18:40] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.022 second response time https://wikitech.wikimedia.org/wiki/ORES [08:19:54] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 981 bytes in 0.075 second response time https://wikitech.wikimedia.org/wiki/ORES [08:41:02] 10ORES, 10Scoring-platform-team, 10Technical-Debt: Inject Config to ORESService, convert tests to unit tests - https://phabricator.wikimedia.org/T232440 (10kostajh) [08:41:24] 10ORES, 10Scoring-platform-team, 10Technical-Debt: Inject Config to ORESService, convert tests to unit tests - https://phabricator.wikimedia.org/T232440 (10kostajh) [08:41:29] 10ORES, 10Scoring-platform-team, 10Technical-Debt: Inject Config to ORESService, convert tests to unit tests - https://phabricator.wikimedia.org/T232440 (10kostajh) a:03kostajh [08:43:18] (03PS12) 10Kosta Harlan: Inject Config to ORESService, convert test to unit test [extensions/ORES] - 10https://gerrit.wikimedia.org/r/510166 (https://phabricator.wikimedia.org/T232440) [14:54:52] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/ORES [14:55:03] I'm getting further. [14:55:24] It looks like our uwsgi workers are sitting idle. [14:55:31] While we're blocking ^ [14:55:51] So I can confirm we are not being overloaded with a bunch of requests. [14:57:10] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 979 bytes in 3.940 second response time https://wikitech.wikimedia.org/wiki/ORES [15:51:26] 10ORES, 10Scoring-platform-team (Current), 10Puppet: Include git-lfs in ores::base role - https://phabricator.wikimedia.org/T232494 (10Halfak) [15:53:45] 10ORES, 10Scoring-platform-team (Current), 10Puppet: Include git-lfs in ores::base role - https://phabricator.wikimedia.org/T232494 (10Halfak) a:03Halfak [15:57:05] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10Puppet: Require git-lfs in ores::base puppet role - https://phabricator.wikimedia.org/T232494 (10Halfak) [15:57:10] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10Puppet: Require git-lfs in ores::base puppet role - https://phabricator.wikimedia.org/T232494 (10Halfak) @akosiaris, I noticed that a new worker node I started in labs didn't have git-lfs do I couldn't deploy. I wonder if we had manually ins... [16:02:49] 10Jade, 10ORES, 10Scoring-platform-team, 10ApiFeatureUsage, and 24 others: All API help links should use `Special:MyLanguage` - https://phabricator.wikimedia.org/T231269 (10JTannerWMF) It appears someone else is working on this. So the #editing-team is moving this to external. [16:06:59] --Async Standup time -- [16:06:59] Here's my update: [16:06:59] Y: Reviewed session-orientation PR and ORES feature_injection fix PR. Continued work on Jade CreateAndEndorse api module, almost have writing json to DB (and serialization stuff) sorted out. [16:07:01] T: Debug Jade schema validation issues related to proposal/base definition and get DB read and writes working. Will also look at docs PR updates. [16:07:59] Y: Mostly worked on uwsgi issue and docs build CI stuff. [16:08:38] T: Tons of meetings. Opportunity fund request rewrite. Session-orientation work. Finish documenting the uwsgi issue and increasing the capacity of ORES in labs. [16:10:44] Also I need to get the ORES paper uploaded to arxiv for sharing out. [16:18:00] 10ORES, 10Scoring-platform-team: Investigate intermittent delay for basic uwsgi requests. - https://phabricator.wikimedia.org/T232228 (10Halfak) I got uwsgitop running on ores-web-01. I was able to confirm that, while the timing script is hanging, the majority of workers are idle. It seems that whatever rout... [16:39:21] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/ORES [16:40:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 981 bytes in 0.045 second response time https://wikitech.wikimedia.org/wiki/ORES [16:41:51] 10ORES, 10Scoring-platform-team (Current): POST method not allowed for some ORES endpoints - https://phabricator.wikimedia.org/T232500 (10Halfak) [16:41:56] 10ORES, 10Scoring-platform-team (Current): POST method not allowed for some ORES endpoints - https://phabricator.wikimedia.org/T232500 (10Halfak) a:03Halfak https://github.com/wikimedia/ores/pull/332 [16:48:57] 10ORES, 10Scoring-platform-team (Current): POST method not allowed for some ORES endpoints - https://phabricator.wikimedia.org/T232500 (10Halfak) @ACraze, this should be a quick and easy review. :) [16:49:14] Sorry to ping early. The build is still in progress [17:03:29] 10ORES, 10Scoring-platform-team: Investigate intermittent delay for basic uwsgi requests. - https://phabricator.wikimedia.org/T232228 (10Halfak) From: https://uwsgi-docs.readthedocs.io/en/latest/ThingsToKnow.html > If your (Linux) server seems to have lots of idle workers, but performance is still sub-par, you... [17:05:57] wikimedia/ores#1377 (post_method - eff8e3b : halfak): The build passed. https://travis-ci.org/wikimedia/ores/builds/583262624 [17:06:08] \o/ [17:06:39] 10ORES, 10Scoring-platform-team: Investigate intermittent delay for basic uwsgi requests. - https://phabricator.wikimedia.org/T232228 (10Halfak) It seems like that variable is deprecated and we'll need to enable a kernel module to work with the new one. See https://bugs.launchpad.net/swift/+bug/1354909 [17:09:14] 10ORES, 10Scoring-platform-team: Investigate intermittent delay for basic uwsgi requests. - https://phabricator.wikimedia.org/T232228 (10Halfak) > $ sudo sysctl net.netfilter.nf_conntrack_max > sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_max: No such file or directory [17:16:27] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10Puppet: Require git-lfs in ores::base puppet role - https://phabricator.wikimedia.org/T232494 (10akosiaris) We never installed git-lfs via any ores puppet code. It was always done via scap. Relevant commits are c01b8bdd0a3e82fe6aed564dd6060f... [17:20:48] 10ORES, 10Scoring-platform-team (Current): POST method not allowed for some ORES endpoints - https://phabricator.wikimedia.org/T232500 (10ACraze) merged! [19:01:49] 10ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10Puppet: Require git-lfs in ores::base puppet role - https://phabricator.wikimedia.org/T232494 (10Halfak) In WMFLabs, we use fabric to do deployments. Hence why I ran into this issue. [19:02:12] accraze: FYI: I have a few fixes to OREs -- mostly around feature injection. I'd like to get those deployed tomorrow. [19:03:13] Also, I'm about ready to declare defeat with https://phabricator.wikimedia.org/T232228 [19:03:26] But I made more progress. [19:03:45] I think I'll need some help at this point. [19:10:58] Looks like we have a git-lfs mixup in ores/wheels. I'm cleaning it up now. [19:11:11] But you'll need to do a hard reset on your local copy, accraze [19:11:24] 5 files were committed directly rather than via LFS [19:46:51] Arg I'm blocked. But accraze I think you might be able to do this operation for me :) [19:47:00] Let me know if you are around. [20:11:56] whoops forget to set to away [20:11:59] im back [20:14:23] what do you need halfak? fix the ores/wheels commits? [20:15:05] Yeah. If you check out the most recent version of the repo (master) on a machine that has git lfs installed, [20:15:25] Then run "git lfs migrate import --fixup --everything" [20:15:45] It's your last two commits that are messed up so I'm hoping it'll let you push up the changes then. [20:16:12] ahh ok lemme give it a try real quick [20:17:22] Cool. Thank you :) [20:22:50] ugh merge conflict, working on sorting it out [20:25:08] Hmm. Merge conflict? You should have two commits that diverge from origin after running that. [20:25:12] And then you need to force-push [20:25:19] accraze, ^ [20:28:54] having trouble force-pushing to gerrit :( [20:29:28] Aha what does it say? [20:30:07] error: failed to push some refs to 'ssh://accraze@gerrit.wikimedia.org:29418/research/ores/wheels' [20:30:45] Probably says something useful higher in the output [20:31:17] e.g. when I try I get the following 10 lines above it: "remote: error: commit 2d0f95b: email address acraze@wikimedia.org is not registered in your account, and you lack 'forge committer' permission." [20:31:30] ! [remote rejected] HEAD -> refs/for/master%topic=fix-lfs (change https://gerrit.wikimedia.org/r/#/c/research/ores/wheels/+/527174 closed) [20:32:00] Did you try force pushing directly to master? [20:32:00] You have the permission to merge in the repo right? [20:32:17] Yeah. accraze has merge rights, but I'm not sure if it will let him push. [20:32:30] merge rights in gerrit iirc are equal to push [20:32:55] paladox: ^ any idea why accraze is having this issue [20:33:03] yeah I get the same remote rejected when I force push to master :( [20:33:15] He is trying to push commits that belong to others? [20:33:29] he needs forge commiter right [20:33:39] which you can do under the project access settings [20:35:29] halfak accraze ^ [20:38:03] is this https://gerrit.wikimedia.org/r/#/admin/groups/uuid-cddbf2315647ba438f5741826fffaeedfdcdfe8a group hidden? [20:38:15] stupid question for me to ask, yes it is. [20:46:58] If we had the full error message, we'd know :P [20:47:00] accraze, ^ [20:47:34] just have him send over everything that happens after git push [20:47:35] :P [20:48:16] halfak the thing you pasted is all i needed [20:48:26] "and you lack 'forge committer' permission."" [20:48:41] I don't get any of that [20:48:53] accraze what's your error message (full)? [20:49:23] https://phabricator.wikimedia.org/P9077 [20:49:43] paladox, that's my error. I want to know Andy's error. [20:49:44] aha! [20:49:47] your force merging [20:50:23] right [20:50:41] It's only two commits that get overwritten and both are Andy's so I'm hoping he can force push. [20:51:18] halfak ok, can you unhide https://gerrit.wikimedia.org/r/#/admin/groups/uuid-cddbf2315647ba438f5741826fffaeedfdcdfe8a please? Or does it need to stay hidden? [20:53:32] paladox, not sure what you mean by "unhide" [20:54:06] I see an option to "Make group visible to all registered users" [20:54:10] Is that what you're asking for? [20:54:18] yup [21:00:51] {{done}} [21:00:52] You rule, halfak! [22:10:20] (03PS1) 10Paladox: Modify access rules [research/ores/wheels] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/535721 [22:10:26] accraze halfak ^ [22:10:39] I completly forgot to check back on this channle, sorry