[00:12:28] 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10artificial-intelligence: Review training set to check strange examples of labels - https://phabricator.wikimedia.org/T171497#3468626 (10Natalia) Here are some results of reviewing labels Damaging/Goodfaith and Damaging/Badfaith for enwiki and ruwik... [07:08:28] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017), 10Patch-For-Review: Summarize what it will take to separate UI and infrastructure for ORES Extension - https://phabricator.wikimedia.org/T167908#3469009 (10awight) [07:14:57] (03CR) 10Awight: [C: 04-1] "I want to merge this as is, but the requirements have changed slightly (sorry!). See my change to the bug description... Now I'm thinkin" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/359364 (https://phabricator.wikimedia.org/T167908) (owner: 10Catrope) [08:16:45] (03CR) 10Ladsgroup: [C: 031] "It looks okay to me with valid use cases but I'm a little bit worried about the storage problems it might cause, in prod we use InnoDB as " [extensions/ORES] - 10https://gerrit.wikimedia.org/r/367449 (owner: 10Catrope) [13:37:25] PROBLEM - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on einsteinium is CRITICAL: CRITICAL: https://grafana.wikimedia.org/dashboard/db/ores is alerting: 5xx rate (Change prop) alert. [13:40:33] * halfak looks [13:41:52] Looks like we're fine. [13:42:14] But holy moley! there was a bit batch job that hit us yesterday. [13:52:35] RECOVERY - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on einsteinium is OK: OK: https://grafana.wikimedia.org/dashboard/db/ores is not alerting. [14:18:28] 10Scoring-platform-team, 10ORES: Set up larger ores-compute instance - https://phabricator.wikimedia.org/T169809#3470179 (10Andrew) [14:18:33] 10Scoring-platform-team, 10ORES, 10Cloud-VPS (Quota-requests), 10User-bd808: Request increase quota for ores-staging to 52GB RAM - https://phabricator.wikimedia.org/T169811#3470177 (10Andrew) 05Open>03Resolved a:03Andrew [14:42:20] OMG we planned a deployment for today. [14:42:24] I don't see any pull requests. [14:42:27] I'm going to go check on that. [15:59:39] o/ Amir1 [15:59:45] still want to do that deployment today? [15:59:59] Since we announced it, I think so [16:00:48] Any PR's or patchsets? [16:00:58] If not, I'll get that out of the way ASAP [16:06:26] halfak: I couldn't get it out of the door because I had to clone lots of things into misc and took forever [16:08:57] Amir1, gotcha. I'll work on it. [16:13:16] Amir1, can you double check https://ores-staging.wmflabs.org/ [16:14:02] Is that a production or labs alert? [16:14:08] ? [16:14:24] awight-shoddy, which alert [16:14:42] Grafana 5xx [16:15:02] I'm on a phone and busy deleting forensics. It recovered already? [16:15:31] Oh yeah. That fired and recovered a few hours ago. :) [16:15:58] I was thinking that I want to know what is happening. We shouldn't have ChangeProp getting 500s at all. [16:17:18] Nvm, this was a blip hours ago [16:17:36] Good to know grafana layout work ok [16:17:48] Works on the old phone [16:17:53] :D [16:21:33] halfak: my laptop died, I'm back now [16:21:34] sorry [16:21:44] Hey! Looks like the staging deploy is OK [16:21:51] I'm working on patchset for prod. [16:24:14] I didn't even know there was a deployment today. Fwiw I'll be situated in 35 minutes [16:24:46] * awight-shoddy_ revels in continuity between nick, actual cell service [16:25:00] , and care of duty [16:25:14] halfak: this needs mediawiki deployment too [16:25:27] Remind me to get it through swat too [16:25:28] Amir1, oh? Why is that? [16:25:30] Amir1, https://gerrit.wikimedia.org/r/367695 [16:25:37] the thresholds have changed [16:25:50] awight-shoddy_, it's on your calendar :P [16:25:58] not much but out still needs some adjustments for ores review tool [16:26:17] Amir1, oh yeah! Gotcha. [16:26:40] * awight-shoddy_ squints between fingers [16:39:32] halfak: so, I changed False to True, then rebuilt and retrained the whole thing (make enwiki_models, make enwiki_training_reports) but enwiki_goodfaith.md failed.. [16:40:14] fajne, great progress. Let me know what that error is. I'll need to be AFK soon but I'll look at it while I eat lunch [16:46:26] OK Beta deploy time. [16:46:38] Amir1, do you have the facilities to check on our beta deploy not 'sploding anything? [16:47:10] nope, except the outside endpoint (ores-beta.wmflabs.org) [16:47:17] I need to go [16:47:19] be back in a sec [16:47:31] halfak: Makefile:463: recipe for target 'tuning_reports/enwiki.goodfaith.md' failed make: *** [tuning_reports/enwiki.goodfaith.md] Error 1 make: *** Deleting file 'tuning_reports/enwiki.goodfaith.md' [16:54:01] fajne, that's just makefile output. I need the error that caused it. [16:56:10] Amir1, ores-beta deploy is complete. [16:56:36] Looking at https://grafana.wikimedia.org/dashboard/db/ores-extension?orgId=1 [16:56:47] Nothing looks bad. [16:56:58] Woops. https://grafana-labs.wikimedia.org/dashboard/db/ores-extension?orgId=1 [16:57:04] Also nothing bad. Generally nothing though [17:00:49] halfak: yep.. looking for it [17:01:03] halfak: Can I do deployey things [17:02:07] 10Scoring-platform-team-Backlog, 10ORES, 10Release-Engineering-Team, 10Scap: ORES should use git-fat for wheel deployments - https://phabricator.wikimedia.org/T171619#3470967 (10demon) [17:04:19] 10Scoring-platform-team-Backlog, 10ORES, 10Release-Engineering-Team, 10Scap: ORES should use git-fat for wheel deployments - https://phabricator.wikimedia.org/T171619#3470987 (10demon) [17:20:13] halfak: what was the service that let you generate a link for a text you copy paste in? [17:20:26] pastebin.ca [17:20:30] ^ my fav [17:22:39] hm.. another one? [17:22:53] (this gives me an error and sorry) [17:23:51] halfak: oh, it finally worked https://pastebin.ca/3846508 [17:24:43] fajne, that's a warning. [17:24:44] It's OK [17:25:13] that's a ValueError [17:25:34] Right [17:25:45] also, could it be that a test set contains the same kind of bug as the training set did? [17:25:55] 10Scoring-platform-team-Backlog, 10ORES, 10Release-Engineering-Team, 10Scap: ORES should use git-fat for wheel deployments - https://phabricator.wikimedia.org/T171619#3470967 (10bd808) #striker could use this too. It has the same sort of wheel blob repo as ORES. [17:27:53] halfak: I got 6 exceptions like this and no more errors [17:28:23] Those don't cause the process to fail [17:28:25] they are expected. [17:28:36] it buit the tunung report for damaging model and reverted one but failed for goodfaith [17:28:46] Right. What error cause it to fail? [17:28:50] 10Scoring-platform-team, 10Release-Engineering-Team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Split editquality repo to two repos, one with full history, one shallow - https://phabricator.wikimedia.org/T170967#3471090 (10awight) [17:28:51] It should be the last one listed. [17:29:08] i gave you the very last line in the biginning [17:29:29] When you’re all less busy… Anyone know what caused the 13:30-13:40 glitch? [17:30:44] halfak, https://pastebin.ca/3846518 [17:31:33] Looks like tune is not parameterized right [17:31:49] It's missing a line [17:31:53] For