[15:52:49] 06Revision-Scoring-As-A-Service, 10Wikilabels: Fix PageAsOfRevision view - https://phabricator.wikimedia.org/T137320#2364766 (10Halfak) [15:52:53] 06Revision-Scoring-As-A-Service, 10Wikilabels: Fix PageAsOfRevision view - https://phabricator.wikimedia.org/T137320#2364779 (10Halfak) https://github.com/wiki-ai/wikilabels/pull/128 [16:01:58] halfak: o/ [16:02:04] o/ Amir1 [16:02:18] Fixed a minor bug in wikilabels. I'm just about to try a deployment [16:02:24] awesome [16:02:34] tell me if I can help :) [16:02:56] the jenkins is down, otherwise I would do much more. I hope it's up now [16:02:59] I think it's good. I ran some tests with our new data loading scripts too. They are nice :) [16:03:07] yay [16:10:22] nice :D [16:26:59] Going to staging with wikilabels [16:33:23] 10Revision-Scoring-As-A-Service-Backlog, 10rsaas-editquality: hewiki "reverted" model weights strongly against anons - https://phabricator.wikimedia.org/T118982#2364938 (10Halfak) Forgot to note that this was deployed a long time ago. I'm going to resolve this for now. Please re-open if it is still an issue. [16:33:32] 06Revision-Scoring-As-A-Service, 10rsaas-editquality: hewiki "reverted" model weights strongly against anons - https://phabricator.wikimedia.org/T118982#2364940 (10Halfak) [16:33:51] 06Revision-Scoring-As-A-Service, 06Research-and-Data-Backlog, 10rsaas-editquality, 07Epic, 03RD-2016Q3: [Epic] Explore disparate impacts of damage detection and goodfaith prediction on anons and newcomers. - https://phabricator.wikimedia.org/T120138#2364944 (10Halfak) [16:33:53] 06Revision-Scoring-As-A-Service, 10rsaas-editquality: hewiki "reverted" model weights strongly against anons - https://phabricator.wikimedia.org/T118982#1814538 (10Halfak) 05Open>03Resolved [16:46:20] 06Revision-Scoring-As-A-Service: Deploy article topic campaign for WikiEd - https://phabricator.wikimedia.org/T137325#2364984 (10Halfak) [16:46:28] 06Revision-Scoring-As-A-Service, 10Wikilabels: Deploy article topic campaign for WikiEd - https://phabricator.wikimedia.org/T137325#2364997 (10Halfak) [16:46:46] 06Revision-Scoring-As-A-Service, 10Wikilabels: Deploy article topic campaign for WikiEd - https://phabricator.wikimedia.org/T137325#2364984 (10Halfak) https://github.com/wiki-ai/wikilabels-wikimedia-config/pull/26 [16:51:34] 06Revision-Scoring-As-A-Service, 10Wikilabels: Deploy article topic campaign for WikiEd - https://phabricator.wikimedia.org/T137325#2365027 (10Halfak) https://meta.wikimedia.org/wiki/Research_talk:Revision_scoring_as_a_service/Work_log/2016-06-08 [16:51:54] 06Revision-Scoring-As-A-Service, 10Wikilabels: Deploy article topic campaign for WikiEd - https://phabricator.wikimedia.org/T137325#2365028 (10Halfak) a:03Halfak [16:52:03] 06Revision-Scoring-As-A-Service, 10Wikilabels: Fix PageAsOfRevision view - https://phabricator.wikimedia.org/T137320#2365029 (10Halfak) a:03Halfak [20:42:54] Hey Amir1, what's this about race condition in ORES? [20:43:33] halfak: hey, the issue I was talking about wrt puppet and scap [20:43:47] and we have to change the config path [20:44:01] (do you remember talking about syslink, etc.) [20:44:08] Oh yes. [20:44:10] Thanks [20:44:20] * halfak is hacking on load testing stuff. [20:45:18] awesome, I fix these stuff [20:58:00] Amir1, got time for https://github.com/wiki-ai/ores/pull/146 ? [21:17:12] halfak: Is ores.wikimedia.org ready for use? [21:17:18] I see that it still loads content from wmflabs [21:17:24] [21:17:31] May wanna keep a copy of that in the repo instead [21:18:38] Krinkle: that's the UI [21:18:56] the API is fully based in prod [21:18:56] (They're separate questions) [21:19:01] OK [21:19:16] So I can change tools to use wm.o instead of wmflabs.org? [21:19:33] We should fix that though, UI or not, it's in prod and our terms of use don't allow it. [21:19:41] Krinkle: about the question. It's ready and it's working but right now we are testing performance and stuff like that [21:19:44] before moving on [21:19:49] Okay [21:20:22] about fixing the UI, you're right I make a phab card for it [21:21:04] thx [21:21:10] I merged the core patch just now [21:22:13] 06Revision-Scoring-As-A-Service, 10ORES: Move labs parts in GUI to its own loader - https://phabricator.wikimedia.org/T137362#2366069 (10Ladsgroup) [21:22:22] ^ [21:22:26] thanks Krinkle [21:22:51] yessss [21:23:00] I need to rebase another patch that uses this [21:25:59] Krinkle, we'll make a big announcement for tool devs to switch :) [21:26:09] I'm working on load testing scripts now. [21:27:07] okay :) [21:27:41] FYI: https://github.com/halfak/load-test-ores [21:27:42] :) [21:28:57] 06Revision-Scoring-As-A-Service: Build load testing scripts for ores.wikimedia.org - https://phabricator.wikimedia.org/T137131#2366102 (10Halfak) https://github.com/halfak/load-test-ores This repo has implementations for large-scale data analysis (request_score_batch.py) and ScoredRevisions-like behavior (score... [21:29:44] * halfak waits for a big query to finish so he can test the batch pattern. [21:29:44] https://quarry.wmflabs.org/query/10313 [21:33:47] 06Revision-Scoring-As-A-Service: Build load testing scripts for ores.wikimedia.org - https://phabricator.wikimedia.org/T137131#2366115 (10Halfak) This query will get a random sample of revids to use in batching: https://quarry.wmflabs.org/query/10313 [22:06:39] 06Revision-Scoring-As-A-Service, 10MediaWiki-Special-pages, 10MediaWiki-extensions-ORES, 05MW-1.28-release-notes, and 2 others: Templatize Special:Contributions lines and add proper hooks - https://phabricator.wikimedia.org/T122537#2366207 (10Ladsgroup) We need this merged too: https://gerrit.wikimedia.org... [22:07:05] Yay! Revids [22:09:19] halfak: \o/ [22:09:23] do you have results? [22:09:34] Not yet. [22:09:59] We need to set up a dashboard still [22:10:07] yeah [22:10:08] I'm still just testing the testing scripts [22:10:34] I can do it but first, Are you okay with renaming https://grafana.wikimedia.org/dashboard/db/ores to https://grafana.wikimedia.org/dashboard/db/ores-labs? [22:11:33] Yeah. looking at that now [22:12:32] https://grafana-admin.wikimedia.org/dashboard/db/ores-labs [22:12:33] Done [22:12:39] \o/ [22:12:56] We need to setup a precaching node for prod too [22:13:27] I want to put it in a node in ores-staging [22:13:39] would it be okay for you? [22:13:55] No. [22:13:59] 06Revision-Scoring-As-A-Service, 10ORES: Load test ores.wikimedia.org - https://phabricator.wikimedia.org/T137365#2366213 (10Halfak) [22:13:59] I don't think that is a good idea [22:14:22] 06Revision-Scoring-As-A-Service: Build load testing scripts for ores.wikimedia.org - https://phabricator.wikimedia.org/T137131#2366229 (10Halfak) [22:14:24] 06Revision-Scoring-As-A-Service, 10ORES: Load test ores.wikimedia.org - https://phabricator.wikimedia.org/T137365#2366228 (10Halfak) [22:15:01] We want the precacher to run as close as possible to the web node itself. [22:15:05] halfak: so where do you have in mind? [22:15:34] 06Revision-Scoring-As-A-Service, 10ORES: Create grafana dashboard for ores.wikimedia.org - https://phabricator.wikimedia.org/T137367#2366245 (10Halfak) [22:15:42] On the same machine that runs the web node. [22:15:47] in prod we have only two nodes (in eqiad and two backup nodes in codfw) which both are web and worker [22:15:56] Let's pick one. [22:16:03] and they are a shared node with citiod, graphoid, etc. [22:16:13] Shared node :/ [22:16:28] That's going to be fun when we reach capacity periodically [22:16:38] I thought we had our own hardware [22:16:54] our redis dbs are dedicated [22:16:59] but other things no [22:17:05] 06Revision-Scoring-As-A-Service, 10ORES: Load test ores.wikimedia.org - https://phabricator.wikimedia.org/T137365#2366259 (10Halfak) [22:17:07] 06Revision-Scoring-As-A-Service, 10ORES: Create grafana dashboard for ores.wikimedia.org - https://phabricator.wikimedia.org/T137367#2366260 (10Halfak) [22:17:22] redis dbs are hardware bought only for us [22:17:37] Well... that's not how things were explained to me. Would be nice to talk to akosiaris about how this all makes sense. [22:17:50] I suppose we can set an upper bound on load. [22:18:05] Are we split across VMs or something? [22:18:10] TO manage resource usage? [22:18:33] I'm not sure [22:18:39] let me take a look [22:18:54] but fwiw our nodes are scb1001 and scb1002 [22:19:41] OK. I'm declaring victory on the load balancer scripts. They seem to work pretty well. [22:20:25] \o/ [22:22:23] Looks like I'm getting a password prompt from scb1001 [22:22:26] Are you able to log in? [22:22:33] yup [22:22:38] I did several times [22:23:17] Hmm... So something is weird. [22:23:18] Checking [22:23:29] I just logged in again [22:25:01] nevermind. Looks like I had weird key rules in my ssh config [22:25:03] got it. [22:25:20] Looks like nodejs and celery are competing for CPU [22:25:24] okay [22:25:54] 24 CPUs [22:25:57] several services live there, cxserver, mobile apps [22:26:09] EventBus [22:26:24] So, if we go crazy and accidentally use all the CPU, that will crash all these other services? [22:27:19] I'm not a operational expert but I think so [22:27:23] Anyway, it seems that scb1001 is a good place to put this. [22:27:40] that's a little bit hard to implement [22:27:42] What do you think? [22:27:51] things in prod are a little bit different [22:28:07] How do we add a role to this machine? [22:28:40] I'm not sure but I think it should be done in operations/puppet repo [22:28:45] (like hiera) [22:28:54] let me check [22:30:40] halfak: "manifests/role/scb.pp" [22:30:52] that's how they do it [22:30:56] Oh darn. We can't just run this on one of the two? [22:31:05] oh no [22:31:21] and even in order to add a role we need to do a lot [22:32:15] halfak: also L3 says we can't run anything directly in prod machines [22:32:41] (except probably playgrounds) [22:32:51] Yeah. I don't think we should ever do that (except for, you know, 'cd' and 'ls' and 'top') [22:33:21] :D [22:33:45] So... [22:33:48] This is funny. [22:33:54] halfak: here's my idea, let's have it in labs for now but move it to prod asap [22:34:04] in two or three weeks at the most [22:34:06] I really don't think that is a good idea. [22:34:15] OK for performance testing, but not OK for a production service [22:34:54] yeah, I understand [22:35:00] o/ yuvipanda, we want to run our precaching role on one of the two scb nodes, but it looks like they are purposefully configured exactly the same. https://github.com/wikimedia/operations-puppet/blob/production/manifests/role/scb.pp [22:35:02] What do you think? [22:35:09] I think we should talk to ops and evern security [22:35:41] yeah :) [22:36:42] what about being in prod but not in scb? (just thinking out load) [22:37:45] Yeah. doesn't seem crazy [22:39:46] looking at this: https://wikitech.wikimedia.org/wiki/Infrastructure_naming_conventions I can't find a place to put the precaching :( [22:48:07] 06Revision-Scoring-As-A-Service, 10ORES: Deploy ores precaching in production - https://phabricator.wikimedia.org/T137370#2366332 (10Halfak) [22:48:08] https://phabricator.wikimedia.org/T137370 [22:48:14] For capturing the conversation. [22:49:34] 06Revision-Scoring-As-A-Service, 10ORES: Deploy ores precaching in production - https://phabricator.wikimedia.org/T137370#2366363 (10Halfak) Optimally, I'd like to run the service on one of the scb100[12] nodes. But if that isn't possible we could potentially run the service on another node somewhere prod-lik... [22:49:34] El búfer 12 está vacío. [22:49:41] AsimovBot, wat [22:49:41] 04Error: Command “wat” not recognized. Please review and correct what you’ve written. [22:50:07] Gotta go do some house chores. [22:50:10] ttyl [22:50:11] o/ [22:50:16] o/