[10:41:35] o/ The git lfs in github doesn't work [10:41:44] it's driving me crazy [11:48:53] It seems I didn't fully rewrite the history, that's the reason [12:19:52] wikimedia/editquality#382 (simplewiki - 17ac8bd : Aaron Halfaker): The build has errored. https://travis-ci.org/wikimedia/editquality/builds/425235046 [13:25:11] So it was due to branches using the old history basically the whole was useless, no it's waaay faster [13:25:17] *now [13:42:33] and now tested pool counter in labs [13:43:27] at first I thought the lock time is around 1 second, that scared the "sugar honey ice tea" out of me, but it's actually 1 millisecond :D [13:43:30] that's median [13:47:26] The 99% percentile is 1.7 ms [13:47:29] that's awesome [13:48:05] o/ [13:48:07] data is here: https://grafana-labs.wikimedia.org/dashboard/db/ores-labs?orgId=1&from=now-3h&to=now [13:48:10] halfak: hey [13:49:12] nah, the average of 99% percentile is 5ms [13:49:15] still fine :D [13:49:50] why are the maxes and averages all the same across percentiles/ [13:49:53] *? [13:50:28] halfak: do you mean about the staging node? [13:50:37] because it doesn't have any requests right now [13:50:37] Oh. yeah I guess so. [13:50:50] (I'm hammering the labs node) [13:51:32] I put the median of response time there just to compare [13:54:04] I think the overhead is completely neglible [14:01:56] Seems fair. Is there a graph of overall response time? [14:02:03] stopped the hammering script [14:02:03] Ultimately, that's what we care about. [14:02:11] halfak: yup it's at the end [14:02:33] "Median response time" [14:03:23] We should have percentiles there. Also, i don't think the per-machine breakdown is useful. [14:03:27] Just seems noisy. [14:03:45] yeah, let me fix that [14:03:53] I'm trying to make edits but my wikitech credentials don't seem to work on grafana-labs [14:03:54] hmm [14:07:50] halfak: you need to go to grafana-labs-admin [14:07:59] it still has two system [14:08:02] WTF [14:08:30] I know :D [14:23:05] (03PS1) 10Ladsgroup: Update ores submodule from phabricator [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/458513 (https://phabricator.wikimedia.org/T203246) [14:23:29] halfak: Can you check this ^ [14:23:40] https://grafana-labs-admin.wikimedia.org/dashboard/db/ores-labs?orgId=1&from=now-3h&to=now&panelId=14&fullscreen [14:23:45] Amir1, check this ^ [14:24:26] We originally switched away from phab for submodules because the large file downloads (models) were making phab cry. [14:24:38] Maybe that is fixed with LFS [14:24:50] ores repo itself is not LFS [14:25:06] and should not be, we don't store any large files in it [14:25:24] halfak: the graphs looks nice [14:26:14] Oh. But you are talking about the deploy repo [14:26:22] It's submodules have LFS. [14:26:25] *its [14:26:27] the shame [14:27:05] I'm just changing pointer to ores repo in the deploy repo [14:27:19] oh. Just one of the submodules? [14:27:26] Why do the other submodules continue to work? [14:31:12] the other submodules are not still getting updated. I'm working on it too [14:31:17] Amir, I see jumps in response timing in this graph. [14:31:25] where? [14:32:12] halfak: I'm not sending requests to labs, so the number of requests is around 2 per min right now [14:50:35] Right. So I guess what I'm trying to work out is that if this caused an increase in overall response time, would it be visible in this graph. [14:50:54] I think you need to hammer labs without a poolcounter lock and then hammer it again with a poolcounter lock. [14:51:12] Or maybe hammer it with precache and them hammer it without [14:51:28] Amir1, ^ [14:51:44] hmm, hammer with precache is easier (no need to deploy anything) [14:52:04] right now it was the hammer with poolcounter [15:37:20] halfak: hammering is done [15:37:46] the median and other things are different because we have other requests coming in [15:37:52] but not for precache [15:38:09] Hmm. I see a steady-rising latency [15:38:16] https://grafana-labs-admin.wikimedia.org/dashboard/db/ores-labs?orgId=1&from=now-3h&to=now&panelId=14&fullscreen [15:38:41] because it's a moving average [15:39:44] Hmm. / [15:39:47] * halfak fixes [15:41:16] one thing we can do is to hammer staging node [15:41:33] staging node doesn't have any requests coming to it [15:41:43] https://grafana-labs-admin.wikimedia.org/dashboard/db/ores-labs?orgId=1&from=now-3h&to=now&panelId=14&fullscreen [15:42:11] So what hammering was done with precache and what was done directly? [15:43:35] for hammering precache it goes to another metric [15:43:46] it's https://grafana-labs-admin.wikimedia.org/dashboard/db/ores-labs?orgId=1&from=now-3h&to=now&panelId=15&fullscreen [15:51:02] afk for lunch [16:00:29] afk for WMF allhands meeting thingie [16:29:02] back now [16:29:06] 10Scoring-platform-team, 10ORES, 10Release-Engineering-Team (Kanban): Create gerrit mirrors for all github-based ORES repos - https://phabricator.wikimedia.org/T192042 (10Ladsgroup) The phabricator repos themselves are broken: ``` amsa@C235:~$ git clone http://phabricator.wikimedia.org/source/editquality.git... [17:04:27] I need a little bit of break [17:04:30] will be back soon [17:11:17] 10Scoring-platform-team (Current), 10ORES, 10User-Ladsgroup: Implement PoolCounter support in ORES - https://phabricator.wikimedia.org/T201823 (10akosiaris) [17:11:21] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review, 10User-Ladsgroup: Use poolcounter to limit number of connections to ores uwsgi - https://phabricator.wikimedia.org/T160692 (10akosiaris) [17:11:28] 10Scoring-platform-team, 10ORES, 10Operations, 10vm-requests, 10Patch-For-Review: Site: 4 VM request for ORES poolcounter - https://phabricator.wikimedia.org/T203465 (10akosiaris) 05Open>03Resolved @Ladsgroup Hosts in both DCs up and running! [17:26:56] back now [17:27:48] 10Scoring-platform-team (Current), 10ORES, 10Operations, 10Patch-For-Review: Spin up a new poolcounter node for ores - https://phabricator.wikimedia.org/T201824 (10Ladsgroup) Thank you @akosiaris [17:36:10] back from meetings. [17:36:43] But I'm thinking that I might need need to step away for a bit. [17:36:54] Will email the internal list in a minute :) [17:40:37] I need to relocate [17:40:44] be back soon [22:08:22] (03PS1) 10Catrope: SpecialORESModels: Remove workaround for Mustache parent scope bug [extensions/ORES] - 10https://gerrit.wikimedia.org/r/458602 [22:13:59] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10ORES-Support-Checklist, 10Patch-For-Review, 10User-Ladsgroup: Change mentions of wp10 to articlequality in products - https://phabricator.wikimedia.org/T203080 (10Catrope) I see, your ORES patch adds articlequality but does not remove wp10... [22:16:05] (03CR) 10Ladsgroup: [C: 032] SpecialORESModels: Remove workaround for Mustache parent scope bug [extensions/ORES] - 10https://gerrit.wikimedia.org/r/458602 (owner: 10Catrope) [22:38:09] (03Merged) 10jenkins-bot: SpecialORESModels: Remove workaround for Mustache parent scope bug [extensions/ORES] - 10https://gerrit.wikimedia.org/r/458602 (owner: 10Catrope) [22:40:24] (03CR) 10jenkins-bot: SpecialORESModels: Remove workaround for Mustache parent scope bug [extensions/ORES] - 10https://gerrit.wikimedia.org/r/458602 (owner: 10Catrope)