[06:31:14] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Operations, and 5 others: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 (10elukey) For posterity, it is easy to spot when Marko deployed and the effects of the new code: {F24087167} {F24087168} [07:01:48] there is indeed a visible effect --^ [07:03:52] yeah :D [07:04:11] I am still not sure why https://grafana.wikimedia.org/dashboard/db/eventstreams?refresh=1m&orgId=1&from=now-2d&to=now&var-stream=All&var-topic=All&var-scb_host=All&panelId=1&fullscreen is happening though [07:05:00] hm [07:05:14] This one is impressive in a less positive way [07:05:28] ah maybe it was the wrong metric, "mean" probably is not the right one [07:05:49] count looks better [07:06:59] elukey: It's interesting that means gives a cumulative! [07:07:19] mmm wait count IIRC was related to the number of datapoints in graphite [07:07:24] or not? [07:07:53] restored mean, will talk with marko about this [07:07:54] weird [07:09:37] bon jour joal! :) [07:09:50] Bonjour à toi ausi elukey :) [07:10:09] Ca va ce matin? [07:15:29] sa va! [07:15:43] (or sa va bien?) [07:15:58] both are good :) [07:16:19] but you'd need a "ç" for ça :) [07:16:34] (terrible french for writing) [07:25:21] ack! :D [07:25:51] joal: as FYI I deleted the /user dirs that were belonging to non existing users and also empty (or containing only .staging) [07:25:55] to remove garbage [07:26:07] I am re-running the script now to see what's left [07:26:19] elukey: good for me - have ou checked with research about ellery data? [07:26:48] not yet, I want to contact them when I have a list of users to check [07:26:54] I hope to do it later in the day [07:27:05] \o/ :) [07:31:38] elukey: I have a question for ou [07:32:01] elukey: do we have a place where SQL requests to analytics-store are logged/dumped? [07:42:40] yes but only the ones taking more than 300s [07:42:50] hm [07:43:20] usually we don't log anything on the dbs [07:43:34] we put that to analyse the query patterns [07:43:56] I recall elukey [07:44:25] My idea was to try to replicate Mysql queries on event-logging + wiki_dbs with presto :) [07:45:17] but we don't have el data on anaytics-store no? [07:49:37] anyhow, I've sent a while ago a list of queries landing to analytics-store (the big ones taking more than 5 mins) [07:49:43] to analytics-internal [07:56:27] I'll use that one elukey, thanks :) [08:11:32] 10Analytics, 10Discovery-Search (Current work), 10Patch-For-Review: Create kafka topic for mjolinr bulk daemon and decide on cluster - https://phabricator.wikimedia.org/T200215 (10elukey) I don't have anything against this but please also sync with @mobrovac and ops :) [08:12:40] joal: bc for a round of deletes? (if you have time [08:13:20] yessir [08:58:30] 10Analytics, 10Pageviews-API, 10User-Elukey: Improve user management for AQS - https://phabricator.wikimedia.org/T142073 (10elukey) [08:59:20] 10Analytics, 10User-Elukey: Secure hue and other private data access sites with 2FA - https://phabricator.wikimedia.org/T159584 (10elukey) [10:41:31] joal: email sent to Leila and Dario :) [10:41:38] and also one to Tilman [10:41:56] I'll wait for their green light before deleting anything [11:15:22] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Operations, and 5 others: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 (10akosiaris) Wow, nice work! [11:37:26] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Operations, and 5 others: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 (10elukey) 05Resolved>03Open There seems to be only one weirdness remaining, namely: {F24089464} For some reason, right a... [11:40:38] (03CR) 10Joal: [V: 032 C: 032] "Merging before deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/443409 (https://phabricator.wikimedia.org/T193641) (owner: 10Jonas Kress (WMDE)) [11:44:29] mforns, milimetric: when you come online, could you please have a look at https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/447720? [11:44:43] I'd like to deploy today if possible [11:45:31] The other one I'd like to see merged would be: https://gerrit.wikimedia.org/r/c/analytics/refinery/+/443487 [12:31:51] * elukey lunch! [13:16:26] (03CR) 10Milimetric: [C: 032] Add foundation.wikimedia to pageviews (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/447720 (https://phabricator.wikimedia.org/T188776) (owner: 10Joal) [13:20:37] (03CR) 10Milimetric: Update mw-user-history username to user_text (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/443487 (https://phabricator.wikimedia.org/T197926) (owner: 10Joal) [13:24:17] milimetric: forgot about that one: https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/443483 [13:24:21] 10Analytics, 10Operations, 10WMDE-Analytics-Engineering: Cannot SSH to stat1004 - https://phabricator.wikimedia.org/T200330 (10GoranSMilovanovic) [13:24:22] The missing part :) [13:24:30] Many thanks for the reviews milimetric :) [13:24:36] 10Analytics, 10Operations, 10WMDE-Analytics-Engineering: Cannot SSH to stat1004 - https://phabricator.wikimedia.org/T200330 (10GoranSMilovanovic) p:05Triage>03High [13:26:45] 10Analytics, 10Operations, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Cannot SSH to stat1004 - https://phabricator.wikimedia.org/T200330 (10GoranSMilovanovic) [13:26:47] (03CR) 10Milimetric: [C: 032] Update user-history job from username to userText [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/443483 (https://phabricator.wikimedia.org/T197926) (owner: 10Joal) [13:27:31] (03CR) 10Milimetric: [V: 032 C: 032] "ah ok, https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/443483/ had already addressed my comment." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/443487 (https://phabricator.wikimedia.org/T197926) (owner: 10Joal) [13:28:16] k joal, the source one is being merged by jenkins, so after that you're all good [13:28:53] just saw that - Huge thanks milimetric :) [13:32:27] 10Analytics-Data-Quality, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Data set review for the Wiktionary Cognate Dashboard - https://phabricator.wikimedia.org/T199851 (10GoranSMilovanovic) Ping, #Analytics can anyone please take a quick look at this - it is really a simple public dataset review t... [13:32:40] milimetric: while we're at it, shall we also merge https://gerrit.wikimedia.org/r/c/analytics/refinery/+/445395 before the deploy? [13:32:49] milimetric: it can wait next week though [13:33:20] (03Merged) 10jenkins-bot: Update user-history job from username to userText [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/443483 (https://phabricator.wikimedia.org/T197926) (owner: 10Joal) [13:35:53] joal: looking [13:36:01] joal: also, wanna batcave with me for a sec? [13:36:19] sure milimetric - OMW [13:55:26] (03CR) 10Milimetric: [C: 04-1] Adds empty dir removal to hive partition dropping jobs (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/445395 (https://phabricator.wikimedia.org/T198600) (owner: 10Fdans) [14:10:18] good morning from texas a-team! [14:13:18] Hi fdans :) [14:13:46] howdy partner [14:17:05] o/ [14:23:53] fdans (or anyone): what do you think about https://meta.wikimedia.org/wiki/Config:Dashiki:Annotations/Wikistats/{{metric}} for annotation paths on metawiki? [14:24:59] I use the Config:Dashiki: path because the mediawiki/Dashiki extension can be extended to take care of all "dashboard configuration" on the wikis, so the name and purpose still fit [14:26:00] Another reading for whose interested a-team: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Mediawiki_History_Snapshot_Check [14:27:10] milimetric: it's a lil weird to me that we use the dashiki namespace for wikistats, but I don't feel strongly either way [14:28:09] fdans: yeah, but Dashiki as in the dashboarding project could theoretically be extended to configure wikistats, and Dashiki as in the mediawiki extension that makes the Config:Dashiki: namespace available is different as I mention above [14:28:44] fdans: the alternative is to make another extension like mediawiki/Wikistats that would govern just the Wikistats config, but that seems too narrow [14:29:46] it should be more like Dashconfiki than Dashiki :) [14:31:41] (03PS1) 10Joal: Update changelog.md for version v0.0.67 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/447812 [14:31:52] milimetric, elukey --^ [14:31:59] If you guys have a minute [14:32:30] (03CR) 10Milimetric: [C: 031] Update changelog.md for version v0.0.67 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/447812 (owner: 10Joal) [14:33:13] (03CR) 10Joal: [V: 032 C: 032] "Merging for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/447812 (owner: 10Joal) [14:33:32] * elukey is not allowed to even +1, feeling sad [14:33:48] of course I am kidding, go ahead :D [14:34:04] * joal feels sad of making elukey feeling sad :( [14:35:30] elukey: Asking permission to deploy please [14:35:32] h:-P [14:35:40] haahahah [14:35:52] you don't need to ask me, deploy anytime :) [14:36:23] elukey: this is a (more or less) subtle way to tell you I'm doing it [14:38:20] !log Release refinery v0.0.67 to archiva [14:38:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:42:50] elukey: having a "go" from an ops make me feel safe :) [14:43:34] 10Analytics, 10Operations, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Cannot SSH to stat1004 - https://phabricator.wikimedia.org/T200330 (10Reedy) When did you last use it? >Debian GNU/Linux 9 auto-installed on Tue May 22 15:58:49 UTC 2018. Just remove the old key with the command it told yo... [14:53:18] 10Analytics, 10Operations, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Cannot SSH to stat1004 - https://phabricator.wikimedia.org/T200330 (10GoranSMilovanovic) @Reedy Thanks - all is fine now. I do not use stat1004 often at all - there is only one of my scripts running there, everything else... [14:53:31] 10Analytics, 10Operations, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Cannot SSH to stat1004 - https://phabricator.wikimedia.org/T200330 (10GoranSMilovanovic) 05Open>03Resolved a:03GoranSMilovanovic [15:15:40] heya elukey - anything special for ops standup? [15:24:02] ah sorry didn't join since I thought you would have not, usually it is me and andrew! [15:24:14] elukey: true :) [15:24:16] do you want to do it?? [15:24:34] elukey: as you wish, anything special to discuss? [15:24:35] I don't have anything special besides the things that I already asked counselling this morning :) [15:25:01] elukey: same for me, nothing special, preping my deploy for tomorrow morning [15:25:34] elukey: let's skip then :) [15:32:01] (03PS8) 10Fdans: Adds empty dir removal to hive partition dropping jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/445395 (https://phabricator.wikimedia.org/T198600) [15:38:50] (03PS1) 10Joal: Bump webrequest refine jar version to 0.0.67 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447825 [15:39:11] (03PS2) 10Fdans: Filter out unwanted wikis from wmf.virtualpageview_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447665 (https://phabricator.wikimedia.org/T197971) [15:42:20] (03CR) 10Joal: [V: 032 C: 032] "Self merging before deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447825 (owner: 10Joal) [15:42:46] !log Deploying refinery onto HDFS [15:42:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:52:45] (03CR) 10Joal: "Another round - property should be added to coordinator.xml file as well please :)" (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447665 (https://phabricator.wikimedia.org/T197971) (owner: 10Fdans) [15:53:24] elukey: Hi elukey - I think I need help - deployement failed :(n [15:54:55] elukey: https://gist.github.com/jobar/8081ab5e8d015f865ce543c533d1794d [15:54:59] joal: sorry - will post another patch, test the query, and update the CR indicating that it works [15:55:37] ah there you go! [15:56:00] joal you will not believe it but I was trying to figure out on the routers why stat1005 was trying to call deploy1001 :D [15:56:04] elukey: That's why I alwas ask permission :) [15:56:12] Mwhahaha [15:56:20] Ok, I think we're onto something here :) [15:56:33] So, scap deploy goes two way it seems :) [15:56:46] yeah scap pulls things from the host [15:56:55] and try to guess what it was using before [15:56:57] Ipv6 :) [15:57:12] right [15:57:29] it seems to be using port 80 [15:57:32] ok - No big deal, I will rollback, and we can solve that tomorrow [15:57:56] I need to drop now and will be back later, so let's recombine tomorrow morning if ok for you? [15:58:18] sure, I am going to ping arzhel to add the network rule [15:58:30] Awesome :)n [15:58:37] Thanks elukey - Talk later or tomorrow :) [16:08:17] hey team, back from airport [16:20:43] hellooo mforns [16:20:51] hey fdans :] [16:57:00] joal: should be fixed! [17:06:50] (03PS3) 10Fdans: Filter out unwanted wikis from wmf.virtualpageview_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447665 (https://phabricator.wikimedia.org/T197971) [17:06:53] joal: query checked :) [17:49:44] Question: There is a Cloud VPS project called "analytics". Is this used for actual analytics work, or is it for testing the stuff run on the actual analytics servers? [17:51:14] (03CR) 10Milimetric: [C: 04-1] WIP: Measure articles published using CX2 (032 comments) [analytics/limn-language-data] - 10https://gerrit.wikimedia.org/r/442860 (https://phabricator.wikimedia.org/T196435) (owner: 10Amire80) [17:52:59] * elukey off! [17:53:10] harej: testing mostly :) [17:53:25] Good to know, thank you [18:52:12] PROBLEM - Check if active EventStreams endpoint is delivering messages. on scb2002 is CRITICAL: CRITICAL: No EventStreams message was consumed from http://scb2002.codfw.wmnet:8092/v2/stream/recentchange within 10 seconds. [19:52:22] RECOVERY - Check if active EventStreams endpoint is delivering messages. on scb2002 is OK: OK: An EventStreams message was consumed from http://scb2002.codfw.wmnet:8092/v2/stream/recentchange within 10 seconds.