[00:06:39] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Vertical: Migrate SearchSatisfaction EventLogging event stream to Event Platform - https://phabricator.wikimedia.org/T249261 (10Richard50371) [00:07:06] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10Nuria) > I want to have 1TB+ (if possible) in one of the stat machines so we don't have to come back to this request in 6 months again. Probably RAM can be increased on the small... [00:50:14] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10leila) (for the record, Nuria and I talked briefly now.) Seeking information on how much RAM we can add to one of the existing machines sounds good to me. If we have the technic... [00:55:10] 10Analytics, 10Design: Broken icons on https://analytics.wikimedia.org/ - https://phabricator.wikimedia.org/T255840 (10Iniquity) [01:07:41] 10Analytics, 10Design: Broken icons on https://analytics.wikimedia.org/ - https://phabricator.wikimedia.org/T255840 (10Iniquity) [06:25:18] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10elukey) I will try to add a last minute request for the budget, but realistically I think that only two/three stat100x boxes may go up to 128G of ram from their actual status. Th... [07:49:50] * elukey bbiab [09:58:56] the rollback of the test cluster wasn't super smooth due to a little mistake in command execution, but I also found another weird use case and how to fix it [09:59:04] first time that I rollout/rollback bigto! [09:59:07] *bigtop! [09:59:43] doing it for 50 nodes manually is a nightmare, we need a cookbook [10:24:47] * elukey lunch! [11:00:59] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) Today I was able to rollback BigTop, and since the previous attempt to rollout went fine, this is the first time that we do back and forth wi... [11:46:31] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10JAllemandou) Hi, I did some research in relation to the topic - Writing my findings here (sorry for the disorganization). **Sofware:** - Fasttext is a C++ library built at Faceb... [11:50:46] mgerlach: Hi! I'm sorry for not noticing the office hours were set on wednesdays - I care the kids that day, co can't be available :( [11:56:15] joal: no worries, thanks for letting me know. anyone else from the team that would potentially be available during that time if analytics-specific questions come up? (no need to monitor anything, I could ping the channel here in this case) [11:57:31] mgerlach: at that time you can usually find either elukey or fdans - I'll try to be around but can't garanty availability :S [11:57:38] mgerlach: sorry again for not noticing :( [12:01:05] joal: ok good to know who might be around in case something comes up (so far the discussions have been mostly focused around research). in case you can make it, we would be glad to have you around : ) [12:03:10] did I forget a meeting? [12:03:47] nah elukey - We were talking with mgerlach of office-hours [12:04:35] elukey: I said I'd be around, but it's wednesday :S So I said that at the time it happends (9/10 UTC) you and fdans are usually nearby [12:04:57] ah yes, but I don't have it in my gcal, if needed I'll try to be around [12:05:02] elukey: I'll be sure to care the day next time I get asked to be present :S [12:05:03] same here [12:05:24] Thanks a lot folks - Sorry for the mess [12:42:45] elukey fdans: thanks. will send you an invite. [12:44:40] thanks for offering to be around if something comes up, will ping you from research-channel in that case : ) [13:30:50] gone for now - back in ~3hours [13:50:24] * elukey afk for ~1h! [14:51:18] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10Nuria) Let's do some exploration with dc ops team as to what kind of hardware could we provision both in terms of bumping up the RAM of existing machines but in terms of ordering... [15:32:34] back! [15:33:10] lovely physiotherapy on a Friday :D [15:38:55] heh, i'm about to have a doctor appt too! [15:43:32] (03CR) 10Nuria: [C: 03+1] Add pageview_actor_hourly table and oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/606127 (https://phabricator.wikimedia.org/T255467) (owner: 10Joal) [15:59:14] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Vertical: Migrate SearchSatisfaction EventLogging event stream to Event Platform - https://phabricator.wikimedia.org/T249261 (10Ottomata) So I just tried to backfill from the EventError table. After I did, the count... [16:03:34] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10leila) >>! In T255716#6237090, @elukey wrote: > I will try to add a last minute request for the budget, In case it helps, I gave Faidon a heads up about this conversation yester... [16:03:55] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10leila) @Nuria Thanks! [16:06:07] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10leila) >>! In T255716#6237514, @JAllemandou wrote: > Hi, I did some research in relation to the topic - Writing my findings here (sorry for the disorganization). > **Sofware:** >... [16:40:37] joal: in the commonswiki [16:40:59] joal: the "content" namespace is what is defined as "media" on project_namespace map? [16:41:09] https://www.irccloud.com/pastebin/1LQrtYNf/ [16:48:08] weird my dr apt is not today? it is monday? I thuoght for sure it was today! [16:49:22] 10Analytics-Radar, 10Product-Analytics, 10Research-Backlog, 10Research-consulting: Propose metrics along with qualifiers for the press kit - https://phabricator.wikimedia.org/T144639 (10Nuria) It will be so nice to get this done [16:51:39] 10Analytics, 10Analytics-Cluster, 10Research: Can we have more RAM on stat machines - https://phabricator.wikimedia.org/T255716 (10Nuria) >To give you a sense: I'm building a prediction model to detect an outcome. I read a paper and see that model x that is just put out may do really well for my problem. I w... [16:56:14] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Move Matomo to Debian Buster - https://phabricator.wikimedia.org/T252740 (10Nuria) 05Open→03Resolved [16:56:16] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move the Analytics infrastructure to Debian Buster - https://phabricator.wikimedia.org/T234629 (10Nuria) [16:57:05] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10Nuria) making sure @Miriam is not waiting for any work here before proceeding [16:58:32] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10elukey) Didn't add a message in here, but the upgrade is completed, I synced with Miriam and Martin (who uses the GPU) before proceeding, all good. [16:58:53] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: mediawiki history dumps sync not working - https://phabricator.wikimedia.org/T255485 (10Nuria) 05Open→03Resolved [16:59:28] hey nuria - I don't know about namespaces in commons [16:59:48] I'm gonna check [16:59:52] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform: eventgate-wikimedia should expose runtime stream configuration - https://phabricator.wikimedia.org/T253157 (10Nuria) ping @Ottomata Whet is teh path where stream configs are surfaced to? [17:00:00] joal: how do we check that? [17:00:12] joal: other than asking Netrom? [17:00:14] that os ... [17:00:16] hehehe :) [17:00:20] * that is [17:00:46] nuria: I'd look at canonical data (db+tables created by product-analytics) [17:00:53] maybe some info can be found there [17:01:12] Asking netrom is probably the fastest/easiest :) [17:02:03] (03PS1) 10Nuria: [WIP] Usage of commons files for tech tunning session metrics [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/606734 (https://phabricator.wikimedia.org/T247417) [17:03:07] nuria: 1-1? [17:03:23] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10Nuria) 05Open→03Resolved [17:08:48] still very WIP but looks promising https://gerrit.wikimedia.org/r/#/c/operations/cookbooks/+/606736/ [17:09:24] in theory there is only one manual step to upgrade hadoop masters + workers, that is removing stuff from zookeper [17:10:19] I'll start real tests next week [17:22:13] * elukey off! have a good weekend folks [17:22:22] bye elukey :) [17:22:47] Have a good weekend elukey [17:35:15] nuria: thinking about namespaces again [17:35:23] joal: yes [17:35:42] * nuria is going to study how zookeeper works [17:36:05] nuria: what do you mean by `content` - the ones we flag as content in mediawiki_history? [17:36:19] joal: on commons taht really means files i think [17:36:47] nuria: for commons, both main-namespace (0) and File (6) are seen as content [17:37:04] joal: i think I only need file [17:37:11] select * from wmf_raw.mediawiki_project_namespace_map where snapshot = '2020-05' and dbname = 'commonswiki' and namespace_is_content [17:37:14] nuria: --^ [17:37:37] joal: i am trying to stablish a % of files u in commons taht are used [17:37:46] joal: for which i need the total number of fiels [17:37:56] *of files [17:38:20] joal: so i do not want to count thsi page: https://commons.wikimedia.org/wiki/Commons:GLAMwiki_Toolset_Project/NARA_analytics_pilot [17:38:46] ack nuria [17:39:02] nuria: files is probably the way to go [17:39:07] joal: but i want to count this: https://commons.wikimedia.org/wiki/File:Annual_wholesale_price_list_of_the_Fraser_Nursery_Company_(Incorporated)_-_fall_1923,_spring_1924._(IA_CAT31312511).pdf [17:39:15] joal: taht makes sense right? [17:39:20] it does nuria [17:39:29] nuria: let me check something [17:39:42] joal: but commons has this "-2" namespace called "media"? [17:39:50] joal: that is teh part i do not get [17:39:52] nuria: noted as non-content [17:40:06] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform: eventgate-wikimedia should expose runtime stream configuration - https://phabricator.wikimedia.org/T253157 (10Ottomata) /v1/stream-configs, e.g. ` curl -s https://intake-analytics.wikimedia.org/v1/stream-configs | jq . { "even... [17:41:32] 10Analytics, 10Event-Platform: EventStreamConfig's auto-topics config is incorrect - https://phabricator.wikimedia.org/T255888 (10Ottomata) [17:41:50] joal: right, it is non content but "seemed" pertinent [17:43:04] nuria: on mediawiki_history, 0 page event on namespace -2, 369716 on namespace 0, 69157418 on namepsace 6 [17:43:22] nuria: Looks like you're after namespace 6 alright [17:43:39] joal: ok, got it [17:45:14] nuria: I confirm I have no revision in namespace -2 either, and a similar ratio of revisions for namespace 0 and 6 (2751242 and 350424739 respectively) [17:45:45] (03PS2) 10Nuria: [WIP] Usage of commons files for tech tunning session metrics [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/606734 (https://phabricator.wikimedia.org/T247417) [17:46:19] joal: i still need to test but please let me know if anything stands out here: https://gerrit.wikimedia.org/r/#/c/analytics/reportupdater-queries/+/606734/2/structured-data/commons_file_usage_in_wikimedia_projects [17:46:30] reading [17:48:17] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform: eventgate-wikimedia should expose runtime stream configuration - https://phabricator.wikimedia.org/T253157 (10Nuria) so nice to see, this, great progress! [17:48:49] nuria: namespace should be 6, not -6 (line43) [17:49:22] nuria: and missing an AND at the end of that same line (before wiki_db =) [17:49:52] joal: indeed, corrected now [17:51:06] nuria: any reason to use wmf_raw.page instead of wmf.mediawiki_history_page in the total_number_of_files_in_commons subquery? [17:51:06] (03PS3) 10Nuria: [WIP] Usage of commons files for tech tunning session metrics [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/606734 (https://phabricator.wikimedia.org/T247417) [17:52:14] joal: isn't page easier to query as it has a lot less records? [17:52:25] joal: maybe this does not matter [17:52:50] nuria: wmf_raw.page is avro - the other one is parquet [17:53:03] nuria: will check quickly - but no big deal [17:53:32] soomething else nuria (same line agian): p.page_namespace (missing the page_) [17:53:40] joal: INDEED [17:55:14] (03PS4) 10Nuria: [WIP] Usage of commons files for tech tunning session metrics [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/606734 (https://phabricator.wikimedia.org/T247417) [17:57:01] nuria: interesting result on perf - the query using wmf_raw only need 32 files to be read, but it takes longer than the one using page_history (640 files) - columnar for the win :) [17:57:53] Something else nuria - You can remove the `distinct` - This is an expensive operation that is not needed here (only one row per page) [17:58:38] small difference in number between wmf_raw.page and wmf.page_history (61827787 and 61826616 respectively) [18:00:02] joal: does the time include 'distinct" on page_history? [18:00:19] yes [18:00:37] joal: i see, then should i change it [18:00:47] nuria: bulk of the time spent by the wmf_raw.page job is reading and unserializing avro [18:01:39] nuria: you'll need the distinct if you wish as accurate number as possible from page_history [18:02:11] nuria: if you rely on `create` event, there is a discrepancy (small - 1187) as files may have changed their namepsace [18:02:49] nuria: so doing a distinct on page_id seen is what gives us best result (even if the opposite case could be true: file moving from namespace 6 to another namespace) [18:03:17] nuria: this example makes me be in need for a `latest_age_event` boolean field in page table [18:03:23] latest_page_event [18:03:25] sorry [18:03:58] joal: mmm, wait if we use page_history we need created but not deleted though [18:04:08] joal: is that not correct? [18:04:21] this is what I used - https://gist.github.com/jobar/bfa34175d1eb8b828142d52e391b3407 [18:04:35] joal: oh i see HANDLY field [18:05:31] nuria: I'm really thinking this `latest` we should create - will be very handy for any type of event we deal with (we could have latest_page_event, latest_user_event for all event types) [18:06:11] joal: do file ticket [18:06:17] nuria: about delete - This is true we don't want deleted, so I removed the page_is_deleted from the query - BUT - There still could be pages moved from namespace 6 to namespace 0 for instance showing up in that request [18:08:47] joal: taht * seems* correct no? those pages we want to count and teh distinct makes sure we do not count them twice [18:08:53] 10Analytics: Add `latest_page_event` and `latest_user_event` fields in mediawiki_history - https://phabricator.wikimedia.org/T255890 (10JAllemandou) [18:09:11] nuria: I'm gonna make an even even more correct one [18:10:13] joal: "even more correct query"? [18:12:36] nuria: trying to get most recent event per page [18:15:27] joal: maybe that is why i started with the other table, just for simplicity [18:15:33] hehe ) [18:16:02] works that nuria - number difference are small, but at least using wmf_raw you get as correct as possible result easily [18:16:14] nuria: drop the distinct nonetheless :) [18:16:20] joal: yessir [18:17:27] (03PS5) 10Nuria: [WIP] Usage of commons files for tech tunning session metrics [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/606734 (https://phabricator.wikimedia.org/T247417) [18:18:36] OH! today is a wmf holiday day! [18:18:52] did not realize until now :) [18:25:56] ottomata: what? [18:26:03] juneteenth [18:26:04] nuria: juneteenth today [18:26:05] i guess is a day off [18:26:07] yah [18:26:14] ottomata and nuria WINNERS! [18:26:21] hahha [18:26:31] will be taking some time off next week! [18:26:41] :) [18:38:43] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Vertical: Migrate SearchSatisfaction EventLogging event stream to Event Platform - https://phabricator.wikimedia.org/T249261 (10Ottomata) The parse error comes from a very strange extra '=' at the end of the URL enco... [18:38:45] well crap ^ [18:39:09] yesterday when I tried to switch SearchSatisfaction to EventGate, I ended up disabling it for almost all wikis for about 2 hours :( :( :( [18:42:19] (03PS1) 10Joal: Webrequest host normalization code had a bug [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/606752 [18:59:38] 10Analytics, 10Event-Platform: EventStreamConfig's auto-topics config is incorrect - https://phabricator.wikimedia.org/T255888 (10Ottomata) Ah, this is just because we never configured the topic prefixes :) [19:49:44] 10Analytics, 10Analytics-Kanban, 10EventStreams, 10Operations, and 2 others: EventStreams drops the connection after 15 minutes, which makes it unreliable - https://phabricator.wikimedia.org/T242767 (10Ottomata) > We could treat a stream as an unending download, and encode the same information that is curr... [20:40:05] 10Analytics, 10Analytics-Kanban, 10EventStreams, 10Operations, and 2 others: EventStreams drops the connection after 15 minutes, which makes it unreliable - https://phabricator.wikimedia.org/T242767 (10Nuria) Some thoughts: I agree with @ema and @BBlack that we cannot expect connections to live "forever" a... [20:41:10] ottomata: thoughts on my reply on https://phabricator.wikimedia.org/T242767 welcome [20:46:44] nuria: yeah............ [20:46:58] 15 mins just seems so short tho. hm [20:47:50] ottomata: the amount of time , i think, does not matter [20:48:07] ottomata: cause once your client is resilient to connection drop [20:48:10] *drops [20:48:23] ottomata: it does not matter that a reconnect happens every 15 min or every hour [20:48:28] ya i get the point, but [20:48:35] sometimes i consume just with curl [20:48:42] ottomata: aham [20:48:47] and it would be nice to test things for longer than 15 minutes without having to write code [20:50:01] also, i feel like it is a bit unexpected [20:50:12] yes the client should be resilient, but also the server should keep the connection alive if it can [20:50:15] this is an artificial timeout [20:50:33] that is in place to work around a file descriptor problem with ATS that I don't fully unerstand [20:51:20] ottomata: if it is artificial it makes sense to increase it to a sensible amount, sure [20:51:50] ottomata: but it could very well be that the cdn infra needs to reset connections every 15 mins to ensure health of fleet [20:52:00] ottomata: that does not seem outlandish [21:00:04] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Vertical: Migrate SearchSatisfaction EventLogging event stream to Event Platform - https://phabricator.wikimedia.org/T249261 (10Ottomata) Added a Data Quality note here: https://wikitech.wikimedia.org/wiki/Analytics/... [22:34:32] 10Analytics-Data-Quality, 10QuickSurveys, 10WMDE-Technical-Wishes-Team, 10MW-1.35-notes (1.35.0-wmf.37; 2020-06-16), and 2 others: Remove Do Not Track support for QuickSurveys - https://phabricator.wikimedia.org/T254224 (10leila)