[00:42:06] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10Patch-For-Review, and 2 others: track number of editors from other Wikimedia projects who also edit on Wikidata over time - https://phabricator.wikimedia.org/T193641 (10Aklapper) a:05Jonas>03None [Resetting assignee as the assignee user account i... [02:59:08] PROBLEM - Check the last execution of refinery-import-page-history-dumps on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused [02:59:27] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), and 3 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Krinkle) 05Resolved>03Open >>! In T187207#4729931, @gerritbot... [03:19:30] RECOVERY - Check the last execution of refinery-import-page-history-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-page-history-dumps [03:28:12] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10chelsyx) [03:28:33] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10chelsyx) Done. Thanks @atgo ! [07:04:17] morning! [07:22:11] sigh druid1001 with root partition full, zookeeper in there was down [07:22:17] /var/log/druid filled up.. [07:22:58] https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&panelId=42&fullscreen&orgId=1&var-datasource=eqiad%20prometheus%2Fanalytics&var-cluster=druid_analytics&var-druid_datasource=All&from=now-7d&to=now [07:24:01] biggest logs are like [07:24:04] 2.0G 2018-12-02.log [07:24:18] so I guess that the KIS spams a bit [07:29:13] interestingly, only on druid1001 [07:29:47] probably one kafka partition == 1 peon working [07:42:21] ok back to the refine issue [08:09:54] Hi elukey [08:10:23] bonjour! [08:10:43] elukey: There 3 tasks for KIS - So I'm assuming the loggign on a single machine must come from the task manager itself [08:11:02] Or, the 3 tasks were on the same machine (would be ver bad) [08:11:41] elukey: I was about to get started on refine issue - Do we sync? [08:11:45] nono they are on 3 hosts [08:12:05] probably only one pulls data from the kafka partition, hence it logs more [08:12:50] joal: sure, as I wrote into the email I didn't see anything from the Varnish/Varnishkafka side (e.g. things exploding during the timeframe) [08:13:02] but there are some suspicious dt:'-' in the logs [08:13:05] hm :( [08:13:14] Ok, let's investigate [08:14:12] actually elukey, the graph you pasted is on query-counts per second - How is that possible to have som any queries? [08:14:33] I know that I posted that graph, it was very weird :D [08:14:54] elukey: The cache miss rate is also super high [08:15:16] elukey: I'd be interested to know from the FRTech team how they use the druid endpoints [08:15:23] automated queries or not [08:15:58] elukey: or, KIS not only ingest data but also queries? [08:17:10] there seems to be requests from turnilo logged [08:17:10] mmmm [08:17:36] elukey: right, but a level of 2k/s? [08:20:08] I am not sure if the number is not right (maybe a bug in the exporter) or something else, but it looks definitely not right [08:22:34] ah wait joal [08:22:38] that was a idelta [08:23:14] (in grafana) [08:23:21] I switched it with irate, more correct [08:23:26] and now it looks more ok :D [08:25:40] and all from druid1001 (I am checking directly in grafana, will try to come up with a better graph) [08:27:39] elukey: IIRC we don't have a reverse-proxy for broker in druid-analytics? [08:27:56] ag hip [08:27:57] Dec 3 06:48:11 analytics-tool1002 turnilo[6481]: Got the latest time for 'test_kafka_event_centralnoticeimpression' (2018-12-03T06:48:11.000Z) [08:28:00] Dec 3 06:48:11 analytics-tool1002 turnilo[6481]: Got the latest time for 'test_kafka_event_centralnoticeimpression' (2018-12-03T06:48:11.000Z) [08:28:11] ahahha that should have been "ah joal!" [08:28:22] I can see in turnilo's logs a ton of these [08:28:44] elukey: probably somebody having an autorefresh at 1s [08:29:12] or maybe turnilo works like this when the druid data source gets updated very often? [08:29:37] joal: about the reverse proxy - do you mean LVS endpoint? (a load balancer) [08:29:57] yup [08:29:58] if so we don't, only for druid public [08:31:54] ahahahah I can see AndyRussG in the logs of apache on analytics-tool1002 [08:31:57] so I think you are right joal [08:33:45] yes confirmed [08:34:02] very interesting use case [08:35:58] mistery solved [08:36:09] ok [08:36:14] hm - this not great however [08:36:35] elukey: shall we make druid-logging a bit less verbose, or move those logs on a bigger partition? [08:37:10] joal: I am planning to open a task to come up with the right log4j config for those files, we don't set any limits for those [08:37:17] but we do for all the daemons [08:37:36] right [08:37:42] ok - Thanks elukey :) [08:39:10] maybe we could also open an issue to the turnilo people asking for some logic to query druid brokers [08:40:39] possible elukey - We probably also want to remove the very small time-period refresh in turnilo [09:08:21] 10Quarry: View 'thwiki_p.page' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them - https://phabricator.wikimedia.org/T210978 (10Zoranzoki21) I can reproduce this problem: https://quarry.wmflabs.org/query/31779 [09:16:56] 10Quarry, 10Dumps-Generation: View 'thwiki_p.page' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them - https://phabricator.wikimedia.org/T210978 (10Aklapper) [09:32:45] joal: sorry I had to go afk (doorbell), are you working on the refine issue? [09:33:19] elukey: currently running the checking script on warning hours - everything is marked as false-positive [09:35:47] joal: super, I am trying to see if I can find weird req [09:36:11] so close to reality for some people: https://pbs.twimg.com/media/DtYrW_cUcAU4zPJ.jpg:large [09:38:26] ahahah yes [09:40:39] joal: can you read https://phabricator.wikimedia.org/P7877 ? [09:41:20] this is an example that I've just found about a request logged without Timestamp:Resp [09:41:25] but I have no idea where it comes from [09:42:15] elukey: just read the thing - nothing stands special IMO [10:55:06] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10elukey) p:05Triage>03Normal [11:15:56] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10elukey) p:05Normal>03High [11:16:35] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Banyek) [11:23:39] joal: in https://phabricator.wikimedia.org/T148412#2781980 I added a comment about how big holes were created, not sure if this is again the case but I keep forgetting about this example [11:26:50] (03PS2) 10Fdans: Add project families to uniques loading job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/476220 (https://phabricator.wikimedia.org/T167539) [11:27:19] (03CR) 10Fdans: "Thank you for your thorough review Joal!!" (038 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/476220 (https://phabricator.wikimedia.org/T167539) (owner: 10Fdans) [11:36:56] elukey: joal helloooo with your permission I want to deploy aqs :) [11:37:28] fdans: hello! I am going afk now and joseph is not here atm, can we do it after lunch? [11:37:36] elukey: yessir [11:40:36] super thanks :) [11:40:41] going afk! [11:40:43] * elukey lunch! [12:13:21] (03PS1) 10Fdans: Add shnwiki, yuewiktionary and liwikinews to sqoop list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/477247 (https://phabricator.wikimedia.org/T209822) [12:16:04] Hi fdans - Good for me - We can either wait for elukey or not, as you wish :0 [12:16:07] 10Analytics, 10Analytics-Kanban: Remove sessionId, pageId pairs from whitelist - https://phabricator.wikimedia.org/T205458 (10fdans) @HaeB quick ping on this so that it doesn't get buried :) [12:16:34] joal: I'm going to go off to make lunch soon so I'll do it in the afternoon :) [12:16:44] sounds good fdans :) [12:17:24] joal: hmm, actually I got some time so I'll do it now! [12:24:38] (03PS1) 10Fdans: Update aqs to adb38e6 [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/477250 [12:26:17] joal: mind approving this --^ [12:26:44] fdans: you trust the node_modules update? [12:27:58] fdans: also, I think we're missing something related to https://gerrit.wikimedia.org/r/c/analytics/aqs/+/476297/1/v1/unique-devices.yaml [12:29:38] joal: hmmm, regarding node_modules... I guess i can git checkout master node_modules on that patch and only update dependencies when we actually want to? [12:30:35] (03PS1) 10Joal: Update insert-test-data script for unique change [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/477251 (https://phabricator.wikimedia.org/T167539) [12:30:38] fdans: --^ [12:30:51] joal: oooo [12:31:24] not the first time I miss that joal, I should document it [12:31:38] abandoning deploy patch [12:32:02] (03Abandoned) 10Fdans: Update aqs to adb38e6 [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/477250 (owner: 10Fdans) [12:32:03] fdans: you didn't forget the change in automated-test description , I would have had :) [12:32:33] fdans: About the node_modules, I don't know what strategy we should take ... [12:32:58] this has got to be part of docs though, there's a bunch of places to add stuff when we add a new/change endpoint [12:33:16] joal: the problem is that the dockerized aqs runs npm install [12:33:27] fdans: I think best could possibly be to update the deploy-patch-creation script (the one that uses docker) to provide node_modules updates only if explicitely asked [12:33:42] yeah [12:34:49] (03CR) 10Fdans: [V: 032 C: 032] Update insert-test-data script for unique change [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/477251 (https://phabricator.wikimedia.org/T167539) (owner: 10Joal) [12:48:13] Hi bmansurov - you have multiple spark jobs running on the cluster, I';m asuming some of them could be killed :) [13:07:17] elukey: for when you;re back [13:07:25] s/;/' [13:08:36] I have an example of a request (currently find others) for which the sequence-number is very not-aligned to others given the timestamp (at second level) [13:08:41] elukey: --^ [13:32:41] hey team :] [13:32:45] Hi mforns [13:35:05] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10Isaac) [13:36:17] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10Isaac) Thanks for setting this up @atgo! Just a small correction - I removed the underscore from my LDAP. [13:40:33] o/ joal, I'm gonna kill some of the jobs. Thanks [13:41:01] bmansurov: Please don't if you use them - I'm just suspecting some lay there without being used [13:41:26] joal: ok, makes sense [13:44:21] joal: I am back :) [13:44:50] heya elukey - I found 83 requests generating 99% of the problem [13:45:01] fdans: I am ok if you want to deploy [13:45:28] elukey: Those seem to be real requests, but they behave in a weord way relative to timestamp/sequence [13:45:48] I think it'd be reat to talk about this when you have a minute :) [13:46:10] sure! gimme 5m and then bc? [13:46:18] when you want elukey :) [13:46:30] no rush anymore on my side, I'll document m findings in the task [13:51:26] joal: bc? [13:51:32] OMW ! [14:48:21] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10JAllemandou) TL;DR: 83 requests in the failed hour are responsible for the failure. Their sequence-number is fairly bigger than the ones in the current hour, while their timest... [14:48:27] elukey: --^ [14:49:20] o/ [14:49:55] Hi milimetric - I have focused on webrequest issue today - I'll double check data validity after [14:51:29] no problem, joal, I was just about to build with your last patch and run [14:51:46] Great milimetric [14:52:34] also realized some of the fixes you made, I had locally but accidentally didn't include them in my patch :( double sorry about that [14:53:26] no problem at all :) [14:55:01] joal: wow I need some time to parse it :D [14:55:18] elukey: It took some time to write ;) [14:57:20] Gone for kids folks - See you at standup [15:17:22] 2018-09 done, 2018-10 underway, now figuring out how to run the checker [15:29:21] ok, cool, pretty easy, will check 09-new vs 10-new, then 08-old vs 09-new, then 09-old vs 10-new [15:38:11] 10Quarry, 10MediaWiki-Database, 10Tool-Database-Queries: database dewiki_p yields error - https://phabricator.wikimedia.org/T211021 (10Herzi.Pinki) [15:41:22] (03PS1) 10Fdans: Update aqs to adb38e6 [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/477292 [15:43:00] 10Quarry, 10MediaWiki-Database, 10Tool-Database-Queries: database dewiki_p yields error - https://phabricator.wikimedia.org/T211021 (10JJMC89) [15:44:23] (03PS2) 10Fdans: Update aqs to adb38e6 [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/477292 [15:44:45] (03PS3) 10Fdans: Update aqs to adb38e6 [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/477292 [15:45:39] joal: this change only updates the submodule in the deploy repo, shouldn't be worried about deps [15:58:15] (03PS1) 10Mforns: Allow for custom transforms in DataFrameToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/477295 (https://phabricator.wikimedia.org/T210099) [15:58:16] 10Analytics, 10Analytics-Kanban: Remove sessionId, pageId pairs from whitelist - https://phabricator.wikimedia.org/T205458 (10Tbayer) >>! In T205458#4793811, @fdans wrote: > @HaeB quick ping on this so that it doesn't get buried :) You're referring to the subtasks assigned to me - I think we have made all the... [15:58:16] (03CR) 10Mforns: [C: 04-2] "Still testing" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/477295 (https://phabricator.wikimedia.org/T210099) (owner: 10Mforns) [15:58:16] fdans: let's wait to discuss with the team at stadup for the process- otherwise looks good to me :) [15:58:30] (03CR) 10jerkins-bot: [V: 04-1] Allow for custom transforms in DataFrameToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/477295 (https://phabricator.wikimedia.org/T210099) (owner: 10Mforns) [16:01:18] ping fdans mforns [16:01:21] standduppp [16:06:34] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Reading-analysis: Final Vetting of Family Wide unique devices data - https://phabricator.wikimedia.org/T169550 (10Tbayer) >>! In T169550#4760456, @Nuria wrote: > @Tbayer: do you have some more comments related to vetting of this metric or is this the... [16:07:41] (03CR) 10Ottomata: "This is very cool! Looks still WIP, when you want I'd love a walkthrough!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/477295 (https://phabricator.wikimedia.org/T210099) (owner: 10Mforns) [16:07:47] (03CR) 10Milimetric: [V: 032 C: 032] Add shnwiki, yuewiktionary and liwikinews to sqoop list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/477247 (https://phabricator.wikimedia.org/T209822) (owner: 10Fdans) [16:13:11] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Clickstream dataset for Persian Wikipedia only includes external values - https://phabricator.wikimedia.org/T191964 (10Milimetric) >>! In T191964#4759013, @Ladsgroup wrote: > It gives me application not found :/ > Where can I submit a spark job like you d... [16:15:31] 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Banyek) On the dbstore1002 host the following databases exists along the wikis (with sizes) | Schema | Size | Comment | |centralauth | 88G |... [16:20:01] 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) Some comments: `ops` doesn't need to be migrated. I believe `datasets` isn't used anymore, but needs double checking. `centralau... [16:21:32] 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Banyek) If the tables mentioned in https://phabricator.wikimedia.org/T210478#4794524 could be moved together (no need to create the own 'sta... [16:33:01] 10Analytics: Use virtual image views to filter mediacounts - https://phabricator.wikimedia.org/T211030 (10Milimetric) [16:33:33] 10Analytics, 10Tool-Pageviews: Statistics for views of individual Wikimedia Commons images - https://phabricator.wikimedia.org/T210313 (10Milimetric) >>! In T210313#4790688, @Tgr wrote: > FWIW, there is a way to detect that - virtual media views (T89088) were developed for that specific purpose (and MediaViewe... [16:38:51] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), and 3 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Milimetric) I can take care of this, @Krinkle, unless you're doing... [17:13:06] 10Analytics: Use virtual image views to filter mediacounts - https://phabricator.wikimedia.org/T211030 (10fdans) p:05Triage>03Normal [17:16:51] 10Analytics: Refine Monitor should be a systemd timer such if process cannot start we get notified - https://phabricator.wikimedia.org/T210759 (10fdans) p:05Triage>03High [17:16:59] 10Analytics: Refine Monitor should be a systemd timer such if process cannot start we get notified - https://phabricator.wikimedia.org/T210759 (10fdans) a:03mforns [17:21:25] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Dashiki should filter out empty newlines - https://phabricator.wikimedia.org/T210570 (10fdans) p:05Triage>03Normal [17:30:04] 10Analytics, 10Operations, 10Performance-Team, 10Traffic: Only serve debug HTTP headers when x-wikimedia-debug is present - https://phabricator.wikimedia.org/T210484 (10fdans) Analytics needs x-analytics in every request, not only in debugging ones but we don't need to include it in the response headers. W... [17:35:45] 10Analytics, 10Analytics-EventLogging: Please delete ChangesListFilters events from before 2016-09-22 - https://phabricator.wikimedia.org/T147346 (10fdans) If you want to remove event data for a particular time range, the data needs to be removed manually from hive and mysql. If what we want is for this data t... [17:36:20] 10Analytics, 10Analytics-EventLogging: Please delete ChangesListFilters events from before 2016-09-22 - https://phabricator.wikimedia.org/T147346 (10fdans) p:05Triage>03Low [17:38:44] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10fdans) p:05High>03Unbreak! [17:41:03] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Milimetric) @Marostegui, we would like to go over plans for implementation during our Wednesday meeting. Is the... [17:44:03] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Marostegui) @Banyek @Bstorm ^ [17:44:56] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Banyek) @Bstrom when could we talk about the details? [17:45:52] 10Analytics, 10Operations, 10Security-Team, 10WMF-Legal, 10Software-Licensing: Can exfat be used in WMF production? - https://phabricator.wikimedia.org/T210667 (10chasemp) I want to acknowledge a few things: - @Legoktm I appreciate that you feel strongly about this - The use of exfat is not any sort of... [17:49:59] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create report for "articles with most contributors" in Wikistats2 - https://phabricator.wikimedia.org/T204965 (10fdans) We'll be creating a new top metric that will rank pages by the number of contributors instead of the number of... [17:54:34] 10Analytics, 10Analytics-Kanban, 10WMDE-Analytics-Engineering, 10Wikidata, and 3 others: track number of editors from other Wikimedia projects who also edit on Wikidata over time - https://phabricator.wikimedia.org/T193641 (10fdans) a:03JAllemandou [17:55:13] 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install cloudvirtan100[1-5].eqiad.wmnet - https://phabricator.wikimedia.org/T207194 (10RobH) >>! In T207194#4789624, @Ottomata wrote: > Hm. They are cattle, but it would probably be nice if the whole node doesn't go down... [17:56:26] 10Analytics, 10Continuous-Integration-Infrastructure (Slipway): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10fdans) [17:56:51] 10Analytics, 10Continuous-Integration-Infrastructure (Slipway): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10fdans) p:05Normal>03High [17:57:00] 10Analytics, 10Continuous-Integration-Infrastructure (Slipway): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10fdans) p:05High>03Normal [17:57:42] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10fdans) 05Open>03Resolved [17:58:11] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10fdans) If you have access to Turnilo you have access to piwik. Let us know otherwise. [17:58:56] 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install cloudvirtan100[1-5].eqiad.wmnet - https://phabricator.wikimedia.org/T207194 (10RobH) Updated from IRC chat with Otto: These should have identical networking vlan setup as the cloudvirts. So we'll have to add the... [18:03:25] 10Analytics: Give access to Turnilo to Pau - https://phabricator.wikimedia.org/T211036 (10mforns) [18:06:04] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Banyek) We talked about the kick-off, and tomorrow we'll sync up about what we found. @Bstorm gives me a few SQ... [18:06:37] 10Analytics, 10Tool-Pageviews: Statistics for views of individual Wikimedia Commons images - https://phabricator.wikimedia.org/T210313 (10Nuria) @TgrI imagine that special beacon was implemented in 2015 for 'virtual mediaviews' due to scaling concerns with the old eventlogging backend. Those concerns no longer... [18:10:46] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10chelsyx) Hi @fdans, I can log in with my LDAP credential to https://piwik.wikimedia.org (the first step, see the screenshot), but can't log into the Mexico campaign dashboard tracking http:/... [18:11:37] !log rerun webrequest upload load job for 2018-12-01T14:00 [18:11:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:11:53] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10Isaac) Yep, I'm in the same boat as far as getting through the LDAP stage but not to the appropriate dashboard. [18:21:05] mforns: as learning step for me - did you run the coordinator for just that hour? [18:36:29] 10Analytics, 10Services, 10Wikimedia-Stream, 10Patch-For-Review: EventStreams process occasionally OOMs - https://phabricator.wikimedia.org/T210741 (10Pchelolo) Mm.. I will not be that certain the deserialization of JSON is the issue here. We deserialize much much bigger messages in the job queue and have... [18:39:31] 10Analytics, 10Services, 10Wikimedia-Stream, 10Patch-For-Review: EventStreams process occasionally OOMs - https://phabricator.wikimedia.org/T210741 (10Ottomata) Hm ok. Yeah I'm not so sure this is the problem either, it just wouldn't hurt. Next time this happens let me know and we'll log in and get the d... [18:47:07] nuria: did you see https://phabricator.wikimedia.org/T210749#4789758 [18:47:10] ; [18:47:11] ?? [18:47:24] this is kinda bad news, we cannot use the new hadoop workers [18:47:39] Cc ottomata --^ [18:53:52] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Phase out and replace analytics-store (multisource) - https://phabricator.wikimedia.org/T172410 (10elukey) Hi everybody, I know that you probably will not believe this but we are planning the dbstore1002 migration to... [18:56:24] elukey, yes, we run it just for that hour [18:58:18] ack thanks :) [18:58:27] mforns: btw you should be able to systemctl restart turnilo now [18:58:40] elukey, cool! [18:58:54] elukey, will help me test the changes I'm doing right now [19:00:41] elukey: going to comment and ask why [19:02:56] ottomata: I guess that it might mean getting very slow query performance, compared to the current settings [19:03:06] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Ottomata) @marostegui, I think these nodes will not have any 'heavy queries'. There won't be any huge joins etc. (unless I don't fully understand the materi... [19:08:43] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10JAllemandou) Operational problem solved: the refinement job has been restarted with higher error-acceptance rates - Blocked jobs have caught up. Some more analysis to come to t... [19:09:16] also, milimetric, can you comment on T210749 so we know exactly what kind of things the new cloud-db replica will need to do? [19:09:17] T210749: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 [19:09:44] (just to be sure so we loose less time in back and forth) [19:12:34] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10chelsyx) @Isaac I just chat with @Nuria and she told me the credential of the bienvenida dashboard. I will send to you in chat. [19:15:37] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10elukey) p:05Unbreak!>03High [19:16:05] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10elukey) Since the refined data should now be there, lowering the priority to High :) [19:16:14] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Marostegui) >>! In T210749#4795112, @Ottomata wrote: > @marostegui, I think these nodes will not have any 'heavy queries'. There won't be any huge joins etc... [19:17:09] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Ottomata) Ah, interesting ok. [19:23:24] going to dinner, o/ [19:23:26] * elukey off [19:30:42] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Milimetric) The queries that we'll be running on here will be of the form: `select [subset of fields] from [one of the tables listed below] where [timestamp... [19:34:15] a-team, my cat has arrived with a swallen face, I'm bringing him to the vet, be back later! [19:35:45] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10JAllemandou) Problematic requests for end-of-hour `upload 2018-12-01T14`: ` spark.sql(""" SELECT hour, hostname, sequence, LAG(sequence) OVER hour_hostname_window AS pr... [19:40:30] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Marostegui) @Milimetric I think T210749#4795213 should go to T210693 as you are explaining what you need and kinda drafting an SQL. The problem with the cur... [19:46:16] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Marostegui) I checked the hosts @elukey mentioned at T210749#4789691 and those have even SATA disks, so not even SAS, so even slower. [19:47:07] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Milimetric) Got it. Luca asked me to comment here describing exactly what we need to do on the boxes. But if basic replication can't even run, I of course... [19:53:47] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Marostegui) >>! In T210749#4795321, @Milimetric wrote: > Got it. Luca asked me to comment here describing exactly what we need to do on the boxes. But if b... [19:56:51] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Milimetric) More details on requirements for Analytics. The queries that we'll be running will be of the form:... [19:57:12] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10JAllemandou) Something to notice is that the problem with the rows listed above occurs all along the hour but only shows-up at the end-border of the calendar-hour since this is... [19:58:23] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Milimetric) Ok, done and agreed. But instead of trying to find hardware that will keep up with replication, I'm asking if replication is necessary, could we... [20:05:41] !log dropping and recreating hive event.mediawiki_revision_score table and data - T210465 [20:05:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:05:43] T210465: Refinery Spark HiveExtensions schema merge should support merging of arrays with struct elements - https://phabricator.wikimedia.org/T210465 [20:07:21] 10Analytics, 10Operations, 10Security-Team, 10WMF-Legal, 10Software-Licensing: Can exfat be used in WMF production? - https://phabricator.wikimedia.org/T210667 (10JBennett) > 2) With respect to the WMF charter and the values and manifestation thereof, it seems the exception process and/or the bar for ea... [20:08:44] 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10User-Elukey: rack/setup/install cloudvirtan100[1-5].eqiad.wmnet - https://phabricator.wikimedia.org/T207194 (10RobH) [20:09:01] 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10User-Elukey: rack/setup/install cloudvirtan100[1-5].eqiad.wmnet - https://phabricator.wikimedia.org/T207194 (10RobH) Ok, I have these booting into the installer, but it dislikes something about the new recipe I made for them. I'm troubleshoot... [20:12:36] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Marostegui) Thanks @milimetric, very useful information. Could I ask for an specific and concrete SQL? With the... [20:14:22] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Ottomata) @Marostegui while Dan answers your Q, I like his idea: Could the HDDs handle replication if we only r... [20:14:49] 10Analytics, 10Tool-Pageviews: Statistics for views of individual Wikimedia Commons images - https://phabricator.wikimedia.org/T210313 (10Tgr) >>! In T210313#4794936, @Nuria wrote: > @Tgr imagine that special beacon was implemented in 2015 for 'virtual mediaviews' due to scaling concerns with the old eventlogg... [20:17:08] 10Quarry, 10Patch-For-Review: Quarry should refuse to save results that are way too large - https://phabricator.wikimedia.org/T188564 (10Framawiki) Unless we can find a similar system to nfs but lighter, we should change the workers' architecture (highly related to {T178520}). Here are few ideas, probably stup... [20:23:44] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Milimetric) Yes, all the templates for the queries are here, and they're easy to read, basic python templating:... [20:24:16] ottomata: I think you wanted to put that question on the hardware task [20:29:31] ohhho hhhoops oh well [20:37:01] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Marostegui) >>! In T210693#4795448, @Ottomata wrote: > @Marostegui while Dan answers your Q, I like his idea: C... [20:52:21] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10Isaac) Just adding documentation that this was taken care of! [21:16:28] 10Analytics, 10Tool-Pageviews: Statistics for views of individual Wikimedia Commons images - https://phabricator.wikimedia.org/T210313 (10Nuria) > Reading Infrastructure and Analytics I suppose? Our team can support any team in reading with migration of the beacon to EL infrastructure but I think it should be... [21:23:05] 10Analytics, 10DBA, 10Data-Services, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Nuria) >So to sum up the hardware approach so far: the proposed hardware wouldn't be able to catch up with the current replication stream for all sections. I... [21:30:28] (03CR) 10Nuria: [V: 032 C: 032] Add shnwiki, yuewiktionary and liwikinews to sqoop list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/477247 (https://phabricator.wikimedia.org/T209822) (owner: 10Fdans) [21:46:55] (03CR) 10Nuria: Allow for custom transforms in DataFrameToDruid (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/477295 (https://phabricator.wikimedia.org/T210099) (owner: 10Mforns) [21:47:38] Hey milimetric - Have you managed to test data with the checkers ? [21:47:53] milimetric: I'm asking to know if I should try tomorrow :) [21:50:04] joal: yes, not all the combinations I wanted to do, but most [21:50:20] user: 1/11 errors, denormalized 1/22 [21:50:24] page: 0/11 [21:50:39] for the 08-old vs 09-new comparison [21:50:48] I'll write it all in the task along with the other checks I did so far [21:51:43] great milimetric :) The checker builds on the fact that some small diffs could be happening, so some errors on comparing only 3 wikis seems legit :) [21:52:13] yeah, I'm still looking into it, just to see and understand [21:54:55] Ok, gone for tonight team - see you tomorrow [21:57:25] hey, I'm back [22:04:37] 10Analytics, 10New-Readers: Piwik access for Isaac Johnson and Chelsy Xie - https://phabricator.wikimedia.org/T210902 (10atgo) Thank you everyone! [22:28:07] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: Resurrect eventlogging_EventError logging to in logstash - https://phabricator.wikimedia.org/T205437 (10Tgr) >>! In T205437#4781437, @Ottomata wrote: > The way Sam puts it here would be suitable, but we don't think that using EventLogging itself for... [23:17:33] chelsyx: question if you may [23:37:31] 10Analytics, 10Analytics-Kanban: Failure while refining webrequest upload 2018-12-01-14 - https://phabricator.wikimedia.org/T211000 (10Nuria) When i try to repro these steps, i get a permission denied error? >spark.sql("select * from wmf_raw.webrequest where webrequest_source = 'upload' and year = 2018 and mo... [23:41:08] 10Analytics, 10Dumps-Generation, 10ORES, 10Scoring-platform-team, and 3 others: Decide whether we will include raw features - https://phabricator.wikimedia.org/T211069 (10awight) [23:43:50] 10Analytics, 10Dumps-Generation, 10ORES, 10Scoring-platform-team, and 3 others: Decide whether we will include raw features - https://phabricator.wikimedia.org/T211069 (10awight)