[01:19:10] 10Analytics, 10Product-Analytics (Kanban): SQL definition for wikidata metrics for tunning session - https://phabricator.wikimedia.org/T247099 (10jwang) > it's not guaranteed that e.g. Namespace 104 will be the same thing in every wiki. Thanks for sharing this info. Then, querying content page based on a sta... [07:33:15] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Add Prometheus Presto metrics and dashboards - https://phabricator.wikimedia.org/T247884 (10elukey) Dashboard with first set of metrics in https://grafana.wikimedia.org/d/pMd25ruZz/presto [07:33:26] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Add Prometheus Presto metrics and dashboards - https://phabricator.wikimedia.org/T247884 (10elukey) [07:34:02] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review, 10User-Elukey: Kerberize Superset to allow Presto queries - https://phabricator.wikimedia.org/T239903 (10elukey) [07:34:49] 10Analytics, 10Analytics-Kanban: Upgrade jupyterhub-systemdspawner from 0.9.9 to 0.13 to allow the use of systemd custom slices - https://phabricator.wikimedia.org/T247055 (10elukey) a:03elukey [07:35:22] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Add Prometheus Presto metrics and dashboards - https://phabricator.wikimedia.org/T247884 (10elukey) p:05Triage→03Medium a:03elukey [10:01:44] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Inuka-Team (Kanban), 10Patch-For-Review: Set up pageview counting for KaiOS app - https://phabricator.wikimedia.org/T244547 (10nshahquinn-wmf) @Milimetric is the new refinement code deployed yet? If not, do you have any idea when it will be? [10:24:36] 10Quarry, 10DBA, 10Data-Services: Quarry query became work much slower - https://phabricator.wikimedia.org/T247978 (10Marostegui) Another thing we can also try is upgrading labsdb1011 to Buster and MariaDB 10.4 We recently had some CPU issues with wikidatawiki (after they all got InnoDB compressed - like our... [10:59:58] 10Analytics, 10ContentTranslation, 10Language-Team (Language-2020-Focus-Sprint): Test Performance of Marian NMT translation in stat cluster - https://phabricator.wikimedia.org/T247245 (10MoritzMuehlenhoff) I had a closer look into a rebuild of OpenBlas for Skylake/avx512, but it turns out this isn't actually... [11:28:24] I am thinking about how to rollback from BigTop to CDH if for some reason the upgrade doesn't go well [11:28:29] there are two use cases: [11:28:42] 1) something doesn't work straight after upgrade [11:29:08] 2) we run for some days the new version before finalizing the hdfs changes [11:29:36] from the point of view of rolling back hdfs and oozie, they are both straightforward [11:29:45] but 2) is particularly challenging for hive/oozie [11:30:00] since we do need to modify their database schemas [11:30:43] when I upgraded hadoop test I made a copy of all the dbs [11:30:58] but I just realized that something might not work fine now [11:38:00] * elukey lunch [13:12:38] 10Analytics, 10Release-Engineering-Team: wmfphab reverting github/wikimedia/KafkaSSE master to old commit - https://phabricator.wikimedia.org/T248170 (10Ottomata) [13:12:55] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) The rollback of HDFS at this stage should be easy, the main question mark are the oozie/hive db schemas. We have been running the Hadoop cluster with the new versi... [13:14:06] 10Analytics, 10EventStreams: KafkaSSE issue tracker and continuous integration - https://phabricator.wikimedia.org/T248044 (10Ottomata) Ah, I did remove the Travis tag from the README! My commits are being reverted due to {T248170} [13:15:05] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) p:05High→03Medium [13:15:17] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) [13:38:22] 10Analytics, 10Event-Platform: Event schemas common schema should set additionalProperties: false - https://phabricator.wikimedia.org/T248173 (10Ottomata) [13:59:53] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) Today while re-installing the new version of oozie/hadoop packages I experienced a problem that I forgot to fix, namely: ` Unpacking oozie (... [14:07:21] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Inuka-Team (Kanban), 10Patch-For-Review: Set up pageview counting for KaiOS app - https://phabricator.wikimedia.org/T244547 (10Milimetric) Yep, it was deployed on Wednesday, I moved the task to Done (usually it would be in "In Code Review" or "Ready... [14:10:30] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Inuka-Team (Kanban), 10Patch-For-Review: Set up pageview counting for KaiOS app - https://phabricator.wikimedia.org/T244547 (10SBisson) (Moving to QA on the Inuka board to better represent the current status) [14:10:32] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) Opened https://issues.apache.org/jira/browse/BIGTOP-3330 [14:10:44] snap I forgot to open another bug for oozie https://issues.apache.org/jira/browse/BIGTOP-3330 [14:15:44] 10Analytics, 10Better Use Of Data, 10Desktop Improvements, 10Product-Infrastructure-Team-Backlog, and 7 others: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Ottomata) Ok, https://gerrit.wikimedia.org/r/c/schemas/event/primary/+/580980 and https://gerrit.wikimed... [14:15:48] heya luca, want to do some eventgate stuff with me on monday? :) [14:15:55] https://phabricator.wikimedia.org/T226986#5986998 [14:17:23] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Patch-For-Review, 10Services (watching): Switch all eventgate clients to use new TLS port - https://phabricator.wikimedia.org/T242224 (10Ottomata) [14:18:13] ottomata: sure! But before standup in case, I have meetings + an interview after that :( [14:18:36] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Patch-For-Review, 10Services (watching): Switch all eventgate clients to use new TLS port - https://phabricator.wikimedia.org/T242224 (10Ottomata) EventBus switches are being worked on as part of {T247484} [14:18:46] ok let's do before! [14:22:40] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10User-Elukey: Documentation improvements for Eventstreams - https://phabricator.wikimedia.org/T240181 (10Ottomata) > The problem is that the SSE connection breaks every now and then It does, but hopefully it won't now! https://phabricator.wikimedia.org/T... [14:46:51] fdans: o/ if you have time we can chat about interviews [14:50:55] 10Analytics, 10Operations, 10Traffic: Create replacement for Varnishkafka - https://phabricator.wikimedia.org/T237993 (10ema) [14:50:58] 10Analytics, 10Operations, 10Traffic, 10Patch-For-Review: Test atskafka deployment - https://phabricator.wikimedia.org/T247497 (10ema) 05Open→03Resolved a:03ema We do have an atskafka instance currently running in production on cp3050. This task can be considered now done, further improvements to at... [15:45:49] 10Analytics, 10Operations, 10Research, 10Traffic, and 2 others: Enable layered data-access and sharing for a new form of collaboration - https://phabricator.wikimedia.org/T245833 (10elukey) I had a chat with Miriam about this: - The pageview granularity request from Sukhbir should be handled as separate t... [15:47:56] ottomata: not sure if you saw my ping yesterday, did you see https://grafana.wikimedia.org/d/pMd25ruZz/presto?orgId=1&from=now-12h&to=now&fullscreen&panelId=19 ? [15:48:32] I was wondering if we could think about multi-instance presto on beefy nodes, to have smaller jvms (I mean regarding heap size) [15:49:23] that usage will grow over time for sure, but probably handling 32G max jvm heap sizes could be more efficient [15:49:35] Hm [15:49:49] not sure if it make sense or not, of course we'll need to see how presto behaves when a lot of people start using it [15:49:49] well we will def do that when we colocate presto [15:49:58] yeah, we should ask joal what he thinks maybe too? [15:50:11] maybe presto works best with tons of RAM so it can do joins nicesly? [15:50:23] needs to be tested yes [16:02:41] mforns: fdans standup? [16:06:25] is the wikistats 2.0 aqs data queryable in hive or available via hdfs? can't find any way to access it that's not the API over HTTP [16:15:27] oops coming [16:20:42] Hi bearloga - data is based on mediawiki_history, using mediawiki_history_reduced, a bizarre datasource computed on purpose for Druid [16:25:00] hi joal: thanks! but the pre-computed data isn't accessible within the analytics cluster, then? only through the REST API? [16:26:38] bearloga: precimputed is named "mediawiki_history_reduced" - It has a table in hive: wmf.mediawiki_history_reduced, with a snapshot partition [16:27:00] bearloga: the data in there is even more denormalized/bizarre that the mediawiki_history one [16:27:55] joal: ah, I understand it now. thank you very much for clarifying! [16:30:02] np bearloga - don't hesitate with questions on that dataset, it is hidden because really not easy [17:04:52] * elukey off! [17:05:34] (03PS1) 10Joal: [WIP] Airflow aqs-data-extraction example [analytics/refinery] - 10https://gerrit.wikimedia.org/r/582114 (https://phabricator.wikimedia.org/T241246) [17:05:42] mforns: --^ This is my first try [17:05:58] thanks joal :] [17:08:07] mforns: please ping me if you want to discuss :) [17:08:20] sure [17:11:58] joal: wanna bc? to discuss? [17:21:15] 10Analytics, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Release-Engineering-Team (CI & Testing services): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10Krinkle) [17:30:40] mforns: i was gone making fire :) [17:31:02] np! wanna now? [17:31:19] give me 5mins for the fire to confirm, and yes :) [17:40:08] ready mforns! to the cave :) [17:40:27] ok joal omw [18:08:02] 10Analytics, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Release-Engineering-Team (CI & Testing services): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10Jdforrester-WMF) The jessie boxes this tries... [18:12:21] 10Analytics, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Release-Engineering-Team (CI & Testing services): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10JAllemandou) Thanks @Jdforrester-WMF for the... [18:31:52] (03PS1) 10Ottomata: Support multiple possible schema base URIs in EventSchemaLoader [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/582131 (https://phabricator.wikimedia.org/T240985) [18:36:21] (03PS2) 10Ottomata: Support multiple possible schema base URIs in EventSchemaLoader [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/582131 (https://phabricator.wikimedia.org/T240985) [18:38:34] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Release-Engineering-Team (CI & Testing services): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) >>! In T210271#... [19:28:29] 10Quarry, 10DBA, 10Data-Services: Quarry: Lost connection to MySQL server during query - https://phabricator.wikimedia.org/T246970 (10Mike_Peel) I sort of have my query running on paws now (at https://paws.wmflabs.org/paws/user/Mike_Peel/notebooks/Query%20for%20Wikidata%20Infobox.ipynb if you can access that... [19:41:21] hey a-team, any of you knows how to map project column (from webrequests logs) to wikidb (such as enwiki, enwikiquote, warwiki,etc) ? [19:41:53] dsaez: use either wmf_raw.mediawiki_project_namespace_map or the not as up to date but more friendly: [19:42:03] canonical_data.wikis [19:42:36] milimetric, thanks, let me check [19:45:21] dsaez: if you want an example you can use this one: https://github.com/wikimedia/analytics-refinery/blob/master/oozie/mediawiki/geoeditors/bucketed/generate_geoeditors_bucketed_public.hql#L39 [19:45:37] amazing! thanks milimetric [19:47:23] god dammit! this is really good, every time that I read code from analytics team's members, makes me feel completely ashamed of my messy code. [20:29:11] 10Analytics, 10EventStreams: KafkaSSE issue tracker and continuous integration - https://phabricator.wikimedia.org/T248044 (10Krinkle) >> However, we do generally keep at least a mirror of these in Gerrit > Is there a way to automate mirroring or does it require a manual push? Maybe! Best ask RelEng who k... [20:52:45] heh, it's just me being obsessive about spacing and lining things up, nothing too deep [21:13:25] ottomata: are you around? [21:35:49] hello leila only kinda! [21:37:28] I'll send you a link for a call if you're around. if not, monday. [22:29:37] (03PS1) 10QChris: Add .gitreview [analytics/wmf-product] - 10https://gerrit.wikimedia.org/r/582225 [22:29:39] (03CR) 10QChris: [V: 03+2 C: 03+2] Add .gitreview [analytics/wmf-product] - 10https://gerrit.wikimedia.org/r/582225 (owner: 10QChris)