[04:17:13] PROBLEM - Webrequests Varnishkafka log producer on cp4021 is CRITICAL: NRPE: Command check_varnishkafka-webrequest not defined [04:24:13] RECOVERY - Webrequests Varnishkafka log producer on cp4021 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf [08:41:52] Hi elukey [08:50:27] (03CR) 10Joal: [C: 031] "Looks good to me! should we merge?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/351667 (https://phabricator.wikimedia.org/T143119) (owner: 10Ottomata) [08:58:00] joal: o/ [08:58:10] \o [08:58:32] elukey: Shall we deploy refinery? There is a patch I'd like to see running? [09:04:39] joal: here I am sorry, constructors at home :/ [09:04:45] np elukey [09:04:52] +1 for the deployment [09:05:27] k, let's go :) [09:06:22] joal: there was an issue this morning with Varnish for text, we might see some data loss [09:06:29] (just as FYI) [09:06:37] ok elukey, thanks for letting me know ! [09:07:25] anything that I can help with the deploymnet? [09:07:28] *deployment? [09:07:44] elukey: just double checking, has deployement.eqiad been rehosted recently?> [09:08:15] so it was naos (codfw) up to yesterday, not sure if they have switched to tin yet [09:08:35] checking [09:09:01] yeah naos.codfw.wmnet seems the deployment server [09:09:32] ok [09:09:36] elukey: can I still goi? [09:11:12] from naos yes :D [09:11:38] I think that tin will become again the deployment server today [09:11:42] not sure when [09:12:17] elukey: just to be sure: everything should work as expected from naos, right? [09:12:22] Deployment server: Thursday, May 4th 2017 16:00 UTC [09:12:29] yep yep all good [09:12:37] k awesome :) [09:12:44] sorry, paranoid mode :) [09:13:10] nono it is always safe to double check, you know how I think about it :) [09:13:23] !log Deploy refinery from naos :) [09:13:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:14:05] joal: one thing that we might want to do in the future is to add a deploy message to scap deploy [09:14:17] like "Weekely Analytics Refinery deployment" [09:14:18] or similar [09:14:33] elukey: would be good indeed ! [09:14:51] elukey: can we provide it manually at deploy ? [09:15:13] I think that it should be enough to do scap deploy "bblablabla" [09:15:50] yeah just double checked [09:16:13] g [09:16:16] great [09:17:17] !log Deploy refinery onto hdfs [09:17:17] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:49:26] 10Analytics: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3234590 (10Tbayer) @mforns: Thanks for these explanations! Yes, I remember that audit and the surrounding discussions, and I'm also aware of the important privacy concerns surrounding user agents in general. That sa... [09:56:06] added apache metrics to bohrium - https://grafana-admin.wikimedia.org/dashboard/db/piwik?orgId=1&from=now-24h&to=now [09:56:09] \o/ [09:56:22] I've also rebooted kafka1012 for kernel upgrades [09:59:33] (also ran preferred-replica-election) [10:09:13] !log Rerun full druid loading for daily uniques - 0012911-170424154741156-oozie-oozi-C [10:09:14] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:41:11] * elukey lunch! [10:50:54] (03PS4) 10Mforns: Add monthly sanitized job for banner activity [analytics/refinery] - 10https://gerrit.wikimedia.org/r/350219 (https://phabricator.wikimedia.org/T157582) [11:12:07] (03PS5) 10Mforns: Add monthly sanitized job for banner activity [analytics/refinery] - 10https://gerrit.wikimedia.org/r/350219 (https://phabricator.wikimedia.org/T157582) [11:21:28] (03PS6) 10Mforns: Add monthly sanitized job for banner activity [analytics/refinery] - 10https://gerrit.wikimedia.org/r/350219 (https://phabricator.wikimedia.org/T157582) [11:36:27] joal, helloooo :] [11:36:32] Hi mforns :) [11:38:19] mforns: batcave? [11:38:24] hey, qq: I've tested the monthly sanitized banner job and after a couple changes I think I got it, but I don't see the output data set in pivot, I was trying to enter the admin UI to look for it, but the ssh tunnel explained in the docs doesn't work for me [11:38:26] sure [11:47:36] taking a break a-team :) [12:05:24] joal, elukey: I'm installing security updates for mysql-connector-java, the JDBC driver for mysql: https://www.computest.nl/advisories/CT-2017-0425_MySQL-Connector-J.txt [12:05:29] this is used by sqoop (it symlinks to the package installed from Debian in cdh::sqoop) [12:06:07] moritzm: ack! [12:06:20] but in addition to installing the updated package we also need to review the autoDeserialize setting for sqoop, is the source code somewhere searchable? [12:07:48] moritzm: do you mean on the hosts? [12:08:20] no, just somewhere, I'd like to check how the version of sqoop we're using on the hadoop cluster uses Connector/J [12:08:54] from http://sqoop.apache.org/ it seems it's a tool which is mostly used for maintenance, so it's not running as a daemon or so? [12:09:03] ah I'd say https://github.com/apache/sqoop [12:09:10] (this is the ASF mirror) [12:09:43] I am completely ignorant about the sqoop usage, but I am 99% sure that it is not a daemon [12:12:30] sqoop's usage of Connector/J is fine, so we only need the package update [12:12:38] super [12:12:58] shall I upgrada a single server and wait for some tests? but rather sounds safe to upgrade all, right? [12:13:08] my only understanding is that we use it to transfer data from Hadoop to Druid [12:13:14] but I'd need more info :) [12:13:27] we can wait for joal to return? [12:14:59] yep yep [12:15:13] that would be better [12:16:52] afaik we use mysql-connector-java probably only to contact the Hive Metastore (and maybe the Oozie one) [12:17:10] but let's ask to Master Joseph first :) [12:17:59] I checked the jars on a hadoop worker and it's only present for sqoop (but symlinked to the jessie deb) [13:30:13] * elukey is drawing a map of analytics slave dbs [14:00:13] mforns_away: let me know when I can ask you questions about EL master :) [14:09:48] ahhh i'm here, switched to phone internet [14:10:24] HALLO ottomata! [14:12:44] * elukey grabs a coffee [14:16:01] 10Analytics, 10Pageviews-API: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235403 (10Fjalapeno) [14:17:44] joal: i'm putting my mind back in hive scala stuff for today and tomorrow...any thoughts on the casing/recursion stuff? [14:17:54] i'm inclined to just leave it out, at least for now [14:18:35] 10Analytics, 10Pageviews-API: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235355 (10Fjalapeno) @tbayer @milimetric any idea what's going on here? Is there an existing ticket? [14:19:26] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235422 (10Fjalapeno) [14:20:56] 10Analytics-Tech-community-metrics, 10Phabricator: Closed tickets in Bugzilla migrated without closing event? - https://phabricator.wikimedia.org/T107254#3235431 (10hashar) [14:22:33] 10Analytics-Tech-community-metrics, 10Phabricator: Closed tickets in Bugzilla migrated without closing event? - https://phabricator.wikimedia.org/T107254#1490798 (10hashar) From T164477 that mess up the burnup charts at https://phabricator.wikimedia.org/maniphest/report/burn/ for the #beta-cluster-infrastructu... [14:26:45] Hey ottomata - Nothing really more than what we have last time [14:26:50] Hi moritzm [14:27:35] moritzm: Sqoop is a utility launching hadoop jobs to grab data from SQL stores. [14:28:17] moritzm: We use it against our mariadb instances (analytics store and labs), I think this is the most prominent usage of the MysqlConnector [14:29:07] ahhhh I didn't understand anything then :D [14:29:19] good that I waited for you :) [14:30:50] so, let's upgrade mysql-connector-java on one node, then you can run some tests and then proceed with the rest? [14:31:22] moritzm: not that easy unfortunately - I don't choose which node the jobs gets launch on [14:31:44] joal, back [14:31:59] however moritzm, I don't think we have production run other than the one we run monthly, so we have some time to test: ) [14:33:35] I'll upgrade all of them, then? if that runs only once per months it's easy to spot a potential regression [14:34:59] moritzm: You can upgrade all of them, I'll run a manual test once you're done, and we'll be carefull next month :) [14:35:27] ok, doing that now [14:36:08] (03CR) 10Ottomata: [C: 031] Update restbase oozie spark job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/349266 (https://phabricator.wikimedia.org/T163479) (owner: 10Joal) [14:36:16] mforns: to the cave? [14:36:22] joal, yessir [14:37:27] (03CR) 10Milimetric: [V: 032 C: 032] Add README.mediawiki-tables-sqoop-orm [analytics/refinery] - 10https://gerrit.wikimedia.org/r/351667 (https://phabricator.wikimedia.org/T143119) (owner: 10Ottomata) [14:37:45] joal: done, let me know if anything breaks [14:37:58] moritzm: will test later on today and keep you updated [14:38:05] moritzm: Thanks for keeping our stuff safe: ) [14:38:18] ok, thanks [14:39:42] (03CR) 10Milimetric: [V: 032 C: 032] "We can submit this one but it's chained with the parent, and I'm not sure if that's tested." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/351667 (https://phabricator.wikimedia.org/T143119) (owner: 10Ottomata) [14:41:53] (03CR) 10Milimetric: "I missed this bug in my previous reviews, but we can't call TimeseriesData.mergeAll on objects with duplicateDates, because we merge on th" [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/351210 (owner: 10Nuria) [14:42:35] 10Analytics: Cleaning scheme for banner data _SUCCESS files - https://phabricator.wikimedia.org/T164497#3235487 (10mforns) [14:44:20] moritzm: joal, mysql-connector-java is used for any hadoop/jvmy service that uses the mysql metastore on analytics1003 too [14:44:31] hive, oozie, druid, etc. [14:44:37] ottomata: Arf, didn't think about that [14:45:22] This is the only thing that I got right :) [14:45:24] 50% [14:45:28] ottomata: obviously, and it's probably used by spark as well when using Hive [14:45:36] (03CR) 10Milimetric: "For the first problem, if there are less than 35 days of data available," [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/350692 (https://phabricator.wikimedia.org/T160796) (owner: 10Milimetric) [14:46:16] hmm, joal probably not directly [14:46:16] hive clients talk to hive-server / hive-metastore services [14:46:16] which in turn use MySQL to look up tables, partitions, etc. [14:46:17] (03CR) 10Milimetric: "oops, trying again, for the first problem, we'd have to add links between the data and the filters that don't currently exist, and I'd pre" [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/350692 (https://phabricator.wikimedia.org/T160796) (owner: 10Milimetric) [14:47:35] ottomata: so what about the other hadoop services using the metastore directly? [14:47:49] I thought that all were using mysql-connector when using hive [14:48:04] but there is also the hive-server [14:48:06] mmmm [14:48:14] okok makes sense [14:48:28] only a few of them for specific reason access the metastore directly [14:48:36] (03CR) 10Milimetric: "Wondering what the status of this is, I look at it from time to time but I haven't seen movement and it's still labeled [WIP]. I'm happy " [analytics/refinery] - 10https://gerrit.wikimedia.org/r/331100 (https://phabricator.wikimedia.org/T137321) (owner: 10Gergő Tisza) [14:48:37] using hive == hitting the hive-server [14:48:41] right ? [14:49:20] elukey: ya, either hive-server or hive-metastore (both are daemons on analytics1003) [14:49:25] i don't remember exactly which happens when [14:49:30] but hive clients dont' talk to mysql directly [14:50:27] yep yep makes sense [14:56:25] nuria_: https://grafana.wikimedia.org/dashboard/db/aqs-elukey?orgId=1&from=now-6h&to=now - Giuseppe just switched Restbase to eqiad again, latency going down [14:56:45] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235355 (10Nuria) @Fjalapeno this is likely a n undetected bot doing tons of requests. [14:56:53] elukey: tears come to my eyes [14:57:12] :) [15:01:30] milimetric: standdup [15:02:39] 10Analytics, 06Analytics-Kanban: Document rationale of choosing druid - https://phabricator.wikimedia.org/T164302#3229272 (10Nuria) a:03Nuria [15:05:11] 10Analytics: Add jobs for druid compaction for pageviews data set - https://phabricator.wikimedia.org/T164500#3235588 (10mforns) [15:14:36] (03CR) 10Ottomata: [C: 031] "Haven't reviewed in detail, as it seems others have. +1 from me for the idea." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [15:17:31] 06Analytics-Kanban: Create purging script for mediawiki-history data - https://phabricator.wikimedia.org/T162034#3235638 (10Nuria) a:05fdans>03None [15:20:54] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235646 (10Fjalapeno) @nuria anything we can/should do? [15:30:35] 10Analytics-Tech-community-metrics, 06Developer-Relations (Apr-Jun 2017), 07Regression: Only display organizations defined in Wikimedia's DB (disable assuming orgs via hostnames in email addresses) - https://phabricator.wikimedia.org/T161308#3235658 (10Albertinisg) Sorry for the long delay into this. The lis... [15:43:06] 10Analytics-Tech-community-metrics, 06Developer-Relations (Apr-Jun 2017): On the "Git" dashboard, filtering on one organization still lists authors who are with another organization - https://phabricator.wikimedia.org/T157709#3235676 (10Albertinisg) @Aklapper , after the changes made in the DB the issue is gon... [15:47:37] 10Analytics-Tech-community-metrics, 06Developer-Relations (Apr-Jun 2017): Updated data in mediawiki-identities DB not deployed onto wikimedia.biterg.io? - https://phabricator.wikimedia.org/T157898#3235688 (10Albertinisg) @Aklapper the issue itself should be fixed now. However, as we are still not updating the... [15:52:33] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235355 (10JAllemandou) Looking at [[ https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=mobile-web&agent=us... [15:52:46] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235708 (10JAllemandou) 05Open>03Invalid [16:03:57] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235735 (10Tbayer) @Nuria, what makes you think this is a bot? Actually that rise looks rather organic to me. It coincides wit... [16:06:04] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235737 (10Tbayer) (PS: hadn't seen @JAllemandou's response and close, but agree with them ;) [16:15:21] 10Analytics: Add jobs for druid compaction for pageviews data set - https://phabricator.wikimedia.org/T164500#3235588 (10Nuria) p:05Triage>03Normal [16:17:59] 10Analytics: Cleaning scheme for banner data _SUCCESS files - https://phabricator.wikimedia.org/T164497#3235487 (10Nuria) p:05Triage>03Normal [16:19:22] 10Analytics, 10Analytics-EventLogging: EventLogging tests fail for python 3.4 in Jenkins - https://phabricator.wikimedia.org/T164409#3235785 (10Nuria) p:05Triage>03Normal [16:21:59] 10Analytics, 06Scoring-platform-team, 10rsaas-articlequality , 07Spike: [Spike] Store article quality data inside hadoop and make AQS outputs a public API - https://phabricator.wikimedia.org/T164377#3231651 (10Nuria) Could we add a bit more info here? [16:22:54] 10Analytics, 06Scoring-platform-team, 10rsaas-articlequality , 07Spike: [Spike] Store article quality data inside hadoop and make AQS outputs a public API - https://phabricator.wikimedia.org/T164377#3231651 (10Nuria) p:05Triage>03Low [16:23:45] 10Analytics: Investigate the use of local_quorum for AQS - https://phabricator.wikimedia.org/T164348#3230766 (10Nuria) We need to do a puppet change, check latencies and rollback/proceed as pertains [16:23:55] 06Analytics-Kanban: Investigate the use of local_quorum for AQS - https://phabricator.wikimedia.org/T164348#3235812 (10Nuria) [16:25:29] 10Analytics, 10Analytics-Dashiki: Compare layout doesn't handle files with non daily resolution - https://phabricator.wikimedia.org/T164335#3235821 (10Nuria) 05Open>03Invalid [16:25:43] 06Analytics-Kanban: Document rationale of choosing druid - https://phabricator.wikimedia.org/T164302#3235822 (10Nuria) [16:26:18] 10Analytics, 06Research-and-Data-Backlog: Host API for token persistence dataset - https://phabricator.wikimedia.org/T164280#3235824 (10Nuria) p:05Triage>03Normal [16:26:21] !log set daily cron archiver (rather than every hour) for Piwik on bohrium [16:26:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:27:26] 10Analytics: Alarms on pageview API latency increase - https://phabricator.wikimedia.org/T164243#3226937 (10Nuria) p:05Triage>03High [16:28:26] 10Analytics: AQS unique devices api should report offset/underestimate separately - https://phabricator.wikimedia.org/T164201#3235834 (10Nuria) p:05Triage>03Normal [16:28:40] 10Analytics, 10Pageviews-API, 06Wikipedia-Android-App-Backlog, 06Wikipedia-iOS-App-Backlog: Investigate possible pageview manipulation - https://phabricator.wikimedia.org/T164491#3235836 (10Fjalapeno) @JAllemandou @Tbayer that makes a lot of sense… thanks for looking. I think because of the fact it was an... [16:29:13] 10Analytics: Spike, test idea on spark job that reads tags and produces different outputs - https://phabricator.wikimedia.org/T164020#3235838 (10Nuria) [16:29:38] 06Analytics-Kanban: Webrequest tagging and distribution. Measuring non-pageview requests - https://phabricator.wikimedia.org/T164019#3235840 (10Nuria) [16:30:48] 10Analytics: Investigate oozie suspended workflows - https://phabricator.wikimedia.org/T163933#3215250 (10Nuria) p:05Triage>03Normal [16:31:29] 10Analytics, 10Analytics-Cluster: Monitor that no worker nodes are in the default rack in net topology - https://phabricator.wikimedia.org/T163909#3235846 (10Nuria) p:05Triage>03Low [16:32:43] 10Analytics, 10Analytics-Cluster: Monitor HDFS blocks problems - https://phabricator.wikimedia.org/T163908#3235861 (10Nuria) p:05Triage>03Normal [16:32:58] 10Analytics, 06Scoring-platform-team, 10rsaas-articlequality , 07Spike: [Spike] Store article quality data inside hadoop and make AQS outputs a public API - https://phabricator.wikimedia.org/T164377#3235864 (10Halfak) See T146718. There are no privacy considerations. Right now, we have a dataset we want... [16:33:24] 10Analytics-Cluster, 06Analytics-Kanban: Monitor hdfs-balancer - https://phabricator.wikimedia.org/T163907#3235878 (10Nuria) [16:34:54] 10Analytics, 10Analytics-Cluster: Monitor hdfs-balancer - https://phabricator.wikimedia.org/T163907#3213994 (10Nuria) p:05Triage>03Normal [16:36:57] 10Analytics, 10Analytics-General-or-Unknown: Provide regular cross-wiki reports on flagged revisions status - https://phabricator.wikimedia.org/T44360#3235893 (10Zache) In fiwiki we are tracking couple key numbers. * maximum pending lag. * number of pending changes Max pending lag doesn't really matter, but... [16:37:21] ottomata: kafka1012 is running with the 4.9 kernel FYI [16:37:27] (will complete the rollout this week) [16:41:15] 10Analytics: add a more friendly message to ladp authentication box for pivot - https://phabricator.wikimedia.org/T163797#3235916 (10Nuria) [16:41:32] 10Analytics: add a more friendly message to ladp authentication box for pivot - https://phabricator.wikimedia.org/T163797#3210388 (10Nuria) p:05Triage>03Low [16:44:22] 10Analytics: Enable nested on-wiki config pages in mediawiki-storage - https://phabricator.wikimedia.org/T163725#3207314 (10Nuria) p:05Triage>03Normal [16:46:03] * milimetric lunch [16:46:05] nuria_: have a minute for global uniqyues? [16:46:45] nice [16:48:40] 10Analytics-Tech-community-metrics: Author names that include commata or "and" are split into separate identities in the frontend - https://phabricator.wikimedia.org/T161241#3235966 (10Albertinisg) a:03Albertinisg Issue fixed, see https://wikimedia.biterg.io:443/goto/42815da25b8d3bf1eb2dbb9081fe3139 [16:50:01] ottomata: yessss managed to wrap the kafka reader, I think [16:50:27] niiice [16:50:36] fdans: i have an idea about making a generic filter reader... [16:50:37] trying it [16:53:42] nuria_: ping? [17:03:21] (03CR) 10Nuria: [V: 032 C: 032] "I can no longer repro problem #2, tried with build source and several intercations but everything seems to work well. Thank you." [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/350692 (https://phabricator.wikimedia.org/T160796) (owner: 10Milimetric) [17:03:59] fdans: https://codeshare.io/aITwc [17:04:13] 06Analytics-Kanban: Document rationale of choosing druid - https://phabricator.wikimedia.org/T164302#3236039 (10Nuria) Pleae see: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Druid [17:04:19] lookin! [17:04:23] 06Analytics-Kanban: Document rationale of choosing druid - https://phabricator.wikimedia.org/T164302#3236040 (10Nuria) 05Open>03Resolved [17:08:15] 10Analytics-Cluster, 06Analytics-Kanban, 06Operations: Reinstall Analytics Hadoop Cluster with Debian Jessie - https://phabricator.wikimedia.org/T157807#3236053 (10Nuria) [17:08:17] 10Analytics-Cluster, 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage the Hadoop Cluster to Debian Jessie - https://phabricator.wikimedia.org/T160333#3095234 (10Nuria) 05Open>03Resolved [17:08:33] 06Analytics-Kanban: Productionize Edit History Reconstruction and Extraction - https://phabricator.wikimedia.org/T152035#3236054 (10Nuria) 05Open>03Resolved [17:08:35] 06Analytics-Kanban: Wikistats 2.0. - https://phabricator.wikimedia.org/T130256#3236056 (10Nuria) [17:08:44] 10Analytics-EventLogging, 06Analytics-Kanban: Implement EventLogging Hive refinement - https://phabricator.wikimedia.org/T162610#3236058 (10Nuria) [17:08:46] 10Analytics, 10Analytics-EventLogging: Find an alternative query interface for eventlogging on analytics cluster that can replace MariaDB - https://phabricator.wikimedia.org/T159170#3236059 (10Nuria) [17:08:48] 10Analytics-EventLogging, 06Analytics-Kanban: Research Spike: Better support for Eventlogging data on hive - https://phabricator.wikimedia.org/T153328#3236057 (10Nuria) 05Open>03Resolved [17:08:57] 06Analytics-Kanban, 13Patch-For-Review: Label mediawiki_history snapshots for the last month they include - https://phabricator.wikimedia.org/T163483#3236060 (10Nuria) 05Open>03Resolved [17:09:07] 06Analytics-Kanban: Piwik improvements - https://phabricator.wikimedia.org/T163000#3236064 (10Nuria) [17:09:09] 06Analytics-Kanban, 13Patch-For-Review, 15User-Elukey: Piwik puppet configuration refactoring and updates - https://phabricator.wikimedia.org/T159136#3236063 (10Nuria) 05Open>03Resolved [17:10:00] fdans: whatcha think? [17:10:44] (sorry 1s, had to go afk) [17:10:48] np [17:11:47] * elukey afk! [17:13:16] joal: do you have a min to brain bounce how to run hive el stuff in prod? [17:23:38] ottomata: this looks cool! it actually helped me understand plugged in consumers better :) [17:23:56] yeahh awesome [17:24:06] not totally sure how we'd get our custom function in there... [17:24:19] maybe we have to commit it to EL code somewhere silly? [17:24:44] i wonder if we deployed it as a plugin file, if we could auto load it into scope [17:24:53] like, if there was some python script in /usr/local/lib/eventlogging [17:24:55] that had a function [17:25:05] filter_not_bot(event): [17:25:19] could we just use it? [17:25:30] yeah I think I did that earlier [17:25:31] function=filter_not_bot [17:25:32] ? [17:26:06] like from eventlogging import udfs [17:26:17] and then call it like udfs[function] [17:26:27] hmm [17:26:36] how would udfs get populated? [17:26:39] (my javascripter within is twitching with the use of "function" as varname) [17:26:50] haha, we can call it whatever [17:26:58] it'll be whatever makes the most sense as the URI query param though [17:27:21] filterFn=filter_not_bot [17:27:25] naww [17:27:28] to me it makes sense to commit udfs as a module within eventlogging [17:27:28] something like that though [17:27:31] hmm [17:27:39] but then they are not 'Udfs' [17:27:43] if they have to be committed to EL repo [17:27:53] they are developer defined functions :) [17:28:06] ddfs it is then :P [17:28:13] hah [17:28:24] but, if it does auto load in a plugin file [17:28:26] nah, yeah that makes sense [17:28:29] we don't really need to maintain a structure, right? [17:28:36] we can just check scope [17:28:38] i'm not sure that will work [17:28:39] but it might [17:28:46] yeah I thought that made sense in your code [17:28:59] why wouldn't it work? [17:29:20] hey ottomata [17:29:43] hmm, maybe it will [17:29:48] load_plugins() happens in handlers.py [17:29:48] and does [17:29:49] imp.load_source('__eventlogging_plugin_%x__' % hash(plugin), plugin) [17:29:59] so, hopefully, that will load whatever defs are there into local or global scope [17:30:05] and they will be totally accesible [17:30:18] so then, we can just deploy a plugin (via puppet), that only contains the filter_not_bot function [17:30:35] add this @reads('filter') handler to EL codebase [17:30:41] and then the rest is configuration in puppet [17:30:54] joal: hey [17:31:15] so, yaaaaaa. things [17:31:22] 1. oozie + hive are not friends, so no oozie [17:31:28] ottomata: right [17:31:34] 2. there are LOTS of different el schemas [17:31:47] ottomata: or oozie launching spark through shell [17:32:01] hmmm, yeah i guess we could try that...we didn't try that yet, right? [17:32:10] ottomata: I don't think so [17:32:57] ok i will try first [17:32:57] https://gerrit.wikimedia.org/r/#/c/201009/ [17:33:12] if we do that joal, that means LOTS of new small jobs [17:33:17] ottomata: I'm going to play with your code in beta [17:33:22] fdans: ok cool [17:33:54] ottomata: I'd rather go for 1 single big job, no? [17:34:07] joal: so, one that depends on all schemas for a single hour or day? [17:34:14] yes ottomata [17:34:19] once they are all present, launch, and iterate over each one in the job? [17:34:25] yah [17:34:28] hm ok [17:34:47] i guess we are importing them all in the same camus job [17:35:04] ottomata: 1 oozie job per EL schema makes it really unmanageable IMO [17:35:25] yeah, its huge. it is cleaner from a separation of concerns perspective, but managing all those jobs would be pretty annoying [17:35:38] ok, i'll see what i can do with spark submit shell oozie with what I got now, and then see if i can write a job that iterates [17:35:51] k [17:36:09] (03PS19) 10Ottomata: Spark + JSON -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [17:36:14] joal: if you have a sec, I think ^ is ready as is [17:36:31] i removed the recursive denormalize stuff [17:38:20] k [17:39:03] gonna run home, back shortly [17:47:45] (03CR) 10Mforns: [C: 032] "+2, didn't merge because we still want to talk about reruns, right? Otherwise, self-merge please :]" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [17:53:24] joal, what do you think happens if we have druid data compacted in monthly segments, and we overwrite partial data on top of it in daily segments? will it still be queryable? [18:05:05] 10Analytics, 06Research-and-Data-Backlog: Host API for token persistence dataset - https://phabricator.wikimedia.org/T164280#3236375 (10DarTar) [18:05:05] mforns: I actually don't know [18:05:23] joal, been googling but didn't find anything [18:05:36] mforns: I can imagine data will still be queryable, but I'm not sure [18:05:56] intuitively for me, I'd say it won't work... [18:06:19] because druid is going to look for the data in monthly segments... [18:06:38] mmm [18:07:08] mforns: you can provide dta to druid in various segments granularity [18:07:23] joal, yes, but overlapping... [18:08:00] That's why I'm not sure [18:08:28] nuria_: ping again before I leave? [18:08:37] Joal: yes here [18:08:41] joal: sorry [18:08:41] Heya ! [18:08:44] np nuria_ :) [18:08:45] batcave? [18:08:51] nuria_: 3minutes yes please: ) [18:08:56] yessir [18:11:08] nuria_: alternate cave am I? [18:14:28] joal, this discussion (kinda) suggests that overlapping intervals delete the old data (in our case the daily rerun would delete monthly data?): https://groups.google.com/forum/#!topic/druid-development/klPt_qiICMw [18:30:36] elukey, yt? [18:35:53] (03CR) 10Mforns: [C: 032] "I think daily reruns (which is our most likely case) will work perfectly when the re-run dates belong to the current month. Once the month" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [18:57:38] joal: when you add the new charts since you have the data for non globals daily [18:58:01] joal: can you include details on underestimate and offset variations? [18:58:05] for non globals [18:58:09] makes sense? [19:16:36] (03CR) 10Mforns: [C: 031] "LGTM! Just one potentially silly comment :]" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/349266 (https://phabricator.wikimedia.org/T163479) (owner: 10Joal) [20:05:41] (03PS3) 10Nuria: Changes datasets api to accept a project or array of same [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/351210 [20:08:22] (03CR) 10Milimetric: [V: 032 C: 032] Changes datasets api to accept a project or array of same [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/351210 (owner: 10Nuria) [20:09:00] 10Analytics, 03Community-Tech-Sprint: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3236946 (10Niharika) I made https://github.com/wikimedia/popularpages/pull/5 to use promises for fetching redirects. This PR also takes out throttling. [20:51:30] 10Analytics, 03Community-Tech-Sprint: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3237015 (10Niharika) I believe @kaldari did some testing with the above PR and found it to be ~3 times faster for one wikiproject. It's worth running the bot wi... [23:39:08] 06Analytics-Kanban, 13Patch-For-Review: Label mediawiki_history snapshots for the last month they include - https://phabricator.wikimedia.org/T163483#3237366 (10Neil_P._Quinn_WMF) {meme, src="tech-barnstar"}