[00:15:29] milimetric: yes, tags are working, next would be to split [00:15:33] milimetric: webrequest [00:15:57] milimetric: but i do not think we will do that until wikistats backend is on its way [00:15:59] yeah, I was considering it for Amir's thing but it's gotta aggregate on top of tagging so it won't work easily [00:16:50] milimetric: we have work regarding tagging that is still pending but we need to have an alpha of wikistats before we move back to that work [00:17:28] oh yeah, definitely agreed [00:17:35] milimetric: we even have pending changesets for split: https://gerrit.wikimedia.org/r/#/c/357814/ [00:17:54] cool, maybe this work I'm doing now will give that some new ideas when we pick it back up [00:18:09] because sometimes tags we want might be a little more complicated and geared towards aggregation [00:18:19] but I'll just remember when we get there, no rush [00:44:22] (03PS2) 10Milimetric: [WIP] Create an Oozie job for interlanguage link table creation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365517 (https://phabricator.wikimedia.org/T170764) (owner: 10Amire80) [05:32:51] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3656161 (10Marostegui) >>! In T153033#3653939, @Nuria wrote: > @marostegui: let's put them on a mediawiki-archive database, the staging database (if I am not mistaken) has open permits for everyone to delete /up... [07:12:43] mooooooorning! [07:20:51] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3656279 (10jcrespo) Compressing the same table on dbstore1002 gave us 1TB back, it took 1 day and 12 hours and required 150GB during the process: https://grafana.wikimedia.org/dashb... [07:38:51] fdans: hola :) [08:22:41] * elukey running errand for ~1h, ttl! [09:08:19] back :) [09:19:22] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Zero results shown for certain repositories on Git dashboard (though there has been Git activity) - https://phabricator.wikimedia.org/T175351#3656497 (10Aklapper) 05Open>03Resolved a:03Aklapper Confirmed that it was an issue with s... [09:19:45] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Zero results shown for certain repositories on Git dashboard (though there has been Git activity) - https://phabricator.wikimedia.org/T175351#3591043 (10Aklapper) a:05Aklapper>03None [09:22:06] (03PS1) 10Fdans: Add top articles by pageviews metric [analytics/wikistats2] (develop) - 10https://gerrit.wikimedia.org/r/382139 (https://phabricator.wikimedia.org/T175266) [09:24:37] omg I never thought I'd be this excited about using gerrit [09:33:33] (03PS1) 10Fdans: Add stub of new contributing and content metrics [analytics/wikistats2] (develop) - 10https://gerrit.wikimedia.org/r/382143 (https://phabricator.wikimedia.org/T175268) [09:38:23] lol [11:29:33] * elukey lunch! [13:39:15] hey all [13:39:44] o/ [13:53:01] milimetric: Jaime this morning started the compression of a wikidata s5 table, he said that the same move on dbstore1001 got 1TB back :P [13:53:15] woah?! [13:53:22] but it was only 490+GB? [13:53:29] how... in the hell does that toku thing work [13:56:32] there is also the index etc.. so more GBs :) [14:03:51] another interesting thing discovered today: http://druid.io/docs/0.9.1/configuration/index.html [14:04:15] I was convinced that Druid exposed a http api to retrieve metrics [14:04:30] since on the mbeans/jmx there are only jvm metrics [14:04:44] instead, Druid has the concept of emitters [14:05:06] so the metrics are jsonized and sent either to a logfile or to a http endpoint [14:06:15] so I am thinking to write a prometheus agent to run on those hosts that accepts a POST request from the druid daemons, keep the last result and returns it (properly formatted) when a http get call asks prometheus metrics [14:09:25] that seems valuable but does not sound like fun at all to me :) [14:11:30] the alternative would be to only get mbeans via jmx, and ask to upstream if they could expose some metrics via mbeans [14:11:46] but I am pretty sure they don't want to do it :) [14:12:43] http://druid.io/docs/0.9.1/operations/metrics.html is really full of useful metrics [14:15:30] you're probably right that they won't change fundamental things like that at this point [14:15:47] I like that they went with a flexible approach so everyone can adapt it [14:19:53] (03PS1) 10Fdans: Keep breakdowns selection regardless of time range [analytics/wikistats2] (develop) - 10https://gerrit.wikimedia.org/r/382164 [14:21:56] milimetric: I've become a fan of working with several commits in a branch locally => squash merge and correct message for gerrit => push to gerrit :) [14:22:32] fdans: :) that's what gerrit pros do [14:22:49] ohhh look at me [14:23:06] well, more accurately, they don't just squash everything, they re-arrange their commits in a neat series of gerrit changes [14:23:15] but roughly same thing :) [14:24:01] I like commiting every change I do and then using the squash message to make a pretty comprehensive gerrit change message [14:24:47] right but *true* gerrit changes should be atomic, should only change one thing [14:24:48] milimetric: I'd argue about the "flexible" approach but let's say we can work with that :D [14:25:27] and since people's minds have a hard time focusing on one thing - rebase / rearrange comes in very handy [14:26:05] elukey: http is flexible, no? everything can speak http [14:26:49] I definitely agree it would've made a lot more sense to go with some standard metric sending protocol [14:30:10] (03CR) 10Mforns: [V: 032 C: 032] Replace references to dbstore1002 by db1047 [analytics/limn-edit-data] - 10https://gerrit.wikimedia.org/r/380755 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [14:31:11] (03CR) 10Mforns: [V: 032 C: 032] Remove reference to old cname x1-analytics-slave [analytics/limn-ee-data] - 10https://gerrit.wikimedia.org/r/380757 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [14:32:01] milimetric: I was joking, it is not that big deal :) [14:32:25] ottomata: hiiiiiiiiii! ops sync? [14:32:28] oh :) well, to me it sounds like a pain [14:33:56] is Andrew working today? [14:34:45] elukey, hiye! [14:34:55] are you in meeting? [14:36:30] mforns: I should but atm Andrew hasn't joined, so I am free :) [14:36:51] free faaallin''' [14:37:13] hehe, could you review this 3 mini CRs? [14:37:34] https://gerrit.wikimedia.org/r/#/c/380751/ https://gerrit.wikimedia.org/r/#/c/380759/ https://gerrit.wikimedia.org/r/#/c/380778/ [14:37:34] HII [14:37:35] yes [14:41:21] mforns: ack I'll review them! [14:43:14] thanks elukey :] [15:00:56] a-team: standup [15:01:24] ping milimetric fdans joal (still sick?) [15:04:54] ping fdans [15:27:06] 10Analytics-Kanban: Easter Egg: wikistats classic style on wikistats 2.0 - https://phabricator.wikimedia.org/T177408#3657650 (10Milimetric) [15:29:41] 10Analytics-Kanban: Easter Egg: wikistats classic style on wikistats 2.0 - https://phabricator.wikimedia.org/T177408#3657673 (10Milimetric) ooh, and the logo should have Nostalgia Mode as the subtitle when someone unlocks this. [16:02:06] elukey, joal: coming? [16:44:03] * elukey off! [17:04:17] (03CR) 10Mforns: "FEELS AMAZING! The fact that we have a tops metric working makes a huge difference. Congrats! Some non-code-related details:" [analytics/wikistats2] (develop) - 10https://gerrit.wikimedia.org/r/382139 (https://phabricator.wikimedia.org/T175266) (owner: 10Fdans) [17:04:41] mforns: <3 [17:05:15] fdans, will review the code next [17:07:02] (03CR) 10Elukey: [C: 031] Replace references to dbstore1002 by db1047 [analytics/limn-flow-data] - 10https://gerrit.wikimedia.org/r/380751 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [17:07:05] (03CR) 10Elukey: [C: 031] Replace references to dbstore1002 by db1047 [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/380759 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [17:07:20] \o/ [17:07:49] mforns: merging the puppet change [17:08:00] elukey, thanks awesome [17:08:18] this should stop that job when puppet runs [17:08:36] well, stop being called [17:08:59] done [17:09:06] thx! [17:19:55] ottomata: I managed to add all the ACLs for ANONYMOUS to Jumbo, ready to merge if yo uare ok [17:20:05] +1 elukey [17:24:45] 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Tune Kafka logs to register clients connected - https://phabricator.wikimedia.org/T173493#3658150 (10elukey) Added the following ACLs (still not active since the above patch is not merged): ``` elukey@kafka-jumbo1001:~$ kafka acls --list... [17:28:27] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3658171 (10Ottomata) [17:28:36] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3617909 (10Ottomata) [17:29:55] [2017-10-04 17:29:26,927] DEBUG Principal = User:ANONYMOUS is Allowed Operation = ClusterAction from host = 10.64.0.176 on resource = Cluster:kafka-cluster (kafka.authorizer.logger) [17:30:02] ottomata: --^ [17:30:04] \o/ [17:30:10] (kafka-jumbo1001) [17:31:00] (03CR) 10Mforns: "Code LGTM! Just 1 nit and 2 typos." (034 comments) [analytics/wikistats2] (develop) - 10https://gerrit.wikimedia.org/r/382139 (https://phabricator.wikimedia.org/T175266) (owner: 10Fdans) [17:31:05] Nice! :) [17:31:29] running puppet to all of them and restarting kafka [17:32:11] !log deploying new LVS service for druid-analytics-broker [17:32:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:32:33] mforns, milimetric : question if you may [17:34:54] yep nuria_? [17:35:35] mforns: how are we handling on aqs backend the multiple snapshots of mediawiki? Are we always pulling metrics from druid latest indexed snapshot? [17:35:40] 10Analytics-EventLogging, 10Analytics-Kanban, 10Page-Previews, 10Readers-Web-Backlog, and 5 others: EventLogging subscriber module in ready state but not sending tracked events - https://phabricator.wikimedia.org/T175918#3658220 (10Tbayer) Great work! Can someone summarize which schemas were affected (as f... [17:36:02] ottomata: producing fine with kafkacat - https://grafana-admin.wikimedia.org/dashboard/db/prometheus-kafka?refresh=5m&orgId=1&from=now-1h&to=now [17:37:17] nuria_, I'm not familiar with the productionization of the druid loader for mediawiki data... [17:38:16] aaand I can consume properly [17:38:20] yessssssss \o/ [17:38:33] so first ACLs deployed, no-op as expected [17:38:37] mforns: ok, we can talk to joal when he is back, i think doing it that way will create downtime when indexing of anew dataset is happening as the whole dataset is overwritten [17:38:41] great! [17:38:55] cc milimetric [17:38:57] nuria_, aha, could be [17:40:45] * elukey dances [17:41:15] next step is adding certs and tightening the ACLs, I'll write some documentation on wikitech tomorrow [17:42:07] nuria_, yes, the last snapshot is taken: https://github.com/wikimedia/analytics-refinery/blob/master/oozie/mediawiki/history/druid/coordinator.xml#L67 [17:42:29] mforns: ya, i think that is going to be a problem when it comes to uptime [17:43:30] mforns: We should probably have two snapshots and flip to the new one once completed [17:43:43] aha [17:47:18] (03CR) 10Mforns: [V: 032 C: 032] Replace references to dbstore1002 by db1047 [analytics/limn-flow-data] - 10https://gerrit.wikimedia.org/r/380751 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [17:47:37] (03CR) 10Mforns: [V: 032 C: 032] Replace references to dbstore1002 by db1047 [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/380759 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [17:52:28] ok, I'm back nuria_ [17:52:34] wanna chat in cave? [17:53:02] milimetric: it is fine here, is pretty short [17:53:20] so which phase are you seeing the problem in, indexing? [17:53:44] milimetric: i was thinking that we cannot run druid queries always of latest snapshot as that will impact uptime as snapshot is recreated everytime (unlike pageviews, where we append everytime) [17:54:09] nuria_: but there's no downtime while indexing though [17:54:24] it indexes one segment at a time and makes it available for answers, leaving the old segments in place [17:54:39] there may be a correctness problem, with incomplete answers during indexing [17:54:47] but I don't see a downtime problem [17:55:11] the snapshot isn't involved here, it's only used to generate the data that Druid then indexes [17:55:24] or am I missing what you're saying? [17:55:31] milimetric: mmm...i think it being uptime or incorrect data our queries will not work while indexing is taking place [17:55:57] milimetric: we will basically be caching bad results (bad best case) or returning empty results (bad worst case) [17:56:16] they'll work, but they might be running on some segments generated from the old snapshot and half new segments from the new snapshot [17:56:33] generally the difference will be very small, if any, because we're building our metrics to not change [17:57:01] I don't see the empty results, and the bad results would be a bug in the data that's getting corrected if anything [17:58:13] milimetric: i do not think so, we could, for example change schema or have afield whose format in schema is different (alredy happened) [17:58:23] milimetric: and old data would not match new data at all [17:58:41] milimetric: because ingestion spec is different [17:59:19] milimetric: so i see issues from ingestion taking place but also from data differing between runs [18:00:27] milimetric: this is a different ingestion than others we do, right? [18:00:32] milimetric: cause we normally append [18:00:37] milimetric: in thsi case we override [18:00:40] *this [18:01:19] nuria_: if we change schema, yeah, we'd have to make sure it's backwards compatible or version the datasource name and re-deploy the front-end, but we're talking about the normal ingestion case, where schemas don't change, right? [18:02:10] nuria_: and yes, I agree we override, but it should be generally fine as I was saying above, unless there's a schema change [18:02:33] this is definitely not a quick topic though, and we should talk it over with Joseph too. I agree at least in the schema change scenario we have to do something different. [18:02:40] milimetric: in the normal case they change cause it is likely we add more fields right? [18:02:47] milimetric: ya, agreed, no need to solve it now [18:02:56] milimetric: but we need to test it [18:03:05] well, if we add fields that'd be a schema change, and if it's not backwards compatible, yeah, that would be different [18:03:37] it's a good thing to think about though, we hadn't considered it yet and it might catch us unprepared if we wait until the second load after launch [18:22:15] * elukey off! (for real this time :P) [18:39:22] druid-analytics.svc.eqiad.wmnet:8082 should only be accessible to analytics networks [18:39:24] !log druid-analytics.svc.eqiad.wmnet:8082 should only be accessible to analytics networks [18:39:31] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:09:50] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3658619 (10Ottomata) OOF. We can't add LVS to stuff in the Analytics VLAN. At least, not so easily. To be discussed tomorrow post standup. [19:51:17] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Fix duplicated enrollments in database - https://phabricator.wikimedia.org/T176786#3658796 (10Aklapper) 05stalled>03Resolved I reviewed the SQL query output based on the latest DB dump (git id `d357d21754988e610ac31a906d42dd234bb821a... [19:59:31] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3658813 (10Aklapper) Status update: As discussed in non-public h... [20:18:30] 10Analytics, 10Analytics-Cluster, 10Operations, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#3658835 (10dr0ptp4kt) Just so we have it here, for TensorFlow people, there's an encouraging comment at https://github.com/tensorflow/tensorflow/issues/22#issueco... [20:25:57] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3658843 (10demon) >>! In T153033#3656161, @Marostegui wrote: > Just to be clear, you are talking about dbstore1002/db1047? > We also have to keep in mind that there are thousands of tables (two per wiki basically... [20:37:49] ottomata: there in a minute [21:27:41] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3658983 (10Nuria) >Could/should we drop the ones that are completely empty already--assuming some wikis never actually used it. Would that make it more manageable? Yes, please. I think that makes loads of sense. [21:37:48] fdans: yt? [21:38:32] nuria_: hola! [21:38:54] fdans: hola , did you sorted out teh issues with differential versus gerrit to deploy wikistats? [21:38:57] *the [21:40:52] nuria_: yes, we’re pushing patches to the develop branch and then opening another patch for master (with squashed commit) whenever we want to deploy [21:41:27] fdans: and did you do changes on puppet so it pulls from gerrit repo? [21:41:40] yes luca did that yesterday [21:42:11] CR is working too [21:42:15] fdans: ah, ok, then are we deploying this: https://phabricator.wikimedia.org/T175265 [21:42:27] CI* not CR [21:42:28] fdans: and this: https://phabricator.wikimedia.org/T176240 [21:42:45] fdans: ah CI, yes. I forgot about that one [21:44:16] nuria_ yes I wanted to include the bug fix that I worked on earlier today, will deploy all that tomorrow [21:44:43] fdans: ok, sounds good, let's touch base on those tomorrow then [21:44:48] I also wanted to include marcels fix on breakdowns [21:44:58] which is not yet merged [22:05:10] wait, fdans we said develop on master then push to release [22:05:25] cc nuria_ ^ at least that was the plan yesterday, so we would be "normal" in terms of developing on master [22:05:39] (sorry just got back to work, we can talk about it tomorrow, it's late in EU) [22:22:01] o/ [22:22:25] I'd like to ssh out from the stat machines, but I can't reach an external IP through port 22 [22:22:31] Is that explicitly blocked? [22:22:41] I can access the IP from my local machine FWIW [22:24:19] I'm actually trying to SCP in some big files FWIW. [22:42:38] (03PS3) 10Milimetric: Create Oozie job for interlanguage nav table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365517 (https://phabricator.wikimedia.org/T170764) (owner: 10Amire80) [22:43:21] halfak: why not scp from your local to the machine? Or are you trying to scp between two stat machines? [22:43:35] milimetric, HUGE files [22:43:37] oh! scp from an external IP to the stat box [22:43:42] From a remote machine at the university [22:43:45] got it [22:44:18] I think everyone that could help is gone, sorry :( [22:44:48] No worries. I'll try again tomorrow morning :) [22:44:51] Thanks for trying :D [22:44:55] Have a good night! [22:45:15] (03CR) 10Milimetric: "This is ready for review, I'm testing it for real now." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365517 (https://phabricator.wikimedia.org/T170764) (owner: 10Amire80) [23:51:48] (03CR) 10Milimetric: "Testing went well, uploading a final patchset. @Amire80 please take a look at the SQL and make sure it is what you were thinking. You ca" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365517 (https://phabricator.wikimedia.org/T170764) (owner: 10Amire80) [23:57:01] (03PS4) 10Milimetric: Create Oozie job for interlanguage nav table [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365517 (https://phabricator.wikimedia.org/T170764) (owner: 10Amire80) [23:59:17] (03CR) 10Milimetric: "ok, job finished very fast actually. You can query it like this:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365517 (https://phabricator.wikimedia.org/T170764) (owner: 10Amire80)