[08:26:05] 10Analytics-Kanban: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3954945 (10elukey) p:05Triage>03High [08:26:49] need to follow up with people to reduce stat1005's disk space consumption [08:27:56] 10Analytics-Kanban: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3954958 (10elukey) [08:47:48] currently copying over (again) the /var/lib/archiva dir to /mnt/archiva on meitnerium (the archiva host) [08:48:02] this time with rsync --bwlimit, let' see if ganeti likes it [08:57:41] interesting, json_refine for netflow failed because of no "netflow" database [08:57:48] should I create it first in hive? [08:58:28] elukey: IIRC the json-refine job will create a netflow table [08:58:34] hello joal! [08:58:48] (since you capture table name in regex, and it's named netflow) [08:58:50] Hi elukey ) [08:59:06] So, having a netflow db is maybe not the best here? [08:59:20] netflow.netflow ... mwarf [08:59:29] REATE EXTERNAL TABLE `netflow.netflow` ( [08:59:29] etc.. [09:00:13] elukey: saying you wish to use db 'wmf' for instance, or maybe we could create an 'ops' db in hive? [09:00:59] joal: maybe a network db? [09:01:32] elukey: why not - I wonder about how network datasets (tables) we'll have [09:01:54] ah no idea about the future [09:02:08] not sure if you saw the code review but in wmf/data/raw we have netflow/netflow now [09:02:38] elukey: Ah ... I kinda don't like it :) [09:02:41] it seems cumbersome but me and Andrew thought that it could reflect reality better (netflow camus job, netflow topic) [09:02:43] elukey: But it works :) [09:02:52] yeah I know [09:03:05] I don't have strong preferences [09:03:37] elukey: anyhow, you just need an existing DB to put your data in hive with json-refine (however it's named) [09:05:35] joal: that is a simpley hive --> create database batman; ? [09:05:49] NaNaNaNaNaNa - Yes :) [09:05:53] :D [09:06:01] about naming.. any preference? [09:06:08] (ops are now sres :P) [09:07:03] true ! hm, we could replicate the camus path style netflow.netflow, but I really don't like it [09:07:21] network was a good call elukey [09:07:44] hope that Andrew will like it :D [09:08:10] elukey: Andrew never likes names before ahving talked through them :D [09:10:04] joal: atm the output base path would be /wmf/data/wmf/netflow [09:10:26] if we use a 'network' db it doesn't make sense right? [09:10:29] elukey: then wmf.netflow [09:10:33] correct [09:11:52] joal: https://gerrit.wikimedia.org/r/#/c/408981/1/modules/profile/manifests/analytics/refinery/job/json_refine.pp right? [09:12:11] or sorry, wmf.netflow [09:12:19] mmm no wmf is a db [09:12:21] and netflow the table [09:12:24] so it should be ok [09:12:32] correct elukey - therefore patch is good :) [09:12:45] And database exists [09:12:53] merging :) [09:13:17] joal: do you have time later on to sanity check my work for archiva? [09:13:32] sure elukey, what do you want me to do? [09:13:42] I'd need to stop archiva, mount a new partition under /var/lib/archiva and then restart [09:13:50] just test that it works as expected :) [09:14:04] Sounds like an easy bit :) [09:14:37] after that we'll have 100G for /var/lib/archiva, plenty of space :) [09:31:17] (03PS2) 10Joal: Update sqoop-mediawiki-tables script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/408930 (https://phabricator.wikimedia.org/T186541) [09:38:24] 10Analytics-Kanban, 10Patch-For-Review: Make sqoop cron job report errors if success flags are not written - https://phabricator.wikimedia.org/T186541#3955044 (10JAllemandou) [09:38:34] 10Analytics-Kanban, 10Patch-For-Review: Make sqoop python code write success flags for each table that's fully imported for all wikis - https://phabricator.wikimedia.org/T186542#3955045 (10JAllemandou) [09:42:19] so first refine done successfully! [09:42:35] but I can see a lot of previous ./year=2018/month=2/day=8/hour=7/_REFINE_FAILED etc.. [09:42:46] do I need to remove all of them to make that data to refine? [09:46:14] elukey: correct - or use a flag (I can't recall which one) [09:50:01] re-running it then :) [09:50:23] elukey: you found the flag I assume :) [09:50:35] https://wikitech.wikimedia.org/wiki/Analytics/Systems/JsonRefine#Rerunning_jobs :) [09:50:41] I recalled that Andrew sent the doc [11:03:42] 10Analytics-Tech-community-metrics: Provide statistics about the delinquency of patches under review - https://phabricator.wikimedia.org/T186759#3955173 (10Aklapper) removing #Gerrit as this is not about the software itself. [11:06:18] as FYI archiva went down [11:06:37] I tried different rsync with bw-limit and eventually ganeti1005 crashed [11:06:49] everything should be back up [11:27:01] * elukey lunch! [11:31:51] https://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/ - really interesting [11:31:57] joal: --^ [11:46:48] hellooo :] [12:53:50] o/ [13:05:40] 10Analytics-Kanban: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3955382 (10elukey) @Ottomata do you think that we can delete the stat1002-a's dir? [13:22:49] (03PS1) 10GoranSMilovanovic: Introducing Structure [analytics/wmde/WDCM-GeoDashboard] - 10https://gerrit.wikimedia.org/r/409019 [13:24:30] (03PS1) 10GoranSMilovanovic: Introducing Structure [analytics/wmde/WDCM-Overview-Dashboard] - 10https://gerrit.wikimedia.org/r/409020 [13:25:13] (03PS1) 10GoranSMilovanovic: Introducing Structure [analytics/wmde/WDCM-Semantics-Dashboard] - 10https://gerrit.wikimedia.org/r/409021 [13:26:29] (03PS1) 10GoranSMilovanovic: Introducing Structure [analytics/wmde/WDCM-Structure-Dashboard] - 10https://gerrit.wikimedia.org/r/409022 [13:27:12] (03PS1) 10GoranSMilovanovic: Introducing Structure [analytics/wmde/WDCM-Usage-Dashboard] - 10https://gerrit.wikimedia.org/r/409023 [13:28:29] (03PS1) 10GoranSMilovanovic: Feb 8 2018 - Introducing Structure [analytics/wmde/WDCM] - 10https://gerrit.wikimedia.org/r/409025 [13:28:56] (03CR) 10GoranSMilovanovic: [V: 032 C: 032] Introducing Structure [analytics/wmde/WDCM-GeoDashboard] - 10https://gerrit.wikimedia.org/r/409019 (owner: 10GoranSMilovanovic) [13:29:14] (03CR) 10GoranSMilovanovic: [C: 032] Introducing Structure [analytics/wmde/WDCM-Overview-Dashboard] - 10https://gerrit.wikimedia.org/r/409020 (owner: 10GoranSMilovanovic) [13:29:26] (03CR) 10GoranSMilovanovic: [C: 032] Introducing Structure [analytics/wmde/WDCM-Semantics-Dashboard] - 10https://gerrit.wikimedia.org/r/409021 (owner: 10GoranSMilovanovic) [13:29:41] (03CR) 10GoranSMilovanovic: [V: 032 C: 032] Introducing Structure [analytics/wmde/WDCM-Structure-Dashboard] - 10https://gerrit.wikimedia.org/r/409022 (owner: 10GoranSMilovanovic) [13:29:56] (03CR) 10GoranSMilovanovic: [C: 032] Introducing Structure [analytics/wmde/WDCM-Usage-Dashboard] - 10https://gerrit.wikimedia.org/r/409023 (owner: 10GoranSMilovanovic) [13:30:07] (03CR) 10GoranSMilovanovic: [C: 032] Feb 8 2018 - Introducing Structure [analytics/wmde/WDCM] - 10https://gerrit.wikimedia.org/r/409025 (owner: 10GoranSMilovanovic) [13:30:34] 10Analytics-Tech-community-metrics: Provide statistics about the delinquency of patches under review - https://phabricator.wikimedia.org/T186759#3955445 (10Huji) >>! In T186759#3955172, @Aklapper wrote: >> what are the most delinquent patches awaiting review > Only those ones that have CR=0? (If so, why?) Or any... [13:36:38] 10Analytics, 10Analytics-Wikistats: Can't combine 'Editor type' and editor 'Activity level' filters to narrow results (in WikiStats 2.0) - https://phabricator.wikimedia.org/T183316#3849995 (10jeblad) Just tested this, and I agree that it should be possible to combine filters. I sort of assumed this to be the c... [13:46:10] (03PS1) 10GoranSMilovanovic: Introducing Structure [analytics/wmde/WDCM-ShinyServerFrontPage] - 10https://gerrit.wikimedia.org/r/409031 [13:46:22] (03CR) 10GoranSMilovanovic: [C: 032] Introducing Structure [analytics/wmde/WDCM-ShinyServerFrontPage] - 10https://gerrit.wikimedia.org/r/409031 (owner: 10GoranSMilovanovic) [13:56:44] 10Analytics, 10Analytics-Wikistats: Trends for editor types, and new editors in particular (in Wikistats 2.0) - https://phabricator.wikimedia.org/T186791#3955497 (10jeblad) [14:04:40] Super interesting read elukey - thanks for sharing !!! [14:05:02] elukey: one of those days, I'll make time and have a go at better understanding linux-kernel inners [14:05:02] joal: there is also https://chrisdown.name/2018/01/02/in-defence-of-swap.html if you are bored :) [14:07:53] 10Analytics, 10Analytics-Wikistats: Trends for editor types, and new editors in particular (in Wikistats 2.0) - https://phabricator.wikimedia.org/T186791#3955514 (10jeblad) [14:16:02] 10Analytics-Tech-community-metrics: Provide statistics about the delinquency of patches under review - https://phabricator.wikimedia.org/T186759#3955559 (10Aklapper) >>! In T186759#3955445, @Huji wrote: > this is close to what I mean, but it seems like biterg only shows "never reviewed" patches That is not corre... [14:22:48] Archiva is down again, we are doing the "io load tests" on ganeti [14:46:04] ok! [14:46:53] 10Analytics-Kanban: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3955741 (10Ottomata) Yeah I think so! Maybe we can just shove it in HDFS for posterity? [14:47:43] elukey: so ad-hoc misc test sends from all miscs? [14:49:01] ottomata: hello1 [14:49:03] !!! [14:49:14] hiiiiI! [14:49:24] ottomata: I haven't merged, wanted to wait you [14:49:35] yaya, just asking, that's what it does, right? [14:49:38] yeppp [14:49:43] cool, +1 :) [14:49:51] super, let me poke traffic then :) [14:52:18] elukey: this is incorrect, right? https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Data_representations#MySQL_/_MariaDB_database_on_m2 [14:53:42] ah yes! [14:56:37] mforns: , yt? [14:57:37] She's got a point! https://twitter.com/kellabyte/status/960983633614979072 [14:58:16] hahaha [14:59:45] Gone to get Lino - later [14:59:46] 10Analytics-Kanban, 10Operations, 10monitoring, 10netops, and 2 others: Pull netflow data in realtime from Kafka via Tranquillity/Spark - https://phabricator.wikimedia.org/T181036#3955768 (10elukey) Finally we have something working! Example from stat1004 ``` elukey@stat1004:~$ hive [.. som output ..] h... [14:59:59] this is niceee --^ [15:00:02] \o/ [15:00:40] coool [15:00:41] ! [15:05:02] 10Analytics-Kanban: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3955782 (10elukey) >>! In T186776#3955741, @Ottomata wrote: > Yeah I think so! Maybe we can just shove it in HDFS for posterity? +1 [15:18:56] ottomata, hey! didn't hear you sorry, what's up? [15:22:59] am thinking about https://phabricator.wikimedia.org/T184793#3955715 [15:23:04] ip/geocoding in eventlogging [15:23:16] wondering if it would be possible to solve mysql/eventlcapsule problems [15:23:24] by somehow making jrm.py smart enough [15:23:30] to drop fields that dont' exist in mysql tables [15:23:48] e.g. just don't try to insert those values if the columns don't exist [15:24:53] OH, we do have [15:24:53] NO_DB_PROPERTIES [15:24:54] hmm [15:24:57] we could just put clientIp there [15:25:00] hmmmmm [15:27:20] YEAHH that could work! [15:27:31] then we could just have an ip field in the eventlogging data [15:27:35] but it wouldn't be inserted into MySQL [15:27:36] and then [15:27:44] we could use the tranformFunction stuff that I added to JsonRefine [15:27:50] to configure geocoding for eventlogging data [15:28:06] so all Hive tables would then have geocoded data and ip [15:28:14] and tilman would be very happy [15:28:26] purging stuff could take care of getting rid of that, right? [15:28:26] in hive? [15:32:13] 10Analytics-Tech-community-metrics: Provide statistics about the delinquency of patches under review - https://phabricator.wikimedia.org/T186759#3955840 (10Huji) Well, when I go to [[ https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit-Backlog?_g=()&_a=(filters:!(),options:(darkTheme:!f),panels:!((col:8,id:... [15:32:39] mforns: ^^^ [15:32:51] hey! [15:32:55] readingh [15:35:38] it'd be nice to get your eventcapsule change merged first for that [15:38:13] ottomata, one thing, the clientIp field is still there (NULL) [15:39:08] so we could repopulate it without changing anything capsule-wise, also the purging script already purges it for all schemas, so it wouldn't represent a privacy issue [15:40:10] still there? [15:40:11] where? [15:40:17] in the capsule [15:40:19] ?? [15:40:27] all tables have a clientIp field [15:40:29] empty [15:40:31] hmm [15:40:40] not all [15:40:41] desc MobileWikiAppEdit_9003125 [15:40:43] no clientIp there [15:40:47] hmmmmmm... [15:40:55] the few I checked don't have it [15:41:05] that one is verrrrrry old no? [15:41:13] really? wait [15:41:14] i dunno [15:41:32] desc Popups_16364296 [15:41:49] desc Edit_17541122; [15:41:55] ok ok [15:42:11] I thought they had it all [15:42:14] but nullified [15:42:37] so, yes, then, what you say makes sense, we can add it to NO_DB_PROPERTIES [15:43:02] and the purging job in hive would take care of purging that by default [15:44:41] 👍 [15:45:51] ottomata, one semi-related question: do you think we can add the following flags to x-analytics header: whether or not the user has cookies enabled, whether or not the user has javascript enabled, whether or not the user has DNT enabled? [15:47:17] don't see why not, although i don't know how that is detected [15:47:23] but if it can be detected somehow, then sure [15:47:29] this way we could get the exact proportion of those situations and be able to normalize all events that are collected using cookies/JS and not when DNT [15:47:55] I saw in that task that they want to ignore DNT for that specific metric... [15:48:09] sounds weird [15:49:39] nuria_: check my idea up at 10:23, what do you think? [15:49:44] been reading eventlogging code, trying things [15:49:56] i think we can actually geocode in Hadoop only without affecting MySQL tables. [15:52:12] maybe nuria_ has a different time for your messages [15:52:20] ottomata, ^ [15:53:50] oh ya [15:53:51] heheh [16:05:39] ottomata: what idea wait [16:07:07] ottomata: i see, it adds data to events that otherwise do not need it and we have to make sure to drop it so whitelist would be different for mysql /hive.. make sense? (cc mforns ) [16:13:37] mforns: whether user has cookies enabled is not possible to know server side [16:13:59] mforns: you can know whether teh request has cookies or not and that is already there [16:14:06] mforns: on x-analytics [16:15:01] nuria_, I think the geocode could be purged by default in hive, avoiding to have 2 different whitelists [16:15:13] mforns: see nocookie: https://gerrit.wikimedia.org/r/#/c/244626/ [16:15:22] nuria_, re. cookies, I see, makes sense [16:15:33] mforns: that work is already done but [16:15:58] I really do not see much of an intersection of people navigating without cookies and js enabled [16:16:18] mforns: in the absence of evidence to the contrary [16:17:23] nuria_, we maybe could create a new EL schema to measure those 3 things: cookies enabled, js enabled, DNT enabled [16:17:36] and sample it like 1/10000 [16:18:00] so we get an estimate of proportion, and are able to normalize other event data sets [16:21:22] hm, but of course that does not work... [16:21:51] O.o [16:21:58] FYI, I'm installing PHP security updates on bohrium/piwik, expect a few seconds of unavailablity [16:22:32] moritzm: ack! [16:22:42] done [16:23:52] !log stop archiva on meitnerium to swap /var/lib/archiva from the root partition to a new separate one [16:23:55] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:24:02] a-team: I am stopping archiva [16:27:47] +1 [16:32:16] ottomata: just mounted the new partition under /var/lib/archiva [16:32:27] everything looks good, is there any sanity check that we can do? [16:32:34] old dir is in my home [16:32:44] when I get the confirmation that everything is good I'll delete it [16:37:08] elukey: if you can browse around in https://archiva.wikimedia.org/#browse [16:37:10] things are probably good [16:45:43] ottomata: I am talking with gehe*l, he is rebuilding one of his local maven builds, everything looks good [16:45:55] I guess I could build refinery too? [16:49:22] sure [16:49:25] so ignorant about this part [16:49:29] if you don't have any local .m2 artifacts cached [16:49:38] you can always wipe that part [16:49:46] if you are building yourself, remove it frmo yoru homedir [16:49:50] rm -r ~/.m2 [16:49:59] you can do that locally on your laptop too [16:52:58] and then mvn $something? [16:53:13] 10Analytics: Make the Wikistats 2 UI responsive - https://phabricator.wikimedia.org/T186812#3956038 (10fdans) [16:55:56] 10Analytics-Kanban: Do not show a split section if there is nothing to split by - https://phabricator.wikimedia.org/T186813#3956054 (10fdans) [16:56:12] 10Analytics-Kanban: Do not show a split section if there is nothing to split by - https://phabricator.wikimedia.org/T186813#3956065 (10fdans) [16:57:32] (03PS1) 10Fdans: Do not show splits/filters if there is nothing to split by [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/409076 (https://phabricator.wikimedia.org/T186813) [16:58:05] 10Analytics-Kanban, 10Patch-For-Review: Do not show a split section if there is nothing to split by - https://phabricator.wikimedia.org/T186813#3956072 (10fdans) a:03fdans [17:00:20] Hey ottomata - Would ou have a minute for me on JsonRefine monitoring? [17:01:23] 10Analytics-Tech-community-metrics: Provide statistics about the delinquency of patches under review - https://phabricator.wikimedia.org/T186759#3956100 (10Aklapper) >>! In T186759#3955840, @Huji wrote: > How come the latter is not shown in the former link? Your link says "Last 90 days" in the upper right corne... [17:02:42] elukey: Just tried to rebuild a jar after having deleted .m2 - looks like archiva is working great ) [17:03:06] joal: nice! How did you do it? (curious and ignorant) [17:03:38] Well, .m2 folder is your home dir (I used stat1004, to prevent skipping jars accross the oceans for test) [17:03:47] I deleted it [17:04:09] Then I ran an maven package command in refinery source: mvn clean packageb [17:04:43] This builds the jars we have in refinery-source, and since no more deps are available in .m2 folder, they're downloaded again from archiva [17:05:39] nice thanks :) [17:06:06] joal: all right so I am going to remove the backup of the old /var/lib/archiva [17:07:46] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3956106 (10elukey) ``` elukey@meitnerium:~$ df -h Filesystem Size Used Avail Use% Mounted on udev 10M 0 10M 0% /dev tmpfs 792M 8.4M 783... [17:11:40] mforns: mmmm...we already measured js enabled a while back (https://www.mediawiki.org/wiki/Analytics/Reports/Clients_without_JavaScript#Preliminary_results, https://commons.wikimedia.org/w/index.php?title=File:Browsers,_Geography,_and_JavaScript_Support_on_Wikipedia_Portal.pdf&page=5) I do not think we need to do it again, cookies enabled we can already do with nocookie header. That leaves DNT which we [17:11:40] might want to measure at a later time, mozilla reported 11% of their users using dnt a while back. [17:12:24] nuria_, thx [17:29:42] (03CR) 10Nuria: [C: 032] Do not show splits/filters if there is nothing to split by [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/409076 (https://phabricator.wikimedia.org/T186813) (owner: 10Fdans) [17:29:58] (03CR) 10Nuria: [C: 032] "Much better, thanks for doing these changes." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/409076 (https://phabricator.wikimedia.org/T186813) (owner: 10Fdans) [17:30:03] a-team are we doing gross king? [17:30:14] (03CR) 10Nuria: [V: 032 C: 032] Do not show splits/filters if there is nothing to split by [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/409076 (https://phabricator.wikimedia.org/T186813) (owner: 10Fdans) [17:30:20] fdans: I am going to be 10 mins late, merging a puppet change sorry [17:30:28] mwahaha :) GROSSSEUH KING fdans :) [17:30:40] joal: you got a sec for scala/spark bc [17:30:43] not a big one... [17:30:44] a-team: grossking [17:30:44] :) [17:30:46] OHHH [17:30:47] sure [17:30:51] cc elukey fdans [17:31:24] nuria_: I am going to be late, merging a change for varnishkafka sorry [17:32:58] (03Merged) 10jenkins-bot: Do not show splits/filters if there is nothing to split by [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/409076 (https://phabricator.wikimedia.org/T186813) (owner: 10Fdans) [17:32:58] elukey: k [17:53:46] ottomata: Is there a Kafka cluster in the Wikimedia Cloud VPS? If so, it must be read-only? [17:54:06] Or at least, unable to write back to production... [17:54:38] there is one in deployment-prep (deployment-kafka-jumbo-[12]), not related to any prod service [17:55:19] awight: ya, no prod data [17:55:20] but beta [17:55:51] elukey: OK, thanks. Are Cloud VPS boxes outside of deployment-prep able to read from deployment-kafka-jumbo, and receive Beta Cluster events, then? [17:56:46] hmm no [17:56:55] not unless there is some special firewall rule set up i think [17:56:58] not sure how that works in labs [17:57:08] but, awight you can make your own kafka clsuter in your project if you want [17:57:11] Cool, that’s all I need to know. We’ll plan to create a new deployment-prep box... [17:57:39] ok [18:20:31] * fdans lunch! [18:21:14] a-team: archiva is now officially fine, new 100G partition up and running, no more restrictions [18:29:15] thank you luca! [18:31:19] 10Analytics: Pageviews/Stats on research.wikimedia.org - https://phabricator.wikimedia.org/T186819#3956274 (10diego) [18:45:36] * elukey off! [18:45:37] byyyeee [18:45:44] bye elukey :) [18:45:51] Thanks for archiva and kafka - big day :) [18:46:11] milimetric: I think my patch for python is ready - It;s not awesome, but it should do the job [18:47:49] will take a look soon [18:48:01] milimetric: no rush - just so that you know :) [18:49:55] joal: i have the same hive udf spark problem with new code [18:49:59] so it wasn't the map obejct stuff [18:54:08] (03CR) 10Ottomata: Refactor geo-coding function and add ISP (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [19:05:55] (03CR) 10Ottomata: "Hm, also, we should make sure this works in Spark. Currently I am getting:" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [19:07:21] ottomata: I really suggest trying to use the IDF [19:07:25] UDF sorry [19:07:35] 10Analytics, 10Operations, 10ops-eqiad, 10User-Elukey: Check analytics1037 power supply status - https://phabricator.wikimedia.org/T179192#3956393 (10RobH) p:05Triage>03Low [19:07:47] (03PS1) 10GoranSMilovanovic: Minor [analytics/wmde/WDCM-Structure-Dashboard] - 10https://gerrit.wikimedia.org/r/409116 [19:07:56] joal: i tried that too [19:07:57] that gives me null pointer stuff [19:08:12] org.apache.spark.sql.AnalysisException: No handler for Hive udf class org.wikimedia.analytics.refinery.hive.GetGeoDataUDF because: null.; line 1 pos 20 [19:08:13] Mwarf?? [19:08:17] (03PS1) 10GoranSMilovanovic: Minor [analytics/wmde/WDCM] - 10https://gerrit.wikimedia.org/r/409117 [19:08:19] sqlContext.sql("CREATE TEMPORARY FUNCTION get_geo_data as 'org.wikimedia.analytics.refinery.hive.GetGeoDataUDF'") [19:08:29] val g = sqlContext.sql("select get_geo_data('2604:2000:12c1:273:c517:bb01:ad08:7f26') as g") [19:08:31] (03CR) 10GoranSMilovanovic: [V: 032 C: 032] Minor [analytics/wmde/WDCM-Structure-Dashboard] - 10https://gerrit.wikimedia.org/r/409116 (owner: 10GoranSMilovanovic) [19:08:40] hmm- Unexpected ottomata [19:08:43] (03CR) 10GoranSMilovanovic: [V: 032 C: 032] Minor [analytics/wmde/WDCM] - 10https://gerrit.wikimedia.org/r/409117 (owner: 10GoranSMilovanovic) [19:08:51] that's the same error i got with the old geocode code too [19:10:28] ottomata: Currently trying to replicate [19:10:41] lemme know if you wanna bc [19:18:46] ottomata: I have more precise info - batcave? [19:23:18] ya [19:23:48] 10Analytics-Tech-community-metrics: Provide statistics about the delinquency of patches under review - https://phabricator.wikimedia.org/T186759#3956433 (10Huji) 05Open>03Invalid You are right. [[https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit-Backlog?_g=(refreshInterval:(display:Off,pause:!f,value:... [20:02:58] 10Analytics: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3956542 (10Tbayer) @Ottomata @Nuria Since it looks from the discussion at T184793 that there wasn't yet full clarity around this: Please take a look at this list of required fields; we should make sure that your team wil... [20:23:43] Gone for tonight tema [20:24:56] (03PS6) 10Nuria: Refactor geo-coding function and add ISP [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [20:27:43] (03PS7) 10Nuria: Refactor geo-coding function and add ISP [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [20:29:51] 10Analytics: Upload XML dumps to hdfs - https://phabricator.wikimedia.org/T186559#3956614 (10bmansurov) @JAllemandou I heard you've done this kind of work before. Would you be able to give us some pointers on how to do this? Although systematic upload is nice, I have an immediate need for this. [20:31:13] (03CR) 10Nuria: Refactor geo-coding function and add ISP (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [20:32:42] 10Analytics: Pageviews/Stats on research.wikimedia.org - https://phabricator.wikimedia.org/T186819#3956621 (10Nuria) This is done with piwiki you need some tracking javscript, @bmansurov can let us know when he has a minute and we can provide a snippet [20:35:45] 10Analytics: Pageviews/Stats on research.wikimedia.org - https://phabricator.wikimedia.org/T186819#3956628 (10bmansurov) @Nuria, please share the snippet. [20:39:53] (03PS1) 10Hashar: Set bower.config.storage.empty to current dir [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/409145 [20:43:10] (03PS2) 10Hashar: tweak bower configuration to run in a Docker container [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/409145 [21:00:49] 10Analytics, 10Discovery-Analysis, 10Reading-analysis: Productionize per-country daily & monthly active app user stats - https://phabricator.wikimedia.org/T186828#3956689 (10mpopov) [21:06:56] (03PS1) 10Fdans: Remove Curaçao as a country [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/409155 [21:12:07] 10Analytics, 10Discovery-Analysis, 10Reading-analysis: Productionize per-country daily & monthly active app user stats - https://phabricator.wikimedia.org/T186828#3956765 (10mpopov) [21:22:22] nuria_: did you send the patch for the numeric locale? I can do it otherwise [21:22:40] fdans: not yet, i needed to work with adrew on geocode [21:24:56] fdans: i will though, numeric needs to bundle all locales too and those are not being bundled [21:25:13] fdans: but regardless teh css of table needs to change so it is aligned properly right? [21:26:02] mmmm I don't think so nuria_ , what's wrong with the current alignment? [21:26:22] fdans: let me see [21:26:57] if I remember correctly we changed the alignment a few months ago so that number and name are displayed close together, and the gap between those is in constant alignment [21:26:57] https://usercontent.irccloud-cdn.com/file/21bN9zwR/Screen%20Shot%202018-02-08%20at%201.26.38%20PM.png [21:27:04] this is one [21:27:11] oh that's weird [21:27:15] https://usercontent.irccloud-cdn.com/file/jfdcXORp/Screen%20Shot%202018-02-08%20at%201.26.26%20PM.png [21:27:16] 10Analytics-Kanban: Include X-Client-IP in EventLogging data and geocode during Hive JSON Refinement - https://phabricator.wikimedia.org/T186833#3956817 (10Ottomata) p:05Triage>03Normal [21:27:20] so different [21:27:24] right? [21:27:29] fdans: do you see that too? [21:27:58] nuria_: btw, if you and joseph are good with your geocode code, yall shoud just merge it [21:28:00] oh I see nuria_ [21:28:02] that way I don't have to rebase on top of it later [21:28:11] we won't do a release/deploy til after java 8 upgrade [21:28:35] the change we made to the alignments only affects top metrics [21:28:39] fdans: ya it is good, do merge at your leisure we did not do it so it will not get picked up on next deploy [21:29:33] i think that last msg is for ottomata :) [21:29:40] fdans: RIGHT! [21:29:58] ottomata: ya it is good, do merge at your leisure we did not do it so it will not get picked up on next deploy [21:30:05] ha [21:30:25] fdans: i know it is a pain but can we fix it such aligment is consistant in both? [21:30:50] nuria_: not a pain at all, I think right now we're just checking the type of metric to control the alignment [21:30:57] it should be removing a conditional [21:31:35] fdans: ok, thank you let's test in other browsers too [21:31:41] well, i'd prefer if you or joal merged, i'm just a bystander on that one :) [21:31:51] leaving for today team, see you tomorrow! [21:31:56] ottomata: ok, will triple check with joal tomorrow [21:32:00] mforns: ciao [21:32:05] lattrrrs [21:32:11] :] [21:33:09] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3931188 (10Dzahn) 05Open>03Resolved a:03Dzahn [21:36:02] nuria_ this is better? https://usercontent.irccloud-cdn.com/file/eFNxqWqi/Screen%20Shot%202018-02-08%20at%203.34.43%20PM.png [21:38:26] fdans: is that what we were shooting for ? (cc milimetric ) seems like if there are more than 2 columns is hard to center align, easier to right aligh but not sure [21:38:44] deferring to what fdans and milimetric agreed to before [21:39:32] nuria_: I like it this way when we select a breakdown [21:39:40] https://usercontent.irccloud-cdn.com/file/gwXWmIYj/Screen%20Shot%202018-02-08%20at%203.39.11%20PM.png [21:39:43] oh, no, numbers should always be right aligned, and have thousands separators [21:40:01] milimetric: the thousands will come with teh locale change [21:40:13] milimetric: now they are not there cause we have our locale to international [21:40:23] milimetric: in numeral [21:40:49] yeah, but right aligned or left padded with spaces and in monospace, otherwise it’s impossible to read and make sense of it [21:40:54] milimetric: hopefully that makes sense [21:41:09] milimetric: i defer to you guys on alignment [21:41:39] I agree the formatting makes more sense in the locales [21:41:52] nuria_: if we're following milimetric 's guidelines, which are pretty solid, we should leave it like that (alignment) [21:41:53] Unless we want to override and do kmb ourselves [21:42:13] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3956869 (10Dzahn) a:05Dzahn>03elukey Icinga for meitnerium looks fine. no disk space warnings. Though one thing: puppet is still disabled there... should it? [21:42:20] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3956871 (10Dzahn) 05Resolved>03Open [21:42:45] this is Erik’s point, btw, I’m just parotting, but he would definitely file a bug if numbers are not right-aligned [21:45:01] milimetric: sounds good, i cannnot review the change (other than visually) give i know LESS THAN ZERO about css so deferring to ya'all [21:53:12] 10Analytics, 10Continuous-Integration-Config: Add CI to all analytics/* repositories and archive obsolete ones - https://phabricator.wikimedia.org/T180301#3956892 (10hashar) [21:53:25] 10Analytics, 10Continuous-Integration-Config: Add CI to all analytics/* repositories and archive obsolete ones - https://phabricator.wikimedia.org/T180301#3753283 (10hashar) 05Open>03Resolved a:03hashar Some repositories had fairly recent commits and hence I flagged them as active in the table: analytic... [21:59:21] (03CR) 10Hashar: "recheck" [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/409145 (owner: 10Hashar) [22:03:12] (03CR) 10Hashar: [C: 032] tweak bower configuration to run in a Docker container [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/409145 (owner: 10Hashar) [22:05:36] (03Merged) 10jenkins-bot: tweak bower configuration to run in a Docker container [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/409145 (owner: 10Hashar) [23:41:23] 10Analytics-EventLogging, 10Analytics-Kanban: Sunset MySQL data store for eventlogging. Find an alternative query interface for eventlogging on analytics cluster that can replace MariaDB - https://phabricator.wikimedia.org/T159170#3957048 (10Nuria) [23:57:01] (03CR) 10Nuria: "Adding luca to CR for perspective but since this job is executed in a cron could we not use MAILTO to send e-mail ?" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/408930 (https://phabricator.wikimedia.org/T186541) (owner: 10Joal)