[06:37:09] 10Analytics-Kanban, 10User-Elukey: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3991909 (10elukey) ``` elukey@stat1005:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/stat1005--vg-data 7.2T 5.6T 1.3T 82% /srv ``` Way better now,... [06:37:34] 10Analytics-Kanban, 10User-Elukey: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3991910 (10elukey) p:05High>03Normal [07:15:05] hello people! [07:15:27] joal: o/ - I am going to stop camus to drain a bit the cluster, I'd need to restart oozie and hive-metastore to apply prometheus monitoring [07:37:08] 10Analytics-Kanban, 10User-Elukey: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3991995 (10elukey) @ezachte Hi again :) any news about dropping some backup data? [07:51:10] Hi elukey - Ack ! [07:54:57] bonjour! [07:55:11] Bonjour elukey ! Ca va? [07:55:31] très bien! [07:55:57] With accents - Man - You're French-writer! :) [07:56:15] hahahah [07:56:15] Moi aussi, je vais bien :) [07:56:36] :) [07:58:42] elukey: please tel me if you want any help with hive/oozie [07:59:13] joal: sure! I'll warn you before doing the restarts, but this time should work (I hope) [07:59:18] :) [07:59:29] we'll not have hive server's metris though, this is still pending [07:59:41] elukey: prending? [08:00:41] joal: https://issues.apache.org/jira/browse/HIVE-12582 [08:00:58] once we'll have HIVE_SERVER2_HADOOP_OPTS we should be ok! [08:01:34] or maybe once cdh will be properly packaged with systemd and not with a ton of bash scripts [08:01:37] :D [08:06:00] elukey: the later will never happen :D [08:06:18] elukey: I wonder also when the former could happen, given CDH doesn't release [08:07:58] joal: we can always do some "hacks" in puppet to make it work [08:08:02] like for oozie [08:08:21] I put a puppet rule to remove the link causing the issue [08:08:38] I tried to contact upstream but it is not clear how to report bugs [08:08:45] this is a thing that I don't like [08:10:23] hm - An apache project without clear bug reporting - not nice [08:10:43] no no I meant cdh packaging [08:11:11] I think that they are the ones doing debs, I didn't find all the init scripts on the apache projects github repos [08:11:16] but maybe I need to dig a bit more [08:11:56] the oozie's link for example is created by the deb install workflow [08:11:58] not sure why [08:15:40] weird elukey - I've nevered installed oozie, but I was expected it to be as hadoop - A bunch of shell scripts [08:16:36] joal: yes but all the deb packages are doing something to install files etc.. [08:16:44] right [08:16:45] in this case, it might have some sense [08:17:19] but the side effect of the link is that, since the files are both sourced, java_ops gets duplicates [08:17:34] now Xmx duplicated is fine, -javaagent no :D [08:17:42] elukey: right [08:28:31] joal: yarn shows on banner impressions [08:28:42] ready for the restarts? [08:28:45] Yay ! [08:30:56] metastore should be up [08:33:08] mmm oozie didn't pick up the new config [08:33:14] maybe I forgot something [08:36:46] Feb 22, 2018 8:31:49 AM org.apache.catalina.startup.ContextConfig init [08:36:49] SEVERE: Exception fixing docBase for context [/oozie] [08:36:52] java.util.zip.ZipException: zip file is empty [08:36:54] mmmm [08:45:02] PROBLEM - Oozie Server on analytics1003 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.catalina.startup.Bootstrap [08:45:15] yeah I know [08:47:02] RECOVERY - Oozie Server on analytics1003 is OK: PROCS OK: 1 process with command name java, args org.apache.catalina.startup.Bootstrap [08:47:28] ok downtime an1003 [08:48:17] of course this didn't happen in labs [08:48:53] going to try another time [08:50:06] so that symlink is needed for oozie to start, apparently [08:50:15] need to rollback oozie prometheus config then [08:50:16] sigh [08:52:13] joal: would you mind to check if hive is fine? [08:52:24] elukey: sure ! [08:53:43] elukey: looks good to me (a test request succeeded) [08:53:53] super [08:54:02] at least the metastore is now monitored :D [08:55:45] \o/ This is super great elukey :) [09:04:39] elukey: oozie still not happy? [09:06:40] joal: should be good now [09:06:52] everything restarted [09:07:31] Awesome elukey - will monitor [09:11:25] * elukey feels frustrated by oozie [09:11:55] now another alternative could be to create a systemd unit to run the jmx exporter in standalone mode, like jmxtrans does [09:12:26] but oozie's support seems to be not ideal, so I'll probably concentrate only on hive-server2 [09:22:54] elukey: I confirm stuff seems back online (webrequest-load jobs started and all) [09:23:05] joal: https://grafana.wikimedia.org/dashboard/db/analytics-hive?orgId=1 :) [09:23:30] <3 [09:23:34] I am going to add a note about the hive server [09:23:54] Thanks again a milion elukey for doing that - This is awesome :) [09:24:51] joal: I am sorry that oozie didn't worked out, it was working fine in labs :( [09:26:38] elukey: Nothing to bother - Gentle steps :) [09:40:45] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Fix outstanding bugs preventing the use of prometheus jmx agent for Hive/Oozie - https://phabricator.wikimedia.org/T184794#3992143 (10elukey) created https://grafana.wikimedia.org/dashboard/db/analytics-hive for the Hive Metastore. Oozie failed to boot... [09:42:27] * elukey brb! [10:22:18] (03PS14) 10Joal: Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 [10:24:39] 10Analytics: Spark 2.x as cluster default (working with oozie) - https://phabricator.wikimedia.org/T159962#3992243 (10JAllemandou) [10:24:50] 10Analytics: Spark 2.x as cluster default (working with oozie) - https://phabricator.wikimedia.org/T159962#3084705 (10JAllemandou) Can we try that: https://community.hortonworks.com/questions/114243/oozie-spark2-compatibility.html ?? [10:25:14] 10Analytics-Kanban: Spark 2.x as cluster default (working with oozie) - https://phabricator.wikimedia.org/T159962#3992246 (10JAllemandou) a:03JAllemandou [10:25:19] (03CR) 10jerkins-bot: [V: 04-1] Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 (owner: 10Joal) [10:25:41] 10Analytics-Kanban: Spark 2.2.1 as cluster default (working with oozie) - https://phabricator.wikimedia.org/T159962#3084705 (10JAllemandou) [11:08:09] 10Analytics, 10Analytics-EventLogging, 10User-Elukey: Upgrade eventlogging servers to Jessie - https://phabricator.wikimedia.org/T114199#3992373 (10elukey) So restarting this work to see how we can proceed to move Eventlogging to systemd. I'd start from the last comment from Andrew, related to daemon groupin... [11:42:19] heya teaaam [11:49:30] o/ [11:57:29] (03CR) 10Mforns: [C: 031] "LGTM! Left one comment regarding Json first line." (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [12:10:20] (03PS15) 10Joal: Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 [12:13:11] (03CR) 10jerkins-bot: [V: 04-1] Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 (owner: 10Joal) [12:30:16] (03PS3) 10Mforns: [WIP] Add EL and whitelist sanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/412939 (https://phabricator.wikimedia.org/T181064) [12:31:25] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add EL and whitelist sanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/412939 (https://phabricator.wikimedia.org/T181064) (owner: 10Mforns) [12:32:14] hey guys! anyone here we some knowledge about PAWS ? [12:33:00] Hi dsaez [12:33:02] dsaez: not a ton but please ask, we'll see if we can answer :) [12:34:25] so, my account in PAWS get stucked in a loop saying: "Your server is starting up. [12:34:38] but never starts [12:35:07] I've created another account, personal, and works, so I imagine that is problem with my configuration [12:35:25] but I can find a place to reset my conf. [12:35:32] *I can-t [12:35:37] *I can't [12:40:28] cold silence ;) [12:41:37] indeed dsaez - no idea :( [12:42:02] no worries joal, I'll wait for madu :) [12:42:42] she's the one indeed :) [13:13:56] 10Analytics, 10TCB-Team, 10Two-Column-Edit-Conflict-Merge, 10WMDE-Analytics-Engineering, and 5 others: How often are new editors involved in edit conflicts - https://phabricator.wikimedia.org/T182008#3992623 (10GoranSMilovanovic) @Lea_WMDE @Addshore The report is now [[ https://docs.google.com/document/d/1... [13:18:49] (03PS16) 10Joal: Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 [13:51:54] (03PS17) 10Joal: Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 [14:02:56] ottomata: o/ [14:03:30] whenever you have time can you tell me if https://gerrit.wikimedia.org/r/#/c/413362/ can be an idea to develop or not? [14:03:49] I've just done the processor part, but the rest is easy enoug [14:03:52] *enough [14:07:21] elukey: reading emails will do! [14:07:24] then let's do webrequest misc! [14:07:36] all rightzzzz! [14:07:44] I am going to warn the traffic people [14:14:23] (03PS18) 10Joal: Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 [14:18:06] (03CR) 10jerkins-bot: [V: 04-1] Upgrade scala to 2.11.7 and Spark to 2.2.1 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/348207 (owner: 10Joal) [14:18:26] ottomata: they are ok, I'd say that I can start removing the testing instance [14:19:02] ottomata: Leaving now, but I have a working version of refine with test line you sent [14:19:54] ottomata: I think I'd like this to be better at finding how many files it writes - We can talk about that later [14:20:17] k! [14:21:58] ottomata: https://gerrit.wikimedia.org/r/#/c/413370/ [14:23:34] +1, prob have to manually stop/remove them [14:25:47] elukey: i'm ready, wanna bc? [14:29:40] ottomata: surez, gimme 2 min [14:30:11] k [14:39:27] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move webrequest varnishkafka and consumers to Kafka jumbo cluster. - https://phabricator.wikimedia.org/T185136#3992780 (10Ottomata) [14:45:01] hey ottomata :], looking into refactoring the sanitization code, I'm moving files from refinery-job to refinery-core, but not sure if that makes sense. Some of the files in refinery-job/.../job/refine are not *jobs* so I thought they would be better in core, what o you think? [14:50:35] like for instance DataFrameToHive [14:51:22] mforns: joal made me put them in job :) [14:55:29] mforns: we might move the refine spark/sql hive stuff out into a standalone repo one day... so we kind of wanted to keep them in the same module for now [14:55:31] dunno [14:56:22] ottomata, is there a way to import job modules from core ones? [14:56:30] I didn't manage to do that [14:57:25] I guess, order of compilation does not allow that [14:58:05] if we keep DataFrameToHive in refinery-job, then all other components that use it, will have to go into job as well... mmhh [15:24:20] mforns: yeah [15:24:25] hm [15:24:31] let's ask joal [15:24:38] i'm not too opinionated on that, i originally had them in core too [15:24:47] i think putting them in job made migrating to spark too easier [15:24:52] because then core didn't have spark deps [15:24:59] ok, aha [15:25:06] it might be easier to move them back to core after that is done [15:25:42] ottomata, I already moved things back to job, on my patch, if we decide to refactor to core, I can do that later, after we talk with josl [15:25:44] josl [15:25:46] ah! [15:25:48] joal, [15:25:51] xD [15:30:24] (03PS4) 10Mforns: [WIP] Add EL and whitelist sanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/412939 (https://phabricator.wikimedia.org/T181064) [15:32:03] mforns: i'm going to test that format agnostic Refine job stuff today and write some tests [15:32:07] if all goes well i'll merge that too [15:32:13] then you might be able to rebase on top of it and make use of it [15:32:27] ok, looked good to me! [15:32:31] you saw my comment? [15:33:00] I guess, as it will be always DB records, it will be always a JSON object, as opposed of a JSON array? [15:33:02] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add EL and whitelist sanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/412939 (https://phabricator.wikimedia.org/T181064) (owner: 10Mforns) [15:33:05] oh yes yes [15:33:10] json array isn't really supported like that [15:33:12] unless the schema is stored in another place? dunno [15:33:14] json can be an array [15:33:19] but not really for hadoop stuff [15:33:23] I see [15:33:33] hadoop json expects newline delimited objects [15:33:34] HMMM [15:33:38] or maybe it can be newl [15:33:44] maybe it can be one array per line [15:33:45] hmmm [15:33:48] i will test mforns [15:33:50] ok [15:34:05] you just can't do a whole file with a single array [15:34:12] newline means there is one full json interpretable in each line, no? [15:34:14] each record must be newline delimited [15:34:19] right [15:34:23] so maybe array is ok [15:34:26] yes, but a line can start with [ [15:34:29] ya [15:34:34] but that would be weird for a dataframe [15:34:40] what would the schema be? [15:34:50] exactly, that was my question [15:34:52] aye [15:34:56] will test [15:35:03] I though the schema might be stored in another place? dunno [15:35:29] anyway! if we need that in the future it's as easy as adding a line, so np! [16:08:19] 10Analytics, 10InteractionTimeline, 10Anti-Harassment (AHT Sprint 15): Measure how many unique people visit the Timeline - https://phabricator.wikimedia.org/T187374#3993045 (10dbarratt) @Milimetric Can you give permission to me and @TBolliger to site 14 on https://piwik.wikimedia.org ? [16:22:55] 10Analytics, 10InteractionTimeline, 10Anti-Harassment (AHT Sprint 15): Measure how many unique people visit the Timeline - https://phabricator.wikimedia.org/T187374#3993108 (10Milimetric) @dbarratt I made the site stats viewable to anonymous users (you still have to login with your ldap account when you visi... [16:25:32] 10Analytics, 10InteractionTimeline, 10Anti-Harassment (AHT Sprint 15): Measure how many unique people visit the Timeline - https://phabricator.wikimedia.org/T187374#3993112 (10dbarratt) @Milimetric Thanks! @TBolliger If I did this correctly, stats should start showing up in https://piwik.wikimedia.org/index... [16:39:08] mforns: huh, some versions of json array do work just fine [16:39:16] hmm [16:39:47] ottomata, aha [16:39:58] you can even do like [16:40:06] [{"a": "b"}, {"a": "c"}] [16:40:06] [{"a": "b"}, {"a": "d"}] [16:40:23] so, multiple records per array, and multiple newline array per file [16:40:28] so, i'll add that as a possibility [16:40:29] thanks [16:40:34] np! [16:40:39] ottomata, another question: [16:41:07] (03PS2) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [16:41:12] DataFrameToHive passes a DataFrame and a HivePartition to transformationFunctions [16:41:21] yes [16:41:40] is the transformationFunction supposed to dice the dataframe according to the partition? [16:41:57] or is the input DataFrame supposed to be already diced according to the partition? [16:41:58] no, the HiveParittion is jsut given for context, in case you need it in the transformFunction [16:42:09] diced? meaning have partition fields added? [16:42:10] yes [16:42:14] ok ok [16:42:31] I mean, filtered by year = 2018 and month = 1 and ... [16:42:32] actually, the way I did this was kinda nice [16:42:36] oh [16:42:39] hm [16:42:57] i mean, it will be if you are using RefineTarget.find [16:43:01] but [16:43:01] hm [16:43:27] ah [16:43:30] it doesn't matter mforns [16:43:33] it'll be smart [16:43:50] so, ya, ok [16:44:01] xD, not following you [16:44:10] HMMM wait yes ok. [16:44:15] DataFrameToHive ALSO takes a HivePartition [16:44:21] aha [16:44:23] the first 'transform' that happens to the df [16:44:29] is to add the HivePartition columns to the df [16:44:33] these are static columns [16:44:38] yes [16:44:45] so it is expected that the df you are passing in belongs to that HivePartition [16:44:49] or, is intended to go there [16:45:01] ok, perfect [16:45:06] so, DataFrameToHive to will end up inserting df into whatever HivePartition you set [16:45:17] but, by the time your transform functions get the df [16:45:25] it will already have had the columns from HivePartition added to it [16:45:34] so, HivePartition is really only given for context [16:45:35] yes, I saw that in the code [16:45:43] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [16:45:48] the dataFrameWithHivePartitions thing [16:45:48] in case you need to know other things, like table or database name, etc. [16:45:53] aye ya [16:46:15] ok [16:46:22] thaaaanks :] [17:00:48] (03PS3) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [17:01:13] elukey: fdans yoohoo [17:04:13] 10Analytics-EventLogging, 10Analytics-Kanban, 10User-Elukey: Upgrade eventlogging servers to Jessie - https://phabricator.wikimedia.org/T114199#3993226 (10elukey) [17:04:57] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [17:07:03] 10Analytics-Kanban: Show compact view of metrics in detail page when visited from mobile - https://phabricator.wikimedia.org/T188013#3993233 (10fdans) [17:12:52] (03PS4) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [17:16:15] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [17:30:31] mforns: ottomata yo get in da hangout [17:30:42] :] [17:33:20] 10Analytics, 10Analytics-Wikistats: Provide easier way of accessing metrics as defined in Wikistats 1 - https://phabricator.wikimedia.org/T187806#3993526 (10fdans) [17:34:17] 10Analytics, 10Analytics-Wikistats: Provide easier way of accessing metrics as defined in Wikistats 1 - https://phabricator.wikimedia.org/T187806#3986460 (10fdans) This will become easier as we add bookmarking for Wikistats 2's splits/breakdowns. [17:36:17] 10Analytics, 10InteractionTimeline, 10Anti-Harassment (AHT Sprint 15): Measure how many unique people visit the Timeline - https://phabricator.wikimedia.org/T187374#3993534 (10TBolliger) Thank you @Milimetric and @dbarratt ! @SPoore and @CSindersWMF — If you want to access piwik, you'll need to create an... [17:37:11] 10Analytics, 10Analytics-Wikistats: Beta: Provide easier way of accessing metrics as defined in Wikistats 1 - https://phabricator.wikimedia.org/T187806#3993536 (10fdans) [17:40:15] 10Analytics-Kanban, 10EventBus, 10Services (later): Investigate why disk usage on Kafka nodes is 2 times lower in codfw - https://phabricator.wikimedia.org/T187554#3993543 (10fdans) [17:41:10] 10Analytics-Kanban, 10User-Elukey: find out usage of dbstore1002 among analysts and reserachers - https://phabricator.wikimedia.org/T187476#3993557 (10fdans) [17:43:13] 10Analytics, 10Analytics-Wikistats: Beta: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#3993571 (10fdans) [17:46:24] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0: "aa.wikipedia.org" exists and has data available, but marked "Invalid" - https://phabricator.wikimedia.org/T187414#3974447 (10fdans) This wiki needs to be added to the sqoop list for it to appear as valid (and to appear in WS1). [17:47:17] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0: "aa.wikipedia.org" exists and has data available, but marked "Invalid" - https://phabricator.wikimedia.org/T187414#3993603 (10fdans) [17:50:20] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0: Page heading style varies - https://phabricator.wikimedia.org/T187412#3993616 (10fdans) [17:50:49] 10Analytics, 10Analytics-Wikistats: Wikistats pageviews by country table view - https://phabricator.wikimedia.org/T187407#3993617 (10fdans) 05Open>03Resolved [17:52:13] 10Analytics: Create refinery-spark package - https://phabricator.wikimedia.org/T188025#3993620 (10mforns) [17:52:43] 10Analytics, 10Analytics-Wikistats: Beta: Pageviews by Country Monthly should specify month in question - https://phabricator.wikimedia.org/T187389#3973745 (10fdans) [17:58:37] 10Analytics, 10Analytics-Kanban, 10EventBus: Failure in eventlogging schema for mediawiki/revision/visibility-change - https://phabricator.wikimedia.org/T187362#3993654 (10fdans) [17:58:42] 10Analytics-Kanban, 10EventBus: Failure in eventlogging schema for mediawiki/revision/visibility-change - https://phabricator.wikimedia.org/T187362#3973172 (10fdans) [17:59:06] 10Analytics-Kanban, 10EventBus: Failure in eventlogging schema for mediawiki/revision/visibility-change - https://phabricator.wikimedia.org/T187362#3973172 (10fdans) cc @Ottomata @Pchelolo [18:01:52] 10Analytics-Kanban, 10Discovery, 10Wikidata, 10Wikidata-Query-Service, 10Wikimedia-Stream: Increase kafka event retention to 14 or 21 days - https://phabricator.wikimedia.org/T187296#3993669 (10fdans) [18:08:18] 10Analytics-Kanban, 10Cloud-VPS, 10EventBus, 10Patch-For-Review, 10Services (watching): Add page-related topics to EventStreams - https://phabricator.wikimedia.org/T187241#3993680 (10fdans) [18:12:17] 10Analytics, 10Analytics-EventLogging, 10Performance: Spin out a tiny EventLogging URL module for lightweight logging - https://phabricator.wikimedia.org/T187207#3993690 (10fdans) [18:13:49] 10Analytics, 10EventBus, 10Services (watching): EventBus schema validation should report the name of the failed property - https://phabricator.wikimedia.org/T188027#3993694 (10Pchelolo) [18:16:15] 10Analytics, 10Analytics-EventLogging, 10Performance: Spin out a tiny EventLogging URL module for lightweight logging - https://phabricator.wikimedia.org/T187207#3967721 (10fdans) What's the actionable here analytics wise? [18:21:38] 10Analytics-Kanban, 10EventBus, 10Services (watching): Failure in eventlogging schema for mediawiki/revision/visibility-change - https://phabricator.wikimedia.org/T187362#3993752 (10Pchelolo) @awight I've checked the message that you've provided and that one passes the validation. The error message is rathe... [18:24:30] 10Analytics-Kanban, 10EventBus, 10Services (watching): Failure in EventBus schema for mediawiki/revision/visibility-change - https://phabricator.wikimedia.org/T187362#3993767 (10mobrovac) [18:35:30] 10Analytics-Kanban, 10EventBus, 10Services (later): Investigate why disk usage on Kafka nodes is 2 times lower in codfw - https://phabricator.wikimedia.org/T187554#3978767 (10elukey) I checked some Kafka topic partition's data dir size and it seems that the mirrored data on the other cluster is always less:... [18:36:46] * elukey off! [18:44:59] (03PS5) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [18:47:51] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [18:49:46] hey elukey. I'm trying to respond to https://phabricator.wikimedia.org/T186776#3988334 but I cannot access stat1002. Do you have 5 min we go to batcave together? [18:55:47] oh. I just noticed elukey is off. :) ottomata, can I bug you for 5 min in batcave? [18:56:51] hey leila - stat1002 doesn't exist anymore - It's been replaced by sat1005 [18:56:58] s/sat1005/stat1005 [18:57:28] deh! that's why. [18:57:34] :) [18:58:18] owww! of course! I got confused with stat1002 in that task's command-line, but of course, that's just a folder in stat1005. sorry, people. [18:58:28] np leila [19:09:58] Hi there -- question: I'm trying to do a little more detailed investigation of https://phabricator.wikimedia.org/T186437 (Save Timing regression on Feb 04), which appears to be related to a very large spike in requests that was received by MediaWiki just before. In order to investigate this, I would like to look at the raw request logs from the period of the request spike. What are my options? [19:10:07] I' [19:10:11] (03PS6) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [19:10:51] m happy to just save them out from Hive, if that's reasonable. But I'm guessing there might be an easier way? [19:13:27] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [19:19:17] (03PS7) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [19:22:22] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [19:31:01] 10Analytics-Kanban, 10EventBus, 10Services (later): Investigate why disk usage on Kafka nodes is 2 times lower in codfw - https://phabricator.wikimedia.org/T187554#3994075 (10Ottomata) Maybe MirrorMaker is causing more messages to be saved in the same batch, resulting in better compression on the mirrored-t... [19:33:03] 10Analytics-Kanban: Spark 2.2.1 as cluster default (working with oozie) - https://phabricator.wikimedia.org/T159962#3994077 (10Ottomata) TODO: puppetize oozie with spark 2.2.1: like https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_spark-component-guide/content/ch_oozie-spark-action.html#spark-config-o... [19:35:07] (03PS8) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [19:35:58] ottomata: I want bakc but you were gone :( [19:36:09] ottomata: what was hat last thing? [19:36:15] 10Analytics, 10Analytics-EventLogging, 10TimedMediaHandler, 10Wikimedia-Video: Record and report metrics for audio and video playback - https://phabricator.wikimedia.org/T108522#3994097 (10Jdforrester-WMF) [19:36:57] oh was just going to say i'd be out on monday, but i'll be around tomorrow [19:37:05] no problem :) [19:39:11] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [19:41:55] (03PS9) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [19:45:51] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [19:53:15] (03PS10) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [19:57:43] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [20:01:17] (03PS11) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [20:03:40] (03CR) 10jerkins-bot: [V: 04-1] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [20:12:18] (03PS12) 10Ottomata: RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 [20:15:38] marlier: you can query webrequest logs in hive [20:15:40] is that good enough? [20:15:46] or you want to actually look at log files? [20:15:59] do you need full? (oxygen has sampled/1000 as files on disk) [20:19:08] ottomata: have you removed out test from oozie? [20:19:30] ottomata: hoping for full, but /1000 may be good enough. Is oxygen the name of a machine? [20:19:40] joal: no, i'm ok with leaving it there for now [20:19:43] ya [20:19:45] oxygen.eqiad.wmnet [20:19:46] ok [20:19:48] /srv/log/webrequest [20:19:54] Awesome, thanks [20:19:59] but marlier you don't want to just query hive? [20:20:01] ottomata: I have a failure on another try - will investigate tomorrow [20:20:05] Gone for tonight ! [20:20:08] laters! [20:21:05] ottomata: nah, for this kind of investigation visual inspection tends to work better [20:22:21] (03CR) 10Ottomata: "YAYY SUCCEEDED FINALLY!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [20:22:34] Once I find some patterns, hive will help to quantify [20:22:35] \o/ [20:23:23] k, marlier sounds good, lemme know if you got more Qs [20:23:40] the stuff on oxygen is there exactly for opsy investigations like this [20:28:22] (03CR) 10Ottomata: [C: 032] RefineTarget can now infer input file format [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413267 (owner: 10Ottomata) [20:28:51] (03PS1) 10Ottomata: Update changelog.md with JsonRefine -> Refine details [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413463 [20:32:19] 10Analytics-Kanban, 10User-Elukey: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3994181 (10leila) @elukey I reviewed my share. All of it is related to one project, and I need to keep that data because we still refer to it and need to take subsets out of it from time to t... [20:33:49] 10Analytics-Kanban, 10User-Elukey: Reduction of stat1005's disk space usage - https://phabricator.wikimedia.org/T186776#3994189 (10Ottomata) @leila, if you have just random data that you don't want deleted, but is large, you could put it in your HDFS home directory and save it there :) [20:39:47] (03CR) 10Ottomata: [V: 032 C: 032] Update changelog.md with JsonRefine -> Refine details [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413463 (owner: 10Ottomata) [20:41:26] thanks ottomata, will use that [20:53:11] 10Analytics: Generate pagecounts-ez data back to 2008 - https://phabricator.wikimedia.org/T188041#3994247 (10Milimetric) [21:13:23] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Add the prometheus jmx agent to AQS Cassandra - https://phabricator.wikimedia.org/T184795#3994281 (10Eevans) [21:15:17] Random question about ReportUpdater: the documentation at https://wikitech.wikimedia.org/wiki/Analytics/Tutorials/Dashboards#Adapt_your_SQL_queries_to_reportupdater's_conventions says the timestamp placeholders are {timestamp_from} and {timestamp_to}, but many SQL files I see in the repos use {from_timestamp} and {to_timestamp}. Which is correct? Or do both work? [21:17:31] nuria_: (Sorry for responding 2 weeks late) yay thanks, thanks to that pointer I found the data at https://analytics.wikimedia.org/datasets/periodic/reports/metrics/echo/days_to_read/ [21:27:32] nuria_: do you have feature development for SWAP as part of FY18-19 goals? [22:14:21] 10Analytics: Functionality to share & view SWAP notebooks - https://phabricator.wikimedia.org/T156934#3994584 (10madhuvishy) [22:26:48] Leila: we have a bucket of time for that, yes. We have not mapped out exactly what goes in it quite yet. [22:40:41] joal: just briefly checking in, super nice to hear about presto [23:04:57] (03PS1) 10Ottomata: [WIP] Add RefineMonitor job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413633 (https://phabricator.wikimedia.org/T186602) [23:07:39] (03PS2) 10Ottomata: [WIP] Add RefineMonitor job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413633 (https://phabricator.wikimedia.org/T186602) [23:08:13] (03PS3) 10Ottomata: [WIP] Add RefineMonitor job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/413633 (https://phabricator.wikimedia.org/T186602)