[00:14:12] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3974250 (10Nuria) > geodata may be a little trickier unless EventLogging can take care of that. EL can take care of it, no need to send from client. > https://en.m.wikipedia.org/wiki/USA Inferring title from url... [00:21:55] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3974261 (10Jdlrobson) > Inferring title from url can be done too but given that a preview is happening title or page_id should be available right ? as that content was retrieved from content service. The page_id we... [00:26:55] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3974275 (10Nuria) >The only information we have available is that in the response: https://en.wikipedia.org/api/rest_v1/page/summary/San_Francisco - so we'd be able to get page_id and title but not namespace (but c... [00:58:23] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0: Page heading style varies - https://phabricator.wikimedia.org/T187412#3974402 (10Krinkle) [01:15:23] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0: "aa.wikipedia.org" exists and has data available, but marked "Invalid" - https://phabricator.wikimedia.org/T187414#3974447 (10Krinkle) [01:18:59] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0: "aa.wikipedia.org" exists and has data available, but marked "Invalid" - https://phabricator.wikimedia.org/T187414#3974482 (10Krinkle) [01:19:13] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0: "aa.wikipedia.org" exists and has data available, but marked "Invalid" - https://phabricator.wikimedia.org/T187414#3974447 (10Krinkle) [01:39:12] 10Analytics, 10EventBus, 10Services: Enable multiple topics in EventStreams URL - https://phabricator.wikimedia.org/T187418#3974509 (10Smalyshev) [01:54:46] 10Analytics, 10Analytics-Wikistats: Wikistats pageviews by country table view - https://phabricator.wikimedia.org/T187407#3974523 (10fdans) This change (https://gerrit.wikimedia.org/r/#/c/410488/) removes links in map metrics :) [07:30:57] 10Analytics-Kanban: English Wikivoyage traffic spike possible bot - https://phabricator.wikimedia.org/T187244#3974712 (10Tbayer) Out of curiosity I took a quick peek at various dimensions in Pivot. (BTW, we should write up some kind of playbook on efficient initial investigations of such pageview anomalies, as t... [07:31:08] 10Analytics-Kanban: Productionitize netflow job - https://phabricator.wikimedia.org/T176984#3974722 (10elukey) [07:31:10] 10Analytics-Kanban, 10Operations, 10monitoring, 10netops, and 2 others: Pull netflow data in realtime from Kafka via Tranquillity/Spark - https://phabricator.wikimedia.org/T181036#3974720 (10elukey) [07:51:19] !log removed default-java packages from analytics1003 and re-launched refinery-drop-mediawiki-snapshots [07:51:20] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:04:40] joal: o/ - there might be something more about refinery-drop-mediawiki-snapshots's failure [08:05:44] ah no wait, hive uses the same bigtop util [08:05:45] sigh [08:18:35] !log removed jmxtrans and java 7 from analytics1003 and re-launched refinery-drop-mediawiki-snapshots [08:18:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:20:10] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Analytics Cluster to Java 8 - https://phabricator.wikimedia.org/T166248#3974803 (10elukey) >>! In T166248#3972607, @elukey wrote: > Cluster upgraded to java8 and java 7 packages removed from all analytics hosts except analy... [08:24:22] super weird, it fails at the same point [09:44:14] 10Analytics-Kanban, 10Analytics-Wikistats: Maps: Zoom limit and name for "Unknown" - https://phabricator.wikimedia.org/T187427#3974857 (10Milimetric) [09:47:57] 10Analytics-Kanban: English Wikivoyage traffic spike possible bot - https://phabricator.wikimedia.org/T187244#3974875 (10Milimetric) +1 that this doesn't look at all like a bot. @kaldari, any more thoughts? I'll just close this otherwise. [09:48:34] early wake up milimetric ? :D [09:48:41] da baby [09:48:47] I am trying to wake up so I can deploy [09:48:56] (phabricator wakes me up like jumping into a cold pool) [09:49:11] ahahhah [09:49:16] good early morning then :) [09:56:04] I am currently watching it.wikipedia's traffic on https://stats.wikimedia.org/v2/#/it.wikipedia.org/reading/pageviews-by-country [09:56:56] I had no idea that it wikipedia was watched everywhere [09:57:25] maybe a lot of "search $something_with_italian_name" -> "oh no this is wikipedia is in italian lan" [09:57:28] :D [09:58:44] yeah, actually I got really confused when I first wanted to use Wikipedia, I accidentally ended up on some sanskrit page [09:59:32] I was just thinking that it's amazing how few people use Italian Wiki but are really really close geographically. Like Turkey [09:59:42] but then realized Turkey's probably still blocked [10:00:21] but yeah, these metrics are really great, and even better is the infrastructure that surfaces them, I think we built something so solid, that we can depend on for many years to coe [10:00:23] *come [10:00:36] joal's not around, hm, I'll eat another cookie [10:00:37] :) [10:06:05] ok, elukey so that Hive error is failing on ALTER TABLE {0} DROP IF EXISTS PARTITION ({1}); executed by python [10:06:33] so I'll just try to run that statement manually, see what happens [10:07:12] milimetric: how did you find that? [10:07:28] * elukey goes to the corner of ignorant people [10:07:33] ALTER TABLE mediawiki_history DROP IF EXISTS PARTITION (... oh! there aren't enough partitions, maybe it's just a python bug ...); [10:07:41] elukey: psh, as if [10:07:49] ok, so here's my path [10:08:05] I looked at the script, looked for the line that logged that ERROR [10:08:21] found that it's the variable "hive" which I saw is instantiated as utils.HiveUtils [10:08:32] so I looked in utils.HiveUtils.drop_partition_ddl [10:08:39] and found the statement it's trying to use [10:09:11] but if I do "hive -- use wmf; show partitions mediawiki_history;", then I see there are only 3 [10:09:18] weird... why are there only 3... [10:09:33] and I made a private one, that's not there... [10:09:35] hm......... [10:10:08] thanks :) [10:10:16] so now I'm reading the original script more carefully to see if maybe it's not doing what we want [10:14:38] elukey: so this must've been failing for at least the last 3 months, and something else must be wrong, because some tables have 3 partitions and some have 9 [10:14:55] joal: hi! [10:14:58] wanna deploy? [10:17:33] Hi milimetric ! [10:17:38] Yes ! Let's do that ! [10:17:47] wait a minute [10:17:48] Man, that's early for you milimetric [10:18:09] let's figure out the drop-mediawiki issue first [10:18:09] elukey: sure - What's up? [10:18:16] oh, I've been up for like 2 hours [10:18:27] milimetric: WHHAT? [10:18:41] elukey: besides the log config problem, I don't think it's a new issues [10:18:54] because based on the partitions, they haven't been truncated properly for a while [10:19:08] I would disable that cron and re-examine that job closely [10:19:38] but why it is causing an error only now? [10:19:43] something must have changed [10:20:42] either we missed it before or it truncated the tables unevenly before and now it assumes they're evenly truncated and has errors due to that? [10:20:54] either way, it's a python issue, nothing that affects the cluster [10:21:04] and if we don't deploy now (and soon) then we miss our window for the week [10:22:14] but I could be wrong, I don't hold this opinion strongly [10:23:48] all right, if you guys are ok to proceed I will not oppose [10:26:26] milimetric: let's double check everything we wish to deploy has been merged [10:26:27] elukey: but we should disable that cron [10:26:37] ok joal, checking gerrit [10:26:59] milimetric, elukey : could very weel be related to the change in MW-History I made a few weeks ago (moving tables etc) [10:27:19] elukey: I +1 the need for investigation and fix, but would do that after deploy [10:29:14] that's weird, how come https://gerrit.wikimedia.org/r/#/c/357814/ doesn't show up on the project dashboard: https://gerrit.wikimedia.org/r/#/projects/analytics/refinery,dashboards/default [10:29:38] no idea milimetric [10:29:45] also, joal you don't want the webrequest_split in this time, right? [10:29:46] but that one is not to be merged yet :) [10:29:47] k [10:30:22] ok, so https://gerrit.wikimedia.org/r/#/c/405938/ and https://gerrit.wikimedia.org/r/#/c/405899/ [10:30:28] that last one isn't submitted, but it's got +2s [10:30:32] that first one? [10:30:54] milimetric: merging both :) [10:30:58] sorry brb! [10:31:11] (03CR) 10Joal: [V: 032 C: 032] "LGTM - Merge before deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/405938 (https://phabricator.wikimedia.org/T185419) (owner: 10Ottomata) [10:31:52] milimetric: last check before merge [10:31:56] ok, good, that's the only two I see on the project dashboard, but as we've established that's broken, so joal do you remember anything else? [10:33:19] milimetric: new field for ISP data in webrequest will be named isp_data (as for geocode_data) [10:33:55] milimetric: andrew suggested isp_info - I think I'd rather go with data since data comes from same backend (maxmind) [10:34:00] milimetric: ok with that? [10:34:24] The cat jumped up, clicked on my mouse, and my computer crashed hard. While I was dealing with it, she drank my milk [10:34:33] Checkmate, kitty, checkmate [10:34:55] :D [10:35:05] I have to reboot it looks like, shit I lose a bunch of work [10:35:11] Wait until Ada gets older ;) [10:35:26] oh crap :( [10:35:33] milimetric: waiting for your reboot [10:36:23] ok, back [10:36:33] holy crap, I don't understand what she did, she just clicked my mouse [10:36:46] anyway, isp_data sounds good to me [10:36:59] but is that consistent (looking at schema) [10:37:35] milimetric: consistent with geocode_data, but not with pageview_info :( [10:37:39] hm, we have pageview_info and stuff_map [10:37:44] and geocode_data [10:37:46] milimetric: I messed up consistency in naming a while ago [10:37:47] :( [10:37:49] so we have all of them! :) [10:38:14] ok, then, whatever you feel, it's more like geocode stuff I guess, so _data it is! [10:38:36] Ok ! [10:38:37] Merging [10:38:51] (03CR) 10Joal: [V: 032 C: 032] "Merging before deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/405899 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [10:40:35] milimetric: merging a last one [10:40:43] (03CR) 10Joal: [C: 032] Update camus part checker topic name normalization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/405867 (https://phabricator.wikimedia.org/T171099) (owner: 10Joal) [10:41:17] oh, I didn't look at refinery-source, oops [10:41:26] milimetric: no worry :) [10:41:33] ok, nothing else there [10:41:53] ok, so start the refinery-source deploy? [10:42:12] milimetric: let's wait a minute for the last patch to be merged by jenkins [10:42:25] yes, ofc [10:44:15] so we're at 57, eh? [10:46:40] milimetric: correct - Moving to 58 [10:52:14] 10Analytics, 10Analytics-Wikistats: Wikistats Bug: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#3974963 (10Volans) [10:52:37] well, joal, Jenkins is being a pain [10:52:54] milimetric: so sloooooooooow :( [10:53:22] milimetric: self merging [10:53:24] (03CR) 10Joal: [V: 032 C: 032] Update camus part checker topic name normalization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/405867 (https://phabricator.wikimedia.org/T171099) (owner: 10Joal) [10:53:33] Merged [10:53:54] k, did you run the tests? Should I run them as well? [10:54:30] milimetric: triple checking now, but I hink it should be fine [10:54:47] milimetric: I let you patch changelog.md? [10:55:00] yeah, I'll do the deploy from start to finish [10:55:20] but, just checking, does anything need to happen to our testing / pom files for the java upgrade? [10:55:22] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3974987 (10elukey) I think that we can proceed in this way: 1) Check in labs what zookeeper version would end up in stretch. On conf100[123] we have... [10:56:03] I didn't have maven installed so I'm running mvn test fresh [10:57:30] 10Analytics, 10Analytics-Wikistats: Wikistats Bug: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#3974963 (10Milimetric) Thanks for the report! I agree with your proposals, was saying something similar yesterday. We're trying to internationalize our number formatting so we're go... [10:57:51] milimetric: worked for me - You'll have to download the entire world :) [10:58:02] yep, world downloading 40% complete I think [10:58:05] 10Analytics, 10Analytics-Wikistats: Wikistats Bug: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#3974992 (10elukey) p:05Triage>03Normal [10:58:15] a lot faster than a couple years ago!! The world got smaller [10:58:36] milimetric: it's been better compressed ;) [10:58:54] poor archiva :D [11:02:05] uh, I'm stuck because my JAVA_HOME isn't set up on this machine [11:02:24] (not stuck, just slow) [11:03:35] milimetric: forgot to let you know but bes way would probably have been to clone refinery-source to a stat machine and build there [11:03:52] ah, yeah [11:06:05] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3975006 (10MoritzMuehlenhoff) >>! In T166081#3974987, @elukey wrote: > 1) Check in labs what zookeeper version would end up in stretch. On conf100[123... [11:07:36] sok, test ran, hm, one error: [11:07:39] https://www.irccloud.com/pastebin/Toc6gGFT/ [11:08:03] java.lang.ClassNotFoundException: org.wikimedia.analytics.refinery.camus.CamusStatusReader [11:09:04] milimetric: ?? Super unexpected :( [11:10:05] it's scala, do I need to install something special to recognize that? [11:10:12] but it's the only thing that failed, there are tons of other scala things [11:10:18] milimetric: should be done through maven for you [11:10:32] I ran mvn test on my machine - succeeded [11:10:35] WEIRD [11:10:52] milimetric: maybe a mvn clean package? [11:11:39] heh, here comes the whole world again? [11:11:57] milimetric: normally just checks for versions [11:12:19] looks like it's downloading stuff, it's giving me speeds [11:12:43] but yeah, weird, the CamusStatusReader.scala is there, I don't see anything wrong with it, let's see if it fails with clean again [11:12:56] milimetric: could have been that some stuff was missing when you were testing previous time [11:13:07] confirmed it's passing a bunch of other scala tests [11:13:42] it looked like mvn test was also downloading tons of stuff, but do you always need to do mvn clean package before running mvn test by itself? [11:13:56] milimetric: normally not [11:14:19] milimetric: I prefer to do mvn clean before any other thing I do with maven (package, test or whatever) - To start afresh [11:14:39] right, but this was a new laptop, no .mvn [11:14:48] .m2 :) [11:15:34] right, that thing [11:15:53] although, it looks like it's downloading more of the world now, so maybe there was something stale somewhere [11:16:28] world-size has increased since last build ;) [11:18:03] success!!! [11:18:08] ok, moving on [11:18:11] \o/ [11:21:39] 10Analytics, 10DC-Ops, 10Operations, 10ops-codfw: Decomission eventlog2001 - https://phabricator.wikimedia.org/T182397#3975056 (10MoritzMuehlenhoff) >>! In T182397#3974123, @RobH wrote: > Not showing there now, someone did a cleanup. Not quite, that is related to some changes in Puppet 4 and their interac... [11:23:09] (03PS1) 10Milimetric: Update changelog to v0.0.58 before deploy [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/410808 [11:23:26] (03CR) 10Milimetric: [V: 032 C: 032] Update changelog to v0.0.58 before deploy [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/410808 (owner: 10Milimetric) [11:23:32] that's a bunch of changes [11:24:14] hey people, I'd need to go in a bit for lunch + errand, anything that I can do to help now? [11:24:17] i swear to god, jenkins hates me [11:24:22] no worries elukey [11:26:09] milimetric: anything I can help with? [11:26:25] no, it just never has the right version when I go to https://integration.wikimedia.org/ci/job/analytics-refinery-release/m2release/, I'm refreshing it [11:26:49] but everyone else claims they have the right version there, when they do it, so I'm jealous :) [11:26:58] milimetric: I does the same to me - I just think jenkins is the kind of guys who's always late [11:27:33] oh, I see, so it does that based on the changelog commit? We're just too fast then [11:27:36] that I can live with :) [11:28:05] milimetric: I don't think so, I think it tracks mvn versions, but incorrectly [11:28:25] oh, I see, then, yeah, I don't understand why some people claim it always works for them [11:32:50] all right, going afk! ttl! [11:38:27] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3975127 (10phuedx) >>! In T186728#3974261, @Jdlrobson wrote: >> Inferring title from url can be done too but given that a preview is happening title or page_id should be available right ? as that content was retrie... [11:41:34] sync job launched, with the right version number this time :) [11:46:43] (03PS5) 10Milimetric: [WIP] Saving in case laptop catches on fire [analytics/refinery] - 10https://gerrit.wikimedia.org/r/408848 [11:47:24] ok, joal, do we need commits to point the webrequest jobs to 0.0.58? [11:47:29] or is that already there [11:47:50] milimetric: for webrequest we're godd - let me double check if there's anything else [11:47:59] k, then I'll deploy refinery [11:48:05] (after) [11:48:46] I we're good milimetric [11:48:51] +think [11:48:52] great [11:49:03] obviously, me thinking doesn't like to get written [11:49:05] we can always merge/restart more jobs later [11:49:09] sure [12:08:14] ok, refinery deployed and synced [12:08:27] so next step: drop and create webrequest [12:08:35] milimetric: nope [12:08:42] stop jobs [12:08:44] right [12:09:44] yup [12:10:12] stop webrequest-load bundle (kill) - wait for any refine job to be done [12:12:27] milimetric: there currently are 2 webrequest load jobs running - Let's not kill them but rather wait for them to finish [12:12:32] I can monitor that [12:15:10] no worries joal, I got this. I’ll read the docs again and watch the last of the bundle-started stuff until they finish, then kill, drop table, create with new statement, and launch jobs with new prop files [12:15:34] milimetric: no drop please :) [12:15:42] milimetric: alter table [12:16:11] milimetric: we keep the createb table script in sync, but we update with an alter table [12:16:16] I wouldn’t do anything without testing first but how do you add a columb, alter table? [12:16:44] yessir [12:32:38] looks like those jobs will be going on for another 20 minutes or so [12:37:48] (03PS2) 10Joal: Add TransformFunctions for JsonRefine job [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410240 (https://phabricator.wikimedia.org/T185237) (owner: 10Ottomata) [12:37:50] (03PS3) 10Joal: Add dataframe conversion to new schema function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 (owner: 10Ottomata) [12:41:12] (03CR) 10Fdans: "Your points have been addressed except for the one with the toolitip and the cursor, which I can't reproduce with my computer. We should p" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (owner: 10Fdans) [12:41:23] (03PS2) 10Fdans: Bunch of small map UI fixes [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (https://phabricator.wikimedia.org/T187205) [12:47:49] 10Analytics-Kanban: Allow switching metrics in a dashboard widget - https://phabricator.wikimedia.org/T187440#3975285 (10fdans) [12:47:59] milimetric: webrequest load jobs finished [12:48:52] milimetric: thinking of that, we'll also need to wait for jobs USING webrequest, before proceeding [12:51:21] milimetric: I kill the webrequest-bundle [12:51:31] ok [12:51:34] (03CR) 10jerkins-bot: [V: 04-1] Add dataframe conversion to new schema function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 (owner: 10Ottomata) [12:51:36] (03CR) 10jerkins-bot: [V: 04-1] Add TransformFunctions for JsonRefine job [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410240 (https://phabricator.wikimedia.org/T185237) (owner: 10Ottomata) [12:52:16] !log Killing webrequest-load bundle (next restart should be at hour 12:00) [12:52:17] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:53:12] joal: I'm reading the datasets to find the dependent jobs, will wait for them before altering the table [12:54:04] milimetric: I can trigger you :) [12:54:37] (03PS3) 10Fdans: Bunch of small map UI fixes [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (https://phabricator.wikimedia.org/T187205) [12:57:54] milimetric: We're good now [12:58:02] Let's DO IT :) [12:58:40] joal: I can see the following dependent jobs: [12:58:41] https://www.irccloud.com/pastebin/mrKDmW16/ [12:58:54] you checked all those? [12:59:08] milimetric: daily and monthly are ok by default [12:59:13] ya [12:59:33] hourly are are done :) [12:59:50] ok, great [12:59:55] altering table [13:00:01] milimetric: And, cluster has been empty for mopre than 5 minutes ;) [13:00:29] alter table webrequest add column isp_data map; ? [13:00:48] milimetric: with comment? [13:00:48] heh, nice, give the poor thing a break [13:00:56] right, comment, doh [13:01:48] hm, failed [13:03:27] CRAP [13:03:50] ok, joal, good [13:04:16] just a bad statement, it has to be "alter table ____ add columns (____)" [13:04:25] the plural columns instead of column and the parens are required [13:06:32] ok, docs updated https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest#wmf.webrequest [13:06:57] ok, joal, so I'll relaunch the bundle with the new props? [13:07:38] good for me milimetric [13:13:02] joal: just double checking: [13:13:03] https://www.irccloud.com/pastebin/UXSdPQAr/ [13:17:26] that looked good to me, so I ran it [13:17:40] I gotta run and take care of baby, but ping me if you need me [13:18:04] sorry milimetric - was on the phone [13:23:39] milimetric: confirmed o work (misc has finished and has data !!) :) [13:24:06] wow, nice [13:24:12] misc is tiny :) [13:24:16] yup [13:27:10] milimetric: I'm super eager to look at top ISPs :) [13:27:24] already did that on misc, willing to do on text :) [13:33:51] 10Analytics, 10Operations, 10Wikimedia-Stream, 10hardware-requests, 10Patch-For-Review: decommission rcs100[12] - https://phabricator.wikimedia.org/T170157#3420970 (10faidon) [13:34:19] 10Analytics, 10Operations, 10Wikimedia-Stream, 10hardware-requests, 10ops-eqiad: decommission rcs100[12] - https://phabricator.wikimedia.org/T170157#3420970 (10faidon) [13:34:40] back :) [13:50:28] so people here's the requirements for the replacement of dbstore1002 [13:50:31] https://phabricator.wikimedia.org/T159423#3975296 [13:50:56] 3 more db hosts if we want to keep dbstore1002, or 6 new ones [13:53:23] joal,milimetric deployment done? [13:53:47] Yes, sorry didn’t log [13:54:09] !log deployment of refinery and refinery-source done [13:54:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:56:52] 10Analytics, 10TCB-Team, 10Two-Column-Edit-Conflict-Merge, 10WMDE-Analytics-Engineering, and 5 others: How often are new editors involved in edit conflicts - https://phabricator.wikimedia.org/T182008#3975445 (10Lea_WMDE) Even with the missing ids, is the data reliable? If yes, then yay! Great work Goran! T... [13:58:34] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3975451 (10Ottomata) BTW, for webrequest, iirc we only get page_id and namespace_id sent from the the client in the X-Analytics header, which is set by MediaWiki. So, sending them from the client is how pageviews... [13:59:44] elukey: i don't understand that [13:59:51] why do we need more hosts if we don't want to keep it? [14:00:18] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3975464 (10elukey) >>! In T166081#3975006, @MoritzMuehlenhoff wrote: >>>! In T166081#3974987, @elukey wrote: >> 1) Check in labs what zookeeper versio... [14:00:45] ottomata: morning :) [14:01:36] so they are proposing, as far as I can see, either dbstore1002 + 3 hosts or 6 new ones [14:02:19] OHHH [14:02:33] i didn't realize that was a choice [14:02:43] we need to get rid of dbstore1002 next year, no? [14:03:18] ottomata: suggested replacement is in eary 2019 yes [14:04:25] so, we need to budget for replacing it, so we don't have a choice [14:04:27] we need 6 hosts? [14:05:28] to avoid rushing to a hdfs only solution in a year I'd say yes [14:09:07] ok, i've budgeted for 3, but probably bigger than they need to be [14:09:34] elukey: i'm going to look at the drop-mediawiki-snapshots thing [14:09:38] Heya ottomata - let me know when you have time to investigate with JsonRefine again [14:09:58] Also ottomata - Can we have a chat over JsonRefine monitoring? [14:10:14] joal: ya! i see you made some patches! [14:10:24] so leaf should be option, but struct should be null...iiiinteresting [14:10:29] ottomata: I think I have found a hacky way to make it work [14:10:48] something else ottomata - I was converting leafs incorrectly [14:10:56] oh? [14:11:16] Leaf is an option of index [14:11:36] None for new values, index where to find value for existing ones [14:12:21] I was putting Some(None) when r.isNullAt(idx) was true [14:12:25] Instead of None [14:12:31] makes sense ottomata --< [14:12:34] ? [14:12:38] sort of! [14:12:40] :D [14:12:53] mapping over the opt(idx) was returning nothing? [14:13:15] milimetric: do you have time later on for https://gerrit.wikimedia.org/r/#/c/405687/ ? [14:13:15] I was mapping, but returning none in some cases [14:13:32] So basically returning Some(None) [14:13:35] ohhhhh [14:13:36] which doesn't make sense [14:13:37] yes ok makes sense [14:13:43] yes elukey [14:13:43] so now None in both cases [14:13:45] where is null [14:13:49] or where is non present? [14:13:52] corret [14:13:55] got it [14:13:55] cool [14:14:23] And then for Rows, Some(Row) is not accepted by spark, so Nulls [14:14:31] How hacky [14:15:00] I added more tests, hopefully it'll be tested enough not ot fail (gain)!!! [14:19:06] ahhh i see [14:19:07] crazy! [14:19:09] ok joal amazing [14:19:13] will test soon [14:23:27] ottomata: qq - is there any constraint in keeping zookeeperd and zookeeper pkgs with the same cdh version? [14:24:00] I am reviewing the work to be done for conf100[456] [14:24:09] in which we'll install stretch [14:24:41] that comes with zookeeperd 3.4.9-3, as opposed to 3.4.5+dfsg-2+deb8u2 that we use now (rebuilt by Moritz with a security patch) [14:24:50] cdh version? oh [14:24:51] i see [14:25:00] we'd be upgrading server but cdh would be using older version? [14:25:21] elukey: i don't know of any problems, i'd hope that a newer server would work with the older client [14:26:00] in theory they have the same 3.4 version, it shouldn't be a big issue (needs to be tested of course) [14:26:12] aye [14:26:16] I am only exploring the possibility of getting the zookeeperd pkg straight from debian [14:26:23] isn't that what we do? [14:26:32] nope, we force 3.4.5+dfsg-2+deb8u2 [14:26:37] oh, mortiz' version [14:26:39] yeah [14:26:57] mforns: holaaa can you send me a screenshot of the issue with the map tooltip? [14:27:03] elukey: the only reason we force, is so that zookeeperd uses the debian version (or our apt version), and not hte cdh version [14:27:15] hi fdans! better we pair in da caif no? [14:27:17] the cdh nodes need to use (force?) the cdh version [14:27:23] because the cdh debs depend on it [14:27:31] mforns: now? [14:27:38] so, ya, we should get from debian for server [14:27:43] (or from moritz [14:27:43] ) [14:27:45] ottomata: yesss sorry I misread versions, you are right [14:27:46] fdans, whenever good for you, now is good for me [14:27:46] either way [14:27:47] but not cdh [14:27:57] mforns: omw! [14:27:59] it is a long one and I confused it with cdh's [14:27:59] ok! [14:28:01] nevermind [14:28:03] I am stupid [14:28:58] I confused it with 3.4.5+cdh5.10.0+104-1.cdh5.10.0.p0.71~jessie-cdh5.10.0 [14:29:03] that is the cdh one [14:29:04] okok [14:30:24] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3975538 (10elukey) Nevermind I am stupid, I confused the long version 3.4.5+cdh5.10.0+104-1.cdh5.10.0.p0.71~jessie-cdh5.10.0 with 3.4.5+dfsg-2+deb8u2,... [14:32:04] heya mforns [14:32:08] i don't know why [14:32:12] hey ottomata :] [14:32:19] but hive doesn't like this mediawiki_metrics drop partitions hacky thing :) [14:32:22] # HACK: For tables partitioned by dimensions other than snapshot [14:32:22] # add !='' to snapshot spec, so that HiveUtils deletes [14:32:22] # the whole snapshot partition with all sub-partitions in it. [14:32:29] oh, come on, again... [14:32:33] hive (wmf)> ALTER TABLE mediawiki_metrics DROP IF EXISTS PARTITION (snapshot='2017-07',metric!='',wiki_db!=''); [14:32:33] FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. null [14:32:38] again? [14:33:00] ottomata, yes, it fails every month with a different error... [14:33:05] hhaah [14:33:25] really?? [14:33:49] ottomata, I'm in meeting, will look into that after [14:34:25] oook! :) [14:36:09] ottomata: so I am going to try a mixed zookeeper cluster in labs and see how it goes [14:36:10] joal, did you mean to remove the geoip test stuff in your patch 2 here? https://gerrit.wikimedia.org/r/#/c/410240/1..2 [14:36:17] elukey: +1 cool [14:36:42] ottomata: sorry for the confusion, the pkg names sometimes confuse me :D [14:37:00] no sorrys! :) [14:41:12] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3284834 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` ['conf1004.eqiad.wmnet'] ``` The... [14:44:01] (03PS3) 10Ottomata: Add TransformFunctions for JsonRefine job [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410240 (https://phabricator.wikimedia.org/T185237) [14:44:23] (03CR) 10Ottomata: [V: 032 C: 032] "Merging this one in branch to help with rebasing future ones." [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410240 (https://phabricator.wikimedia.org/T185237) (owner: 10Ottomata) [14:44:54] (03PS4) 10Ottomata: Add dataframe conversion to new schema function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 [14:46:23] ottomata: Oh nop, I just rebase :( [14:47:21] k :) [14:47:57] (03PS1) 10Ottomata: [WIP] Refactor JsonRefine to use DataFrame converter [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410942 [14:53:19] joal: 18/02/15 14:50:38 INFO JsonRefine: Successfully refined 1 of 1 raw JSON dataset partitions into table otto.MobileWikiAppSessions (total # refined records: 8378) [14:53:21] :o :o :o [14:53:23] !!!! [14:53:41] \\\o/// ! YAY [14:53:46] wowwee [14:53:51] now to test transform funcs and cleanup [14:53:52] wow [14:54:05] it feels much faster so far too... [15:00:00] it just evolved the schema, added geocoded_data, and then inserted it! [15:00:02] WOWWW [15:00:18] now the table has new field, what if dont' geocode!!!? [15:05:26] gone to catch Lino - Later ! [15:09:26] (03PS4) 10Fdans: Bunch of small map UI fixes [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (https://phabricator.wikimedia.org/T187205) [15:12:38] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3975633 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['conf1004.eqiad.wmnet'] ``` and were **ALL** successful. [15:14:08] (03CR) 10Mforns: [C: 032] Bunch of small map UI fixes [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (https://phabricator.wikimedia.org/T187205) (owner: 10Fdans) [15:15:58] wow everyting works [15:15:59] wow [15:16:30] (03Merged) 10jenkins-bot: Bunch of small map UI fixes [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (https://phabricator.wikimedia.org/T187205) (owner: 10Fdans) [15:17:38] ottomata: do you have a tl;dr for poor souls not knowing any scala magic? [15:17:41] :D [15:17:55] and not "Joseph is awesome" since we know it [15:18:58] 10Analytics-Kanban: We should prevent the user from trying to rediscover America - https://phabricator.wikimedia.org/T187452#3975650 (10fdans) [15:21:45] 10Analytics-Kanban: Map tooltip and line graph guide are misaligned in Ubuntu Chrome - https://phabricator.wikimedia.org/T187453#3975666 (10fdans) [15:22:29] 10Analytics-Kanban, 10Patch-For-Review: Launch top per country pageviews on UI - https://phabricator.wikimedia.org/T185510#3975677 (10fdans) [15:26:09] elukey: for sure! running home from cafe [15:26:10] then ya [15:37:36] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3975752 (10elukey) In labs I've extended the one-zookeeper-node analytics project's cluster to three nodes, adding two stretch hosts. Except the puppe... [15:55:13] interesting.. it seems that Yarn RM when zk is acting weird says "nope, shutdown" [15:55:22] meanwhile hdfs tries to failover [15:57:08] better: RM tries to set itself to standby [15:57:33] elukey: happy to bc scala tldr if you want [15:57:44] same thing for NM (as expected), but for some reason NM-2 was up, RM-2 was not in labs [15:57:49] anyhow, zk is important :D [15:58:15] ottomata: maybe we can do a post standup so others will joing? [15:58:18] *join [15:59:06] sure [16:00:46] ahahh today standup is in a hour [16:00:55] I always forgetttttt [16:19:34] (03PS1) 10Fdans: Release 2.1.9 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/411002 [16:21:35] (03PS2) 10Fdans: Release 2.1.9 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/411002 [16:23:08] (03PS3) 10Fdans: Release 2.1.9 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/411002 [16:23:32] (03CR) 10Fdans: [V: 032 C: 032] Release 2.1.9 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/411002 (owner: 10Fdans) [16:25:06] (03PS1) 10Fdans: Release 2.1.9 [analytics/wikistats2] (release) - 10https://gerrit.wikimedia.org/r/411004 [16:25:51] (03CR) 10Fdans: [V: 032 C: 032] Release 2.1.9 [analytics/wikistats2] (release) - 10https://gerrit.wikimedia.org/r/411004 (owner: 10Fdans) [16:25:53] 10Analytics, 10Analytics-EventLogging: uBlock blocks EventLogging - https://phabricator.wikimedia.org/T186572#3975994 (10Milimetric) We really did try and ping people several times about blocking *pageview* (which catches our pageview-api). We submitted a patch, I think Kaldari even knew some devs on there an... [16:31:42] 10Analytics, 10Analytics-Wikistats: Wikistats pageviews by country table view - https://phabricator.wikimedia.org/T187407#3976019 (10Nuria) Sounds good, let's deploy that one as it looks like it is been merged. [16:32:06] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3976023 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` ['conf1005.eqiad.wmnet', 'conf10... [16:32:43] 10Analytics-Kanban: Small map UI changes - https://phabricator.wikimedia.org/T187205#3976024 (10Nuria) [16:44:18] I'm a bit surprised about that finding elukey - For HDFS-failover I knew, but Yarn (RM + NM) was supposed in my mind to be free from zk [16:45:50] ottomata: do you have aminute before standup? [16:46:03] joal: yep, it uses zk to figure out who is the leader [16:46:31] elukey: Ah ... Ok - Weird - I thought not - But since you tried, I trust you ! [16:47:57] joal: why weird? It is the same thing as the hdfs HA no? Yarn has it embedded, the HDFS NN have it on a separate daemon [16:48:31] (trying to get your doubts because usually they carry good findings on a later stage :D) [16:48:51] joal: i hope you don't mind, im' doign some code cleaning [16:48:51] :) [16:48:53] in your patch... [16:49:19] elukey: Right, I just had in mind (can remember where I read that) that it didn't use zk - Maybe I'm just confused with embeded/separate [16:49:26] ottomata: please do :) [16:49:36] ottomata: a minute before standup on monitoring? [16:49:47] yes [17:01:55] 10Analytics-Kanban, 10Analytics-Wikistats, 10Hindi-Sites, 10Patch-For-Review: Hindi Wikiversity is not showing in Wikimedia Stats - https://phabricator.wikimedia.org/T183682#3976214 (10Nuria) [17:05:17] 10Analytics, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3976232 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['conf1005.eqiad.wmnet', 'conf1006.eqiad.wmnet'] ``` and were **ALL** successful. [17:11:13] 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey, 10User-Joe: rack/setup/install conf1004-conf1006 - https://phabricator.wikimedia.org/T166081#3976253 (10elukey) [17:32:02] 10Analytics: find out usage of dbstore wiki data - https://phabricator.wikimedia.org/T187476#3976374 (10Nuria) [17:37:48] 10Analytics: find out usage of dbstore1002 among analysts and reserachers - https://phabricator.wikimedia.org/T187476#3976419 (10Nuria) [17:38:46] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3976421 (10phuedx) >>! In T186728#3975127, @phuedx wrote: > I'll open a ticket with Readers Infra about adding namespace ID to the response. Working around a lack of data in the response by sending partial data fro... [17:44:50] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3976433 (10Jdlrobson) > I'll open a ticket with Readers Infra about adding namespace ID to the response. New endpoint has this (I believe!) [17:45:29] 10Analytics, 10Analytics-Wikistats: Wikistats Bug: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#3974963 (10Nuria) Agreed with 1 and 2 but not so much with 3. Regarding 1: Localization issues as hard, the "bn" sufix might make sense to americans but not to anyone else reading t... [17:49:53] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3976447 (10Jdlrobson) I had a lovely chat with @Ottomata and I think I can pin down this schema today. In terms of what I need to log: * **page_title** will be provided by Popups and retrieved from response * **pa... [18:00:26] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0 - incorrect links in table views - https://phabricator.wikimedia.org/T187480#3976479 (10Quiddity) [18:01:25] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0 - incorrect links in table views - https://phabricator.wikimedia.org/T187480#3976490 (10Quiddity) 05Open>03Resolved Oh, this was fixed (the links were removed) whilst I was asleep. I should've reconfirmed it, before filing from my notes... >.< [18:04:05] 10Analytics, 10TCB-Team, 10Two-Column-Edit-Conflict-Merge, 10WMDE-Analytics-Engineering, and 5 others: How often are new editors involved in edit conflicts - https://phabricator.wikimedia.org/T182008#3976500 (10GoranSMilovanovic) @Lea_WMDE > The only other thing that is confusing to me: You write there w... [18:14:48] (03PS5) 10Ottomata: Add dataframe conversion to new schema function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 [18:35:02] 10Analytics, 10Analytics-Wikistats: Beta Release: Resiliency, Rollback and Deployment of Data - https://phabricator.wikimedia.org/T177965#3976608 (10JAllemandou) First round of discussion with the team: - Things we agre on: - using multiple datasources in druid (snapshots) seems the way to go to facilitate... [18:35:18] milimetric: do you mind reviewing / commenting ? --^ [18:35:23] Gone for diner a-team ! [18:35:39] will do jo [18:44:41] 10Analytics, 10Analytics-Wikistats: Beta Release: Resiliency, Rollback and Deployment of Data - https://phabricator.wikimedia.org/T177965#3676320 (10Milimetric) For the record, I liked Joseph's idea of 3 data sources. One being served right now, one backup, and one being loaded next. When loaded_next is done... [18:59:27] * elukey off! [19:01:42] (03CR) 10jerkins-bot: [V: 04-1] Add dataframe conversion to new schema function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 (owner: 10Ottomata) [19:07:45] joal: o/ I have a couple of questions regarding https://phabricator.wikimedia.org/T186559#3957604. Can you share the link to XML2Parquet job patch? [19:08:18] joal: Also, is your bash script available anywhere? [19:17:10] file:///home/milimetric/Pictures/Ada/six.jpg https://usercontent.irccloud-cdn.com/file/RstcJCp9/image.png [19:17:17] don't mind me, just spamming cute babies [19:20:39] BABYYYYYYYY ! [19:25:05] :DDDD [19:27:29] wowowowowow [19:44:19] (03PS2) 10Joal: Add XmlConverter spark job [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/361440 [19:54:24] 10Analytics: Upload XML dumps to hdfs - https://phabricator.wikimedia.org/T186559#3976834 (10bmansurov) @JAllemandou I found some [[ https://wikitech.wikimedia.org/wiki/Analytics/Archive/Hadoop_Streaming#Using_WikiHadoop_to_parse_XML_dumps | documentation ]] about using Wikihadoop. Is there any other place where... [20:01:48] (03PS3) 10Joal: Add XmlConverter spark job [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/361440 [20:02:54] i may be biased, but i think the cirrus dumps are more useful than the xml dumps in many cases ;P [20:03:39] very possible ebernhardson - I don't even know of the cirrus dumps existence [20:04:06] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move webrequest varnishkafka and consumers to Kafka jumbo cluster. - https://phabricator.wikimedia.org/T185136#3976864 (10Jgreen) [20:04:44] I gotta run out for a couple hours, will try to catch up a bit more tonight but I might crash 'cause I woke up at 3 today [20:04:46] joal: they are on dumps.wikimedia.org and created for all wikis every week. They have this for every page: https://en.wikipedia.org/wiki/Kennedy?action=cirrusdump [20:04:53] joal: (in json) [20:05:03] +1 ebernhardson, those are great [20:05:14] Super cool ebernhardson :) [20:12:00] (03PS4) 10Joal: Add XmlConverter spark job [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/361440 [20:38:36] 10Analytics: Upload XML dumps to hdfs - https://phabricator.wikimedia.org/T186559#3977087 (10JAllemandou) @bmansurov : This doc is a it outdated. It should work but there is better tooling now. - Script to copy to HDFS: https://gerrit.wikimedia.org/r/#/c/409960/ - XML converter in spark: https://gerrit.wikimed... [20:38:52] (03PS5) 10Joal: Add XmlConverter spark job [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/361440 [20:39:48] (03PS2) 10Joal: Manual importer of xml dumps to hdfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/409960 [20:40:02] (03PS3) 10Joal: Manual importer of xml dumps to hdfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/409960 [20:40:47] nuria: are there actually any difference between https://phabricator.wikimedia.org/T164201 and https://phabricator.wikimedia.org/T164596 ? [20:47:02] ebernhardson: is your current job memory setting on purpose super-heavy for memoryOverhead? [20:48:18] joal: yes, it loads 10G files from hdfs into memory and spits them back out as binary optimized files [20:48:42] joal: but i need to figure out how to make it release the executors it's not using, as this final stage (that takes 30+ minutes) only needs 4 executors and the data is pulled from hdfs not spark [20:49:04] ebernhardson: why do you set memoryOverhead instead of regular memory? [20:49:09] joal: because it's C++ [20:49:39] ebernhardson: of course ! therefore it uses memory outside the process - makes sense [20:49:49] Thanks for explanation ebernhardson [20:50:22] it's almost done though, will give the resources back :) [20:50:50] ebernhardson: no worrie - better with 1/3 of the cluster than half is all :-P [20:51:23] * ebernhardson waits for you to go to sleep and takes 80% [20:53:54] 10Analytics, 10ChangeProp, 10EventBus, 10Patch-For-Review, and 4 others: Create generalized "precache" endpoint for ORES - https://phabricator.wikimedia.org/T148714#3977133 (10awight) Argh, attached to the wrong bug, please ignore. [20:58:24] (03PS1) 10Ottomata: Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/411088 [20:59:33] (03Abandoned) 10Ottomata: Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/411088 (owner: 10Ottomata) [20:59:46] ottomata: have a minute for monitoring ? [20:59:57] (03PS1) 10Ottomata: Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/411090 [21:03:39] (03PS1) 10Ottomata: Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/411092 [21:03:42] joa yes [21:03:45] joal [21:03:54] in bc [21:03:58] cave ! [21:12:12] (03CR) 10jerkins-bot: [V: 04-1] Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/411092 (owner: 10Ottomata) [21:16:08] -1? it says +2! https://gerrit.wikimedia.org/r/#/c/411090/ [21:16:19] oh that was the master bad one [21:16:26] (03Abandoned) 10Ottomata: Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/411092 (owner: 10Ottomata) [21:22:24] (03PS2) 10Ottomata: Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/411090 [21:22:57] 10Analytics-EventLogging, 10Analytics-Kanban: Monitor and alert if no new data from JsonRefine jobs - https://phabricator.wikimedia.org/T186602#3977189 (10JAllemandou) After more thoughts, looks like the current need only needs to cron-check tha new data flows in regularly and email if not. Accumulators and re... [21:26:36] (03CR) 10jerkins-bot: [V: 04-1] Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/411090 (owner: 10Ottomata) [21:29:49] 10Analytics-Kanban: Provide unqiues estimate/offset breakdowns in AQS - https://phabricator.wikimedia.org/T164593#3977199 (10Nuria) [21:29:51] 10Analytics: Provide uniques offset/underestimate breakdowns in AQS - https://phabricator.wikimedia.org/T164596#3977201 (10Nuria) [21:36:01] (03PS3) 10Ottomata: Clean up the normalize function and add a new makeNullable function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/411090 [22:09:49] (03PS2) 10Ottomata: [WIP] Refactor JsonRefine to use DataFrame converter [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410942 [22:17:42] (03PS3) 10Ottomata: Refactor JsonRefine to use DataFrame converter [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410942 [22:32:30] 10Analytics, 10Operations, 10hardware-requests: Refresh or replace oxygen - https://phabricator.wikimedia.org/T181264#3977328 (10faidon) a:05faidon>03RobH OK, let's do this, approved. It's spinning rust which is unfortunate, but with 64GB of RAM we could probably fit most of the dataset in the page cache... [22:53:04] hm... any reason that I might not be getting any offsets from eqiad.mediawiki.page-delete topic when calling offsetsForTimes? [22:53:30] also, codfw.* topics seem to have only random old events [22:53:36] ottomata: ^ is this normai? [22:53:48] I am talking about production cluster [22:56:08] SMalyshev: codfw will have very few events, if any, because it is in the inactive datacenter [22:56:12] but eqiad should have events [22:56:30] * ebernhardson thought mirrormaker was being used to somehow unify the DC's [22:56:43] somehow page delete/undelete topics refuse to supply timestamp offsets to me [22:57:08] also all codfw topics - even though they should have _some_events [22:57:24] ah wait... what happens if all events are older than given timestamp? [22:57:52] ebernhardson: the analytics events (terabytes) are not moved across data centers [22:57:58] not sure SMalyshev, you know more about offsetsForTimes than others now :) [22:58:01] ebernhardson: events on main cluster are [22:58:15] SMalyshev: there are def eqiad page-delete messages in jumbo [22:58:18] so that's all working fine [22:58:21] heh ok I'm gonna hit the docs then :) [22:58:27] dunno why times not working for ya ya :) [22:58:36] ottomata: yeah the messages are there but for some reason I am not getting offsets I expected... [22:58:51] will check maybe I misunderstood how the API works [22:59:04] SMalyshev: but for your real use cases you will be consuming from jumbo or eqiad right? [22:59:51] nuria_: I would think jumbo from both eqiad.* and codfw.* topics? [23:00:45] SMalyshev: if i understand right all analytics events will be on eqiad topics right ottomata ? [23:01:15] hmm that would make it a bit easier though not much [23:01:29] but I do see some events on codfw too [23:04:50] nuria_: it doesn't cost me much to add both sets of topics, so probably I'd do that [23:05:29] in case some updates go to codfw or datacenters switch [23:06:24] SMalyshev: i just do not see why would that be needed i can see a few testing events in non-eqiad but that's it as -per design- we do not move analytics data across datacenters and all analytics infrastructure is in eqiad, ottomata knows best [23:06:45] SMalyshev: and there is no such a thing of "fallback" of analytics infrastructure [23:06:57] ok if ottomata says I don't need codfw events it's easy for me to ignore it. It's just a command-line switch [23:07:10] so let's hear what he says I guess :) [23:08:58] nuria_: SMalyshev is consuming replicated eventbus events [23:09:05] which originate in main kafka cluster [23:09:25] you need codfw events [23:09:34] if you don't, if a DC switchover happens, you wont' get anything :) [23:09:49] ottomata:i see, those are not webrequest right [23:09:51] that sounded like Mandarin to me but I got the last two lines :) [23:09:55] jajaj [23:10:11] lol [23:10:38] so I conclude I need both sets [23:10:51] INDEED [23:11:16] cool [23:13:48] hha yes [23:13:53] not webrequest nuria