[07:47:05] good morning to you oozie [07:47:12] always a pleasure to read emails from you [07:47:14] :/ [07:48:52] !log created 0038054-160922102909979-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-20-3 (oozie errors) [07:49:40] !log created 0038058-160922102909979-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-20-5 (oozie errors) [08:10:14] joal: morninggg [08:10:28] we need to reboot $everything [08:11:27] except our routers :-) [08:11:41] let's reboot them too! :P [08:12:18] moritzm: I am going to send an email also for the stat boxes [08:12:35] because the last time I broke several people's jobs [08:12:41] and they were not super happy [08:12:47] ok, thanks. currently in the process of installing fixed kernels on those [08:25:20] elukey: Hiiii !b [08:25:45] elukey: please rebott me as well, I'd feel better with some kernel updates ;) [08:26:40] hahahahah [08:26:57] so I am going to start with AQS and EventLogging [08:27:02] K [08:27:10] elukey: Can I help for anything? [08:27:10] then I'll send an email to engineering@ for stat boxes [08:27:37] joal: if you could keep an eye on metrics and let me know if you see fire it would be great [08:27:46] elukey: sure :) [08:27:48] or if you see me rebooting things in a weird order [08:27:49] :D [08:28:02] like "rebooting analytics100[12] [08:28:21] huhuhu [08:28:21] :) [08:28:33] ok elukey, will keep an eye open [08:28:47] elukey: or let's say, as much open as it can be ! [08:29:55] :D [08:37:20] joal: I suspended new-cassandra-bundle just to be sure.. all the coordinators are stopped, but I feel better with it stopped [08:37:38] elukey: cool ! [08:38:07] elukey: I'll also put a reminder to myself to restart this bundle from prod code at the beginning of the month [08:46:58] elukey: I need to trop off for about an hour [08:47:07] elukey: sorry :( [08:48:01] joal: it is super fine! I am going to do aqs and EL during the next 30 mins, the hadoop cluster is battle tested :D [08:48:27] this is the first time that we are rebooting the new aqs cluster so I am a bit paranoid [08:48:50] aqs1004 is up and running after the reboot, nodetool is happy [08:52:20] https://grafana.wikimedia.org/dashboard/db/aqs-elukey?from=now-3h&to=now - it seems that it didn't go super smooth [08:52:57] but yeah only a bit of impact, probably ongoing connections [08:53:08] even if I didn't see them with tcpdump [08:53:14] will wait a bit more for aqs1005 [08:54:18] ok while waiting I am going to reboot eventlog1001 [08:57:34] (CR) Hoo man: WikidataArticlePlaceholderMetrics also send search referral data (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/305989 (https://phabricator.wikimedia.org/T142955) (owner: Addshore) [09:07:34] Eventlogging looks good [09:07:49] Going to proceed with aqs1005 [09:26:35] cassandra up on aqs1005, all good so far [09:31:33] moritzm: how soon should we restart stat100[234] ? I am preparing the email to send and I'd like to set a dealine (I am also trying to contact people owning screen/tmux sessions) [09:37:01] how about tomorrow morning? that should also give the SF staff a chance to react [09:44:57] elukey: Back ! [09:45:14] elukey: from the 24h graph I look at, aqs seems fine :) [09:45:33] moritzm: good to me, I just sent a personal email to all the tmux/screen session owner on stat100[234] [09:45:43] I'll also send an email to engineering [09:46:04] joal: super thanks! aqs1006 is the only one left, doing it now [09:46:48] elukey: p99 latency has a bump when you reboot, but that's really a big deal (as you said, probably ongoing connections [09:48:17] yes some timeouts or something similar cause this, we can't really avoid it [09:56:58] aqs1006 done [09:57:00] and pooled [09:57:22] awesome elukey :) Thanks for doing this ! [09:57:31] :) [09:57:39] now the fun begins, hadoop :) [09:57:43] hehe [09:58:07] Analytics: Some recent ExternalLinksChange data lost - https://phabricator.wikimedia.org/T146815#2730905 (Samwalton9) Open>Resolved a:Samwalton9 Just going to close this in favour of discussing ongoing issues at T115119, since it does seem like this is a direct schema issue. [09:58:56] elukey: we can try to prevent failing job by checking application masters [09:59:18] yep yep good idea! [09:59:23] I have suspended the bundles [09:59:23] cool [09:59:32] and will wait a bit before starting [09:59:40] going to prepare the email to announce the restarts [09:59:40] elukey: Have you stopped camus? [09:59:52] nope, will do it in a sec [09:59:56] ok :) [10:00:31] it is running atm, but I'll disable the cron in the meantime [10:00:44] elukey: That's the way to go :) [10:00:57] elukey: Recall we should do it via puppet to prevent the bug we had last time ;) [10:02:02] Analytics, MediaWiki-extensions-WikimediaEvents, The-Wikipedia-Library, Wikimedia-General-or-Unknown, and 2 others: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#2730912 (Samwalton9) Looks like we're running into some problems, but it's hard to pinpoint why. I'v... [10:02:09] so there are two ways [10:02:21] you can comment and then uncomment paying attention on the next puppet run [10:02:28] or just comment, wipe the crontab, run puppet [10:02:53] I am extra careful after that issue [10:02:57] :) [10:03:23] elukey: I completely trust you :) [10:04:48] !log stopped camus on an1027 and all the oozie bundles as prep step for the reboots of analytics* [10:22:39] mmm isn't wiki-research-l@lists.wikimedia.org the right email for research? [10:22:51] yes [10:23:00] hello! [10:23:02] hi :) [10:23:05] I got bounced [10:23:13] you need to be on the list [10:23:23] ahhh makes sense [10:23:30] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l [10:23:53] (I think I just got added this year) [10:25:34] ok email sent [10:25:49] this time I am alerting people with screen sessions on stat boxes, engineering, analytics, research [10:26:03] should be enough :D [10:26:17] yes [10:26:50] I don't know if you really need to take the trouble to look for screen sessions, but that's very nice of you [10:27:29] elukey: 6 jobs runnin [10:27:41] yep I am watching Yarn [10:27:48] for the moment I have only sent emails :P [10:27:49] Do you want us to move or to wait a bit more ? (4 prod) [10:27:53] :) [10:28:09] I think that we can wait a bit [10:28:17] there's no real hurry [10:28:24] I need also to plan kafka reboots [10:28:45] that implies also the main clusters [10:28:48] mobrovac: --^ [10:30:05] kafka reboots for the kernel updates? [10:32:01] yeppa [10:34:34] when is that happening elukey? [10:34:52] mobrovac: whenever you prefer but asap [10:35:04] I pinged you to ask how/when you want to do it [10:35:09] ah i see [10:35:18] elukey: let's do it now, starting with codfw [10:35:25] sure [10:36:09] you restart both boxes, and i'll restart changeprop after that to be on the safe side [10:36:35] elukey: after you restart and make sure that kafka is up, we'll also need to ensure the eventbus proxy service is up and kicking [10:44:42] yep! [10:50:54] elukey: eta? [10:51:33] now :) [10:52:04] doing kafka2001 in 2 mins [11:01:21] done, proceeding with kafka2002 [11:06:20] mobrovac: main-codfw done [11:06:29] ok, restarting [11:16:06] starting Hadoop restarts! [11:16:11] elukey: cool [11:16:22] elukey: let me tell you which nodes not to touch yet please ;) [11:16:34] already checked app masters :) [11:16:44] ok, you're faster than I am :) [11:16:53] 1034/1037 and 104somethingthatIdon'tremember [11:19:39] elukey: oh you're not continuing with the main kafka eqiad cluster straightaway ? [11:19:58] if that's the case, i'm going to have lunch [11:20:22] mobrovac: let's do it this afternoon ok? [11:20:34] w4m [11:20:43] super, enjoy lunch :) [11:21:07] grazie! [11:25:01] elukey: 1034 and 1037 have finished, you can restart them :) [11:25:59] gooooood [11:26:25] TestEditHistoryRunner seems on 1034 though [11:26:29] elukey: but we're gonna have trouble with 1047: ellery's job is huge and not yet finsh [11:27:09] aouch, really ? [11:27:19] elukey: Thanks for having double Checked ! [11:27:38] I'll skip it, we can leave a couple of nodes behind [11:27:39] :) [11:27:41] elukey: Sparj tellws me an IP, not a name [11:32:15] super paranoid with journalnodes, the last time analytics1001 stopped :P [11:32:59] (CR) Joal: "One comment" (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/316845 (https://phabricator.wikimedia.org/T130249) (owner: Nuria) [11:37:00] elukey: my job dies, you can restart 1034 ;) [11:39:28] hi a-team, are you firefighting? can I help? [11:40:06] mforns: hola! We are rebooting all the things for a kernel upgrade, nothing on fire [11:40:12] only boring :( [11:40:29] elukey is on fire, yes ! He restarts nodes since teh early morning ! [11:40:33] elukey, I though it was tomorrow [11:40:38] xD [11:40:39] Give the man a break ;) [11:41:06] (CR) Joal: "One comment in file." (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/305989 (https://phabricator.wikimedia.org/T142955) (owner: Addshore) [11:41:32] mforns: :P :P :P only the stat nodes! [11:41:41] otherwise people will kill me [11:41:45] ah ok ok [11:41:58] (CR) Joal: [C: -1] "Same problem as usual: spaces instead of tabs." [analytics/refinery] - https://gerrit.wikimedia.org/r/316838 (https://phabricator.wikimedia.org/T130249) (owner: Nuria) [11:43:11] (CR) Joal: "@nuria: Waiting for the java code to be deployed in a new jar before being able to dpeloy that patch (with potential changes in jar number" [analytics/refinery] - https://gerrit.wikimedia.org/r/315241 (https://phabricator.wikimedia.org/T147841) (owner: Joal) [11:44:24] elukey: have you restarted 1034? [11:44:28] Can I relaunch my spark? [11:45:04] it is in progress, finishing in a bit :) [11:45:14] elukey: ok great, thanks :) [11:46:16] joal: done! [11:46:35] Thanks elukey :) [11:49:08] joal: Is ellery using Hive right? I was wondering if I could have rebooted an1003 but probably it will need to be postponed [11:49:28] elukey: You're right [11:50:11] maybe I can restart an1027 in the meantime [11:51:47] elukey: an1027 is camu machine, right? [11:52:04] yep! [11:54:55] ouch joal sorry, I didn't see that the spark job was on an1043 :( [11:55:24] elukey: that is life :) [11:55:41] elukey: test jobs, no big deal [11:56:53] new one is on 1043 so I will not destroy it anymore :) [12:10:59] ok except 100[12], 1047 and 1003 we are good [12:11:27] I'd be super happy to do 1003 and 1047 together before re-enabling the whole thing [12:14:38] ok so I am going to grab a bite very quickly and then we can decide what to do [12:15:05] we could think about re-enabling everything and then do 1003/1047 tomorrow [12:17:48] * elukey lunch! [12:24:25] mforns: Heya, are you somewhere around? [12:30:24] joal, hey! [12:30:28] what's up? [12:30:46] mforns: would you have a minute exchanging on some candidate? [12:30:54] joal, sure! [12:30:59] batcave [12:31:03] OMW [12:33:46] mforns: I'm fighting with the test setup, awful stuff, but I have a meeting with Erik in 30, sorry to leave you hanging with that refactor [12:35:05] milimetric, np, I've been doing task reviews [12:37:26] goooood ellery's job has finished [12:37:45] spawned another one on 1057 [12:38:12] proceeding with 1047 and 1003 then [12:43:42] milimetric, will be afk for lunch, see you in a bit [12:44:34] elukey: when back, ellery job is finished ! [12:44:48] you can restart :) [12:45:15] joal: already done 1003 and 1047 :) [12:45:26] everything seems good to me now [12:45:28] mind to double check? [12:45:32] 1057 is ok [12:45:43] elukey: I'm running spark, and things look ok [12:46:02] even hue is no complaining, now mysql on 1003 comes up nicely [12:47:03] elukey: Yay ! this is a complete success full reboot :) [12:47:22] elukey: less than a day, and nothing complains :) [12:48:55] elukey: And I can see that other jobs have not been restarted yet, full cluster for my own little tests ;) [12:51:22] \o/ [12:51:30] joal: shall we reboot also 100[12]? [12:51:37] I can surely do 1002 now [12:51:45] elukey: please go ahead with 2 [12:53:31] doing it now [12:57:09] completed [12:57:30] we can do 1001 tomorrow or now [12:57:43] joal: --^ [12:57:53] please go ahead [12:57:59] elukey: Better with everything done [12:58:14] elukey: jobs are not running, no risk , let's do it [12:58:31] yeah same thought [13:03:49] 1002 is the new master now [13:04:55] rebooting 1001 [13:09:23] Analytics-Visualization: Bug with sorting in some pages due to string sorting instead of numerical sorting - https://phabricator.wikimedia.org/T147749#2731345 (Samwalton9) Seems unrelated to TWL. [13:10:32] elukey: not yet rebooted I guess [13:10:33] all right forcing the failover again to 1001 [13:10:40] elukey: same timing: ) [13:13:18] elukey: Thanks mate, back on track :) [13:13:37] completed! [13:13:48] !log re-enabling oozie and camus after cluster reboot [13:14:20] elukey: arf, I can't keep all the computing resources for only myself ? :( [13:14:46] elukey: I really need to share with damn oozie boy? [13:14:48] :D :D :D [13:14:53] you know oozie [13:14:57] he complains otherwise! [13:15:06] He has already started [13:15:08] ;) [13:21:58] elukey: I guess you see my spark job not willing to release resources ;) [13:26:02] :P [14:03:24] joal: https://yarn.wikimedia.org/proxy/application_1476969128131_0063/ :) [14:03:42] elukey: This makes my day :) [14:04:03] it is still live hacking but I found the issue [14:04:14] * joal sends a huge cookie to elukey :) [14:07:27] elukey: what was it? [14:08:08] ottomata: o/ https://httpd.apache.org/docs/2.4/mod/mod_proxy_html.html#comment_3329 [14:08:34] I tried to make a tunnel from my localhost:8088 to stat1001 port 80 [14:08:43] and chrome failed for a decode error [14:09:05] and then I remembered about the comment [14:10:38] hM! [14:10:51] elukey: also, thanks for reboots, lemme know if you need any help with those [14:11:51] ottomata: first time that a Hadoop reboot goes smoothly, I am writing it in my calendar :) [14:12:08] I left out kafka's main-eqiad and analytics [14:12:26] but I can do main-eqiad tomorrow with Marko [14:12:43] ok [14:13:08] so if you have time to do kafka analytics during your daytime I'll complete the work tomorrow morning CEST [14:13:19] ok, i should have time today [14:13:22] ah also you may want to check kafka2001 [14:13:24] i can start now even [14:13:25] oh? [14:13:31] because mirror maker stopped again.. [14:13:38] hm on 2001 [14:13:38] hm [14:13:48] did you just restart it? [14:13:51] (or puppet?) [14:13:52] yep! [14:13:54] hm ok [14:13:56] I did manually [14:13:59] ottomata: new kernels already installed on kafka*, only needs the rolling reboot [14:14:08] ok great moritzm thanks [14:20:24] !log starting rolling restart of analytics-eqiad kafka brokers to apply kernel update [14:21:25] elukey: ? do these graphs look weird to you? [14:21:26] https://grafana.wikimedia.org/dashboard/db/kafka?from=1476851936873&to=1476973229471 [14:22:21] woa [14:22:24] yes [14:22:29] jmxtrans weirdness? [14:22:32] i guess so [14:22:33] but [14:22:35] i dunno [14:22:49] beacuse if you hover over, you see brokers in the tooltip that aren't in the graph [14:22:55] so probably grafana weirdness? [14:23:13] but kafka1018 is not there [14:23:16] mmmmm [14:23:37] I recall that Riccardo showed an icinga alerts yesterday about missing datapoints for kafka1018 [14:23:38] true [14:23:44] but I forgot to follow up [14:24:03] hm [14:24:04] [20 Oct 2016 14:23:51] [ServerScheduler_Worker-8] 761746130 ERROR (com.googlecode.jmxtrans.jobs.ServerJob:41) - Error [14:24:04] java.nio.BufferOverflowException [14:24:09] kicking jmxtrans [14:25:26] ahhhh snap [14:25:40] joal: fix is now permanent, enjoy spark shells :) [14:25:54] YAY ! [14:26:00] :) [14:26:20] Analytics-Kanban: Make yarn.wikimedia.org correctly proxy to Spark UI - https://phabricator.wikimedia.org/T147927#2731520 (elukey) [14:27:37] ok well, elukey i'm proceeding with broker restarts, starting at 1012 [14:27:41] btw, the quarterly review meeting is in 30 minutes, the meeting is on the WMF staff calendar and I have the bluejeans link if you need (not sure it's ok to paste) [14:27:45] jmxtrans looks ok after restart [14:27:47] on 1018 [14:27:49] OH [14:27:57] ya [14:27:59] ok [14:28:04] so no standup today? [14:28:53] yes, no standup [14:29:25] nuria: do you want a pivot screenshot of the Chrome 41 bug? I find that one's fun to explain [14:32:33] just in case you do: [14:32:33] https://pivot.wikimedia.org/#pageviews-daily/line-chart/2/EQUQLgxg9AqgKgYWAGgN7APYAdgC5gQAWAhgJYB2KwApgB5YBO1Azs6RpbutnsEwGZVyxALbVeAfQlhSY4AF9kwYhBkdmeANroVazsApU6jFmw55uOfABtSYag2LWqANycBXcV2DMwxBmC8AEwADACMAGwAtCHRYQAscCEhuMmpIQB0ySEAWkbkACbB4dEhAJwxYUkpaclZyXmKwGAAnlhewHAAkgCyIBIASgCCAHIA4iAKijqq7PrEhUb0TKxzFphWBCSGSsYrZpyWvAJCoh3uxBIARgwYAO7MDhL8oqTWLQp [14:32:33] Kumve3+b4GBcDmsxBwu2Wph+RxsdgcTlcHi86EeYDgbQ6AGU4AMuuMjNZqGJyGANLhNMAEIRbnIALpNVrtXgYkBwKbyeR05DaGgQ1b/aF8aiCJTCOT4C7XW4PJ4iYgAKwwDE+PkVYCGs35yg1+mYqqWJj5hw2xyFpzFwBcpGodwkEAw7mJyoKpCY2t4BRYEGohQoAHMpmgeQaDusePgTiKzpJpLJxE0ru4IABrahqt2/dPNWMAIUTKcCSgK7kcel4AAUwgARZW6gLq0sZhsqgL6/ZQ43h02R81SGRyJrO11N4jML0+8j+9k05DkdzWaxKS3W232x1ci1Wm12h2Baeaaez+eB51EkPiy43e6P [14:32:34] BgSWUKpVKWwiOx4ACs8iAA [14:32:37] ew!!! [14:32:38] lol [14:32:56] https://goo.gl/8rA9to [14:37:06] thanks milimetric [14:37:30] milimetric, oh! I though this was the blue jeans link, hehehe [14:37:36] *thought [14:37:54] https://bluejeans.com/569999548/ [14:38:02] thx! :] [14:43:28] milimetric: what about tasking ? Are we skipping it as well? [14:43:33] or only standup? [14:47:08] elukey, I guess when qr ends, we'll fallback to tasking. nuria had the idea of using the first part of it to talk about how to improve the tasking meeting, but I may be wrong? [14:51:42] there is no wrong :) [14:51:59] we can resume tasking after the qr if everyone's ok with it [15:00:56] a-team: I am in batcave ... standup? [15:01:09] nuria: QR? [15:01:54] joal: our QR is at 10 , ops one right? [15:02:16] Oooh, right [15:02:36] ohhh [15:02:37] so standup yes? [15:02:43] ok joining xstandup [15:02:52] ok [15:03:21] milimetric: seems that current QR meeting is not ours [15:03:31] er? [15:03:38] nuria: are you still in batcave? [15:03:42] a-team: I sent you invites to our quaterly, did you get them, it is at 10 am PST [15:03:45] oooh yeah!! [15:03:48] it's at 1pm [15:04:00] cc ottomata elukey joal mforns [15:04:16] nuria I'm in it, thanks [15:04:37] mforns: it's the next one https://bluejeans.com/411326803/ [15:04:42] a-team: Let's do standup [15:04:50] nuria, oh! [15:04:52] ok [15:05:21] trying to join [15:05:58] need to reboot chrome, argh [15:06:13] mforns: you comin to standup? [15:09:46] Analytics, Commons, Multimedia, Tabular-Data, and 4 others: Review shared data namespace (tabular data) implementation - https://phabricator.wikimedia.org/T134426#2731622 (Yurik) [15:15:50] gah, elukey looks like disks got swapped aroudn on 1020 after reboot [15:15:51] doh! [15:16:00] argh [15:16:05] will do the uuid thing [15:39:03] milimetric, nuria, I see some kafka-related errors in eventloging:/srv/log/upstart/eventlogging_processor-client-side.00.log http://pastebin.com/fbL1Cy3W [15:39:38] cc ottomata thus the low throughput, maybe eventlogging is going to need a re-start too? [15:42:01] I can do that, won't hurt [15:42:30] hm, those are normal errors though [15:42:40] it should do that during every restart for a sec [15:42:46] can't hurt though [15:42:49] to restart el [15:42:53] i've still got a few brokers to reboot [15:42:56] ottomata: no leader for partition shouldn't happen no? [15:43:06] 'NotLeaderForPartitionError' [15:43:08] not 'no' [15:43:08] :) [15:43:13] ahhhhh sorry [15:43:15] :) [15:43:40] !log restarted EventLogging after throughput drop [16:00:19] Analytics-Kanban, Operations, Performance-Team, Reading-Admin, Traffic: Preliminary Design document for A/B testing - https://phabricator.wikimedia.org/T143694#2731863 (dr0ptp4kt) [16:06:18] lzia: what do I need to do to make the tool labs survey form live before I send out the emails? [16:10:26] Analytics, Analytics-Cluster: Audit fstabs on Kafka and Hadoop nodes to use UUIDs instead of /dev paths - https://phabricator.wikimedia.org/T147879#2731893 (Ottomata) Did some Kafka reboots today and ran into the issue where /dev numbers are rearranged after reboot. I had to manually use UUIDs in fstab... [16:15:32] Analytics, Commons, Multimedia, Tabular-Data, and 3 others: Allow structured datasets on a central repository (CSV, TSV, JSON, GeoJSON, XML, ...) - https://phabricator.wikimedia.org/T120452#2731897 (Yurik) [16:17:17] done with reboots, gonna kcik el one last time [16:17:37] !log restarting eventlogging after rebooting kafka brokers [16:28:47] (CR) Nuria: Enhancing regex to support pageviews to non-knowledge wikis (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/316845 (https://phabricator.wikimedia.org/T130249) (owner: Nuria) [16:47:50] ottomata, EL looks good in general, but it seems that the kafka restarts affected the sending of the metrics to grafana [16:49:08] and restarting EL didn't fix it [16:53:12] mfyeah something is weird with the kafka dash metrics too [16:53:13] checking jmxtrans [16:57:09] hm yeah jmxtrans is having problems [17:01:18] vk seems fine now [17:05:47] hi bd808. let me check it. [17:06:30] lzia: thanks. I'm waiting on advice from ops-l on sending the emails. [17:06:39] but I hope to get them out today [17:06:49] !log created 0000294-161020124223818-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-20-13 (oozie errors) [17:08:13] bd808: if you're using old Google Forms, you should o to Responses tab and then click on Not accepting responses [17:08:24] I just clicked on that and now it says Accepting responses, bd808 [17:08:56] bd808: if you want, you can also submit a test response just to be sure, but from what I can see, it's on now. :) [17:29:46] lzia: I filled in a test response (well actually a real response but from an insider). I don't have access to the speadsheet to see if it looks right there. [17:32:25] gave you access bd808. sorry. [17:32:36] your response is registered bd808. [17:32:53] lzia: no worries on the access. I never needed it before :) [17:33:57] * bd808 updates the link in the email template to point to the right form [17:35:44] a-team: we might just had an outage with kafka, connections were dropped [17:35:52] ok [17:35:55] :( [17:35:58] we are investigating it but if you see jobs failing this might be the issue [17:35:59] need help? [17:36:07] yea [17:36:58] no no now it is recovered [17:37:03] or at least it seems [17:37:12] oh cool [17:39:40] * milimetric appreciates the kafka whisperers very much [17:48:20] team going afk for a bit, I am a bit tired from today :D [17:48:34] will re-join in an hour to check if everything is ok [17:57:51] Analytics-Kanban: Tech Talk: Pivot - https://phabricator.wikimedia.org/T148776#2732347 (Milimetric) [17:58:31] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2612941 (Tgr) [18:18:17] mforns: I'm taking care of the 2 oozie jobs not yet relaunched [18:18:41] joal, oh can I watch? [18:18:49] mforns: you pay ? [18:18:51] :-P [18:18:59] mforns: sorru , too easy [18:19:06]