[06:47:48] 10Analytics, 10EventBus, 10Wikimedia-Stream, 10Patch-For-Review, 10Services (watching): Expose revision-create in EventStreams - https://phabricator.wikimedia.org/T167670#3385113 (10akosiaris) Change merged, eventstreams service has been restarted and should be publishing the new stream [06:57:33] hey nuria_. can you tell me when the quarterly review meeting is? I'm lost. ;) [06:57:54] (none of us are invited and I'm thinking it should be soon) [06:59:29] ooo, nuria_, nevermind. it's quite late. i found it in Dario's calendar. [09:26:28] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Make non-nullable columns in EL database nullable - https://phabricator.wikimedia.org/T167162#3385546 (10elukey) >>! In T167162#3341610, @mforns wrote: > I think the idea and script are pretty good! > > If we want to reduce the number of alter statement... [09:29:21] a-team: running errand for a bit (cat to the vet) + early lunch! [09:29:24] * elukey afk! [09:29:26] + [11:03:56] helooooo [11:04:56] o/ [11:05:05] mforns: Stage: 1 of 2 'Queried about 42675001 rows, Inserted about 42675000 rows' 53.4% of stage done [11:05:08] :D :D :D [11:05:38] elukey, hello! is that the alter tables? [11:05:49] yeah :/ [11:05:54] EchoInteraction_5782287 [11:05:55] \o/ [11:06:03] I just read your comment in phab [11:06:13] awesome [11:06:35] sorry to haven't tested it before [11:06:41] and you launched this a couple hours ago? [11:07:06] np, I also thought this wouldn't be necessary [11:10:27] mforns: yes ~1h30m afo [11:10:30] *ago [11:10:43] but I guess this is one of the biggest tables right? [11:10:58] probably around 90M rows [11:11:02] aha [11:11:11] I think Edit has 600M :D [11:11:37] nope [11:11:38] MariaDB [log]> ALTER TABLE Edit_10676603 MODIFY COLUMN event_platform varbinary(191) NULL, MODIFY COLUMN `event_user.editCount` bigint NULL, MODIFY COLUMN `event_user.id` bigint NULL, MODIFY COLUMN event_version bigint NULL; [11:11:42] Query OK, 12917440 rows affected (32 min 39.39 sec) [11:11:59] (not sure what revision though) [11:13:05] oh [11:13:30] if there is a revision with 600M rows we are screwed :D [11:13:40] it will take half a day to run [11:13:44] ahhahaah [11:14:24] Hi elukey and mforns :) [11:14:31] hello! [11:14:38] elukey: I have some info I want to share with you on slider/presto [11:14:55] joal: hello! Should I be scared? :D [11:15:22] elukey: my brain shouldn't be the only deposit of the few things I learnt :) [11:17:17] elukey: nothing to be scared of - it doesn't work :) [11:18:47] 10Analytics-Kanban: Extraneous whitelist items for WikimediaBlogVisit schema - https://phabricator.wikimedia.org/T168475#3385860 (10mforns) @Tbayer Awesome, will move it to DONE then. Thanks for the heads up anyway! :] [11:19:53] 10Analytics-Kanban: Extraneous whitelist items for WikimediaBlogVisit schema - https://phabricator.wikimedia.org/T168475#3365522 (10mforns) [11:21:29] 10Analytics-Kanban, 10Patch-For-Review: Modify EL purging script to not use limit/offset - https://phabricator.wikimedia.org/T168071#3355041 (10mforns) [11:34:59] (03CR) 10Mforns: [C: 032] Use native timestamps in mediawiki history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/358916 (https://phabricator.wikimedia.org/T161150) (owner: 10Joal) [11:46:55] (03CR) 10Mforns: [C: 032] Add new fields in mediawiki_history job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359019 (https://phabricator.wikimedia.org/T161147) (owner: 10Joal) [11:49:59] (03CR) 10Mforns: [V: 032 C: 032] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361471 (owner: 10Joal) [12:12:21] (03CR) 10Mforns: [C: 031] "LGTM! Just left a comment that is totally optional. Please let me know and I'll merge." (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361500 (https://phabricator.wikimedia.org/T161147) (owner: 10Joal) [12:16:33] (03CR) 10Joal: "I kinda knew when doing it I should have done it the way you suggested :) providing patch now." (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361500 (https://phabricator.wikimedia.org/T161147) (owner: 10Joal) [12:16:55] (03PS3) 10Joal: Update mediawiki history related tables [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361500 (https://phabricator.wikimedia.org/T161147) [12:17:01] mforns: --^ [12:17:06] Many thanks for the reviews again :) [12:17:09] mforns: --^ [12:26:24] mforns: what do you think about running eventlogging_cleaner in beta? [12:29:34] ideally we could check the status of the log db before and after the script [12:29:55] the only caveat is that I've already run it a while ago and we truncated most of the tables :D [12:30:12] there should be data to sanitize/purge though [12:33:35] from the debian security advisor: [12:33:35] The security update announced as DSA-3886-1 caused regressions for some [12:33:38] applications using Java - including jsvc, LibreOffice and Scilab - due [12:33:41] to the fix for CVE-2017-1000364. Updated packages are now available to [12:33:44] correct this issue. For reference, the relevant part of the original [12:33:47] advisory text follows. [12:33:56] we haven't seen it up to now buuuut we might need to update the kernel again in the near future just in case [12:44:06] * elukey coffee! [13:08:12] taking a break a-team :) [13:30:27] elukey, sorry, back from lunch [13:30:33] elukey, yesssssss beta! [13:30:43] mforns: I am now angry at you, not talking with you anymore [13:30:50] I can even consider now to talk with Francisco [13:31:00] :'((((( [13:31:03] but he is not going to read this [13:31:09] pretty sure [13:31:15] soooo mad are you with me? xD [13:31:25] otherwise I'll read a "shut-up luca!" immeditely [13:31:36] hehe [13:31:40] mforns: for today no, I'll re-think about it tomorrow [13:31:54] k :] [13:32:26] elukey, you mean being mad at me, or runnig the script in beta? [13:32:41] the former! [13:32:47] let's run the script :) [13:34:20] (03CR) 10Mforns: [V: 032 C: 032] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361500 (https://phabricator.wikimedia.org/T161147) (owner: 10Joal) [13:34:53] elukey, k [13:34:57] batcave? [13:35:48] mforns: sure [13:38:26] elukey, are you hiding from me? [13:38:32] https://hangouts.google.com/hangouts/_/wikimedia.org/a-batcave-2 [13:47:21] elukey, ping me whenever, no rush! [13:53:23] mforns: sorry! Trying to get the root pass [13:53:36] elukey, np [13:54:56] ah there you go, sudo :) [13:55:04] elukey, cave? [14:32:26] joal: ops sync or skipping? [14:49:28] elukey: please excuse me, I completely missed the ops sync ping [14:49:39] elukey: We can do a pre-standup talk if you want :) [14:50:13] don't worry! I am listening Dan and Francisco getting excited for JS [14:50:18] let's do batcave-2? [14:50:27] elukey: --^ [15:00:32] ping mforns [15:05:05] elukey: is your +1 on druid data deletion an agreement to merge? [15:05:52] joal: yep [15:05:56] thanks :_) [15:09:20] 10Analytics-Kanban: Make banner realtime jobs more resilient - https://phabricator.wikimedia.org/T169101#3387091 (10JAllemandou) [15:23:01] (03PS2) 10Joal: Improve resiliency of Banner streaming job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359461 (https://phabricator.wikimedia.org/T169101) [15:25:27] (03PS3) 10Joal: Improve resiliency of Banner streaming job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359461 (https://phabricator.wikimedia.org/T169101) [15:28:46] (03PS5) 10Joal: Add script deleting druid deep storage data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361651 (https://phabricator.wikimedia.org/T168614) [15:29:08] (03CR) 10Joal: [V: 032 C: 032] "Self merging before deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361651 (https://phabricator.wikimedia.org/T168614) (owner: 10Joal) [15:29:38] (03CR) 10Joal: [V: 032 C: 032] "Self merging before deploy - correct patch" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361651 (https://phabricator.wikimedia.org/T168614) (owner: 10Joal) [15:30:50] (03PS4) 10Joal: Add two tables to sqoop on hadoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/360866 [15:31:12] (03CR) 10Joal: [V: 032 C: 032] "Self merging before deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/360866 (owner: 10Joal) [15:40:27] fdans: Arf, droppped too fast :) [15:40:43] fdans: wanted to joke you, but actually joked myself ! [15:40:58] THE PLAYER... PLAYED HIMSELF [15:41:30] fdans: a fairly classical one [15:41:55] joal: this is the patch right? https://gerrit.wikimedia.org/r/#/c/359461/ [15:42:06] correct nuria [15:42:15] !log analytics1030 back to the worker nodes after maintenance [15:42:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:42:23] joal: i think there are still unanswered comments, for ex the loging config seems like it belongs on log4j [15:42:41] nuria_: Arf, my comment didn't get log - let me try and find them [15:43:04] (03CR) 10Joal: "See inline." (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359461 (https://phabricator.wikimedia.org/T169101) (owner: 10Joal) [16:06:16] Joal: so the banner job is going to run under oozie now? [16:06:23] nope [16:06:43] joal: because it is realtime? [16:06:44] nuria_: since it's streaming, we have no proper way to deal with that [16:06:57] joal:and do we truly need that to be realtime? [16:07:11] joal: seems like banner dat acould be just as pageviews delayed for couple hours [16:07:12] nuria_: when it fails, they ask me to revive it [16:07:36] joal: ok, so someone is looking at the data [16:08:02] nuria_: probably not every minute, but regularly yes [16:08:07] joal: but having the job run under your user doesn't seem the best choice [16:08:21] joal: specially when you will not be there to monitor it [16:08:27] nuria_: With the last patch (currently running under my user), I have successfuly manage not to fail the job for 2 weeks [16:08:41] joal: who was the person looking at that data? [16:09:08] nuria_: the person that contaact me most often if Joseph Seddon [16:09:42] nuria_: I also think it's good, since it allowed me to get a good learning on spark streaming resiliency [16:10:04] Joal: Joseph Seddon does not do fundraising, is community rather, seems like he could do with data that is outdated for 2 hours [16:11:15] joal: is he looking at the data in teh cluster? [16:11:42] joal: or in pivot , i am not sure if there is a companion job to load that data into pivot [16:11:45] nuria_: he looks at it in pivot [16:11:57] nuria_: we have default jo [16:12:07] nuria_: the data we are talking about is actually druid only [16:14:52] joal: i see I think the log4j config needs improvement seems odd to have those settings on the script when they configure loggingoverall , can we leave this code in CR and work on deploying the rest? [16:15:12] nuria_: we can [16:16:08] joal: i prefer to do that and sort out the config , also would love to have a way to run it that does not depend on it running under your user [16:16:16] nuria_: on logging, we took that approach on mediawiki_history as well: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/mediawikihistory/MediawikiHistoryRunner.scala#L150 [16:16:27] nuria_: as you wish [16:16:52] nuria_: cave? [16:16:56] joal: i think we can improve that approach no? seems that those logging settings should not be replicated per job [16:17:03] milimetric: coming sorry [16:17:10] nuria_: some of it, yes [16:17:22] nuria_: I'll leave it and deloy the rest [16:17:30] joal: ok, thank you [16:20:49] (03PS1) 10Joal: Bump changelog version to 0.0.48 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/361895 [16:20:59] nuria, elukey, if you have a minute --^ [16:21:24] joal: extra whitespace, looks good :) [16:21:30] crap [16:22:39] (03PS2) 10Joal: Bump changelog version to 0.0.48 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/361895 [16:23:18] (03CR) 10Joal: [V: 032 C: 032] "Self merging before deploy." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/361895 (owner: 10Joal) [16:23:47] mforns: we need to review the code.. [16:25:33] !log Building / Deploying refinery-source from jenkins to archiva (v0.0.480 [16:25:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:28:16] elukey, ok, do we have comments? looking [16:28:58] pretty serious ones :D [16:29:01] elukey, I found a table that we can use to test: MobileWikiAppSessions_15522505 [16:29:09] O.o [16:31:12] 10Analytics-Kanban, 10Operations, 10ops-eqiad: analytics1030 stuck in console while booting - https://phabricator.wikimedia.org/T162046#3387528 (10Cmjohnson) 05Open>03Resolved New board added, updated idrac license...back online. [16:35:21] mforns: in the meantime I started the script on db1047 to null all the remaining tables [16:35:33] will go through the night and I hope that tomorrow it will be done [16:35:36] awesome elukey [16:56:26] (03CR) 10Nuria: "Thanks for taking care of this, there must be a companion change to this one in puppet that executes this script as a cron, correct?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361651 (https://phabricator.wikimedia.org/T168614) (owner: 10Joal) [16:58:19] 10Analytics-Kanban, 10Patch-For-Review: Add a job that regularly deletes druid webrequest deep-stored data - https://phabricator.wikimedia.org/T168614#3369965 (10Nuria) We need to add corresponding puppet code to execute this script as a cron [16:58:35] thanks joal for deploying [17:07:04] * elukey off! [17:12:09] bye! [17:27:21] hey folks! Any word on when we should expect the new stat machines to be ready for use? [17:27:28] Also, I'm curious what OS they are running. [17:28:33] Currently, it looks like stat1003 is running Ubuntu 14.04 [17:31:12] halfak: we are provisioning hw now so we should have done the replacement by next quarter (aug/sep) [17:31:32] halfak: I *think* they will run debian but elukey correct me if I am wrong [17:31:59] nuria_, great thanks. I'm hoping to match our prod env for ORES to the stat machines so we can safely train models on the stat boxes :) [17:32:24] Given the super old Ubuntu 14.04 I assume that upgrading that OS is hard. [17:33:39] halfak: it is debian, see task: https://phabricator.wikimedia.org/T165368 [17:34:26] halfak: "debian strech" [17:42:11] Gotcha. Perfect! [18:15:54] (03PS1) 10Joal: Bump refinery-job jar version in mediawiki_history [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361917 [18:16:21] (03CR) 10Joal: [V: 032 C: 032] "Self merging for deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/361917 (owner: 10Joal) [18:17:19] !log Deploying refinery with scap [18:17:20] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:20:50] elukey: Are you nearby? [18:20:57] Arf, no he said he's gone [18:21:11] There's just enough space on stat1002 to deploy [18:21:34] I'm doing it, I'll ask elukey to clean tomorrow morning [18:31:58] Mwarf - deployment failed because of space - Weird [18:32:06] It'll wait tomorrow morning and elukey [18:32:11] Have a good night a-tre [18:32:13] a-team [19:08:36] 10Analytics, 10New-Readers, 10Easy: Split opera mini in proxy or turbo mode - https://phabricator.wikimedia.org/T138505#3388103 (10atgo) Hello! I just wanted to check in on this one as I see the timeline has changed. This is really useful data from the New Readers perspective as we're looking at reaching rea... [19:48:14] 10Analytics, 10New-Readers, 10Easy: Split opera mini in proxy or turbo mode - https://phabricator.wikimedia.org/T138505#3388331 (10Nuria) @atgo: sorry we had to move it back. FYI that this work needs to happen on ua-parser an open source project we benefit from, other than updating software there is no work... [20:27:58] 10Analytics, 10New-Readers, 10Easy: Split opera mini in proxy or turbo mode - https://phabricator.wikimedia.org/T138505#3388453 (10atgo) Thanks @nuria. I'll see what I can do. [22:01:44] 10Analytics-Kanban, 10Analytics-Wikistats: Deploy new Wikistats to stats.wikimedia.org/v2 - https://phabricator.wikimedia.org/T167684#3388944 (10Milimetric) @fdans, @Nuria, keeping you up to speed on this task: Got the production build to semi-work with this patch: https://phabricator.wikimedia.org/D699 Fran... [22:34:41] 10Analytics, 10EventBus, 10Wikimedia-Stream, 10Services (watching), 10User-mobrovac: Bikeshed what events should be exposed in public EventStreams API - https://phabricator.wikimedia.org/T149736#3389055 (10mobrovac) [22:34:44] 10Analytics, 10EventBus, 10Wikimedia-Stream, 10Services (watching): Expose revision-create in EventStreams - https://phabricator.wikimedia.org/T167670#3389051 (10mobrovac) 05Open>03Resolved The [revision-create stream](https://stream.wikimedia.org/v2/stream/revision-create) is now live.