[02:03:17] 10Quarry: Users blocked from account creation on meta can not use Quarry - https://phabricator.wikimedia.org/T157342#3002152 (10Harej) >>! In T157342#3008064, @bd808 wrote: > Switching the wiki contacted for the OAuth handshake would really be a game of whack-a-mole. Today someone is affected by a meta ban, tomo... [05:58:10] 10Analytics, 10EventBus, 10Wikimedia-Stream, 13Patch-For-Review: Implement server side filtering (if we should) - https://phabricator.wikimedia.org/T152731#2858381 (10Tomayac) Sorry if this has come up before, but SSE actually does allow for event type specification. Currently you send your events like so:... [07:21:32] morning! [07:21:53] aqs1009-b is still bootstrapping, will take an hour and something to finish [08:42:51] 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#3012332 (10Marostegui) >>! In T155658#3009580, @JAllemandou wrote: > @Marostegui : I looked at [[ https://grafana-admin.wikimedia.org/dashboard/file/server-board.json?var-server=lab... [09:45:36] for mforns when he arrives: need to leave for ~2h, will be online around 12:30 for deploy :) [09:46:56] joal: what if I told you that I've restarted java on all the hadoop workers in ~20 mins :D [09:47:36] nothing exploded so far :D [09:47:54] I need to restart hive on 1003 and do the master failover [09:49:10] and oozie [09:49:44] !log suspended oozie bundles temporarily to allow graceful restarts [09:49:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:03:39] hellooo team :] [10:04:21] o/ [10:04:40] !log restarted oozie, hive-server and metastore for java upgrades [10:04:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:04:58] !log performed master failover from an1001 to an1002 (and vice-versa) for java upgrades [10:04:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:05:11] !log re-enabled oozie bundles after maintenance [10:05:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:06:15] one hour of maintenance in total, oozie stopped for 15 minutes [10:06:16] \o/ [10:16:11] :] [10:17:15] spoken too soon, an1035 didn't come up cleanly [10:18:27] (03PS1) 10Mforns: Add v0.0.41 to the changelog [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/336771 [10:20:01] (03CR) 10Mforns: [V: 032 C: 032] Add v0.0.41 to the changelog [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/336771 (owner: 10Mforns) [10:26:08] mforns: let me know if you see any issue.. I don't see any job failed for the moment! [10:26:34] elukey, ok, but I didn't deploy refinery/source yesterday, doing it now [10:27:20] sure sure [10:37:23] joal, hey! in the deploying refinery-source documentation it says the jenkins maven job should show v0.0.41 as the release version parameter, but it shows 0.0.41 (without the v). Is that OK? If so, I'll update the docs [10:38:00] https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Deploy/Refinery-source [10:41:00] yep it is ok! [10:41:34] mforns: you meant "and supplying the version number of the latest jars released (e.g 0.0.29)," ? [10:41:37] if so yes [10:41:45] elukey, no [10:42:39] mforns: all the other steps need the v [10:42:47] I mean in here: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Deploy/Refinery-source, Deploy procedure #2 "Example: Release v0.0.40 - Development 0.0.41-SNAPSHOT" the version number jenkins displays does not have the v in it, is it OK or should I add it? [10:43:21] ah that one is probably what I wrote the last time [10:43:24] let me check [10:43:28] ok [10:43:46] you can click at the jenkins link to see it [10:43:54] to see what I mean [10:44:54] if should be fine since the v will be needed in Specify custom SCM tag to allow a meaningful git tag [10:45:27] also "usually Release Version + 1 with -SNAPSHOT at the end" [10:45:31] doesn't meantion a "v" [10:45:37] the example is probably my mistake [10:45:38] aha [10:45:46] sorry :) [10:45:49] so, can I remove the v in that line? [10:45:53] in the docs? [10:45:54] sure sure [10:46:18] OK, thanks for explaining! :] [10:46:52] my bad, typo :( [10:47:30] np! I'm just afraid of breaking stuff! xD [10:48:29] :) [10:48:36] I am going to restart druid [10:48:42] k [10:48:47] !log restarting druid daemons for Java upgrades [10:48:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:52:53] 10Quarry: Users blocked from account creation on meta can not use Quarry - https://phabricator.wikimedia.org/T157342#3002152 (10Base) Can't the user just change the URL address in the browser and handle OAuth manually through another wiki? Or OAuth does not allow it to be done like that? [10:58:17] 10Quarry: Users blocked from account creation on meta can not use Quarry - https://phabricator.wikimedia.org/T157342#3012592 (10Base) I have just logged in to Quarry by changing Meta to angwiki so I guess it is possible. I am not banned on Meta though, if needed I can ban myself for test purposes. [11:06:26] !log Deployed refinery-source using jenkins [11:06:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:10:58] * elukey sees other screen sessions for "tranquillity" on druid [11:11:04] * elukey blames joal and ottomata [11:23:18] joal, yt? [11:31:22] mforns: all good with the deployment? [11:31:44] elukey, I need to execute an ALTER TABLE on wmf.webrequest and wmf.pageview_hourly but for that I guess I have to kill jobs that use those tables [11:33:17] or maybe wait until they are done [11:33:28] I'd go for the second one :) [11:33:56] probably suspending the bundles via hue might be good as well [11:34:26] but I guess that you'd need to kill/re-create jobs as well [11:34:39] elukey, yes [11:34:40] so might be good to just stop them, do the work, re-create them [11:35:31] ok, will do that [11:39:26] mforns: lets do something like [11:39:41] stop, wait for jobs to finish, kill, alter, restart [11:40:05] makes sense, what do you mean with stop? [11:40:11] stop the deployment? [11:40:19] there is a "suspend" hue functionality [11:40:27] ah ok [11:40:33] so oozie stops submitting things :D [11:41:12] then things in https://yarn.wikimedia.org/cluster/apps/RUNNING will keep going [11:41:28] suspend the bundles that touch those tables, wait for jobs to finish, kill the webrequest-load-bundle, alter, restart and unsuspend [11:41:51] elukey, if I suspend a bundle, the underlying coordinators are also suspended? [11:44:29] yep [11:47:22] 10Analytics: Agreggate banner dataset for long term retention - https://phabricator.wikimedia.org/T157582#3012642 (10mforns) @AndyRussG Hi, a couple questions on banners that can help decide here :] 1) On which pages or page types are banners shown? 2) If banners are shown in all pages, then for campaigns with... [11:47:42] elukey, OPK [11:47:43] OK [11:49:55] elukey, I don't think I have permits to suspend, resume or kill from hue... [11:53:11] buuuu [11:53:13] :) [11:53:16] xD [11:53:31] can you bless me with that power? [11:54:03] have you tried? [11:54:06] you should be able to [11:54:29] elukey, I can not select any bundle... [11:56:54] !log stopped webrequest-load-bundle from hue [11:56:55] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:57:10] I can see the buttons on top of the oozie bundle dashboard: resume, suspend and stop [11:57:32] but they are inactive, and I can not select any bundle [11:58:10] it looks like in the left side of the table, I should see checkboxes to select them, but there are no check boxes [11:58:37] mmmm [11:58:44] I am checking how hue assigns perms [11:58:52] but in the meantime you are free to go :) [11:59:45] mforns: can you re-check now? [11:59:58] elukey, yeeeesss!!! thanks a lot [12:00:06] I can see the checkboxes :] [12:00:12] you are now a superuser [12:00:23] [12:00:26] !log added Marcel as superuser in Hue [12:00:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:07:32] Hey mforns [12:07:36] Sorry I'm late [12:07:41] joal, hi! [12:07:42] np [12:07:53] I didn't break anything [12:07:55] yet [12:07:58] I think [12:08:01] :] [12:08:04] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017): Kibana's Mailing List data sources do not include recent activity on wikitech-l mailing list - https://phabricator.wikimedia.org/T146632#3012657 (10Aklapper) [12:08:13] I'm about to execute the alter tables [12:09:16] awesome mforns, I was backlogging IRC :) [12:09:50] ok, proceeding with alter table then [12:11:13] mforns: all jobs are finished? [12:11:19] joal, they are suspended [12:11:30] mforns: hm [12:11:40] we just suspended the webrequest-bundle [12:11:59] I can resume then if better [12:12:19] mforns: I think suspend is not good for altering something [12:12:25] mforns: suspend means it'll restrart [12:12:39] and, it'll restart with a different bakend config - not col [12:12:42] joal, we'll need to kill them and restart anyway right? [12:12:48] correct :) [12:13:02] ah snap camus [12:13:04] mforns: kill webrequest-load bundle (making sure you know when to restart [12:13:04] so you suggest killing them right away, or waiting for them to finish? [12:13:16] mforns: They are suspended, so not even trying to finish [12:13:32] joal, yes, but we can resume them and wait... [12:13:44] ok, will kill [12:13:47] mforns: what I suggest is look at the 2 big ones (text and upload), let them finish if already started, and kill them at the biggining of new ones [12:14:02] ok [12:14:24] do we need to stop camus for this task? Thinking a bit more I'd say no [12:14:29] but let me know otherwise [12:14:42] elukey: no need to stop camus, no [12:14:59] mforns: let's resume the bundle and kill coordinators [12:15:03] mforns: batcave? [12:15:04] resumed [12:15:07] ok [12:15:21] joal: why resume it ? [12:15:27] can't we just kill it now? [12:15:41] * elukey don't understand [12:16:21] i mean, we suspend, let it drain and kill [12:16:22] give me aminute elukey [12:16:25] perform the alter table [12:16:28] start again [12:16:29] sure sure [12:17:12] elukey: the mistake is, suspending doesn't drain them [12:19:03] mmmm [12:19:26] this is not what I was convinced [12:19:33] elukey: I notived ;) [12:19:39] I noticed sorry [12:19:39] how are other jobs going to be submitted if they are suspende? [12:19:42] *suspended? [12:20:00] elukey: come to batcave, it'll be easier [12:20:21] don't worry, need to step away in a bit [12:20:30] will leave you guys do the work [12:20:31] :) [12:20:54] np elukey :) [12:20:58] We can debrief after [12:27:59] 06Analytics-Kanban, 15User-Elukey: Ongoing: Give me permissions in LDAP - https://phabricator.wikimedia.org/T150790#3012714 (10MoritzMuehlenhoff) @TBolliger : I've added you to the wmf group. [12:37:40] going afk for an hour more or less, ttl! If you need me ping me on hangouts :) [12:37:43] * elukey afk [12:46:54] !log Deployed refinery using scap, then deployed onto hdfs [12:46:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:48:32] mforns: can be useful https://gist.github.com/jobar/78b9e394437e28a8a9f7ab25ed695612 [13:02:24] !log Restarted webrequest-load-bundle and pageview-hourly-coord [13:02:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:17:34] mforns_brb: I added the command to start spark 1.6 from stat1004 to c9 :) [14:21:11] 10Analytics, 10Analytics-Cluster: Hadoop: Remove priority queue and add a new one with lower-than-user priorty - https://phabricator.wikimedia.org/T156841#3013045 (10JAllemandou) [14:23:01] joal: o/ [14:23:07] Hi elukey :) [14:24:02] so we were saying about suspended jobs! [14:24:08] Ah :) [14:24:10] what is the best practice when we need to stop the world? [14:24:23] kill seemed brutal [14:24:41] Kill is the way when it's needed :) [14:24:53] I prefer to do it coord at a time [14:26:49] elukey: 2 big coords are text and upload: you wait for a time when an hour is just finished (this is the boring part) - kill the coord [14:27:20] Then you kill the bundle (we don't mind rerunning load for small partitions (misc and maps), but not rerunning it for text and upload is good :) [14:28:32] makes sesne elukey ? [14:29:26] joal: more or less what I thought I was doing with bundle suspend, wait + kill [14:30:03] 'suspend' suspends inner steps of jobs, so it's actually preventing the running jobs to finish [14:32:14] ah I didn't think about this part [14:32:18] ok it makes sense [14:33:09] elukey: waiting for finishing hours of text and upload are boring I'd love to have better solution:( [14:34:58] joal: I know that you explained to me this trick a lot of times but now I finally got the meaning of suspended [14:35:09] huhu :) [14:35:10] it might be good when I do maintenance [14:35:15] elukey: do you have a second per batcavernare? [14:35:15] but not in this use case [14:35:20] hahhahahahhha [14:35:50] :D fdans [14:36:18] la grotte aux chauve-souris [14:36:19] fdans is the man managing to make cross-language jokes in writing [14:36:26] * joal claps loud [14:36:30] :D [14:36:32] hahah [14:37:04] I shall change my title [14:37:07] fdans: give me 10 minutes! :) [14:37:07] elukey: indeed maintenance stop jobs, so for maintenance without restart, it's good [14:37:16] super [14:37:27] joal: this morning I restarted the whole cluster in 1hour (more or less) [14:37:28] But when a restart is needed (for conf change for instance), kill/restart is the way [14:37:30] \o/ [14:37:36] elukey: read that [14:37:43] * joal claps for elukey [14:37:51] elukey: This really is huge improvement ! [14:38:30] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017): Create orgs and enroll DB identities who are likely only activity from imported upstream repos (to make it visible how incorrect our data potentially is) - https://phabricator.wikimedia.org/T157569#3013109 (10Aklapper) 05Open>03Resolv... [14:42:18] 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#3013113 (10JAllemandou) > Seeing these numbers and going back to your previous comment about increasing connections from 10 to 50, that means you'd be increasing the network traffic... [14:45:12] 10Analytics-Tech-community-metrics: Have "Last Attracted Developers" information for Gerrit (already exists for Git) - https://phabricator.wikimedia.org/T151161#3013118 (10Aklapper) (Unrelated sidenote: Due to its UI this can only work after changing the default (2y) timespan in the upper right corner to somethi... [14:47:13] 10Analytics-Tech-community-metrics: Provide equivalent of "SCR: Open changesets vs. Open changesets waiting for review (CR0 / CR+1)" in Kibana - https://phabricator.wikimedia.org/T151555#3013125 (10Aklapper) If I understood Jesus correctly last week, "we don't have that" in the backend, so this really is not pos... [14:47:16] 10Analytics-Tech-community-metrics: Provide equivalent of "SCR: People uploading patchsets vs. Reviewers per month" in Kibana - https://phabricator.wikimedia.org/T151559#3013126 (10Aklapper) If I understood Jesus correctly last week, there is no data for reviewers yet in the backend, so this really is not possib... [14:48:08] fdans: still there? I am free now [14:48:16] yes! [14:48:52] lets "batcavernare" [14:48:54] :D [14:50:31] 10Analytics-Tech-community-metrics: Provide equivalent of "SCR: Age of open changesets (monthly snapshots)" in Kibana - https://phabricator.wikimedia.org/T151557#3013131 (10Aklapper) [15:01:02] hi joal! would you be able to spend some time today or tomorrow helping me set up my own table with only the necessary fields I'm using from the webrequest table? I've been using the tablesample method, but it's not very accurate for small wikipedia projects so it's time to switch back to unsampled data, but I want to be mindful of resource usage... [15:01:14] Hi zareen :) [15:01:35] saw your message from yesterday, yes I can help, timing today is tight [15:01:49] 10Analytics-Tech-community-metrics: "Attracted Developers" widget in "Demographics" panel unexpectedly lists "old" developers - https://phabricator.wikimedia.org/T157688#3013140 (10Aklapper) [15:02:02] We can spend ~1h now, if you want? [15:02:03] joal: awesome! is tomorrow better for you? [15:02:18] joal: yeah, right now works as well [15:03:16] zareen: how do you want us to proceed? [15:03:25] zareen: irc / batcave? [15:04:19] batcave? [15:05:59] sure :) [15:09:34] milimetric did the banner go up? [15:09:40] didn't see it on the wikistats page [15:09:48] and seems to not be that many more responses? [15:10:08] ashgrigas: no, no banner, sorry [15:10:18] yeah, the lists were quiet [15:10:33] ottomata: HHHHHHIIIIiiiiiiiiiiiIIIIIIIIIIIIIIIIIIIiiiiiiiiiiiiiiiiiiiiiiiiiiii [15:10:33] what happened to the banner? too hard to do? [15:10:59] ottomata: we'd need to restart all the kafka brokerz for open-jdk upgrades [15:11:09] shall I start? [15:13:11] 10Analytics-Tech-community-metrics: "Attracted Developers" widget in "Demographics" panel unexpectedly lists "old" developers - https://phabricator.wikimedia.org/T157688#3013168 (10Aklapper) I had an (incorrect) suspicion that developers count as "attracted" (new) when they use a different email address that the... [15:13:21] elukey: oook! [15:13:40] * elukey restarts all the things [15:13:51] it has been a while since we haven't restarted those brokers [15:14:31] ashgrigas: looking into it now, I'm just not sure how to deploy to wikistats. Will talk to nuria later when she's up [15:16:28] ok sounds good milimetric thanks for the update [15:18:17] 10Analytics, 10EventBus, 10Wikimedia-Stream, 13Patch-For-Review: Implement server side filtering (if we should) - https://phabricator.wikimedia.org/T152731#3013179 (10Ottomata) Thanks for the response! > `event: enedit` This doesn't actually do server side filtering, it just allows you to register a local... [15:20:07] 10Analytics, 10Analytics-Cluster: Hadoop: Remove priority queue and add a new one with lower-than-user priorty - https://phabricator.wikimedia.org/T156841#3013185 (10Ottomata) Sure, not sure I understand though. Lower-than-user priority means lower than default? [15:39:12] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017): Deployment of Maniphest panel - https://phabricator.wikimedia.org/T138002#3013268 (10Aklapper) @Lcanasdiaz: Did Mukunda's comment help? (Or is there some issue that after a 503 you might need to start from scratch again and hence miss d... [15:49:26] aqs1009 is serving traffic! [15:49:50] aqs1006-b is still compleating the cleanup, buuuuut the aqs cluster is officially expanded [15:50:02] 12 cassandra instances in services and 6 nodes serving HTTP traffic [15:50:05] * elukey dances [15:53:36] 10Analytics, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#2956925 (10Ottomata) 05Open>03Resolved Done! https://apt.wikimedia.org/wikimedia/pool/main/g/git-fat/ Reopen if it doesn't work! :) [15:54:39] 10Analytics, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3013355 (10Ottomata) 05Resolved>03Open Ok, I'm just clicking buttons way too fast over here. It's been packaged. To deploy, we need to run a `apt-get install git-fat` everywhere. Can... [15:54:48] 06Analytics-Kanban, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3013358 (10Ottomata) [15:57:56] 10Analytics, 10ChangeProp, 10EventBus, 10Revision-Scoring-As-A-Service-Backlog, and 2 others: Rewrite ORES precaching change propagation configuration as a code module - https://phabricator.wikimedia.org/T148714#3013371 (10Halfak) [16:01:11] milimetric: standdup [16:04:57] milimetric mforns fdans nuria update the feedback doc with what we have so far for friday's discussion, if we get more before then i'll add it: https://docs.google.com/a/wikimedia.org/document/d/1FFa5L2IKJtTed5B6fsfQHwGQcfFP9AvnPVLI1CGSLJs/edit?usp=sharing [16:05:36] 10Quarry: Users blocked from account creation on meta can not use Quarry - https://phabricator.wikimedia.org/T157342#3013392 (10Reguyla) @Base, I don't think Oauth works like that. I think the developer has to assign a home wiki to use for the login process and although they can choose just about anything, in th... [16:07:41] 10Analytics-EventLogging, 06Analytics-Kanban, 13Patch-For-Review: Change userAgent field to user_agent_map in EventCapsule - https://phabricator.wikimedia.org/T153207#3013422 (10Nuria) a:05fdans>03Nuria [16:13:04] zareen: job finished, I have some info for you: you don't even need partitioning :) [16:13:22] total size for 2 month will be ~2Gb [16:13:30] total size for 2 month will be ~20Gb sorry [16:13:46] so very ok for a single folder :) [16:13:56] so i can run just 1 query instead of breaking it into 2 timespans? [16:14:17] zareen: I suggest monthly partitioning to facilitate updates, but no more [16:14:26] zareen: it actually needs a bit more changes [16:14:33] zareen: let me update the stuff [16:14:51] joal: okay, i'll hold off on running right now [16:15:00] please zareen :) [16:15:13] joal: are you able to write in the etherpad now? [16:17:34] (03PS6) 10Fdans: Add map visualizer to Dashiki [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/333922 (https://phabricator.wikimedia.org/T153921) [16:18:23] zareen: etherpad modified, and big job launched (filling end of december) [16:19:01] zareen: I need to leave now, I'll be back later on tonight, will run the 2017 part of the query and change ownership of the folder after [16:19:22] zareen: You should be setup to go either your evening time, or tomorrow :) [16:19:23] joal: oh nice, thank you so much for the help! very useful and hopefully will take some burden off the servers :) [16:19:36] zareen: It'll surely will ;) [16:21:12] joal: awesome! talk to you later [16:22:33] (03CR) 10Milimetric: [V: 032 C: 032] "nice, well done." [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/333922 (https://phabricator.wikimedia.org/T153921) (owner: 10Fdans) [16:40:29] 06Analytics-Kanban, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3013520 (10thcipriani) >>! In T155856#3013355, @Ottomata wrote: > Ok, I'm just clicking buttons way too fast over here. It's been packaged. To deploy, we need to run a `apt-get ins... [16:40:37] 10Analytics: Add error component to Dashiki - https://phabricator.wikimedia.org/T157697#3013521 (10fdans) [16:41:35] 06Analytics-Kanban: x1-analytics-slave hangs forever - https://phabricator.wikimedia.org/T157514#3013535 (10Nuria) [16:42:36] 10Analytics, 10Analytics-Dashiki: Add error component to Dashiki - https://phabricator.wikimedia.org/T157697#3013536 (10Nuria) [16:43:36] 06Analytics-Kanban: Investigate rise in IE views from Pakistan since 2015 - https://phabricator.wikimedia.org/T157404#3013552 (10Nuria) [16:44:06] 06Analytics-Kanban: Investigate rise in IE views from Pakistan since 2015 - https://phabricator.wikimedia.org/T157404#3004126 (10Nuria) This is likely related to IE7 bot traffic increase, most of it comes from that part of the world. [16:49:57] 06Analytics-Kanban, 06Performance-Team: Check if the EventLogging User Agent schema upgrade breaks any performance tool/metric - https://phabricator.wikimedia.org/T156760#3013567 (10Nuria) [16:54:52] 06Analytics-Kanban: x1-analytics-slave hangs forever - https://phabricator.wikimedia.org/T157514#3007801 (10Ottomata) Has this worked before? [16:57:54] 06Analytics-Kanban: x1-analytics-slave hangs forever - https://phabricator.wikimedia.org/T157514#3013605 (10Ottomata) Ah, it did! Looks like the IP has changed, and we need VLAN ACLs updated. https://github.com/wikimedia/operations-dns/commit/9c4b845f73f2e6d9eb15b140e17dd0f0931b431b [16:59:35] 10Analytics, 07Easy: Don't accept data from automated bots in Event Logging - https://phabricator.wikimedia.org/T67508#3013608 (10Nuria) a:05Milimetric>03None [17:04:44] 10Analytics, 10EventBus, 13Patch-For-Review, 06Services (watching): Check eventbus Kafka cluster settings for reliability - https://phabricator.wikimedia.org/T144637#2605884 (10Nuria) Related to kafka tier-1 goal for next year [17:05:07] 10Analytics: Add Analytics-Wikistats 2.0 phab project tag - https://phabricator.wikimedia.org/T146043#3013620 (10Nuria) 05Open>03declined [17:05:50] 06Analytics-Kanban: Much more pageviews in Tagalog Wikipedia since mid-June 2016 - https://phabricator.wikimedia.org/T144635#3013621 (10Nuria) [17:06:05] 06Analytics-Kanban: x1-analytics-slave hangs forever - https://phabricator.wikimedia.org/T157514#3013622 (10elukey) 05Open>03Resolved a:03elukey Fixed the ACLs, it works now! [17:07:39] nuria: you froze [17:08:27] 10Analytics-Tech-community-metrics: scr-backlog.html should also exclude V-2 changesets - https://phabricator.wikimedia.org/T123848#3013627 (10Aklapper) 05Open>03declined There are currently no such stats on https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit-Backlog and korma.wmflabs.org is dead, hence... [17:09:56] 10Analytics, 10Wikimedia-General-or-Unknown, 07Easy: Browser and platform stats for logged-in vs. anon users for security and product support decisions - https://phabricator.wikimedia.org/T58575#3013637 (10Milimetric) [17:10:18] 06Analytics-Kanban, 10Wikimedia-General-or-Unknown, 07Easy: Browser and platform stats for logged-in vs. anon users for security and product support decisions - https://phabricator.wikimedia.org/T58575#3013639 (10Milimetric) [17:11:42] 06Analytics-Kanban, 10Wikimedia-Stream: Port RCStream clients to EventStreams - https://phabricator.wikimedia.org/T156919#3013649 (10Milimetric) [17:23:43] 10Analytics: AQS: Verify that node not being able to restart logs locally to errorlog not to logstash - https://phabricator.wikimedia.org/T155791#3013705 (10Nuria) [17:25:14] 06Analytics-Kanban: AQS: Verify that node not being able to restart logs locally to errorlog not to logstash - https://phabricator.wikimedia.org/T155791#3013709 (10Milimetric) a:03elukey [17:26:12] 06Analytics-Kanban: Add hardware capacity to AQS - https://phabricator.wikimedia.org/T144833#3013727 (10Milimetric) a:03elukey [17:26:24] 06Analytics-Kanban: Add hardware capacity to AQS - https://phabricator.wikimedia.org/T144833#2611862 (10Milimetric) 05Open>03Resolved [17:28:17] 06Analytics-Kanban: Hadoop cluster expansion.Add Nodes - https://phabricator.wikimedia.org/T152713#3013740 (10Milimetric) p:05Triage>03Normal a:03Ottomata [17:29:04] 10Analytics: Hadoop cluster expansion. Add Nodes - https://phabricator.wikimedia.org/T152713#2857806 (10Milimetric) [17:33:05] 06Analytics-Kanban, 06Operations, 10Traffic, 13Patch-For-Review: Add global last-access cookie for top domain (*.wikipedia.org) - https://phabricator.wikimedia.org/T138027#3013792 (10Milimetric) [17:38:15] 06Analytics-Kanban, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3013818 (10thcipriani) I just tried the above ^ on a labs project machine with the new git fat version and it worked! Here's the tl;dr: ``` $ /bin/bash fattest.sh Cloning into '/ho... [17:43:24] 10Analytics, 10Analytics-Cluster: Kafka mirror maker failures when kafka brokers are restarted - https://phabricator.wikimedia.org/T157705#3013836 (10elukey) [17:43:29] 10Analytics, 10Analytics-Cluster: Kafka mirror maker failures when kafka brokers are restarted - https://phabricator.wikimedia.org/T157705#3013848 (10elukey) p:05Triage>03Normal [17:43:35] ottomata: --^ [17:45:26] 06Analytics-Kanban, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3013854 (10Ottomata) Ok, great. Can I install on a remote target and we try in prod for something? [17:45:40] +1 thanks elukey [17:51:07] busing home in the snow (cya car, hope you will be ok out here!), and lunchin, bbl [17:53:51] going afk people! byyyeeee o/ [18:07:45] 10Analytics-Tech-community-metrics: On the "Git" dashboard, filtering on one organization still lists authors who are with another organization - https://phabricator.wikimedia.org/T157709#3013972 (10Aklapper) [18:42:22] 06Analytics-Kanban, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3014122 (10thcipriani) >>! In T155856#3013854, @Ottomata wrote: > Ok, great. Can I install on a remote target and we try in prod for > something? hrm. I usually test in beta and do... [18:56:14] Dealing with issues with a deploy of ORES. Will be late to research/analytics/devops [18:57:34] 06Analytics-Kanban, 15User-Elukey: Ongoing: Give me permissions in LDAP - https://phabricator.wikimedia.org/T150790#3014205 (10TBolliger) Thank you @MoritzMuehlenhoff — I can confirm I now have the access I need. [19:17:17] do we have an upcoming refinery deploy with source jar chnage? [19:17:37] 06Analytics-Kanban, 06Operations, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3014261 (10Ottomata) I think we’ll have a refinery deploy to do soon. Will check… [19:17:55] Looks like I might not make it to the checkin at all. [19:17:59] :( [19:28:25] bye a-team, see you tomorrow! [19:35:57] 06Analytics-Kanban, 06Performance-Team: Check if the EventLogging User Agent schema upgrade breaks any performance tool/metric - https://phabricator.wikimedia.org/T156760#3014301 (10Krinkle) @Nuria Okay, makes sense! I assumed the DB consumer would automatically store those as separate columns (e.g. "obj_subke... [19:36:39] 06Analytics-Kanban, 06Performance-Team: Update webperf EventLogging consumers for userAgent schema change - https://phabricator.wikimedia.org/T156760#3014302 (10Krinkle) p:05Triage>03High a:03Krinkle [19:48:54] halfak: hey! i was in the meeting for a bit, s'ok! [19:48:58] i guessed no one was coming [19:49:00] 06Analytics-Kanban, 06Performance-Team: Update webperf EventLogging consumers for userAgent schema change - https://phabricator.wikimedia.org/T156760#3014362 (10Nuria) >I'm glad this restriction is on the way out, and I suppose we'll migrate to an object sometime after. Yes, we would need to do small changes b... [19:50:05] ottomata, boo to that. I'm extra sorry not to show then. [19:50:17] * halfak does his devops work for the day rather than going to devops meeting [19:53:08] nuria: do you know if we have a refinery deploy (with jar updates) pending? [19:54:14] ottomata: i though we had deployed already? [19:55:00] nuria: we may have, i haven't kept track, just heard yall talking about it [19:55:11] i want to do a test scap deploy in prod that uses git fat [19:55:13] and refinery does [19:55:33] ottomata: i think we deployed already but can verify, give me a sec [19:59:34] ottomata: ya, we did 2017-02-09 12:45 hdfs:///wmf/refinery/current [19:59:46] ok [19:59:47] cool [19:59:50] ottomata: but that is good actually, you can redeploy what is already there [19:59:59] ottomata: does that work? [20:00:11] i think so, [20:00:54] ottomata: k [20:06:49] 10Analytics, 10Analytics-Cluster, 10EventBus: Delete stale topics from main Kafka clusters - https://phabricator.wikimedia.org/T149594#3014473 (10Ottomata) [20:16:02] elukey: , moritzm, FYI, main kafka brokers in eqiad and codfw have been restarted [20:48:33] 06Analytics-Kanban: Add hardware capacity to AQS - https://phabricator.wikimedia.org/T144833#3014644 (10Nuria) [20:48:35] 06Analytics-Kanban, 06Operations, 10ops-eqiad, 13Patch-For-Review, 15User-Elukey: rack and set up aqs100[7-9] - https://phabricator.wikimedia.org/T155654#3014643 (10Nuria) 05Open>03Resolved [20:48:47] 06Analytics-Kanban, 13Patch-For-Review: Add namespace ID to pageview_hourly - https://phabricator.wikimedia.org/T156993#3014646 (10Nuria) 05Open>03Resolved [20:49:02] 06Analytics-Kanban, 13Patch-For-Review: Debian package for ua parser latest version - https://phabricator.wikimedia.org/T156821#3014648 (10Nuria) 05Open>03Resolved [20:49:31] 06Analytics-Kanban, 10Wikimedia-Stream: Report number of stream connections to statsd - https://phabricator.wikimedia.org/T157492#3014661 (10Nuria) 05Open>03Resolved [20:49:34] 06Analytics-Kanban, 10EventBus, 10Wikimedia-Stream, 06Services (watching), 15User-mobrovac: EventStreams - https://phabricator.wikimedia.org/T130651#3014662 (10Nuria) [20:49:47] 06Analytics-Kanban, 10Fundraising-Backlog, 13Patch-For-Review: Productionize banner impressions druid/pivot dataset - https://phabricator.wikimedia.org/T155141#3014664 (10Nuria) 05Open>03Resolved [21:46:40] 06Analytics-Kanban, 13Patch-For-Review: CDH 5.10 upgrade - https://phabricator.wikimedia.org/T152714#3014918 (10Ottomata) Ah ha! Hue did break because of a change. Had to do: https://gerrit.wikimedia.org/r/#/c/336906/1/templates/hue/hue.ini.erb So, with that, everything looks good! Time to schedule... [21:49:11] 06Analytics-Kanban, 13Patch-For-Review: CDH 5.10 upgrade - https://phabricator.wikimedia.org/T152714#3014922 (10Ottomata) So, we briefly talked about doing this on a weekend..buut I don't really have a free weekend day until March 4. I suppose this can wait that long. Thoughts? [21:57:44] 06Analytics-Kanban, 06Operations, 13Patch-For-Review, 03Scap3: Package + deploy new version of git-fat - https://phabricator.wikimedia.org/T155856#3014951 (10Ottomata) Updating here: git-fat 0.1.2 now installed everywhere by puppet. [22:19:25] ottomata: thanks!