[02:57:41] 10Analytics, 10MediaWiki-Releasing: Create dashboard showing MediaWiki tarball download statistics - https://phabricator.wikimedia.org/T119772#3901867 (10Nuria) My advice , rather than using hadoop for this would be to instrument with piwik, releases.wikimedia.org. Combing terabytes of data for this few reques... [03:33:56] (03CR) 10Nuria: "Would be uploading a patch with a proposal of how I think we can simplify this a bit via separating database reader classes from POJOS tha" (035 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [05:56:40] 10Analytics, 10MediaWiki-Releasing: Create dashboard showing MediaWiki tarball download statistics - https://phabricator.wikimedia.org/T119772#3901946 (10Legoktm) People don't visit releases.wikimedia.org directly - they click the direct tarball link from mediawiki.org. I don't think piwik will work for that. [08:39:05] Hey, all -- looping back to a question that I asked the other day, but didn't actually see a response to. The EventLogging guide suggests that arrays are not a valid data type. Is that the case? (In this particular instance, I have a property that I'd ideally like to have set to an array of strings.) [08:39:35] 10Analytics-Kanban, 10User-Elukey: Fix outstanding bugs preventing the use of prometheus jmx agent for Hive/Oozie - https://phabricator.wikimedia.org/T184794#3902066 (10elukey) In `metastore.sh` it is used the `HIVE_METASTORE_HADOOP_OPTS` that works fine (just tested), but there seems to be no equivalent for H... [08:41:12] marlier: Hi! I'd suggest to ask directly to Marcel (mforns) [09:02:33] Will do, thanks [11:09:40] hi marlier! I'm here if you want to discuss something [11:10:32] "The EventLogging guide suggests that arrays are not a valid data type. Is that the case? (In this particular instance, I have a property that I'd ideally like to have set to an array of strings.) [11:10:36] mforns: --^ [11:10:52] elukey, yes, I saw, looking into it [11:10:54] thanks! [11:50:34] * elukey lunch! [12:08:00] (03CR) 10Joal: "Answered inline to some comments. Given the reader classes are already impletemented and the current classes are wrapper around them, I'm " (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [12:13:50] Hi elukey - just an email explaining I won't have a lot of time today [12:36:23] (03CR) 10Fdans: "This change is fantastic. I spent a bit of time playing with it and it addresses most of the feedback we've gotten so far about the select" (038 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/402387 (https://phabricator.wikimedia.org/T179530) (owner: 10Mforns) [12:55:42] joal: ack! [12:55:51] let me know if I can help [12:58:26] (03CR) 10Mforns: "Thanks Fran! Tried some answers to your questions. I'll wait for Dan's comments and they apply changes." (037 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/402387 (https://phabricator.wikimedia.org/T179530) (owner: 10Mforns) [12:59:37] Currently rebooting kafka100[23] [13:52:13] Rebooting druid1004 [14:03:42] everything looks good as far as I can see [14:12:31] going to reboot druid100[56] then [14:12:46] fdans,mforns - if you see anything weird for wikistats/aqs let me know [14:13:15] will do elukey :) [14:13:23] hello! [14:13:45] o/ [14:22:16] elukey, OK [14:22:22] hey milimetric :] [14:23:32] hi mforns [14:49:10] fdans: qq if you have time [14:49:26] elukey: shoot! [14:49:38] maybe I am getting it wrong but I am browsing https://stats.wikimedia.org/v2/#/all-projects and when I try to select say Wikinews in the search bar nothing changes [14:50:54] elukey: damn, this is the issue I was complaining about yesterday [14:51:32] mforns 's change on the wikiselector fixes it [14:51:52] ahhh okok [14:51:53] but I wonder if we should revert or wait for merge and deploy [14:52:40] the website looks a bit broken in this state, even if we are in a stage that we can tolerate bugs this one seems big :) [14:52:55] I agree [14:53:28] if you guys are close to deploy a fix I'd say that we can wait (it has been broker for a day now right?) [14:53:52] elukey: if I revert the gerrit change on the release branch, does puppet rerun? [14:54:20] I can re-run it manually, it should grab the last revision [14:54:30] (03PS1) 10Fdans: Revert "Release 2.1.4" [analytics/wikistats2] (release) - 10https://gerrit.wikimedia.org/r/404451 [14:54:39] ok elukey reverted [14:54:58] well you need to submit no? [14:55:30] also, it would be great if we could add a message about why we are rolling back [14:55:33] in the commit msg [14:55:39] so people will be aware [14:55:49] elukey: yes, sorry, on it [14:56:02] no rush :_ [14:56:03] :) [14:57:46] (03PS1) 10Fdans: Revert "Release 2.1.4" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/404454 [14:57:53] (03CR) 10jerkins-bot: [V: 04-1] Revert "Release 2.1.4" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/404454 (owner: 10Fdans) [14:58:57] (03Abandoned) 10Fdans: Revert "Release 2.1.4" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/404454 (owner: 10Fdans) [14:59:24] (03PS1) 10Faidon Liambotis: Define IOV_MAX properly, using _GNU_SOURCE [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404456 [14:59:26] (03PS1) 10Faidon Liambotis: Bump debhelper compat to 11, Standards-Version [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404457 [14:59:28] (03PS1) 10Faidon Liambotis: Remove kafkatee upstart script [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404458 [14:59:30] (03PS1) 10Faidon Liambotis: Don't create (or remove) /var/run/kafkatee anymore [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404459 [14:59:34] (03PS1) 10Faidon Liambotis: Use /run instead of /var/run in the source [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404460 [14:59:36] (03PS1) 10Faidon Liambotis: Remove +x from debian/kafkatee.post{inst,rm} [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404461 [14:59:38] (03PS1) 10Faidon Liambotis: d/control: fix indentation of the extended description [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404462 [14:59:40] (03PS1) 10Faidon Liambotis: Drop versioned Build-Depends to pre-jessie versions [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404463 [14:59:42] (03PS1) 10Faidon Liambotis: Set Maintainer to ops, add myself to Uploaders [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404464 [14:59:44] (03PS1) 10Faidon Liambotis: Release 0.1.7-1 [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404465 [15:00:21] (03PS2) 10Fdans: Revert release 2.1.4 [analytics/wikistats2] (release) - 10https://gerrit.wikimedia.org/r/404451 [15:00:37] (03CR) 10Fdans: [V: 032 C: 032] Revert release 2.1.4 [analytics/wikistats2] (release) - 10https://gerrit.wikimedia.org/r/404451 (owner: 10Fdans) [15:00:59] elukey: tested locally, submitted :) [15:01:15] fdans: https://gerrit.wikimedia.org/r/#/c/404451/ looks not submitted [15:02:00] damn, sorry, I'm being useless elukey, now it is [15:02:30] ahahah don't be silly, I was just reviewing things, nothing to say sorry for [15:03:46] elukey: thank you for the report Luca :D [15:05:11] working now! just ran puppet and verified [15:06:21] (03CR) 10Elukey: [V: 032 C: 032] Define IOV_MAX properly, using _GNU_SOURCE [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404456 (owner: 10Faidon Liambotis) [15:06:48] (03CR) 10Elukey: [V: 032 C: 032] Bump debhelper compat to 11, Standards-Version [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404457 (owner: 10Faidon Liambotis) [15:07:03] (03CR) 10Elukey: [V: 032 C: 032] Remove kafkatee upstart script [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404458 (owner: 10Faidon Liambotis) [15:07:42] (03CR) 10Elukey: [V: 032 C: 032] Don't create (or remove) /var/run/kafkatee anymore [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404459 (owner: 10Faidon Liambotis) [15:08:18] (03CR) 10Elukey: [V: 032 C: 032] Use /run instead of /var/run in the source [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404460 (owner: 10Faidon Liambotis) [15:08:37] (03CR) 10Elukey: [V: 032 C: 032] Remove +x from debian/kafkatee.post{inst,rm} [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404461 (owner: 10Faidon Liambotis) [15:08:55] (03CR) 10Elukey: [V: 032 C: 032] d/control: fix indentation of the extended description [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404462 (owner: 10Faidon Liambotis) [15:09:10] (03CR) 10Elukey: [V: 032 C: 032] Drop versioned Build-Depends to pre-jessie versions [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404463 (owner: 10Faidon Liambotis) [15:09:30] (03CR) 10Elukey: [V: 032 C: 032] Set Maintainer to ops, add myself to Uploaders [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404464 (owner: 10Faidon Liambotis) [15:09:51] (03CR) 10Elukey: [V: 032 C: 032] Release 0.1.7-1 [analytics/kafkatee] - 10https://gerrit.wikimedia.org/r/404465 (owner: 10Faidon Liambotis) [15:10:41] this is 30 mins of Faidon's work time --^ [15:10:43] :D [15:11:05] it would have taken me 3 hours probably [15:11:05] ahahah [15:48:07] currently in the process of rebooting druid1003 [15:48:13] I've drained the middlemanager [15:48:42] so after the hour we should be able to safely stop everything and reboot [15:49:57] 10Analytics, 10Patch-For-Review, 10Performance-Team (Radar): Explore NavigationTiming by faceted properties - EventLogging refine - https://phabricator.wikimedia.org/T166414#3902996 (10Krinkle) [16:03:09] (03Abandoned) 10Milimetric: Implement Wikistats metrics as Druid queries [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365806 (https://phabricator.wikimedia.org/T170882) (owner: 10Milimetric) [16:05:03] (03Abandoned) 10Milimetric: Update database where data is stored [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/391841 (owner: 10Milimetric) [16:06:27] (03Abandoned) 10Milimetric: [WIP] Design thoughts for AQS edit history API [analytics/aqs] - 10https://gerrit.wikimedia.org/r/347637 (owner: 10Milimetric) [16:50:54] anyone have the wikitech page link that has all the analytics usergroups? [16:51:14] nm, found it https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups [16:56:17] :) [17:02:52] fdans, do you have 10 minutes to pair and try and fix the placeholder in safari? [17:03:05] mforns: hell yea [17:03:06] mforns: so ok to merge https://gerrit.wikimedia.org/r/#/c/402907/ ? [17:03:08] batcave [17:03:26] elukey, LGTM! [17:03:29] ok! [17:03:33] fdans omw [17:36:26] there was a hole today in eventlogging metrics https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [17:36:39] that is what burrow complained about [17:41:38] I just discovered that brrd (not sure what that is) logs [2018-01-16 17:41:06,490] Connecting to ... [17:41:45] that is the forwarder? [17:42:29] ahh seems to try to connect to python /srv/deployment/eventlogging/analytics/bin/eventlogging-forwarder @/etc/eventlogging.d/forwarders/legacy-zmq [17:44:38] aye [17:44:46] elukey: btw if https://phabricator.wikimedia.org/T175087 happens, we can get rid of zmq forwarder [17:44:46] https://github.com/wikimedia/operations-software-brrd ahhh [17:45:26] ottomata: is that normal that spams dmesg and syslog that heavily? [17:45:34] i've never noticed that [17:45:48] brrd? [17:45:50] where does it run? [17:45:58] i think i didn't realize this was a thing really [17:46:01] eventlog1001 afaics [17:46:02] maybe it is also a dependancy [17:46:02] really? [17:46:05] Logs Navigation Timing data to an RRD file. [17:46:06] [17:46:06] ? [17:46:10] why would that run on eventlog1001??? [17:46:14] looking... [17:46:28] I was checking https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 and found brrd [17:46:28] it isn't in puppet afaic [17:46:29] t [17:46:41] ? [17:47:00] nope [17:47:28] /etc/init/brrd.conf [17:47:35] it has a upstart script [17:47:48] wow elukey i dunno, must be super old [17:47:49] exec /usr/bin/python -m brrd \ --log-file "/var/log/brrd/brrd.log" \ "tcp://eventlogging.eqiad.wmnet:8600" \ "/var/lib/brrd/navtiming.rrd" [17:47:57] we should ask Krinkle, buuut he might know anything [17:47:59] i doubt they are using it [17:48:02] we should probably remove it [17:51:09] yeah, just asked to Krinkle [17:54:36] ottomata: did we have an outage today? [17:54:46] I mean, the hole in the metrics [17:55:18] https://grafana.wikimedia.org/dashboard/db/eventlogging-schema?orgId=1&var-schema=Popups&from=1516113840279&to=1516115052842 [17:56:25] elukey: did you reboot some brokers or something? [17:56:43] kafka100[123] [18:03:54] hmmm so not analytics hm [18:04:00] ejegg|away: yooohoo [18:07:05] ottomata: just trying to get into the hangout [18:08:54] AndyRussG: i think your personal email was added to invite [18:08:57] https://hangouts.google.com/hangouts/_/wikimedia.org/queenmary [18:14:43] ottomata: soooo. are we in the right hangout or the meeting is canceled? :D [18:15:51] milimetric ^ [18:16:01] leila: we have a conflicting meeting [18:16:02] I see ottomata has said no already. [18:16:08] sorry [18:16:15] I see. Let me cancel this then, milimetric. [18:16:26] yeah, apologies I didn't do it already [18:16:52] done [18:27:28] milimetric: just saw Andrew had already sent an email about it. thanks, and sorry for the extra ping. [19:14:33] https://blog.godatadriven.com/divolte-kafka-druid-superset [19:14:37] this one? [19:15:01] ottomata: DIVOLTE, one two thre volte [19:15:04] tre [19:15:25] yeah i think so [19:15:48] yea! [19:27:30] 10Analytics, 10EventBus, 10Services (next): EventBus rejecting events because of malformed characters in the comment - https://phabricator.wikimedia.org/T184698#3903764 (10Pchelolo) I've found several more similar issues, around 50 over the last 60 days. What all of them have in common is that the comment/... [19:32:45] 10Analytics, 10Operations, 10ops-eqiad: Decommission kafka1018 - https://phabricator.wikimedia.org/T182955#3903797 (10Ottomata) p:05Triage>03Normal [20:09:06] 10Quarry, 10Operations, 10cloud-services-team (Kanban): let quarry use the mariadb module - https://phabricator.wikimedia.org/T181205#3904036 (10Ottomata) p:05Triage>03Normal [20:09:31] 10Analytics, 10Operations, 10Research, 10Traffic, and 6 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#3904037 (10Ottomata) p:05Triage>03Normal [20:43:57] 10Analytics-Kanban, 10Analytics-Wikistats: Beta Release: Resiliency, Rollback and Deployment of Data - https://phabricator.wikimedia.org/T177965#3904129 (10Nuria) >Given we probably want to use Druid as a query engine to check numbers between old and new, cache warming would actually be a side-effect of checki... [20:44:24] 10Analytics-Kanban, 10Analytics-Wikistats: Beta Release: Resiliency, Rollback and Deployment of Data - https://phabricator.wikimedia.org/T177965#3904130 (10Nuria) Let's move this task to tasking i think there is quite a bit to talk about. [20:44:36] 10Analytics, 10Analytics-Wikistats: Beta Release: Resiliency, Rollback and Deployment of Data - https://phabricator.wikimedia.org/T177965#3904131 (10Nuria) [20:59:53] 10Analytics, 10ChangeProp, 10EventBus, 10Reading-Infrastructure-Team-Backlog, and 2 others: Update node-rdkafka version to v2.x - https://phabricator.wikimedia.org/T176126#3904165 (10Ottomata) [21:00:50] 10Analytics, 10ChangeProp, 10EventBus, 10Reading-Infrastructure-Team-Backlog, and 2 others: Update node-rdkafka version to v2.x - https://phabricator.wikimedia.org/T176126#3614140 (10Ottomata) We've also got librdkafka 0.11 backported for Jessie in our apt repo now too. I don't see any blockers to using i... [21:01:41] 10Analytics, 10ChangeProp, 10EventBus, 10Reading-Infrastructure-Team-Backlog, and 2 others: Update node-rdkafka version to v2.x - https://phabricator.wikimedia.org/T176126#3904167 (10Ottomata) It is also installed on cp1008.wikimedia.org (cache canary) and used by varnishkafka there. [21:03:18] 10Analytics, 10EventBus, 10Operations, 10hardware-requests, and 2 others: SSDs for main Kafka clusters - https://phabricator.wikimedia.org/T166341#3904171 (10Ottomata) Do we still want to do this? [21:04:03] 10Analytics, 10Patch-For-Review: Update druid to latest release (0.10) - https://phabricator.wikimedia.org/T164008#3904176 (10Ottomata) [21:04:05] 10Analytics-Kanban: Upgrade druid - https://phabricator.wikimedia.org/T157977#3904178 (10Ottomata) [21:05:22] 10Analytics, 10ChangeProp, 10EventBus, 10Reading-Infrastructure-Team-Backlog, and 2 others: Update node-rdkafka version to v2.x - https://phabricator.wikimedia.org/T176126#3904182 (10Pchelolo) Hm, so librdkafka 0.11 getting on beta might be the reason of T185016 ? [22:51:05] (03PS3) 10Nuria: Refactor geo-coding function and add ISP [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [22:52:50] (03PS4) 10Nuria: Refactor geo-coding function and add ISP [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/403916 (https://phabricator.wikimedia.org/T167907) (owner: 10Joal) [22:55:32] joal: pushed refactor patch for maxmind on top of your changes, hopefully you do not hate it... [23:42:13] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (next): Migrate RefreshLinks job to kafka - https://phabricator.wikimedia.org/T185052#3904698 (10Pchelolo) p:05Triage>03Normal