[02:14:39] (CR) Yuvipanda: "Thank you for the patch! Did you manage to test this?" [analytics/quarry/web] - https://gerrit.wikimedia.org/r/283839 (https://phabricator.wikimedia.org/T117645) (owner: Zhuyifei1999) [05:06:13] (CR) Zhuyifei1999: [C: -1] "No, not yet. Do you have some docs on how to setup a test instance?" [analytics/quarry/web] - https://gerrit.wikimedia.org/r/283839 (https://phabricator.wikimedia.org/T117645) (owner: Zhuyifei1999) [05:09:30] (CR) Zhuyifei1999: "Nevermind, I just saw the README.md. Will do later." [analytics/quarry/web] - https://gerrit.wikimedia.org/r/283839 (https://phabricator.wikimedia.org/T117645) (owner: Zhuyifei1999) [08:27:15] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2213440 (fgiunchedi) btw the alert still shows up as UNKNOWN in icinga since 2d `Throughput of EventLogging NavigationTiming events UNKNOWN 2... [08:44:25] Analytics-Kanban: Puppet on stat1003 keeps failing for git errors - https://phabricator.wikimedia.org/T132445#2213473 (elukey) Double checked and the stats user is pushing data to the repo: https://github.com/wikimedia/analytics-geowiki-data-public/commits/master [09:11:08] Analytics-Tech-community-metrics, Differential, Developer-Relations (Apr-Jun-2016): Make MetricsGrimoire/korma support gathering Code Review statistics from Phabricator's Differential - https://phabricator.wikimedia.org/T118753#2213564 (Lcanasdiaz) >>! In T118753#2198466, @Qgil wrote: >>>! In T118753... [09:21:02] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2213659 (elukey) ``` elukey@stat1002:~$ kafkacat -L -b kafka1012.eqiad.wmnet:9092 | grep -i navigation topic "eventlogging_NavigationTiming"... [09:41:35] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2213691 (elukey) Cancelled the above comment since it was outdated by https://gerrit.wikimedia.org/r/#/c/283673/2/modules/eventlogging/manifes... [10:02:50] PROBLEM - Check status of defined EventLogging jobs on eventlog1001 is CRITICAL: CRITICAL: Stopped EventLogging jobs: processor/client-side-04 forwarder/legacy-zmq [10:04:14] good morning, just restarted kafka1018 for an upgrade --^ [10:05:50] joal: aloha :) [10:06:51] socket.error: [Errno 111] Connection refused [10:07:01] ahhh from eventlogging_processor-client-side-04.log [10:11:37] !log execute sudo eventloggingctl restart on eventlogging1001 [10:11:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [10:12:59] RECOVERY - Check status of defined EventLogging jobs on eventlog1001 is OK: OK: All defined EventLogging jobs are runnning. [10:20:06] everything seems good, even from the kafka broker's perspective (kafka1018 was the one restarted). I asked to Moritz to pause the upgrade to double check with you what happened [10:20:11] even if it seems the old issue [10:21:59] brb ~30 mins! [10:51:20] (PS2) Zhuyifei1999: output: add wikitable as an option [analytics/quarry/web] - https://gerrit.wikimedia.org/r/283839 (https://phabricator.wikimedia.org/T117645) [10:53:10] Analytics-Tech-community-metrics, Developer-Relations (Apr-Jun-2016): Check whether mailing list activity per person on korma is in sync with current "mlstats_mailing_lists.conf" - https://phabricator.wikimedia.org/T132907#2213805 (Aklapper) [10:53:20] Analytics-Tech-community-metrics, Developer-Relations (Apr-Jun-2016): Check whether mailing list activity per person on korma is in sync with current "mlstats_mailing_lists.conf" - https://phabricator.wikimedia.org/T132907#2213805 (Aklapper) p:Triage>Low [10:54:03] (CR) Zhuyifei1999: "I tested non-ascii characters in the header, but not in the data itself." [analytics/quarry/web] - https://gerrit.wikimedia.org/r/283839 (https://phabricator.wikimedia.org/T117645) (owner: Zhuyifei1999) [10:55:07] Analytics-Tech-community-metrics, Developer-Relations (Apr-Jun-2016): Check whether mailing list activity per person on korma is in sync with current "mlstats_mailing_lists.conf" - https://phabricator.wikimedia.org/T132907#2213805 (Aklapper) [10:55:09] Analytics-Tech-community-metrics, Developer-Relations: Who are the top 50 independent contributors and what do they need from the WMF? - https://phabricator.wikimedia.org/T85600#2213831 (Aklapper) [10:56:29] Analytics-Tech-community-metrics, Developer-Relations (Apr-Jun-2016): Check whether mailing list activity per person on korma is in sync with current "mlstats_mailing_lists.conf" - https://phabricator.wikimedia.org/T132907#2213833 (Aklapper) [11:10:39] mmm not sure if anybody wrote anything because my irssi went awol [11:12:50] the only weird thing that I can see now in the EL dashboard is the absence of EventLogging Client Errors by Schema [11:14:36] (CR) Zhuyifei1999: "I'm wondering, should it be download=true by default? Since it's wikitext, it's mainly for user to copy-paste." [analytics/quarry/web] - https://gerrit.wikimedia.org/r/283839 (https://phabricator.wikimedia.org/T117645) (owner: Zhuyifei1999) [11:29:42] Analytics-Wikistats: Recent "Edits per country" data not available - https://phabricator.wikimedia.org/T131596#2213893 (Aklapper) I also asked in https://lists.wikimedia.org/pipermail/analytics/2016-April/005114.html [11:44:03] lunch brb! [12:00:21] so yeah I can't figure out what https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=11&fullscreen means [12:00:41] total client error seems 100% but I can't find any evident issue in the logs [12:00:49] surely I am missing something [12:12:17] Hi elukey [12:12:41] joal: helloooo [12:13:16] so EL has issues, elukey [12:13:24] sigh [12:14:12] After a kafka restart, correct? [12:14:23] yeah, only one node (kafka1018) [12:14:46] I restarted EL and the main alerts went away, but the above graph is weird [12:15:07] what kind of errors are you seeing? [12:15:50] not seeing anything yet elukey :) [12:16:20] ahhhhhhh joal sorry! I thought you said "yes issues!" [12:16:28] :D [12:16:44] all right the only weird thing is https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=11&fullscreen [12:17:28] now that I see eventlogging_consumer-client-side-events-log.log shows a weird coincidence [12:17:56] last log is 2016-04-18 10:01:49,842 (Thread-23 ) RequestHandler worker: exiting cleanly [12:18:09] more or less when the broker rebooted [12:19:53] elukey: I have log lines in /var/log/upstart/eventlogging_processor-client-side-00.log [12:20:18] elukey: tail tells me 2016-04-18 12:18:38,726 [12:20:49] elukey: with a validation error [12:21:01] elukey: on that side, looks ok to me [12:22:31] joal: mmm so the processor is ok but the consumer for some reason is not getting anything? [12:23:20] elukey: I think the problem is on the chart: This "Total client_errors" shows 100% over the last 7 days [12:28:31] yeah could be, I was worried about the other metrics [12:36:59] hm, I don't get it elukey [12:37:38] joal: well NavigationTiming for example is not pubblishing datapoints anymore [12:38:08] (maybe I am missing something) [12:38:09] elukey: no more errors, right [12:38:35] it seemed pretty consitent during the past days, up to when I rebooted the broker [12:39:10] and the nice thing is that we'd need to restart all the workers this week PLUS the same thing again next week for a Java update [12:39:14] elukey: looking at https://grafana.wikimedia.org/dashboard/db/eventlogging Vraw-valid diff, there is something happening when you restart the broker [12:39:44] elukey: mwarf [12:40:42] elukey: can't say about the navigation timimg errors disapearing ... [12:43:27] elukey: one werid thing with this charts is that the throughput one show EventError at about 0 for the last 6 hours, even when there supposedly was errrors [12:44:38] mmm I was checking https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/Oncall#Fix_graphite_counts_if_not_working [12:45:37] * elukey discovers what hafnium.eqiad.wmnet is [12:45:42] elukey: hafnium might be the cuplprit [12:59:07] elukey: has it worked? [13:00:22] joal: didn't take any action because I don't get the "start eventlogging/consumer NAME=graphite CONFIG=/etc/eventlogging.d/consumers/graphite" [13:01:27] elukey: I don't know the syntax ... Is that systemd ? [13:03:04] I am not sure what that is to be honest :D [13:03:18] oki, we'll ask ottomata :) [13:05:41] ahhh maybe systemctl start etc.. [13:05:45] that would make sense [13:07:56] mmm nah systemctl -a doesn't show anything related [14:07:37] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2214065 (fgiunchedi) fwiw looks like it is known now but warning for the last 4h ```Throughput of EventLogging NavigationTiming events WARNIN... [14:11:56] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2214066 (elukey) @fgiunchedi: the above warning is it likely related to the issue that I was experiencing this morning with Event Logging afte... [14:12:08] ottomata: gooooooood morning [14:12:10] :) [14:12:18] whenever you have time, EventLogging doesn't like me [14:12:42] MORNING [14:12:48] i'm seeing that convo on that ticket looking too [14:12:53] what's up? [14:13:20] was NavTiming just not producting any EL events maybe? [14:14:00] so this morning I restarted kafka on kafka1018 for a security update as canary [14:14:06] and EventLogging alarmed [14:14:29] processor/client-side-04 to be precise (there is also a Burrow email) [14:14:45] checked the error logs and the problem was socket errors [14:14:55] so I restarted EL and everything was ok [14:15:10] EXCEPT https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=11&fullscreen [14:15:24] HM [14:15:25] (no metrics pushed) [14:15:47] oh [14:15:49] so [14:15:53] https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/Oncall#Fix_graphite_counts_if_not_working seemed related [14:15:58] i think those come from performace el daemons [14:16:05] el stuff that we don't really manage [14:16:26] not totally sure where...but probably on hafnium? [14:17:27] yeah I discovered its existence today :D [14:17:55] i don't see errors there [14:18:06] it subscribes to the zmq forwarder from eventlog1001 though [14:18:54] Apr 18 09:56:41 hafnium python[18520]: File "/usr/lib/python2.7/dist-packages/pykafka/handlers.py", line 55, in get [14:18:57] Apr 18 09:56:41 hafnium python[18520]: raise self.error [14:19:00] Apr 18 09:56:41 hafnium python[18520]: SocketDisconnectedError [14:19:10] this is from statsv status [14:19:18] ahhhhhhhhhhhhhhhhhh ori told me once this [14:19:19] the forwader is working fine [14:19:20] let me check [14:19:25] oh! [14:19:37] hm i was looking at the navtiming daemon [14:19:57] ah! [14:20:02] they have an old version of pykafka there! [14:20:06] elukey: i'm upgrading it... [14:20:11] gogogo! [14:20:32] I forgot that he told me once to restart statsv just in case when a broker was restarted [14:20:43] yeah, hopefully with new version you won't need to though [14:20:44] iof pykafka [14:20:46] of pykakfa [14:20:54] that's why we upgraded it for EL if you remember [14:21:14] just restarted statv [14:21:26] \o/ [14:21:55] logs are better [14:23:15] joal: I knew what to do the whole time but it was buried in my head [14:24:03] see ottomata comes and solves problem while having breakfast [14:24:46] haha [14:24:52] i alwasy forget about those hafnium services :/ [14:24:58] hopefully with this newer version we won't have to deal [14:27:21] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2214074 (Ottomata) Ah! So there are some services that use this stuff on hafnium managed by the performance team. I noticed that the pykafka... [14:28:13] https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=11&fullscreen --> data coming againn! [14:31:23] cool! [14:33:33] thanks elukey and ottomata for having solved that :) [14:34:44] joal: I usually only waste hours trying to solve something and then ottomata fixes them [14:35:42] elukey: Don't be stupid ! (elukey TM) [14:38:28] Analytics-Kanban: Puppet on stat1003 keeps failing for git errors - https://phabricator.wikimedia.org/T132445#2214084 (Ottomata) The problem was that the stat’s user’s git config email did not match what was in gerrit, and it did not have forge committer identity rights. This was fixed on Friday. [14:51:49] added https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/Oncall#Dependent_systems [14:56:44] Analytics-Kanban: Puppet on stat1003 keeps failing for git errors - https://phabricator.wikimedia.org/T132445#2214132 (elukey) Open>Resolved [15:02:16] thanks elukey! [15:04:11] Analytics-Cluster, Analytics-Kanban, Operations, Patch-For-Review: setup stat1004/WMF4721 for hadoop client usage - https://phabricator.wikimedia.org/T131877#2214142 (elukey) Checked the netboot.cfg blame list and the raid was configured on purpose to leverage all the 4 disks: ``` elukey@stat100... [15:26:20] Analytics: EventLogging suffers for Kafka broker restarts - https://phabricator.wikimedia.org/T132922#2214221 (elukey) [15:35:05] * elukey afk for a bit! [15:42:19] Analytics-Kanban: Make reportupdater output emtpy values when query returns no results. - https://phabricator.wikimedia.org/T117537#2214261 (mforns) a:mforns [15:57:08] Analytics-Kanban, Patch-For-Review: Fix number formatting in charts - https://phabricator.wikimedia.org/T132579#2214308 (Nuria) Open>Resolved [15:57:23] ottomata: milimetric: hi! quick question... is there a standard way of sharing ipython notebooks, for example, to link from a Phab task or an e-mail? [15:57:28] Analytics-Kanban, Patch-For-Review: Fix limn-mobile-data mobile-options-last-3-months report after RU changes {lama} - https://phabricator.wikimedia.org/T131849#2214309 (Nuria) Open>Resolved [15:58:02] Analytics-Kanban: Unique devices endpoint Graphana Dashboard - https://phabricator.wikimedia.org/T132795#2214310 (Nuria) [15:58:52] haha, AndyRussG you are speaking the language of madhuvishy and yuvipanda [15:59:12] ottomata: ah hmmm [15:59:24] K I'll await their arrival then :) thx much! [15:59:58] Hi! Quick question - is there a database mirror on Labs which we can use instead of hitting the Pageviews API for a little tool we run? [16:00:03] Analytics: Make metrics-by-project breakdown interactive and bookmarkable - https://phabricator.wikimedia.org/T130255#2214311 (Nuria) [16:00:05] Analytics-Kanban, Patch-For-Review: Allow filtering of data breakdowns in pageview metric - https://phabricator.wikimedia.org/T131547#2214312 (Nuria) [16:00:37] trying to enter the batcave... [16:00:50] same [16:00:51] a-team :trying to join [16:00:53] slow... [16:01:05] mforns: I'd say unresponsive :) [16:01:09] hehehe [16:01:18] I'd say screwed [16:01:18] hangouts, every day a new adventure [16:01:26] I'm in ! [16:02:07] i think we all need to sign in [16:02:20] again [16:02:50] madhuvishy: yt? [16:06:22] a-team: injecting post-standup - how to backup stat1001 data (hadoop or asking ops for an alternate backup?) - tracking phab: https://phabricator.wikimedia.org/T76348 [16:08:09] elukey: yhou can't! its too big! :) [16:08:25] i *think* you gotta just install and make it not wipe the /a disk :/ [16:08:29] or /srv, whatever it is [16:08:31] oh stat1001 though. [16:08:32] hmm [16:08:35] that might not be so big [16:22:00] elukey: discussed in standup: ops should have a way to backup that is not hadoop [16:23:06] joal: thanksssss [16:23:12] I'll follow up with Daniel on this [16:23:29] * elukey everybody blames ops [16:26:17] joal: this is a special case [16:26:18] heh [16:26:37] they do have a way to backup, but there are few uses that ask to backup so much data at once [16:27:08] elukey, ottomata : I supported the idea of maybe, as a one off, doing on hadoop - nuria_ prefers not to set precedence and not put ops-level dependency on hadoop [16:27:38] elukey, ottomata, nuria_ : I'm happy to discuss it more :) [16:28:09] no need, it was just to get the quickest way (and of course the one supported the most by the team) [16:30:37] the suggestion to use hadoop is not to use it for regular backups, but for a one off reinstall [16:31:43] if you guys feel strongly about hadoop do use it but it seems that it adds 1 more piece to a backup process that can fail and wikistats is still the most used piece of data we have [16:32:15] cc ottomata elukey [16:34:58] Analytics-Kanban: Describe threat model for sanitized pageview data {mole} - https://phabricator.wikimedia.org/T131158#2214452 (Nuria) [16:35:37] Analytics-Kanban: Sanitize pageview_hourly - subtasked {mole} - https://phabricator.wikimedia.org/T114675#2214453 (Nuria) [16:35:49] its not for regular backups [16:35:57] i woudln't even call it a 'backup' [16:36:06] more like a holding zone during a reinstall [16:36:24] but, we will have this same problem for stat1002 and stat1003 eventually, and im not sure using hadoop there will work, since the data is so much bigger [16:36:28] Analytics-Kanban: Research and validate assumptions for pageview sanitization with Research Team {mole} - https://phabricator.wikimedia.org/T120640#1857777 (Nuria) Open>Resolved [16:38:18] Analytics, Analytics-Cluster, Operations, ops-eqiad: Analytics hosts showed high temperature alarms - https://phabricator.wikimedia.org/T132256#2214476 (Nuria) [16:43:22] Analytics-Kanban: harvest unique devices endpoint request data like we do for pageviewAPI - https://phabricator.wikimedia.org/T132931#2214490 (Nuria) [16:44:09] ottomata: I am still unclear about how we value the data on stat1001 but if we do a "regular" backup strategy should be in place anyway in my opinion.. [16:44:33] Analytics-Kanban: Harvest unique devices endpoint request data like we do for pageviewAPI - https://phabricator.wikimedia.org/T132931#2214502 (Nuria) [16:47:26] Analytics: Break down "Other" a little more? - https://phabricator.wikimedia.org/T131127#2214510 (Nuria) [16:47:49] Analytics: Break down "Other" a little more? - https://phabricator.wikimedia.org/T131127#2156750 (Nuria) p:Low>Normal [16:47:53] Analytics-Kanban: Document the new AQS Unique devices endpoint and launch it - https://phabricator.wikimedia.org/T129520#2214512 (JAllemandou) [16:47:55] Analytics-Kanban: Harvest unique devices endpoint request data like we do for pageviewAPI - https://phabricator.wikimedia.org/T132931#2214513 (JAllemandou) [16:48:01] elukey: if we could, that would be nice [16:48:23] when i've looked into this in the past, i've been told that it is too much data [16:48:30] for our usual backup stuff [16:48:31] :( [16:48:52] Hi! Quick question - is there a database mirror on Labs which we can use instead of hitting the Pageviews API for a little tool we run? (Pardon me for asking again) [16:49:48] nuria_, mforns : hangout logged me out, I'll be back ! [16:51:35] I'll be bach [16:51:40] joal: ok, try to re-sign [16:51:49] nuria_: trying, getting troubles [16:52:05] Niharika: for pageviews? no, there is nothing like that [16:52:21] nuria_: Okay, thanks! [16:54:07] joal: it no work? [16:54:13] Nope :( [16:54:16] Analytics-Kanban: {pika} Proactive Pageview Definition - https://phabricator.wikimedia.org/T109745#1558031 (Nuria) Phase 1 and 2 are done, Phase 3 differs quite a bit from what we are doing in the near/mid term thus closing this ticket. [16:54:31] Analytics-Kanban: {pika} Proactive Pageview Definition - https://phabricator.wikimedia.org/T109745#2214573 (Nuria) Open>Resolved [16:55:22] Back ! [16:55:46] Analytics-Kanban: Unique devices endpoint Graphana Dashboard {bear} - https://phabricator.wikimedia.org/T132795#2214577 (Nuria) [16:56:01] Analytics-Kanban: Harvest unique devices endpoint request data like we do for pageviewAPI {bear} - https://phabricator.wikimedia.org/T132931#2214578 (Nuria) [16:56:18] Analytics-Kanban: Document the new AQS Unique devices endpoint and launch it {bear} - https://phabricator.wikimedia.org/T129520#2214579 (Nuria) [16:56:44] Analytics: Visualize unique devices data in dashiki {bear} - https://phabricator.wikimedia.org/T122533#2214582 (mforns) [16:56:59] FYI a-team, aqs and druid nodes are currently in the 'pending receiving' queue on the hardware procurement workboard [16:57:06] as of last thurs [16:57:10] so hopefully they will arrive soon! [16:57:14] o/ [16:57:51] thanks ottomata for the heads up ! [16:57:55] Analytics-Kanban: {bull} Unique Tokens - https://phabricator.wikimedia.org/T102221#2214583 (Nuria) Open>Resolved [17:03:18] joal, what? [17:03:21] a-team, uhhh meeting about edit data? [17:03:32] that's what I touhgt:) [17:03:36] mforns: --^ [17:03:46] i'm in the meeting, oh should i be in batcave? [17:03:47] oh, I have it not in my calendar.. [17:03:54] Arf [17:04:02] nuria_: batcave for that? [17:04:13] yes, ! [17:04:31] joining batcave [17:05:16] didn't have it in my calendar [17:09:33] Analytics-Kanban: Examine wikistats reports, make a summary of the most granular data needed that would serve all reports - https://phabricator.wikimedia.org/T131783#2214615 (Nuria) Notes on this so far: https://etherpad.wikimedia.org/p/wikistats-edits We are leaning to 1st load data from mysql [17:27:14] Analytics: Swap hardware on pageview API with new SSD-able machines - https://phabricator.wikimedia.org/T132938#2214688 (Nuria) [17:27:28] Analytics: Swap hardware on pageview API with new SSD-able machines - https://phabricator.wikimedia.org/T132938#2214700 (Nuria) p:Triage>High [17:29:27] Analytics-Kanban: Harvest unique devices endpoint request data like we do for pageviewAPI {bear} - https://phabricator.wikimedia.org/T132931#2214704 (JAllemandou) a:JAllemandou [17:32:30] (PS1) Joal: Update aqs oozie job to harvest more endpoints [analytics/refinery] - https://gerrit.wikimedia.org/r/283999 [17:32:35] nuria_: --^ [17:32:44] yessir [17:34:10] Analytics-Kanban: Examine wikistats reports, make a summary of the most granular data needed that would serve all reports - https://phabricator.wikimedia.org/T131783#2214726 (JAllemandou) a:JAllemandou [17:34:20] Analytics: Swap hardware on pageview API with new SSD-able machines - https://phabricator.wikimedia.org/T132938#2214727 (elukey) a:elukey [17:35:00] ---^ joal assigned it to me, we should grab some time this week if you have time to decide how we do it [17:35:48] elukey: cool, a lot on my plate currently, will try to find time for this :) [17:36:26] joal: yep I know, I'll try to study a plan and then get back to you for final validation and moral support :D [17:36:34] :D [17:37:03] I'd like also to figure out if we could do something about the sstable compaction but still had no time to look into cassandra :( [17:39:59] brb! [17:45:53] Analytics, Operations, WMF-Legal, Privacy: Honor DNT header for access logs & varnish logs - https://phabricator.wikimedia.org/T98831#2214758 (ZhouZ) [17:46:12] Analytics, Analytics-EventLogging, WMF-Legal, Privacy: Allow opting out from logging some of the default EventLogging fields on a schema-by-schema basis - https://phabricator.wikimedia.org/T108757#2214762 (ZhouZ) [18:06:47] Analytics-Wikistats: Recent "Edits per country" data not available - https://phabricator.wikimedia.org/T131596#2214933 (Aklapper) ...and received helpful replies [[ https://lists.wikimedia.org/pipermail/analytics/2016-April/005115.html | by Nemo ]] and by [[ https://lists.wikimedia.org/pipermail/analytics/20... [18:23:29] hey ottomata! I made a few more pokes to https://gerrit.wikimedia.org/r/#/c/269467/ and it would be great to get a few high level comments on it again! [18:24:35] hmm, addshore think you still have to upload patch :) [18:24:42] ottomata: have you seen mobrovac comments in ops chan ? [18:24:50] no looking [18:25:00] ottomata: its a draft, you may need to log in ;) [18:25:03] ottomata: timeouts were related to aqs 500 [18:25:16] hello [18:25:19] ah ok [18:25:23] ottomata: actually root cause is waiting for read, mre than compaction [18:25:34] just to clarify :) [18:25:44] addshore: log in? im' logged in [18:25:52] https://gerrit.wikimedia.org/r/#/c/269467/ [18:25:57] did you see my comments this morning? [18:25:59] :) [18:26:06] joal: ok [18:26:11] things are ok though? [18:26:13] oh ottomata ! hah! [18:26:21] I totally missed the new comments! :D [18:26:38] ottomata: not really, would be better not to timeout, but nothing we can do now [18:27:14] a-tea, I'm off for tonight, will see you tomorrow [18:27:20] ok [18:27:23] laters joal! [18:43:39] Analytics: Swap AQS nodes with new SSD-able machines - https://phabricator.wikimedia.org/T132938#2215075 (Ottomata) [18:43:51] Analytics: Swap AQS nodes with new SSD-able machines - https://phabricator.wikimedia.org/T132938#2214688 (Ottomata) [18:43:53] Analytics, Operations, hardware-requests, Patch-For-Review: eqiad: (3) AQS replacement nodes - https://phabricator.wikimedia.org/T124947#2215079 (Ottomata) [18:46:54] (PS1) Catrope: Add frwikisource to flow_beta wiki list [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/284021 (https://phabricator.wikimedia.org/T132914) [18:56:08] (CR) Catrope: [C: 2] Add frwikisource to flow_beta wiki list [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/284021 (https://phabricator.wikimedia.org/T132914) (owner: Catrope) [18:56:36] (Merged) jenkins-bot: Add frwikisource to flow_beta wiki list [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/284021 (https://phabricator.wikimedia.org/T132914) (owner: Catrope) [19:06:17] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2215217 (Krinkle) @ottomata It looks to be back up and working fine, but the restart appears to have a change in traffic. Analytics: Swap AQS nodes with new SSD-able machines - https://phabricator.wikimedia.org/T132938#2215242 (JAllemandou) See @Eevans comments on that task: https://phabricator.wikimedia.org/T124314 There is more to this swap than just hardware replacement :) [19:51:51] As another data point, the post-restart throughput of (?) EventLogging seems very high for the VE data too: https://grafana.wikimedia.org/dashboard/db/visualeditor-load-save not great [19:53:13] HaeB: yt? [19:53:24] hi [19:53:46] HaeB: do you know of these reports: http://www.pewinternet.org/three-technology-revolutions/ [19:53:58] HaeB: sorry, a better url [19:54:30] HaeB: http://www.pewinternet.org/datasets/ [19:54:45] HaeB: they seem to have data about devices per user for example [19:54:58] orusers to device [19:55:20] no, looks interesting [19:55:26] HaeB: http://www.pewinternet.org/datasets/april-2012-cell-phones/ [19:55:47] IIRC the previous mobile team(s) spent quite a bit of time looking at such general industry data in earlier years [19:55:56] e.g. mary meeker's annual reports [19:56:08] HaeB: well, if we have those numbers available then nm [19:57:18] nah, i'm not sure they looked at devices per user. so if you're researching that, the pew data might still be interesting [19:58:45] nuria_, mforns_gym : If you have a minute, can you have a look at https://wikitech.wikimedia.org/wiki/Analytics/Unique_Devices? [19:59:06] joal: looking [19:59:13] Analytics-Kanban: Document the new AQS Unique devices endpoint and launch it {bear} - https://phabricator.wikimedia.org/T129520#2215380 (JAllemandou) a:JAllemandou [20:00:18] nuria_: but Pew is usually US only, right? [20:00:35] HaeB: yes, seems that way. [20:01:55] joal: looks good, made 1 small edit [20:02:18] cool nuria_ :) [20:02:45] joal: that went real fast seems to me [20:02:56] nuria_: ? [20:03:14] joal: as in "rolling out that end point was real fast" [20:03:22] nuria_: I agree !!! [20:03:43] nuria_: I feel we have a "pipeline" for data presentation now :) [20:03:51] joal: great! [20:03:56] nuria_: And I had to fight a bit with cassandra: ) [20:04:06] joal: alot, not a bit [20:04:19] nuria_: For endpoint launching, emil to engineering ? [20:04:47] joal: we have to e-mail wikitech-l, analytics and wiki-research-l at least [20:05:30] nuria_: sounds good [20:05:34] joal: but let it bake today, we can do it tomorrow [20:05:41] sure ! [20:06:05] I still wanted to have that doc done, it makes days I had been postponing :) [20:06:12] joal: thank you [20:06:15] Anyway, thanks for review nuria_ [20:07:27] nuria_, ottomata: if you have a time for a 1 liner review before end of your day, that'd be awesome (merging/deploying tomorrow): https://gerrit.wikimedia.org/r/#/c/283999/ [20:07:31] Thanks again :) [20:07:35] bye folks! Talk with you tomorrow! o/ [20:07:35] good night a-team [20:07:41] joal: will CR [20:08:19] (CR) Nuria: [C: 2 V: 2] Update aqs oozie job to harvest more endpoints [analytics/refinery] - https://gerrit.wikimedia.org/r/283999 (owner: Joal) [20:10:01] Thx nuria_ ! [20:31:13] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2215521 (Tbayer) [20:34:17] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2215557 (Tbayer) And it does not seem to be related to the to the Google bug around March 11, which caused about 600k new installs, many of which we lost again in subs... [21:04:45] Analytics-EventLogging, Operations, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2215675 (Ottomata) Talked a bit with Timo in IRC. The new version of pykafka change the default value of `auto_offset_reset` to from latest t... [21:09:37] Analytics-EventLogging, Operations, Performance-Team, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2215696 (Krinkle) [21:15:04] Analytics, Hovercards, Unplanned-Sprint-Work, Reading-Web-Sprint-70-Lady-and-the-Trumps: Capture hovercards fetches as previews in analytics - https://phabricator.wikimedia.org/T129425#2215726 (dr0ptp4kt) Confirmed on latest stable Firefox, Chrome, Opera, and Safari desktop UAs client side. Conf... [21:17:35] Analytics, Hovercards, Unplanned-Sprint-Work, Reading-Web-Sprint-70-Lady-and-the-Trumps: Capture hovercards fetches as previews in analytics - https://phabricator.wikimedia.org/T129425#2215740 (dr0ptp4kt) The next step should be to verify in Hive once this starts rolling on the train. With the sc... [21:18:36] dr0ptp4kt: who you HELLOin?! [21:23:17] (CR) Alex Monk: "ok..." [analytics/quarry/web] - https://gerrit.wikimedia.org/r/266925 (https://phabricator.wikimedia.org/T76466) (owner: Alex Monk) [22:16:49] Analytics, Commons, Multimedia, Wikidata, and 3 others: Allow tabular datasets on Commons (or some similar central repository) (CSV, TSV, JSON, XML) - https://phabricator.wikimedia.org/T120452#2215951 (Legoktm) [23:27:03] hey, is there a way in Hive to tell whether a request is HTTP or HTTPS?