[00:03:22] Analytics-Kanban, Learning-and-Evaluation: Add instruction text next to the input fields in the Program Global Metrics Report {kudu} - https://phabricator.wikimedia.org/T121899#1988237 (Milimetric) @Abit - I thought the message next to the dates was redundant because those date controls don't allow you to... [00:04:08] (PS1) Milimetric: Add explanatory text to the program-metrics form [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267806 (https://phabricator.wikimedia.org/T121899) [00:04:13] milimetric: joal: ping? [00:04:19] hi mobrovac [00:04:25] joseph's gone for the night, what's up [00:04:33] ah right, makes sense [00:04:50] milimetric: i see quite some 500 Error in Cassandra table storage backend coming form AQS [00:04:53] in the rb logs [00:05:07] uri is /analytics.wikimedia.org/v1/pageviews/per-article/ro.wikipedia/all-access/user/Thales_din_Milet/daily/2015080100/2016013100 [00:05:10] e.g. [00:07:23] mobrovac: interesting: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/ro.wikipedia/all-access/all-agents/Thales_din_Milet/daily/2015080100/2016013100 [00:07:23] (CR) Nuria: [C: 2 V: 2] "Tested that it works and it does. Seems an awful long dropdown but ..." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267790 (https://phabricator.wikimedia.org/T121167) (owner: Milimetric) [00:07:23] milimetric: hm indeed [00:07:23] milimetric: i'll monitor the logs a while longer and see if i see a pattern [00:07:39] from what i can see usage hasn't increased... hm [00:11:19] hm, there were 4k such errors in the last hour [00:11:19] Analytics-Kanban: Expose the results of the global metric at a public link, that's available immediately for the API {kudu} [8 pts] - https://phabricator.wikimedia.org/T118310#1988256 (Milimetric) @Abit - I'm just making sure about something. I think this task is the last unfinished thing that you needed us... [00:11:19] Analytics-Kanban: Expose the results of the global metric at a public link, that's available immediately for the API {kudu} [8 pts] - https://phabricator.wikimedia.org/T118310#1988270 (Milimetric) a:Milimetric [00:11:19] mobrovac: i see, one per second doesn't move the graphs but it seems significant for sure [00:11:19] what graph are you looking at, btw? [00:11:19] I was looking at https://grafana.wikimedia.org/dashboard/db/pageviews [00:11:19] milimetric: logstash @ https://logstash.wikimedia.org/#/dashboard/elasticsearch/restbase [00:14:37] mobrovac: I think this makes sense, and it's most likely due to loading: https://hue.wikimedia.org/oozie/list_oozie_coordinators/ [00:14:37] oh ok [00:14:37] 5 jobs are running, the end of the month reports [00:14:37] milimetric: yup, you are right, makes sense [00:14:37] ah yes, it's inventory day :) [00:14:37] :) yes [00:14:37] so the plan is to get two more boxes that replicate to the main cluster, and load through those boxes but not query them [00:14:37] also we'll get SSDs - we're submitting the hardware requests very soon [00:14:38] SSDs should help a lot [00:14:38] but we'll have to limp along until that happens [00:14:38] that's a good idea [00:14:49] gwicke: I wanted to ask about that, you looked at the samsung 1TB ones right, did you see the SanDisk Extreme Pro ones? [00:15:10] 10 year warranty and all: http://www.amazon.com/SanDisk-Extreme-2-5-Inch-Warranty-SDSSDXPS-960G-G25/dp/B00KHRYR0U [00:15:15] seem pretty fast, and a bit cheaper [00:15:26] ottomata: running anything important on analytics1021? [00:15:57] milimetric: I tried to use cheaper SSDs once & we got a bad batch; since then ops has not been keen to use anything but those samsungs [00:16:05] even those took a lot of convincing [00:16:21] ottomata: nvm :) [00:17:01] hm... my personal experience is the opposite. I've had trouble with Samsung SSDs, never with SanDisk [00:17:09] milimetric: you can give it a try -- I think technically those should be fine [00:17:42] one consideration is write rate [00:17:43] k, I'm running them in my laptop, seem pretty great to me :) but if ops don't like'm I'm not arguing [00:17:49] and endurance [00:18:22] endurance is the big one to me, most of these are bumping up against the SATA 3 limit anyway [00:18:26] http://www.anandtech.com/show/8170/sandisk-extreme-pro-240gb-480gb-960gb-review says they are rated at ~22G writes per day [00:18:35] for 10 years [00:19:15] the warranty is meaningless for us, as we'll likely write more than that [00:20:48] but, large disks in a raid make for low per-disk write rates [00:21:48] interesting, I get some weird results when I look for long endurance, like the Corsair SSDs [00:22:32] large SSDs have much higher durability just by having more flash cells to burn [00:23:09] so if sandisk says 80T for the 240G model, then it should be better for 1T [00:23:42] the more expensive disks with higher endurance typically have more extra capacity [00:24:13] all that said, I think you'll be fine with either of those options [00:24:55] http://www.anandtech.com/show/8216/samsung-ssd-850-pro-128gb-256gb-1tb-review-enter-the-3d-era says 150T endurance for the Samsung, so a bit higher than 80 [00:27:04] cool, that works for me. Interesting reads [00:27:19] avg write rate over the last 30 days was 2.3 mb/s [00:27:34] https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-system?panelId=8&fullscreen [00:28:36] (CR) Nuria: [C: 2] "There is an additional comment in ticket about adding some text for dates but I think those are self-explanatory." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267806 (https://phabricator.wikimedia.org/T121899) (owner: Milimetric) [00:29:45] (CR) Milimetric: "yeah, I commented the same about the date fields on that task. The format of the date picker is forced, so any explanation of it would ju" [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267806 (https://phabricator.wikimedia.org/T121899) (owner: Milimetric) [00:29:58] nite all o/ [00:36:22] Analytics-Kanban, MediaWiki-General-or-Unknown, Beta-Cluster-reproducible: Regression: action=info pages broken - https://phabricator.wikimedia.org/T125432#1988454 (Nuria) [00:37:03] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event - https://phabricator.wikimedia.org/T125423#1988456 (Nuria) [00:42:40] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event - https://phabricator.wikimedia.org/T125423#1988474 (Nuria) Rebooted instance, restarted Eventlogging and restarted mysql [04:10:58] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event - https://phabricator.wikimedia.org/T125423#1988849 (Nuria) Other tables are inserting events correctly so I think things are back to work: MariaDB [log]> select... [05:10:36] Analytics: Create directory for output of discovery analytics data in HDFS - https://phabricator.wikimedia.org/T125488#1988866 (EBernhardson) NEW [05:11:23] Analytics: Create directory for output of discovery analytics data in HDFS - https://phabricator.wikimedia.org/T125488#1988873 (EBernhardson) [05:16:57] Analytics: Create directory for output of discovery analytics data in HDFS - https://phabricator.wikimedia.org/T125488#1988875 (EBernhardson) [07:08:35] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1988933 (jcrespo) We can delete in batches, using pt-archiver every nth row. For example, using the timestamp "seconds" col... [07:38:46] Analytics: Make AQS return 0 instead of no values {slug} - https://phabricator.wikimedia.org/T118402#1988997 (Nemo_bis) The 404 comes from T102725. [07:39:54] Analytics: Inspect Pageview API queries (after launch ) {slug} - https://phabricator.wikimedia.org/T117242#1989000 (Nemo_bis) [07:40:58] Analytics-Kanban, Datasets-General-or-Unknown: Remove all-access and spider from top endpoint {slug} - https://phabricator.wikimedia.org/T121300#1989002 (Nemo_bis) [07:41:00] Analytics-Kanban, Datasets-General-or-Unknown: Gather preliminary metrics of Pageview API usage for quaterly review {slug} [5pts] - https://phabricator.wikimedia.org/T120845#1989003 (Nemo_bis) [07:41:02] Analytics-Kanban, Datasets-General-or-Unknown: Backfill cassandra pageview data - August [5 pts] {slug} - https://phabricator.wikimedia.org/T118845#1989005 (Nemo_bis) [07:41:04] Analytics-Kanban, Datasets-General-or-Unknown: Missing Pageview API data for one article {slug} [3 pts] - https://phabricator.wikimedia.org/T118785#1989004 (Nemo_bis) [07:41:06] Analytics-Kanban, Datasets-General-or-Unknown: Investigate cassandra daily top job [5 pts] {slug} - https://phabricator.wikimedia.org/T118449#1989007 (Nemo_bis) [07:41:08] Analytics-Kanban, Datasets-General-or-Unknown: Backfill cassandra pageview data - September [5 pts] {slug} - https://phabricator.wikimedia.org/T118450#1989006 (Nemo_bis) [07:41:10] Analytics-Kanban, Datasets-General-or-Unknown: Update Cassandra loading job - per-project [5 pts] {slug} - https://phabricator.wikimedia.org/T118447#1989009 (Nemo_bis) [07:41:12] Analytics-Kanban, Datasets-General-or-Unknown: Update cassandra monthly top job [3 pts] {slug} - https://phabricator.wikimedia.org/T118448#1989008 (Nemo_bis) [07:41:14] Analytics-Kanban, Datasets-General-or-Unknown: AQS should expect article names uriencoded just once {slug} - https://phabricator.wikimedia.org/T118403#1989010 (Nemo_bis) [07:41:16] Analytics, Datasets-General-or-Unknown: Inspect Pageview API queries (after launch ) {slug} - https://phabricator.wikimedia.org/T117242#1989012 (Nemo_bis) [07:41:18] Analytics, Datasets-General-or-Unknown: Make AQS return 0 instead of no values {slug} - https://phabricator.wikimedia.org/T118402#1989011 (Nemo_bis) [07:41:20] Analytics-Kanban, Datasets-General-or-Unknown: Pageview API documentation for end users {slug} [8 pts] - https://phabricator.wikimedia.org/T117226#1989013 (Nemo_bis) [07:41:22] Analytics-Backlog, Analytics-Kanban, Datasets-General-or-Unknown: Testing druid data loading/retrieval on labs {slug} - https://phabricator.wikimedia.org/T116764#1989014 (Nemo_bis) [07:41:24] Analytics-Kanban, Datasets-General-or-Unknown: Reformat pageview API responses to allow for status reports and messages {slug} - https://phabricator.wikimedia.org/T117017#1989015 (Nemo_bis) [07:41:26] Analytics-Backlog, Analytics-Kanban, Datasets-General-or-Unknown: Test Elastic search pageview data loading/retrieval on labs {slug} [8 pts] - https://phabricator.wikimedia.org/T116763#1989017 (Nemo_bis) [07:41:28] Analytics-Kanban, Datasets-General-or-Unknown: Druid testing on labs to asses whether is a suitable Cassandra replacement. {slug} [8 pts] - https://phabricator.wikimedia.org/T116409#1989018 (Nemo_bis) [07:41:30] Analytics-Kanban, Datasets-General-or-Unknown, Patch-For-Review: Remove loading of hourly data from Cassandra loading scripts and hql [5 pts] {slug} - https://phabricator.wikimedia.org/T116408#1989020 (Nemo_bis) [07:41:32] Analytics-Kanban, Datasets-General-or-Unknown: Document Cassandra SLAS and storage requirements for daily and hourly data {slug} [5 pts] - https://phabricator.wikimedia.org/T116407#1989021 (Nemo_bis) [07:41:34] Analytics-Kanban, Datasets-General-or-Unknown: Improve record size on cassandra storage for pageview API data (RESTBase changes) {slug} [8 pts] - https://phabricator.wikimedia.org/T116209#1989022 (Nemo_bis) [07:41:36] Analytics-Kanban, Datasets-General-or-Unknown: cassandra backfill monitoring [0 pts] {slug} - https://phabricator.wikimedia.org/T115360#1989024 (Nemo_bis) [07:41:38] Analytics, Datasets-General-or-Unknown: optimize Analytics Query Service {slug} - https://phabricator.wikimedia.org/T115361#1989023 (Nemo_bis) [07:41:40] Analytics-Kanban, Datasets-General-or-Unknown: run job using oozie {slug} [13 pts] - https://phabricator.wikimedia.org/T115355#1989026 (Nemo_bis) [07:41:42] Analytics-Kanban, Datasets-General-or-Unknown: special character stripping on cassandra loading (tabs) {slug} [5 pts] - https://phabricator.wikimedia.org/T115356#1989025 (Nemo_bis) [07:41:44] Analytics-Kanban, Datasets-General-or-Unknown: improve timeuuid writing {slug} [5 pts] - https://phabricator.wikimedia.org/T115353#1989027 (Nemo_bis) [07:41:46] Analytics-Kanban, Datasets-General-or-Unknown: Improve loading Analytics Query Service with data {slug} [5 pts] - https://phabricator.wikimedia.org/T115351#1989028 (Nemo_bis) [07:41:48] Analytics-Kanban, Datasets-General-or-Unknown, RESTBase, Services, RESTBase-API: configure RESTBase pageview proxy to Analytics' cluster {slug} [34 pts] - https://phabricator.wikimedia.org/T114830#1989029 (Nemo_bis) [07:41:50] Analytics-Kanban, Datasets-General-or-Unknown: Deploy the Analytics RESTBase {slug} [13 pts] - https://phabricator.wikimedia.org/T113991#1989030 (Nemo_bis) [07:41:52] Analytics-Kanban, Datasets-General-or-Unknown, Patch-For-Review: Create Hadoop Job to load data into cassandra [34 pts] {slug} - https://phabricator.wikimedia.org/T108174#1989032 (Nemo_bis) [07:41:54] Analytics-Kanban, Datasets-General-or-Unknown, RESTBase-API: create third RESTBase endpoint [8 pts] {slug} - https://phabricator.wikimedia.org/T107055#1989034 (Nemo_bis) [07:41:57] Analytics-Kanban, Datasets-General-or-Unknown, netops, operations, Patch-For-Review: Puppetize a server with a role that sets up Cassandra on Analytics machines [13 pts] {slug} - https://phabricator.wikimedia.org/T107056#1989033 (Nemo_bis) [07:42:01] Analytics-Kanban, Datasets-General-or-Unknown: POC RestBase with cassandra in labs on test data [8 pts] {slug} - https://phabricator.wikimedia.org/T106821#1989037 (Nemo_bis) [07:42:03] Analytics-Kanban, Datasets-General-or-Unknown, RESTBase-API: create second RESTBase endpoint [8 pts] {slug} - https://phabricator.wikimedia.org/T107054#1989035 (Nemo_bis) [07:42:06] Analytics-Kanban, Datasets-General-or-Unknown, RESTBase-API: create RESTBase endpoints [34 pts] {slug} - https://phabricator.wikimedia.org/T107053#1989036 (Nemo_bis) [07:42:09] Analytics-Cluster, Analytics-Kanban, Datasets-General-or-Unknown: Test Cassandra as a storage strategy {slug} [5 pts] - https://phabricator.wikimedia.org/T101786#1989040 (Nemo_bis) [07:42:11] Analytics-Cluster, Analytics-Kanban, Datasets-General-or-Unknown: Generate test data for Pageview API {slug} [5 pts] - https://phabricator.wikimedia.org/T101785#1989041 (Nemo_bis) [07:42:13] Analytics-Cluster, Analytics-Kanban, Datasets-General-or-Unknown: {slug} Pageview API - https://phabricator.wikimedia.org/T101792#1989039 (Nemo_bis) [07:45:10] Nemo_bis: moonlighting as a scrum master? [09:41:08] hi a-team :] [09:41:15] 'morning mforns :) [09:41:20] How is the family ? [09:41:57] hey joal, my daughter is better, almost no fever tonight, but my wife had almost 40 :/ [09:42:12] Marf :\ [09:42:35] and you 3? [09:42:44] Everything ok :) [09:43:00] :] [09:43:01] Lino has been sick this weekend as well, but now ok [10:30:28] ori: hm? just some bug triaging as usual [11:57:55] Analytics-Kanban, Patch-For-Review: Reorganize oozie jobs to not use mobile cache webrequest_source {hawk} [13 pts] - https://phabricator.wikimedia.org/T122651#1989436 (BBlack) [11:58:10] Analytics-Kanban, Patch-For-Review: Reorganize oozie jobs to not use mobile cache webrequest_source {hawk} [13 pts] - https://phabricator.wikimedia.org/T122651#1909893 (BBlack) [12:39:45] ahhh, for future reference ns is added to the X-analytics header in the WikimediaEvents extension [13:34:09] Analytics-Kanban, Patch-For-Review: Reorganize oozie jobs to not use mobile cache webrequest_source {hawk} [13 pts] - https://phabricator.wikimedia.org/T122651#1989662 (BBlack) Just to be clear: no real user traffic is flowing in webrequest_mobile, but there's still some internal healthcheck traffic keepin... [13:44:33] Analytics-Kanban, Patch-For-Review: Reorganize oozie jobs to not use mobile cache webrequest_source {hawk} [13 pts] - https://phabricator.wikimedia.org/T122651#1989683 (JAllemandou) Thanks for the clarification @bblack, we plan on working on moving our oozie stuff starting today with @ottomata. [14:49:25] morning a-team! :) [14:52:01] joal: i'm drinking some coffeeeeee, checking emails, wanna start on mobile move stuff in 10 mins? [14:57:22] Analytics-General-or-Unknown, The-Wikipedia-Library: Category based-pageview collection for non-Article space, via Treeviews or similar - https://phabricator.wikimedia.org/T112157#1989836 (Sadads) Hey @Magnus, I attemped again with a pagepile, and it seems to be throwing a similar error. Have you looked... [15:02:57] joal, i'm going to go ahead and unrevert/merge some stuff... [15:03:13] lets seeee can I revert a revert?! [15:10:43] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1989883 (Krenair) I think you'd want to delete a percentage of distinct editing sessions rather than rows. [15:18:06] joal, whenever you come around: [15:18:06] https://gerrit.wikimedia.org/r/#/c/267889/ [15:18:06] https://gerrit.wikimedia.org/r/#/c/267891/ [15:18:10] the answer is yes! you can revert a revert [15:28:03] awesome ottomata :) [15:28:07] reviewing ! [15:28:42] ottomata: we need to be carefull --> some stuff have been merged in the mean time (like mforns patvh) [15:29:29] joal, yes! and I'd like to deploy it [15:29:42] I can imagine [15:29:52] joal, ja i got a conflict on each of those changes [15:29:53] and fixed [15:30:01] ok great ottomata [15:30:03] the only code change was in app session metric, ja? [15:30:09] i kept the code from latest [15:30:17] a change to some filter, ja? [15:30:24] mforns: ^ [15:31:11] ottomata, I think in the end the code you added for the mobile to text migration was kept, because madhu said it wouldn't harm the query [15:31:32] One thing mforns and ottomata : I think we need to add a filter for mobile only events, no? [15:31:37] IIRC, it was just the access_method = 'mobile app'? no? [15:32:03] mforns: My bad, it's already there :) [15:32:05] ja, should be [15:32:06] ja cool [15:32:11] great [15:32:21] apparently we can't deploy right now tho...:) [15:32:22] ottomata, was there any other change from your part in that file? [15:32:31] just a sec [15:32:41] conflict? no just the stuff starting on line 251 [15:32:44] in pathListToDataframe [15:32:49] i think i kept what you had [15:32:57] .filter ... .selectExpr [15:32:57] mforns, ottomata: I go and merge this patch ? [15:33:15] if mforns says ja, then yes, let's merge and push to archiva, and then update refinery jars [15:33:21] one sec [15:33:29] sure mforns [15:33:34] we can't use git-deploy atm, but I can just do a manualy deploy via ssh/rsync :p [15:33:39] sure ottomata [15:33:49] can't git-deploy ? WTF ? [15:34:08] :) [15:35:07] _joe_ is working on cross DC deployment stuff, and something is broken [15:35:09] tin has been reimaged [15:35:12] but something isn't working right [15:35:32] k ottomata [15:36:12] also ottomata, the only jobs I have not rerun last weekend is the appSession one --> data was already in transition [15:36:18] joal, how are you going to do the re-revert? [15:36:22] so that mean we'll have to move data and recompute :) [15:36:34] mforns: ottomata did for most of it already :) [15:37:27] ottomata, did you revert on top of HEAD, or you just went back? [15:37:58] mforns: i reverted, which made a patch on top of the orignal patch [15:38:00] then [15:38:02] i rebased [15:38:04] and got conflicts [15:38:08] which i resolved as i said above [15:38:19] ok, ok [15:38:40] sure, I had misunderstood you guys [15:39:09] joal, can I watch while you deploy? [15:39:30] sure mforns :) [15:39:34] ok [15:39:54] mforns: but mostly andrew will do (since it's funky time on tin) [15:40:01] ah [15:40:04] ottomata: merging ref-source ? [15:41:24] source go right ahead [15:41:31] joal: you doing the jar dance? i can if you like [15:41:40] ottomata: as you prefer [15:42:27] ottomata: merged [15:43:47] hehe, ok i'm on it [15:43:50] i haven't done the jar dance in a while [15:43:58] joal: merge refinery too? [15:44:22] ottomata: Let's wait for new jars first ? [15:44:25] joal: is this version 0.0.26 [15:44:26] ok [15:44:35] it is ottomata [15:44:41] v0,0,25 nerver got used [15:44:42] k making changelog entry [15:45:46] ottomata: got a conflict on my master in ref-source [15:45:47] :( [15:45:50] oh me too [15:45:51] ! [15:46:00] hUhhH [15:46:16] going to cehckout fresh origin master and make sure its what i want [15:47:03] joal: its ok, origin/master looks good [15:47:53] ottomata: you sure ? [15:48:10] yes [15:48:20] origin master has that it as text [15:48:23] has it* [15:48:34] HEAD is correct and what origin/master has [15:48:39] val webrequestMobilePath = params.webrequestBasePath + "/webrequest_source=text" [15:50:33] heh, joal are we going to be forever on v0.0 ? :p [15:50:55] ottomata: maybe one day we'll decide to move to 0.1 ;) [15:51:02] https://gerrit.wikimedia.org/r/#/c/267898/ [15:51:08] mforns: https://gerrit.wikimedia.org/r/#/c/267898/1/changelog.md [15:51:15] is my summary there for your change correct? [15:51:19] i just stole your git commit heading [15:51:24] Maybe the day :) [15:51:30] naw not the day [15:51:32] mayyybe hm [15:51:45] maybe if/when we expand webrequest_source partitoin [15:51:46] s [15:51:48] to have lots more [15:51:49] ottomata, thx!! [15:51:51] when we do perf improvement and use streaming like stuff ? :) [15:51:51] and refactor lots of crap [15:51:54] sure! [15:51:56] whenver! [15:51:57] hehe [15:52:01] :) [15:52:50] merged ottomata [15:53:07] ottomata: Why don't we have message on tasks merged anymore ? [15:53:10] ;( [15:53:25] dunno! [15:53:39] starting jar dance [15:53:45] watch me dance! [15:54:27] ottomata: https://www.youtube.com/watch?v=GU6psn30Pyk [15:54:41] very famous french song :) [15:56:21] that is exactly what i'm doing right now! [16:05:25] ottomata: https://archiva.wikimedia.org/#artifact-details-download-content/org.wikimedia.analytics.refinery.core/refinery-core/0.0.26 [16:05:28] Yay ! [16:05:41] Shall we go and merge that into refinery ? [16:06:05] Analytics: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990049 (Milimetric) >>! In T123059#1981068, @FedericoLeva-WMIT wrote: >>>! In T123059#1978846, @Milimetric wrote: >> We don't keep the raw data that far back. If you'd like to do this sort of query, you'd have t... [16:06:53] joal: still uploading stuff [16:06:57] doing refinery hive now [16:06:59] k ottomata [16:08:36] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event - https://phabricator.wikimedia.org/T125423#1990053 (Niedzielski) Thanks for restarting the instances! Hm, I think I'm still seeing the same issue on my end :( Th... [16:09:01] Analytics-Kanban, Research-and-Data: Research Spike: How do redirects affect pageviews [8 pts] {hawk} - https://phabricator.wikimedia.org/T108867#1990054 (Milimetric) I'll jump in without being addressed. I think the basic consensus is that the problem of redirects is too deeply woven into mediawiki for a... [16:09:10] Analytics: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990056 (Ironholds) [16:10:07] joal: I didn't follow your morning chats, but did you see Marko's ping yesterday about 4000 errors in AQS within 1 hour? [16:10:26] I have seen that, nothing else today [16:10:27] I looked and they seemed to be timeouts, I guessed due to the monthly load jobs being kicked off since it was Feb. 1st [16:10:38] the disk usage seemed to agree with me: https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-system?panelId=8&fullscreen [16:10:47] Seemed indeed related to load, therefore [16:11:10] yeah, let that bounce around in the back of your head and let me know if it collides with disagreement :) [16:11:33] milimetric: My guess is that at load time, cassandra response time peeks, timeouting restabase [16:11:47] yep [16:13:46] oh joal [16:13:47] wink wink [16:13:47] https://gerrit.wikimedia.org/r/#/c/267167/ [16:13:49] milimetric: there is not really something I can think of as a fast solution [16:14:06] Ahhh, ottomata, yes ! [16:15:54] i didn't mean a solution, just meant in case you disagree that the timeouts were caused by the load and think something else might have coincided [16:16:27] oh no milimetric, I think you have pinpointed the stuff right :) [16:16:57] milimetric: one thing that might help is when tables are well compacted (yesterday was heavy compaction day, not making it easy on cassandra) [16:17:15] And obviously ssds [16:18:03] yep, gotta get that sweet solid state goodness [16:18:05] joal: ok so, lets go over this because i kinda forget [16:18:08] ottomata: Your change in camus, I trust you have tested it ? [16:18:12] after we deploy to hdfs [16:18:13] joal: yes [16:18:44] ottomata: shall I merge? [16:18:52] But it's a shame, you just deployed a new version ! [16:19:06] yeah! i know i forgot [16:19:18] joal: yeah merge away, but let's change it later [16:19:20] it isn't ahurry [16:19:21] I'm sorry ottomata ... I have thought about it earlier as well, and procrastinated :( [16:19:26] we'll deploy again later [16:19:28] Analytics: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990071 (FedericoLeva-WMIT) It's not currently planned for me to ask access to this data. We'd just need someone to run a SELECT COUNT(*) with the specified conditions, is that difficult? [16:19:32] oki [16:19:48] ok so ja [16:19:52] after we deploy refinery stuff to hdfs [16:20:02] what do we have to move? [16:20:07] data from mobile from the last week? [16:20:08] ottomata: since it'a a submodule, doesn't it need to be also merged in the ref? [16:20:23] submodule??...> [16:20:52] ottomata: My bad, I thought it was refinery/refinery-camus [16:21:21] refinery-camus is a maven uh, module?project? [16:21:28] Merging that one then getting back to the refinery stuff [16:21:39] that depends on analytics/camus [16:21:42] ja k [16:21:56] Ok merged [16:22:13] So, IRRC the mobile data move happened last week, right ? [16:22:14] Analytics, Research-and-Data: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990081 (FedericoLeva-WMIT) [16:22:20] or started the week before last ? [16:22:58] Analytics, Data release, Research-and-Data: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1920159 (FedericoLeva-WMIT) [16:24:50] ottomata: you recall when the move has startrted ? [16:25:09] cause best would be to move data from the week when mobile move has started [16:25:29] (not counting the false positive move, it's an artefact of the thing :) [16:26:27] I explain why: currently AppSession metric only uses mobile partition, so data are false during cache-moving time ottomata [16:26:37] ah [16:26:40] yes [16:26:45] so we need to re run those [16:27:33] joal: according to this, https://phabricator.wikimedia.org/T109286#1949446, jan 20 [16:27:39] maybe jan 19 [16:27:44] weird ottomata, https://hue.wikimedia.org/oozie/list_oozie_coordinator/0029517-150605005438095-oozie-oozi-C/ [16:28:16] hm, jobs not started? [16:28:34] Since the job works "working" weeks, we should move data from jan 17th [16:28:37] Yeah ! [16:28:46] ooook [16:28:53] that's a little werid, but ok [16:29:12] ok, joal, i'm still updating jars in refinery, you wanna do move of mobile data? [16:29:13] Starting the sunday before the moving day [16:29:18] k [16:29:22] sounds good [16:29:44] ottomata: Yessir, I'll plan that, prepare the stuff, and when ready we stop the jobs, move and restart [16:29:47] k cool [16:29:56] Analytics-Kanban: Expose the results of the global metric at a public link, that's available immediately for the API {kudu} [8 pts] - https://phabricator.wikimedia.org/T118310#1990121 (Abit) @milimetric: this is less important for the mid-february deadline than T121262, since program leaders won't see it. i... [16:29:57] actually, stop the jobs, move, merge, deploy, and restart :) [16:30:01] ottomata: --^ [16:30:07] aye [16:30:21] ottomata: before, I'm double checking the AppSession oozie stuff [16:33:28] k cool [16:33:38] i'm writing a little script ot help with dling jars from archiva... [16:34:17] ottomata : oozie stuff looks normal except for missing files actually exisitng :) [16:34:22] Now, [16:36:39] restarting the jobs starting from the ones that have not yet run, we'll miss one of the weeks, and we won't be able to re-run the e previous ones [16:37:33] ottomata: --^ Since there is 3 weeks of data overlap on this job,either we move data to accomodate the non last-run job, or fail it [16:37:44] Analytics-Kanban, Patch-For-Review: Investigate adding piwik to transparency report {3] - https://phabricator.wikimedia.org/T125175#1990149 (Nuria) [16:38:31] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Uninstall Impala [3 pts] - https://phabricator.wikimedia.org/T125141#1990162 (Nuria) Open>Resolved [16:38:33] Analytics, Analytics-Cluster: Upgrade to CDH 5.5 - https://phabricator.wikimedia.org/T119646#1990163 (Nuria) [16:38:41] Analytics-Kanban, Services, RESTBase-API: RESTBase pageview data not updated - https://phabricator.wikimedia.org/T125048#1990164 (Nuria) Open>Resolved [16:38:44] Concrete stuff: next job to run is jan 10th one (using data from jan 10th to feb 8th) [16:39:07] ok cool [16:39:09] Analytics, Data release, Research-and-Data: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990166 (Milimetric) mmm, I wouldn't normally do it, not my job kind of thing, because usually it turns into more than a simple select. But since you asked nicely ;) ```... [16:39:10] So, either we move data from jan 10 to accomodate that job, or we move from 17th, and fail jan 10th [16:39:12] so we should move all data starting jan 10? [16:39:14] let [16:39:19] let's move jan 10, might as well, eh? [16:39:21] Analytics, Data release, Research-and-Data: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990169 (Milimetric) declined>Resolved [16:39:26] works for me :) [16:39:37] Analytics-Kanban, Discovery, Discovery-Analysis-Sprint, Patch-For-Review: Create UDFs for categorising referers - https://phabricator.wikimedia.org/T115919#1990170 (Nuria) Open>Resolved [16:39:54] ottomata: only thing to remember will then be that jobs before jan 10th for AppSession CAN'T be backfilled [16:39:55] Analytics-Kanban, Services: Improve pageview API response time with cache headers [8 pts] - https://phabricator.wikimedia.org/T119886#1990173 (Nuria) Open>Resolved [16:39:58] (without moving data) [16:40:23] ok ottomata: preparing script to move data from jan 10 ! [16:40:26] Analytics-Kanban: Restore MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 - https://phabricator.wikimedia.org/T123595#1990174 (Nuria) Open>Resolved [16:40:42] Analytics-Kanban: Aggregator for projectviews stuck in 2015 {hawk} [3pts] - https://phabricator.wikimedia.org/T123832#1990176 (Nuria) Open>Resolved [16:40:54] Analytics-Kanban, Patch-For-Review: Remove avro schema from jar [1 pts] - https://phabricator.wikimedia.org/T119893#1990178 (Nuria) Open>Resolved [16:41:27] Analytics: Move vital signs to its own instance {crow} [5 pts] - https://phabricator.wikimedia.org/T123944#1990185 (Nuria) [16:41:29] Analytics-Kanban, Patch-For-Review: Remove LegacyPageviews from vital-signs [3 pts] - https://phabricator.wikimedia.org/T124244#1990184 (Nuria) Open>Resolved [16:41:41] Analytics-Kanban: Problem with hadoop data ingestion impacting data delivery [8 pts] - https://phabricator.wikimedia.org/T125079#1990186 (Nuria) Open>Resolved [16:42:17] Analytics-Kanban, Patch-For-Review: Investigate adding piwik to transparency report {3] - https://phabricator.wikimedia.org/T125175#1990187 (Nuria) a:Nuria [16:42:35] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3] - https://phabricator.wikimedia.org/T125423#1990189 (Nuria) a:Nuria [16:47:55] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1990208 (Nuria) @Neil_P._Quinn_WMF: can you chime in on whether @jcrespo approach would work? [16:47:57] Analytics, Data release, Research-and-Data: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990209 (FedericoLeva-WMIT) Thanks! [16:48:24] having an odd pyspark + oozie issue if anyone has seen this before, pyspark in application_1454006297742_2404 threw an exception and failed, oozie in application_1454006297742_2401 that kicked it off marked as accepted [16:48:44] i'll probably write up something today to try and fail in spark w/scala to see if its limited to pyspark, but was wondering if anyone had ideas [16:49:29] ebernhardson: currently in the middle of something, can't really help :) [16:49:32] Sorry :S [16:49:36] no worries [16:49:55] Analytics, Editing-Analysis, Editing-Department: Consider scrapping Schema:PageContentSaveComplete and Schema:NewEditorEdit, given we have Schema:Edit - https://phabricator.wikimedia.org/T123958#1990215 (Neil_P._Quinn_WMF) declined>Open I've been thinking about it, and I'm not actually sure it ma... [16:50:20] (PS3) Milimetric: Clean up syntax errors and bad names [analytics/dashiki] - https://gerrit.wikimedia.org/r/267389 [16:50:21] nuria_: Heyyyy !P [16:50:22] (PS1) Milimetric: Fix metrics with spaces in URL [analytics/dashiki] - https://gerrit.wikimedia.org/r/267910 [16:50:31] nuria_: you should have pinged me !!!! [16:50:38] I missed the time :) [16:50:47] Analytics-Kanban: Lower parallelization on EventLogging to 1 consumer - https://phabricator.wikimedia.org/T125225#1990217 (mforns) [16:50:49] Analytics-Kanban: Eventlogging replication not working with mysql parallel consumption - https://phabricator.wikimedia.org/T125113#1990218 (mforns) [16:50:50] ahhh, i thought you were on "duty' [16:51:01] I was, but would have make some time :) [16:51:03] No issue :) [16:51:09] jaja, ok, will do next time! [16:51:12] It'll wait next week :) [16:53:36] (PS1) Ottomata: Update refinery source jars to version 0.0.26 [analytics/refinery] - https://gerrit.wikimedia.org/r/267912 [16:55:37] (PS1) Ottomata: Add helper script to download and symlink refinery source jars for a deployment [analytics/refinery] - https://gerrit.wikimedia.org/r/267915 [16:55:41] joal ^^ and ^ [16:56:29] Analytics-Kanban, Learning-and-Evaluation, Patch-For-Review: Add instruction text next to the input fields in the Program Global Metrics Report {kudu} - https://phabricator.wikimedia.org/T121899#1990239 (Abit) @Milimetric: thanks, that's fine! [16:57:03] ottomata: While I like your script, we almost never upgrade all the jars at one ... [16:57:59] joal: no? shouldn't we do that every time we upgrade? [16:58:28] for instance, when upgrading to 0.0.25, I didn't upgrade cassandra version [16:58:42] why though? even though we didn't change something there, we did build a new jar [16:58:49] wouldn't it be less confusing to do a full release? [16:59:15] what if something changes in reifnery-core 0.0.26 that refinery-cassandra 0.0.24 depended on [16:59:25] I wonder if it's less confusing to do a full release with no change, than leaving the thing unchanged ... [16:59:52] seems more consistent to keep the symlinked jars all pointing at the same versino [16:59:58] sure, old jobs might still point at old jars [16:59:59] that's fine [17:00:15] the folks that will be using the non versioned symlink jars are mostly ad-hoc queries and other users [17:00:29] i think it'd be less confusing for them if they didn't have to look at see what version of hive vs core was deployed, etc. [17:00:31] Analytics-Kanban, Data release, Research-and-Data: wmit-* account creation campaigns totals - https://phabricator.wikimedia.org/T123059#1990252 (Milimetric) [17:00:33] true ottomata [17:00:38] ok, let's go for that [17:00:48] I'll merge the jar bump anyhow :) [17:00:51] k [17:01:23] (CR) Joal: [C: 2 V: 2] "Can be improved, but will be usefull as-is !" [analytics/refinery] - https://gerrit.wikimedia.org/r/267915 (owner: Ottomata) [17:01:53] oh standup! [17:02:18] (CR) Joal: [C: 2 V: 2] "LGTM !" [analytics/refinery] - https://gerrit.wikimedia.org/r/267912 (owner: Ottomata) [17:03:15] elukey: standup [17:04:33] milimetric: I am off today, reading emails though :) [17:06:44] sorry elukey, didn't know [17:06:52] enjoy your day off [17:10:12] thanks! [17:11:28] Analytics-Kanban, Patch-For-Review: Investigate adding piwik to transparency report {3 pts] - https://phabricator.wikimedia.org/T125175#1990278 (Milimetric) [17:11:36] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1990279 (Milimetric) [17:12:32] Analytics-Kanban, Analytics-Wikimetrics, Patch-For-Review: Include all timezones in global metrics report interface {kudu} [3 pts] - https://phabricator.wikimedia.org/T121167#1990290 (Milimetric) [17:12:38] Analytics-Kanban, Learning-and-Evaluation, Patch-For-Review: Add instruction text next to the input fields in the Program Global Metrics Report {kudu} [1 pts] - https://phabricator.wikimedia.org/T121899#1990291 (Milimetric) [17:14:36] Analytics-Kanban: Rotate kafka GC logs [3 pts] {hawk} - https://phabricator.wikimedia.org/T124644#1990316 (Milimetric) a:elukey [17:18:18] Analytics: Create directory for output of discovery analytics data in HDFS - https://phabricator.wikimedia.org/T125488#1990331 (EBernhardson) @ottomata this should be an easy one if you could help us out. Thanks! [17:27:59] (PS7) Milimetric: Development environment for wikimetrics using docker [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267172 (https://phabricator.wikimedia.org/T123749) (owner: Madhuvishy) [17:29:45] Analytics: Create directory for output of discovery analytics data in HDFS - https://phabricator.wikimedia.org/T125488#1990365 (Ottomata) Open>Resolved a:Ottomata Done! ``` $ hdfs dfs -ls -d /wmf/data/discovery drwxr-xr-x - analytics-search analytics-search-users 0 2016-02-02 17:28 /wmf/... [17:32:30] Analytics, Datasets-General-or-Unknown, Services: Many error 500 from pageviews API "Error in Cassandra table storage backend" - https://phabricator.wikimedia.org/T125345#1990375 (mobrovac) [17:43:46] (PS8) Milimetric: Use docker to develop wikimetrics [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267172 (https://phabricator.wikimedia.org/T123749) (owner: Madhuvishy) [17:49:47] madhuvishy: Hi ! [17:51:00] (CR) Milimetric: [C: 2 V: 2] Make redirect uris configurable [analytics/wikimetrics-deploy] - https://gerrit.wikimedia.org/r/267639 (owner: Madhuvishy) [17:52:46] ottomata: Here ? [17:53:37] joal: ja [17:53:41] about to head to a cafe [17:53:46] wassup?! [17:54:52] I plan on starting this in a screen: https://gist.github.com/jobar/d85def0bf95f07de476b [17:54:57] ottomata: --^ [17:55:23] 10..10? [17:55:28] ottomata: With actually changing the numbers, right :) [17:55:38] milimetric: can you try submodule update now? [17:55:48] ay ek :) [17:55:49] ottomata: 1-31, 0-23 [17:56:07] joal: I'm around if you want to chat :) [17:56:13] madhuvishy: still denied [17:56:18] cool madhuvishy :) [17:56:48] hmm, i was about to ask how safe we should be, but that should be fine [17:56:52] would be easy to revert that if we had to [17:56:57] since we can differentiate the files, ja? [17:57:23] joal: looks good [17:57:24] milimetric: hmmmm this is the relevant group and i added you - https://gerrit.wikimedia.org/r/#/admin/groups/1131,members. may be it takes some time [17:57:26] i'd run that in a screen [17:57:33] and pipe stderr and stdout to tee -a file [17:57:33] like [17:57:54] sudo -u hdfs mv_mobile.sh 2>&1 | tee -a /tmp/mv_mobile.log [17:58:13] maybe put some date commands in there too [17:58:15] so you can see timing [17:58:24] makes sense ottomata ! [17:58:25] echo "$(date) Moving $f to $nf"... [17:58:30] Thanks for ideas :) [17:58:44] cool [17:58:49] ok, running out, back in 15ish [18:00:27] madhuvishy: that's weird, you're right, the secrets/wikimetrics repo allows that group, and I'm in it... [18:00:37] I'll try deleting my known_hosts [18:00:40] milimetric: yeah [18:01:48] nope, still denied [18:01:49] so werid [18:03:51] !log moving refined mobile files to refined text for january on HDFS [18:07:02] madhuvishy: Hi ! Sorry :) [18:07:25] So, numbers I had for offset vs estimate was way too big [18:07:43] And then I reallized the code I had was counting multiple time the same uniqyues [18:10:30] okay, joal oh [18:10:36] * madhuvishy looks at code [18:11:12] madhuvishy: I can walk you through my thoughts in batcave if you want [18:11:20] joal: sure [18:24:00] madhuvishy: decembrer 1st is gone already :( [18:24:07] joal: yeah i guessed [18:24:08] here! [18:24:10] it might be okay [18:24:31] if we start from dec 3 - the monthly data won't be off by too much [18:24:36] true [18:24:51] we can put a note somewhere saying it's slightly inaccurate [18:25:41] Right madhu [18:25:54] I'll push my version now then [18:26:04] joal: great! [18:26:56] Oh yeah madhuvishy, wanted your opnion on that [18:27:45] madhuvishy: Instead of using: (la.last_access < la.evt_dt) [18:28:17] I rebuild the day out of year-month-day (to have propoer results for monthly) [18:28:23] Sounds correct to you ? [18:29:56] joal: fyi am etherpad planning here https://etherpad.wikimedia.org/p/refinery_mobile2text [18:29:56] Like for the monthly, I use (last_access < unix_timestamp(year-month-01)) - With al the hive syntax needed for this to work :0 [18:30:07] Thanks ottomata ! [18:30:10] Files are moving :) [18:30:57] :) [18:30:58] Analytics-EventLogging, MediaWiki-extensions-MultimediaViewer: 60% of MultimediaViewerNetworkPerformance events dropped (exceeds maxUrlSize) - https://phabricator.wikimedia.org/T113364#1990678 (Jdlrobson) p:Normal>Low Any suggested actions here @krinkle ? Setting to low as unlikely to work on this... [18:31:22] joal: i dont think i understand [18:32:38] hm, when doing last_access < event.dt, you use the event date as "working day" [18:33:38] When doing monthly, we don't want a "working day", but a "working month", so last_access < (beginning of the month), is that completely wrong ? [18:33:47] I'm wondering now madhuvishy :) [18:34:11] joal: mmmmm [18:34:16] joal: aah - i think they way i did it was to use date format to just pick year and month [18:34:31] and calculate unix timestamp of it [18:34:40] right [18:34:49] which would be beginning of month i think? [18:34:50] Would be the same to use ozzie params, right ? [18:35:13] yes i think so [18:35:28] I find it easier to understand in the job using the year and month parameters than the actual event date [18:35:42] joal: sure [18:35:46] Cool P) [18:36:03] joal: is patchset 15 the one you finally uploaded? or is it not final yet? [18:36:30] not fianl yet, give a minute, double checking everything vefore removing the WIP :) [18:36:38] I should actually give you that pleasure :) [18:37:01] ha ha no you should do that [18:37:12] joal: thnak you for taking care of this, boy is going to be exciting when it is done [18:37:35] joal: line 60 on patchset 15 filters for nocookie is NULL - should be NOT NULL [18:37:47] that's the only thing i wanted to point out [18:38:58] Right madhuvishy, corrected that one already :) [18:39:11] Also, is it ok all the project info the way I have put it ? [18:39:31] less easy than uri_host, but normalized :( [18:39:44] joal: i think so - i was wondering if it could be just en.wikipedia and en.m.wikipedia [18:40:04] but i dont know if there are caveats to that [18:40:26] concern is, there could multiple qualifiers (it's an array) [18:40:38] Same thing, I keep tld, in case we have others [18:41:47] joal: right [18:43:23] dont know if the other qualifiers are useful and if having it this way makes it easy to query for mobile uniques [18:43:44] madhuvishy: it sure doesn't :( [18:44:17] joal: yeah.. also it kinda tempts people to just do project = 'en.wikipedia' and sum [18:44:45] madhuvishy: What should go for, then ? [18:45:47] Quick double check: offset is on 28% of estimate on average for the top 20 estimate [18:45:51] madhuvishy: --^ [18:45:57] Sounds better, right ? [18:46:08] joal: yeah definitely sounds better [18:46:18] 35million for en.m, + 7milion offset [18:46:22] cool [18:46:25] makes sense [18:46:39] Ok [18:46:53] joal about the project - doing en.wikipedia and en.m.wikipedia makes sense to me. nuria_ ^ what do you think? [18:46:54] I'm gonna push the patch, and run the code as me [18:47:02] Like that we don't loose to much of december [18:47:23] madhuvishy: versus? what is teh other option? [18:47:24] *the [18:47:25] Analytics-EventLogging, MediaWiki-extensions-MultimediaViewer: 60% of MultimediaViewerNetworkPerformance events dropped (exceeds maxUrlSize) - https://phabricator.wikimedia.org/T113364#1990782 (Krinkle) [18:47:47] nuria_: the other option is to have all the normalized_host fields [18:48:04] all? [18:48:09] nuria_: the quelifiers particularly don't lok good: it'a an array of string [18:48:25] nuria_: ya - it'll look like ('m', 'zero') [18:48:29] for mobile [18:48:30] i think [18:48:40] correct madhu [18:49:03] is taht documented somewhere here? https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest [18:49:23] nuria_: yup [18:49:34] normalized_host struct,tld:string> struct containing project_class (such as wikipedia or wikidata for instance), project (such as en or commons), qualifiers (a list of in-between values, such as m and/or zero) and tld (org most often) [18:50:18] joal: , milimetric, can you jump in a hangout with nuria and I in a few mins? [18:50:26] we just talked to mark about hw stuff and need to make some changes [18:50:29] (PS16) Joal: [WIP] Daily last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [18:50:30] sure [18:50:35] sure [18:50:35] k in 10 inutes? [18:50:36] ok? [18:51:03] madhuvishy: the project must be the website domain so it cannot be anything but en.wikipedia [18:51:07] k --> nuria_ madhuvishy, maybe jump in there before about project stuff ? [18:51:29] joal: sure [18:51:29] and en.m.wikipedia [18:51:37] batcave? [18:51:51] madhuvishy: last patch sent (using project fileds) [18:51:57] yes nuria_ [18:52:13] 10 minutes - sure, i'll grab lunch [18:52:14] ottomata: on batcave talking about last access [18:52:22] ottomata: let's talk hardware after [18:55:16] Analytics-EventLogging, MediaWiki-extensions-MultimediaViewer: 60% of MultimediaViewerNetworkPerformance events dropped (exceeds maxUrlSize) - https://phabricator.wikimedia.org/T113364#1990818 (Krinkle) @Jdlrobson: The suggested action is to rethink the purpose of what is being logged. Because whatever i... [18:56:58] Analytics-EventLogging, MediaWiki-extensions-MultimediaViewer: 60% of MultimediaViewerNetworkPerformance events dropped (exceeds maxUrlSize) - https://phabricator.wikimedia.org/T113364#1990825 (Jdlrobson) @tgr are you able to shed some light? I'm not sure what this was used for and if it's relevant any m... [18:57:47] k [19:04:10] Analytics-Kanban: Debug wikimetrics docker dev setup failing on ubuntu 14.04 - https://phabricator.wikimedia.org/T125415#1990848 (madhuvishy) a:madhuvishy [19:22:19] (PS9) Madhuvishy: Use docker to develop wikimetrics [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267172 (https://phabricator.wikimedia.org/T123749) [19:24:43] (PS10) Madhuvishy: Use docker to develop wikimetrics [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/267172 (https://phabricator.wikimedia.org/T123749) [19:27:18] (PS17) Joal: [WIP] Daily last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [19:27:30] madhuvishy: --^ [19:27:36] That one should be alrighty :) [19:27:39] joal: thanks, looking :) [19:27:45] thx:) [19:28:01] ottomata: one more hdfs directory if you don't mind :) it looks like oozie is now erroring trying to create /user/analytics-search so it has a place for the .sparkStaging and .staging directories [19:28:16] * ebernhar1son keeps finding more one step at a time... [19:32:12] makes sense [19:32:28] Analytics: Create directory for output of discovery analytics data in HDFS - https://phabricator.wikimedia.org/T125488#1990993 (madhuvishy) I see that the /wmf/data/discovery folder already exists. I was able to run the second two commands - @EBernhardson and @Ottomata - I leave you to verify that it's fine.... [19:32:48] hmm, it'd be nice if we could put the analytics-search system user into the analytics-search-users group [19:32:51] ebernhar1son [19:33:36] ottomata: ha ha i thought i could do the directory creation it looked easy enough except you already did [19:33:58] madhuvishy: you can do the one he just asked for [19:34:01] /user/analytics-search [19:34:17] sure [19:34:24] make it analytics-search:analytics-search-users 775 [19:34:30] okay [19:35:39] ebernhardson: i just did /user/analytics-search. look alright to you? [19:36:26] ottomata: i wanted to ask you about eventlogging and running setup.py install - let me know if you have time to chat [19:36:49] madhuvishy: looks right i'll kick this job to try again. thanks! [19:37:33] madhuvishy: gimme some minutes [19:37:38] ottomata: suree [19:37:40] doing more hw budget wranglin [19:40:31] nuria_: lemme know when you are done with your meeting, got some more hw thoughts [19:40:37] ottomata: k [20:02:46] Analytics-Kanban, hardware-requests, operations, Patch-For-Review: 8 x 3 SSDs for AQS nodes. - https://phabricator.wikimedia.org/T124947#1991215 (Ottomata) Hold on this, it seems will be replacing the aqs1xxx nodes since they are out of warranty. [20:03:13] Analytics-Cluster, Analytics-Kanban, hardware-requests, operations: Hadoop Node expansion for end of FY - https://phabricator.wikimedia.org/T124951#1991218 (Ottomata) Hold on this, we may be using the remainder budget for other things. [20:03:59] ottomata: done [20:04:15] nuria_: check spreadsheet [20:04:22] madhuvishy: what's up!? [20:04:25] ottomata: looked at etherpad...what about hadoop node for next year? [20:04:31] move to spreadsheet :) [20:04:33] heheh [20:04:46] nuria_: main q is about what to do with remainder budget for this year, with [20:04:48] ottomata: wow [20:04:54] wikimedia/mediawiki-extensions-EventLogging#528 (wmf/1.27.0-wmf12 - 3e68708 : Chad Horohoe): The build has errored. [20:04:54] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/3e687085bb43 [20:04:54] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/106561477 [20:04:57] that is SOME spreadsheet cc milimetric joal [20:04:57] with new stuff, only $10K remaining [20:05:11] ottomata: i created an instance - activated the role::eventlogging, ran puppet - cloned eventlogging, and ran sudo python setup.py install [20:05:13] maybe we should just do the new 'stat' box thing this quarter [20:05:16] which spreadsheet? [20:05:18] and do any hadoop expansion all at once next year [20:05:21] net FY [20:05:22] Analytics-Kanban, Research-and-Data: Research Spike: How do redirects affect pageviews [8 pts] {hawk} - https://phabricator.wikimedia.org/T108867#1991230 (mforns) @DarTar Sorry, I overlooked your comment. > the ticket is about exploring and documenting issues and possible solutions, not settling on one, c... [20:05:22] next* [20:05:28] https://docs.google.com/spreadsheets/d/1123OTmek4eRriBkZrAjbp06aH0RMmR0e69TMUlVF84s/edit#gid=45861119 [20:06:02] ottomata: el-test.analytics.eqiad.wmflabs, my home folder [20:06:14] ottomata: it fails with [20:06:18] https://www.irccloud.com/pastebin/AAwAYrwt/ [20:06:49] configure: error: no acceptable C compiler found in $PATH [20:06:50] Analytics-Kanban, Research-and-Data: Research Spike: How do redirects affect pageviews [8 pts] {hawk} - https://phabricator.wikimedia.org/T108867#1991233 (mforns) @Milimetric +1 [20:06:53] maybe [20:06:57] apt-get install gcc [20:06:58] ? [20:06:59] :) [20:07:06] or g++ or something? [20:07:08] ottomata: yeah - hmmm... [20:07:11] or some package that gets you one? [20:07:40] ottomata: what usually happens though - is it always available? i thought may be puppet would install it [20:07:50] dunno [20:07:56] hmmm [20:07:57] weird [20:08:09] i really dislike the setup.py thing we do with EL :p [20:08:32] oh you are using the puppet class [20:08:36] do you need setup.py install at all? [20:08:46] role::eventlogging should get you all deps from .debs [20:09:36] hmm ImportError: No module named functools32 [20:09:58] ottomata: aah - but what about the bin scripts? [20:10:12] export PYTHONPATH=/home/madhuvishy/eventlogging [20:10:19] python bin/eventlogging-consuer ... [20:10:32] you shouldn't have to pip install inorder to dev [20:10:35] is dumb [20:10:51] we shouldn't globally install deployed app in prod either [20:10:53] cool, spreadsheet makes a lot more sense [20:11:11] ottomata: hmmm - alright. i thought may be i'd have to run all of eventlogging using eventloggingctl inorder to test what happens when broker goes down [20:11:25] but i guess i dont have to [20:11:31] i dunno, don't think so, i think you just need a running processor and/or consumer [20:11:36] probably just processor [20:11:39] yeah [20:11:40] consuming and producing back to kafka [20:11:42] i'll do that [20:12:18] ottomata: the problem is that the processor dies right? [20:12:24] madhuvishy: that or consumer [20:12:31] okay [20:12:34] i'll play with it [20:12:37] thanks :) [20:12:37] but hte 'processor' runs both pykafka consumer and kafka-python producer [20:12:42] so that is probably a good way to test [20:12:44] right [20:12:58] nuria_: , milimetric ja my open q is if we should go ahead and and do the new stat type box this quarter [20:13:01] with our remainder budget [20:13:06] ottomata: may be also mysql consumer? [20:13:18] maybe, but it shouldn't matter either way [20:13:21] right [20:13:22] from the kafka lib point of view [20:13:28] okayy [20:13:28] a pykafka consumer is a kafka consumer [20:13:41] whether or not its being used by bin script from processor or consumer [20:13:51] yeah that makes sense [20:13:58] thank youu [20:13:59] nuria_: milimetric, +1stat box seems to make sense, since with what I have for this quarter [20:14:03] we only have ~$10K remaining [20:14:05] aham [20:14:08] which i guess we don't have to spend [20:14:14] no, we do not [20:15:39] but, it does make next quarter look better :p [20:15:39] :) [20:15:45] since we don't have to factor in the request for it then [20:16:14] ottomata:ok, let's do it then [20:18:57] ah nuria_ i had the wrong $ guess for hadoop nodes [20:19:02] budget is less now :o [20:19:57] we should figure out better estimates for the other servers too [20:20:41] ottomata: k corrected [20:23:03] also, i miscounted worker nodes, we have 30, not 29 [20:23:07] so, for round number's sake [20:23:11] changed # of new nodes to 20 [20:23:12] :) [20:23:15] from 21 [20:24:32] (fixing Analytics/Cluster/Hardware page) [20:38:28] Analytics-Kanban: Eventlogging replication not working with mysql parallel consumption - https://phabricator.wikimedia.org/T125113#1991469 (mforns) [20:38:32] Analytics-EventLogging, Analytics-Kanban: Add autoincrement id to EventLogging MySQL tables. - https://phabricator.wikimedia.org/T125135#1991470 (mforns) [20:40:06] Analytics-Kanban: Lower parallelization on EventLogging to 1 consumer - https://phabricator.wikimedia.org/T125225#1991476 (mforns) duplicate>Open [20:45:51] milimetric, are you planning to work on https://phabricator.wikimedia.org/T124296 ? [20:46:25] mforns: not right now, you're welcome to it if you want [20:46:39] I'm doing the text visualization now [20:47:03] but there are some things I did that might help you, so if you do it you should chain on top of my change [20:47:08] lemme know if you want me to push that [20:47:40] milimetric, there's also the other one for the bookmarks of the tabular layout, is that one better? [20:47:53] it will be the same right? [20:50:23] mforns: oh, I already did that one, sorry I didn't know it was a separate task [20:51:00] mforns: but neither of them are "bad" or anything, the hierarchy one is probably going to take the longest [20:51:11] and I hadn't decided what viz to use, so it's probably the most fun [20:51:14] milimetric, ok, I thought you still would work on that one after your current patch, so I think I'll take another task and leave the hierarchy to you [20:51:53] no, it's OK [20:51:53] mforns: I was planning on doing the AQS config change after the text visualization actually [20:52:09] so it's really all yours if you want it, and should be fun [20:52:24] I'm just saying you'll need to chain onto my patch because I'm using semantic 2 and you'll need some of the build changes that I'm making [20:52:33] I see [20:52:37] just stupid stuff like where the fonts go and such [20:52:59] so it's totally up to you, and if you grab it and don't want it, I'll pick it up when I finish the AQS config [20:53:55] Thanks a lot ottomata for the spreadsheet, looks good :) [20:54:26] ja [20:54:29] joal: ah you are still here [20:54:36] do yo know why app session metrics has a bundle? [20:54:40] not for long, back after diner monitoruing stuff [20:54:40] i see only a coordinator running [20:54:43] :) [20:54:48] milimetric, ok I'll go for it, thanks! [20:54:53] we can talk tomorrow! [20:54:55] no worries! [20:55:01] it's a new bundle introdiced by mforns [20:55:07] mforns: ok, then, one sec, lemme make something that actually works and I'll submit [20:55:12] ottomata: I have a few minutes: ) [20:55:32] also ottomata, moving takes Looooooooong time ! [20:55:33] ah, cool, joal will we use that to start the job tomorrow? [20:55:36] oh boy! [20:55:39] Yessir [20:55:46] ottomata, it has a single coordinator, but it is called twice [20:55:49] Like, it's at day 4 now, Yaya ! [20:55:54] joal: maybe parallelize, most of it is probably just dumb hdfs namenode latency, i don't *think* it should have to move blocks [20:56:08] agreed [20:56:17] iiinteresting, ok [20:56:20] milimetric, no rush, I think I'm done for today [20:56:23] I'll do that tomorrow morning of not finished [20:56:25] will start tomorrow [20:56:26] k [20:56:44] mforns: can you update README.md in oozie/mobile_apps/session_metrics [20:56:45] ? [20:56:48] it says to start via coordinator.xml [20:56:51] I'll also review the plan tomorrow morning, adding things if I can think of [20:57:08] ottomata, oh... [20:57:28] madhuvishy: Are you ok for me to start daily and monthly uniques jobs tomorrow morning? [20:57:37] joal: sure! [20:57:41] great: ) [20:57:45] k [20:57:56] mforns: actually [20:57:59] don't worry about it, i'll do it [20:58:04] since i'm editing it now anyway to make up a command [20:58:34] ottomata, np in doing that, I think I forgot the readme?? hmmm [20:58:50] oh joal, do we need to change refinery .jar version for those jobs hten, ja? [20:58:59] spark_job_jar = ${artifacts_directory}/org/wikimedia/analytics/refinery/refinery-job-0.0.14.jar [20:59:05] ottomata: we do [20:59:07] ottomata, let me do that, it was my fault [20:59:22] ok mforns [20:59:28] ottomata: I'll do that tomorrow morning as well, [20:59:33] on the way will you set default refinery-job jar to 0.0.26 [20:59:34] ? [20:59:35] I think I have done it and reverted it [20:59:36] in bundle.properties? [21:00:31] ottomata: for refinery, the is a better referer_class udf we should use, so we should bump the jar AND the record_version, as well as change the udf [21:00:39] if it's ok [21:00:45] I'll provide code tomorrow :) [21:01:55] cool [21:02:09] ok [21:02:17] for refine you mean? [21:02:18] oh cool [21:02:18] ok [21:02:22] yessir [21:02:23] ja joal, i'll leave that up to you [21:02:29] cool [21:02:30] danke [21:02:57] Then, I'm just gonna get to bed :) [21:02:59] wikimedia/mediawiki-extensions-EventLogging#529 (wmf/1.27.0-wmf.12 - b38f4e7 : Dan Duvall): The build has errored. [21:02:59] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/b38f4e7a3693 [21:02:59] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/106574813 [21:03:07] a-team, have a good end of day ! [21:04:09] (PS1) Milimetric: Add textual table "visualization" [analytics/dashiki] - https://gerrit.wikimedia.org/r/267993 (https://phabricator.wikimedia.org/T124297) [21:04:10] nite! [21:04:26] mforns: ^ that patch above is the textual table, and the hierarchy visualization should be similar [21:04:31] laters! [21:04:49] milimetric, super thanks [21:05:07] milimetric, did you have something in mind that you'd like me to consider? [21:05:14] lib or idea? [21:05:41] not really. I was vaguely thinking a tree-browser with a pie chart would be the most intuitive [21:06:01] but I hadn't tried it out and so it could be a terrible idea :) [21:06:46] Analytics, MediaWiki-API, Reading-Infrastructure-Team, MW-1.27-release-notes, and 2 others: Publish detailed Action API request information to Hadoop - https://phabricator.wikimedia.org/T108618#1991588 (bd808) [21:07:59] milimetric, aha, OK [21:08:11] cool [21:09:33] (CR) Madhuvishy: [WIP] Daily last_access uniques oozie job (1 comment) [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [21:10:01] Analytics-Cluster, hardware-requests, operations: New Hive / Oozie server node in eqiad Analytics VLAN - https://phabricator.wikimedia.org/T124945#1991604 (Ottomata) @robh, we'd like to move forward with this one as quickly as possible. We were going to use an older OOW Dell for this, and were about... [21:10:39] Analytics-Kanban: Restore MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 - https://phabricator.wikimedia.org/T123595#1991608 (Tbayer) Great, thanks everyone! As a test of the integrity of the restored `MobileWebSectionUsage_14321266`, I re-ran the first two queries from https://phabricator... [21:14:01] nuria_: you took away my edit access! :p [21:14:03] to the sheet! [21:14:08] argh SORRY! [21:14:15] heheh [21:14:30] corrected now [21:14:31] Analytics, Analytics-Wikimetrics: Display global metrics report results on same page as report inputs {kudu} - https://phabricator.wikimedia.org/T121262#1991616 (madhuvishy) I'm not sure we should do this - it's a very complicated change in the code for less gain, given that there is only one row in the... [21:15:56] Analytics-Cluster, hardware-requests, operations: New Hive / Oozie server node in eqiad Analytics VLAN - https://phabricator.wikimedia.org/T124945#1991623 (RobH) a:mark We don't have a spare that meets this criteria, so we would have to allcoate a spare that has 4 * 4TB disks, or order a new system... [21:17:52] Analytics-Cluster, hardware-requests, operations: eqiad: New Hive / Oozie server node in eqiad Analytics VLAN - https://phabricator.wikimedia.org/T124945#1991640 (RobH) [21:20:55] (PS1) Mforns: Correct app session metrics README file [analytics/refinery] - https://gerrit.wikimedia.org/r/267996 (https://phabricator.wikimedia.org/T117615) [21:21:02] ottomata, ^ [21:22:26] cool, mforns could you also change the default spark_job_jar version in bundle.properties, since you need that to have your recent change applied [21:22:27] ? [21:22:44] ottomata, sure [21:22:51] 0.0.26 [21:24:19] (PS2) Mforns: Correct app session metrics README and jar version [analytics/refinery] - https://gerrit.wikimedia.org/r/267996 (https://phabricator.wikimedia.org/T117615) [21:24:51] ottomata, done ^ [21:26:53] (CR) Ottomata: [C: 2 V: 2] Correct app session metrics README and jar version [analytics/refinery] - https://gerrit.wikimedia.org/r/267996 (https://phabricator.wikimedia.org/T117615) (owner: Mforns) [21:27:06] danke! [21:27:16] (PS18) Ottomata: [WIP] Daily last_access uniques oozie job [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [21:27:19] macht nichts [21:27:27] see you tomorrow a-team! [21:27:42] (PS3) Ottomata: Revert "Revert "Remove mobile webrequest_source merging it in text"" [analytics/refinery] - https://gerrit.wikimedia.org/r/267891 (https://phabricator.wikimedia.org/T122651) [21:27:43] laters! [21:27:44] (CR) Madhuvishy: "I think we are good to go, Joseph I can merge when you remove WIP :)" [analytics/refinery] - https://gerrit.wikimedia.org/r/216341 (https://phabricator.wikimedia.org/T92977) (owner: Madhuvishy) [21:33:14] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1991677 (Neil_P._Quinn_WMF) I think the core idea is sound, but as Krenair points out it would make more sense to use the c... [21:40:00] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1991694 (Nuria) @Neil_P._Quinn_WMF: let's do purge before you are done with your schema revision. The purge is a solution f... [21:41:09] Is the Pageviews API down or am I making something wrong with my queries? [21:59:13] My bad, missed that the space should be _ and not %20 [21:59:29] Analytics, Team-Practices, User-JAufrecht: Get regular traffic reports on TPG pages - https://phabricator.wikimedia.org/T99815#1991795 (JAufrecht) [22:07:42] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1991836 (Neil_P._Quinn_WMF) @Nuria, I'm fine with that. But it sounded like Jaime was advising against that; as long as he'... [22:58:06] (PS2) Milimetric: Add textual table "visualization" [analytics/dashiki] - https://gerrit.wikimedia.org/r/267993 (https://phabricator.wikimedia.org/T124297) [23:48:10] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1992311 (Nuria) Most definitely something is wrong here as unqiue ids are appearing 9 times in most occasions:... [23:52:44] Analytics, Analytics-Kanban, Wikipedia-Android-App: Database not updated for beta event logging and all-events.log reports 8x for each event [3 pts] - https://phabricator.wikimedia.org/T125423#1992351 (Nuria) Looks like this is not happening on production so it is only an issue on beta labs