[04:26:03] Analytics, Pageviews-API, Services: Pageviews API returning `Error in Cassandra table storage backend` - https://phabricator.wikimedia.org/T133005#2216759 (MusikAnimal) [04:39:48] (CR) Amire80: Add sorted errors (3 comments) [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/282228 (https://phabricator.wikimedia.org/T127283) (owner: Amire80) [04:39:51] (PS4) Amire80: Add sorted errors [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/282228 [05:08:00] (PS5) Amire80: Add sorted errors [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/282228 [05:11:54] (PS6) Amire80: Add sorted errors [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/282228 [05:11:56] (PS1) Amire80: Add a simple script to run all the sorted errors in one go [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/284128 [05:16:37] (CR) KartikMistry: Add a simple script to run all the sorted errors in one go (1 comment) [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/284128 (owner: Amire80) [06:28:20] (CR) Nikerabbit: Add sorted errors (1 comment) [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/282228 (owner: Amire80) [06:37:41] Analytics, Operations, Traffic: cronspam from cpXXXX hosts related to varnishkafka non existent processes - https://phabricator.wikimedia.org/T132346#2216914 (elukey) Resolved>Open [06:37:53] Analytics, Operations, Traffic: cronspam from cpXXXX hosts related to varnishkafka non existent processes - https://phabricator.wikimedia.org/T132346#2195218 (elukey) Closed it too soon, I can see the root@ notifications again :( ``` elukey@cp4003:~$ ls /etc/logrotate.d/varnishkafka* /etc/logrotate.... [08:21:43] !log deploying refinery from tin [08:21:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [08:23:40] elukey: hello ! [08:23:53] r [08:33:05] !log deployed new refinery on hadoop [08:33:07] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [08:35:33] joal: hello! [08:35:39] Good morning :) [08:36:46] elukey: do you know some about stat1004? [08:38:16] joal: a bit, I just puppetized it and brought up and running [08:39:02] elukey: it doesn't work properly when deploying from tin (I still went since stat1002 and analytics 1027 were ok) [08:39:54] joal: let me guess, errors with salt? [08:40:24] mmm no it is working fine [08:40:53] joal: what kind of errors are you seeing? [08:41:22] elukey: https://gist.github.com/jobar/f3d16164c487bd8a56792adc82a9a825 [08:43:26] 2016-04-19 08:24:20,441 [salt.minion ][ERROR ] The return failed for job 20160419082419807637 global name '__pillar__' is not defined [08:43:30] mmmmmmmmmmmmm [08:44:19] there might be something to do with salt grain that I didn't know [08:45:29] hmm [08:46:01] So, I basically deployed but stat1004 didn't ... There might be somnething to fix :S [08:46:57] yeah I am reading that I need to kick the minion a couple of times [08:47:00] let me check [08:47:11] elukey: good kicking :) [08:48:07] joal: can you retry? [08:48:42] sure elukey [08:49:27] same [08:50:31] elukey: --^ [08:51:05] joal: sorry can you retry again [08:51:07] ? [08:51:41] YAY :) [08:51:53] y [08:51:53] gooooooooD! [08:52:06] elukey: is a minion kicker :) [08:52:25] needed to be synced poor guy [08:52:56] synchro .... Every engineer runs into that one day or another :) [08:53:18] Thanks a lot elukey for having solved that one :) [08:54:26] * elukey used ottomata's mmmmm x-factor correctly this time [08:54:31] ;d [08:54:36] :) [08:55:19] also joal I thought that we could backup stat1001 /srv to stat1004, we havee ~7TB free in there [08:55:48] elukey: I have no strong opinion on that [08:55:58] elukey: wherever you think is good will work for me :) [08:56:11] stat1004 is not yet in use, why not :) [08:57:10] I am only struggling with the fact that we don't usually backup that data, even if it is important for us.. The reasons are good, namely the 2TB size, but we are only relying on RAID to backup that data [08:57:35] it would be great to have a more permanent solution [08:57:50] agreed elukey, but not sure which :) [08:58:54] Analytics-Kanban: analyse AQS queries over the previous month or weeks to have a better understanding of how compaction should behave - https://phabricator.wikimedia.org/T133016#2217168 (JAllemandou) [08:59:07] Analytics-Kanban: analyse AQS queries over the previous month or weeks to have a better understanding of how compaction should behave - https://phabricator.wikimedia.org/T133016#2217185 (JAllemandou) [08:59:09] Analytics-Kanban: Better response times Pageview API - https://phabricator.wikimedia.org/T124314#2217184 (JAllemandou) [09:16:37] ---^ joal +1, please keep me in the loop :) [09:16:55] elukey: on which one? AQS queries analysis? [09:17:15] elukey: I know you're the one for cassandra compaction help, I'll keep in touch for sure: ) [09:17:26] all of them!! :P [09:17:35] huhuhu, OK !!! [09:17:54] elukey: wanna see how I do that analysis? [09:22:14] yep! But I am still trying to figure out the DateTieredCompactionStrategy problem that we are seeing with aqs (the out of order writes) [09:22:26] elukey: batcave? [09:22:27] so don't want to slow you down, I'll read your updated :) [09:22:30] *updates [09:22:40] as you wish :) [09:23:55] yep yep thanks :) [09:25:16] elukey: come to batcave for a minute [09:25:17] :) [10:22:24] Analytics, Pageviews-API, Services: Pageviews API returning `Error in Cassandra table storage backend` - https://phabricator.wikimedia.org/T133005#2217462 (elukey) a:elukey [10:27:27] Analytics-Cluster, Analytics-Kanban, Operations, Patch-For-Review: setup stat1004/WMF4721 for hadoop client usage - https://phabricator.wikimedia.org/T131877#2217463 (elukey) Open>Resolved [10:27:47] Analytics: Polish script that checks eventlogging lag to use it for alarming - https://phabricator.wikimedia.org/T124306#2217465 (elukey) a:elukey>None [10:28:15] Analytics, Analytics-Cluster, Patch-For-Review: Single Kafka partition replica periodically lags - https://phabricator.wikimedia.org/T121407#2217466 (elukey) a:elukey>None [10:37:30] * joal AFK for a while [10:44:04] * elukey lunch! [11:22:29] Analytics, Operations, Traffic: cronspam from cpXXXX hosts related to varnishkafka non existent processes - https://phabricator.wikimedia.org/T132346#2217535 (BBlack) I've killed the rest of them, I think. I'll let you confirm->close this time :) [11:22:38] (PS1) Mforns: Add null values when query returns no results [analytics/reportupdater] - https://gerrit.wikimedia.org/r/284162 (https://phabricator.wikimedia.org/T117537) [11:51:18] (CR) Mforns: "I left some comments to help in the review." (11 comments) [analytics/reportupdater] - https://gerrit.wikimedia.org/r/284162 (https://phabricator.wikimedia.org/T117537) (owner: Mforns) [12:57:16] elukey: I'm no ops, but if you need help with the DC stuff, please ring [12:57:29] what DC stuff? :P [12:57:37] thankssssss [12:57:49] like washington DC you know [12:57:52] elukey: --^ [12:58:20] ahhhhh right! [13:00:16] I am trying to add mysql to hue but with puppet everything is very long [13:00:39] yeah, I don't know enough puppet, but seems not always easy [13:28:59] mooorning! [13:29:05] madhuvishy: morning! quick question, what is the standard way to share an ipython notebook running on stat1002? Thx!!! [13:29:07] elukey: conf200x servers are finally ready! [13:29:14] that means we can set up the main kafka cluster in codfw [13:29:21] i'm happy to do it, but thought you might want to [13:29:24] either way! [13:29:59] AndyRussG: sorry i think madhu may have been off yesterday, not sure. she's also on west coast, so probably not up yet(?) [13:30:10] but, if i think about it, i think there is not a standard way :( probably ssh tunnel [13:30:18] madhuvishy: and yuvi really want to make that very very easy [13:30:47] but their project has not been priortized because there wasn't a clear need for it. if you think there is, you should be vocal about it! [13:31:07] ottomata: ah cool.... Yeah I just saw there was no "away" status on her IRC nickname... [13:31:19] ottomata: o/ [13:31:24] hiyaa [13:31:27] sure I can do it, it would be great! [13:31:44] ok cool [13:31:47] will add you to tickets [13:31:52] super [13:32:18] step 1: zookeeper [13:32:19] https://phabricator.wikimedia.org/T131959 [13:32:25] this can be the same as conf100x in eqiad, but no etcd [13:32:27] just zookeeper for now [13:32:40] ottomata: yeah I did see those spin-uppable Jupyter notebooks, looks great! I just followed the instructions here https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Spark#Spark_and_Ipython , and got some results that I wanted to post on a Phab task as well as in Gerrit [13:32:54] then step 2: kafka [13:32:54] https://phabricator.wikimedia.org/T121558 [13:32:57] For now I'm just sharing results copied and pasted into a text file [13:33:09] should be the same as kafka100[12] in eqiad [13:33:13] * AndyRussG loooooved using ipython notebook [13:33:14] kafka and eventbus stuff [13:33:27] AndyRussG: that is awesome! [13:33:37] yeah it's soooo fun [13:34:01] yeah, i think to share you'll have to have people use an ssh tunnel to your running notebook port [13:34:06] so they'd have to have stat1002 access :( [14:14:23] a-tea m: fyi ops is doing codfw master switchover right now! :o [14:14:28] a-team ^ [15:14:27] ottomata: I am almost ready to send a code review to add the possibility to specify an external db for hue, keeping sqlite as default [15:14:49] so we can roll it out (after a backup JUST IN CASE) as first step [15:15:16] it should not change anything [15:15:56] then we create tables/schema/username on analytics meta, update the role/private repo and switch [15:28:16] ok cool [15:28:18] sounds good [15:31:04] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Use MySQL as Hue data backend store - https://phabricator.wikimedia.org/T127990#2218238 (elukey) [15:31:15] cominug! [16:01:35] nuria_: hiii standup! [16:11:32] hey analytics! I understand the "Error in Cassandra table storage backend" error with the Pageviews API is a known issue? context: https://phabricator.wikimedia.org/T133005 [16:12:39] my quick question is, am I safe to re-do API queries that failed with this error? With my new Langviews tool, I see that after getting those errors, I can re-run the tool and usually it will work [16:13:03] so if it's OK I'll just alter the code to automatically re-run those individual API queries that failed with that error [16:13:38] maybe have it try 3 times before giving up [16:17:20] Hey MusikAnimal [16:17:47] hey! [16:17:57] Retrying over to pageview API is feasible, the second thing is to make sure you query small(ish) amount of data at once (one month max) [16:18:51] MusikAnimal: 500 mostly happens on biggish data (meaning long time periods) [16:19:22] one month max, no problem. I've been seeing the issue when querying for as few as a few days, but the thing is I'm doing an individual query for up to 200+ projects [16:19:25] madhuvishy: just tested beeline: that's great :) [16:19:34] Analytics-Cluster, Analytics-Kanban: Publicize stat1004 - https://phabricator.wikimedia.org/T133056#2218431 (Ottomata) [16:19:41] joal: :) [16:20:03] Thanks madhuvishy for setting this up :) [16:20:35] MusikAnimal: let me give you some more info on the current issue [16:20:40] ok thanks [16:20:56] MusikAnimal: the pageview API relies on cassandra for data storage [16:21:30] currently, the column family (equivalent of table in SQL) containing per-article data is about 2Tb [16:21:44] and the hardware we have has rotating disks [16:22:18] So, when querying either long data, either by bursts, you're asking a lot of reading to cassandra [16:22:32] Since restbase has a timeout of 2secs IIRC, the requests fail [16:22:56] And some seconds later, when cassandra has loaded the previous in RAM, the query succeed [16:23:07] MusikAnimal: does it make sense ? [16:23:14] MusikAnimal: About solutions now [16:23:25] yes it does [16:23:35] We are awaiting new hardware with SSDs anytime soon [16:24:00] awesome [16:24:13] so for now, just re-run those queries, and keep my 100ms throttling that I have now [16:24:29] that brings it down to 10 individual queries per second [16:24:31] Once this has arrived, there is some work on our side to not only take advantage of SSDs, but also do a better job at architecturing cassandra stuff, and hoppefully in a few weeks, we'll have a responsive pageviewAPI [16:24:55] MusikAnimal: your current approach is great for the time being [16:25:09] awesome, what I needed to hear, thank you! [16:25:22] assuming you already have a phab for that maintenance work, we can close https://phabricator.wikimedia.org/T133005 [16:25:29] MusikAnimal: Please subscribe to https://phabricator.wikimedia.org/T124314, in order to get updates on the new stuff [16:25:34] that phab was mainly to get the info you just gave me [16:25:46] MusikAnimal: I'll close it then :) [16:26:01] cheers [16:26:16] Thanks for being nice with the API, we try to give it some love with elukey , but it's kinda very demanding [16:26:56] Analytics, Pageviews-API, Services: Pageviews API returning `Error in Cassandra table storage backend` - https://phabricator.wikimedia.org/T133005#2218503 (JAllemandou) [16:26:59] Analytics-Kanban: Better response times Pageview API - https://phabricator.wikimedia.org/T124314#2218505 (JAllemandou) [16:33:29] a-team: done with standup? [16:33:37] yup [16:33:38] yes [16:33:43] madhuvishy: yt? [16:33:47] ah sorry [16:34:26] nuria_: yeah! [16:58:10] a-team: let's do tasking on batcave [16:58:53] ok! [17:00:58] madhuvishy: coming to tasking, we are meeting on batcave [17:01:00] ? [17:09:25] Analytics, Hovercards, Unplanned-Sprint-Work, Reading-Web-Sprint-70-Lady-and-the-Trumps: Capture hovercards fetches as previews in analytics - https://phabricator.wikimedia.org/T129425#2218742 (dr0ptp4kt) [17:10:04] Analytics-Cluster, Analytics-Kanban, Operations, hardware-requests, netops: setup/deploy server analytics1003/WMF4541 - https://phabricator.wikimedia.org/T130840#2218749 (Ottomata) a:Ottomata [17:11:10] Analytics-Cluster, Analytics-Kanban, Operations, hardware-requests, netops: setup/deploy server analytics1003/WMF4541 - https://phabricator.wikimedia.org/T130840#2218757 (Ottomata) Will work on this today/tomorrow. [17:11:45] Analytics, Hovercards, Reading-Web-Sprint-71-m: Verify X-Analytics: preview=1 in stable - https://phabricator.wikimedia.org/T133067#2218761 (dr0ptp4kt) [17:15:41] Analytics-Kanban: Upgrade scripts to facilitate wiki data loading / treatment on hadoop - https://phabricator.wikimedia.org/T132590#2218816 (Nuria) This is in regards to load editing data from dumps. Data is translated from xml into json and from those we create hive tables. [17:16:07] Analytics-Kanban: Upgrade scripts to facilitate wiki data loading / treatment on hadoop - https://phabricator.wikimedia.org/T132590#2218832 (Nuria) Code is here: https://github.com/jobar/research-cluster/tree/etl_checkers [17:28:19] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Use MySQL as Hue data backend store - https://phabricator.wikimedia.org/T127990#2218957 (Ottomata) [17:28:23] Analytics-Cluster, Analytics-Kanban, Operations, hardware-requests, netops: setup/deploy server analytics1003/WMF4541 - https://phabricator.wikimedia.org/T130840#2218956 (Ottomata) [17:42:39] Analytics: Provision new SSD-able machines on pageviewAPI - https://phabricator.wikimedia.org/T132938#2219066 (Nuria) [17:43:17] Analytics: Provision new SSD-able machines on AQS - https://phabricator.wikimedia.org/T132938#2219071 (Nuria) [17:44:03] Analytics-Kanban: Better response times on AQS (Pageview API mostly) - https://phabricator.wikimedia.org/T124314#2219073 (Nuria) [17:47:10] Analytics-Kanban, Operations, Patch-For-Review: Upgrade stat1001 to Debian Jessie - https://phabricator.wikimedia.org/T76348#2219074 (Dzahn) We now have an rsyncd running on stat1004, ready to accept data from stat1001, it will be in /srv/stat1001/ , there are 3 modules, one for home, one for srv and... [17:48:23] Analytics-Kanban: Better response times on AQS (Pageview API mostly) {slug} - https://phabricator.wikimedia.org/T124314#1952692 (Nuria) [17:48:40] Analytics-Kanban: Better response times on AQS (Pageview API mostly) {melc} - https://phabricator.wikimedia.org/T124314#1952692 (Nuria) [18:07:31] ottomata, re. the EL changes to avoid breaking when the schema does not exist [18:07:37] ja? [18:08:25] I wrote the try/catch in the eventlogging-consumer, but I realized this was not optimal, because the problem is local to mysql writer no? [18:08:44] it should be inside jrm.py maybe [18:09:46] but then jrm should reraise the error to be catched by the eventlogging-consumer as an "acceptable error"? [18:10:59] yes [18:11:02] it should be done in the hanlder [18:11:08] or in jrm.py even better [18:11:22] no, i think jrm.py should catch it and log it [18:11:24] but not reraise [18:11:29] aha, but then... [18:11:39] all stuff that happens after jrm returns... [18:11:45] that should not happen [18:11:49] looking [18:11:49] like inserting [18:12:31] well, the inserting happens inside of jrm.py in the insertion thread [18:12:35] ja? in store_sql_events [18:12:35] ? [18:12:38] but. [18:12:39] yes [18:12:47] still there is stuff in the handler that hsouldn't happen [18:12:51] aha [18:12:51] like statsd increments [18:13:02] so mforns yeah maybe you are right [18:13:13] I was thinking of having declare_table to reraise a wrapper error [18:13:22] that is known by the eventlogging-consumer code [18:13:34] catching at handler level is fine if possible...but i'm not sure how that will work in the thread [18:13:43] mforns: def not in eventlogging-consumer code [18:13:51] i think that's too high up, no? [18:13:54] mhm ok [18:13:56] anything could use the handler [18:13:58] not just the consumer [18:14:05] you could do processor straight to mysql if you wanted to [18:14:14] the bin/ scripts are like lib users [18:14:17] I see [18:14:25] stuff in eventlogging/ should all work [18:16:29] it makes sense, I guess the handler needs to do some catch work too [18:16:35] thanks ottomata [18:17:06] yeah, mforns sounds tricky because i guess get_schema() is called in the insert thread [18:17:14] unless...did we get rid of threading? [18:17:21] ottomata, yes [18:17:29] oh cool easy then [18:17:30] right [18:17:32] nice [18:17:33] yep [18:18:10] nuria_: around? [18:18:15] yes [18:18:33] nuria_: ah I PM-ed not sure you're getting my messages though :) [18:18:43] sorry, just moved [18:18:51] and got those now [18:50:12] (PS12) Nuria: Allow filtering of data breakdowns [analytics/dashiki] - https://gerrit.wikimedia.org/r/278395 (https://phabricator.wikimedia.org/T131547) (owner: Jdlrobson) [18:55:35] AndyRussG: Hi! [18:55:44] I saw you were asking about notebooks [18:55:48] madhuvishy: hey! how's it going? [18:55:49] Yeah [18:56:05] That's such a fun system! [18:56:23] I just wanted to know if there was a facility for linking to contents or results of a notebook [18:56:26] Just rememberted tho [18:56:37] afaik, only good way to share notebooks right now is downloading them [18:56:44] a few months ago I think Ellery sent me one as a saved HTML document [18:56:50] and may be putting them on github [18:57:04] or something that supports ipynb [18:57:06] aah [18:57:23] ahh foo joal, elukey, i am going to a mentoring orientation event 8:30am - 10am tomorrow [18:57:23] hmmm [18:57:23] Analytics, Commons, Multimedia, Wikidata, and 2 others: Allow tabular datasets on Commons (or some similar central repository) (CSV, TSV, JSON, XML) - https://phabricator.wikimedia.org/T120452#2219474 (matmarex) a:Milimetric>Yurik So I guess Yurik is working on making this happen? I'm not... [18:57:34] i'd still like to do the migration, but won't be able to do at 91m [18:57:36] 9am [18:57:37] my time [18:59:44] AndyRussG: it was probably a github page or something like that? [19:01:00] Mmm no it was just a html document, with images embedded, somehow or another, which he e-mailed as an attachment... Yeah I hadn't remembered that when I asked here earlier... Hmmm lemme see exactly what it was [19:01:35] AndyRussG: oh i know how [19:01:37] https://ipython.org/ipython-doc/1/interactive/nbconvert.html [19:01:47] you can convert ipynb to other formats [19:02:00] he probably just converted to html and attached it [19:02:13] madhuvishy: ah beautiful, thx! [19:02:41] Yeah I could convert them and set up a github repo for versions I want to share then [19:03:09] AndyRussG: it's pretty painful now - i'll keep you posted on progress on https://wikitech.wikimedia.org/wiki/PAWS/Internal experimental setup i'll work on soonish [19:03:14] Analytics-Cluster, EventBus, Operations, Services: Investigate proper set up for using Kafka MirrorMaker with new main Kafka clusters. - https://phabricator.wikimedia.org/T123954#2219495 (RobH) [19:04:28] madhuvishy: ah cool! Yeah I see there's a paragraph on publishing [19:06:18] AndyRussG: ya that'll come later - to start with atleast accessing them like - http://paws-public.wmflabs.org/paws-public/User:YuviPanda/Test.ipynb will happen [19:10:07] Right! I saw an early demo in SF a few months ago [19:11:04] cool! [19:30:37] ottomata: Heya [19:30:47] ottomata: postponing to 10:30 EST? [19:39:57] ja, joal sounds good, i haven't sent email out yet because i wanted to be sure i was ready [19:40:02] am prepping node now before hand to be sure [19:40:39] great ottomata [19:40:50] Thanks for letting us know :) [19:41:16] madhuvishy: from services team, doc issue solved (different versions of restbase on different servers, problem solved) [19:41:34] Gone for night ! [19:41:44] Have a good end of dau a-team:) [19:42:12] laters! [19:44:46] byee [19:48:59] Analytics, Commons, Multimedia, Wikidata, and 2 others: Allow tabular datasets on Commons (or some similar central repository) (CSV, TSV, JSON, XML) - https://phabricator.wikimedia.org/T120452#2219801 (Yurik) merged, please take a look at https://commons.wikimedia.org/wiki/Commons:Village_pump/Pr... [20:00:16] Hi [20:00:32] I'm trying to establish an EventLogging dev setup following the instruction at https://www.mediawiki.org/wiki/Extension:EventLogging#Sanity_checking_your_setup [20:00:57] Analytics-Cluster, Analytics-Kanban, Operations, hardware-requests, and 2 others: setup/deploy server analytics1003/WMF4541 - https://phabricator.wikimedia.org/T130840#2219864 (Ottomata) Ok, we are ready to proceed. Plan here: https://etherpad.wikimedia.org/p/analytics-meta 1. stop camus early... [20:01:08] I've been able to get eventlogging-devserver to listen on port 8081, but the JS test does not record any event [20:01:26] strainu: , are you using mediawiki-vagrant or doing this manually? [20:01:38] I'm using vagrant, yes [20:02:27] here is what I added to LocalSettings.php: http://pastebin.com/ZwYdMpvw [20:02:55] joal: let's plan to do 10:45 for downtime, just so there is time for me to get settled and ready [20:03:12] strainu: you've enabled the eventlogging role? [20:03:34] yes, and followed the server setup instructions from the README [20:03:53] I see the extension in Special:Version [20:04:11] the server starts and says # Listening to events. [20:05:15] as a side note, I also needed to enable the geshi role, although this isn't mentioned anywhere :) [20:05:24] but otherwise eventlogging would just fail [20:06:04] hm, that's strange [20:07:20] OH [20:07:24] strainu: i think these docs are outdated [20:07:26] hm [20:07:40] eventlogging::devserver should be run by the role and managed, you shoudln't have to run it manually out of server/ [20:07:47] i will fix these docs if we figure this out [20:08:00] loading my vagrant... [20:08:39] I don't think this is possible at the moment - you just don't have the necessary dependencies installed when enabling the role [20:08:50] perhaps you have some new, unmerged version? [20:09:12] you do, they are now managed using virtualenv and gitupdate [20:09:18] it is merged [20:09:45] https://github.com/wikimedia/mediawiki-vagrant/blob/master/puppet/modules/role/manifests/eventlogging.pp#L14 [20:09:48] ah, :8100? [20:09:49] hm [20:09:56] oh right, because apache rewrites them [20:09:58] that is fine [20:10:14] strainu: are you running with up to date mw-vagrant repo? [20:10:17] if so [20:10:23] yes [20:10:30] and someone is listening on :8100 [20:10:37] ok cool [20:10:38] let me try to change the port on the LocalSttings [20:10:39] not you, right? [20:10:41] nono [20:10:48] i think apache is configured to proxy to 8100 [20:10:51] so, check [20:10:59] /vagrant/logs/eventlogging.log [20:11:06] do you see events there? [20:11:17] mforns: I clonned reportupdater in 1002 ,can you send me the config changes you were using to test your change? [20:11:31] nuria_, sure [20:12:00] nuria_, are you testing using the browser reports? [20:12:07] ottomata: nope [20:12:18] mforns: whatever is easier, seems that we better test with those than edit [20:12:50] nuria_, ok, but the browser reports do not fall in the specific scenario [20:13:02] where the query/script returns empty results [20:13:12] ok strainu, do what does [20:13:24] ps aux | grep eventlogging-devserver show? [20:13:28] nuria_, I used them too, to test that the regular functionalities were not broken [20:13:33] /vagrant/srv/eventlogging/virtualenv/bin/python /vagrant/srv/eventlogging/virtualenv/bin/eventlogging-devserver ... [20:13:33] ? [20:13:36] or something else [20:14:00] nuria_, but to test the actual change, we should use the limn-edit-data repo, for example [20:14:08] yep, that's the one [20:14:18] ok, cool, so its running [20:14:18] nuria_, if you want, we can batcave, I think the documentation does not include that [20:14:20] hm [20:15:18] strainu: ok, do [20:15:28] sudo tail -f /var/log/upstart/eventlogging-devserver.log [20:15:31] and try to submit an event [20:16:27] nothing [20:17:46] hm, strainu, remove your sepcial LocalSettings.php stuff [20:17:56] pretty sure the role shouldi take care of it [20:18:12] nuria_, but to test with the browser reports the only thing I changed in /home/mforns/reportupdater-queries/browser/config.yaml was the 'start' parameter of all reports to 2016-04-03 [20:18:35] this will generate the last 2 weeks of data [20:18:55] still nothing - my LocalSettings.php is not identical with the one from git [20:19:48] oh, well the ones you sent me [20:19:53] does anthying need to be restarted/ reprovisioned? [20:20:01] the puppet stuff tries to set wgEventLoggingBaseUri [20:20:07] so maybe re provision after removing that? [20:20:12] i'm not sure how LocalSettings.php works [20:20:18] ok, will try [20:20:23] mforns: k [20:21:51] ottomata: no luck :( [20:22:17] and no output in /var/log/upstart/eventlogging-devserver.log ? [20:23:16] strainu: cat /etc/apache2/env-enabled/define-ELDevServer [20:23:21] should say [20:23:21] export APACHE_ARGUMENTS="$APACHE_ARGUMENTS -D ELDevServer" [20:23:22] and [20:23:59] cat /etc/apache2/site-confs/devwiki/00-default.conf [20:24:00] correct [20:24:07] should have [20:24:08] [20:24:08] ProxyPass /event.gif http://127.0.0.1:8100/event.gif [20:24:08] ... [20:24:09] at the bottom [20:24:28] correct again [20:24:40] let me try to see if the variable has been correctly set [20:24:46] ok, how are you submitting events? [20:25:15] that's the problem: [20:25:21] mw.config.get( 'wgEventLoggingBaseUri' ); gives me "//localhost:8080/event.gif" [20:25:36] that is correct [20:25:38] ok, if you do [20:25:41] curl -I 'http://localhost:8080/event.gif?woohooo' [20:25:44] fore submitting events I'm using the dev console in firefox [20:25:45] do you get a 204? [20:25:52] and the code from https://www.mediawiki.org/wiki/Extension:EventLogging#Sanity_checking_your_setup [20:26:17] yes, I get a 204 [20:26:24] ok cool, so its working [20:26:46] and still nothing in /var/log/upstart/eventlogging-devserver.log [20:27:54] strainu: i'm not very familiar with the frontend eventlogging stuff [20:28:04] but, are you sure it is emitting events to localhost:8080/event.gif? [20:28:08] strainu: FYI, that "//localhost" will be using https if you are accessing mw under https [20:28:24] strainu: you want to make sure to use http [20:28:37] strainu: what do you see on your console when events are logged ? [20:28:44] on the network pannel [20:29:35] nuria_: nothing [20:29:43] just the stuff that got loaded on loading the page [20:30:13] so events are not being logged when you log them with mw.logEvent ? [20:30:24] are you trying to log them that way? [20:30:32] strainu: maybe you need to reload the js stuff since you changed the localsettings and reprovisioned? [20:30:39] a page refresh maybe? not sure [20:30:44] tried that [20:30:46] hm [20:30:52] ok [20:31:06] strainu: mw.eventLog.logEvent( null, { "foo": 42 } ); [20:31:12] strainu: if you do [20:31:13] sudo tcpdump -A port 8080 [20:31:18] and then try to fire an event [20:31:20] do you get anything? [20:32:41] no, hold on, there might be some caching issue still [20:32:51] let me try in private browsing [20:34:34] still nothin [20:34:46] so it would appear that the events are not going to the right place [20:34:50] strainu: can you log with: mw.eventLog.logEvent( null, { "foo": 42 } ); [20:35:09] if you do not see your request in the network pannel for chrome [20:35:20] strainu: there is no event being sent [20:35:28] that's what I'm using [20:36:08] strainu: that should give you "empty" schema [20:36:18] a warning on console, so event will not be sent [20:36:27] strainu: now you want to use a real event [20:36:52] FYI, updated docs for mw-vagrant here: https://www.mediawiki.org/wiki/Extension:EventLogging#Using_mediawiki-vagrant [20:37:13] but nuria [20:37:15] I do have a warning, but docs say "The last line will generate console warnings in debug mode as null is not a known schema, but eventlogging-devserver should dump the event along with its own warnings." [20:37:19] "The last line will generate console warnings in https://www.mediawiki.org/wiki/ResourceLoader/Features#Debug_mode as null is not a known schema, but eventlogging-devserver should dump the event along with its own warnings." [20:37:21] hah [20:37:21] Analytics, Hovercards, Unplanned-Sprint-Work, Reading-Web-Sprint-70-Lady-and-the-Trumps: Capture hovercards fetches as previews in analytics - https://phabricator.wikimedia.org/T129425#2105269 (dr0ptp4kt) Open>Resolved Signing off. [20:37:38] so strainu should see something in /var/log/upstart/eventlogging-devserver.log it hikn [20:38:10] strainu: sorry, event will "not be valid" but should be visible on network pannel like: [20:38:38] yeah, strainu when I do that, i see ValidationError('Invalid revision ID %s' % revision) in eventlogging-devserver.log [20:38:48] and also in network panel [20:39:00] [null] Missing or empty schema [20:39:15] strainu: [20:39:29] https://www.dropbox.com/s/n6c7eqjiano0xy1/Screen%20Shot%202016-04-19%20at%201.38.14%20PM.png?dl=0 [20:39:30] I see "[null] Missing or empty schema" as the output in the console [20:39:36] but nothing in network [20:41:42] ok, so it works in Chromium [20:41:46] but not in Firefox [20:42:03] are you debugging all exceptions? [20:42:04] I can work with Chromium ok, but perhaps you want to take a look at that [20:42:15] nuria_: can you be more verbose? :) [20:42:17] do you have addblock? [20:42:46] strainu: stopping on all exceptions on js console [20:43:01] strainu: cause it works fine on my ff [20:43:14] I do have uBlock, but it's diabled [20:44:02] huh, interesting! so strainu you see an error in eventlogging-devserver.log when you do that in chomimum? [20:44:13] yep [20:44:16] ok cool! [20:44:20] let me try with the extensions disabled [20:44:23] at least the other stuff works, phew! :) [20:44:28] :)) [20:45:30] strainu: ya, i bet is your adblock, as it works fine on latest FF [20:46:00] strainu: ah, wait i do not have latest , i have 44 [20:46:06] ok, I can live with that :) [20:46:50] one other difference I see is that when I do mw.loader.load( 'ext.eventLogging' ); [20:47:07] FF says: cached, while Chromium fetches the page again [20:48:41] but, anyway, if it works with Chromium I'm good - I'll start working with real events tomorrow. [20:48:52] thanks for all the help ottomata and nuria_ ! [20:53:26] ok cool glad you got it working! [21:08:09] ottomata, re EL, one call to store_sql_events() has only events of a single schema [21:08:36] actually a single batch [21:08:39] oh [21:09:13] fixing the tox problems [21:09:41] ok cool [21:09:44] that makes sense then mforns [21:09:47] ok [22:28:51] Analytics-Kanban, Operations, Patch-For-Review: Upgrade stat1001 to Debian Jessie - https://phabricator.wikimedia.org/T76348#2220771 (ezachte) I looked at the backups at stat1001. I need to tidy things up. Some backups occur too often, and have a lot of garbage in it. Apologies for the overhead this...