[00:24:54] Analytics / General/Unknown: Make sure 2013 traffic logs are gone from /a/squids/archive on stat1002 - https://bugzilla.wikimedia.org/63543 (Kevin Leduc) [01:09:11] (CR) Nuria: "Since I saw that Marcel comment on the details I will restrict my comments to the design." (2 comments) [analytics/dashiki] - https://gerrit.wikimedia.org/r/168488 (owner: Milimetric) [01:23:20] (CR) Nuria: "Yes. Please be so kind as to write a unit test that verifies that a cohort with tags can be deleted. I have tested your change and it fixe" [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/171726 (https://bugzilla.wikimedia.org/72434) (owner: Bmansurov) [02:10:28] (PS3) Milimetric: Add Separated Values converter [analytics/dashiki] - https://gerrit.wikimedia.org/r/168488 [02:11:29] (CR) Milimetric: "Thank you for the comments. I addressed those and only one thing remains: changing the wikimetrics api's getData function to be able to f" (9 comments) [analytics/dashiki] - https://gerrit.wikimedia.org/r/168488 (owner: Milimetric) [02:20:54] nite! [09:32:48] (PS1) Yurik: Updated scripts [analytics/zero-sms] - https://gerrit.wikimedia.org/r/171810 [10:37:12] [Ops] [puppet-private] (8086a19) new DB 'research' user password [10:37:27] that's the user we were using to generate the limn data for our team [10:37:46] should we have been using another one? one way or another we're going to need new credentials for the script that generates our tsvs [10:38:11] research_prod maybe? [10:38:46] (CR) Yurik: [C: 2] Updated scripts [analytics/zero-sms] - https://gerrit.wikimedia.org/r/171810 (owner: Yurik) [10:39:31] (CR) Yurik: [V: 2] Updated scripts [analytics/zero-sms] - https://gerrit.wikimedia.org/r/171810 (owner: Yurik) [12:02:12] Analytics / Refinery: Raw webrequest partitions for 2014-11-06T21/1H not marked successful - https://bugzilla.wikimedia.org/73131 (christian) NEW p:Unprio s:normal a:None None of the webrequest partitions [1] for 2014-11-06T21/1H have been been marked successful. What happened? [1] _____... [12:02:42] Analytics / Refinery: Raw webrequest partitions for 2014-11-06T21/1H not marked successful - https://bugzilla.wikimedia.org/73131#c1 (christian) NEW>RESO/FIX Commit fe3b1679776adb3e0251a5b6fe12b7f5dab69438 got merged, which updated the varnishkafka configuration for the caches. This caused varnis... [12:03:06] !log Marked all 4 raw webrequest partition for 2014-11-06T21/1H ok (See {{bug|73131}}) [12:14:27] gi11es: DarTar sent out an email yesterday to the internal list that mentions that more people need to get access to the credentials. [12:14:46] You we're included in that list of people that need to have access granted. [13:56:09] (CR) QChris: Transform projectcounts hourly files (1 comment) [analytics/refinery] - https://gerrit.wikimedia.org/r/169974 (https://bugzilla.wikimedia.org/72740) (owner: Milimetric) [13:58:52] (CR) Milimetric: [C: 2] Hide cohort details when filtering results no results. [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/171489 (https://bugzilla.wikimedia.org/73040) (owner: Bmansurov) [13:58:59] (Merged) jenkins-bot: Hide cohort details when filtering results no results. [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/171489 (https://bugzilla.wikimedia.org/73040) (owner: Bmansurov) [13:59:56] Analytics / Wikimetrics: Misleading search result displayed when filtering cohorts - https://bugzilla.wikimedia.org/73040#c3 (Dan Andreescu) PATC>RESO/FIX will get deployed this Thursday [14:19:18] milimetric, lets talk datasets! [14:20:15] k [14:20:24] so first - do you need any help generating it? [14:24:17] Ironholds: ^ [14:24:26] nope! Not to my knowledge. [14:24:35] We're good on that front (307 days in!) [14:25:46] ok, and it'll be done when? [14:26:42] (i don't know how much data you were asked for, or any details really :)) [14:31:30] Ironholds: ^ [14:31:41] * Ironholds thinks [14:31:45] probably by the end of the weekend [14:31:55] I was asked for as much data as I could provide, which is ~590 days worth at the moment [14:32:03] it'll be less after we sanitise the old logs, obviously. [14:32:07] Ironholds: oh :) well then :) I'll have to teach someone how to load it into the cube [14:32:17] yo! :P [14:33:09] Ironholds: so when's the deadline [14:33:16] as in, after your set finishes, how long do we have? [14:33:41] it absolutely has to be done by the 17th, but the ask is for Monday, as I understand it. [14:33:49] Toby/DarTar will know better than me any specific restrictions. [14:33:50] got it, ok [14:34:30] qchris / ottomata: it looks like I'll be surfing when someone will need to load Ironholds' data into a new cube [14:34:46] I'll do some prep work today, but can one of you do the actual loading? [14:35:04] or just bully Marcel or Nuria into it [14:35:11] * qchris reads backscroll [14:36:15] so I can set up the new cube, load it with some fake data, and write down clear instructions for loading it [14:36:31] milimetric: is this just some clicky-clicky or does this need more scripting/sanitization work too? [14:36:32] Ironholds: I assume you're going to leave the output in the staging database on analytics-store? [14:36:50] qchris: i'll do the clicky clicky so you don't have to [14:37:14] milimetric, yep! [14:37:26] not sure about scripting, but you'll basically have to move data from analytics-store/staging to dan-pentaho/warehouse [14:37:44] dan-pentaho/warehouse is a mysql/mariadb database? [14:37:48] yes] [14:37:55] dan-pentaho.eqiad.wmflabs [14:38:00] k. [14:38:26] I guess I can help there. [14:38:32] ok, thanks [14:38:43] sounds like the absolute deadline is the 17th and the nice to have is Monday [14:39:08] and I've already told Toby that this is very disruptive at a point when we're trying to finish a challenging sprint [14:39:46] * qchris withholds his opinion here. [14:40:05] honestly, as scrum master, I can push back [14:40:09] I've not entirely made up my mind [14:40:20] let's talk briefly after standup [14:40:38] k [15:20:30] milimetric, I agree! [15:20:39] it's pretty disruptive at the R&D end too [15:21:02] there I was, hoping to work on our session definition and suchlike, and bam, boardwerk. [15:58:25] qchris: talking to toby now [15:58:29] on batcave [16:14:02] ottomata: hellooouuu [16:14:51] helloooww [16:14:56] ottomata: we were going to hang the pageview files in the dataset endpoint [16:15:02] ottomata: does that sound ok? [16:15:05] eh? [16:15:43] ottomata: you know that dan is working in creating pageview files per project from teh current webstatscollector files, right? [16:15:51] yes, daily counts, right? [16:15:55] yes [16:15:56] yes [16:16:14] ottomata: so those files will be "served" via http [16:16:30] from the datasets endpoint, makes sense? [16:16:42] not exaclyt sure what you mean by datasets endpoing [16:16:47] datasets.wikimedia.org? [16:16:54] why not keep them on dumps, with the othe rilfes [16:16:54] ? [16:16:56] http://dumps.wikimedia.org/other/pagecounts-all-sites/2014/2014-11/ [16:17:03] maybe a daily/ subdir there [16:17:04] or something [16:17:25] ottomata: http://datasets.wikimedia.org/ [16:17:36] aye, why put them on a different host than the hourly files? [16:17:52] I thought that is what qchris thought it was best [16:19:48] hm, qchrsi would know then? i day keep them on the same host [16:19:50] ottomata: it was his suggestion [16:20:41] I think qchris might be gone for the weekend [17:06:16] ottomata: I was going to work on puppet changes for datasets to add a directory where files should go but if you feel things should be under dumps i can postpone that [17:12:52] nuria [17:12:52] hm [17:12:53] https://github.com/wikimedia/operations-puppet/blob/production/manifests/role/dataset.pp#L20 [17:12:56] if you put it in hdfs [17:13:08] it will autmoatically get synced to dumps [17:16:58] nuria__: ^ [17:17:09] ottomata: at your service [17:17:29] hdfs dfs -ls /wmf/data/archive/webstats [17:17:33] so, if you put it in there somewhere [17:17:37] it will automatically get synced over [17:17:59] no puppet chnages needed :) [17:18:28] ottomata: I think we have to wait for qchris to be back then, cause he mentioned datasets explicitily [17:19:17] ottomata: cause also we would need to send things to a particular directory [17:19:28] where we can set up cOrs and cache headers [17:21:17] hm, is CORS to dumps a prolem? [17:21:19] you could do [17:21:28] /wmf/data/archive/webstats/daily/... [17:21:29] of you want [17:21:31] iunno [17:21:35] yeah qchris probaly has opinions [17:25:55] ottomata: ok, let's wait then, both milimetric & qchris metioned about datasets being the place where we wanted to served those files from [17:28:28] Analytics / EventLogging: Event Logging tables that we can drop as of Nov 7th - https://bugzilla.wikimedia.org/73140 (nuria) NEW p:Unprio s:normal a:None From Dan Garry on Nov 6th I'm quickly checking through the EventLogging database, and there are a lot of redundant tables in there which... [17:29:24] milimetric, if I have a question about EL data collection, should I pass it to you or someone else in the team? [17:29:52] shoot lzia [17:30:44] nuria__, mforns: I'm just finishing my burger, but batcave in a bit? [17:30:53] milimetric: sure [17:30:57] yep 10mins? [17:30:58] ContentTranslation has started to collect data in Beta yesterday. The table is not in EL, but I see logs in /a/eventlogging/archive/server-side-events.log-20141107.gz [17:31:00] k [17:31:07] 10 min [17:31:39] lzia: most likely events are invalid then, one sec [17:31:40] I'm wondering why the data hasn't gone into ContentTranslation table (and the table is not there). Can this be a schema problem? [17:31:43] milimetric: make it 15, fast food, eat slow [17:31:45] oh okay. [17:33:41] lzia: I'm here. [17:34:28] hi jsahleen, we're trying to see what's up with your events [17:34:30] cool jsahleen [17:34:53] nuria__: if event logging events don't show up in http://graphite.wikimedia.org/ under eventlogging -> schema -> ContentTranslation, that means the events sent with that schema are not valid right? [17:35:10] It's entirely possible the collection code is not written correctly. [17:35:12] or that they are not being sent [17:35:29] nuria__: leila sees them in the logs, so they appear to be sent [17:35:30] milimetric: which is most often the case [17:35:49] milimetric: i can look up on teh logs whether there is any ocurrence of those events [17:36:05] leila: in what logs? [17:36:11] in /a/eventlogging/archive/server-side-events.log-20141107.gz [17:36:44] yep, i just checked and they show up there [17:37:02] oh :) no they don't [17:37:18] mm. what are the three logs there then milimetric [17:37:22] i see Extension:ContentTranslation but no Schema of ContentTranslation [17:37:30] doh! [17:37:40] milimetric, bmansurov lemme look, in our experience more often than not rather than events not validating (which you CAN check in vagrant) [17:37:41] lzia: you can see \"schema\":\"PageCreation\" in those lines where ContentTranslation shows up [17:37:47] is that events are not being sent [17:37:58] nuria__: you're right - we accidentally thought they were in the logs [17:38:05] because the word ContentTranslation shows up [17:38:28] so jsahleen: the events are not being sent, could you point us to the code that should send them? [17:38:42] Getting it now... [17:38:47] nuria__: I totally should have vagrant on my machine. my next todo for next week [17:38:51] actually jsahleen : can you guys test (using a vagrant instance) [17:38:57] that events are being sent 1st [17:39:20] jsahleen: that is easy to do in the test environment and many errors can be corrected there [17:39:22] I have vagrant on my machine. Do I just enable the python server. [17:39:47] The debug server? [17:39:54] jsahleen: ya, go through your code that triggers the event sent out and look at teh network requests [17:40:01] jsahleen: being sent 1st [17:40:13] Ok. I can do that. Makes sense. I did not write this code so... [17:40:25] jsahleen: right, 1st) look that events are sent [17:40:34] 2nd) look whether they are valid [17:41:00] jsahleen: to see whether they are sent the network tab of chrome, say, is sufficient [17:41:07] jsahleen: let us know what you find [17:41:29] Will do. Thanks. Here is a link to the code in case anyone needs that. https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FContentTranslation/89b6284f06b4419ddec6dcccee0eed500f267100/modules%2Feventlogging%2Fext.cx.eventlogging.js [17:41:32] nuria__: I'm in the 'cave [17:41:41] milimetric: going [17:41:43] thx jsahleen, good luck [17:41:59] Thanks. May need it. [17:43:21] thanks, nuria__, milimetric for chiming in. [17:44:57] Analytics / Quarry: WMF employees use an internal Quarry instance to share data & sample queries - https://bugzilla.wikimedia.org/73142 (Kevin Leduc) NEW p:Unprio s:enhanc a:None Yuvi's description of the task: If you guys are worried about this taking too much of analytics' time, here is a... [17:45:24] ottomata: so are you against this data going to datasets? I think it would be harder to enable CORS on dumps than datasets [17:50:29] Analytics / Quarry: AnalyticsEng has a transition plan for maintenance of Internal Quarry - https://bugzilla.wikimedia.org/73143 (Kevin Leduc) NEW p:Unprio s:enhanc a:None Dan's comment on internal list: I think the initial setup should be de-coupled from the dev team's process. It sounded... [17:51:27] Analytics / Quarry: AnalyticsEng has a transition plan for maintenance of Internal Quarry - https://bugzilla.wikimedia.org/73143 (Kevin Leduc) [17:51:27] Analytics / Quarry: WMF employees use an internal Quarry instance to share data & sample queries - https://bugzilla.wikimedia.org/73142 (Kevin Leduc) [17:57:35] milimetric: not against, it just makes more sense to keep the data together [17:57:47] especially since there is already code to copy the stuff to dumps [18:00:18] kevinator: yay on the bug being filed, etc :) [18:00:48] you’re welcome! [18:16:23] Ironholds: hey so the schema for the table you're going to generate [18:16:33] mforns and I are trying to prepare our end [18:16:49] if you're going to create that in staging anyway, you wanna do it now? [18:17:01] not for ~2 hours. Meetings :/ [18:17:12] ok, cool [18:17:21] we'll look for it then and go by your email now [18:23:12] Analytics / Wikistats: WikiStats update cronjob failing - https://bugzilla.wikimedia.org/73146 (Yuvi Panda) NEW p:Unprio s:normal a:None Cron /usr/bin/php /usr/lib/wikistats/update.php wx > /var/log/wikistats/update_wx@18.log 2>&1 /bin/sh: 1: cannot create /v... [18:53:30] is anyone from wmf analytics set up with google webmaster tools yet? [19:05:15] eloquence: 1st time I hear about it... [19:05:27] Eloquence: sorry, 1st time I hear about it... [19:14:11] Analytics / Wikimetrics: report table performance, cleanup, and number of items - https://bugzilla.wikimedia.org/72635 (Dan Andreescu) PATC>RESO/FIX [19:14:29] Analytics / Wikimetrics: Story: WikimetricsUser tags a cohort using a pre-defined tag - https://bugzilla.wikimedia.org/72746 (Dan Andreescu) PATC>RESO/FIX [19:16:26] Analytics / EventLogging: List tables/schemas with data retention needs - https://bugzilla.wikimedia.org/72741 (Dan Andreescu) NEW>RESO/FIX [19:23:50] Eloquence: I was for some personal sites, what are you looking for? [19:37:53] milimetric, nuria__ - we're interested in poking at average search ranking -- I know Stu's already set up on it, but was wondering if anyone at WMF is [19:37:56] (stu west from the board) [20:36:29] Eloquence: I see, you mean access to google web master tools for, say, the en.wikipedia website? [20:38:20] milimetric: i'm taking your patch #3 and starting from there, ok? [20:38:23] on https://gerrit.wikimedia.org/r/168488 [20:38:23] nuria__, Eloquence: I’d love to see that data (if we have it), is that something that Stu owns? [21:05:16] ironholds@stat1002:~$ cat /etc/mysql/conf.d/research-client.cnf [21:05:16] cat: /etc/mysql/conf.d/research-client.cnf: No such file or directory [21:05:18] ottomata, ^ [21:05:21] plz explain ;p [21:09:38] stat1003 [21:09:41] not stat1002 [21:10:26] Ironholds: therfe is a file on stat1002, but researchers group can't access it, this is because not all researchers ahve access to stat1002 [21:10:29] on stat 1002 [21:10:38] okay [21:10:46] wait, which researchers don't have access to 002? [21:10:58] /etc/mysql/conf.d/analytics-research-client.cnf [21:11:01] Ironholds: many? [21:11:05] stat1002 has private data [21:11:20] huh. Who's in the research group? :/ [21:11:21] iunno man, there is a 'researchers' group, that is maintained for access to stat1003 and to this password file [21:11:37] https://gerrit.wikimedia.org/r/#/c/171828/2/modules/admin/data/data.yaml [21:11:47] That was updated only this morning, after the password changed and people finally cared :p [21:26:44] aha [21:29:02] hey mforns [21:29:07] I see you tried sudo on stat1003 :) [21:29:12] please don't do that. [21:29:17] are you not able to read the passwod otherwise? [21:29:26] hi! [21:29:51] I'm trying to connect to analytics-store.eqiad.wmnet mysql [21:30:06] sorry for the sudo [21:30:20] let me look [21:32:07] YuviPanda: ottomata says I'm not in the researchers group [21:32:19] yeah, just noticed. [21:32:22] YuviPanda / mforns: the few lines above are relevant to you here. https://gerrit.wikimedia.org/r/#/c/171828/2/modules/admin/data/data.yaml [21:32:29] yeah, I'm there atm :) [21:32:37] I'm not sure what the process is to add someone to a group... [21:32:46] i guess a patch similar to that one :) [21:32:50] milimetric: mforns ah, he's taking care of it [21:32:53] as soon as that got merged, I had access [21:32:54] yeah :) [21:33:02] (PS1) Ottomata: Add mforns to researchers group [puppet] - https://gerrit.wikimedia.org/r/171961 [21:33:08] mforns: you should have access shortly, I think :) [21:33:31] thanks guys! [21:33:37] milimetric: remember what happened when I/we first tried sudo on stat1 long long ago? :) [21:34:55] YuviPanda: I know!! That was like - welcome to the real world [21:35:11] mforns is lucky / unlucky that leslie's not around to yell at him :) [21:35:20] milimetric: yeah! :D I'm just laughing a fair bit here since this happened again, but like this :D [21:35:25] try now mforns [21:35:25] O.o [21:35:35] chain continues, etc :) [21:35:53] i know! Tis the Ciiiiircle.... Circleeee of Sudo [21:35:56] :D [21:36:16] yes, it worked [21:36:17] milimetric: now I'm on the other side of https://xkcd.com/838/ [21:36:45] coool [21:36:56] for context to others - about two years ago, when me and milimetric were messing around on stat1 (old stat box), we tried to sudo apt-get install some things. received stern letter from leslie :) [21:37:06] (who was in ops then) [21:38:05] anyway, nevermin :) [21:39:21] stern letter is like a bit of an understatement [21:39:31] I thought I was gonna get fired, I started packing up my laptop [21:39:39] :/ [21:43:22] oh, I remember trhat letter [21:43:27] THOU SHALT NOT INSTALL LOCALLY [21:43:32] THOU SHALT NOT FORWARD AGENTS [21:43:44] result: now instead of doing it and talking about it, people merely do it. [22:03:40] hey so I'm heading out on vacation [22:03:51] Ironholds: you and mforns have been talking - so you guys are good right? [22:04:13] yeah [22:04:20] k, I'll be checking my mail, have a nice weekend everyone [22:04:31] have nice days!! [22:04:38] thx :) [22:14:53] hey ottomata yt? [22:15:03] have fun milimetric ! [22:17:50] DarTar, ja hey [23:06:56] Analytics / Wikistats: WikiStats update cronjob failing - https://bugzilla.wikimedia.org/73146 (Daniel Zahn) a:Daniel Zahn [23:55:03] (PS2) Bmansurov: Delete cohort tags when a cohort is deleted [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/171726 (https://bugzilla.wikimedia.org/72434) [23:59:14] nuria__: hi, any suggestions on the next wikimetrics bug/feature I should work on?