[14:14:26] halfak: you said you'd make your dataset public? [14:29:56] harej, yeah. Sorry to log off before telling you where to grab it yesterday. [14:30:07] http://datasets.wikimedia.org/public-datasets/enwiki/etc/references.enwiki-20150403.tsv.bz2 [14:30:10] harej, ^ [14:30:53] SHINY [15:00:59] * guillom waves. [15:02:15] good morning guillom [15:02:18] :) [15:04:51] hey halfak [15:24:24] morning halfak, guillom :) [15:24:51] o/ Ironholds [15:24:55] hey Ironholds [15:32:07] Today in WikiProject insanity: a WikiProject with two completely redundant categorization schemes [15:32:36] Must be Monday. [16:50:49] halfak, you got anything for today's meeting? I've got pretty much nothing unless you'd like to chat EventLogging schemas over readers :) [16:51:56] Ironholds, talk page parser? [16:52:05] Or mobile editor stuff. [16:52:12] I haven't made progress yet :( [16:54:24] Ironholds, ^ [16:54:54] halfak, wfm! [16:55:00] I haven't made progress on either either :/ [16:55:04] what does? [16:55:34] OK. Let's skip the 1:1 today and promise to consider making progress for next week :) [16:58:59] halfak, okies! [17:39:09] 10Quarry, 6Analytics-Kanban: it would be useful to run the same Quarry query conveniently in several database - https://phabricator.wikimedia.org/T95582#1257804 (10kevinator) a:3Milimetric [17:44:43] guillom, ping :) [17:52:50] Ironholds: yup? [17:53:53] guillom, what google webmaster...thing..tools do we have access to? Wes is asking [17:54:27] Ironholds: Well, Wes could read the email I sent him weeks ago in which I answered that question :p [17:55:03] hah. He did! He just told me he'd already spoken to you slightly after he told me "do we have these tools?" and I type fast, apparently [17:55:05] Ironholds: Basically: Google Webmaster Tools for all sites, and Bing webmaster tools for at least some sites, not sure if all [17:55:10] neat [17:55:56] Ironholds: Also, I'm drafting an email to Wes, Toby and Dario about who wants to take over the search engine monitoring project. [17:56:00] (FYI) [17:56:18] I think the answer is "me" ;p [17:56:43] I was going to Cc: you and Jon K. :) [18:22:10] hey guillom, I have some more context on that that I should share with you [18:22:29] copying you and Ironholds [18:22:42] DarTar, cool; thanks :). So much to do with new team :/ [18:23:20] I’m actually not 100% positive this belongs to S&D, I raised the question with lila and damon and haven’t heard back yet [18:24:13] milimetric: omg you got assigned a quarry ticket!!!1 [18:25:33] :) don't get excited but I'm trying to argue that quarry should take precedence over wikimetrics based on usage. How do you feel about putting piwik on quarry? [18:25:42] milimetric: +1 [18:25:49] https://piwik.wmflabs.org [18:26:00] milimetric: I can also steal the logster code from wikimetrics [18:26:14] woah [18:26:16] who set that up? [18:33:51] milimetric: I increased timeout to 20mins [18:35:01] Quarry is a *lot* more user-friendly than Wikimetrics. It is a great joy to use because it is very pretty and the results actually output to a page instead of a weird file you have to download. [18:35:46] (The only downside is that you have to know how to construct SQL queries, of course.) [18:35:58] How does one use piwik? [18:36:07] Do I need to send it events or something? [18:36:51] halfak: it’s like Google Analytics [18:36:55] Maybe a cool little JS snippet [18:37:07] I'd like to be able to track some things on the server-side too. [18:37:09] E.g. API usage. [18:39:28] halfak: yeah. [18:39:44] halfak: not sure how to work this in with privacy policy tho [18:40:46] yuvipanda, indeed. IPs and Useragents might be an issue [18:41:03] halfak: I think IPs will have to be scrubbed by default. UAs, idk. [18:41:12] halfak: toollabs does that (scraps IPs, not UAs0 [18:41:14] ) [18:44:24] halfak: it's a little js snippet [18:44:32] hang on, i'll come back in a bit, ops needs me [18:44:50] No worries. [18:51:59] DarTar: Sorry, I was picking up lunch. I don't have a strong opinion on who should take over SEM, as long as it's someone who can devote more time to it than me :) which is why I was going to email the three of you and see who wants it the most [18:52:59] halfak: https://piwik.org/docs/privacy/ [18:53:09] halfak: we can just turn everything up to max, allow it only for https [18:53:38] Why limit to https? [18:54:00] Piwik is quite privacy-minded. [18:54:06] why not? :) [18:54:07] anyway [18:54:10] doesn’t matter [18:55:17] guillom: totally agreed [18:56:22] yuvipanda, milimetric: I told a bunch of people that piwik on Labs makes me cringe for its privacy implications :( [18:56:34] it does me too [18:56:54] so we should step back and see what exactly we wanat to measure [18:57:38] DarTar: I think we should / can experiment with it, however. [18:58:07] yuvipanda: as long as it’s with private reports and Legal is fine with that, I’m cool [18:58:37] piwik?! [18:58:39] noooo [19:00:19] DarTar: legal signed off, everything seems kosher [19:00:36] kevinator had some reservations before we started rolling it out [19:00:38] like we can limit some of the widgets [19:01:16] milimetric: wow, so exposing my location at city level non aggregated is kosher? [19:01:44] brb [19:01:45] no, but we can turn that off [19:02:03] point is, it's customizable and we should customize it [19:19:07] milimetric: the part that scares me is that right now there’s a lof of PII about myself that is publicly disclosed and I’m surprised to hear Legal signed off, before figuring out what needs to be switched off and also whether this is consistent with our overall approach to privacy (there’s no opt-out, an ugly lot of passive fingerprinting etc) [19:19:59] DarTar: well, labs privacy is very loose to say the least [19:20:00] I’m in favor of reusing 3rd party open source libraries otherwise [19:20:17] we all acknowledge that most data in labs can be hacked fairly easily and that we do nothing to actively prevent that [19:20:18] milimetric: true, and that’s a problem [19:20:23] i agree [19:20:34] i can turn off piwik, that's no problem [19:20:43] or at least turn off anonymous access, would that be ok for now? [19:21:00] it was just an experiment on two very inconsequential sites [19:21:17] I don’t want to be a PITA but I want to make sure we’re on the same page for legal (and yeah, turning off anon access would help) [19:25:00] ok, nobody has access to the piwik instance. It was just for playing with anyway. If anyone would like to run any experiments, please contact me. [19:26:07] but in general, it seems like people would like at least some of the functionality and like the idea of reusing simple third party tools with quick integration. I think we have enough use cases to warrant standing up a productionized intstance [19:26:31] and we have enough experience with privacy to make the right tweaks - country level only, etc. Basically, we can turn privacy settings all the way up as others were saying [19:26:59] milimetric: thanks, makes sense [19:48:38] milimetric: who is hosting it? [19:49:30] we could either host it in labs or prod i guess, but you mean the current one? [19:49:32] milimetric: I’m up for setting up a controlled, labs-supported instance of piwiki with all controls turned way up :) [19:49:34] it's on a separate instance [19:49:36] milimetric: yeah the current one [19:49:41] oh, [19:49:41] yeah! [19:50:03] milimetric: with its own project, and heavily controlled shell (similar to our https proxy) [19:50:32] yeah, that'd be great [19:51:04] milimetric: and proper perf / security (separate https, no access to IPs, HHVM, etc) [19:51:09] milimetric: can you file a ticket? [19:52:05] yuvipanda: https://phabricator.wikimedia.org/T98058 [19:52:57] milimetric: commented already :D [19:53:23] milimetric: what’s the timeline you’re looking at? [19:53:41] i don't think it's super urgent from our point of view [19:53:46] but there are a lot of stakeholders [19:53:59] so i guess in aggregate, the earlier the better [19:54:31] hmm [19:55:15] * halfak watches on with interest [19:56:31] so mysql + hhvm + nginx [19:56:54] milimetric: since this could help put more resources on quarry… :D [19:57:56] one of my quarries has been queued for the past like hour [19:58:59] harej: hit ‘run query’ again? [19:59:04] harej: it’s a bug that happens sometimes [19:59:20] there we go [20:00:30] And now to wait another ten minutes... [20:11:25] anyone know who I go to to have a new phabricator board created? [20:11:56] wait, found the guide [20:22:31] * Ironholds sigh. high-stress day. [20:22:53] * yuvipanda gives Ironholds hugs [20:23:09] Ironholds: I’ve phab creation rights if you want to expediate a request [20:24:23] yuvipanda, https://phabricator.wikimedia.org/T98064 please! [20:27:18] Ironholds: I responded asking for more info [20:55:34] yuvipanda, done [20:55:36] halfak, for the annual reviews...do we review the people who picked us out to review them, or what? [20:55:36] because I submitted my names and am now twiddling my thumbs [20:55:37] Ironholds: No, I think you need to wait until people pick you. Or something. [20:55:37] * guillom is still lost in the process due to old-manager-leaving and 1-monty-old-team-disbanding. [20:55:37] Ironholds: https://phabricator.wikimedia.org/project/profile/1232/ I called it Search-Data-Analytics (more precise). is that ok? [20:55:37] month* [20:55:37] Ironholds, I think our due date is the 8th, so I think we wait a while [20:55:38] TO get assignments anyway., [20:55:38] gotcha [20:55:38] * halfak see guillom's answer and feels shame for repeating [20:55:38] yuvipanda, sure [20:55:38] thanks :) [20:55:38] halfak: yours was more specific! No shame required! [20:55:38] Ironholds: yw [21:38:43] so, last call [21:38:50] does anyone need stat1002 for anything big in the next 24 hours? [21:48:46] Ironholds, did you ping EZ? [21:49:09] halfak, well, I sent out an email to the mailing list [21:50:12] either he's unconscious because it's late (in which case, he won't be running anything until tomorrow) or he's not got anything running (great) or he's not following the mailing list (a problem, but not a problem at my end ;p) [21:50:30] I sort of assume the lack of response was "nobody has anything so nobody said anything". [21:51:55] I think 4 cores is pretty safe too. [21:52:01] What's the memory footprint going to look like? [21:52:17] Lack of response was: I don't read my email that often. [21:52:51] ahh [21:52:55] and, 20-30% tops? [21:53:22] Sounds like you maybe didn't even need to announce :) [21:53:47] yes, but I wanted to be polite [21:54:08] and I don't particularly enjoy email threads informing me my code and tasks are garbage that should not be sullying a machine CLEARLY designated for one SPECIFIC employee. Say ;p [21:54:30] (unrelated: so apparently when I naturalise, I have to take an english-language test. This'll be good.) [21:54:54] You're going to yankeefy yourself? [21:55:13] ottomata: btw, I setup a mesos cluster on labs yesterday :) [21:55:22] ottomata: is running marathon: marathon.wmflabs.org [21:55:34] guillom, at some point [21:55:37] it'll mean crazy things! [21:55:44] like "being able to leave the country for extended periods!" [21:56:00] as a permanent resident, let me say how much side-eye I give to people freaked out by the NSA, actually [21:56:08] the freakouts are legitimate, but my brain does go [21:56:15] Ironholds: Does it mean renouncing your UK citizenship? [21:56:15] * yuvipanda is going to be out of the country for close to 2 months [21:56:17] this gonna be fun [21:56:26] "sorry, I was just making sure I had my government-mandated ID card on me, because if I don't carry it with me 24/7 I can be immediately deported with 48 hours notice" [21:56:31] "what were you saying about your emails?" [21:56:49] guillom, nope; I wouldn't mind tremendously if it did, though [21:56:51] less tax paperwork [21:57:02] I looked into it a few years ago, but when I saw that it'd mean losing my French citizenship I said "uugh, no". [21:57:06] whoa! yuvipanda cool! [21:57:11] ottomata: it’s puppetized too [21:57:13] not merged yet [21:57:39] don't know much about it, aside from having heard of marathon [21:57:43] Ironholds: oh, hmm. I thought you had to renounce your citizenship to become US citizen (unless you were lucky and had both citizenships due to borth or something). [21:57:56] ottomata: yeah, is like a ‘meta’ framework. you run stuff on top of it. [21:57:57] to become a* [21:58:02] oh! no. You have to renounce allegiance, but the state department decided this doesn't mean citizenship [21:58:05] birth* [21:58:13] * guillom can't type today. [21:58:14] unless you're like, Roan, and come from a country that objects to dual nationality ;p [21:58:19] ottomata: allows you to use one cluster of machines between different frameworks, with first class support for hadoop, spark, docker and other things. [21:58:35] Ironholds: Hmm, interesting. How recent is this? [21:58:36] ottomata: and I’m going to write support for ipython notebooks on it in a while [21:59:27] * guillom looks up https://en.wikipedia.org/wiki/United_States_nationality_law . [22:00:04] http://travel.state.gov/content/travel/english/legal-considerations/us-citizenship-laws-policies/citizenship-and-dual-nationality/dual-nationality.html [22:00:21] "A U.S. national may acquire foreign nationality by marriage, or a person naturalized as a U.S. national may not lose the nationality of the country of birth. U.S. law does not mention dual nationality or require a person to choose one nationality or another" [22:00:25] yuvipanda: cool, I know mesos, but marathon makes it easier to do services somehow? [22:00:34] guillom: It's not recent. I think it was found to be unconstitutional to deny someone their other citizenships, even though the oath makes you renounce it. [22:00:41] guillom: So, practically speaking, that oath means nothing. [22:00:44] ottomata: marathon runs long-running process in nicely scalable ways [22:00:57] ottomata: so you give it a process, give it a count, and it keeps that many of them running across the cluster [22:01:03] does health checks and starts them back up if necessry [22:01:14] ah, hm, cooooOl [22:01:15] that is cool [22:01:46] Deskana: ok, thanks. Weird. I really thought I had read the opposite a few years ago. Not that it would have changed anything in my case. But good to know; thank you both :) [22:02:05] ottomata: so I’m going to put ipython on top of marathon / mesos, and have that be what’s for jupyter.wmflabs.org :) [22:02:30] super cool [22:02:41] ottomata: do you guys have something similar for prod? [22:03:02] guillom: IANAL, blah blah blah. :-) [22:03:07] :) [22:03:44] guillom: Are you going to the Hackathon? [22:04:17] guillom: I had the "Move to the US, go back to my home country for Wikimania" thing, I wondered whether you'd have the same thing with the Hackathon :-p [22:04:34] Deskana: Sadly, no :( I was hoping to go, but $budget had reached its upper bound. [22:04:36] guillom: I was *so* disappointed when I moved to the US and found out that my first Wikimania would be in my home country. [22:05:02] pffft. I was really happy it was in the UK :P [22:05:06] let’s do more things in the UK!11 [22:05:45] You guys will have to survive in France without me. And I won't be inviting you to my brother's big house in the mountains. [22:06:44] !??!? [22:06:50] yuvipanda: no [22:06:54] we use YARN, but that is very hadoop specific [22:06:58] well now i'm disappointed and i didn't even know this mountain house existed! [22:07:06] ottomata: aaah, I see. [22:07:07] i have heard that mesos is really great in general for managing distributed services [22:07:12] yarn can be run in mesos [22:07:19] but we don't do it [22:08:51] ottomata: cool :D [22:13:49] harej: heh. it's ~400 km (250 mi) from Esino Lario, so there might be another opportunity next year ;) [22:19:23] ottomata: so i was going to rename the zookeeper roles to be more generic (role::zookeeper::client and role::zookeeper::server). objections? [22:19:29] I’ll change all the other things that rely on it as well [22:41:33] ottomata: do you have any zookeeper instances running precise still? [23:00:19] Hey yuvipanda. I'm playing around with uwsgi and could use a protip or two. [23:02:30] Specifically, when "sudo service uwsgi restart" responds "OK", but starts up nothing despite my config appearing in /etc/uwsgi/apps-enabled, where do I look to debug? [23:02:38] It seems that there's no logs. [23:02:44] I can start the server manually it seems [23:09:53] halfak: ah where is this? [23:10:23] Where? If you want to scope out the config, I'm working from labels.eqiad.wmflabs [23:10:32] halfak: I'll be back in about 30mins [23:10:36] kk [23:10:42] Am out atm