[10:12:16] 06Revision-Scoring-As-A-Service, 10ORES: [spike] Find out if we can still get health check warnings after lb rebalance - https://phabricator.wikimedia.org/T134782#2333205 (10schana) After speaking with @akosiaris, since the web nodes have private IP addresses in the labs environment, they can't be monitored di... [11:05:09] akosiaris: hey, What we need to do for going to prod now? what is the steps? [11:28:44] the parent patch got merged now [11:42:45] 10Revision-Scoring-As-A-Service-Backlog, 10ORES: [discuss] What to do with vagrant? - https://phabricator.wikimedia.org/T135623#2304834 (10schana) Since Vagrant can be configured to apply Puppet roles, I think it may make sense to have the Vagrantfile do that. That being said, it would be nice to not be platfo... [12:51:36] o/ [12:52:02] halfak: o/ [12:52:49] I'm at WMDE. I'm aiming to get about an hour of ORES refactor hacking in and some documentation of what I have been doing [12:53:08] awesome [12:53:08] halfak: https://phabricator.wikimedia.org/T136406 [12:53:09] Then I'll be getting on my bike to cross town back towards my hotel and to return my bike rental. [12:54:21] Just added my endorsement [12:54:47] halfak: nice, tell me how bike rental works in Berlin [12:55:08] my hotel is 1 km away from the WMDE office [12:55:39] It's pretty nice. You pay ~12 Euro to rent a bike for the day. They give you a lock and you can bring it back to where you started by 2300. [12:56:21] The bike I have is a big cruiser. Not exactly the most efficient, but it is definitely comfortable [12:56:34] I just found a shop near my hotel that had them for rent [12:59:35] nice [13:05:05] halfak: my first action item for today is to fix the staging. Tell me if you think I need to something else (more priority, etc.) [13:05:22] also, trying to get the scap configs merged :D [13:06:25] Those sound good to me. After that, KDD paper review summary and change proposal. [13:06:44] I'll be flying during our sync session on Monday. [13:07:11] But it would be great if we could meet later in the week to discuss next steps [13:07:29] is it possible to push it a little? [13:11:02] halfak: ^ [13:11:17] push the KDD paper work back? [13:11:26] nope, the meeting [13:11:32] Oh yeah. I suppose so [13:11:55] How about the same time on Tuesday? [13:12:36] works just fine [13:15:17] Hello everyone [13:15:25] o/ [13:16:58] I am pretty interested in working on something or the other, but not sure where I can contribute best. Halfak suggested I check out here to know :) [13:17:13] o/ [13:17:40] Amir1 is the other fulltime engineer on the revscoring project stuff. He's also User:ladsgroup, so you may have seen him around. [13:17:51] SoniWP was one of my collaborators on the Snuggle project [13:18:02] nice to meet you SoniWP :) [13:18:03] He's been on an extended wikibreak (at least break from our collabs) to do school stuff. [13:18:29] SoniWP used to be TheOriginalSoni, but I think it's just User:Soni now, right? [13:18:49] Yes [13:19:05] i pointed SoniWP to our backlog board to get some thoughts going. [13:19:55] I'm currently studying Computer science (undergraduate) and am currently looking to get some working experience in one field or another. Hopefully that'd help me figure out what fields I would like to work in the future in :) [13:21:31] in matter of CS there are tons of stuff to do [13:21:36] We have a lot of data analysis and modeling work to do :) [13:21:45] The "ideas" column has a lot of that [13:21:48] halfak: I think we should have a "scientific" tag or column [13:22:23] SoniWP, see a talk I gave about our work from last year here: https://www.youtube.com/watch?v=Hj7o5d-OEis [13:22:37] I've given more recent talks, but I can't find a recording. [13:23:12] Oh wait! Here: https://www.youtube.com/watch?v=nMnUjIh6FUg [13:23:16] From the berkman center [13:23:21] That one is pretty recent. [13:29:28] Great, looking\ [13:30:25] I think you'll find many of the arguments made in that talk familiar, but the approach to supporting Wikipedians is more novel. [13:31:58] halfak: we probably need to migrate to scap3 deploy in labs eventually. Not a big fan... [13:32:33] Yeah. I hear you. The fabfile pattern has been pretty good to us [13:33:16] it's much more flexible and more importantly doesn't require a deployment server and puppet configurations [13:33:38] very good for labs-based projects [13:33:46] halfak, Among the items on the phabricator board, is there any one of these I could get started on right away [13:33:47] ? [13:34:02] (It sounds like one scarily long and extensive list) [13:35:27] Heh. Well, one we are looking at working on in the short term is building a quality model specific to new page creations. [13:35:34] It would be designed to detect spam and vandalism [13:35:45] We can use delete reasons to build a train/test set. [13:35:55] I bet that our editquality feature sets would carry a lot of signal [13:36:19] I'd have to help you extract features for deleted pages (unless you have some advanced rights) [13:37:01] (I dont) [13:37:38] Sounds good to me. Is there any set of resources I can look at to know where to begin with it? [13:38:34] I was trying to make ores-scaptest-01 to a deployment server but that role doesn't work in jessie, needs ubuntu [13:38:43] * Amir1 swears really hard [13:39:56] halfak: precaching is down: https://grafana.wikimedia.org/dashboard/db/ores [13:40:06] just kidding, it can't be down [13:40:13] lol [13:40:15] :P [13:41:31] SoniWP, if I were you, I'd start by generating a set of labeled data using the `logging` table on enwiki. [13:41:47] Generate a dataset of recently deleted articles/drafts with deletion reasons. [13:41:58] We can then use the `archive` table to get the first revision ID. [13:42:29] I think you'll want to query with log_type="delete" and log_action="delete" [13:42:41] I believe that log_namespace in (0, 118) will work pretty well [13:42:49] You can parse log_comment to look for the reason. [13:42:54] might need to be clever with that parsing. [13:43:09] We want the subset that have anything to do with spam/vandalism [13:43:11] Um, Sorry if its a noob question but how exactly do I access these tables [13:43:23] Oh! You might have not seen quarry.wmflabs.org [13:43:24] :) [13:43:28] Check it out [13:43:29] I havent exactly been around much for the last couple months or so [13:43:34] You can run queries directly from the web. [13:43:55] Oh right. Quarry. I remember looking at it once [13:44:21] Can we access all of enwiki's database using Quarry? [13:47:04] Basically, yes [13:47:16] You can access metadata from deleted pages in the `archive` table [13:47:24] But not user password hashes or the checkuser tables [13:50:17] Gotcha. And should I have any doubts or queries (heh), who should I ping? [13:50:41] Should I ping you here or mail you halfak? [13:50:47] Ping here is OK. [13:50:57] Also see #wikimedia-research [13:51:05] Lots of people there query often [13:51:11] Or #wikimedia-analytics [13:51:37] BTW, I'm hoping to use this model we are discussing to support this: https://meta.wikimedia.org/wiki/Grants:IdeaLab/Fast_and_slow_new_article_review [13:52:45] Got it. [13:53:01] Thanks a lot [14:07:22] Hey halfak, I'd be expected to know and learn how to use ORES, right? [14:12:31] SoniWP, to some extent. ORES isn't that complex of a system [14:12:36] It makes predictions about revisions :) [14:21:37] YuviPanda: one question. It would be great if you help us. Is there anything else I need to do to get this done? https://phabricator.wikimedia.org/T136406 [14:21:47] beside patience :D [14:22:38] Amir1: usually the person who is 'clinic duty' will handle it (see /topic of -operations to find out who it is) [14:22:47] Amir1: and for deployment access you usually need a 'ok' from someone in releng [14:23:19] oh, thanks [14:23:22] Amir1: so my advice would be: find someone in releng to approve it, and get halfak to get Dario to approve it as well (managerial and stuff) [14:24:17] YuviPanda: isn't it enough? https://phabricator.wikimedia.org/T134651#2276823 [14:24:36] Amir1: you know how paperwork goes. it's never enough. [14:24:42] :D [14:24:50] Amir1: I'll agree with you it is enough, but i'm saying the things to make the experience smoother :) [14:25:14] thank you for helping out [14:25:16] :) [14:25:21] It is a great help [14:25:26] Amir1: you might also have less headaches with splitting it into two access requests actually [14:25:29] let me talk to Tyler in releng [14:25:33] one for stat and one for deployment access [14:25:51] Amir1: you should also rephrase the scb one to be 'deployment access (for deploying to scb)' rather than straight up just 'access to scb' [14:25:53] sure [14:25:57] since those are different things and you want the former [14:26:29] * Amir1 writes down [14:27:02] Amir1: yeah, since deployment access comes with access to tin + your hosts... [14:27:18] Amir1: you're the first non-WMF employee who's requesting that access, so I expect it to be slightly bumpy. [14:27:22] Amir1: the stat one should be easier [14:27:29] so another reason to split it [14:27:55] YuviPanda: doing it right now [14:28:09] ok :) [14:28:17] Amir1: robh is in SF timezone, so he won't be up for a few more hours [14:29:07] thanks, I will ping him very soon. [14:30:15] cool :) [16:31:01] Amir1: we are more ore less ready for the deployment finally, aren't we ? [16:31:26] akosiaris: yes, I think so [16:31:28] \o/ [16:31:40] the patch on keyholder got merged [16:31:46] yup :-) [16:32:22] also it would be great if you or anyone merge this: https://gerrit.wikimedia.org/r/#/c/291255/ [16:32:38] (I'm not sure if what I did was correct) [16:36:29] akosiaris: I'm around. assign anything you want to me [16:39:09] Amir1: so, that one requires the usual 3 days waiting time. It's standard, but I can try to expedite it on Monday in ops meeting [16:39:16] 3 days == 3 working days [16:47:42] akosiaris: okay, thanks [16:47:43] :) [17:31:56] akosiaris: about moving to prod. Do you have an ETA now? I'm so impatient. Sorry for that :( [17:34:26] afk for now [17:55:38] Amir1: so, we would try it on Monday. probably even succeed. There is one thing that would take a bit more time and it's setting up the public endpoint, but the internal production stuff should be good to go by then [17:56:08] akosiaris: LVS and varnish? [17:56:31] you made patches for them AFAIK but probably more settings are needed [17:56:40] yup [17:56:52] more or less ready if I remember correctly [17:57:13] and with that, it's 21:00 here, and I am signing off. c ya [17:57:31] akosiaris: it's great. Tell me if I need anything [17:57:35] *you need anything [17:57:41] I need to go too [17:57:44] see you [17:57:45] o/ [17:58:00] o/