[02:08:07] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2517800 (Tbayer) OK, this is just a vague hunch. But looking at the Google Search Console (webmaster tools) for some of our domains, it's interesting that they s... [06:10:02] (PS1) Amire80: Script for CLL daily statistics [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/302656 (https://phabricator.wikimedia.org/T139326) [07:14:16] (CR) Nikerabbit: [C: 2] Script for CLL daily statistics [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/302656 (https://phabricator.wikimedia.org/T139326) (owner: Amire80) [07:14:22] (Merged) jenkins-bot: Script for CLL daily statistics [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/302656 (https://phabricator.wikimedia.org/T139326) (owner: Amire80) [07:17:12] joal: morningggg [07:17:17] so aqs1004 [07:17:21] Receiving 1064 files, 133885900040 bytes total. Already received 942 files, 118694385993 bytes total [07:17:28] for instance a [07:17:32] meanwhile instance b [07:17:49] Receiving 1525 files, 202343261268 bytes total. Already received 859 files, 113279161367 bytes total [07:17:52] Receiving 1947 files, 275009995979 bytes total. Already received 810 files, 113965776942 bytes total [07:19:32] instance a looks very good, it will finish this morning but instance b looks a bit slow [07:19:53] or anyway far from completion [07:21:21] but if it completes today it might be faster than regular loading time [07:26:11] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2518094 (Tbayer) Some further remarks: # This has a considerable **effect on [[https://www.mediawiki.org/wiki/Wikimedia_Product#Reading | our global traffic metr... [07:31:02] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2518096 (Legoktm) >>! In T141506#2517800, @Tbayer wrote: > OK, this is just a vague hunch. But looking at the Google Search Console (webmaster tools) for some of... [08:17:30] elukey: o/ ! [08:17:40] https://meta.wikimedia.org/wiki/Research:Quantifying_the_global_attention_to_public_health_threats_through_Wikipedia_pageview_data [08:17:57] this looks cool! [08:18:50] elukey: I think milimetric should read that and comment - There have already been projects like this one, and IIRC they were not pursued due to too much noise in data to extract proper signla [08:18:55] elukey: But cool for sure ! [08:19:10] elukey: AQS signals look good [08:19:31] will forward this to the ML just in case [08:19:33] elukey: We also nee to remember we need to do the boostraping on every machine, so time * 3 [08:19:41] yep yep [08:20:20] elukey: I also prefer let cassandra manage its own data this way, looks cleaner than having to reload everything :) [08:23:46] joal: I made some quick calculations and $maybe next Monday we could be able to restart loading [08:23:57] k elukey :) [08:24:21] fingers crossed :) [08:24:44] I'll keep an eye on Cassandra, and will double check with you before proceeding with aqs1005 [08:25:37] another good thing is https://grafana.wikimedia.org/dashboard/db/aqs-elukey?panelId=16&fullscreen [08:25:59] Mutation drops have stopped (but read drops are still present sometimes) [08:26:13] the former looked scarier than the latter [08:27:07] elukey: It seems there actually were no mutation drops - or very disparate spikes ... [08:28:40] elukey: The "read message dropped" drop is significant, as well as the latency drop, and the 5xx response drop, but I think those are the only metrics showing a change [08:29:26] joal: yeah I was checking the sporadic spikes, I didn't like them much.. anyway, better to not have them at all [09:24:18] Analytics-General-or-Unknown, Monitoring: Switch jmxtrans from statsd to graphite line protocol - https://phabricator.wikimedia.org/T73322#2518283 (fgiunchedi) [09:25:11] Analytics, Operations: Jmxtrans failures on Kafka hosts caused metric holes in grafana - https://phabricator.wikimedia.org/T136405#2518294 (fgiunchedi) see also {T73322} about switching statsd -> graphite, once the upgrade is done [09:56:19] (PS7) Joal: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) [10:16:12] elukey: I can see that aqs1004-a has finished bootstrapping (compaction started ;) [10:24:25] joal: yep aqs1004-a is now working (not in joining anymore) [10:26:00] elukey: keyspace setting checked: everything FINE :) [10:26:31] \o/ [10:46:45] (PS8) Milimetric: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [10:55:01] taking a break a-team, later [11:02:48] me too, just sent an email :) [11:14:25] elukey: I've been involved with the ISI folks and helping them understand what's possible, etc. I have high hopes for this one [11:15:22] I'll run them through Hive once they finish getting access (they're close, just requesting shell and groups now) [13:35:45] milimetric: nice! [13:50:50] Hi milimetric [13:51:34] hi [13:52:44] milimetric: enwiki has run ! [13:53:35] milimetric: I was about to pick a new task, possibly that one: T130656 [13:53:35] T130656: Stop generating pagecounts-raw and pagecounts-all-sites - https://phabricator.wikimedia.org/T130656 [13:53:46] milimetric: do you think it's a good idea? [13:53:52] woa- stashbot ! [13:54:27] yes, joal, I was on an email thread just recently where someone was *still* using pagecounts-raw. So yeah, let's deprecate it before it gets silly :) [13:54:57] joal: also, in like 10 minutes can we chat a bit about the oozie task? [13:55:23] been researching since last night and I'm not super comfortable with running 5000 oozie workflows :) [13:55:26] milimetric: I'll follow the process describe i nthe task: stop it, then email :) [13:55:47] joal: yeah, I started a thread on analytics-l a while back, you can reply to that or start a new one [13:55:48] hi a-team [13:55:51] hi mforns [13:55:55] hi mforns [13:55:55] hey :] [13:56:02] I'll brb 10 min [13:56:27] milimetric: I think we should have only one workflow, and the wiki list hardcoded in sqoop or something similar [13:56:31] sure [13:56:38] mforns: enwiki run ! [13:56:44] * joal is happy [13:56:47] joal, woooohoooo! [13:56:50] how was that? [13:57:46] a few adaptations: I took your idea to split the non-event-related states into different partitions, and fined-tune caching [13:57:59] but nothing major :) [13:58:04] aha [13:58:19] and it actually takes not so long [13:58:23] cool, I moved everything in user history to your patch, but still testing [13:58:28] olé [13:59:03] so curious to see ze data [14:00:33] mforns: in spark, you can go for: sqlContext.read.parquet("/user/joal/page_history").registerTempTable("ph") [14:00:55] mforns: Ah, yes, forgot to mention: I added the database field to the page-related things [14:00:57] joal, ok [14:01:10] makes sense [14:01:14] will do that as well [14:01:19] and the page_history folder contains both enwiki and simplewiki :) [14:01:32] awesome [14:01:57] Analytics-Kanban: Stop generating pagecounts-raw and pagecounts-all-sites - https://phabricator.wikimedia.org/T130656#2518897 (JAllemandou) a:JAllemandou [14:02:09] Analytics-Kanban, Datasets-Webstatscollector, RESTBase-Cassandra, Patch-For-Review: Better response times on AQS (Pageview API mostly) {melc} - https://phabricator.wikimedia.org/T124314#2518898 (elukey) Status update: We loaded four months of data to the cluster and fixed some misconfigurations... [14:02:12] joal, can't we create external tables on those? [14:02:28] mforns: We can :) [14:02:44] ok, I can do that also if you want [14:03:04] mforns: sure, sounds good [14:12:09] mforns / joal: let Ellery know when that data looks good [14:12:12] he was very interested in it [14:12:40] (it's ok if it's not beautiful / perfect / cleaned up, he can help us vet it and tell us what else he might need for his use case) [14:14:22] milimetric, sure [14:15:15] Analytics-General-or-Unknown, Monitoring: Switch jmxtrans from statsd to graphite line protocol - https://phabricator.wikimedia.org/T73322#2518930 (elukey) We discussed today on the ops IRC channel how to proceed, here's some details! The WFM jmxtrans version is very old and not supported anymore, so t... [14:15:15] joal: ok, I'll look to see if sqoop supports running things in parallel on different databases or if there are sqoop utilities or similar. So far I found nothing that looks good in Oozie, so I'm glad you agree that's not the place to do it [14:16:09] Analytics, Monitoring, Operations: Switch jmxtrans from statsd to graphite line protocol - https://phabricator.wikimedia.org/T73322#2518931 (elukey) p:Triage>Normal [14:17:49] Analytics, Operations: Jmxtrans failures on Kafka hosts caused metric holes in grafana - https://phabricator.wikimedia.org/T136405#2518950 (elukey) Open>Resolved a:elukey All the next steps outlined in https://phabricator.wikimedia.org/T73322, we can close this task. [14:22:03] milimetric: only idea I have if we need to do it in oozie is going for forks inside a single workflow, but that not great [14:38:45] joal: yeah, I read about those, I also read about oozie recursion! [14:39:10] but all of those have limits on job sizes (like actual text size of xml) and recursion depth [14:39:28] which are by default too small for our use and it's not recommended to increase them [14:40:44] at the same time, it doesn't make sense to do every table from all 853 databases in serial, because that would take a very long time I think [14:40:58] while we do jobs like enwiki, we could go one at a time, but for smaller wikis we should be able to pull 50-100 at a time and still be ok [14:41:10] (back to learning) [14:44:41] joal, do you have 5 mins? I'm strugling with serializable problems? [14:56:08] a-team, I have updated https://wikitech.wikimedia.org/wiki/User:Elukey/Ops/AQS_Settings [14:56:27] milimetric: from my calculations we could tolerate some scenarios with 4 disks failures at the same time [14:56:32] (not all of them of course) [14:56:38] right [14:56:49] 4 disks failed over 8*3 = 24 ones [14:56:55] but I mean I have bad luck and I'm willing to jinx it by saying that's never gonna happen :) [14:57:04] oh yes now I am super happy [14:57:20] happy ops - happy life :) [14:59:51] we also have to account stuff like rack down (so two cassandra instances down) plus disks failures [15:00:05] I'll write down them [15:00:16] but again if it happens we'll take the outage [15:00:28] I think we have a very good trade off [15:00:47] now I hope to finish the reimage by sunday [15:00:55] (keeping the actual 4 months of data) [15:04:22] all right added "Combined failures" to the page :) [15:05:47] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2519163 (Nemo_bis) >>! In T141506#2518094, @Tbayer wrote: > # Looking at http://discovery.wmflabs.org/external/#traffic_summary , the additional pageviews appear... [15:30:52] (PS9) Mforns: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [15:33:17] (CR) jenkins-bot: [V: -1] [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [15:56:42] ottomata: random thought: does mobile editing trigger the same hooks we're instrumenting, or are there separate editing hooks in the MobileFrontEnd extension? [15:56:50] (standup soon, we can talk there) [15:56:52] ha, um, no idea. [15:57:02] if it is mobile web, i assume it would [15:57:06] hmm [15:57:07] you know [15:57:08] hm [15:57:25] yeah i dunno, i would thikn it would, since the hooks are usually fired from mw 'models' or objets [15:57:30] which i would hope are also used by api [15:57:31] but, who knows [16:06:34] Analytics-Kanban, Cassandra: Investigate why cassandra per-article-daily oozie jobs fail regularly - https://phabricator.wikimedia.org/T140869#2519388 (Nuria) Open>Resolved [16:07:54] Analytics-Kanban: Get jenkins to automate releases {hawk} - https://phabricator.wikimedia.org/T130122#2519390 (Nuria) [16:07:56] Analytics-Kanban: Create separate archiva credentials to be loaded to the Jenkins cred store {hawk} - https://phabricator.wikimedia.org/T132177#2519389 (Nuria) Open>Resolved [16:10:12] https://analytics.wikimedia.org/dashboards/vital-signs/#projects=all/metrics=Pageviews - was there a way to link directly to the mobile/desktop graphs, or does one always have to click "data breakdowns" for that? [16:11:59] HaeB: data breakdowns for now, but we have a ticket to fix that [16:14:48] ok thanks [16:33:14] urandom: hi! we are in the process of reviewing the AQS settings for users that write/read from the Cassandra cluster (since we are using the admin one). Are there any best practices that you would like us to follow? [16:33:42] I am going to RTFM myself tomorrow but your opinion would be really good to avoid making horrible mistakes [16:37:08] elukey: How many users do you have? [16:40:35] urandom: I think we are using 'cassandra' for both writes and reads [16:40:47] (reads from the AQS httpd service) [16:40:54] yeah, that's the super-user [16:40:59] don't do that :) [16:41:46] oh yes we have never changed it [16:42:15] I would add one user for read and one for writing? Keeping admin only for "admin" activities [16:42:22] elukey: I think the standard recommendation there would be to create a new super-user with a sound password, and then delete the old one [16:42:33] definitely [16:42:44] in the Services cluster we ended up just changing the password [16:43:00] we still use user cassandra, but have password set out of the private repo [16:43:21] and how many other users do you have? [16:43:27] and then we have a user for the app that also has a good password [16:43:29] 2 [16:43:38] all right [16:43:49] so one user that can write/read and the super user [16:44:00] ya [16:44:05] for for example the user can't mess with system.auth [16:44:14] but only cassandra can [16:44:24] correct [16:44:39] all right I got the full picture, thanks :) [16:44:43] hrmm [16:44:53] you don't get the adduser.cql in your /etc [16:45:33] aha! [16:45:43] you need to set an application_username [16:46:15] if $application_username != undef { [16:46:18] yeah [16:46:58] and that script facilitates the job right? [16:47:24] yeah, trying to remember if that is automated, or if you have to run it manually after a change [16:48:03] I think you have to run: cqlsh --cqlshrc=/etc/cassandra/cqlshrc -f /etc/cassandra/adduser.cql $HOSTNAME [16:49:50] elukey: super_username and super_password get written out to /etc/cassandra-$i/cqlshrc mostly as a convenience as well [16:49:54] and afaiu cassandra_user == application_username [16:50:45] yup, restbase::cassandra_user == cassandra::application_username [16:51:13] super, it makes sense: one creates the user on cassandra and the other one tells restbase to use it for queries [16:51:17] goooooood [16:51:21] thanks urandom as always :) [16:51:28] elukey: i think this is what you have to do [16:51:55] a) set the passwords in private [16:52:12] b) set the super password on the cluster to match (one-time op) [16:52:23] c) cqlsh --cqlshrc=/etc/cassandra/cqlshrc -f /etc/cassandra/adduser.cql $HOSTNAME [16:52:31] (c) is also a one-timer [16:52:41] one-time unless/until you change a password [16:53:01] joal, yt? [16:53:14] and then d) set restbase::cassandra_user = cassandra::application_username [16:53:41] elukey: do you have encryption between nodes? [16:54:32] not that I am aware of [16:54:55] yeah, that's a Nope [16:55:43] so, your data might not be sensitive, but, without encryption it would be possible for someone to person-in-middle you [16:55:54] ah yes for sure [16:56:13] does restbase use enc between nodes? [16:56:26] though, I guess they'd need to get around the ferm rules first [16:56:31] it does, yeah [16:56:49] * elukey writes down notes [16:57:03] urandom: does c) depends on d) by any chance? [16:57:16] no, other way around [16:57:31] you'd want to have the user created before pointing restbase at it [16:57:54] elukey: we're using client encryption, too [16:57:57] oh yes but /etc/cassandra/adduser.cql will be created without application_username set ? [16:57:57] fwiw [16:58:09] nope [16:58:35] yeah I am missing how to execute c) then :( [16:58:55] oh, (a) [16:59:19] hey mforns [16:59:21] i guess i meant it to be implicit that when setting the passwords, you'd also set the app username [16:59:33] hi joal do you have a couple minutes? [16:59:36] sure [16:59:45] elukey: (a) assign an application_username, password, and super_passsword [16:59:46] batcaev? [16:59:49] yes! [17:00:19] elukey: (b) update the cluster (manually) to change the super user password to match what you used in (a) [17:00:34] urandom: ahhh okok because only d) will cause restbase to switch user [17:00:34] elukey: (c) cqlsh --cqlshrc=/etc/cassandra/cqlshrc -f /etc/cassandra/adduser.cql $HOSTNAME [17:00:37] right [17:00:40] yup [17:00:47] oh [17:01:24] elukey: yeah, if you're trying to do this without creating downtime, you'll need to be careful of steps [17:01:37] because (a) will break auth for restbase, i guess [17:01:54] if it's using user cassandra now, then changing the password.... [17:02:41] all right, I might not want to do it for the current aqs cluster but only for the new one :P [17:03:54] urandom: thanks! Going afk, will let you know tomorrow my progress :) [17:05:19] elukey: https://phabricator.wikimedia.org/P3633 (i think) [17:06:08] elukey: ... and chat at you later; enjoy the rest of your day [17:06:10] I can try it on the new cluster, see how many things explode, and then decide what to do with the current one [17:06:16] thankssss!! [17:06:31] (a) will just create adduser.cql [17:06:38] (b) will just create that user on the cluster [17:06:47] (c) will reconfigure restbase to use it [17:07:05] got it, [17:07:06] and then (d) and (e) are just resetting/setting the Cassandra super user password [17:07:11] yep yep [17:07:17] it looks sound to me [17:07:31] I'll add documentation if this will work as expected [17:07:40] https://wikitech.wikimedia.org/wiki/Cassandra#Authentication [17:08:00] elukey: you can put it there [17:08:16] sure! [17:10:31] milimetric: let's keep on deploying aqs once you are done talking to joal [17:14:09] nuria I'm done but was just about to head out to lunch [17:14:21] I'll ping you when I'm done eating [17:14:34] oh :( scrum of scrums is coming up, nvm, my lunch will wait [17:16:01] milimetric: k [17:28:25] Analytics: Add global last-access cookie for top domain (*.wikipedia.org) - https://phabricator.wikimedia.org/T138027#2519793 (Slaporte) Having cross-project stats would be helpful for our work as well, such as evidence for defending Wikimedia trademarks in various countries. Follow up with me via email if y... [17:35:02] ottomata: you all were planning on upgrading the clusters to kafka 0.9 tomorrow right? [17:35:29] or did I hear that wrong? [17:35:42] nono [17:35:50] that was about scap3 refinery deploy [17:35:57] main-eqiad is the only cluster still on 0.8 [17:36:08] both main-codfw and analytics-eqiad are already 0.9 are [17:36:17] we want to deploy eventbus changes before we do the 0.9 upgrade for main-eqiad [17:42:49] logging-off a-team, see you all tomorrow ! [17:43:13] thx otto, makes sense [17:47:19] laters! [18:05:54] (PS1) Nuria: [WIP] Service Worker to cache locally AQS data [analytics/dashiki] - https://gerrit.wikimedia.org/r/302755 (https://phabricator.wikimedia.org/T138647) [18:36:08] Analytics: Add global last-access cookie for top domain (*.wikipedia.org) - https://phabricator.wikimedia.org/T138027#2520012 (Nuria) cc @BBlack let us know if you think this work that can also be tackled next quarter as the new cookie (let's call it WMF-Last-Access-Global) would need to be added to VCL code... [18:59:34] nuria_: did you wanna deploy? [19:00:02] milimetric: can we do it in a bit? I am in the middle of CRing the druid loader [19:00:10] np [19:00:13] just ping [19:10:19] (CR) Nuria: "Have we tested how does this loading behave in the case of a rerun on the druid side? Is it possible to reinsert the data on the druid en" (9 comments) [analytics/refinery] - https://gerrit.wikimedia.org/r/298131 (https://phabricator.wikimedia.org/T138264) (owner: Joal) [19:10:31] ping milimetric , shoudl we deploy? [19:10:44] *should [19:10:47] sure, in the cave [19:12:41] nuria_: ^ [19:12:47] omw [19:35:20] Pchelolo: do you know who set up our beta aqs instance? [19:35:32] we're trying to figure out if anyone loaded any data to test in there [19:35:35] (couldn't find docs about it) [19:35:45] milimetric: heh.. I have literally no idea :) [19:35:53] ok :) thanks, no worries [19:49:49] (PS10) Mforns: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [19:52:25] (CR) jenkins-bot: [V: -1] [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [19:56:35] ottomata: can you help me figure out why although i have permits to ssh to aqs1001 I cannot deploy there from tin? [19:56:41] ottomata: I get: [19:57:04] https://www.irccloud.com/pastebin/EE9kLFf0/ [20:01:07] nuria_: you are not in the correct deployer group [20:01:09] you need to be in some grou [20:01:13] probably deploy-service [20:01:31] milimetric: , Pchelolo wow, did you know [20:01:36] did you know that during page restore [20:01:49] ALL of the page's revision have RevisionInsertComplete called on them? [20:02:01] that means we get new revision-create events for every revision that is restored [20:02:03] :/ [20:02:18] ottomata: that's actually kind of brilliant [20:02:29] I can tell you why but it's too complicated for IRC :) [20:03:18] milimetric: that includes any hidden revision as well [20:03:26] so it is possible 'create' a new revision that has been 'deleted' [20:03:31] :o [20:05:16] which means (and this is how i found this out), that it might be impossible to know if a particular revision is a redirect during a restore, since the redirect status of a revision is defined by its content [20:05:26] but I can't get its content if the revision text is hidden [20:05:59] HMM actually I can [20:06:17] i can pass a special flag to override the current permissions and just get me the content anyway [20:06:17] hm. [20:06:51] would it be more correct to set is_redirect properly on a hidden revision? or to leave it absent? [20:07:59] ottomata: setting an is_redirect property for a hidden revision kinda reveals a little piece of information about it [20:08:05] i'm not sure if the fact that a revision is hidden should keep the public from knowing it was a redirect when it was visible [20:08:12] yeah [20:08:29] ottomata: we certainly shouldn't expose where was it redirecting [20:08:30] i can pretty easily omit is_redirect if I can't get the public content for the revisoin [20:08:40] Pchelolo: we don't have that in any events (yet) [20:08:46] but we had talked about maybe adding it if we needed it [20:08:47] but yeah... [20:08:56] it makes sense to just omit is_redirect, ja? [20:09:03] if the content is not publicly visible? [20:09:18] makes sense to me. [20:09:21] k thanks [20:09:33] i'm glad you guys are here to help make these decisions :) [20:10:08] milimetric: its not going to mess with your history that you are getting new revision create events on a page restore [20:10:10] but i didn't decide yet how to react to the new knowledge that page_restore creates all those revision_create events.. It has deep implications on CP and RB... [20:10:28] you'll get revision create events with rev_ids that past revision create events ahve also had [20:10:38] yeah... [20:10:55] Pchelolo: i'm trying to remember, did we switch the hook from ArticleSaveComplete to RevisionInsertComplete at some point? [20:11:12] sorry, PageContentSaveComplete [20:12:36] Pchelolo: i see that resource_change is using PageContentSaveComplete [20:12:48] * It's used to detect null edits and create 'resource_change' events for purges. [20:12:48] * Actual edits are detected by the RevisionInsertComplete hook. [20:13:55] ottomata: yea, one way or another we need to deal with that case. Because the data in the revision table does the same thing, but it's much harder to request [20:14:02] *much harder to understand [20:14:28] since if you select count(*), min(rev_id), max(rev_id) from revision where rev_timestamp between A and B [20:14:36] where A and B are fixed timestamps in the past, [20:14:45] ottomata: do i need to file a ticket with ops to get added to deploy service? [20:14:45] you'll get different results for that query over time [20:15:20] the discrepancy is the restored pages, but that's why we can't sqoop incrementally, because we have no way to know where the new revisions are added based on any key or date [20:15:52] https://github.com/wikimedia/mediawiki-extensions-EventBus/commit/6f8fe37d [20:16:04] ottomata: so basically, it's fine if there are new events, and ideally we can just join to the page_restore stream and see that they're restored revisions [20:16:24] nuria_: ja, although i'm not sure why that didn't happen in the ticket about making eventbus-admins [20:16:27] i guess it was just an oversight [20:16:33] oh sorry [20:16:37] totally separate thing [20:16:41] aqs not eventbus :p [20:16:42] duh [20:16:44] umm, yeah [20:16:46] you need to file with them [20:18:11] yeah nuria_, its deploy-service [20:18:17] which kinda sucks because that is a catcha ll group for node services :/ [20:18:24] it can be changed to be aqs-admins [20:18:25] which is a group [20:18:40] but that would require some coordination with scap and keyholder and puppet stuff [20:19:00] luca can probably help, but ja i think it would require some ops consultation [20:19:10] ottomata: k [20:19:22] OH [20:19:23] no wait [20:19:25] ok cool [20:19:28] aqs-admins can deploy. [20:19:42] it just uses the deploy-service users to own files on the targets [20:19:43] ok [20:19:44] ah ok, so i add myself on puppet? [20:20:02] nuria_: ja i think that should be fine...although it does have some sudo privs, so technically ops has to review it. [20:20:07] not sure if i should push that one through or now [20:20:08] not [20:20:48] milimetric: ok [20:21:04] hm, i'm not sure i fully understand, but i'm going to proceed :) [20:22:06] yes, proceed [20:25:15] ottomata: sorry was getting lunch [20:25:21] yes we did [20:44:05] milimetric: Pchelolo need a quick field name bikeshed [20:44:10] on page move [20:44:19] what should the object that contains info about the newly created redirect page be called? [20:44:19] shoot [20:44:23] new_redirect_page [20:44:23] ? [20:44:32] redirect_page [20:44:33] ? [20:45:06] what event is this on again? [20:45:11] like what's the schema for the event called? [20:45:16] page_move? [20:45:34] ja page/move [20:45:36] yea [20:45:37] that one [20:45:42] I like new_redirect_page [20:45:50] because it shows that a new page was created [20:45:55] https://gerrit.wikimedia.org/r/#/c/301284/12/jsonschema/mediawiki/page/move/1.yaml [20:46:02] hm mabye [20:46:09] created_redirect_page [20:46:21] hm [20:46:28] hmmm [20:46:36] new_redirect_page [20:46:37] i guess [20:46:39] hm [20:46:48] doesn't feel perfect, but it's ok... [20:47:35] ottomata: are you making this object optional? [20:47:43] yes [20:47:47] ok, good [20:59:08] Quarry: Make a Quarry automatically refresh on a set time interval - https://phabricator.wikimedia.org/T141698#2520629 (yuvipanda) a:yuvipanda>None I too would like this to happen, but don't have time to work on it actively right now though :( [21:18:44] milimetric: did it matter that we don't have information about the archived page during a move over redirect? [21:19:26] yes ottomata that's ok [21:19:46] we can get that from the old page_move's new_redirect_page [21:21:17] milimetric: how? [21:21:30] new_redirect page isn't related [21:21:39] you can move a page over a redirect without leaving a redirect behind, no? [21:23:14] if B is a redirect page [21:23:27] with page_id 2 [21:23:33] and A is a normal page with page_id 1 [21:23:35] you can move A -> B [21:23:43] and page_id 2 and all its revisions are archived [21:23:48] but, we don't get a page delete event from that [21:24:29] actually not sure if the revisions are archived, that might be that bug we saw with orphans [21:24:49] but either way there's no event information about the delete of page id 2 [21:27:12] Hm Pchelolo is there a specific reason we are calling page restore a 'restore' and not an 'undelete' [21:27:13] ? [21:27:16] MW calls it undelete. [21:27:41] ottomata: not the one I'm aware of [21:27:51] mobrovac: ^^ do you know or have an opinion? [21:29:47] ottomata: I think it's too late for Marko [21:29:54] but here's my logic on new_redirect_page [21:30:09] oh ja [21:30:26] if A -> B -> A, then the A -> B move event will have new_redirect_page: (new id of A) [21:30:54] so if we now move B -> A, that (new id of A) is what would be archived [21:31:28] milimetric: new_rediect_page is only present if the user clicks the leave redirect behind checkbox [21:31:28] and we know that a move_redir has to be preceeded by such a move, so we can search back in that page's history [21:31:34] if they leave that absent, there's no new_redirect_page created [21:31:54] but then it's not a move_redir I don't think [21:32:00] that's just a delete and move [21:32:13] so we'd get the archived information from the delete [21:32:32] i thought move_over_redirect was when a page is moved on top of and replaces an existing redirect page [21:32:41] and I suppose we can check for a "delete page with title A" happening within X time of "move page to title A" [21:32:47] milimetric: that's what i'm saying though, we don't get a delete event [21:33:02] move A -> B(redirect) does not fire a page delete event [21:33:08] ottomata: right, but you're saying what happens when they don't leave a redirect [21:33:12] nono [21:33:17] i mean, yes [21:33:19] :) [21:33:21] haha [21:33:24] one at a time [21:33:30] batcave? [21:33:33] sure :) [22:50:05] (PS11) Mforns: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [22:53:33] Analytics: 2016-06-02 hour 14 file missing? - https://phabricator.wikimedia.org/T142052#2521047 (dr0ptp4kt) [22:54:52] bye a-team! [23:35:23] ottomata: are you ok if I add a new patch-set to https://gerrit.wikimedia.org/r/#/c/301284/ ? I've started to review it to check that all the required fields are ok and nothing is forgotten to match generic mediawiki/page/revision/user schemas, but it was too painful, so I wrote a script and converted it to a unit test [23:35:30] wanna push it to your change