[01:00:58] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446913 (10yuvipanda) I still want it, but I'm part of the ops group anyway. So in the interest of keeping this list clean you might remove me (I was added to this group befo... [01:07:20] 10Quarry, 10Cloud-Services: Consider moving Quarry to be an installation of Redash - https://phabricator.wikimedia.org/T169452#3446916 (10yuvipanda) Yup, we can keep a static version running forever. re: history, I also record query history + output history, never wrote UI to expose it :( [02:14:55] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3446977 (10Nuria) > Is there a plan to backfill page-create as well, or are we leaving that how it is? We are backfilling it now, the process is slightly... [02:15:30] (03CR) 10Nuria: "Can you explain how are this queries run a little bit?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365806 (https://phabricator.wikimedia.org/T170882) (owner: 10Milimetric) [02:24:34] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446983 (10MarkTraceur) I would like to continue to have access, as I am still the person on the Multimedia team who has the most experience with our metrics infrastructure (... [06:55:04] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447150 (10Samtar) Feel free to **remove ** me as I only had access for T115119 - if development restarts I'll request it back [07:28:45] 10Analytics, 10WMDE-Analytics-Engineering: Migrate WMDE campaign and WDCM analytics from stat1002, 1003 to stat1005, 1006 - https://phabricator.wikimedia.org/T170664#3447164 (10GoranSMilovanovic) All datasets and code migrated, now testing. [08:48:08] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447246 (10Gilles) I use EL occasionally to investigate performance incidents, I would like to retain access. [09:00:47] 10Analytics, 10Wikimedia-Mailing-lists: Delete eventlogging alerts e-mail list - https://phabricator.wikimedia.org/T170864#3447255 (10Peachey88) [09:10:52] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447263 (10Jan_Dittrich) Hi, I am user researcher at Wikimedia Deutschland and occasionally need access. If it is easy to get rights back again you can disable me at least fo... [09:10:54] 10Analytics, 10Analytics-Data-Quality: Please Check Pivot Data on Campaign Banners - https://phabricator.wikimedia.org/T170792#3447264 (10GoranSMilovanovic) Addendum: as of 07/18/2017, two data points, for 07/15 (already reported here) and 07/16 (reporting on this now) are missing: https://pivot.wikimedia.org... [09:27:14] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3360724 (10elukey) Started the eventlogging cleaner script for 2014 data on dbstore1002, we'll run optimize queries afterwards to see how much space from the log database we gain.. [09:42:37] 10Analytics, 10WMDE-Analytics-Engineering: Migrate WMDE campaign and WDCM analytics from stat1002, 1003 to stat1005, 1006 - https://phabricator.wikimedia.org/T170664#3447315 (10GoranSMilovanovic) Tested on one regularly run R script from stat1006; test successful. Testing from stat1005 now. If test Ok, all W... [09:48:56] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447322 (10daniel) Hi! I'd like to keep my access, please. I'm the principal platform engineer at WMDE, looking after wikidata among other things. I'm also member of the Arch... [10:20:31] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447372 (10schana) [10:28:27] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447385 (10matthiasmullie) [10:39:26] * elukey lunch! [10:47:00] GoranSM: gotta drive somewhere this morning, I may be a little late (but maybe not) [11:26:00] milimetric: I will be online, but I can be with you until 17:00 CEST, which would be your 11:00 AM (it's six hours, not eight... I've lived in NYC for two years and forgot about this). Then I have to go to meet my daughter and I will be back again at 20:00 CEST or so (your 14:00). [11:37:50] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447577 (10matmarex) [11:46:03] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447599 (10SBisson) [11:50:59] 10Analytics, 10WMDE-Analytics-Engineering: Migrate WMDE campaign and WDCM analytics from stat1002, 1003 to stat1005, 1006 - https://phabricator.wikimedia.org/T170664#3447608 (10GoranSMilovanovic) Tests on stat1005 successful. I will keep using stat1002 and stat1003 as production machines until the Analytics d... [11:51:10] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Replacement of stat1002 and stat1003 - https://phabricator.wikimedia.org/T152712#3447610 (10GoranSMilovanovic) [11:51:12] 10Analytics, 10WMDE-Analytics-Engineering: Migrate WMDE campaign and WDCM analytics from stat1002, 1003 to stat1005, 1006 - https://phabricator.wikimedia.org/T170664#3447609 (10GoranSMilovanovic) 05Open>03Resolved [11:54:40] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447614 (10Pcoombe) [12:18:55] mforns: o/ - the script is running on dbstore1002 too (purging 2014 and older data) [12:19:00] still running on db1047 [12:19:09] cool! [12:19:11] :] [12:19:46] I also discovered that Toku might release disk space when it drops data [12:28:16] elukey, that is good news :] [12:29:04] mforns: but long term we should figure out what to do with eventlogging tables, for example if we could drop 2014/2015 data [12:29:49] I mean, it is good to keep data for trends etc.. but as Manuel puts it, if we don't find a good solution for all those TB occupied we will not allow more data to be inserted :D [12:30:25] on the EL master we have ~2.6TB stored (60% space occupied) [12:30:34] that is still good, but it is a ton of data :D [12:33:01] yes [12:33:18] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Perf test RAID vs JBOD with new hardware and kafka versions - https://phabricator.wikimedia.org/T168538#3447676 (10elukey) a:03elukey [12:48:12] (03CR) 10Milimetric: "The way to run these queries is to post them to the Druid broker service. This is serving on port 8082 on druid100[123]. So, for example" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365806 (https://phabricator.wikimedia.org/T170882) (owner: 10Milimetric) [12:49:53] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 3 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3447710 (10Milimetric) Yeah, it's kind of a bug in Phabricator that it copies all parent tags on subtasks. I think that's... [12:55:39] mforns: elukey, i got a bit of a EL mysql problem, got a sec? [12:55:45] ottomata, yes [12:55:55] so, we are inserting some of these eventbus events [12:56:08] but, there is no 'uuid' field, [12:56:10] there is a uuid [12:56:14] but it is meta_id [12:56:18] aha [12:56:22] as such, the eventlogging jrm code doesn't add an index [12:56:31] i was trying to backfill the page create events from last week [12:56:35] and it was WAY too slow [12:56:37] aha [12:56:43] because it checks the meta_id for every event to see if it exists [12:56:49] which has to scan the whole table for every event [12:56:52] i stopped that process [12:56:55] I see [12:57:08] hi GoranSM, you around? [12:57:16] so, i can maybe modify the EL code to automatically add an index for these fields [12:57:21] but i'm worried it would be to broad [12:57:23] in the mean time [12:57:30] i kinda need an index on meta_id [12:57:35] really soon, especially on master [12:57:41] you think I can just add it? [12:58:00] are the tables receiving events as of now? [12:58:01] index / unique_key on meta_id [12:58:02] yes [12:58:07] but we could pause it [12:58:08] milimetric: Yes [12:58:23] ottomata, and the tables are not huge right? [12:58:26] ok GoranSM, as we say, to the batcave! :) https://plus.google.com/hangouts/_/wikimedia.org/a-batcave [12:58:29] 903641 records [12:58:30] in the biggest one [12:58:42] so not too bad [12:58:44] :-D [12:58:58] milimetric: 8-) [12:59:18] ottomata, yes, I don't see any potential issue in adding them by hand [12:59:41] ok, you think i should pause the process that is inserting them? [12:59:46] milimetric: Please give me a sec to switch to Chromium [12:59:48] only pausing the mediawiki eventlogging mysql consumer [13:00:32] ottomata, I'm not sure what behavior mysql will have when indexing a 1M table, regarding incoming events [13:00:33] milimetric: joining in [13:00:48] problably will block the table [13:00:54] lock [13:01:15] but maybe it's better to pause the consumer [13:01:21] +1 [13:01:24] so that no events get lost in a buffer [13:01:25] please do :) [13:01:43] yeah [13:01:46] +1 ok [13:01:49] will do [13:01:57] and we have a separate consumer process for eventbus mysql now [13:02:00] so i can just stop that one [13:02:08] i'm going to add the timestamp index while i'm at it too, so we have it [13:02:09] aha [13:02:22] ottomata: -slave and -store are running el script now so likely to be a bit overloaded, FYI [13:03:09] ok, so we can do those later [13:03:12] i really need it on master soon, so i can backfill [13:03:19] we can add to slaves whenever [13:03:38] I'd also inform manuel about this as FYI [13:03:42] ok [13:04:07] !log adding unique index on meta_id and index on meta_dt to mediawiki_page_{create,delete,move,undelete}_1 on db1046 MySQL eventlogging master [13:04:09] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:06:50] 10Analytics, 10Analytics-EventLogging: Ensure indexes are added to `meta_dt` and unique `meta_id` fields in eventbus MySQL tables in eventlogging databases. - https://phabricator.wikimedia.org/T170925#3447756 (10Ottomata) [13:37:06] 10Analytics, 10Analytics-EventLogging: Ensure indexes are added to `meta_dt` and unique `meta_id` fields in eventbus MySQL tables in eventlogging databases. - https://phabricator.wikimedia.org/T170925#3447936 (10Ottomata) I manually added these indexes to the 4 eventbus tables on db1046: ``` CREATE TABLE otto... [13:37:23] ottomata: do you have some handy examples about how to test produce/consume on kafka? [13:37:31] I realized I've never really done it [13:37:42] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3447938 (10matmarex) [13:38:13] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445820 (10matmarex) [13:39:06] elukey: for the perf tests? or just in general? [13:39:16] kafka console-producer [13:39:19] and kafka console-consumer [13:39:28] are what i usually use for in -> out testing [13:39:32] just to make sure things are working [13:39:37] you could also use kafkacat [13:40:52] ah with the space like topics describe [13:41:18] no no I am starting the task to figure out how to authenticate clients, the kafka-jumbo hosts are not racked [13:41:33] so not much that I can do [13:41:51] aye [13:41:51] yeah [13:41:53] right, yeah [13:41:58] with or without the space :) [13:42:05] the one with the space is my kafka shell wrapper [13:42:11] but it ends up calling kafka-console-consumer [13:42:16] but fills in some args for ya [13:42:34] elukey i think if you just type [13:42:34] kafka [13:42:42] it will show you a bunch of subcommands [13:42:50] I tried them on kafka3-1 etc.. but it was hanging, so I wondered what I was doing wrong [13:42:56] okok [13:43:04] also 3-* had kafka down [13:43:11] I am starting it up again [13:43:16] yeah, who knows what happens to labs boxes if you don't look at them for a while [13:43:53] now that eventlogging_cleaner seems running and piwik is done done done I can concentrate on kafka (finally!) [13:46:01] :) [13:46:06] haha, wish i was there [13:46:12] stat boxes still dragging [13:46:20] el eventbus mysql stuff still pulling on me :) [13:47:31] do you need some help with stat boxes? I can definitely help out and speed up [13:47:44] ah btw! I moved /home on stat1006 to /srv/home [13:47:50] root partition filled up [13:48:53] !!!!!!! [13:48:55] elukey: I'm sorry! [13:48:58] i thoguht i had already done that [13:49:02] i started rsyncing homes last night [13:49:08] i didn't even check [13:49:15] ahhhh, i think maybe I did it before I reinstalled it to stretch? [13:49:49] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (1/4) - Dashboard and general UI - https://phabricator.wikimedia.org/T170933#3447985 (10mforns) [13:49:49] could be, but it was a 5 min thinghy, no need to say sorry :) [13:51:03] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0 UI second deployment/iteration - https://phabricator.wikimedia.org/T170460#3448003 (10mforns) [13:51:06] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (1/4) - Dashboard and general UI - https://phabricator.wikimedia.org/T170933#3448002 (10mforns) [13:51:23] if you want to fill me in about stat next steps I can definitely help, I'll let you decide [13:52:25] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3448026 (10Ottomata) Also note that event for EventLogging Analytics events, 'bot' events are only dropped from MySQL. The MySQL dbs are causing lots of... [13:52:58] hmm, elukey i'm almost ready to send email [13:53:03] next up it will be just people wrangling [13:53:06] getting them to move [13:53:30] elukey: this one we could do together i guess [13:53:33] https://phabricator.wikimedia.org/T170878 [13:54:49] wow [13:55:23] wow also, i haven't looked since i made that ticket yesterday! so many responses! :) [13:58:18] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3448055 (10elukey) [13:59:41] oh okok, I'll leave you change the description then [14:05:15] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3448062 (10TJones) [14:06:02] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (2/4) - Wiki selector - https://phabricator.wikimedia.org/T170936#3448065 (10mforns) [14:06:04] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3448064 (10Ottomata) > Also - is this just about shell access, or also about access to dashboards? This is about shell access only, and specifically to the various analytic... [14:06:26] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0 UI second deployment/iteration - https://phabricator.wikimedia.org/T170460#3448080 (10mforns) [14:06:28] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (2/4) - Wiki selector - https://phabricator.wikimedia.org/T170936#3448079 (10mforns) [14:13:55] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3448102 (10Ottomata) [14:15:32] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3448114 (10Ottomata) FYI, mediawiki_page_create_1 events are now backfilled. [14:20:12] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (3/4) - Data issues - https://phabricator.wikimedia.org/T170937#3448129 (10mforns) [14:20:57] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0 UI second deployment/iteration - https://phabricator.wikimedia.org/T170460#3448145 (10mforns) [14:21:01] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (3/4) - Data issues - https://phabricator.wikimedia.org/T170937#3448144 (10mforns) [14:29:18] ottomata: do you have a min for a el consult? [14:34:22] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (4/4) - Detail page - https://phabricator.wikimedia.org/T170940#3448220 (10mforns) [14:36:43] ya [14:36:52] 10Analytics-Cluster, 10Analytics-Kanban, 10Security, 10User-Addshore: Access rights for HDFS on stat100* for Sqoop tasks - https://phabricator.wikimedia.org/T170052#3448246 (10Milimetric) Here is a template for how to do this in general. The example is using a test directory in hdfs under my user, and the... [14:37:48] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0 UI second deployment/iteration - https://phabricator.wikimedia.org/T170460#3448248 (10mforns) [14:37:50] 10Analytics, 10Analytics-Wikistats: Wikistats2 bugs (4/4) - Detail page - https://phabricator.wikimedia.org/T170940#3448247 (10mforns) [14:37:51] on bc is used [14:38:01] ottomata: so the eventlogging_cleaner script just failed because we assumed in the code that uuid was char, and this is the case on db1047.. but a lot of tables on dbstore1002 have uuid as binary [14:38:03] in a-batcave-2 [14:38:05] oh ok [14:38:07] irc is good [14:38:11] yep yep sorry [14:38:13] :) [14:38:16] !!!!! [14:38:20] :D [14:38:23] el tables have uuid as binary?! [14:38:31] * mforns has to run an errand for 20 mins, will be back before standup [14:38:35] but only on dbstore1002? [14:38:37] that is relaly weird [14:38:43] yep.. [14:38:47] maybe some old EL cold used to make them that way? [14:39:00] but, not sure why they'd only be like that on dbstore1002 [14:39:39] I didn't check what uuid is supposed to be [14:39:44] I mean on the schema [14:39:56] 'uuid5-hex': {'type_': sqlalchemy.CHAR(32), 'index': True, [14:39:56] 'unique': True}, [14:40:49] all right so the -slave seems the correct one [14:40:52] right ? [14:45:04] yeah, it should be a char [14:45:07] elukey: ^ [14:45:10] not binary [14:48:38] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 6 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3448270 (10Nuria) >Ok, events have been backfilled. However, I accidentally backfilled TOO many e... [14:48:46] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 6 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3448271 (10Nuria) 05Open>03Resolved [14:51:35] 10Analytics, 10Analytics-EventLogging: Ensure indexes are added to `meta_dt` and unique `meta_id` fields in eventbus MySQL tables in eventlogging databases. - https://phabricator.wikimedia.org/T170925#3448292 (10Nuria) Ping @kaldari so he knows this is on the works [14:53:57] holaaa team [14:55:03] hollaaaa [14:55:17] nuria_: eventbus events are backfilled on master, they should replicate to slaves too [14:55:44] ottomata: ok, sent e-mail about outage with reports and such just now [14:55:56] 10Analytics-Kanban, 10Analytics-Wikistats, 10Continuous-Integration-Config: Set up continuos integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3448318 (10fdans) [14:56:41] gr8 [14:56:44] thank you [14:56:54] milimetric: hdfs dfs -rm -r /mnt/hdfs/user/goransm/test returns: rm: `/mnt/hdfs/user/goransm/test': No such file or directory [14:57:21] GoranSM: /mnt/hdfs is a local path, not an hdfs path [14:57:34] that is the local fs mount on the stat boxes of HDFS [14:57:40] milime [14:57:41] so, use same path, but strip the /mnt/hdfs part [14:57:48] milimetric: Got it [14:57:52] ottomata: thank you [14:57:56] when you are using hdfs commands (rather than regular filesystem ones0 [14:58:43] ottomata: milimetric has already explained that, it's just that I'm still not used to it [15:00:58] a-team brt, waiting for a coffee... [15:05:44] 10Analytics-Kanban, 10Analytics-Wikistats, 10Continuous-Integration-Config: Set up continuos integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3448332 (10fdans) As discussed with @hashar, we'd like to add Continuous Integration to the UI part of the new Wikistats, which is hosted i... [15:13:32] 10Analytics-Kanban, 10Analytics-Wikistats, 10Continuous-Integration-Config: Set up continuous integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3448343 (10Milimetric) [15:14:26] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3434437 (10Milimetric) a:03Milimetric [15:14:38] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3448347 (10TJones) [15:17:03] a-team: if we use diffusion do commits not show up on phabricator tickets? [15:17:59] i think they can/should nuria [15:32:37] mforns_brb: I tried CAST(uuid AS CHAR) on my local test env and it works [15:32:55] elukey, aha [15:33:59] elukey, do you want to pair on testing that, like testing that the update in (...) will work with binary in (string, string, ...)? [15:34:20] no, wait! we can also add the cast in the in clause, of course [15:34:45] ... where cast(uuid as char) in (...) [15:34:51] this will work [15:35:57] mforns + fdans: new rule: we should use arc so we can code review and associate changes with phab tasks [15:36:09] milimetric, makes sense :] [15:36:33] yes let's do that from now on [15:36:36] k [15:36:38] you can search for the arc instructions on mediawiki.org, they're pretty good [15:37:04] (lemme know if you can't findem) [15:37:26] mforns: bc? [15:37:30] yep! [16:00:23] (03CR) 10DCausse: UDF for extracting primary full text search request (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/327855 (https://phabricator.wikimedia.org/T162054) (owner: 10EBernhardson) [16:03:33] 10Analytics-Kanban, 10Analytics-Wikistats, 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): Set up continuous integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3432047 (10hashar) [16:10:09] 10Analytics-Kanban, 10Analytics-Wikistats: Manage application state with vuex - https://phabricator.wikimedia.org/T169371#3448549 (10Nuria) 05Open>03Resolved [16:13:21] 10Quarry, 10Cloud-Services: Consider moving Quarry to be an installation of Redash - https://phabricator.wikimedia.org/T169452#3448556 (10Halfak) Ahh yes, but the URLs to old result sets still work. [16:13:42] mforns: nuria_: whenever you get a chance https://gerrit.wikimedia.org/r/#/c/365999/ [16:14:05] ottomata: after QR! [16:16:02] :) [16:18:10] 10Analytics-Kanban, 10Analytics-Wikistats: Define, Document (and test) Desktop and Mobile browser support for wikistats 2.0 - https://phabricator.wikimedia.org/T170457#3448564 (10Nuria) Did we deploy this code or we are still waiting on deployment? [16:18:44] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Ensure indexes are added to `meta_dt` and unique `meta_id` fields in eventbus MySQL tables in eventlogging databases. - https://phabricator.wikimedia.org/T170925#3448565 (10Ottomata) [16:28:45] 10Analytics-Kanban, 10Operations, 10Wikimedia-Stream, 10hardware-requests, 10Patch-For-Review: Decommission RCStream - https://phabricator.wikimedia.org/T170157#3448619 (10RobH) I already synced up with @ottomata about this via IRC, and I'll snag from here. There is a checklist for decoms, which I'll ap... [16:35:08] ottomata,mforns [16:35:14] yaa [16:35:20] elukey@db1047:~$ sudo mysql -h localhost --skip-ssl <<< "show create database log;" [16:35:27] logCREATE DATABASE `log` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */ [16:35:48] on dbstore1002 [16:35:49] logCREATE DATABASE `log` /*!40100 DEFAULT CHARACTER SET binary */ [16:36:50] I need to figure out where it takes them [16:37:07] db1046 has logCREATE DATABASE `log` /*!40100 DEFAULT CHARACTER SET binary */ [16:37:27] and with Marcel we discovered that in there only one table is with uuid == binary [16:37:30] weeeird [16:37:35] hmmmmm [16:37:36] weeeiiird [16:37:46] i betcha log is created manually [16:37:50] log db [16:37:56] by whenever someone sets up the server [16:38:02] and whatever default is the default in the configs is what it takes [16:42:44] lovely [16:42:49] opening a phab task :( [16:52:49] 10Analytics-Kanban, 10DBA: Inconsistent default charset for analytics slaves - https://phabricator.wikimedia.org/T170952#3448710 (10elukey) [16:53:17] mforns: --^ [16:55:15] a-team when do you think is a good EoL date for stat1002/stat1003? [16:55:21] maybe early september? [16:55:47] +1 [16:56:19] mforns: https://gerrit.wikimedia.org/r/#/c/365992/3/modules/role/files/mariadb/eventlogging_cleaner.py [16:57:20] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3448731 (10Niharika) [16:59:20] 10Analytics, 10Operations, 10ops-eqiad: Smartctl errors for one kafka1012 disk - https://phabricator.wikimedia.org/T168927#3448734 (10Cmjohnson) @elukey, I have plenty of disks on-site...just let me know which slot number. [17:00:27] elukey: ^ want me to respond and coordinate? [17:00:29] i have a moment [17:01:39] ottomata: if you have time it would be great thanks! [17:01:57] k on it [17:15:02] mforns: eventlogging_cleaner restarted on dbstore1002, it seems working fine [17:15:09] let's see how it goes [17:15:11] 10Analytics-Kanban, 10Operations, 10Wikimedia-Stream, 10hardware-requests, 10Patch-For-Review: Decommission RCStream (rcs100[12]) - https://phabricator.wikimedia.org/T170157#3448838 (10RobH) a:05Ottomata>03RobH [17:15:23] +1 on early september, that's plenty of time [17:15:29] ottomata: do you need any help? Otherwise I'd pack my stuff and go home :) [17:15:35] elukey, sorry did not see your previous pings [17:15:37] naw its good elukey thanks! [17:15:59] all right thanks! [17:16:23] mforns: no problem! I left you some stuff to read :) [17:17:11] * elukey afk! [17:34:26] 10Analytics-Kanban, 10Operations, 10ops-eqiad: Smartctl errors for one kafka1012 disk - https://phabricator.wikimedia.org/T168927#3448934 (10Ottomata) [17:35:04] 10Analytics-Kanban, 10Operations, 10ops-eqiad: Smartctl errors for one kafka1012 disk - https://phabricator.wikimedia.org/T168927#3381297 (10Ottomata) disk replaced as spare. Mounted as /var/spool/kafka/h with UUID=247e0397-066b-4b5c-b6c3-cacd1ecf8cdd. Kafka is back up and is replicating missing data from... [17:36:13] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 3 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3448949 (10Ottomata) a:05Ottomata>03None [17:38:20] 10Analytics, 10Analytics-Cluster, 10Operations: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3448958 (10Ottomata) [17:38:32] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3264224 (10Ottomata) [17:38:50] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Patch-For-Review: rack/setup/install replacement to stat1005 (stat1002 replacement) - https://phabricator.wikimedia.org/T165368#3448961 (10Ottomata) [17:42:00] ottomata: consider putting a signature in your email. :) [17:42:34] ottomata: this is especially helpful for lists like wiki-research-l when they see an email from you and don't have much context. Knowing who you are helps. :) [17:42:51] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3448975 (10gpaumier) I don't anticipate a need to use my access in the foreseeable future. You can remove my access and I'll ask for it back later shoul... [17:44:15] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Replacement of stat1002 and stat1003 - https://phabricator.wikimedia.org/T152712#3449003 (10Ottomata) [17:47:09] leila: aye [17:47:15] i should have written more ya [17:47:16] k [17:48:08] nah, np about that specific email. If someone is really confused, I'm sure they will ask. it's helpful for the future though. I resisted signatures for a while (they seem to braggy to me), I can see their value when publicly communicating though. [17:48:23] too* [17:55:19] 10Analytics-Kanban, 10Patch-For-Review: Add "desktop by browser" tab to browser reports - https://phabricator.wikimedia.org/T170286#3449053 (10Nuria) 05Open>03Resolved [17:56:04] 10Analytics-Kanban, 10Analytics-Wikistats: Interface from Graphs to DimensionalData - https://phabricator.wikimedia.org/T167679#3340653 (10Nuria) Let's make sure that going forward code commits show up in phab tickets [17:56:13] 10Analytics-Kanban, 10Analytics-Wikistats: Interface from Graphs to DimensionalData - https://phabricator.wikimedia.org/T167679#3449057 (10Nuria) 05Open>03Resolved [17:56:15] 10Analytics-Kanban, 10Analytics-Wikistats: Implement pageview metric in Wikistats UI - https://phabricator.wikimedia.org/T163817#3449058 (10Nuria) [18:17:46] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3449126 (10leila) @ottomata: I went over the list and none of the formal collaborators I'm their point contact is on that list, and I myself am not. Tha... [18:20:28] 10Analytics-Kanban, 10Operations, 10ops-eqiad: Smartctl errors for one kafka1012 disk - https://phabricator.wikimedia.org/T168927#3449173 (10Ottomata) Ah, we accidentally swapped the wrong disk. My fault. We put the good one back in, took the defected one out, and put the spare back in the other slot. So /... [18:38:29] changing locations, back shortly! ;) [18:38:30] :) [18:42:37] milimetric: quick q! :) where's published-datasets on stat1005? [18:42:49] milimetric: since there's no /a [18:43:09] hm, my guess is /srv bearloga, checking [18:43:36] yep [18:43:57] bearloga: yes. I think it was bugging all of us that stat1002 was in /a and stat1003 was in /srv so I think otto is cleaning that up with the new boxes [18:45:13] milimetric: there is a /srv/published-datasets...but it's missing the discovery folder [18:46:02] bearloga: hm, that makes sense. I think right now otto is just moving over the puppetized reportupdater jobs. The other ones are managed by you, right? [18:46:09] yup [18:46:23] ok, then I think it's safe to hold off for now, and wait until otto says to start moving them [18:46:35] milimetric: got it, thank youy! [18:46:36] don't worry, stat1002 won't go away for a while [18:49:53] cool *thumbs up* [18:51:14] (03CR) 10Smalyshev: [C: 032] "Re-applying +2 from Nuria after rebase." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/364542 (https://phabricator.wikimedia.org/T169798) (owner: 10Smalyshev) [18:56:58] (03Merged) 10jenkins-bot: Add tagger for Wikidata Query Service requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/364542 (https://phabricator.wikimedia.org/T169798) (owner: 10Smalyshev) [19:07:34] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3449468 (10DarTar) Thanks for starting this @Ottomata Relevant to this thread I wanted to give you the heads up that @MoritzMuehlenhoff @RStallman-leg... [20:13:49] I am looking at https://analytics.wikimedia.org/dashboards/reportcard/#pagecounts-dec-2007-dec-2016/monthly-pagecounts and I see "all projects," which presumably means more than Wikipedia, but the breakouts are only Wikipedias. Can I get an article count for all Wikipedias? [20:18:49] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3449824 (10RStallman-legalteam) I can confirm that the following people from the list have signed NDAs that are on file with legal: ellery jdcc goransm... [20:20:25] Also, does https://analytics.wikimedia.org/dashboards/reportcard/#pageviews-july-2015-now measure page views per month, or something else? [20:20:44] Or is it a rolling daily average? [20:27:05] And does English Wikipedia actually have 7.7 billion views per day? :o [20:28:24] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3449880 (10Lydia_Pintscher) Like Guillaume: I don't anticipate a need to use my access in the foreseeable future. You can remove my access and I'll ask... [20:33:06] 10Analytics, 10Performance: Add MediaWikiInstallPingback to EventLogging purging white-list - https://phabricator.wikimedia.org/T170986#3449888 (10mforns) [20:39:21] ottomata: yt? [20:40:15] yaaa [20:40:29] ottomata: any ideas on how can I free space from beta EL [20:40:49] ottomata: i tried dropping logs all around to no effect [20:40:57] ottomata: issue is revision-create on db [20:41:08] ottomata: but not even dropping that table seems to work [20:41:19] harej: hello [20:41:22] hmm [20:42:06] harej: as the graph indicates it measures pageviews per month for both users+automated traffic [20:42:10] harej: https://analytics.wikimedia.org/dashboards/reportcard/#pageviews-july-2015-now/monthly-pageviews-2015-now [20:42:21] hmm, nuria_ wonder why beta is still importing that... [20:42:30] ottomata: it is not [20:42:36] so it's 7.7 billion per *month* not day. [20:42:37] it is [20:42:39] ottomata: but disk was full when it was [20:42:44] the mysql consumer is still configured to do so [20:42:47] also, is there a way to filter out automated views? [20:42:50] maybe because puppet can run to fix it [20:42:55] aye [20:43:30] harej: there are other tools in which you can do that: https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=latest-20&pages=Cat|Dog [20:43:44] ottomata: i also cleaned up a bunch of puppet logs [20:43:50] ottomata: but yes nothing can run [20:43:56] ottomata: also rebooted box [20:43:59] nuria_: but can I get that for "all of English Wikipedia"? [20:44:38] harej: please take a look at the tool i just send you: https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=latest-20&pages=Cat|Dog [20:44:40] hmm, nuria_ i also just removed logs [20:44:43] doesn't free up space [20:44:52] i dunno nuria_ i think we should wipe this instnace [20:44:53] make a new one [20:44:54] harej: https://tools.wmflabs.org/siteviews/?platform=all-access&source=pageviews&agent=user&range=latest-20&sites=en.wikipedia.org [20:45:06] excellent, thank you nuria_ [20:45:19] ottomata: how about unistalling mysql? [20:45:33] ottomata: don't know... just throwing it all out here [20:45:40] *uninstalling [20:46:11] ottomata: i stopped El days ago there so it has not run for a while [20:47:05] nuria_: somethign is def wrong [20:47:07] if we delete files [20:47:10] and disk space doesn't free up [20:47:15] ottomata: ya [20:47:21] i just deleted like 3G of data [20:47:23] ottomata: i did notice that yesterday [20:47:31] ottomata: same for me yesterday [20:56:45] ottomata: but it seems that milimetric was able to fix this one: https://phabricator.wikimedia.org/T170523 [20:57:36] ottomata: same disk filling up on kafka beta [20:58:09] 10Analytics-Kanban: Piwik improvements - https://phabricator.wikimedia.org/T163000#3449973 (10Nuria) [20:58:10] 10Analytics-Kanban, 10DBA, 10Operations, 10Patch-For-Review, 10User-Elukey: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073#3449972 (10Nuria) 05Open>03Resolved [20:58:49] nuria_: ya dunno, i looked at that too, and in that case, space did free up [20:59:22] 10Analytics-Kanban, 10Documentation, 10Services (watching): Document revision-create event for EventStreams - https://phabricator.wikimedia.org/T169245#3391503 (10Nuria) Before closing this do we want to do any sort of announcement of this data/? [20:59:39] ottomata: ok, let me poke some more [21:00:17] 10Analytics-Kanban, 10Analytics-Wikistats: Build Dashboard on top of dynamic data - https://phabricator.wikimedia.org/T167677#3340613 (10Nuria) Let's please make sure that going forward commits of code display on ticket. [21:00:25] 10Analytics-Kanban, 10Analytics-Wikistats: Implement pageview metric in Wikistats UI - https://phabricator.wikimedia.org/T163817#3449983 (10Nuria) [21:00:27] 10Analytics-Kanban, 10Analytics-Wikistats: Build Dashboard on top of dynamic data - https://phabricator.wikimedia.org/T167677#3449982 (10Nuria) 05Open>03Resolved [21:00:41] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3449984 (10Nuria) 05Open>03Resolved [21:01:28] 10Analytics-Kanban: Clean up PageContentSaveComplete event if there are no data users - https://phabricator.wikimedia.org/T170720#3449986 (10Nuria) [21:03:32] (03PS27) 10Ottomata: Camus JSON datasets -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [21:04:27] Heyyayayyya a-team! Since jo al is gone, who do you think I should ask for a final code review of the eventlogging refine stuff? [21:04:36] its all scala :/ [21:04:56] he's reviewed most of it, so i could self merge and try to start running it via oozie [21:05:01] i'm sure we'll run into somehting [21:05:33] ottomata: let's not self merge w/o CR, i can CR and help test [21:05:52] ottomata: also no need to self merge to run via oozie , we can test it from the patch right? [21:06:27] oh ya, i've tested from patch pretty much [21:06:32] there's more to oozie too [21:06:34] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10DBA, and 2 others: Add index to mediawiki_page_create_1 table - https://phabricator.wikimedia.org/T170990#3450014 (10kaldari) [21:06:37] ok nuria_ i will add as review [21:06:43] its had CR [21:06:44] just not FINAL CR [21:06:53] ottomata: no DONE DONE [21:06:55] juas [21:07:14] (03CR) 10jerkins-bot: [V: 04-1] Camus JSON datasets -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [21:07:31] ahh this is the oozie one: https://gerrit.wikimedia.org/r/#/c/201009/ joal has some comments for me to resolve [21:07:31] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 3 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3444668 (10kaldari) [21:07:33] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10DBA, and 2 others: Add index to mediawiki_page_create_1 table - https://phabricator.wikimedia.org/T170990#3450032 (10kaldari) [21:08:35] (03PS28) 10Ottomata: Camus JSON datasets -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [21:09:03] ok nuria_ i gotta run, i can look at making a new beta EL instance tomorrow if you like [21:09:06] shall I? [21:11:08] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 3 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3450051 (10kaldari) @Nuria, @Ottomata: Is there an existing service that we can piggyback for this? Such as Kibana or Graf... [21:12:02] ok, nuria_ if you want me to, send me an email and I will in the morning tomorrow! [21:12:02] laters! [21:15:11] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10DBA, and 2 others: Add index to mediawiki_page_create_1 table - https://phabricator.wikimedia.org/T170990#3450075 (10kaldari) [21:16:36] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10DBA, and 2 others: Add index to mediawiki_page_create_1 table - https://phabricator.wikimedia.org/T170990#3450014 (10kaldari) I see that T170925 already exists and is almost the same thing. Will follow up there and likely close this as a dup... [21:17:49] bearloga: what was the name of your reportupdater depot? [21:18:28] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Ensure indexes are added to `meta_dt` and unique `meta_id` fields in eventbus MySQL tables in eventlogging databases. - https://phabricator.wikimedia.org/T170925#3447756 (10kaldari) What is the mysteriously named `meta_id` field? Don't we also... [21:20:03] nuria_: wikimedia/discovery/golden [21:20:17] nuria_: it's a golden [data] retriever :P [21:20:47] bearloga: jajaja [21:20:56] ay ay [21:23:00] 10Analytics-Cluster, 10Analytics-Kanban, 10Security, 10User-Addshore: Access rights for HDFS on stat100* for Sqoop tasks - https://phabricator.wikimedia.org/T170052#3450141 (10GoranSMilovanovic) @Milimetric First, thank you for our session today. It was really helpful. I want to let you know about the foll... [21:26:27] I love it [21:26:38] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 3 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3450151 (10Nuria) @kaldari : visualizing things for all wikis is not easy to do with neither graphana or kibana, for 2 met... [21:52:17] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3434423 (10Nuria) Thanks to help from some releng folks looks like this are getting better: root@deployment-eventlogging03:~# df -h Filesystem Size... [21:53:49] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3434423 (10Reedy) There's around 700MB in `/var/lib/mysql.5.5.46-0ubuntu0.14.04.2` that hasn't been touched since 2015 [22:01:39] 10Analytics, 10Analytics-Data-Quality: Please Check Pivot Data on Campaign Banners - https://phabricator.wikimedia.org/T170792#3450288 (10GoranSMilovanovic) Disregard the task; a constraint was not used properly in my data selection on Pivot. [22:01:48] 10Analytics, 10Analytics-Data-Quality: Please Check Pivot Data on Campaign Banners - https://phabricator.wikimedia.org/T170792#3450289 (10GoranSMilovanovic) 05Open>03Resolved a:03GoranSMilovanovic [22:19:00] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3450342 (10Nuria) Dropped tables and logs and run apt-get clean and apt-get autoclean, we are now at 80% space. Restarted eventloging again. [22:24:28] nuria_: Where is https://analytics.wikimedia.org/datasets/periodic/reports/metrics/ served from? When I go into stat1003:/srv/published-datasets/periodic the directory is empty for me. [22:43:13] kaldari: probably the new 1005 , we are moving that as we speak to the new stats machines, what are you looking for? [22:43:19] kaldari: reportupdater examples [22:43:20] ? [22:43:22] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Community-Tech, and 6 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3450439 (10DannyH) [22:43:39] 10Analytics, 10Analytics-EventLogging, 10Community-Tech, 10MW-1.30-release-notes, 10Patch-For-Review: Remove EventLogging for cookie blocks - https://phabricator.wikimedia.org/T166247#3450441 (10DannyH) [22:46:55] nuria_: Actually, the main thing I wanted to find out was where to put the query and script files. The documentation (https://wikitech.wikimedia.org/wiki/Analytics/Systems/Reportupdater#Where_to_put_these_files.3F) just says "You should put all queries and scripts inside the same dedicated directory.", but it doesn't say where. [22:47:14] kaldari: ah sorry, they should be in a repo [22:47:37] kaldari: we will clone it and take care of rsyncs and such, we need to document that better [22:47:46] kaldari: let me send you an example [22:47:55] Yeah, I was about to create a repo, but didn't know where to clone it. That answers my question :) [22:53:35] kaldari: let me ask team tomorrow for best location of depot , they are in very different places, for example discovery: https://gerrit.wikimedia.org/r/#/admin/projects/wikimedia/discovery/golden [22:54:18] Thanks! [22:54:32] Either GitHub or Gerrit are fine with us [22:58:11] kaldari: gerrit would be best i think maybe something like: git clone https://gerrit.wikimedia.org/r/analytics/reportupdater-queries [23:52:17] Many thanks to milimetric for his support in sqooping the wbc_entity_usage tables today. Almost a 2h Google Hangouts session, but the problem seems to be solved.