[00:01:17] 10Analytics-Kanban, 10RESTBase-API, 10Services (doing): Analyze surge of traffic in AQS that lead to 504s - https://phabricator.wikimedia.org/T190213#4067232 (10Nuria) I think we need to investigate the effects of 404 in api, caching those for a short time will not stop the effects of a storm of 404s but... [00:20:36] (03PS7) 10Nuria: Create and manipulate date objects according to UTC timezone [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [02:51:19] 10Quarry: Add export to pagepile functionality to Quarry - https://phabricator.wikimedia.org/T190242#4067406 (10zhuyifei1999) [05:15:44] (03CR) 10Nuria: [C: 04-1] "@mforns fixed issue with selector and couple others on time selector" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [05:22:45] (03PS8) 10Nuria: Create and manipulate date objects according to UTC timezone [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [05:36:04] 10Analytics, 10Analytics-Wikistats: Change '--' to something more helpful in Wikistats page views by country table view - https://phabricator.wikimedia.org/T187427#4067508 (10sahil505) Yes, I feel the same. Since https://en.wikipedia.org/wiki/Unknown does not have anything related (for now :P ) to a Country (n... [05:38:23] (03PS9) 10Nuria: Create and manipulate date objects according to UTC timezone [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [05:39:18] 10Analytics-Kanban, 10Patch-For-Review: Wikistats: labeling of pageviews is wrong on table and graph. Issues with dates - https://phabricator.wikimedia.org/T189266#4037002 (10Nuria) [05:43:01] 10Analytics-Kanban: Puppetize job that saves old versions of geoIP database - https://phabricator.wikimedia.org/T136732#4067521 (10Nuria) [05:43:03] 10Analytics-Kanban: Hand off of Christian's MaxMind geolocation databases repository - https://phabricator.wikimedia.org/T89453#4067523 (10Nuria) [05:43:37] 10Analytics: Order hardware labs storage for mediawiki history analytics friendly DB - https://phabricator.wikimedia.org/T175604#4067524 (10Nuria) [05:44:15] 10Analytics: Easter Egg: wikistats classic style on wikistats 2.0 - https://phabricator.wikimedia.org/T177408#4067525 (10Nuria) [05:44:47] 10Analytics: Update Mediawiki Table manuals on wiki - https://phabricator.wikimedia.org/T179407#4067526 (10Nuria) [05:46:56] 10Analytics-Kanban: Allow switching metrics in a dashboard widget - https://phabricator.wikimedia.org/T187440#3975285 (10Nuria) ping @fdans is there a carrousel ticket? [05:56:57] 10Analytics, 10Analytics-Wikistats: Intervals for data arround pageviews in wikistats maps - https://phabricator.wikimedia.org/T188928#4067531 (10Nuria) [05:56:59] 10Analytics-Kanban, 10Analytics-Wikistats: Create Daily & Monthly pageview dump with country data and Visualize on UI - https://phabricator.wikimedia.org/T90759#4067530 (10Nuria) [06:36:16] 10Analytics: Jupyter Notebooks TLC 2018-2019 - https://phabricator.wikimedia.org/T188275#4067564 (10Aklapper) [07:46:56] morning everybody! [07:47:36] so due to a big puppet infrastructure upgrade we are without kafka/hadoop metrics since yesterday at 21:00 UTC (more or less) [07:47:42] we are working on it [08:16:01] Hi elukey - noted - Kid's day oday, I'll be on and off until tonight [08:16:10] RECOVERY - YARN NodeManager JVM Heap usage on analytics1055 is OK: OK - analytics_hadoop_yarn_nodemanager is 1024424648 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?orgId=1&panelId=17&fullscreen [08:16:38] joal: ack! [08:16:44] it seems that we are in recovering [08:17:01] PROBLEM - YARN NodeManager JVM Heap usage on analytics1051 is CRITICAL: CRITICAL - analytics_hadoop_yarn_nodemanager is 3941446544 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?orgId=1&panelId=17&fullscreen [08:17:21] PROBLEM - HDFS active Namenode JVM Heap usage on analytics1001 is CRITICAL: CRITICAL - hadoop-hdfs-namenode-heap-usage is 6139775912 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?panelId=4&fullscreen&orgId=1 [08:17:29] oh noes this is a storm of false pages [08:17:30] PROBLEM - HDFS DataNode JVM Heap usage on analytics1055 is CRITICAL: CRITICAL - analytics_hadoop_hdfs_datanode is 3904443016 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?panelId=1&fullscreen&orgId=1 [08:18:21] RECOVERY - HDFS active Namenode JVM Heap usage on analytics1001 is OK: OK - hadoop-hdfs-namenode-heap-usage is 5285365184 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?panelId=4&fullscreen&orgId=1 [08:18:50] downtime all the hosts for 30 mins [08:19:01] RECOVERY - YARN NodeManager JVM Heap usage on analytics1051 is OK: OK - analytics_hadoop_yarn_nodemanager is 3345668812 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?orgId=1&panelId=17&fullscreen [08:24:30] RECOVERY - HDFS DataNode JVM Heap usage on analytics1055 is OK: OK - analytics_hadoop_hdfs_datanode is 3842382624 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?panelId=1&fullscreen&orgId=1 [08:50:48] 10Analytics-Tech-community-metrics: Offer "time to first review" data for patches - https://phabricator.wikimedia.org/T190251#4067712 (10Aklapper) p:05Triage>03Low [10:15:37] elukey: helloooo, the link here that points to the purging script is broken - https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Data_retention_and_auto-purging#The_purging_script [10:16:18] fdans: I think it moved after a refactoring, checking [10:17:18] https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/profile/files/mariadb/misc/eventlogging/eventlogging_cleaner.py [10:18:27] 10Analytics-Tech-community-metrics, 10Developer-Relations: Identify volunteer code contributors that we should consider inviting for the first time to a Hackathon - https://phabricator.wikimedia.org/T190259#4067947 (10Aklapper) p:05Triage>03Normal [10:21:20] thank youuu elukey [10:23:04] 10Analytics-Tech-community-metrics, 10Developer-Relations: Identify volunteer code contributors that we should consider inviting for the first time to a Hackathon - https://phabricator.wikimedia.org/T190259#4067965 (10Aklapper) [11:44:46] * elukey lunch + errand! be back in 2h [11:59:00] 10Analytics-Kanban, 10Google-Summer-of-Code (2018): [Analytics] Improvements to Wikistats2 front-end - https://phabricator.wikimedia.org/T189210#4035243 (10Jibin2706) Where is the source code? [12:03:53] 10Analytics-Tech-community-metrics, 10Developer-Relations: Investigate how to identify our code contributors to on-wiki code (gadgets, templates, modules) across WMF sites - https://phabricator.wikimedia.org/T190164#4068346 (10Aklapper) Well, quick and dirty theoretical and untested idea, for poor people witho... [12:13:09] (03PS3) 10Fdans: Add hi.wikimedia, zh.wikidata and wikipedia.commons to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/420346 [12:25:38] (03PS1) 10Fdans: Metrics carousel [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421001 (https://phabricator.wikimedia.org/T187440) [12:29:46] (03PS2) 10Fdans: Metrics carousel [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421001 (https://phabricator.wikimedia.org/T187440) [12:32:41] (03PS3) 10Fdans: Metrics carousel [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421001 (https://phabricator.wikimedia.org/T187440) [12:54:22] Hi hi Ateam! [12:54:53] Does anyone know of any numbers "How many Wikipedia editors are editing Wikidata as well"? [12:55:46] addshore: I have never heard of that number - but computation should be doable with the assumption wikipedia-username == wikidata-username [12:55:52] yup [12:55:56] And by the wasy, Hi addshore :) [12:56:04] I guess using the editing data in the data lake? [12:56:34] addshore: I suggest using wmf.mediawiki_history table in hive [12:56:38] correct [12:56:47] addshore: I can help :) [12:56:50] :D [12:56:57] * addshore hasn;t looked at mediawiki_history at all yet [12:57:20] addshore: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_history [12:58:03] * addshore tries to remember how to write hadoop queries [12:59:29] addshore: let me rephrase the question as a query-style language: You're after distinct usernames of non-anonymous users that have at least one revision on a wikipedia project and at least one revision on wikidata project - I'm assuming we should remove deleted revisions from the count, and that we should retrieve the number of edits on wiki projects, and the number of edits on wikidata projet - [12:59:35] Any dates restrictions? [13:00:17] That sounds easier to think about :) [13:00:23] "I'm assuming we should remove deleted revisions from the count" I'm not sure, let me check [13:00:29] sure [13:00:44] When I say deleted, I mean archived, not hidden [13:00:48] addshore: --^ [13:00:51] "and that we should retrieve the number of edits on wiki projects, and the number of edits on wikidata projet" as far as I know no, we literally just want the number of users [13:01:00] addshore: ack! [13:04:12] So, let's include edits that are archived [13:04:35] works for me addshore [13:13:09] addshore: should we remove bots? [13:13:15] *checks* [13:13:38] addshore: meaning, bots identified as such in the group [13:17:06] I think includes them for now :) [13:18:21] addshore: query started - https://gist.github.com/jobar/c3ed06c826cf111e080d71d4b57a9166 [13:18:51] ooooooohhh [13:19:12] is it running? :) [13:19:50] addshore: stage 8 over 10 [13:20:02] * addshore watches it on yarn [13:21:01] SUCCEEDED! [13:22:10] 68GB of data in 45 seconds, aaaah, i love the data lake [13:22:18] :) [13:23:03] addshore: https://gist.github.com/jobar/c3ed06c826cf111e080d71d4b57a9166 [13:24:10] addshore: for fun --> 14:17:05 < addshore> I think includes them for now :) [13:24:17] oops - misclick [13:24:22] addshore: Total MapReduce CPU Time Spent: 0 days 17 hours 11 minutes 41 seconds 440 msec [13:24:39] real time taken: 254.075 seconds [13:25:22] :D [13:25:32] joal: any chance I could get it excluding bots? >.> [13:25:37] the overlords changes their minds :D [13:26:32] joal: fixed the indexing template and re-testing everything from scratch now [13:26:51] awesome milimetric [13:27:56] |oo|/ [13:28:22] Hi ottomata [13:28:29] addshore: currently running [13:28:35] joal: thanks! [13:28:59] joal: do you have a dump of the updated query again? (so I don't have to ask you in the future) :) [13:29:31] addshore: https://gist.github.com/jobar/a6b2351f19aa66dcc170ce20694339b1 [13:36:42] (03CR) 10Fdans: "WIth this change, for me right now all the dashboard metrics still show January as last month, while they should be February right?" (036 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [13:37:51] addshore: https://gist.github.com/jobar/f6b171532f655fca5e41dd3489274c36 [13:38:04] Thanks a million! [13:38:13] np addshore :) [13:38:35] a-team - Gone again, kids awaken - Will be back if not for standup, for gaols meeting for sure [13:40:58] (03CR) 10Ottomata: [C: 031] "One nit, but +1" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/420795 (https://phabricator.wikimedia.org/T190202) (owner: 10Mforns) [13:44:43] aaah! No such file or directory [13:45:17] ottomata: lol to |oo|/ [13:45:19] :D [13:45:21] morning :) [13:45:50] fdans: I am here if you want to chat about the data deletion [13:45:59] yesssss let's do it elukey [13:46:42] (03PS15) 10Milimetric: Compute geowiki statistics from cu_changes data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/413265 (https://phabricator.wikimedia.org/T188113) [13:46:57] i'm in the batcaverna elukey [13:47:09] fdans: 2 mins that I am merging a change! [13:47:16] how dare you [13:47:20] :D [13:47:22] I thought I had more time :D :D [13:52:57] oh man, I really don't understand this. This stupid command not found error is intermittent and I have no idea why [13:54:18] * fdans hugs milimetric [14:11:14] hi teaam [14:16:46] 10Analytics-Kanban, 10Patch-For-Review: Remove sensitive fields from whitelist for QuickSurvey schemas (end of Q2) - https://phabricator.wikimedia.org/T174386#3560006 (10elukey) Quick sanity check: ``` UPDATE QuickSurveyInitiation_15278946 SET userAgent = null where timestamp < '20171221000000'; UPDATE Quick... [14:32:22] 10Analytics-Kanban: Allow switching metrics in a dashboard widget - https://phabricator.wikimedia.org/T187440#3975285 (10fdans) For some reason the patch is not being linked: https://gerrit.wikimedia.org/r/#/c/421001/ [14:34:21] joal: you said something yesterday about a corrupt file that you fixed? I think something in my environment must be corrupting what I'm copying up to hdfs (git, maybe?) and I can't figure out what. What was the corruption you found? [14:57:42] 10Analytics-Kanban, 10Patch-For-Review: [EL] Correct Print schema mysql whitelist - https://phabricator.wikimedia.org/T190223#4066652 (10mforns) [14:57:56] 10Analytics-Kanban, 10Patch-For-Review: Add defaults section to WhitelistSanitization.scala - https://phabricator.wikimedia.org/T190202#4069038 (10mforns) [14:58:45] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (watching): Create .change-prop.partitioned.mediawiki.job.refreshLinks topic - https://phabricator.wikimedia.org/T190196#4069039 (10Ottomata) [14:59:53] mforns: I found something disturbing, namely the eventlogging cleaner not logging anything and not updating the start timestamp file [15:00:03] elukey, hmmm [15:00:14] file permissions? [15:00:26] it seems fine on db1107 but not on 1108 [15:00:35] nothing registered in the logs [15:00:46] elukey, how do you know it is executing? [15:01:18] elukey, BTW, thanks for merging that :] [15:01:25] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (watching): Create .change-prop.partitioned.mediawiki.job.refreshLinks topic - https://phabricator.wikimedia.org/T190196#4069054 (10Ottomata) Done in both eqiad and codfw for both prefixed topics, e.g.: ``` [@kafka1001:/home/otto] $ kafka topic... [15:01:32] elukey@db1108:/var/run$ ls /var/run/eventlogging_cleaner [15:01:32] ls: cannot access '/var/run/eventlogging_cleaner': No such file or directory [15:01:37] * elukey cries [15:02:40] so it has already happened to db1107 and I re-created the file [15:05:16] elukey, so what happened? I don't see it [15:05:29] * mforns taps elukey's shoulder [15:05:48] noooooooooooooooooooooooooooooooooooooo all the db1108's emails were in the spam folder [15:06:09] basically since one month ago [15:06:18] nobody noticed those?? [15:06:30] elukey, I didn't receive any of them [15:07:08] check your spam dir for db1108 [15:07:08] elukey, oh! I also have them in the spam folder! [15:07:11] sigh [15:07:59] ok this needs a specific alarm [15:08:05] can't rely on emails [15:08:26] mforns: the older one seems from Feb 21st [15:09:23] and 3 months before is Nov 21st [15:09:50] so I'd put the start date as 20171101000000 [15:10:00] just to be sure [15:10:11] we'd need to triple check when it stopped though [15:11:34] ok created on db1108 [15:11:55] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Patch-For-Review, 10Services (designing): Support per-db-shard concurrency in ChangeProp - https://phabricator.wikimedia.org/T189738#4069074 (10Pchelolo) [15:11:58] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (watching): Create .change-prop.partitioned.mediawiki.job.refreshLinks topic - https://phabricator.wikimedia.org/T190196#4069072 (10Pchelolo) 05Open>03Resolved Thank you! [15:14:44] elukey, ok, do you want to pair on that today? [15:18:31] nuria_: hey, good morning! Tell me when you have some free time to work on the email [15:19:00] mforns: if you could just check in a couple of tables that need sanitization the last timestamp that we sanitized on db1108 would be great [15:19:35] elukey, 1108 is the slave or the master? [15:20:13] (03PS1) 10Fdans: Adds "Load more rows..." UI button to table chart [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) [15:21:47] mforns: slave [15:21:52] k looking [15:33:57] joal: is it possible to get that same query & data, just for edits made in the last 3 months? so Jan 1st 208 until the present day? [15:44:56] elukey, it seems that the script didn't run since 20171122110001 [15:44:59] is that possible? [15:45:33] milimetric: Let's discuss your issue [15:45:43] milimetric: batcave? [15:45:55] addshore: edits oneither wikidata or wikipedias? [15:47:24] joal: Yes, same one :) number of users that edited both a wikidata and a wikipedia in Jan Feb and march of this year (with the same other conditions, such as bot etc as before) [15:48:06] addshore: same conditions meaning: no bots, possibly deleted pages [15:48:12] yes! [15:48:15] thanks [= [15:48:51] mforns: yep exactly what my calculations suggested! So the oldest email in spam is Feb 21st, three months before is the date that you posted [15:49:05] I put in the start date the Nov 1st just as precaution [15:49:10] super good then [15:49:22] addshore: You'll only have Jan and Feb 2018 - March data will be imported at the end of month [15:49:29] okay! [15:49:47] 10Analytics-Tech-community-metrics, 10Developer-Relations (Jan-Mar-2018): Find out if Kibana/Elasticsearch allows queries based on the results of other queries - https://phabricator.wikimedia.org/T189903#4069226 (10Aklapper) So the current workaround (until these two indices are merged) to get the list of //al... [15:51:25] elukey: if interested, I have funny charts for yesterday AQS spike [15:53:56] 10Analytics-Tech-community-metrics: Include gerrit DB's "author_bot" field also in the gerrit_demo DB - https://phabricator.wikimedia.org/T184907#4069250 (10Aklapper) This might also get fixed by merging the `gerrit_demo` index into the `gerrit` index as a potential part of T151161. [15:53:58] elukey, oh of course, it's not that the script did not run since 20171122110001, rather since 90 days later [15:54:45] yeah! [15:54:56] joal: sure! Did you see my updates in the task? [15:55:02] nope elu [15:55:05] nope elukey [15:55:13] https://phabricator.wikimedia.org/T190213 [15:55:20] lemme know what you think about them [15:55:31] denied access elukey :( [15:57:00] joal: you should be able now [16:04:27] (03CR) 10Nuria: "@fdans: let's talk in irc, the assumption that tz does not play a part in date operations is incorrect." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [16:05:35] addshore: https://gist.github.com/jobar/09005ad29b399e28b73d7773477fdaaf [16:05:46] addshore: query + result in the same gist ;) [16:06:35] Thankss!! [16:06:58] addshore: You're welcome :) [16:24:16] 10Quarry, 10cloud-services-team: Quarry server errors caused by Cloud VPS shared proxy failures - https://phabricator.wikimedia.org/T190218#4069396 (10bd808) [16:24:52] joal: right, the last thing I need to the same data for Jan and Feb of 2017 [16:25:12] I see the 2 event_timestamp LIKE pats of the query, i guess I need to alter them to be 2017 [16:25:38] * addshore is not sure about the snapshot bit [16:26:19] oh, and I guess I have to explicitly ask for 01 and 02 in the event_timestamp regex too now) as we have all of the data for 2017 [16:26:19] addshore: Don't change the snapshot - Every snapshot contains all data [16:26:26] okay! [16:26:42] addshore: you just need to change the LIKE in the 2 CTEs indeed [16:27:07] what regex will work? 2017-(01|02)% [16:27:27] nope [16:28:01] You should do RLIKE '^2017-0[12].*' [16:28:10] RLIKE stands for regexp-like [16:28:15] LIKE is sql-like [16:28:20] addshore: --^ [16:28:21] * addshore facepalsm, of course [16:28:26] *palms [16:30:17] addshore: You have it running :) [16:30:23] :D [16:30:50] addshore: Thanks for doing so - While I love to help, the team try not to get into the habit of being data-providers ;) [16:31:04] yup :) [16:31:18] Now you have written the complex query for me ;) I can run it and modify it at will! [16:31:37] Any idea what user group people need to be added to in order to access the mediawiki_history tables in hadoop? [16:32:12] https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Production_access doesnt look like it explicitely states, so I guess it is one of the one with hadoop access, but not sure which would be neeed [16:32:34] addshore: data comes from labs, so normally anybody would have the right to access it [16:32:50] I guess analytics-users would have access to it then [16:32:56] Access to stat1004 to connect to the Analytics/Cluster (Hadoop/Hive) (NO HADOOP PRIVATE DATA). [16:33:16] analytics-users should work addshore [16:33:23] addshore: however, our permissions are not detailed enough I think for us to easily allow access to this data and not PII one [16:33:36] joal: not true, analytics-users vs analytics-privatedata-users [16:33:50] ottomata: Thanks for correction !! [16:34:18] So here we go addshore - We need to ensure mediawiki-history is in the correct group :) [16:36:01] :D [16:36:35] ottomata: Just checked - history files are not in the correct group as of now [16:36:46] ottomata: Do you mind double checking I'm not doing mistakes? [16:37:07] joal: they are world readable [16:37:12] so if you have cluster access, you can read them [16:37:29] Ohhhh ! right ottomata - Sorry to even boterh [16:42:06] super nice analysis @elukey on 404s [16:43:27] ottomata: Indeed, in the venv/lib folder of notebook, there are 2 subfolders: python3.4 and python3.5 [16:44:42] hmm [16:46:11] ping ottomata goals goals goals [16:58:21] ottomata: I've managed to make hive work with py3.5 - but the patch is awefull [16:59:58] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4069562 (10Tbayer) More notes from our meeting on Friday, for reference: - @Jdlrobson verified (by tracking down the source... [17:46:22] fdans:want to talk about dates? [17:46:36] sure, omw [17:47:20] (nuria_ ) [18:16:03] ottomata: o/ - on deployment-puppetmaster02 I can see error: could not apply 5b5cff6aab... [WIP] point eventlogging processes at Kafka jumbo [18:16:24] can I remove it or do you need it ? [18:16:46] well probably not since we have already done the work, auto-answering :D [18:21:15] 10Analytics, 10EventBus, 10Services (next): Support multiple partitions per topic in EventBus - https://phabricator.wikimedia.org/T157822#4070010 (10Pchelolo) [18:22:14] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Patch-For-Review, 10Services (designing): Support per-db-shard concurrency in ChangeProp - https://phabricator.wikimedia.org/T189738#4070015 (10Pchelolo) [18:22:22] 10Analytics, 10EventBus, 10Services (done): Support multiple partitions per topic in EventBus - https://phabricator.wikimedia.org/T157822#3017728 (10Pchelolo) 05Open>03declined We've decided to go with a different approach when CP is handling partitioning. Declining. [18:22:46] ah wonderful now role::eventlogging::analytics::server in deployment-prep gets stuck with kafka_config('jumbo-eqiad') [18:24:45] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (done): Support per-db-shard concurrency in ChangeProp - https://phabricator.wikimedia.org/T189738#4070019 (10Pchelolo) 05Open>03Resolved a:03Pchelolo Deployed. Seem to be working fine, resolving. [18:24:50] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Database, and 7 others: High (2-3x) write and connection load on enwiki databases - https://phabricator.wikimedia.org/T189204#4070023 (10Pchelolo) [18:33:10] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (done): Support per-db-shard concurrency in ChangeProp - https://phabricator.wikimedia.org/T189738#4070056 (10mobrovac) [18:33:14] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Patch-For-Review, 10Services (doing): Migrate RefreshLinks job to kafka - https://phabricator.wikimedia.org/T185052#4070057 (10mobrovac) [18:33:57] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Database, and 7 others: High (2-3x) write and connection load on enwiki databases - https://phabricator.wikimedia.org/T189204#4070065 (10Pchelolo) The change that partitioned the `refreshLinks` topic in line with MySQL sharding has been deployed. Now we just need... [18:36:14] 10Analytics, 10EventBus, 10MW-1.31-release-notes (WMF-deploy-2018-03-13 (1.31.0-wmf.25)), 10Services (done): Mediawiki EventBus should set meta.dt to UTC time - https://phabricator.wikimedia.org/T189243#4070074 (10Pchelolo) 05Open>03Resolved This has been deployed with a regular mediawiki train, resolv... [18:37:51] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Database, and 7 others: High (2-3x) write and connection load on enwiki databases - https://phabricator.wikimedia.org/T189204#4070079 (10mobrovac) Relevant dashboards to monitor (for posterity): - [MySQL open connections](https://grafana-admin.wikimedia.org/dashb... [18:39:44] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, 10Services (doing): FY17/18 Q3 Program 8 Services Goal: Migrate two high-traffic jobs over to EventBus - https://phabricator.wikimedia.org/T183744#4070086 (10mobrovac) [18:39:49] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (done): Migrate RefreshLinks job to kafka - https://phabricator.wikimedia.org/T185052#4070083 (10mobrovac) 05Open>03Resolved The RefreshLinks jobs have been fully migrated to the EventBus system. Resolving. [18:43:18] Wow - his is perf improvement ! https://www.ibm.com/blogs/research/2018/03/machine-learning-benchmark/ [18:43:49] ottomata: lemme know how should we handle kafka_config in role::eventlogging::analytics::server (maybe a simple check for labs until we refactor the role?) [18:46:55] back from metting milimetric and fdans want to talk some more? [18:49:58] Amir1: we can look at e-mails if you are there, i know is late on your end [18:50:20] elukey: yes you can remove that cherry pick thank you! [18:50:26] (just got out of meeting) [18:50:53] elukey: are you refactoring? [18:50:55] OHHH [18:50:58] m [18:50:58] https://gerrit.wikimedia.org/r/#/c/421104/1/modules/role/manifests/eventlogging/analytics/server.pp [18:51:01] hm [18:51:01] :) [18:51:01] i see [18:51:02] yes [18:51:09] yeah, conditional makes sense [18:51:12] ergh [18:51:14] really horrible [18:51:16] I know [18:51:21] nuria_: I grabbed a late lunch, I'm just looking at the patch now. I'll just leave comments there or ping you [18:51:35] milimetric: sounds good, will check in in afew mins [18:51:47] milimetric: let's set up a meeting with asaf to look at geowiki [18:52:06] nuria_: after the superset dashboard is up [18:52:12] hmm i can't switch to deployment-prep project in horizon anymore...? [18:52:14] milimetric: yeah yeah [18:52:33] ottomata: the alternative could be to create a jumbo-eqiad cluster definition in there [18:52:52] (I mean where we define the deployment-prep jumbo) [18:52:54] maybe cleaner [18:53:02] yeah i want to look at hiera but i can't seem to switch to that project... [18:53:03] can you? [18:53:08] yep [18:53:23] weird [18:53:30] :( [18:53:43] its not in my list of projects... [18:53:47] should I simply duplicate jumbo-deployment-prep in jumbo-eqiad? [18:53:55] might be worth to ask to releng [18:53:57] can you paste the top of the hiera? [18:54:03] sure [18:54:07] for kafka_clustesr [18:54:08] s [18:54:42] tehcnically elukey, it would work in both places if we did [18:54:48] kafka_config('jumbo') [18:54:50] https://etherpad.wikimedia.org/p/temp [18:54:54] but, if we ever inlucded that class in codfw, it would break [18:55:25] well when we'll migrate to profiles this issue will not be there anymore [18:55:30] elukey: i thikn probably a conditional for now [18:55:36] if production jumbo-eqiad, else 'jumbo' [18:56:15] ah because it adds the -eqiad [18:56:23] even in labs right? [18:56:28]   jumbo-eqiad: [18:56:28] is in labs deployment prep? [18:56:58] yeah [18:56:59] if realm == 'labs' [18:56:59] "#{prefix}-#{labsp}" [18:57:02] no I just added it to show you how I'd change it [18:57:07] (hiera I mean) [18:57:08] ah ok [18:57:22] yeah, i think conditional is better [18:57:26] for now [18:57:45] amending the patch [18:58:42] ottomata: like this? https://gerrit.wikimedia.org/r/#/c/421104/2/modules/role/manifests/eventlogging/analytics/server.pp [19:00:35] elukey: do just 'jumbo' [19:00:49] that way it'll work if we were crazy people and wanted it in analytics project [19:00:56] for if labs [19:01:15] ah I thought you didn't want it due to codfw etc.. [19:01:38] nono, just 'jumbo-eqiad' hardocded for prod [19:01:41] 'jumbo' else [19:01:52] all prod should use jumbo-eqiad [19:02:03] but only because jumbo doesn't exist in codfw [19:02:08] yeah that's what I've done no? [19:02:28] no you made all labs use 'jumbo-deployment-prep', and everything else use production [19:02:33] sorry [19:02:38] everything else use 'jumbo-eqiad' [19:02:42] you want [19:02:51] if prod 'jumbo-eqiad' else 'jumbo' [19:02:57] sure, but did you check https://gerrit.wikimedia.org/r/#/c/421104/2/modules/role/manifests/eventlogging/analytics/server.pp [19:03:01] REV 2 [19:03:03] (three is wrong) [19:03:08] NO just the email i got [19:03:09] haha [19:03:14] gr8! [19:03:21] all right :D [19:03:44] ahhahaha [19:10:43] ottomata: sorry I am a bit tired so I might miss something.. I just realized that 'jumbo' is not defined in deployment-prep, should I just add it to the list of kafka clusters in hiera there right? [19:12:10] ah nooooo [19:12:13] it should work sorry [19:12:15] ufff [19:12:25] after this I promise I'll log off :D [19:12:50] it is still not working since the git_sync hasn't worked on the deployment-prep's puppet master [19:12:53] ufff [19:13:22] nuria_: I go grab something to eat and come back soon [19:14:05] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, 10Services (next): FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4070155 (10mobrovac) p:05Triage>03Normal [19:14:18] fdans: looking at button batch [19:16:54] (03PS2) 10Mforns: Add defaults section to WhitelistSanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/420795 (https://phabricator.wikimedia.org/T190202) [19:17:20] 10Analytics, 10ChangeProp, 10EventBus, 10MediaWiki-JobQueue, and 3 others: [EPIC] Develop a JobQueue backend based on EventBus - https://phabricator.wikimedia.org/T157088#4070171 (10mobrovac) [19:17:24] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, 10Services (next): FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4070170 (10mobrovac) [19:17:36] (03CR) 10Mforns: Add defaults section to WhitelistSanitization (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/420795 (https://phabricator.wikimedia.org/T190202) (owner: 10Mforns) [19:19:20] 10Analytics, 10ChangeProp, 10EventBus, 10MediaWiki-JobQueue, and 4 others: Select candidate jobs for transferring to the new infrastucture - https://phabricator.wikimedia.org/T175210#4070187 (10mobrovac) [19:19:25] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, 10Services (next): FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4070155 (10mobrovac) [19:22:03] (03CR) 10Milimetric: [C: 04-1] "Added some minor nits as is my style and a couple of changes I'd like to see like the graphData being separate from the selector. After g" (0313 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [19:25:02] elukey: all ok? [19:25:34] ottomata: yes I am fixing role::eventlogging::analytics::mysql too [19:25:43] in two patches since I am stupid [19:25:50] :D [19:26:29] after these I hope that labs is fixed [19:26:41] if not, I'll flip my desk and go home :D [19:27:09] (03CR) 10Mforns: "LGTM! But there's still one file with tabs and spaces, see comments!" (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/416999 (https://phabricator.wikimedia.org/T187345) (owner: 10Fdans) [19:29:20] haha ok [19:29:28] lemme know if i can help, thanks for cleaning that up [19:30:02] ottomata: can you give me a functioning brain like yours? Mine sometimes is faulty :D [19:36:24] all right eventlogging cleaner deployed in labs [19:36:33] elukey, :D [19:36:41] will test tomorrow morning [19:37:05] thanks! [19:37:13] yep, need to add the eventlogging_cleaner user to the db buuut will do it tomorrow :) [19:37:18] now logging off!! [19:37:40] thanks ottomata for the patience! [19:38:33] bye luca [19:38:36] oook haha np laters! [19:39:23] (03CR) 10Milimetric: Adds "Load more rows..." UI button to table chart (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) (owner: 10Fdans) [19:43:04] 10Analytics-Kanban, 10Patch-For-Review: Limit length of table and add "Load more rows" button - https://phabricator.wikimedia.org/T188953#4025130 (10Nuria) {F15950377} I think this change should sit of top of master rather than responsive changes. Please take a look at screenshot. [19:46:58] (03CR) 10Nuria: "I think this change should sit on top of master given that it is not related to responsivenes. tested on chrome and looks good and does no" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) (owner: 10Fdans) [19:51:01] (03CR) 10Nuria: [C: 032] Adds "Load more rows..." UI button to table chart [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) (owner: 10Fdans) [19:51:29] (03CR) 10Nuria: "Sorry , mean to '0'" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) (owner: 10Fdans) [19:54:53] (03CR) 10Nuria: Adds "Load more rows..." UI button to table chart (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) (owner: 10Fdans) [20:19:17] (03PS10) 10Nuria: Create and manipulate date objects according to UTC timezone [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [20:22:46] (03CR) 10Nuria: Create and manipulate date objects according to UTC timezone (0310 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [20:29:54] 10Analytics: Gather all constants related to mobile/responsiveness in config - https://phabricator.wikimedia.org/T190339#4070536 (10fdans) [20:30:35] (03PS2) 10Fdans: Adds "Load more rows..." UI button to table chart [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) [20:30:44] (03CR) 10Fdans: Adds "Load more rows..." UI button to table chart (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/421041 (https://phabricator.wikimedia.org/T188953) (owner: 10Fdans) [20:35:12] (03PS9) 10Fdans: Responsive Wikistats 2 UI [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/416999 (https://phabricator.wikimedia.org/T187345) [21:41:16] chelsyx: hola [21:41:30] chelsyx: could you tell me if we can turn off piwik gathering of ios metrics? [21:42:53] nuria_: No... iOS team don't use a lot of EL, so piwik is the only source we know about events on ios app [21:43:13] chelsyx: k [21:43:14] nuria_: what's the problem? [21:43:47] chelsyx: that none seems to have looked at that data for a while and just keeps on growing , piwik is mean to be used for real small sites [21:44:10] chelsyx: and ios usage is less and les sustainable [21:44:14] *less [21:44:46] nuria_: yeah... [21:45:29] chelsyx: i see it dropped data again but is is nothing we can "fix" , it is just over the volume it can handle [21:45:42] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Spike: Consider alternatives to MirrorMaker: uReplicator, Confluent Replicator - https://phabricator.wikimedia.org/T190049#4070868 (10Ottomata) From https://github.com/uber/uReplicator/issues/8#issuecomment-375104514 > uReplicator consumer is using 0.10.... [21:45:50] nuria_: oh no... [21:46:00] chelsyx: only for like 2 days [21:46:18] chelsyx: but the problem remains [21:46:47] nuria_: should we chat we josh about potential solution for this? [21:47:16] chelsyx: i think he is aware but you can ping him again and let him know [21:47:26] nuria_: ok [21:47:31] chelsyx: thank you [21:47:41] chelsyx: Thanks! [21:49:03] nuria_: Except starting to send data to EL, do you see any other potential solution? [21:51:11] chelsyx: I think EL is the only alternative i can think of other than downsampling data [21:51:19] chelsyx: downsampling might be best [21:52:26] nuria_: ok. I will chat with Josh and ask what he thinks [21:52:30] nuria_: thanks! [21:52:36] chelsyx: thank you [22:08:20] (03CR) 10Mforns: [V: 031 C: 031] "LGTM! +1 Let's merge the one with the dates first, though, no?" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/416999 (https://phabricator.wikimedia.org/T187345) (owner: 10Fdans) [23:24:42] 10Analytics, 10Performance-Team: Archive Kasocki repository - https://phabricator.wikimedia.org/T190365#4071243 (10Krinkle)