[00:37:45] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MediaWiki-Vagrant: How to use Wikipedia EventLogging schemas in Vagrant setup? - https://phabricator.wikimedia.org/T153641 (10srishakatux) Thanks @Milimetric for looking into this! Uninstalling mysqlclient did the trick and I am now able to see l... [03:41:44] (03PS5) 10Awight: [WIP] Import ORES scores [analytics/refinery] - 10https://gerrit.wikimedia.org/r/481025 (https://phabricator.wikimedia.org/T209732) [03:41:46] (03PS1) 10Awight: [WIP] Nonsense copypaste to produce ORES data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/482753 [03:47:26] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) This is the idea, https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/481025/ Should I avoid creating a view, though? I don't see any... [04:22:10] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) >>! In T209732#4861475, @awight wrote: > Should I avoid creating a view, though? I don't see any precedent for it, although the access pat... [04:26:00] (03PS6) 10Awight: [WIP] Import ORES scores [analytics/refinery] - 10https://gerrit.wikimedia.org/r/481025 (https://phabricator.wikimedia.org/T209732) [04:26:02] (03PS2) 10Awight: [WIP] Nonsense copypaste to produce ORES data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/482753 [07:18:27] morning! [07:19:56] morniiiing [07:22:16] hey Fran [07:22:19] how's going? [07:24:31] !log decommission analytics10[39-41] from Analytics Hadoop [07:24:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:24:35] these are the last three people [07:25:14] elukey, fdans : hola! will be here for couple hours [07:30:28] all right 3 nodes in decom process, those are the last ones [07:31:26] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Decommission old Hadoop worker nodes and add newer ones - https://phabricator.wikimedia.org/T209929 (10elukey) [07:31:39] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Decommission old Hadoop worker nodes and add newer ones - https://phabricator.wikimedia.org/T209929 (10elukey) [07:32:41] PROBLEM - Hadoop NodeManager on analytics1040 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [07:32:47] PROBLEM - Hadoop NodeManager on analytics1041 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [07:33:16] good morning Luca, after coffee please remember to silence nodes [07:33:17] sigh [07:33:20] sorry for then noise [07:33:23] PROBLEM - Hadoop NodeManager on analytics1039 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [07:34:19] (silenced) [07:36:12] (brb) [07:39:40] fdans: did you guys had time to look at superset ? [07:41:14] nuria: I was going to ping elukey now to maybe pair and deploy the fork [07:41:25] fdans: k [08:01:17] 10Analytics, 10Analytics-Wikistats: Wikistats New Feature - DB size - https://phabricator.wikimedia.org/T212763 (10Nuria) @TheSandDoctor I think you want size of wikipedia for current revisions , no history and no talk pages and such, correct?. The best way to estimate that size that i can think of is using du... [08:03:06] joal, elukey: hello, from what i see we did deploy cluster yesterday right? [08:03:43] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: [BUG] userAgent missing from all EventLogging analytics Hive tables between 2018-11-29 and 2018-11-14 - https://phabricator.wikimedia.org/T211833 (10Nuria) 05Open→03Resolved [08:03:56] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10Nuria) [08:04:02] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 4 others: Prototype in node intake service - https://phabricator.wikimedia.org/T206815 (10Nuria) 05Open→03Resolved [08:06:13] Hi folks [08:06:17] 10Analytics, 10Analytics-Kanban: Remove sessionId, pageId pairs from whitelist - https://phabricator.wikimedia.org/T205458 (10Nuria) [08:06:21] 10Analytics, 10Analytics-Kanban, 10Readers-Web-Backlog, 10Patch-For-Review: Print schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209050 (10Nuria) 05Open→03Resolved [08:06:33] nuria: marcel deployed yes - I'm currently double checking global status [08:06:49] 10Analytics, 10Analytics-Kanban, 10Readers-Web-Backlog, 10Patch-For-Review: Print schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209050 (10Nuria) whitelist deployed [08:09:28] !log manual stop of hdfs balancer to ease the under replicated blocks healing (worker nodes already decently balanced) [08:09:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:09:39] 10Analytics, 10Analytics-Kanban: Remove sessionId, pageId pairs from whitelist - https://phabricator.wikimedia.org/T205458 (10Nuria) [08:09:42] 10Analytics, 10Patch-For-Review: ReadingDepth schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209051 (10Nuria) 05Open→03Resolved [08:10:54] 10Analytics, 10Analytics-Kanban: Remove sessionId, pageId pairs from whitelist - https://phabricator.wikimedia.org/T205458 (10Nuria) [08:10:59] 10Analytics, 10Product-Analytics, 10Patch-For-Review: MobileWebSectionUsage schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209049 (10Nuria) 05Open→03Resolved [08:11:04] 10Analytics, 10Analytics-Kanban: Remove sessionId, pageId pairs from whitelist - https://phabricator.wikimedia.org/T205458 (10Nuria) 05Open→03Resolved [08:12:11] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Presto cluster online and usable with test data pushed from analytics prod infrastructure accessible by Cloud (labs) users - https://phabricator.wikimedia.org/T204951 (10Nuria) [08:12:14] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create .deb package for Presto - https://phabricator.wikimedia.org/T203115 (10Nuria) 05Open→03Resolved [08:12:32] 10Analytics, 10Analytics-Kanban: Presto on Cloud Platform Design Document - https://phabricator.wikimedia.org/T208614 (10Nuria) 05Open→03Resolved [08:12:34] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Presto cluster online and usable with test data pushed from analytics prod infrastructure accessible by Cloud (labs) users - https://phabricator.wikimedia.org/T204951 (10Nuria) [08:13:24] elukey: HiiiiIIIIii - I'm trying to restart turnilo on analytics-tool1002 but it seems I have no right :( [08:13:57] joal: you should have the powa! [08:13:59] joal: i thought we had sudo , marcel has restarted it a bunch as of late [08:14:12] we do, you guys can sudo directly [08:14:12] joal: ya, you can be yourself no need to be hdfs right? [08:14:16] that's what I thought as well ! [08:14:23] sudo systemctl restart turnilo [08:14:49] elukey: Worked - I used service command :S [08:14:55] * joal should learn to use the powab [08:15:06] Thanks a lot elukey :) [08:15:19] joal: added command to docs: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo-Pivot#Administration [08:15:28] Thanks a lot nuria [08:15:57] ah yes that's my bad, you guys can only use systemctl with sudo [08:16:00] and journalctl [08:16:04] if you want I can add service [08:16:07] elukey: sounds perfect :) [08:16:32] elukey: I actually knew and tried systemctl but it didn't help me (no autocomplete), while service did :-P [08:18:00] also added https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo-Pivot#Logs [08:18:40] joal: it does auto-complete but only after you add the action (like restart) [08:18:48] Ahhhh! [08:18:54] it is really confusing if you are used with service I know [08:19:00] * joal should learn even more than expected :( [08:19:06] I stopped using service due to that, my brain was always mixing stuff [08:19:09] :( [08:19:14] yeah [08:19:34] I actually never use service either - I just need to make sure the thing is wired correctly in my brain [08:19:47] And since I use those command once every now and then, it's not that easy :) [08:23:12] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create staging domain for turnilo to test config changes - https://phabricator.wikimedia.org/T212958 (10Nuria) @Nuria will document after this moment of enlightenment about ssh -L [08:26:27] joal: did you restarted yesterday the webrequest job in refinery? [08:26:37] nuria: Marcel did [08:27:00] nuria: I'm triple checking data before moving tasks to done [08:29:53] and everything looks correct - There is a bunch of stuff we need to wait for validation (mediawiki-history changes, wikidata-editors, clickstream), but the rest seems ok (namely, wikitech in pageview and labs-IPs [08:32:06] joal: mmmm but https://turnilo.wikimedia.org/#webrequest_sampled_128 [08:32:15] joal: does not have the is_pageview field [08:32:32] nuria: old webrequest segments don't have the field, so it doesn't show up [08:38:13] 10Analytics, 10Pageviews-API: Yearly endpoint for the /pageviews/top API - https://phabricator.wikimedia.org/T154381 (10Nuria) We can get back to our pageviewAPI work after we make significant improvment on quality and addition of new tables in Data Lake, moving to priority normal for Q4. the earliest we could... [08:38:49] 10Analytics, 10Pageviews-API: Yearly endpoint for the /pageviews/top API - https://phabricator.wikimedia.org/T154381 (10Nuria) p:05Normal→03High [08:45:04] 10Analytics: small bot activity marked as user in Manuel_de_Pedrolo page - https://phabricator.wikimedia.org/T213148 (10Nuria) [09:03:04] (03PS2) 10Elukey: Bump to superset version 0.26.3-wikimedia1 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/481056 (owner: 10Ottomata) [09:03:53] fdans: --^ [09:04:04] Andrew's patch needed a manual rebase [09:04:21] I expected less dependency changes, but the ones that were affecting us seems not touched [09:04:37] we are now pulling directly from the requirements.txt file with Andrew's build code [09:04:51] elukey: looking [09:05:01] so in theory we should be able to deploy this in labs, test if it is good and do the same in prod [09:08:50] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create staging domain for turnilo to test config changes - https://phabricator.wikimedia.org/T212958 (10Nuria) Documented process, will tune as I test today but closing ticket: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo-Pivot#Test_config_changes [09:08:58] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create staging domain for turnilo to test config changes - https://phabricator.wikimedia.org/T212958 (10Nuria) 05Open→03Resolved [09:09:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: unique devices monthly should be configured with default "monthly" granularity in turnilo - https://phabricator.wikimedia.org/T209103 (10Nuria) [09:10:12] (03CR) 10Fdans: [C: 03+1] Bump to superset version 0.26.3-wikimedia1 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/481056 (owner: 10Ottomata) [09:11:08] elukey: let's do it! [09:20:04] fdans: ah snap I don't have anymore a turnilo instance in labs, creating it [09:20:23] elukey: turnilo? [09:21:34] yeah sorry brain fault, superset, it is there [09:21:36] nevermind [09:21:57] :) [09:23:46] so fdans, in theory we should simply deploy without touching the db [09:23:53] since this is the same version [09:24:07] uuu sounds cool [09:27:14] fdans: ssh -L 9080:superset.eqiad.wmflabs:9080 superset.eqiad.wmflabs [09:31:21] elukey: testing [09:32:49] elukey: hmmm csv upload doesn't work, I guess the tmp folder setting is not set? [09:35:06] yeah lemme fix ut [09:35:07] *it [09:36:06] fdans: try now [09:37:11] elukey: (_mysql_exceptions.InternalError) (3, 'Error writing file \'./superset/testcsv.frm\' (Errcode: 28 "No space left on device")') [SQL: '\nCREATE TABLE testcsv (\n\tcountry TEXT, \n\tviews TEXT, \n\t`rank` BIGINT, \n\toldviews TEXT, \n\tmin FLOAT(53), \n\tmax FLOAT(53)\n)\n\n'] (Background on this error at: http://sqlalche.me/e/2j85) [09:37:20] lolz [09:37:53] that is the coordinator [09:38:48] I am trying to fix it [09:38:54] 10Quarry: How to STOP running a Query? - https://phabricator.wikimedia.org/T213152 (10Taher2000) [09:39:57] fdans: should be good now [09:41:20] 10Quarry: How to STOP running a Query? - https://phabricator.wikimedia.org/T213152 (10zhuyifei1999) What is the use case for stopping the query? The query being running doesn’t prevent you from editing the query and submitting it again. [09:50:26] elukey: no errors found with filter box [09:53:24] fdans: is it the bug that we are trying to fix right? (don't remember the exact details) [09:53:47] O WAIT NO [09:54:02] checked the wrong thing, sorry, I've only had 2 coffees [09:55:24] elukey: na periodicity pivot is not fixed [09:55:38] it's still saying 'NoneType' object is not iterable [09:55:58] elukey: version should say "0.26.3"? [09:56:11] yeah [09:56:45] ugh [09:56:55] what do you mean? [09:57:06] we didn't change that, it is still 26.3 [09:57:16] so the fix is the correct one in https://github.com/wikimedia/incubator-superset/commits/wikimedia right? [09:57:34] is there a quick way to check in the js code if the fix got deployed or not? [09:57:40] elukey: yeah just making sure that it wouldn't say "0.6.3-wikimedia" or something like that [09:57:41] because I think that we might have missed it [09:57:54] elukey: this is on the python side though [09:57:59] ah okok [10:00:36] well we can easily see inside the wheel [10:01:39] ah wait [10:01:40] superset-0.26.3-py3-none-any.whl superset-0.26.3_wikimedia1-py3-none-any.whl [10:01:44] this might be the issue [10:02:26] we are adding a new superset artifact and not removing the other one [10:02:54] elukey ooo [10:03:02] * fdans sees a glimmer of hope [10:03:24] updating the cr and re-attempting the deploy [10:07:49] (03PS3) 10Elukey: Bump to superset version 0.26.3-wikimedia1 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/481056 (owner: 10Ottomata) [10:09:08] yeah I just realized that there are multiple deps that are duplicated by my rebase [10:10:54] anyway let's see if this fixes it [10:11:40] Collecting superset==0.26.3 (from -r /srv/deployment/analytics/superset/deploy/frozen-requirements.txt (line 1)) Could not find a version that satisfies the requirement superset==0.26.3 (from -r /srv/deployment/analytics/superset/deploy/frozen-requirements.txt (line 1)) (from versions: 0.26.3-wikimedia1) [10:11:45] No matching distribution found for superset==0.26.3 (from -r /srv/deployment/analytics/superset/deploy/frozen-requirements.txt (line 1)) [10:11:56] no bueno [10:13:27] fdans: you might be right about the version number, this could explain even what we had before [10:13:30] mmmm [10:14:01] trying a manual fix [10:14:05] let's see if it works [10:14:30] yeah better [10:14:31] Could not find a version that satisfies the requirement cryptography==2.4.2 (from superset==0.26.3-wikimedia1->-r /srv/deployment/analytics/superset/deploy/frozen-requirements.txt (line 1)) (from versions: 2.3.1) [10:17:33] and we have cryptography-2.3.1-cp34-abi3-manylinux1_x86_64.whl [10:18:32] and patchset 1 doesn't contain it [10:18:50] https://github.com/wikimedia/incubator-superset/commit/3585175c165bcb4c87e85cc5bffadb88f8ec062a [10:20:55] I am going to update the task with the info and ask Andrew later on [10:21:00] is it ok fdans ? [10:34:02] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Superset to 0.28.1 - https://phabricator.wikimedia.org/T211605 (10elukey) So I tried it today and I have a couple of notes: * my revert to 0.28.3 and https://gerrit.wikimedia.org/r/#/c/analytics/superset/deploy/+/481056/ were preventing a clean re... [10:47:52] elukey: yes, sorry, I had to run out for a lil bit [11:33:18] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update html language for per-domain uniques - https://phabricator.wikimedia.org/T168477 (10fdans) All docs I could find had been updated in a previous task. I just added a change to update the dumps page to summarize uniques more accurately [11:35:04] elukey: in what host in labs is superset deployed? [11:39:18] superset.eqiad.wmflabs [11:42:44] going afk for lunch + errand :) [11:42:57] * elukey lunch! [11:43:32] (nuria: not sure if superset in labs now work since the last deployments didn't succeed) [12:00:54] 10Analytics, 10Pageviews-API, 10wikitech.wikimedia.org, 10Patch-For-Review: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821 (10Nuria) Wikitech pageviews are now available as of yesterday, closing: https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/wi... [12:01:06] 10Analytics, 10Pageviews-API, 10wikitech.wikimedia.org, 10Patch-For-Review: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821 (10Nuria) 05Open→03Resolved [12:01:24] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update IP addresses of cloud labs to mark internal traffic on refinery code - https://phabricator.wikimedia.org/T212862 (10Nuria) 05Open→03Resolved [12:07:59] 10Quarry: How to STOP running a Query? - https://phabricator.wikimedia.org/T213152 (10Aklapper) 05Open→03Invalid a:05ASammour→03None Hi @Taher2000. This does not sound like something is wrong in the code base (a so-called "software bug"), but instead like a support request (how to change settings, questi... [12:28:05] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10jcrespo) There is already an 'sql' tool that developers that query production us... [12:34:34] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10jcrespo) > the best way to accomplish this would probably be a library I would... [12:37:48] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10charlotteportero) [12:37:53] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Security-Team, and 3 others: T206785: Modern Event Platform: Stream Intake Service: AJV usage security review - https://phabricator.wikimedia.org/T208251 (10charlotteportero) 05Open→03Declined For Security work, we need all the information included i... [12:38:45] hey team! [13:11:57] 10Quarry: How to STOP running a Query? - https://phabricator.wikimedia.org/T213152 (10alanajjar) @Taher2000 if you still facing this issue, please send to me [[https://ar.wikipedia.org/wiki/%D8%AE%D8%A7%D8%B5:%D9%85%D8%B1%D8%A7%D8%B3%D9%84%D8%A9_%D8%A7%D9%84%D9%85%D8%B3%D8%AA%D8%AE%D8%AF%D9%85/%D8%B9%D9%84%D8%A7... [13:32:52] 10Analytics, 10Analytics-Data-Quality, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Add EditAttemptStep properties to the schema whitelist - https://phabricator.wikimedia.org/T208332 (10mforns) @Neil_P._Quinn_WMF looking into this right now. [13:39:42] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Banyek) @Bstorm so, what do you think, should I drop these? [14:04:31] hey all [14:05:01] hi fdans [14:05:08] hellooo [14:06:02] milimetric: pair on data quality? [14:06:28] fdans: I was just about to say that but now I have to go to the bathroom, few minutes and then cave? [14:06:35] yep! [14:15:13] ok fdans, ready [14:19:43] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Security-Team, and 3 others: T206785: Modern Event Platform: Stream Intake Service: AJV usage security review - https://phabricator.wikimedia.org/T208251 (10Ottomata) @charlotteportero I don't think any of us knew there was a security review form. Can y... [14:21:48] (03PS4) 10Ottomata: Bump to superset version 0.26.3-wikimedia1 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/481056 [14:22:07] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Superset to 0.28.1 - https://phabricator.wikimedia.org/T211605 (10Ottomata) Hm, why was '0.26.3' in frozen-requirements.txt if you cherry picked it? It should be pointing at the github fork link. https://gerrit.wikimedia.org/r/#/c/analytics/superse... [14:24:55] ottomata: o/ [14:24:58] did you submit https://gerrit.wikimedia.org/r/#/c/analytics/superset/deploy/+/481054/ ? [14:25:35] ah wait I should have used all the chain of code commits [14:25:48] uuuuuu sorry I missed it, my bad [14:25:50] now I get it [14:26:25] I cherry picked the last one [14:26:33] this is why it wasn't working [14:26:34] okok [14:27:18] so my last rebase is not ok, we can restart from patchset 1 [14:28:51] https://phabricator.wikimedia.org/T211605#4862552 [14:29:07] well it looks like my patchset one also dpoesn't ahve cryptography 2.4.2 [14:29:32] elukey: i gotta go to dentist, back later! [14:30:06] sure sure [14:30:16] fdans: ---^ [14:36:13] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Security-Team, and 3 others: T206785: Modern Event Platform: Stream Intake Service: AJV usage security review - https://phabricator.wikimedia.org/T208251 (10sbassett) Hey @Ottomata - Here's [[ https://www.mediawiki.org/wiki/Wikimedia_Security_Team/Secur... [14:39:25] I am going to recheck superset later on, need to take care of other changes now :( [14:56:31] elukey: [14:56:42] sorry ... elukey: https://gerrit.wikimedia.org/r/c/operations/puppet/+/482816 [14:56:48] for when you have a minute :) [14:57:11] elukey: I'm gone for kids, will be back for standup - We can the usual procedure at that time if you want [14:57:24] joal: sure! [14:57:42] in the meantime I'll deploy the patch [15:19:46] nuria: ready to merge with the puppet patch or still wip? (saw it passing in the ops chan) [15:20:14] elukey: ready i think, i tested everything but had to transfer settings manually as i cannot copy files directly [15:20:22] elukey: monethly datasets make a lot more sense now [15:20:42] nuria: what do you mean with "cannot copy files directly" ? [15:23:56] elukey: that i tested all setting on the config file that is passed to turnilo but that is not a puppet template ..let me explain: [15:23:59] elukey: /usr/bin/nodejs /srv/deployment/analytics/turnilo/deploy/node_modules/.bin/turnilo --config config.yaml [15:24:25] elukey: this is the 'start turnilo command' which i executed with a config file with new settings [15:24:49] elukey: i did not modified teh erb file directly [15:24:57] elukey: ahem... makes sense? [15:25:36] nuria: so you mean /etc/turnilo/config.yaml? I am asking just to understand if anything can be done to ease testing :) [15:26:31] elukey: right , that file comes from executing puppet from the erb template , i did not touched it. i just copied the file to my homedir to analytics-tool1002 and used it to instantiate other turnilo [15:26:35] ah or you mean that you don't exactly know if the erb template will render as the config.yaml that you tested? [15:26:57] elukey: right, it should be fine, i just had to cut and paste all my changes [15:27:07] elukey: now makes more sense? [15:28:27] nuria: sure! I just ran the puppet compiler https://puppet-compiler.wmflabs.org/compiler1002/14214/analytics-tool1002.eqiad.wmnet/ [15:29:03] elukey: ok, i think is good to go [15:30:03] elukey: will run puppet and test UI [15:30:19] elukey: will run puppet/restart and test UI that is [15:30:47] I am running puppet now, you shouldn't be able to on analytics-tool1002 (sudo rules are only for some systemctl commands) [15:31:09] config.yaml updated, you are free to restart turnilo and test :) [15:41:20] 10Analytics, 10Analytics-Data-Quality, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Add EditAttemptStep properties to the schema whitelist - https://phabricator.wikimedia.org/T208332 (10mforns) @Neil_P._Quinn_WMF The white-list patch above (merged on Dec 11th) is perfectly fine. The problem... [15:52:48] elukey, I have 2 micro puppet patches that need merge to complete yesterday's deployment, can you please have a look? If you need clarification we can meet in da cave? [15:52:58] https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/482826/ [15:53:02] https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/482727/ [15:54:58] hey i'm here and will be at standupwow! [15:56:35] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10Ottomata) [15:56:38] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Security-Team, and 3 others: T206785: Modern Event Platform: Stream Intake Service: AJV usage security review - https://phabricator.wikimedia.org/T208251 (10Ottomata) 05Declined→03Open [15:56:58] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Security-Team, and 3 others: Modern Event Platform: Stream Intake Service: AJV usage security review - https://phabricator.wikimedia.org/T208251 (10Ottomata) [15:57:47] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Security-Team, and 3 others: Modern Event Platform: Stream Intake Service: AJV usage security review - https://phabricator.wikimedia.org/T208251 (10Ottomata) Ok thanks @sbasset. I've brought the form template over to this task and filled it out. @charlo... [16:07:19] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Bstorm) Well, my end can't use 'em very effectively. It's all whether we are going to build it around analytics... [16:11:15] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: JVM pauses cause Yarn master to failover - https://phabricator.wikimedia.org/T206943 (10elukey) We bumped the Xmx/Xms settings of the HDFS namenode to 12G (was 8G) for unrelated changes and I haven't seen any more pauses since then. The in... [16:17:44] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 2 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Ottomata) [16:46:31] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Marostegui) +1 to drop them if @Milimetric doesn't need them [17:11:45] 10Analytics, 10Operations, 10ops-eqiad: Rack A2's hosts alarm for PSU broken - https://phabricator.wikimedia.org/T212861 (10RobH) So I asked for an update for the quote on T210776 and nothing yet. Dell acknowledges they received and are working on it. If we do not have a quote back today, I'd recommend s... [17:14:20] joal: aqs1004 depooled and ready for testing whenever you are [17:18:31] thanks elukey - testing NOW :) [17:22:14] elukey: validated ! [17:22:18] Good to go [17:22:19] goooooo [17:26:03] Thanks elukey :) [17:26:08] Gone for diner, back after [17:27:04] joal: aqs restarted! [17:39:45] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: unique devices monthly should be configured with default "monthly" granularity in turnilo - https://phabricator.wikimedia.org/T209103 (10Nuria) ping @JKatzWMF take a loot at http://turnilo.wikimedia.org , monthly datasets display now with monthly defaults... [17:43:20] milimetric: can please you answer this ticket so they can drop those views? https://phabricator.wikimedia.org/T210693 [17:46:33] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Milimetric) +1 to drop, our other solution is feasible and we're going that way for the foreseeable future. Hap... [17:59:19] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 3 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Ottomata) @Pchelolo, hmmmm. eventgate in prod will need to have the event-schemas repo(s) av... [18:08:49] * elukey off! [18:16:37] Is there anything I can read about smoke-testing Oozie jobs in a dev sandbox? [18:17:11] I was hoping to make a limited sample of real tables I plan to work with, copy these into a personal database in Hadoop, and test my scripts there... [18:17:51] It seems I can just run the HQL from the commandline, with -d arguments, but I'm not sure quite how to do that with the Oozie .xml [18:19:08] yeah you can do that awight [18:19:15] I'd love to write tests but haven't found any examples of that, yet. [18:19:18] that's why we try to parameterize a lot of stuff via the .properties files [18:19:20] tests dunno. [18:19:27] but runnign in dev mode ya [18:19:49] so you need to upload all the .xml files as they somewhere into hadoop, probably in your hdfs usre dir [18:20:15] ok [18:20:21] then, if you have things like oozie_directory and e.g. destination_table parameterized in your properties file [18:20:39] you can oozie job -submit them and override the relevant properties on the CLI [18:20:40] e.g. [18:21:25] I'm looking at example invocations in http://oozie.apache.org/docs/5.1.0/DG_Examples.html, what would the oozie URL be for analytics servers? [18:21:28] oozie job -Duser=$USER -Doozie_directory=/user/awight/oozie -Ddest_table=awight.table_name -submit -config ./oozie/job/coordinator.properties (or whatever) [18:21:31] oh [18:21:39] it is in an env var, you normally don't have to provide it [18:21:40] if you do [18:21:42] its $OOZIE_URL [18:21:43] you can do [18:21:46] ok awesome [18:21:46] -oozie $OOZIE_URL [18:22:25] awight: some (possible a bit old, dunno) docs here https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Oozie#Running_a_real_oozie_example [18:22:38] Hi awight - I was suggesting that one as well --^ [18:22:57] perfect, thanks ottomata and joal! [18:23:29] awight: An important point is to put the whole oozie folder onto HDFS (wherever you prefer, I use /user/joal/oozie for instance) [18:23:54] It's important because of cross-referencing datasets-files between folders [18:24:01] I'm taking another pass at the ORES jobs, conforming to the new design discussed in the task... will have more review soon :) [18:24:14] Great awight :) [18:24:16] joal: Okay, noted. That does seem like a gotcha that would have bitten me ;-) [18:25:31] one more thing. I'm building one job which imports from event.revision_score as the datasets become available. That one seems close to other existing jobs, so I can copypasta. [18:25:43] The second job however is meant to run monthly, for each wiki. [18:25:58] That seems trickier, and I'm not sure where to find a precedent. [18:26:33] The monthly job builds a re-denormalized table of the scores, intended to be directly dumped for public use. [18:26:34] awight: monthly jobs exist - However we try not to have per-wiki jobs - Too many wikis ! [18:26:57] awight: We would build a single monthly job working all wikis [18:27:18] kk, the monthly job will easily work for all wikis at once. However, I think the dump itself should still be per-wiki [18:28:30] awight: Needs to be handled through a script I think - IMO you'll easily build dumps by wikis in single files. Then it'll be about moving those files to the archive folders [18:29:22] very nice. I won't be touching the actual dump mechanism for now, FYI. Just building the denormalized data set in preparation for dumps. [18:30:31] awight: things to keep in mind about that dataset: partitioning into folders (hadoop doesn't do by-file) and use a single reducer at the end (or you'll end up with plenty files) [18:31:49] something else awight: It'll be interesting to see how much data gets worked by the regular data-transforming jobs [18:32:16] awight: if data is very small, doing it at larger timespan could be better (for instance daily instead of hourly) [18:32:45] I don't understand the note about partitioning, but in case it answers the question, I'm planning for the denormalized data to be partitioned by monthly snapshot dates: ".../snapshot=2019-01/..." [18:33:18] Okay noted about the first (normalizing) job, I'll keep that to a daily cadence. [18:33:32] awight: I was thinking about the by-wiki output [18:33:59] generating by-wiki partitioning can be done at the monthly-dataset layer [18:34:13] Or at a temporary layer after the monthly job [18:35:11] joal: That makes sense, thanks! [18:35:57] On that note, I had briefly considered using a view to do the denormalized queries, but that doesn't seem like the preferred paradigm, so I'm writing to a table. [18:38:30] I think it's better awight - We've not used views in hive yet, and the use-case seems to be schema-facilitation (reduce the number of columns) [18:42:27] great, nice to have that question settled. [18:57:02] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) [19:18:38] Tech news before leaving - https://globenewswire.com/news-release/2019/01/08/1681851/0/en/The-Apache-Software-Foundation-Announces-Apache-Airflow-as-a-Top-Level-Project.html [19:18:44] See you tomorrow team [19:29:28] nice! [19:43:02] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 3 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Pchelolo) We could consider using https://kubernetes.io/docs/concepts/storage/volumes/ for th... [20:01:32] 10Analytics, 10Analytics-Dashiki: Improve Dashiki defaults for Browser selection - https://phabricator.wikimedia.org/T213215 (10Quiddity) [20:04:08] 10Analytics, 10Analytics-Dashiki: Improve Dashiki defaults for Browser selection - https://phabricator.wikimedia.org/T213215 (10Milimetric) Indeed a bug, I think it's trying to show top 5. Taking a look now. [20:13:53] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services, and 3 others: Create materialized views on Wiki Replica hosts for better query performance - https://phabricator.wikimedia.org/T210693 (10Banyek) Ok, then we are all agree, I'll drop those tables tomorrow in the morning [20:37:29] 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Elukey: Review dbstore1002's non-wiki databases and decide which ones needs to be migrated to the new multi instance setup - https://phabricator.wikimedia.org/T212487 (10leila) >>! In T212487#4851201, @elukey wrote: >>>! In T212487#4839932, @elukey wrote: >>... [20:41:56] (03PS1) 10Milimetric: Handle null values when sorting [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/482887 (https://phabricator.wikimedia.org/T213215) [20:42:07] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Handle null values when sorting [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/482887 (https://phabricator.wikimedia.org/T213215) (owner: 10Milimetric) [20:48:29] (03PS1) 10Milimetric: Updating browsers dashboard [analytics/analytics.wikimedia.org] - 10https://gerrit.wikimedia.org/r/482889 [20:48:42] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Updating browsers dashboard [analytics/analytics.wikimedia.org] - 10https://gerrit.wikimedia.org/r/482889 (owner: 10Milimetric) [20:50:58] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Improve Dashiki defaults for Browser selection - https://phabricator.wikimedia.org/T213215 (10Milimetric) p:05Triage→03High a:03Milimetric [20:54:13] 10Analytics, 10Analytics-Kanban: Reportupdater queries jobs failing - https://phabricator.wikimedia.org/T213219 (10Milimetric) [20:54:15] 10Analytics, 10Analytics-Kanban: Reportupdater queries jobs failing - https://phabricator.wikimedia.org/T213219 (10Milimetric) p:05Triage→03High [20:54:17] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Improve Dashiki defaults for Browser selection - https://phabricator.wikimedia.org/T213215 (10Milimetric) ok, done and deployed, will be refreshed when puppet runs and pulls the change. As I was doing this I realized another bug, th... [20:58:38] 10Analytics, 10Analytics-Kanban: Reportupdater queries jobs failing - https://phabricator.wikimedia.org/T213219 (10Milimetric) turns out the .reportupdater.pid file was just an empty file written on Nov 6, 2019!!! Which... seems not possible. And was messing up all reportupdater-queries runs for the last few... [21:00:34] 10Analytics, 10Analytics-Kanban: Reportupdater queries jobs failing - https://phabricator.wikimedia.org/T213219 (10Milimetric) [21:01:29] 10Analytics, 10Analytics-Kanban: Reportupdater queries jobs failing - https://phabricator.wikimedia.org/T213219 (10Milimetric) Update: executing fine now, will move this to done tentatively, and bring it back if anything else goes wrong. [21:22:39] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MediaWiki-Vagrant: How to use Wikipedia EventLogging schemas in Vagrant setup? - https://phabricator.wikimedia.org/T153641 (10Milimetric) > * As per Step 4 in [[ https://www.mediawiki.org/wiki/Extension:EventLogging/Guide#Debugging | installation... [21:33:28] 10Analytics, 10MediaWiki-API: API Analytics - page views by country - https://phabricator.wikimedia.org/T213221 (10Green_Cardamom) [21:41:37] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Improve Dashiki defaults for Browser selection - https://phabricator.wikimedia.org/T213215 (10Quiddity) Thank you! (yeah, I saw the empty sections, but didn't know how to clearly describe it, and I figured you'd see it :) [21:49:31] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 3 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Ottomata) @akosiaris In [[ https://wikitech.wikimedia.org/wiki/User:Alexandros_Kosiaris/Bench... [21:51:28] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 3 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Ottomata) OH! Nevermind I see, that isn't an instruction...but a summary of what we are doin... [22:11:34] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: Improve Dashiki defaults for Browser selection - https://phabricator.wikimedia.org/T213215 (10Milimetric) "everything is bad" works in this case :) Good news is the jobs are fine now. It'll take a while to update everything but one... [22:49:04] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 3 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Ottomata) Ok, I'm pretty close. I've got the charts deployed in minikube via helm. It seems... [22:50:48] (03PS7) 10Awight: [WIP] Schema for ORES scores [analytics/refinery] - 10https://gerrit.wikimedia.org/r/481025 (https://phabricator.wikimedia.org/T209732) [22:50:50] (03PS3) 10Awight: [WIP] Oozie jobs to produce ORES data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/482753 [23:48:22] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 3 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Ottomata) I think kubectl describe pod is the most helpful. I'm onto something great here!