[00:08:26] 10Analytics, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Patch-For-Review, 10Services (later): Modern Event Platform: Stream Intake Service: Migrate change-prop events to new (EventGate) style schemas - https://phabricator.wikimedia.org/T226522 (10Ottomata) After our meeting today w... [07:04:08] Just re-installed ROCm 2.6 on stat1005, this time tensorflow seems to work [07:04:27] a bit weird but... better than hours of debugging :D [07:34:26] 10Analytics, 10Operations, 10Research-management, 10Patch-For-Review, 10User-Elukey: Remove computational bottlenecks in stats machine via adding a GPU that can be used to train ML models - https://phabricator.wikimedia.org/T148843 (10elukey) Restarted from a clean state as indicated by upstream, and ten... [08:08:47] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Beeline does not print full stack traces when a query fails - https://phabricator.wikimedia.org/T136858 (10elukey) 05Open→03Resolved [08:08:50] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Allow all Analytics tools to work with Kerberos auth - https://phabricator.wikimedia.org/T226698 (10elukey) [09:21:30] 10Analytics: User knissen can't access Superset - https://phabricator.wikimedia.org/T226431 (10elukey) 05Open→03Resolved Closing this task in favor of https://phabricator.wikimedia.org/T224159 [09:45:35] 10Analytics, 10Better Use Of Data, 10Reading-Infrastructure-Team-Backlog, 10Epic: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10fgiunchedi) [09:53:07] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban: Pie charts not showing on "User Agent Breakdowns" dashboard - https://phabricator.wikimedia.org/T228187 (10Esanders) Great, thanks! [10:02:26] 10Analytics, 10Better Use Of Data, 10Reading-Infrastructure-Team-Backlog, 10Epic: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Tgr) "Schema validation for eventgate events" - does that only mean adding to mediawiki/event-schemas? If so, I can do that. "Choose... [10:09:48] (03PS2) 10Fdans: Add UDF to get wiki project from referer string [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/523903 (https://phabricator.wikimedia.org/T228151) [10:13:15] (03CR) 10jerkins-bot: [V: 04-1] Add UDF to get wiki project from referer string [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/523903 (https://phabricator.wikimedia.org/T228151) (owner: 10Fdans) [10:27:34] 10Analytics: Check home leftovers of cwdent - https://phabricator.wikimedia.org/T226404 (10elukey) 05Open→03Resolved a:03elukey Done! [10:28:17] 10Analytics, 10Research: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10elukey) @Isaac anything needed from my side or are you good for data review? (Sorry I want to make sure that you are not waiting on me). If so, gentle ping :) [10:30:02] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: TLS certificates for Analytics origin servers - https://phabricator.wikimedia.org/T227860 (10elukey) [10:32:58] (03PS3) 10Fdans: Add UDF to get wiki project from referer string [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/523903 (https://phabricator.wikimedia.org/T228151) [13:06:41] 10Analytics, 10Research: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10Isaac) @elukey all good thanks -- there had been a hold-up while we were trying to figure out which files were the ones that were being proposed to release publicly, but that's been sorted out now so... [13:17:27] 10Analytics, 10Better Use Of Data, 10Reading-Infrastructure-Team-Backlog, 10Epic: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Ottomata) > "Schema validation for eventgate events" - does that only mean adding to mediawiki/event-schemas? If so, I can do that. Th... [13:17:47] 10Analytics, 10Better Use Of Data, 10Reading-Infrastructure-Team-Backlog, 10Epic: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Ottomata) [13:18:43] 10Analytics: Check home of bmansurov - https://phabricator.wikimedia.org/T226956 (10elukey) @leila thanks! Summary of what I have copied over to you: * /srv/leila/nschaaf on stat1006 * /user/leila/{nschaaf| ashwinpp} on HDFS I didn't find the dir ashwinpp elsewhere. [13:19:15] 10Analytics, 10Better Use Of Data, 10Reading-Infrastructure-Team-Backlog, 10Epic: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Ottomata) Actually, @tgr... we are beginning to use some new tooling to aide in managing mediawiki/event-schemas: https://gerrit.wikim... [13:24:33] * elukey afk for a bit! [14:13:24] 10Analytics: Check home of bmansurov - https://phabricator.wikimedia.org/T226956 (10elukey) ` elukey@stat1004:~$ DATABASE_TO_CHECK=a2v elukey@stat1004:~$ for t in $(hive -S -e "show tables from $DATABASE_TO_CHECK;" 2>/dev/null | grep -v tab_name); do echo "checking table: $DATABASE_TO_CHECK.$t"; hive -S -e "desc... [14:16:28] 10Analytics: Check home of bmansurov - https://phabricator.wikimedia.org/T226956 (10elukey) 05Open→03Resolved a:03elukey [14:27:27] nuria: morning :) [14:27:48] 10Analytics, 10Analytics-Kanban: deployments to analytics1030 failing - https://phabricator.wikimedia.org/T228347 (10Nuria) a:03Nuria [14:28:16] !log deployed refinery v0.0.40 [14:28:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:28:32] elukey: holaaa [14:32:57] let me know if you need any ops help [14:32:57] ottomata: hellooo whenever you want we can merge/deploy this :) https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/517085/ [14:33:47] nuria: jelooo there's a change pending to deploy in wikistats ui, if you're doing the whole train [14:33:55] fdans: I can merge it if you want! [14:34:15] elukey: yesss let's do it! rock and roll [14:34:44] fdans: ya, let me finish refinery deploy that was stopped yesterday and can do wikistats after [14:34:58] yeayea totally [14:35:09] nuria i think you could have kept going with the deploy yesterday [14:35:13] an30 is on the test cluster [14:35:58] ok fdans so I think this will just work. after we merge and the jobs run at least once, we can manually remove the old repo checkouts [14:36:27] ah ottomata is here :) o/ [14:36:29] sounds goood [14:36:32] https://puppet-compiler.wmflabs.org/compiler1002/17476/ - this is the pcc fdans [14:36:39] I'll leave to you guys the rest! [14:40:14] oh sorry elukey missed your comment [14:41:33] done fdans [14:46:14] (03CR) 10Fdans: [C: 04-1] "code looks good, i only have objections with naming, but we can discuss in standup" (035 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/517641 (https://phabricator.wikimedia.org/T225911) (owner: 10Fdans) [14:46:32] thank you ottomata ! [14:48:12] ottomata: ah i see, i just could not think what was that host used for [14:50:34] I added info to https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/Hadoop_testing_cluster [14:50:40] including hostnames if you guys want to bookmark [14:50:59] ottomata: nginx on analytics-tool1001 seems to work fine [14:51:14] today I have created a single cert with ema for our services (turnilo, yarn, etc..) [14:51:37] nice! [14:52:03] the only thing that currently happens when deploying nginx though is that installing the deb package causes nginx to start, and trying to bind port 80 [14:52:08] in our case it is taken [14:52:18] so I had to manually fix the listen dir, restart nginx, run puppet [14:52:22] that is not great of course [14:52:36] aye [14:53:15] we could add port 8888 or simialr to the listen dir of httpd on analytics-tool1001, move varnish to 8888, remove port 80 [14:53:24] then the puppet runs will not fail [14:53:41] (03PS4) 10Fdans: Add UDF to get wiki project from referer string [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/523903 (https://phabricator.wikimedia.org/T228151) [14:53:45] eventually (long term) ATS will call us only on :443 [14:54:45] elukey: could you just puppetize the listen port for nginx, and make the puppet dependency notify nginx? [14:55:06] ottomata: it is pupperized, the problem is the default config that comes with the deb package [14:56:12] *puppetized [14:56:25] what I meant is that the 443 port is eventually the only one in the config [14:56:37] but nginx needs to install first, and the deb tries to start it [14:56:46] causing it to fail and making puppet sad [14:57:15] ah yes [14:57:18] that is annoying [14:57:22] i never liked that auto start stuff [14:57:36] its just two puppet runs then, yes? [15:00:52] yes but with a manual change in the middle (the listen port) [15:13:45] why is manual change needed if port is puppetized? [15:26:49] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata-Campsite: Track WDQS updater UA in wikidata-special-entitydata grafana dashboard - https://phabricator.wikimedia.org/T218998 (10Addshore) p:05Triage→03Low [15:39:24] 10Analytics, 10Better Use Of Data, 10Reading-Infrastructure-Team-Backlog, 10Epic: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Milimetric) I am starting to worry that I won't have enough time before my paternity leave to work on the client, someone else should... [15:39:47] 10Analytics, 10Discovery-Search, 10Multimedia, 10Reading-Admin, and 3 others: Image Classification Working Group - https://phabricator.wikimedia.org/T215413 (10Miriam) [15:39:58] ottomata: sorry I was in a meeting - so nginx is ensured (the package), and that comes with a default config for port 80 [15:40:18] the manual change is fixing the listen port, so nginx could start (and the deb installed correctly) [15:40:32] at that point, puppet can run, deploy the new 443 only config, restart nginx [15:41:03] why can't the listen port be puppetized too? [15:41:10] the deb will still be installed by puppet the first time, no? [15:41:14] it will just make puppet fail [15:41:39] i guess if puppetization of listen port hass require -> Package['nginx'] hm, that would not happen. [15:43:47] yeah it is a weird issue, possibly less of a hassle if we simply don't listen on port 80 for http [15:43:53] that is not a big deal [15:44:27] aye [15:45:55] but, in the meantime, I'll add the rest of the TLS proxies! [15:47:18] :) [15:51:45] elukey: o/ [15:51:57] elukey: in T215800, should I still aim to review HDFS? :D [15:52:55] leila: o/ yes I was about to ask if you needed to review HDFS or not :) [15:53:14] the Hive databases might be backed by HDFS dirs under /user/ellery, I need to check [15:53:29] elukey: I haven't reviewed it, yet. I keep it in my todo then. Maybe I can finish it this afternoon PST. [15:53:37] +1, no rush :) [15:53:42] thanks a lot of the work! [15:54:39] elukey: if when you're deleting Hive you have interdependencies for which you have to delete HDFS, go for it and just delete the whole HDFS. otherwise, I'll review HDFS and let you know if any of it should be saved. (in general, given the time passed I'm leaning towards deletion as without tight documentation, it's hard to reuse the data anyway) [15:55:22] elukey: np. thanks for your patience. I know it's been very much delayed by me. [15:55:26] nono I'll wait, I'll add later on more info about where the Hive tables are stored [15:55:40] leila: it was a lot of data, I completely understand :) [15:55:48] ;) [16:18:07] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Factor out eventgate-wikimedia factory into its own gerrit repo and use it for deployment pipeline - https://phabricator.wikimedia.org/T226668 (10Ottomata) [16:24:50] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team: issues with artifact cache in an-coord1001 - https://phabricator.wikimedia.org/T227132 (10mmodell) >>! In T227132#5301532, @Nuria wrote: > Can the release engineering chime as to whether scap config settings should also delete artifacts from the ta... [16:30:03] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team: issues with artifact cache in an-coord1001 - https://phabricator.wikimedia.org/T227132 (10mmodell) > Does this scap config take effect in the source as well as the target of the deploy? The cached revs are only used on the target, not on the source. [16:31:38] 10Analytics, 10Analytics-Kanban: deployments to analytics1030 failing - https://phabricator.wikimedia.org/T228347 (10Milimetric) p:05Triage→03High [16:32:02] 10Analytics, 10Analytics-Kanban: deployments to analytics1030 failing - https://phabricator.wikimedia.org/T228347 (10Milimetric) 05Open→03Resolved [16:33:09] 10Analytics: Refine should accept principal name for hive2 jdbc connection for DDL - https://phabricator.wikimedia.org/T228291 (10Milimetric) p:05Triage→03High [16:34:56] 10Analytics, 10Analytics-Kanban: Page creation data stream died June 6 - https://phabricator.wikimedia.org/T228188 (10Milimetric) p:05Triage→03High a:03Milimetric [16:34:59] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team (Deployment services): issues with artifact cache in an-coord1001 - https://phabricator.wikimedia.org/T227132 (10greg) [16:35:32] 10Analytics, 10Analytics-Kanban, 10Wikimedia-Stream, 10Documentation: stream.wikimedia.org/?doc returns an error page - https://phabricator.wikimedia.org/T227958 (10Milimetric) p:05Triage→03High [16:35:37] 10Analytics, 10Analytics-Kanban, 10Wikimedia-Stream, 10Documentation: stream.wikimedia.org/?doc returns an error page - https://phabricator.wikimedia.org/T227958 (10Milimetric) 05Open→03Resolved [16:39:07] 10Analytics, 10Tool-Pageviews, 10Patch-For-Review: The mediacounts dataset doesn't have a project dimension - https://phabricator.wikimedia.org/T228151 (10Milimetric) p:05Triage→03High [16:39:13] 10Analytics, 10Tool-Pageviews: Load media requests data into cassandra (or druid?) - https://phabricator.wikimedia.org/T228149 (10Milimetric) p:05Triage→03High [16:39:42] 10Analytics, 10Analytics-Kanban, 10ULS-CompactLinks, 10UniversalLanguageSelector: The Interlanguage Navigation Statistics dashboard stops at 2019-05-26 - https://phabricator.wikimedia.org/T228033 (10Milimetric) a:03Milimetric [16:40:18] 10Analytics, 10Analytics-Kanban, 10ULS-CompactLinks, 10UniversalLanguageSelector: The Interlanguage Navigation Statistics dashboard stops at 2019-05-26 - https://phabricator.wikimedia.org/T228033 (10Milimetric) p:05Triage→03High [16:42:33] 10Analytics, 10Analytics-Kanban, 10Wikimedia-Stream, 10Documentation: stream.wikimedia.org/?doc returns an error page - https://phabricator.wikimedia.org/T227958 (10stjn) Is component documentation supposed to contain nothing? In fact, there are multiple React errors on doing any action on the page. (Firef... [16:42:52] 10Analytics: Pagecounts merged archive with incorrect encoding and weird content - https://phabricator.wikimedia.org/T227955 (10Milimetric) p:05Triage→03Low [16:43:23] 10Analytics: Pagecounts merged archive with incorrect encoding and weird content - https://phabricator.wikimedia.org/T227955 (10Milimetric) We don't have the source data any more, so best we can do is change the encoding. This will be lower priority than our current work. [16:43:35] 10Analytics, 10Continuous-Integration-Config, 10JavaScript: Fix the analytics/mediawiki-storage repo to work on node10 - https://phabricator.wikimedia.org/T228451 (10Jdforrester-WMF) [16:43:59] 10Analytics, 10WMDE-Analytics-Engineering: Public Data Review Needed - https://phabricator.wikimedia.org/T227905 (10Milimetric) p:05Triage→03High a:03Nuria [16:44:01] 10Analytics, 10JavaScript: Fix the analytics/wikistats2 repo to work on node10 - https://phabricator.wikimedia.org/T228452 (10Jdforrester-WMF) [16:44:03] 10Analytics, 10WMDE-Analytics-Engineering: Public Data Review Needed - https://phabricator.wikimedia.org/T227905 (10Milimetric) p:05High→03Triage [16:44:07] 10Analytics, 10WMDE-Analytics-Engineering: Public Data Review Needed - https://phabricator.wikimedia.org/T227905 (10Milimetric) p:05Triage→03High [16:44:41] 10Analytics, 10Discovery, 10Operations, 10Research-Backlog: Make oozie swift upload emit event to Kafka about swift object upload complete - https://phabricator.wikimedia.org/T227896 (10Milimetric) p:05Triage→03High [16:45:37] 10Analytics, 10Analytics-Kanban, 10JavaScript: Fix the analytics/wikistats2 repo to work on node10 - https://phabricator.wikimedia.org/T228452 (10Milimetric) p:05Triage→03High a:03Milimetric [16:46:49] 10Analytics, 10Continuous-Integration-Config, 10JavaScript: Fix the analytics/mediawiki-storage repo to work on node10 - https://phabricator.wikimedia.org/T228451 (10Milimetric) p:05Triage→03High [16:49:28] 10Analytics, 10Analytics-Data-Quality: Set entropy alarm in editors per country per wiki - https://phabricator.wikimedia.org/T227809 (10Milimetric) [16:49:37] 10Analytics, 10Analytics-Data-Quality: Set entropy alarm in editors per country per wiki - https://phabricator.wikimedia.org/T227809 (10Milimetric) p:05Triage→03High [16:49:52] 10Analytics, 10Analytics-Kanban: mylvmbackup on an-coord1001 not working - https://phabricator.wikimedia.org/T227941 (10Milimetric) 05Open→03Resolved [16:49:58] 10Analytics, 10Analytics-Kanban: LDAP ldap-ro.eqiad.wikimedia.org not reachable from Analytics VLAN - https://phabricator.wikimedia.org/T227611 (10Milimetric) 05Open→03Resolved [16:56:29] 10Analytics, 10Analytics-Kanban, 10ULS-CompactLinks, 10UniversalLanguageSelector: The Interlanguage Navigation Statistics dashboard stops at 2019-05-26 - https://phabricator.wikimedia.org/T228033 (10Milimetric) dashboard is working fine now. It's possible there was just a temporary server hiccup, reportup... [16:57:48] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team (Deployment services): issues with artifact cache in an-coord1001 - https://phabricator.wikimedia.org/T227132 (10Ottomata) We're having a problem where the cached revs are not removed in the case of a scap failure. Occasionally we run out of disk s... [17:06:21] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team (Deployment services): issues with artifact cache in an-coord1001 - https://phabricator.wikimedia.org/T227132 (10Nuria) @mmmodell: this an example of such a problem: https://phabricator.wikimedia.org/T228347 [17:07:15] ottomata: there's a mediawiki_page_create_3 and a mediawiki_page_create_4, but there's a gap of 12 days between the last event in 3 and the first event in 4 [17:07:23] know anything about that? [17:07:34] ? [17:07:40] in mysql? [17:07:48] 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 3 others: Modern Event Platform: Stream Intake Service: Migrate eventlogging-service-eventbus events to eventgate-main - https://phabricator.wikimedia.org/T211248 (10WDoranWMF) [17:08:16] hmm, i wonder what is happening in mysql when schema version is like 1.0.0 [17:09:20] yeah, in mysql, on analytics-slave, log db [17:09:41] I don't see any versions like X.Y.Z [17:09:46] it's just whole numbers [17:11:18] well it coudln't be [17:11:31] milimetric: we switched the events to new schema format [17:11:36] i will have to look [17:11:39] is someone still using those? [17:11:53] yeah, the bug's filed because the dashboard is missing data [17:11:54] we have this quarter a goal to decom EL mysql [17:12:00] and it's missing it because they didn't upgrade to query the new table [17:12:10] well, the new table might be missing data too [17:12:17] will have to look [17:12:21] you can assign to me [17:12:22] what is bug? [17:12:27] yeah, they should just query the hive table, but reportupdater can't do what they're doing right now [17:12:39] cc nuria / fdans: this is a blocker for the reportupdater work [17:12:58] 10Analytics, 10Analytics-Kanban: Page creation data stream died June 6 - https://phabricator.wikimedia.org/T228188 (10Milimetric) this is due to a change in schema from mediawiki-page-create version 3 to version 4. [17:13:01] this one ^ [17:13:24] 10Analytics, 10Analytics-Kanban: Page creation data stream died June 6 - https://phabricator.wikimedia.org/T228188 (10Ottomata) a:05Milimetric→03Ottomata [17:13:32] milimetric: ah, the page create dashborad pulls data from mysql [17:13:47] yea, and it's in the reportupdater repo so we didn't look at it closely [17:13:53] milimetric: what is the functionality missing they could not get from hive? [17:13:55] I knew I wasn't crazy when I said someone was doing this [17:14:02] they query across all dbs [17:14:19] milimetric: can you give me an example? [17:14:19] so their template explodes wiki_db to equal enwiki, etwiki, etc. [17:14:33] example: https://github.com/wikimedia/analytics-reportupdater-queries/blob/master/page-creation/pagecreations.sql [17:14:49] and relevant config: https://github.com/wikimedia/analytics-reportupdater-queries/blob/master/page-creation/config.yaml#L20 [17:15:05] if we port that to hive, we'd have to come up with some way to do it [17:15:14] hm but that's not using another db [17:15:22] that is jsut filtering on the database field [17:15:23] no? [17:15:55] yeah, it's querying log.mediawiki_page_create_3, but the where clause is dynamic [17:16:09] RU makes a copy of this query for each of the 900+ dbs in that txt file [17:16:24] runs it and puts the output in a different file, like enwiki.tsv, etwiki.tsv, etc [17:16:27] and Hive can't do that [17:16:44] and Dashiki needs that to display the dashboard [17:17:20] milimetric: let's see, the data is in hive on the same table for all wikis [17:17:21] we can change Dashiki or RU [17:17:26] yes [17:19:53] 10Analytics, 10Analytics-EventLogging: Move reportupdater reports that pull data from eventlogging mysql to pull data from hadoop - https://phabricator.wikimedia.org/T223414 (10Nuria) An example of the page creation stream that was broken as of late includes functionality to make a report per wiki that after i... [17:20:42] milimetric: ya would be better if we can turn that into one query then [17:20:44] and do it in hive [17:20:54] just grouup by database or something [17:21:31] that's fine, then we have to change Dashiki [17:21:54] it'll take a bit longer than updating RU, but might be the right call [17:24:24] milimetric: ya, added this note to the ticket we have about this. let me look at one thing [17:26:24] ottomata: btw, I'm done reviewing the EG doc [17:26:30] *MEP [17:28:20] milimetric: seems to me [17:28:26] milimetric: this data is now in wikistats [17:28:30] milimetric: right? [17:29:38] nuria: there are other fancier queries about autopatrolled and autoconfirmed, and all that [17:29:45] but from a quick look at them... maybe it's all in AQS yea [17:29:56] oh, no, it's not [17:29:59] it's in mediawiki_history [17:30:01] but not in AQS [17:30:20] and mw history is updated monthly and this is a weekly dashboard [17:30:23] thanks milimetric ! [17:30:27] sorry, daily [17:32:37] milimetric: ya but let's mention this fact, i think the most useful thing is "autoconfirmed/not" but everything else is there with monthly availability [17:33:48] 10Analytics, 10Analytics-Kanban: Page creation data stream died June 6 - https://phabricator.wikimedia.org/T228188 (10Nuria) @kaldari When these dashboards were created this data was not in wikistats, now there are monthly updates of, say, pages created on namespace 0 for estonian wikipedia: https://stats.wik... [17:35:26] 10Analytics, 10Analytics-Kanban: Page creation data stream died June 6 - https://phabricator.wikimedia.org/T228188 (10Nuria) @kaldari are these dashboards still needed? [17:36:42] milimetric: let's hear from kaldari on whether these dashboards need to exist [17:37:13] milimetric: the autoconfirmed/not part is missing but from what i see (let me know otherwise) that data hasn't been updated for a while [17:38:16] 10Analytics, 10Analytics-Kanban: Page creation data stream died June 6 - https://phabricator.wikimedia.org/T228188 (10Nuria) Also, while now you can only split data by (bot/user) and namespace separately it is our plan to allow multiple splits soon. [17:38:42] nuria: he filed the bug with the message that these dashboards are needed [17:39:37] but I'll wait then to hear back [17:39:44] we have many courses of action [17:39:52] milimetric: right, but let's verify what is the use case they serve now, when ACTTRIAL for enwiki happen this data was not available any other way. [17:46:17] (03PS1) 10Milimetric: Update Karma to allow testing from node 10 [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524286 (https://phabricator.wikimedia.org/T228451) [17:46:32] (03CR) 10Milimetric: [C: 03+2] Update Karma to allow testing from node 10 [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524286 (https://phabricator.wikimedia.org/T228451) (owner: 10Milimetric) [17:47:50] (03Merged) 10jenkins-bot: Update Karma to allow testing from node 10 [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524286 (https://phabricator.wikimedia.org/T228451) (owner: 10Milimetric) [17:47:53] yeah milimetric [17:48:00] i forgot about this mysql consumer [17:48:09] the new schemas are not compatible with it [17:48:13] 10Analytics, 10Analytics-Kanban, 10Continuous-Integration-Config, 10JavaScript: Fix the analytics/mediawiki-storage repo to work on node10 - https://phabricator.wikimedia.org/T228451 (10Milimetric) [17:48:14] or with eventlogging really [17:48:26] so when we migrated page_create etc. to eventgate [17:48:28] they stopped working [17:48:32] it isn't just the _4 table [17:48:36] everything since then [17:48:48] there are some recent events in the _4 table [17:48:51] hmmm i think... [17:48:54] let me keep checking [17:49:03] but anyway, yeah, this is all broken, we should move it to Hive, there's no reason to keep it in mysql [17:49:16] (03CR) 10Jforrester: "check experimental" [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524286 (https://phabricator.wikimedia.org/T228451) (owner: 10Milimetric) [17:49:25] we just have to pick the way we want to do that, and do it, it's not more than a few days of work either way, it's just more unplanned stuff [17:49:46] (03CR) 10jenkins-bot: Update Karma to allow testing from node 10 [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524286 (https://phabricator.wikimedia.org/T228451) (owner: 10Milimetric) [17:50:10] * elukey off! [17:52:27] milimetric: yeah, we should move it [17:52:34] fixing it on the EL side won't be trivial either [17:54:09] brb [18:00:31] https://www.irccloud.com/pastebin/oZ8h1BrK/ [18:00:49] (03PS1) 10Jforrester: karma.conf.js: Switch from Chrome to ChromeHeadless [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524289 (https://phabricator.wikimedia.org/T228451) [18:01:12] (03CR) 10Jforrester: "check experimental" [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524289 (https://phabricator.wikimedia.org/T228451) (owner: 10Jforrester) [18:02:39] (03PS1) 10Jforrester: Commit package-lock.json to make CI builds much faster [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524290 [18:02:57] (03CR) 10Jforrester: "check experimental" [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524290 (owner: 10Jforrester) [18:09:19] iflorez: can't tell why that would be, i can't repro myself, him. [18:09:21] hm* [18:09:42] if you are there we can try to troubleshoot together [18:12:29] ottomata: i think this still needs merging: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/524256/ [18:12:43] 10Analytics, 10Analytics-Kanban: Page creation data stream died June 6 - https://phabricator.wikimedia.org/T228188 (10Ottomata) Once again, my fault. We've been migrating events to a the new EventGate service in T211248. The new events are not compatible with EventLogging. We would like to decommission the... [18:12:52] nuria: yes [18:12:53] sorry [18:12:58] i'm going to run my backfills real quick first [18:13:03] if they go well then will merge [18:13:21] ottomata: k [18:17:32] @ottomata, troubleshooting together would be great, yes please [18:26:43] iflorez: seems quite a prevalent problem in hue: https://community.hortonworks.com/questions/22151/hue-does-not-list-hive-databases-for-most-users.html [18:26:56] iflorez: have you tried access using jupyter notebooks? [18:27:17] I will try that and will keep you posted. thank you for the suggestion. [18:29:19] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Refine JsonSchemaLoader should use JsonParser instead of YAMLParser to load JSON data - https://phabricator.wikimedia.org/T227484 (10Ottomata) Backfilled offending hours with e.g. ` sudo -u analytics /usr/bin/spark2-submit \ --name otto_refine0 \ --class... [18:34:41] !log backfilling MobileWikiAppDailyStats data since June 7 to populate misisng fields (e.g. appinstallid) in refined data. - T226219 [18:34:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:34:44] T226219: [BUG] Logging error of MobileWikiAppDailyStats for the iOS app - https://phabricator.wikimedia.org/T226219 [18:36:51] 10Analytics, 10Analytics-Dashiki, 10Analytics-Kanban: Pie charts not showing on "User Agent Breakdowns" dashboard - https://phabricator.wikimedia.org/T228187 (10Nuria) 05Open→03Resolved [18:37:15] milimetric: what was the problem here? https://phabricator.wikimedia.org/T228187 [18:37:26] milimetric: re: dashiki dashboards [18:40:39] (03CR) 10Milimetric: [C: 03+2] karma.conf.js: Switch from Chrome to ChromeHeadless [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524289 (https://phabricator.wikimedia.org/T228451) (owner: 10Jforrester) [18:41:22] (03Merged) 10jenkins-bot: karma.conf.js: Switch from Chrome to ChromeHeadless [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524289 (https://phabricator.wikimedia.org/T228451) (owner: 10Jforrester) [18:41:29] (03PS1) 10Nuria: Bumping up jar version on webrequest load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/524298 (https://phabricator.wikimedia.org/T226730) [18:42:32] 10Analytics, 10Analytics-Kanban, 10Continuous-Integration-Config, 10JavaScript, 10Patch-For-Review: Fix the analytics/mediawiki-storage repo to work on node10 - https://phabricator.wikimedia.org/T228451 (10Jdforrester-WMF) 05Open→03Resolved a:03Milimetric Thank you! [19:04:46] (03CR) 10Jforrester: "recheck" [analytics/mediawiki-storage] - 10https://gerrit.wikimedia.org/r/524290 (owner: 10Jforrester) [19:09:05] (03CR) 10Ottomata: [C: 03+1] Bumping up jar version on webrequest load [analytics/refinery] - 10https://gerrit.wikimedia.org/r/524298 (https://phabricator.wikimedia.org/T226730) (owner: 10Nuria) [20:17:43] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Factor out eventgate-wikimedia factory into its own gerrit repo and use it for deployment pipeline - https://phabricator.wikimedia.org/T226668 (10Ottomata) [20:18:18] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Factor out eventgate-wikimedia factory into its own gerrit repo and use it for deployment pipeline - https://phabricator.wikimedia.org/T226668 (10Ottomata) Looking good in staging. I w... [20:25:40] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Factor out eventgate-wikimedia factory into its own gerrit repo and use it for deployment pipeline - https://phabricator.wikimedia.org/T226668 (10Ottomata) [20:26:11] nuria: this backfil is taking a while, i'm going to merge the jar bump [20:32:12] ottomata: k [20:33:57] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Import AMD rocm packages in wikimedia-buster - https://phabricator.wikimedia.org/T224723 (10Nuria) 05Open→03Resolved [20:34:04] 10Analytics, 10Operations, 10Research-management, 10Patch-For-Review, 10User-Elukey: Remove computational bottlenecks in stats machine via adding a GPU that can be used to train ML models - https://phabricator.wikimedia.org/T148843 (10Nuria) [20:35:26] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Investigate if a Prometheus exporter for the AMD GPU(s) can be easily created - https://phabricator.wikimedia.org/T220784 (10Nuria) 05Open→03Resolved [20:35:34] 10Analytics, 10Operations, 10Research-management, 10Patch-For-Review, 10User-Elukey: Remove computational bottlenecks in stats machine via adding a GPU that can be used to train ML models - https://phabricator.wikimedia.org/T148843 (10Nuria) [20:35:39] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826 (10Nuria) 05Open→03Resolved [20:36:06] 10Analytics, 10Analytics-Kanban, 10ULS-CompactLinks, 10UniversalLanguageSelector: The Interlanguage Navigation Statistics dashboard stops at 2019-05-26 - https://phabricator.wikimedia.org/T228033 (10Nuria) 05Open→03Resolved [20:36:25] 10Analytics-Kanban: Not all metrics have daily and monthly granularities available - https://phabricator.wikimedia.org/T226397 (10Nuria) 05Open→03Resolved [20:36:46] 10Analytics, 10Analytics-Kanban, 10Operations, 10Wikimedia-Incident: Move icinga alarm for the EventStreams external endpoint to SRE - https://phabricator.wikimedia.org/T227065 (10Nuria) 05Open→03Resolved [20:36:51] 10Analytics, 10Operations, 10Security, 10Services (watching), 10Wikimedia-Incident: Eventstreams in codfw down for several hours due to kafka2001 -> kafka-main2001 swap - https://phabricator.wikimedia.org/T226808 (10Nuria) [20:37:09] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Enable encryption and authentication for TLS-based Hadoop services - https://phabricator.wikimedia.org/T217412 (10Nuria) 05Open→03Resolved [20:37:11] 10Analytics: Enable Security (stronger authentication and data encryption) for the Analytics Hadoop cluster and its dependent services - https://phabricator.wikimedia.org/T211836 (10Nuria) [20:37:27] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats UI workarround for time interval bounds - https://phabricator.wikimedia.org/T226421 (10Nuria) 05Open→03Resolved [20:37:46] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Product-Analytics: Many revision events in mediawiki_history have missing page and namespace information - https://phabricator.wikimedia.org/T221338 (10Nuria) [20:37:53] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Product-Analytics: Many revision events in mediawiki_history have missing page and namespace information - https://phabricator.wikimedia.org/T221338 (10Nuria) 05Open→03Resolved [20:37:55] 10Analytics, 10Analytics-Kanban: Mediawiki-history release - Snapshot 2019-06 - https://phabricator.wikimedia.org/T221825 (10Nuria) [20:37:58] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics, 10Patch-For-Review: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset - https://phabricator.wikimedia.org/T211173 (10Nuria) [20:38:00] 10Analytics-Kanban, 10Product-Analytics: Address data quality issues in the mediawiki_history dataset - https://phabricator.wikimedia.org/T204953 (10Nuria) [20:38:56] 10Analytics, 10Analytics-Kanban: Decide: start_timestamp for mediawiki history - https://phabricator.wikimedia.org/T220507 (10Nuria) 05Open→03Resolved [20:38:58] 10Analytics, 10Analytics-Kanban: Mediawiki-history release - Snapshot 2019-06 - https://phabricator.wikimedia.org/T221825 (10Nuria) [20:39:24] 10Analytics, 10Analytics-Kanban: Bug: geoeditors (editors per country data) 2019-06 snapshot broken - https://phabricator.wikimedia.org/T227812 (10Nuria) 05Open→03Resolved [20:39:41] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics, 10Patch-For-Review: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset - https://phabricator.wikimedia.org/T211173 (10Nuria) 05Open→03Resolved [20:40:07] 10Analytics, 10Analytics-Kanban: Update bot user check in mediawiki-user-history-checker to use historical bot values - https://phabricator.wikimedia.org/T225247 (10Nuria) 05Open→03Resolved [20:41:12] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Product-Analytics: page_creation_timestamp not always correct in mediawiki_history - https://phabricator.wikimedia.org/T214490 (10Nuria) 05Open→03Resolved [20:41:14] 10Analytics-Kanban, 10Product-Analytics: Address data quality issues in the mediawiki_history dataset - https://phabricator.wikimedia.org/T204953 (10Nuria) [20:41:17] 10Analytics, 10Analytics-Kanban: Mediawiki-history release - Snapshot 2019-06 - https://phabricator.wikimedia.org/T221825 (10Nuria) [20:43:33] (03CR) 10Nuria: Hash tokens from the EL Sanitization white-list for iOS app (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/520134 (https://phabricator.wikimedia.org/T226849) (owner: 10Chelsyx) [20:56:52] 10Analytics: Pagecounts merged archive with incorrect encoding and weird content - https://phabricator.wikimedia.org/T227955 (10Lofhi) Thank you. It seems that all the archives before the data for January 2015 are not in UTF8. [21:02:32] 10Analytics: Pagecounts merged archive with incorrect encoding and weird content - https://phabricator.wikimedia.org/T227955 (10Nuria) 05Open→03Resolved [21:24:44] @nuria and @ottomata, [21:24:44] yes, I am able to access the tables missing on Hue via Jupyter Notebook. I can proceed to rely on notebooks. Please let me know if there are any concerns that I should be aware of given this issue....any implications or related potential roadblocks. ty! [21:36:01] PROBLEM - Check the last execution of refine_eventlogging_analytics on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refine_eventlogging_analytics https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [22:43:36] ottomata: are you by any chance still around and up for a call? [22:44:26] (03CR) 10Chelsyx: Hash tokens from the EL Sanitization white-list for iOS app (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/520134 (https://phabricator.wikimedia.org/T226849) (owner: 10Chelsyx) [22:55:02] ottomata: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/524374/ if you are arround [23:00:49] (03CR) 10Nuria: Hash tokens from the EL Sanitization white-list for iOS app (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/520134 (https://phabricator.wikimedia.org/T226849) (owner: 10Chelsyx) [23:33:19] RECOVERY - Check the last execution of refine_eventlogging_analytics on an-coord1001 is OK: OK: Status of the systemd unit refine_eventlogging_analytics https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:38:37] leila: i am here if i can help you with anything