[00:18:59] 10Analytics, 10Analytics-Kanban, 10Privacy Engineering, 10Product-Analytics, and 3 others: Drop data from Prefupdate schema that is older than 90 days - https://phabricator.wikimedia.org/T250049 (10Milimetric) I had to tackle some performance problems with the actual sanitization. Parking this here {F3235... [01:42:59] 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Analytics-Radar, 10Product-Analytics, and 3 others: WikiEditor records all edits as platform = desktop in EventLogging - https://phabricator.wikimedia.org/T249944 (10DLynch) VisualEditor does `ve.init.target.constructor.static.platformType || 'other'` (... [02:46:50] 10Analytics, 10Analytics-Kanban: Import page_props table to Hive - https://phabricator.wikimedia.org/T258047 (10MMiller_WMF) @Nuria -- thank you! That will be in about two weeks I guess. @Miriam FYI. [06:12:36] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Stats for newer projects not available - https://phabricator.wikimedia.org/T258033 (10The_Discoverer) Thanks :) [06:22:14] good morning! [06:27:15] so I am trying to debug the webrequest_load problem with hue-next and it is a pain [06:27:21] also I see some errors here and there [06:29:24] !log re-run webrequest-load-text 21/09T21 - failed due to sporadic hive/kerberos issue (SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://an-coord1001.eqiad.wmnet:10000/default;principal=hive/an-coord1001.eqiad.wmnet@WIKIMEDIA: Peer indicated failure: Failure to initialize security context) [06:29:26] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [06:45:01] Good morning [06:46:52] bonjour [06:46:54] just opened https://github.com/cloudera/hue/issues/1272 [06:46:56] elukey: o/ [06:47:03] elukey: can I help with webrequest? [06:47:26] joal: already restarted, it was the hive/kerberos glitch that sometimes happen [06:47:33] ack [06:47:39] thanks! [06:47:55] it will take a bit to iron out all the hue issues.. [06:48:04] oh btw, what was the one that you found yesterday? [06:50:44] actually elukey it was me trying to do something not possible (namely kill not yet started runs [06:50:56] ahh okok [06:51:18] I fixed manually on the host the problem that Andrew had with the "Configuration" button [06:51:31] and sent a pull request, will re-package once they merge [06:51:36] ok elukey - does it require a patch upstream? [06:51:39] Ah right [06:51:57] yeah https://github.com/cloudera/hue/pull/1271 [06:52:13] they didn't merge it yesterday due to some linter issues [06:52:21] buuut should be mergeable now [06:53:06] awesome elukey [06:53:50] the more we use it the better, hopefully bugs will be ironed out soon [06:53:53] (last famous words) [06:56:34] joal: if you are ok I am going to merge the change to swap the presto TLS certs [06:56:47] works for me elukey [07:17:01] ok of course it doesn't work [07:18:36] :( [07:22:38] wow - cluster is very late [07:26:03] pageview-hourly is currently computing yesterday hour 21 and 22 :( [07:26:56] Ahhhh! I know [07:27:19] we do have some backlog due to the killed webrequest-load no? [07:30:45] elukey: before the actor tables, given that webequest had been computed for hours after the failed ones, pageview hours would have followed, letting only one hour undone [07:31:20] elukey: Now with actors, with have a rollup table - Meaning that any hour is dependent on the 24 previous ones - therefore a failed horu blocks the whole pageview [07:31:49] so webrequest is not late, pageview is [07:32:50] ok I pretend to understand and approve :) [07:33:12] :) More details at your disposal if needed elukey :) [07:34:40] np, I am trying to understand why presto doesn't work [07:35:16] I'm trying it now, and the CLI works, but not the data gathering FWICS [07:35:38] Or maybe it is coords connection failing [07:35:47] yes the workers cannot contact the presto coord on an-coord1001, probably due to https, but there is no indicatio why [07:35:48] elukey: let me know how I can help [07:35:54] :( [07:35:56] like "SSL blabla horrible error blalba" [07:35:58] nothing [07:36:02] MEH [07:47:24] 10Analytics, 10Analytics-Kanban: Import page_props table to Hive - https://phabricator.wikimedia.org/T258047 (10Miriam) Thank you @MMiller_WMF and @Nuria!! [07:58:25] 10Analytics-Radar, 10Operations, 10Traffic, 10Patch-For-Review: Package varnish 6.0.x - https://phabricator.wikimedia.org/T261632 (10ema) [08:03:00] ok I am going to revert the presto change [08:09:45] worked [08:11:14] so frustrating [08:47:53] Morning! [08:52:16] Good morning klausman :) [09:13:01] (03CR) 10Joal: [V: 03+2] "Merging for next dpeloy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628917 (owner: 10Joal) [09:35:07] 10Analytics, 10Analytics-Kanban: Prevent wikidata-entity jobs to wait indefinitely - https://phabricator.wikimedia.org/T263529 (10JAllemandou) [09:35:21] 10Analytics, 10Analytics-Kanban: Prevent wikidata-entity jobs to wait indefinitely - https://phabricator.wikimedia.org/T263529 (10JAllemandou) a:03JAllemandou [09:35:38] (03PS1) 10Joal: Set timeout in oozie wikidata-entity jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/629073 (https://phabricator.wikimedia.org/T263529) [09:38:41] (03PS2) 10Joal: Set timeout in oozie dumps-dependent jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/629073 (https://phabricator.wikimedia.org/T263529) [09:39:44] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prevent dumps-dependent jobs to wait indefinitely - https://phabricator.wikimedia.org/T263529 (10JAllemandou) [10:03:00] o/ QUestion about the wikidata dumps loaded into hadoop. Is it possible / fine to load some of the older JSON dumps in there, say from 2014 etc? [10:08:07] addshore: no reason it wouldn't be [10:08:18] addshore: maybe one - if format changes [10:09:35] ack! yeah the format should be stable [10:09:39] well, stable so far [10:09:46] I might write a phab ticket up soon, thanks :) [10:11:23] o/ [10:34:13] hi guys, quick question, there is any table where I can quickly count vandalism-"revertion" edits? [10:35:46] dsaez: is there a flag for 'vandalism'| [10:35:47] ? [10:36:05] that would be in the edit summary? [10:36:16] I dont know? [10:36:23] me neither :D [10:37:42] dsaez: you can imagine that no precise definition means no table ;) [10:37:57] true. [10:38:22] dsaez: you can try to filter rows of mediawiki-history using the `event_comment` field with some regex - I have no better [10:39:13] joal: yep, thanks, I'll try that approach. [10:52:38] going afk for lunch! [11:49:54] I forgot I’d be at the dentist this morning, but I should be back by meetings [11:58:15] morning! [11:58:25] Hi fdans :) [11:58:40] joal: o/ [12:29:08] I'm fixing a wiring mistake I made yesterday. Back in 30m or so [13:00:39] I opened https://github.com/prestodb/presto/issues/15207 for the Presto pkcs12 issue [13:00:49] let's see if upstream answers [13:04:38] (never a joy) [13:24:46] Well, that turned out ra right mess (cabling) [13:29:11] Hey analytics! I have a question. What percentage of actual users utilise something like noscript or have javascript disabled? [13:35:36] Hi Seddon - very difficult to know - nuria might have a number having looked at webrequest calls more than I have but we have no well-defined way of knowing [13:52:12] 10Analytics-Radar, 10Operations, 10Traffic, 10Patch-For-Review: Package varnish 6.0.x - https://phabricator.wikimedia.org/T261632 (10ema) [14:03:45] hey teammmm [14:03:51] hellooo mforns [14:03:55] uou, lots of alerts, readingg [14:03:55] : [14:03:57] :] [14:13:07] @joal makes sense. Do we know if our general traffic generally matches global trends? [14:20:24] Seddon: for what purpose? [14:20:32] Seddon: can you be a bit more specific? [14:21:00] Seddon: global trends for..? [14:23:56] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up automatic deletion/snitization for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10Nuria) [14:28:04] nuria: Browser, OS, device breakdowns. Basically there is likely to be a phase where by the graph extension will render client side and not server side. This will impact a very small subset of users. From a cursory exploration of the internet, the number of users who disable javascript is about 0.2%. [14:28:50] Seddon: i think you need to look rather at which is the number of users to which WE DO NOT serve javascript [14:29:14] Seddon: cause as you know browsers like ie9 support javascript but WILL NOT receive js files from us [14:29:29] Seddon: that is probably the number you are after [14:29:57] Seddon: see, https://www.mediawiki.org/wiki/Compatibility#General_information [14:30:25] Seddon: grade C does get JS [14:32:43] Seddon: sorry, grade C does NOT get js [14:32:49] Seddon: grade A and B do [14:34:24] nuria: is there a formal definition for what "core functionality of the MediaWiki platform" consists of? [14:34:53] PROBLEM - Check the last execution of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:35:19] Seddon: probably mediawiki-devs can answer that better but i woudl think reading wikipedia and using wikitext [14:35:55] 10Analytics-Radar, 10Operations, 10Traffic, 10Patch-For-Review: Package varnish 6.0.x - https://phabricator.wikimedia.org/T261632 (10ema) [14:43:42] ottomata: are you testing produce_canary_events? [14:43:55] (03CR) 10Nuria: "I see, so the SLA is not going to trigger until the job has started, right?" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/629073 (https://phabricator.wikimedia.org/T263529) (owner: 10Joal) [14:44:56] (03CR) 10Nuria: Set timeout in oozie dumps-dependent jobs (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/629073 (https://phabricator.wikimedia.org/T263529) (owner: 10Joal) [14:45:29] mforns: yes test ones are being send regularly [14:45:33] but i'm not doing anything new atm [14:45:49] ok ok, looking at the icinga alarm, thx [14:51:10] ottomata: error says: eventlogging_Test does not have a schema_title setting [14:51:30] ottomata: is this urgent? I have a meeting in 9 mins [14:51:48] oh, you have the meeting too :] [14:55:00] i thought we fixed that......... [14:55:02] not urgent at all [14:55:08] i'll try to look into that later this afternoon [14:57:55] ok thanks ottomata :] [15:05:11] (03CR) 10Joal: "Indeed, SLAs only trigger for jobs instantiated (meaning not waiting for data -- ready to run but not yet started possibly because of thro" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/629073 (https://phabricator.wikimedia.org/T263529) (owner: 10Joal) [15:17:36] nuria: Is there a readily available percentage of Class C traffic? [15:20:20] Seddon: there are percentages for browsers so you can estimate that, there is data in hive/turnilo and publicy: https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser [15:28:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Fix TLS certificate location and expire for Hadoop/Presto/etc.. and add alarms on TLS cert expiry - https://phabricator.wikimedia.org/T253957 (10elukey) Presto seems not working with the new pkcs12 config, opened: https://github.com/prestodb/presto/issues/... [15:28:26] 10Analytics-Radar, 10Operations, 10Traffic, 10Patch-For-Review: Package varnish 6.0.x - https://phabricator.wikimedia.org/T261632 (10ema) 05Open→03Resolved a:03ema All packages ready for prime time! [15:38:51] RECOVERY - Check the last execution of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:52:31] a-team i'm going to skip standup so i can practice my talk a bit more, i'll come to the first 15 mins of retro though [15:52:53] ok :] [16:01:09] ping razzi , mforns , milimetric standddupppppp [17:02:40] FYI live talk happening right now about event platform: https://www.youtube.com/watch?v=Tb_9gOYsWJQ&feature=emb_title [17:43:34] which irc room is been used for questions for the talk (i dont have one im just curious as i dont seem to be in it) [17:52:51] ottomata: great talk, thank you! [17:53:03] jbond42: #wikimedia-office or the yt chat of the livestream itself, both are monitored by Sarah [17:53:50] #-office also gets other use like techconf meetings and the like [17:54:00] thanks cdanis ! i [18:01:40] 10Analytics-Radar, 10Product-Analytics, 10Product-Infrastructure-Data: prefUpdate schema contains multiple identical events for the same preference update - https://phabricator.wikimedia.org/T218835 (10jlinehan) [18:13:39] tha [18:13:47] thanks cdanis [18:25:15] * elukey afk! [19:02:10] greetings! I am back to do a few small new analysis to wrap up the ores bias project I did with halfak [19:03:04] my access was extended through the end of this month:https://gerrit.wikimedia.org/r/c/operations/puppet/+/618779/ [19:03:18] I'm having some trouble with authentication however [19:03:43] I can ssh into stat1006, but I can't seem to login to jupyterhub or authenticate with kerberos [19:11:46] 10Analytics, 10Platform Engineering, 10Epic, 10Platform Team Initiatives (API Gateway): AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10Milimetric) > 1. (Optional) Add support in service-template-node for talking to Cassandra. Since we are planning to move storage from RESTBase down to individual se... [19:28:16] groceryheist: o/ [19:29:09] mmm I see login failures in the jupyterhub [19:29:28] can you kinit? [19:30:22] I don't see your principal for kerberos, so I think you are not anymore [19:30:39] @elukey I cannot kinit. one sec and I can send you my error message [19:31:03] groceryheist: what credentials are you using for jupyterhub? [19:32:07] elukey: I'm using my shell username (nathante) [19:32:23] and I just reset my wikitech password and I'm trying to use that password [19:33:21] ahhh ok so there might be some issues [19:33:36] so kerberos, I am going to re-create your account [19:33:56] is the email in puppet the correct one? [19:34:03] yeah [19:34:38] groceryheist: ok you should have an email with the tmp password, so you can kinit again [19:36:20] for jupyter I just restarted the hub, if you can retry on 1006 [19:36:45] also, https://lists.wikimedia.org/mailman/listinfo/analytics-announce is probably something that you want to subscribe [19:36:48] :) [19:37:05] ok that worked [19:37:14] we are going to reimage stat1006 tomorrow to Buster as FYI [19:37:58] sorry Thursday morning [19:38:01] (EU time) [19:38:13] @elukey that will preserve data right? [19:38:22] i don't think that will be a big problem for me [19:38:32] yes yes home dir preserved [19:38:32] kinit worked [19:38:35] super [19:38:42] I see in the hub's logs: Failed login for nathante [19:38:44] jupyter not working yet [19:38:46] yeah [19:40:23] it should be the wikitech password for jupyterhub yeah? [19:40:35] yep but lemme check something [19:41:17] you were previously in the nda LDAP group I imagine [19:41:42] ok [19:41:49] ah yes https://phabricator.wikimedia.org/T205454 [19:41:55] and you were removed [19:41:58] lemme re-add you [19:42:44] groceryheist: can you retry? [19:43:41] victory !! [19:44:07] thanks a ton elukey [19:44:52] super :) [19:58:48] mforns: where did you see the produce canary events schema_title alert? [20:10:22] 10Analytics, 10Platform Engineering, 10Epic, 10Platform Team Initiatives (API Gateway): AQS 2.0 - https://phabricator.wikimedia.org/T263489 (10Pchelolo) > This would be nice, we sometimes update schemas. But it is very rare, so if this is the blocker, maybe we can talk and work around it. I'd estimate impl... [20:23:16] 10Analytics, 10Operations: Augment NEL reports with GeoIP country code and network AS number - https://phabricator.wikimedia.org/T263496 (10Ottomata) > The long-term answer (which might be stream processing stuff?) is stream processing stuff > In the very short term, I don't think it would be too hard to have... [20:26:13] 10Analytics, 10Event-Platform, 10Performance-Team, 10Product-Infrastructure-Data: Research and consider network connections made due to Event Platform - https://phabricator.wikimedia.org/T263049 (10Ottomata) Also relevant: {T261340} Chris is making use of the fact that there is a separate endpoint that eve... [20:50:32] 10Analytics, 10Operations: Augment NEL reports with GeoIP country code and network AS number - https://phabricator.wikimedia.org/T263496 (10Nuria) I cannot think of case of any piece of data we hold that includes an IP where we would not want the geo localization. So even doing that by default at all times mak... [20:58:24] 10Analytics, 10Operations: Augment NEL reports with GeoIP country code and network AS number - https://phabricator.wikimedia.org/T263496 (10Ottomata) Nuria, except in {T262626}, this is exactly what Timo wants. He thinks that we should remove IPs from the data, and use the GeoIP country header that varnish se... [21:34:26] 10Analytics, 10Data-Services, 10cloud-services-team (Kanban): labstore1006 persistent high iowait - https://phabricator.wikimedia.org/T263329 (10nskaggs) The dashboard links are out of date for this. Can you link a current dashboard for labstore1006? I added it to https://grafana.wikimedia.org/d/000000568/la... [21:35:22] 10Analytics, 10Data-Services, 10cloud-services-team (Kanban): labstore1006 persistent high iowait - https://phabricator.wikimedia.org/T263329 (10nskaggs) I'm guessing thought there should be a dumps dashboard and I should probably revert my edits :-) [21:45:59] 10Analytics, 10Data-Services, 10cloud-services-team (Kanban): labstore1006 persistent high iowait - https://phabricator.wikimedia.org/T263329 (10Bstorm) There's no special dashboard for this. It's just the host dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=l... [21:48:52] congrats on editors by country API.. [21:48:55] * leila claps [21:50:14] 10Analytics, 10Data-Services, 10cloud-services-team (Kanban): labstore1006 persistent high iowait - https://phabricator.wikimedia.org/T263329 (10Bstorm) >>! In T263329#6485593, @nskaggs wrote: > The dashboard links are out of date for this. Can you link a current dashboard for labstore1006? I added it to htt... [22:10:06] 10Analytics, 10Product-Analytics, 10Structured Data Engineering, 10SDAW-MediaSearch (MediaSearch-Beta), 10Structured-Data-Backlog (Current Work): [L] Instrument MediaSearch results page - https://phabricator.wikimedia.org/T258183 (10egardner) Thanks @Nuria for the suggestion; @nettrom_WMF and I just had...