[06:50:08] 10Analytics, 10Analytics-Kanban: Create kerberos principals for users - https://phabricator.wikimedia.org/T237605 (10elukey) [06:50:36] 10Analytics, 10Analytics-Kanban: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [08:16:06] 10Analytics, 10incubator.wikimedia.org: Create dashiki dashboard / small tool to track statistics about incubated wikis - https://phabricator.wikimedia.org/T237389 (10Ooswesthoesbes) Yes, that's the basics. The advantage of the catanalysis is the number of bytes added, as some contributors make a lot of low ch... [08:51:11] Good morning team [08:51:31] elukey: if you have a minute, I have a patch for you on puppet (turnilo config, should be easy) [08:55:55] joal: bonjour! [08:55:55] sure [08:56:33] elukey: o\ [08:56:38] o/ is better :) [08:57:00] elukey: superset has broken feature for druid-datasource metadata refresh :( [08:57:37] joal: what do you mean? [08:57:40] SQL alchemy complains about duplicate entry for a cloumn [08:57:50] elukey: no druid-metadata-refresh [08:58:10] ah yes it has always been like that IIRC, you need to remove the duplicate before refreshing [08:58:25] hm, remove duplicates? [08:58:32] turnilo restarted [08:58:35] yes lemme check [08:59:39] elukey: turnilo is looking good :) Thanks a lot [09:00:45] joal: I am checking superset, this time is different: the last time that I had to fix it it was due to a new datasource name that was conflicting with an existing one (same name without case sensitive comparison) [09:00:50] this time it seems something else [09:01:02] elukey: column -names related IIUC [09:01:13] but same idea - there is a duplicate in the db, if we remove it (if possible) then it will work [09:01:25] superset is a little bit brittle for these things sadly [09:02:58] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Update webrequest_128 dataset in turnilo to include TLS fields once available - https://phabricator.wikimedia.org/T237117 (10JAllemandou) Done! Hidding the turnilo aweful link under [[ https://turnilo.wikimedia.org/#webrequest_sampled_... [09:06:23] ok joal, superset complains because the datasource id 371 (event_pageissues) already has a userAgent_browser_family column [09:07:30] https://superset.wikimedia.org/druiddatasourcemodelview/edit/371 [09:09:19] elukey: and the case has changed, is that it? [09:09:48] joal: try now to refresh [09:09:58] yes almost surely it changed or similar [09:10:05] elukey: Worked :) [09:10:13] elukey: you are a superset master :) [09:10:20] Thanks a lot!!! [09:10:30] \o/ [09:11:57] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10JAllemandou) Done! Example: ` spark2-shell --master yarn --driver-memory 4G --executor-memory 8G --executor-cores 4 --conf spark.dynamicA... [09:15:07] 10Analytics, 10Operations, 10ops-eqiad, 10User-Elukey: Check if a GPU fits in any of the remaining stat or notebook hosts - https://phabricator.wikimedia.org/T220698 (10elukey) @Cmjohnson we might need to add a new GPU next quarter (need to triple check with the Research team), is there any of the above ho... [09:23:02] (03PS1) 10Joal: Add licenses and correct minimal typos in scripts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549427 [09:23:30] (03CR) 10Joal: "Should be a nop, only comments modified/added" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549427 (owner: 10Joal) [09:27:24] joal: new jvm, brace yourself, roll restarts are coming [09:27:37] /o\ [09:28:11] * joal fakes being afraid, but knows that as usual all the hard work will be done by elukey - Thank you elukey <3 [09:28:45] this time should be done by cumin :D [09:29:19] With a pinch of spice :) [09:37:49] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10elukey) After a chat with Moritz and my team, this is what we are planning to do: 1)... [09:41:54] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Update webrequest_128 dataset in turnilo to include TLS fields once available - https://phabricator.wikimedia.org/T237117 (10Vgutierrez) @BBlack I'm seeing some "nil" values on the TLS KeyExchange field when AES128-SHA is being us... [09:47:12] joal: hello! you're suggesting to have a date-from and a date-to in the oozie intervals script, but what should the script do when the edge date is reached? exit with nonzero code so that the oozie command doesn't go through? [09:48:32] fdans: Good idea for that case, and also, for the last batch, number of days of the interval should be recomputed, cause it will probably be smaller than the defined one [09:51:28] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Update webrequest_128 dataset in turnilo to include TLS fields once available - https://phabricator.wikimedia.org/T237117 (10Vgutierrez) I'm already loving the data, thanks @JAllemandou <3 [10:02:52] 10Analytics, 10Analytics-Wikistats: Wikistats data discrepancy for India page views from hive data pull - https://phabricator.wikimedia.org/T237579 (10Addshore) [10:05:20] (03PS1) 10Awight: Fix another nonexistent field [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/549437 (https://phabricator.wikimedia.org/T233108) [10:08:33] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] Fix another nonexistent field [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/549437 (https://phabricator.wikimedia.org/T233108) (owner: 10Awight) [10:50:27] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/549437 (https://phabricator.wikimedia.org/T233108) (owner: 10Awight) [11:03:31] 10Analytics, 10Analytics-Wikistats: Wikistats data discrepancy for India page views from hive data pull - https://phabricator.wikimedia.org/T237579 (10JAllemandou) Hi @Iflorez, Data presented in Wikistats comes from AQS endpoints (see [[ https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews | here ]] for... [11:03:40] 10Analytics, 10Analytics-Wikistats: Wikistats data discrepancy for India page views from hive data pull - https://phabricator.wikimedia.org/T237579 (10JAllemandou) 05Open→03Invalid [11:05:20] (03PS2) 10Joal: Add licenses and correct minimal typos in scripts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549427 [11:05:50] elukey: while I'm pythoning on refinery, should I remove imports from python2 stuff? [11:13:00] yes plese :) [11:13:04] please [11:13:08] I was reading https://docs.cloudera.com/documentation/enterprise/5-16-x/topics/cm_sg_principal_keytab.html#delegation-tokens [11:13:11] very interesting [11:13:27] so I was wrong, the token that the namenode generates is not user agnostic [11:13:38] but once you have it, you can impersonate a user [11:14:09] therefore not having wire encryption is kinda crazy, I am not sure why people enable sasl auth and not encryption [11:14:30] (that seems a lot of use cases judging from the state of the libs doing hadoop rpc natively) [11:17:04] Cc: moritzm (interesting read) [11:17:28] Indeed elukey - very interesting :) [11:18:13] especially the part "If valid, the client and the NameNode will then authenticate each other by using the TokenAuthenticator that they possess as the secret key, and MD5 as the protocol." [11:18:33] I guess it refers to MD5 digest? [11:19:01] (03PS3) 10Joal: Update python scripts (licenses, imports, typos) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549427 [11:19:58] that would make sense, since it would use sasl as well [11:20:02] (that supports it) [11:20:44] yep seems like that [11:21:27] https://issues.apache.org/jira/browse/HADOOP-11962?attachmentSortBy=dateTime [11:22:42] ok now I have more clear ideas, debugging snakebite helped :D [11:23:02] :) [11:23:25] next elukey task will be to implement an HDFS api in puppet :-P [11:39:59] * elukey lunch! [11:40:01] nop! [11:40:02] :D [12:17:35] Special favor: would anyone please remove stat1007:/srv/reportupdater/output/metrics/reference-previews/baseline.tsv at your convenience? Thanks in advance. [12:17:46] (03PS4) 10Mforns: Add Spark job to update data quality table with incoming data [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/549115 (https://phabricator.wikimedia.org/T235486) [12:18:43] (03PS8) 10Mforns: Refactor data_quality oozie bundle to fix too many partitions [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547320 (https://phabricator.wikimedia.org/T235486) [12:18:55] !log Deleting stat1007:/srv/reportupdater/output/metrics/reference-previews/baseline.tsv as asked by awight [12:18:57] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:19:09] Done awight :) [12:20:07] joal: Thanks! There was some incorrect output that needs to be regenerated with updated queries. [12:20:22] awight: I had something like that in mind :) [12:20:49] (03PS9) 10Mforns: Refactor data_quality oozie bundle to fix too many partitions [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547320 (https://phabricator.wikimedia.org/T235486) [12:21:29] hehe I see you've done this before ;-) [12:28:31] * awight fails to find Grafana metric for the total number of pageviews per wiki. [12:29:16] awight: use turnilo :) [12:31:43] Here's something weird. I found https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo , however: https://wikitech.wikimedia.org/w/index.php?search=turnilio&title=Special%3ASearch&go=Go&ns0=1&ns12=1&ns116=1&ns498=1 [12:32:25] awight: not LIO at the end: LO [12:32:49] :) facepalm (: [12:33:26] awight: turnilo.wikimedia.org - Use pageview_daily datasource :) [12:33:26] I'm trying to make a public dashboard however. It's fine, I do have a secret data source I can use to approximate pageviews in Grafana. [12:33:38] ok :) [12:33:48] Good to know, though! I've had it on my list to read about Druid. [12:58:36] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create test Kerberos identities/accounts for some selected users in hadoop test cluster - https://phabricator.wikimedia.org/T212258 (10Isaac) @elukey I played around with it and didn't run into any major issues. Thanks for the detailed notes! My only two concer... [13:34:48] 10Analytics, 10WMDE-Analytics-Engineering, 10WMDE-FUN-Funban-2019, 10WMDE-FUN-Sprint-2019-10-14, 10WMDE-New-Editors-Banner-Campaigns (Banner Campaign Autumn 2019): Implement banner design for WMDEs autum new editor recruitment campaign - https://phabricator.wikimedia.org/T235845 (10chrp) 05Open→03Reso... [13:42:03] joal: any chance you're here? [13:43:30] oh nvm [13:48:10] Hey fdans - all good? [13:48:36] joal: yeayea facing some shenanigans with unittest in python, but it's all good now [13:48:42] thank you for your CR yesterday :) [13:49:07] np fdans - were you ok with my comments? [13:49:24] joal: yes, it all makes sense [13:51:04] cool - fdans I realie I forgot to mention: maybe you could rename the file "oozie_day_intervals", to point the day-only aspect of it? [13:51:23] joal: sure, no problem [13:51:32] thanks fdans :) [14:06:40] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10Ottomata) > it might mean that we'll have to stop rsyncs for a couple of days Do we n... [14:13:08] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10elukey) >>! In T234229#5644194, @Ottomata wrote: >> it might mean that we'll have to... [14:15:59] joal: meeting! :) [14:16:14] joining in a minute [14:37:55] 10Analytics, 10Research-Backlog, 10Wikidata: Copy Wikidata dumps to HDFS - https://phabricator.wikimedia.org/T209655 (10Ottomata) [14:49:02] joal: I'm following up on the daily dumps import (https://github.com/wikimedia/analytics-refinery/blob/master/bin/import-mediawiki-dumps) [14:49:18] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create test Kerberos identities/accounts for some selected users in hadoop test cluster - https://phabricator.wikimedia.org/T212258 (10elukey) Thanks a lot for the tests! >>! In T212258#5644058, @Isaac wrote: > @elukey I played around with it and didn't run in... [14:49:24] it seems the refinery-import-page-current-dumps timer has failed [14:49:34] because some wikis were unavailable [14:49:35] milimetric: Hi! [14:50:02] milimetric: we've looked into that with elukey yesterday - Issue was due to a json file not correctly read from the job [14:50:23] oh ok, but the timer still shows " Active: inactive (dead) since Thu 2019-11-07 07:00:00 UTC; 7h ago" [14:50:32] milimetric: I think we've experienced bad luck in term of file-reading/writing, and we tried to read a file not yet correctly written [14:50:52] ok, sounds fine, I'm just trying to clear the nagios alert [14:51:02] milimetric: 7th is today- I think it has run correctly today (error was yesterday) [14:51:12] hm... why's the timer say inactive then... [14:51:34] milimetric: still in incorrect format? I thought it had been solved by my rerun [14:51:52] oh... it's inactive (dead) with SUCCESS code... so that means it's fine? Wouldn't that mean it would've sent a RECOVERY email? [14:51:59] milimetric: yesterday - 19:20:38 < icinga-wm> RECOVERY - Check the last execution of refinery-import-page-current-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-page-current-dumps [14:52:03] https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:53:06] oh, ok, I just missed the recovery email, it was in my trash [14:53:07] thanks! [14:53:33] np milimetric :) [14:53:36] thanks for caring! [14:54:17] (I did work with the managing systemd timers but honestly it's not that friendly, like it's a bit confusing that "inactive (dead)" can be a good thing :)) [14:55:02] true milimetric - Let's confirm with elukey [14:56:50] inactive means only that it is not running, that's it :) [14:57:47] as Joseph pointed out the last return code is also displayed, but it might not be intuitive I agree [14:58:10] we can add some documentation to the page if it can help [14:58:37] are you guys able to execute something like sudo systemctl -a | grep failed [14:58:40] Gone for kids [14:58:40] ? [14:58:53] I usually run that to see if a job is still in failed state [15:11:38] 10Analytics, 10Research-management: Test GPUs with an end-to-end training task (Photo vs Graphics image classifier) - https://phabricator.wikimedia.org/T221761 (10Miriam) Finally I managed to do some progress. I built a simple model for testing purposes. TL; DR: I built a tensorflow-based model to classify i... [15:12:06] 10Analytics, 10Research-management: Test GPUs with an end-to-end training task (Photo vs Graphics image classifier) - https://phabricator.wikimedia.org/T221761 (10Miriam) [15:34:17] a-team: is it ok if I start the cumin cookbook to restart all the jvms on the hadoop workers? [15:34:20] (new jvm out..) [15:34:27] it might mean that some jobs will fail [15:35:15] nuria: coming to meeting? [15:37:49] heya team :] [15:37:54] mforns: you coming to this meeting/ [15:38:11] oh, man, sure! [15:41:12] !log roll restart all jvms on Hadoop Analytics Workers to pick up the new jvm [15:41:14] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:42:01] elukey: yes! ty [15:42:53] ottomata, is there something I need to do to be able to access a hive table though presto in superset? I'm getting some errors... [15:44:12] 10Analytics, 10Knowledge-Integrity, 10Epic, 10Patch-For-Review: Citation Usage: run third round of data collection - https://phabricator.wikimedia.org/T213969 (10Aklapper) [15:51:08] I'm trying to create a table on top of presto_analytics_hive, but it says: Table [wmf.data_quality_metrics] could not be found, please double check your database connection, schema, and table name, error: wmf.data_quality_metrics [15:51:33] if I try without wmf. it gives the same error [15:52:10] joal? ^ [15:57:43] create a table? [15:57:44] mforns: ? [15:57:47] you mean in superset? [15:57:56] ottomata, yes! [15:57:58] ah [15:58:42] mforns: i can see it in the SQL Lab editor just fine [15:59:05] the table? [15:59:09] yes [15:59:15] oh [15:59:15] maybe [15:59:22] with presto if you are using CLI or somethign directly [15:59:36] you have to refer to the full database schema (databse) table [15:59:37] so [15:59:53] presto_analytics_hive.wmf.data_quality_metrics [15:59:54] but [16:00:00] yeha so if you are using that in a query [16:00:03] ah! the "schema" refers to the Hive database! [16:00:03] you might have to do it like that [16:00:05] yes [16:00:06] that's why [16:00:09] a bit confusing for sure [16:00:12] ok, thanks! :D [16:00:26] yeah, since presto is a frontend for many data sources [16:00:34] presto_analytics_hive refers to our hive instance [16:00:38] schema is the databse in hive [16:00:40] and then table is table [16:00:40] exactly [16:03:54] (03PS5) 10Fdans: Add python script to generate intervals for long backfilling [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) [16:05:52] (03CR) 10Fdans: Add python script to generate intervals for long backfilling (037 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [16:20:09] milimetric: o/ [16:20:19] got some time today to talk stream config with me? [16:20:24] i keep waffling [16:20:27] need brainbouncer [16:21:02] hm i actually have a lot of meetings today too [16:35:43] aaaand we have the kerberos puppet patch! https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/549566/ [16:36:09] it still needs presto and it excludes the work on labstore nodes [16:36:23] but it doesn't looks scary at all [16:46:16] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 5 others: Public schema.wikimedia.org endpoint for schema.svc - https://phabricator.wikimedia.org/T233630 (10Ottomata) @ema, @Joe informs me the that the nginx server serving schema.svc should terminate TLS for the 'encrypt al... [16:46:26] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [16:53:52] Hi team - Are we going to have standup or staff meeting? [16:59:21] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Prepare the Hadoop Analytics cluster for Kerberos - https://phabricator.wikimedia.org/T237269 (10elukey) [17:00:02] (03CR) 10Fdans: "@nuria oh, yes please :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548776 (owner: 10Fdans) [17:00:04] joal, I want to go to staff [17:00:09] :] [17:00:25] k mforns :) Let's go to staff then! [17:00:52] oh ok staff? a-team ? [17:01:01] yep [17:01:05] mforns: should be good st'ff I guess [17:01:13] xD [17:01:16] joal: nice [17:01:47] * joal feels proud to be acknowledge by lord fdans :) [17:03:00] joal: all hadoop workers roll restarted without any manual intervetion [17:03:03] \o/ [17:03:15] \o/! Thanks elukey - Will monitor for errors :) [17:03:19] not even a job failed [17:03:20] wow [17:04:37] nuria: is it sufficient to use a fixed salt for hashing as opposed to switching it for every unique value? [17:05:32] lexnasser: yes, a global salt that is not stored anywhere and is large enough should be sufficient [17:33:26] !log restart all jvms on hadoop test workers [17:33:28] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:33:36] !log restart zookeeper on druid nodes for jvm upgrades [17:33:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:51:45] ottomata, mforns , elukey , joal , milimetric , fdans " given updates on staff meeting we will postpone our grosking-kerberos to monday , to happen right after standup. I think we all will be interested on Q & A [17:51:57] ok [17:52:02] k [17:53:59] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 6 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10Ottomata) I think some of the issues I'm having are caused by the combination of 3 users of stream config a... [17:54:42] nuria: ok so we'll enable kerberos on Monday, got it, thanks [17:54:44] :D [17:55:02] elukey: totally, just ping if you need any help, i will be at the BEACH [17:55:06] ahahahahah [18:09:10] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create test Kerberos identities/accounts for some selected users in hadoop test cluster - https://phabricator.wikimedia.org/T212258 (10Isaac) > The option that is currently available is a keytab Ok, that works for me. I'll avoid it but it's good to know it's an... [18:19:56] lol [18:20:03] :) [18:39:00] (03CR) 10Nuria: [C: 03+1] Update python scripts (licenses, imports, typos) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549427 (owner: 10Joal) [18:52:20] * elukey off! [19:03:18] a-team :D https://superset.wikimedia.org/superset/dashboard/73/ [19:03:55] very nice mforns ! [19:14:51] milimetric: are you planning to send the announcement yous ent to analytics list to wiki-research-l and wikidata lists? [19:15:02] milimetric: or do you rely on us to do it? :D [19:15:48] leila: I'm sorry I never really understood what lists I should be sending stuff to, so I stay close to home and branch out as needed. I will send to those two now [19:16:00] leila: but wait, wikidata?! [19:16:13] milimetric: makes sense. and thanks! [19:16:25] milimetric: why not wikidata? this is not wp specific, is it? [19:16:52] leila: it is, only wikipedias for now [19:16:55] oh wait [19:17:00] I think it might include wikidata :) [19:17:06] no, it doesn't [19:17:18] milimetric: well then no Wikidata if it doesn't include it. ;) [19:17:52] I actually have no idea what the wikidata list is, I've never seen it [19:18:01] milimetric: let's change that. ;) [19:25:29] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Hive data quality alarms pipeline - https://phabricator.wikimedia.org/T235486 (10mforns) [19:25:51] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Hive data quality alarms pipeline - https://phabricator.wikimedia.org/T235486 (10mforns) [19:26:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Hive data quality alarms pipeline - https://phabricator.wikimedia.org/T235486 (10mforns) [19:26:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Hive data quality alarms pipeline - https://phabricator.wikimedia.org/T235486 (10mforns) Here's the data quality dashboard in Superset: https://superset.wikimedia.org/superset/dashboard/73/ [19:42:36] 10Analytics, 10Better Use Of Data, 10Product-Infrastructure-Team-Backlog, 10Wikimedia-Logstash, and 3 others: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Ottomata) FYI: https://blog.sentry.io/2019/11/06/relicensing-sentry We weren't planning on using Sentry in... [19:44:14] (03CR) 10Joal: "Minor things - looks a lot better to me :) Thanks fdans!" (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [19:44:42] gone for diner team - see you tomorrow [20:18:31] mforns: NICE [20:23:16] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Hive data quality alarms pipeline - https://phabricator.wikimedia.org/T235486 (10Nuria) So nice! Let's show metrics from the beginning of time if we can as this series is very small. [20:31:20] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Hive data quality alarms pipeline - https://phabricator.wikimedia.org/T235486 (10mforns) It's still backfilling, will take a couple days. [20:35:49] (03CR) 10Nuria: [C: 03+2] Add date to daily cassandra loading job titles,reduce SLA hours [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548776 (owner: 10Fdans) [20:35:58] (03CR) 10Nuria: [V: 03+2 C: 03+2] Add date to daily cassandra loading job titles,reduce SLA hours [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548776 (owner: 10Fdans) [20:41:34] (03PS6) 10Fdans: Add python script to generate intervals for long backfilling [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) [20:41:46] (03CR) 10Fdans: Add python script to generate intervals for long backfilling (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [21:04:38] nuria: from my and milimetric's sync-up, we were wondering why the 2016 dataset filters x-cache by a specific cache as opposed to the hostname that actually serves the content (ex. x-cache LIKE '%cp3030%' vs x-cache LIKE '%cp3030 hit%' vs hostname LIKE '3030%') [21:05:59] lexnasser: if those two fields in 2016 were the same that they are now , ya, seems like hostname shoudl be sufficient [21:06:42] nuria: so all is needed in the dataset is entries that ended up being served by that host? [21:08:14] lexnasser: mmm...no [21:08:24] lexnasser: re-reading ticket i think we want x-cache [21:08:55] lexnasser: * i think* , reading https://phabricator.wikimedia.org/T128132#2536351 [21:09:16] lexnasser: (but you can verify with data) that x-cache will give you entries that will not appear if you use hostname alone [21:09:22] nuria: I'm trying to decide between filtering by hostname or filtering by x-cache. both would include an x-cache field [21:09:43] lexnasser: per the comment i linked i think x-cache is the correct one [21:10:29] nuria: got it, thanks for clarifying [21:13:21] 10Analytics, 10Analytics-Kanban: Add Mon Wikipedia to analytics setup - https://phabricator.wikimedia.org/T235747 (10Nuria) 05Open→03Resolved [21:16:46] 10Analytics: Request for a large request data set for caching research and tuning - https://phabricator.wikimedia.org/T225538 (10lexnasser) @Danielsberger Also, I saw that in your 2016 dataset request ([[ https://phabricator.wikimedia.org/T128132 | link ]]) that you wanted a separate query field for a save fla... [21:19:24] (03CR) 10Joal: [C: 03+1] "Thanks for the changes fdans :) Ok to go for me" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [21:22:18] (03CR) 10Joal: "I think this needs to be reverted or updated. Coordinators don't have access to year/month/day params (workfloew do). Maybe you were think" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/548776 (owner: 10Fdans) [21:22:21] nuria: --^ [21:22:46] (03PS1) 10Nuria: Revert "Add date to daily cassandra loading job titles,reduce SLA hours" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549636 [21:23:16] joal: thanks for the catch, see revert here: https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/549636/ [21:24:02] nuria: full revert ok, or do you want to keep smaller SLAs? [21:27:04] reverting [21:27:12] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging to revert buggy patch" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549636 (owner: 10Nuria) [22:01:38] a-team link to wikistats live in the footer of every wikipedia :) [22:01:57] well done fdans [22:02:07] O.o!!!! [22:02:22] now we shall be fixing bugs on wikistats for approximately 35 quarters [22:02:27] xDDDD [22:03:25] milimetric: this is actually an elaborate plan of mine to pressure the team to do the i8n work :D [22:05:37] good plan! [22:14:40] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: analytics1062 lost one of its power supplies - https://phabricator.wikimedia.org/T237133 (10Jclark-ctr) Received new psu and replaced. Tracking for RMA {F31057616} [22:15:31] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: analytics1062 lost one of its power supplies - https://phabricator.wikimedia.org/T237133 (10Jclark-ctr) 05Open→03Resolved [22:16:35] fdans: jajajaja [22:16:43] fdans: looking into PIWIK NOW [22:30:55] 10Analytics, 10Research, 10Epic, 10Patch-For-Review: Citation Usage: run third round of data collection - https://phabricator.wikimedia.org/T213969 (10leila) [22:46:39] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: analytics1062 lost one of its power supplies - https://phabricator.wikimedia.org/T237133 (10Dzahn) 05Resolved→03Open it is still shown as CRIT in monitoring: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=analytics1062&s... [22:49:33] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 6 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10Ottomata) Ok, I think I'm making some progress after a short chat with Petr on IRC today. I'm going to mak... [22:59:08] 10Analytics, 10Analytics-EventLogging, 10Advanced-Search, 10TCB-Team, and 2 others: Advanced Search eventlogging messages don't validate against the schema - https://phabricator.wikimedia.org/T237060 (10awight) 05Open→03Resolved a:03awight https://logstash.wikimedia.org/goto/c7115907236f3f6b57db134bc... [23:02:33] (03CR) 10Nuria: [C: 03+1] Add python script to generate intervals for long backfilling [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [23:05:43] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Services (watching): Add mediarequests per referer endpoint to AQS - https://phabricator.wikimedia.org/T232857 (10Nuria) 05Open→03Resolved [23:05:47] 10Analytics, 10Patch-For-Review, 10Services (watching): Add mediacounts data to AQS and, from there, Restbase - https://phabricator.wikimedia.org/T207208 (10Nuria) [23:06:00] 10Analytics, 10Analytics-Kanban: Move Analytics Report Updater to Python 3 - https://phabricator.wikimedia.org/T204736 (10Nuria) 05Open→03Resolved [23:06:02] 10Analytics-Kanban: Deprecate Python 2 software from the Analytics infrastructure - https://phabricator.wikimedia.org/T204734 (10Nuria) [23:06:23] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10Nuria) 05Open→03Resolved [23:06:45] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Update webrequest_128 dataset in turnilo to include TLS fields once available - https://phabricator.wikimedia.org/T237117 (10Nuria) 05Open→03Resolved [23:06:48] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10Nuria) [23:06:54] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Update webrequest_128 dataset in turnilo to include TLS fields once available - https://phabricator.wikimedia.org/T237117 (10Nuria) [23:07:07] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Nuria) 05Open→03Resolved [23:08:24] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Performance-Team (Radar): Drop Navigationtiming data entirely from mysql storage? - https://phabricator.wikimedia.org/T233891 (10Nuria) 05Open→03Resolved [23:08:26] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Archive data on eventlogging MySQL to analytics replica before decomisioning - https://phabricator.wikimedia.org/T231858 (10Nuria) [23:13:18] 10Analytics, 10Analytics-Kanban: Enable TLS encryption for the MapReduce Shufflers in the Hadoop Analytics cluster - https://phabricator.wikimedia.org/T236995 (10Nuria) 05Open→03Resolved [23:13:20] 10Analytics: Enable Security (stronger authentication and data encryption) for the Analytics Hadoop cluster and its dependent services - https://phabricator.wikimedia.org/T211836 (10Nuria) [23:13:37] 10Analytics, 10Analytics-Kanban: Check Avro as potential better file format for wikitext-history - https://phabricator.wikimedia.org/T236687 (10Nuria) 05Open→03Resolved [23:14:34] 10Analytics, 10Analytics-Kanban: Understand why SQL string pattern matching differ from Hive to Spark - https://phabricator.wikimedia.org/T236985 (10Nuria) 05Open→03Resolved [23:15:14] 10Analytics, 10Analytics-Kanban: Import siteinfo dumps onto HDFS - https://phabricator.wikimedia.org/T234333 (10Nuria) Did we documented this data is available? [23:15:55] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Disable production EventLogging analytics MySQL consumers - https://phabricator.wikimedia.org/T232349 (10Nuria) 05Open→03Resolved [23:15:57] 10Analytics-EventLogging, 10Analytics-Kanban: Sunset MySQL data store for eventlogging - https://phabricator.wikimedia.org/T159170 (10Nuria) [23:17:14] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Optimize archiva git-fat symlink script - https://phabricator.wikimedia.org/T235668 (10Nuria) 05Open→03Resolved [23:17:42] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging to Python 3 - https://phabricator.wikimedia.org/T233231 (10Nuria) 05Open→03Resolved [23:17:44] 10Analytics-Kanban: Deprecate Python 2 software from the Analytics infrastructure - https://phabricator.wikimedia.org/T204734 (10Nuria) [23:18:25] 10Analytics-EventLogging, 10Analytics-Kanban: Remove references in doc to mysql storage for EL data - https://phabricator.wikimedia.org/T236403 (10Nuria) 05Open→03Resolved [23:18:27] 10Analytics-EventLogging, 10Analytics-Kanban: Sunset MySQL data store for eventlogging - https://phabricator.wikimedia.org/T159170 (10Nuria) [23:20:15] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up automatic deletion for netflow datasource in Druid - https://phabricator.wikimedia.org/T229674 (10Nuria) We can merge this next week. ping @mforns [23:22:01] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Topviews Analysis of the Hungarian Wikipedia is flooded with spam - https://phabricator.wikimedia.org/T237282 (10Nuria) [23:50:32] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Topviews Analysis of the Hungarian Wikipedia is flooded with spam - https://phabricator.wikimedia.org/T237282 (10Nuria) After running the data for hu.wikipedia through bot spikes detection the top list for 2019/10/16 looks like the following. Most rogue page... [23:53:42] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Topviews Analysis of the Hungarian Wikipedia is flooded with spam - https://phabricator.wikimedia.org/T237282 (10Nuria) Pinging here #product-analytics so they are aware that effects of bots in "small" sites like these can be dramatic