[07:34:16] bonjour [07:34:22] o/ [08:48:27] (03CR) 10Andrew-WMDE: [C: 03+1] Push job start date forward to first data collection [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/655949 (https://phabricator.wikimedia.org/T262209) (owner: 10Awight) [09:10:28] 10Analytics, 10SRE, 10ops-eqiad: an-test-worker1002 may need a DAC replace - https://phabricator.wikimedia.org/T272009 (10elukey) [09:10:46] really nice --^ [09:11:00] today I wanted to test the cookbook and one of the worker is down [09:12:33] (03CR) 10Andrew-WMDE: [C: 03+1] Collect metrics of all wikis [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/655886 (https://phabricator.wikimedia.org/T271894) (owner: 10WMDE-Fisch) [10:19:26] (03CR) 10WMDE-Fisch: [C: 03+1] Push job start date forward to first data collection [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/655949 (https://phabricator.wikimedia.org/T262209) (owner: 10Awight) [11:00:07] (03PS5) 10DCausse: Add rdf-streaming-updater schemas for side outputs [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/647723 (https://phabricator.wikimedia.org/T269619) [11:18:58] Hi a-team. We were investigating setting up a Matomo instance to collect pageview data for the Library Card platform (https://phabricator.wikimedia.org/T265001) and found that we already have piwik.wikimedia.org. Is there a process for setting up a new tool there? [11:20:44] Samwalton9: hi! Yes we maintain a piwik instance but it is basically for microsites, very low in traffic.. I am a little ignorant about the Library Card platform, can you give us more details? [11:23:35] It's a WMF project (https://www.mediawiki.org/wiki/The_Wikipedia_Library), a library tool for finding sources for WP research (https://wikipedialibrary.wmflabs.org/). I'm not sure I have a good estimate for daily visitors but currently it's somewhere in the region of ~100 users per day. We do have Q4 plans to ramp that number up quite substantially, which may take us out of the "very low" traffic group. [11:24:39] (If we're not already above 'very low') :) [11:24:47] Samwalton9: yes exactly, Matomo is very painful to maintain when the traffic rises, and it is not meant to scale very well.. Maybe if this is the goal it would be nice to track pageviews via something like the Modern Event Platform? [11:25:20] Potentially so - do you have more info on that? [11:26:31] Samwalton9: https://wikitech.wikimedia.org/wiki/Event_Platform - I'll also ask to ottomata to follow up on this later on [11:26:54] Samwalton9: anyway, to draw lines, 100 users/day is super fine, even 10000/day, etc.. [11:27:18] when we raise to say 10 users/second etc.. that is where things get bad for matomo [11:27:48] Thanks for the link. Those numbers are helpful, I can't imagine we'd get to that level, so perhaps Matomo is fine for our purposes. [11:28:15] yep yep I'd start with getting some more precise estimates, I'll check the current sites that we have and report back [11:31:38] Ok, I'll see if I can get some more accurate data. Appreciate the help :) [12:24:33] joal: just to add more fun to an-presto1004 - while rebooting with the cookbook, I realized (since it was taking ages) that the host was PXE booting (so os install) [12:24:42] I didn't stop it in time, so I am reimaging the node now [12:24:59] (wrong bios settings, boot pxe before boot disk) [12:25:35] that host is really cursed! :D [12:35:09] Maaaaaan [12:35:42] * joal sends wikilove to elukey and counter-curses to an-presto1004! [12:39:19] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for next deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/655975 (https://phabricator.wikimedia.org/T271980) (owner: 10Gerrit maintenance bot) [12:39:29] ok an-presto1004 back up [12:39:52] * elukey afk! lunch [12:53:09] same [13:09:27] o/ [13:09:58] "EventStreams" are only configurable via mediawiki-config? [13:12:26] I mean this: https://meta.wikimedia.org/w/api.php?format=json&action=streamconfigs&all_settings=true (I understand that this is the official repo for stream config even for streams that are not mediawiki related) [13:15:37] Hi dcausse - I'm sorry I'm not knowledgeable enough to help :( [13:15:55] dcausse: you'll need to wait for either mforns or ottomata I think [13:16:07] joal: sure, np, thanks! [13:23:07] morning! [13:23:20] Hi fdans [13:25:08] joal: bonjour mon ami! [13:25:26] Ca va fdans ? [13:26:05] elukey: sorry about all the worker headaches, lmk if you need to rubberduck or some help :) [13:41:19] fdans: ahahah nono it was just to add another data point to the an-presto1004 curse! [14:13:17] (03PS1) 10Milimetric: Update logic per Isaac [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656166 (https://phabricator.wikimedia.org/T271571) [14:15:18] 10Analytics, 10Product-Analytics, 10Patch-For-Review: Update Image usage metric - https://phabricator.wikimedia.org/T271571 (10Milimetric) @cchen: patch is up, will be reviewed shortly and will apply on the next monthly run. I'm wondering if there's a need to backfill old data, I think we keep at least one... [14:27:31] mforns: so I just saw this refine failure for event_sanitized, at the 45-day mark [14:27:50] and I looked at the source data, and it just has a _REFINED flag in the folder and nothing else (no success or anything) [14:27:53] it's in /wmf/data/event/SpecialInvestigate/year=2020/month=11/day=29/hour=10 [14:28:46] I'm not sure how that happened, but I think the right move is to delete that hour folder [14:28:58] oh, I should check the raw source, one sec [14:35:53] hm... weird. There are 4 records in the source data, so there was a silent failure on the original ingestion, and just a _REFINED flag was written with no data. I rerun refine and then rerun sanitization [14:48:53] mforns: [14:48:56] i mean [14:48:56] milimetric: [14:49:03] those 4 records are probably canary events [14:49:07] which are filtred out [14:49:13] there was a failure in the original refine job? [14:49:29] raw -> event ? [14:50:37] oh! ottomata no, there was just nothing in the data/event/ folder, only a _REFINED flag and nothing else [14:50:51] hm, this is an interesting case [14:50:59] how do I know they're canaries? [14:51:00] milimetric: that makes sense (althogh we only recently started filtering canaries, like last week) [14:51:06] meta.domain == 'canary' [14:51:29] actually... oh no [14:51:34] for eventlogging legacy data [14:51:42] we've always filtered them and also any non 'wmf' domain [14:51:48] so that makes sense [14:52:23] hm, no, these don't have meta, hdfs dfs -text /wmf/data/raw/eventlogging/eventlogging_SpecialInvestigate/hourly/2020/11/29/10/eventlogging_SpecialInvestigate.1007.0.5.3951.1606644000000 [14:52:35] oh right that hasn't been migrated yet...i think? [14:52:47] sounds right... wait did I use the wrong script then [14:52:53] brb [14:53:20] "webHost": "login.wikimedia.org" [14:54:21] hm that should be fine [15:07:14] hm... ok, well, either way something's wrong with the refine script because it should either fail or succeed. I'm not sure if it really wrote _REFINED and nothing else the first time, or if that was some kind of manual intervention, but now it just runs and doesn't output anything, not even a _REFINED flag. [15:13:13] 10Analytics, 10Analytics-Kanban: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10Ottomata) The October job still timed out after 60 days :( https://hue.wikimedia.org/oozie/list_oozie_coordinator/0000497-191216160148723-oozie-oozi-C/ ` 15:08:15 [@an-launcher1002:/... [15:15:10] (03PS1) 10Ottomata: Explicitly set timeout in banner_activity-druid-monthly-coord [analytics/refinery] - 10https://gerrit.wikimedia.org/r/656185 (https://phabricator.wikimedia.org/T264358) [15:15:18] milimetric: will look witih ya in a few mins [15:15:41] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Explicitly set timeout in banner_activity-druid-monthly-coord [analytics/refinery] - 10https://gerrit.wikimedia.org/r/656185 (https://phabricator.wikimedia.org/T264358) (owner: 10Ottomata) [15:20:02] !log Deployed refinery using scap to an-launcher1002, then deployed onto hdfs - T264358 [15:20:02] T264358: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 [15:37:13] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10Ottomata) OH! DOH! Nope, I had just followed the link to the oozie coordinator job that we killed in December. DOH! https://hue.wikimedia.org/oozie/list_oozie_c... [15:38:38] (03PS1) 10Ottomata: Revert Explicitly set timeout in banner_activity-druid-monthly-coord back to -1 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/656192 (https://phabricator.wikimedia.org/T264358) [15:39:03] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Revert Explicitly set timeout in banner_activity-druid-monthly-coord back to -1 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/656192 (https://phabricator.wikimedia.org/T264358) (owner: 10Ottomata) [15:39:55] ok milimetric i'mi going to run that refine and see what i see :) [15:40:17] ok, happy to brainbounce after [15:40:28] milimetric: you were running refine_eventloggingg_anlayitcs [15:40:28] right? [15:40:31] not _legacy? [15:40:48] yes, I ran _legacy by accident once, but I ran _analytics, yes [15:42:06] oh milimetric your refine didnt' do anythinbg because there was a _REFINE flag in the dest dir, right? [15:42:15] to make it do something, if that is present [15:42:16] you also need [15:42:18] no, I deleted it that [15:42:21] --ignore_done_flag=true [15:42:21] oh. [15:42:22] ok [15:42:48] oh, hang on, folder has the _REFINE_FAILED flag in there still from my _legacy run [15:43:01] oh [15:43:06] milimetric: this schema IS migrated [15:43:13] or...wait [15:43:20] ah! maybe it is now but wasn't then [15:43:30] right [15:43:34] that would make sense, then re-refining won't work 'cause it's in the middle [15:43:42] OH right. [15:43:47] :) [15:43:51] ok, mystery solved [15:43:55] well we can use the _analyticsi job [15:44:04] but have to chaneg the blacklist and the whitelist to make itrun [15:44:09] right [15:44:11] makes sense [15:44:18] you gonna do that? [15:44:19] trying that [15:44:22] k, thx [15:50:02] milimetric: there is now data in /wmf/data/event/SpecialInvestigate/year=2020/month=11/day=29/hour=10 [15:50:13] will you re-run the refine_sanitize? [15:51:16] hey milimetric and ottomata, was in a meeting, reading scrollback [15:51:35] ottomata: I'll rerun sanitize, yes, thank you! [15:51:38] no worries mforns, we got it [16:30:39] 10Analytics, 10Discovery, 10Product-Analytics, 10Research: New anaconda-wmf release with updated packages - https://phabricator.wikimedia.org/T271960 (10nshahquinn-wmf) As I understand it, this just determines what's installed by default; the user can always install packages that aren't included later. Is... [16:32:48] (03PS1) 10Awight: [WIP] Segment CodeMirror metrics by user edit count [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656210 (https://phabricator.wikimedia.org/T269986) [16:37:36] 10Analytics, 10SRE, 10Traffic: Traffic anomalies: Factor out list of countries into a dedicated Hive table - https://phabricator.wikimedia.org/T272052 (10mforns) [17:02:09] 10Analytics, 10Discovery, 10Product-Analytics, 10Research: New anaconda-wmf release with updated packages - https://phabricator.wikimedia.org/T271960 (10Ottomata) That's right! To get the deps on the worker nodes if they aren't already there is a bit of extra work for the user (that's you!) though. [17:28:00] 10Analytics, 10Analytics-Kanban: Add client TCP source port to webrequest - https://phabricator.wikimedia.org/T271953 (10elukey) @Ladsgroup we should follow up with Traffic to make sure that ATS-TLS adds the client's source port in one HTTP header, since Varnish (and varnishkafka) are behind it (so the source... [17:35:12] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10MW-1.35-notes (1.35.0-wmf.37; 2020-06-16), and 2 others: Clients need to generate an ISO 8601 formatted timestamp - https://phabricator.wikimedia.org/T240460 (10razzi) [17:38:16] 10Analytics, 10Analytics-Kanban: Address refinery-source security vulnerabilities - https://phabricator.wikimedia.org/T237774 (10Ottomata) [17:39:07] 10Analytics: Address jackson version security vulnerabilities in refinery-source - https://phabricator.wikimedia.org/T272058 (10Ottomata) [17:39:49] 10Analytics-Radar, 10DBA: mariadb on dbstore hosts, and specifically dbstore1004, possible memory leaking - https://phabricator.wikimedia.org/T270112 (10razzi) [17:40:47] 10Analytics-Radar, 10SRE, 10ops-eqiad, 10Patch-For-Review: Degraded RAID on an-coord1002 - https://phabricator.wikimedia.org/T270768 (10razzi) [17:51:47] 10Analytics: Retain nonsensitive mediawiki_api_request logging data - https://phabricator.wikimedia.org/T265952 (10razzi) 05Open→03Declined Closing since there has been no reply; feel free to reopen. [17:58:32] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update Spicerack cookbooks to follow the new class API conventions - https://phabricator.wikimedia.org/T269925 (10razzi) a:03elukey [17:59:13] 10Analytics-Radar: Presto error in Superest - only when grouping - https://phabricator.wikimedia.org/T270503 (10razzi) [18:01:25] 10Analytics: Implement Data Governance Tool - https://phabricator.wikimedia.org/T272060 (10Milimetric) p:05Triage→03Medium [18:02:09] 10Analytics, 10Pageviews-API: 404.php shows up in pageview API for 2017 - https://phabricator.wikimedia.org/T271870 (10razzi) a:03JAllemandou [18:03:48] 10Analytics, 10Analytics-EventLogging: In PHP EventLogging::logEvent() should automatically read the schema revision id from extension.json - https://phabricator.wikimedia.org/T207034 (10Ottomata) 05Open→03Resolved a:03Ottomata `$revId` is now overridden if the revision (or schema URI) is set in extensio... [18:04:22] 10Analytics, 10Machine Learning Platform: Get pytorch running on AMD GPU - https://phabricator.wikimedia.org/T245449 (10Ottomata) [18:04:33] 10Analytics: Gather all data-purge into a single job - https://phabricator.wikimedia.org/T262201 (10Milimetric) [18:04:35] 10Analytics: Implement Data Governance Tool - https://phabricator.wikimedia.org/T272060 (10Milimetric) [18:04:55] 10Analytics: Gather all data-purge into a single job - https://phabricator.wikimedia.org/T262201 (10Milimetric) p:05High→03Medium [18:04:58] 10Analytics, 10Analytics-Kanban, 10Pageviews-API: 404.php shows up in pageview API for 2017 - https://phabricator.wikimedia.org/T271870 (10JAllemandou) [18:06:14] 10Analytics, 10Analytics-Kanban: Add client TCP source port to webrequest - https://phabricator.wikimedia.org/T271953 (10razzi) a:03JAllemandou [18:06:17] 10Analytics: Flink Spike - https://phabricator.wikimedia.org/T241185 (10Ottomata) 05Open→03Declined Unneeded here, but we might use Flink for other things in the future. [18:06:19] 10Analytics: MW REST API Historical Data Endpoint Needs - https://phabricator.wikimedia.org/T240387 (10Ottomata) [18:06:56] 10Analytics, 10Analytics-Kanban, 10Discovery, 10Product-Analytics, 10Research: New anaconda-wmf release with updated packages - https://phabricator.wikimedia.org/T271960 (10razzi) [18:08:28] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Update Spicerack cookbooks to follow the new class API conventions - https://phabricator.wikimedia.org/T269925 (10razzi) [18:09:58] 10Analytics-Radar, 10SRE, 10ops-eqiad: an-test-worker1002 may need a DAC replace - https://phabricator.wikimedia.org/T272009 (10razzi) [18:11:31] (03PS1) 10Bharatkhatri: Wikistats Bug [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/656227 (https://phabricator.wikimedia.org/T263973) [18:11:40] 10Analytics, 10SRE, 10Traffic: Traffic anomalies: Factor out list of countries into a dedicated Hive table - https://phabricator.wikimedia.org/T272052 (10razzi) p:05Triage→03Medium [18:13:09] 10Analytics, 10Product-Analytics: Streamline Superset signup and authentication - https://phabricator.wikimedia.org/T203132 (10Ottomata) 05Open→03Resolved a:03Ottomata [18:13:54] 10Analytics, 10Analytics-Kanban, 10SRE, 10Traffic: Traffic anomalies: Factor out list of countries into a dedicated Hive table - https://phabricator.wikimedia.org/T272052 (10razzi) p:05Medium→03High [18:15:41] 10Analytics, 10Analytics-Wikistats, 10Patch-For-Review, 10good first task: Wikistats Bug - easy to understand language for pageviews - https://phabricator.wikimedia.org/T263973 (10Bharatkhatri351) Hey @Kipala please review my patch [18:20:11] elukey: hey, I'll do it, in the mean time this one has PCC https://gerrit.wikimedia.org/r/c/operations/puppet/+/656015 :D [18:21:08] 10Analytics, 10Analytics-Kanban: Check home/HDFS leftovers of dcipoletti - https://phabricator.wikimedia.org/T271092 (10JAllemandou) a:03JAllemandou [18:21:14] 10Analytics, 10Analytics-Kanban: Check home/HDFS leftovers of kaldari - https://phabricator.wikimedia.org/T271089 (10JAllemandou) a:03JAllemandou [18:22:01] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Update Image usage metric - https://phabricator.wikimedia.org/T271571 (10JAllemandou) a:03Milimetric [18:22:10] milimetric: I just assigned T271571 to you and moved it to kanban [18:22:11] T271571: Update Image usage metric - https://phabricator.wikimedia.org/T271571 [18:23:32] 10Analytics-Radar, 10SRE, 10ops-eqiad: an-test-worker1002 may need a DAC replace - https://phabricator.wikimedia.org/T272009 (10Cmjohnson) an-test-worker1002 is in the wrong vlan. I see it's in private1-c not analytics. Making the change now [18:32:46] thx Jo [18:32:51] 10Analytics-Radar, 10SRE, 10ops-eqiad: an-test-worker1002 may need a DAC replace - https://phabricator.wikimedia.org/T272009 (10Cmjohnson) 05Open→03Resolved a:03Cmjohnson The issue should be resolved now [edit interfaces interface-range vlan-analytics1-c-eqiad] member ge-5/0/13 { ... } + memb... [18:36:02] 10Analytics-Radar, 10SRE, 10ops-eqiad: an-test-worker1002 may need a DAC replace - https://phabricator.wikimedia.org/T272009 (10elukey) Thanks a lot! It is weird since it worked fine up to yesterday, was it done by mistake? Anyway, I'll also check vlans next time! [18:40:52] Dumb question: when y'all say "LGTM" on CRs, do you mean "Let's get this merged" or "Looks good to me"? [18:41:09] 10Analytics-Radar, 10SRE, 10ops-eqiad: an-test-worker1002 may need a DAC replace - https://phabricator.wikimedia.org/T272009 (10elukey) Yep it was part of a commit that lines up with the drop in connectivity: ` elukey@asw2-c-eqiad> show system rollback compare 2 1 [edit interfaces interface-range disabled... [19:09:08] lexnasser: the latter :) [19:24:12] elukey: random question. is renewing a kerberos ticket more lightweight than just re-kiniting? stat1008 is showing my kerberos ticket expired on the 11th but can be renewed until the 16th. to get access, should i just kinit and type my password in or is there a different thing I can do? [19:25:26] isaacj: it is a good question, you can try kinit -R to renew the ticket without password [19:25:55] elukey: oh nice! thanks! [19:26:09] the 48h hours are how long the ticket will last without renew, but if you do it it should last up to a week.. I am trying to have this done automatically for all users [19:26:16] but there are some things to check etc.. [19:26:34] the idea then would be to have a week-long session by default [19:26:40] isaacj: --^ [19:27:25] ahh -- actually it looks i had to do this before it expired? i get `kinit: Ticket expired while renewing credentials` if i run kinit -R now. but that would be fantastic! obviously not super hard to kinit but all simplification is appreciated [19:29:36] 10Analytics, 10Better Use Of Data, 10Event-Platform: Produce an instrumentation event stream using new EPC and EventGate from client side browsers - https://phabricator.wikimedia.org/T241241 (10Ottomata) 05Open→03Resolved Closing this, we are producing events like session tick! [19:30:11] isaacj: ah yes the renew is only valid if the ticket is not expired [19:30:22] otherwise you'll need to re-input the password [19:30:47] sorry I didn't get before that the ticket was expired :( [19:30:51] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, 10Goal: BUOD-KR1-Q3: Require that all new schema/instruments are created with the MEP system - https://phabricator.wikimedia.org/T259157 (10Ottomata) [19:30:53] 10Analytics, 10Analytics-EventLogging, 10Event-Platform: Prevent schema creation in meta for eventlogging schemas - https://phabricator.wikimedia.org/T259201 (10Ottomata) 05Open→03Declined We are doing this by edit protecting the pages. [19:30:55] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Decommission EventLogging backend components by migrating to MEP - https://phabricator.wikimedia.org/T238230 (10Ottomata) [19:31:44] * elukey afk! [19:32:01] haha, it's like a tamagotchi (https://en.wikipedia.org/wiki/Tamagotchi) where you have to remember to feed it :) [19:32:16] (03CR) 10Isaac Johnson: "Thanks Milimetric!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656166 (https://phabricator.wikimedia.org/T271571) (owner: 10Milimetric) [19:32:50] isaacj: hahahah [19:33:13] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, 10Goal: BUOD-KR1-Q3: Require that all new schema/instruments are created with the MEP system - https://phabricator.wikimedia.org/T259157 (10Ottomata) [19:33:36] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, 10Product-Infrastructure-Team-Backlog: Develop test environment solution for MEP analytics events - https://phabricator.wikimedia.org/T238837 (10Ottomata) 05Open→03Resolved a:03Ottomata Closing this, reopen if neces... [19:34:41] 10Analytics, 10Event-Platform: jsonschema-tools should ensure schema examples exist - https://phabricator.wikimedia.org/T270134 (10Ottomata) 05Open→03Resolved a:03Ottomata [19:35:21] 10Analytics-Radar, 10ChangeProp, 10MassMessage, 10WMF-JobQueue: The mass-message queue reports 0 when there are still queued messages - https://phabricator.wikimedia.org/T209899 (10Ottomata) [19:36:25] milimetric: heya - you here? [19:36:34] 10Analytics, 10Event-Platform: DesktopWebUIActionsTracking Event Platform Migration - https://phabricator.wikimedia.org/T271164 (10Ottomata) [19:36:37] 10Analytics, 10Event-Platform: DesktopWebUIActionsTracking Event Platform Migration - https://phabricator.wikimedia.org/T267342 (10Ottomata) [19:38:04] 10Analytics-Radar, 10Instrument-ClientError: Bot throwing large amount of errors - https://phabricator.wikimedia.org/T264453 (10Ottomata) [19:40:54] 10Analytics-EventLogging, 10Analytics-Radar, 10Event-Platform, 10Product-Infrastructure-Data, and 2 others: OperationError: The operation failed for an operation-specific reason in generateRandomSessionId - https://phabricator.wikimedia.org/T263041 (10Ottomata) a:05Ottomata→03None [19:43:29] 10Analytics, 10Analytics-Kanban: Follow up on hdfs:///tmp perms issues after umask change on HDFS - https://phabricator.wikimedia.org/T271560 (10JAllemandou) a:03JAllemandou [19:44:05] 10Analytics, 10Analytics-Kanban: Follow up on hdfs:///tmp perms issues after umask change on HDFS - https://phabricator.wikimedia.org/T271560 (10JAllemandou) [19:44:08] 10Analytics: Review /tmp usage when using hive in oozie workflows - https://phabricator.wikimedia.org/T271687 (10JAllemandou) [19:44:12] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data: Obtain evidence-based guidance on capacity for event streams - https://phabricator.wikimedia.org/T259155 (10Ottomata) 05Open→03Resolved a:03Ottomata Reopen if necessary. [19:44:14] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, 10Goal: BUOD-KR1-Q3: Require that all new schema/instruments are created with the MEP system - https://phabricator.wikimedia.org/T259157 (10Ottomata) [19:44:32] 10Analytics, 10Analytics-Kanban: Follow up on hdfs:///tmp perms issues after umask change on HDFS - https://phabricator.wikimedia.org/T271560 (10JAllemandou) [19:44:37] 10Analytics: Follow up on /tmp/DataFrameToDruid permissions after umask change - https://phabricator.wikimedia.org/T271558 (10JAllemandou) [19:47:05] 10Analytics, 10Event-Platform, 10Product-Infrastructure-Data: Automate EventGate validation error reporting - https://phabricator.wikimedia.org/T268027 (10Ottomata) In lieu of stream ownership, perhaps we could just do {T257237} and have alerts go to Data Eng team, which would then be forwarded to the proper... [19:48:17] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Platform Engineering: Adopt conventions for server receive and client/event timestamps in non analytics event schemas - https://phabricator.wikimedia.org/T267648 (10Ottomata) a:05jlinehan→03None [19:56:59] 10Analytics, 10Analytics-Kanban: Evaluate possible replacements for Camus: Gobblin, Marmaray, Kafka Connect HDFS, etc. - https://phabricator.wikimedia.org/T238400 (10Ottomata) [19:57:33] joal: sorry was wrapping up a meeting and got another one now [19:57:34] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Ottomata) [19:57:46] 10Analytics, 10Event-Platform, 10SRE, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10Ottomata) 05Stalled→03Declined Declining in favor of {T253058} [19:57:49] np milimetric - I solved my own problem :) [19:57:54] good luck with meetings [19:57:57] milimetric: --^ [19:58:26] aw, I missed a chance to help you :( this makes me sad [19:58:51] 10Analytics, 10Event-Platform, 10serviceops-radar, 10Services (watching): Datacenter aware configs for EventGate topic prefixes - https://phabricator.wikimedia.org/T213564 (10Ottomata) 05Open→03Resolved We have datacenter / k8s cluster specific values file overrides now, which accomplishes the goal here. [19:59:00] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Modern Event Platform: Stream Intake Service (EventGate): Implementation - https://phabricator.wikimedia.org/T206785 (10Ottomata) [19:59:01] milimetric: more a talk about names, we'll continue that tomorrow :) [19:59:16] I'm afraid :) [19:59:21] huhu [19:59:39] milimetric: folder names for tmp_data_transfer - nothing major [19:59:49] ah! gotcha [20:00:25] I'm not married to those at all btw [20:01:32] Ack milimetric - I'll still have you review my ideas, more brains, more sharing :) [20:04:33] Going to kick off a reboot for the analytics druid cluster - nodes should be rebooted maintaining uptime, but elukey and I expect some in-flight queries will fail; I'll keep an eye out for oozie jobs that need restarting [20:08:06] (03CR) 10Joal: "Asking for a more descriptive commit message - code looks good!" (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656166 (https://phabricator.wikimedia.org/T271571) (owner: 10Milimetric) [20:11:47] Gone for tonight team - see y'all tomorrow [20:26:17] (03CR) 10Milimetric: "updated, thanks for the review!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656166 (https://phabricator.wikimedia.org/T271571) (owner: 10Milimetric) [20:26:26] (03PS2) 10Milimetric: Update logic per Isaac [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656166 (https://phabricator.wikimedia.org/T271571) [20:26:48] 10Analytics: Analytics Hardware for Fiscal Year 2019/2020 - https://phabricator.wikimedia.org/T244211 (10Ottomata) [20:27:02] 10Analytics: Analytics Hardware for Fiscal Year 2019/2020 - https://phabricator.wikimedia.org/T244211 (10Ottomata) @elukey can we close this? :D [20:38:48] razzi: o/ would like to lurk in arch office hours in 20 mins [20:38:57] we can do sync up now if you like? [20:41:56] ottomata: sounds good, bc? [20:42:46] ya otw! [20:44:09] 10Analytics-Clusters: Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers - https://phabricator.wikimedia.org/T255973 (10Ottomata) Oh, and mirror maker too. [21:12:28] milimetric yt? [21:12:49] hey dsaez [21:12:53] 10Analytics, 10Analytics-Kanban: Add client TCP source port to webrequest - https://phabricator.wikimedia.org/T271953 (10Ladsgroup) >>! In T271953#6748341, @elukey wrote: > @Ladsgroup we should follow up with Traffic to make sure that ATS-TLS adds the client's source port in one HTTP header, since Varnish (and... [21:13:02] trouble with the GII data? [21:13:11] hehe! you read my mind [21:13:25] bracing for the worst [21:14:02] two questions: one what are the two columns? which should I use? and second) there is an easy way to join the Country names with some ISO code? [21:26:00] dsaez: columns are total edit count and total edits on namespaces zero count, the country name comes from a join to canonical_data.countries, so you can join there and get it back [21:26:29] cool, thanks! [21:26:32] dsaez: if you want to repeat the query it's very low-resource, you can just run this script with the parameters as suggested: https://github.com/wikimedia/analytics-refinery/blob/master/oozie/mediawiki/geoeditors/yearly/write_geoeditors_edits_yearly_data.hql [21:26:51] (except the country info data, that comment is out of date and it should be using canonical_data.countries, let me know if you have any trouble [21:27:02] sorry about the column names not being there, I thought they'd output but was wrong) [21:27:13] no, this is great, thanks. [21:54:10] (03PS1) 10Mholloway: Fix README typo [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656265 [21:59:12] 10Analytics, 10Privacy Engineering: Implement Data Governance Tool - https://phabricator.wikimedia.org/T272060 (10JFishback_WMF) [22:05:43] (03CR) 10Mholloway: [C: 03+2] "Merging trivial change. My real purpose here is to test replication to GitHub." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656265 (owner: 10Mholloway) [22:24:50] (03PS1) 10Mholloway: Fix README typo [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/656267