[00:04:23] <wikibugs>	 10Analytics, 10Jupyter-Hub, 10Operations: notebook1001 shown as DOWN in icinga, due to firewall rules - https://phabricator.wikimedia.org/T138685 (10Dzahn) 05Open→03Resolved a:03Dzahn no response since 2016 and meanwhile there is no more notebook1001. closing.
[00:15:45] <wikibugs>	 10Analytics, 10Contributors-Analysis, 10Product-Analytics, 10Epic: Support all Product Analytics data needs in the Data Lake - https://phabricator.wikimedia.org/T212172 (10nettrom_WMF) Here's another use case that came up during the analysis of the survey results. I was asked if I could figure out what pro...
[00:23:59] <wikibugs>	 10Analytics, 10Operations: notebook server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10Dzahn)
[00:26:11] <wikibugs>	 10Analytics, 10Operations: notebook server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10Dzahn) ` Jan  2 22:33:15 notebook1004 kernel: [9646042.221155] R invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0-1, order=0, oom_score_adj=0  Jan  3 00:06:33 no...
[00:45:45] <Nettrom>	 a-team: is there a problem with EventLogging on betalabs? We're testing our HelpPanel schema there, but I don't see a table for that in the log database.
[00:46:02] <Nettrom>	 Or more specifically, with getting the EventLogging data into the database?
[00:46:14] <Nettrom>	 I don't find any errors for the schema in the processor logs
[00:46:21] <Nettrom>	 and I can find the events through kafka
[00:46:21] <milimetric>	 mmm, maybe Nettrom, I'll try to take a look
[01:05:01] <milimetric>	 I'm confused Nettrom, everything appears fine on those servers, but I don't see any activity since October... I'm obviously doing something stupid
[01:05:19] <milimetric>	 did EL move from deployment-eventlog05?
[01:06:10] <Nettrom>	 milimetric: not as far as I know, we worked with ottomata in December and some of the documentation was updated and refers to it
[01:06:18] <milimetric>	 k
[01:07:01] <Nettrom>	 it's not that urgent at the moment as I'm about to head out, maybe I should ping ottomata in the morning and see if he can figure it out?
[01:10:48] <milimetric>	 yeah, I've restarted the services, they seem fine, so if it's still not working ping ottomata 
[01:12:05] <milimetric>	 Nettrom: heh, ok, it was just catching up, HelpPanel_18721886 exists now, and it has a few events in there
[01:12:13] <Nettrom>	 milimetric: looks like that did the trick, thanks!
[01:12:23] <Nettrom>	 the timestamp appears to match what I found in kafka earlier
[01:12:32] <milimetric>	 for the record, all I did that was meaningful was to restart the service, which is often enough I guess
[01:12:43] <milimetric>	 k, back to bathing the baby :)
[01:12:50] <Nettrom>	 have fun, and thanks again! :)
[08:09:28] <elukey>	 morning!
[08:09:31] <elukey>	 very nice
[08:09:36] <elukey>	 Jan  3 06:00:00 an-coord1001 hdfs-balancer[114867]: date: extra operand ‘%H:%M:%S'’
[08:09:39] <elukey>	 Jan  3 06:00:00 an-coord1001 hdfs-balancer[114867]: Try 'date --help' for more information.
[08:09:42] <elukey>	 Jan  3 06:00:00 an-coord1001 hdfs-balancer[114867]:  WARN Not starting hdfs balancer, it is already running (or the lockfile exists).
[08:10:11] <elukey>	 ahhh okok now I get it from the script
[08:10:22] <elukey>	 if /tmp/hdfs-balancer is not removed then it doesn't work
[08:10:39] <elukey>	 but that code path needs to be fixed, doing it now :)
[08:14:50] <fdans>	 helloooo elukey :)
[08:15:00] <icinga-wm>	 RECOVERY - Check if the Hadoop HDFS Fuse mountpoint is readable on notebook1004 is OK: OK
[08:15:22] <elukey>	 holaaaa
[08:20:54] <elukey>	 ok script fixed, I'll start the balancer when the under replicated blocks are done
[08:21:18] <elukey>	 I hope to have the hosts ready for the new cluster next week
[09:08:36] <joal>	 morning folks
[09:09:41] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Elukey: Review dbstore1002's non-wiki databases and decide which ones needs to be migrated to the new multi instance setup - https://phabricator.wikimedia.org/T212487 (10elukey) >>! In T212487#4839932, @elukey wrote: >  > `project_illustration` was requested i...
[09:10:13] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10DBA, 10User-Elukey: Review dbstore1002's non-wiki databases and decide which ones needs to be migrated to the new multi instance setup - https://phabricator.wikimedia.org/T212487 (10elukey) Ok so as far as I can see this task seems waiting a few confirmations but if the p...
[09:11:24] <joal>	 Sorry for not answering about decom yesterday elukey - I was trying to concentrate on mediawiki-reduced and it consumed all m brain
[09:13:25] <wikibugs>	 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10elukey) If possible I'd prefer to skip this part and keep the staging db as it is, from what I can see it is not that huge.. The main issue is that getting all those tables reviewed before the April deadline wil...
[09:14:58] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10elukey) As far as I can see we'd need to work on https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/479224/ right?
[09:16:28] <elukey>	 joal: bonjour! I know that in reality you don't like to work with me anymore
[09:16:37] <elukey>	 you can say that
[09:16:40] <elukey>	 no offense taken
[09:16:41] <elukey>	 :P
[09:17:00] <joal>	 elukey: Arf - I'm discovered :)
[09:17:42] <joal>	 elukey: looks like the decom of 29/30 is doing good :)
[09:17:52] <elukey>	 yeah but suuuper sloooowwwww
[09:18:02] <elukey>	 I'll start the balancer when those are done
[09:18:17] <elukey>	 realistically I think that we'll get this task done by end of next week
[09:18:36] <elukey>	 I'll try to prepare puppet patches to spin up the cluster in the meantime
[09:19:06] <joal>	 elukey: datanode disks are in average 50/60% full - meaning reinforcing rep-factor for one or 2 nodes over other ones means a lot of data
[09:19:48] <elukey>	 I know I knowwww I was only ranting :)
[09:19:57] <joal>	 elukey: About yarn - I see that the nodes are removed (65 active, 3 decoms) - You remove them at the same time zas with hdfs?
[09:20:23] <elukey>	 exactly, I issued a -refreshNodes to yarn as well
[09:20:31] <elukey>	 it reads the same hosts.exclude file
[09:20:38] <joal>	 elukey: About disk-usage - I feel we use a lot of space currently - I wonder if there wouldn't be a way to drop some data
[09:20:51] <joal>	 great for yarn elukey :)
[09:21:20] <joal>	 thanks for explaining (and still being cooperative with me :-P)
[09:21:32] <elukey>	 joal: no idea about space used, you know that I am only a guy working in the engine basement of the ship :D
[09:21:53] <elukey>	 jokes aside, it would be great if we could free space
[09:22:37] <joal>	 I agree very much elukey - Possibly we could drop some snapshots from MWH and wikitext - But I assume most of the space is taken from webrequest :(
[09:23:04] <elukey>	 let's drop that!
[09:23:24] <joal>	 no need about requests from the web - We served them already
[09:25:13] <elukey>	 that's what I meant! :P
[09:35:51] <wikibugs>	 10Analytics, 10User-Elukey: Kerberos service running in production - https://phabricator.wikimedia.org/T211836 (10elukey)
[10:05:41] <joal>	 elukey: did a quick check for space usage - In user space, a lot of it was taken by trash (me and hdfs) - I cleaned mine, HDFS one should be cleaned in a few days automatically
[10:06:52] <elukey>	 nice!
[10:21:32] <elukey>	 fdans: o/
[10:21:52] <elukey>	 fdans: do you remember the task to clean up old user dirs?
[10:22:05] <fdans>	 yessss
[10:22:19] <elukey>	 mind to update the on-call docs to reflect the work to be done? IIRC we had to do it but I believe we forgot
[10:24:58] <fdans>	 yesss sorry I was updating it a couple weeks ago and I closed the browser by accident
[10:25:30] <elukey>	 no problem! Thanks :)
[10:26:06] * fdans wants to refactor the on call article
[10:27:17] <elukey>	 what I'd like to make sure is that we come up with some sort of procedure to check every week what are the clean ups left to do
[10:27:26] <elukey>	 for example there is one ongoing
[10:30:50] <fdans>	 i'd love to structure the document as:
[10:31:03] <fdans>	 - things to do/check at the start of the ops week
[10:31:13] <fdans>	 - things to do/check at then end of the ops week
[10:31:32] <fdans>	 - things to do/check every day of the ops week
[10:32:09] <fdans>	 it needs to be more concise and actionable than what it is now, like a pre-flight check
[10:32:23] <elukey>	 do we need the start/end parts?
[10:32:34] <elukey>	 (asking for curiosity)
[10:33:25] <fdans>	 I don't know, there might be things that only need to be done once per ops week, but it's just a way to structure it
[10:33:52] <fdans>	 e.g. you might not need to check for user dirs to clean up every day of your duty
[10:34:16] <fdans>	 instead of start/end you could say "to be done once a week"
[10:36:33] <elukey>	 sure, but the user dirs is most of the times a task, so I'd re-write it as "check if tasks with tag X and make sure to progress them if they are not blocked"
[10:36:36] <elukey>	 etc..
[10:36:38] <elukey>	 that can be done every day
[10:36:46] <elukey>	 like reviewing alarms, etc..
[10:48:27] <elukey>	 ah! https://github.com/apache/incubator-druid/releases/tag/druid-0.13.0-incubating
[10:48:30] <fdans>	 elukey: https://wikitech.wikimedia.org/wiki/Analytics/Team/Oncall#Offboard_users
[10:49:35] <fdans>	 oh wow elukey when did they make that the stable release? I've been checking periodically :)
[10:51:48] <elukey>	 fdans: looks good thanks! Small nits: the task should already be opened by an SRE, the main problem is now how to find them
[10:52:18] <elukey>	 I think that a new tag is worth it, so we can identify the outstanding tasks with a clik
[10:52:21] <elukey>	 *click
[10:52:25] <elukey>	 what do you think?
[10:52:41] <fdans>	 corrected accordingly
[10:53:06] <fdans>	 elukey: hmmm dont knoooooow maybe bring it up in standup?
[10:53:06] <elukey>	 about druid - not sure when they released, I think recently since I checked in december and didn't find anything new :)
[10:53:17] <elukey>	 fdans: standup makes sense
[10:53:43] <elukey>	 the other thing to keep checking is superset - 0.29 is still in RC 
[10:53:51] <elukey>	 but hopefully it will get out soon
[11:15:41] <wikibugs>	 10Analytics, 10Operations, 10ops-eqiad: kakfa1013 shows a failed PSU - https://phabricator.wikimedia.org/T212844 (10elukey)
[11:26:07] <elukey>	 !log manually started the hdfs-balancer (failed earlier on due to the presence of a lock file)
[11:26:08] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[11:26:12] <elukey>	 joal: --^
[11:30:32] <wikibugs>	 10Analytics, 10Core Platform Team Backlog (Watching / External), 10Services (watching): Evaluate using TypeScript on node projects - https://phabricator.wikimedia.org/T206268 (10Physikerwelt) I definitely support this motion. However, before I consider to create a subtask I would like to know a bit more abou...
[11:51:08] <wikibugs>	 (03CR) 10Joal: "Some comments inline - Sorry for the duplication with the todos." (036 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/481025 (https://phabricator.wikimedia.org/T209732) (owner: 10Awight)
[11:59:13] * elukey lunch!
[13:14:04] <joal>	 Gone for a medical apointement, back in a while
[13:18:33] <nuria>	 hello team europe
[13:18:49] <elukey>	 nuria: o/
[13:19:30] <elukey>	 are you at home or travelling? Seems really early :D
[13:36:44] <wikibugs>	 10Analytics, 10Readers-Web-Backlog (Tracking): [Bug] Many JSON decode ReadingDepth schema errors from wikiyy - https://phabricator.wikimedia.org/T212330 (10Nuria) This is a frequent occurrence as wikimedia's code base is used by many other sites as-is. In this case this is an unlawful usage of content and we w...
[13:37:24] <wikibugs>	 10Analytics: Throttling /request dispatcher that fronts eventlogging /MEP endpoint - https://phabricator.wikimedia.org/T212853 (10Nuria)
[14:03:09] <elukey>	 team: the heating system in my home decided to break, I'd need to take some time to try to sort it out
[14:03:18] <elukey>	 will be afk for a bit, ping me if needed
[14:05:21] <fdans>	 Luca SRE: Home Edition
[14:09:49] <elukey>	 it was quick, the technician said that he'll need to clean it up sigh
[14:11:59] <wikibugs>	 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES scoring events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Nuria) Let's please make sure this data and tables  are documented on wikitech .
[14:12:08] <addshore>	 elukey: everything breaks over the holiday periods :D
[14:12:26] <addshore>	 I'm now working on a laptop with no screen in need of a motherboard replacement :(
[14:12:30] <addshore>	 happy new year all! :D
[14:14:08] <elukey>	 addshore: to you too!
[14:14:27] <wikibugs>	 10Analytics: Upgrade python ua parser to 0.6.3 version - https://phabricator.wikimedia.org/T212854 (10Nuria) p:05Triage→03High
[14:27:07] <ottomata>	 o?
[14:27:23] <ottomata>	 haha i mean o/
[14:27:26] <ottomata>	 although i like that first one
[14:27:32] <ottomata>	 kinda looks like i'm scratching my head
[14:30:13] <wikibugs>	 10Analytics, 10Readers-Web-Backlog (Tracking): [Bug] Many JSON decode ReadingDepth schema errors from wikiyy - https://phabricator.wikimedia.org/T212330 (10Ottomata) I betcha we could also use some kinda of special secret key whitelisting.  Even if not secure and easily spoofable, it would at least keep stuff...
[14:40:20] <nuria>	 ottomata: jajaja
[14:40:28] <elukey>	 joal: the balancer seems doing its work, I can see more green bars in the dfs-health page now.. It is still running, so I'll leave it doing its work before removing other hadoop nodes 
[14:40:29] <nuria>	 joal: question if you are there
[14:44:24] <wikibugs>	 (03CR) 10Nuria: [C: 03+2] Add direct kafka-to-druid ingestion example [analytics/refinery] - 10https://gerrit.wikimedia.org/r/480956 (https://phabricator.wikimedia.org/T203669) (owner: 10Joal)
[14:46:10] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Fundraising-Backlog, 10Patch-For-Review, 10User-Elukey: Return to real time banner impressions in Druid - https://phabricator.wikimedia.org/T203669 (10Nuria) Code merged, if @JAllemandou closes job we can close this ticket
[14:48:51] <elukey>	 so I am a bit confused about https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_hdfs_balancer.html
[14:49:02] <elukey>	 "The Balancer has a default threshold of 10%, which ensures that disk usage on each DataNode differs from the overall usage in the cluster by no more than 10%. For example, if overall usage across all the DataNodes in the cluster is 40% of the cluster's total disk-storage capacity, the script ensures that DataNode disk usage is between 30% and 50% of the DataNode disk-storage capacity."
[14:49:35] <elukey>	 atm if I've done my calculations correctly we should be at ~59% of usage
[14:49:59] <elukey>	 so the balancer tries to keep each datanode's disk usage between 49%-69%
[14:51:25] <elukey>	 but currently there are ~20 nodes outside the windonw
[14:51:27] <elukey>	 *window
[14:51:51] <elukey>	 yesterday we killed a super long running balancer, that probably got confused when we added nodes to the cluster
[14:52:15] <elukey>	 now I am a bit worried that the balancer takes ages only to because of the default window of 10%
[14:55:42] <joal>	 hm
[14:55:45] <joal>	 Back
[14:56:52] <joal>	 nuria: Hi - Happy new year :) How may I help
[14:57:31] <nuria>	 joal: I  think I figured it out, give me a sec
[14:58:41] <ottomata>	 elukey:  yeah, balancer by default runs really slow
[14:58:49] <ottomata>	 to not take up bandwidth, etc.
[14:59:08] <ottomata>	 i think in the past, when i've needed to rebalance a bunch (new nodes, etc.) i've manually run balancer with a higher threshold
[14:59:18] <joal>	 interesting ottomata 
[15:00:09] <joal>	 After a refresh I've seen that datanode-histogram in `Datanode Information` of HDFS UI got more compact
[15:02:38] <wikibugs>	 10Analytics, 10Product-Analytics: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset - https://phabricator.wikimedia.org/T211173 (10Nuria) > because the complexity makes the data harder to use and because a lot of those dimensions wouldn't work well with Druid anyway. Agreed, I made...
[15:03:12] <nuria>	 joal: please see my comment on this ticket to make sure it makes sense: https://phabricator.wikimedia.org/T211173
[15:04:55] <joal>	 nuria: The comment makes sense at a functional level - I however don't know of extracting a value of an array works as you suggested using transforms :)
[15:05:10] <elukey>	 ottomata: my doubt is if there could be a use case that causes the balancer to keep "rebalancing" since some nodes cannot fall into the "balanced" percentage window
[15:05:41] <joal>	 nuria: I think that if the analysts give us a requirement, we can analyse it and build the druid indexation job (or explain why some stuff if not doable, in case)
[15:06:48] <nuria>	 joal: ah, it is an array and not  abunch of strings, it might nor work then
[15:07:03] <nuria>	 joal: *not work
[15:08:20] <joal>	 nuria: Depending on their need, we will either process the data in spark to modify/pre-aggregate etc, or let druid do it if modifications are simple
[15:08:36] <nuria>	 joal: I rather avoid a custom spark job
[15:09:00] <nuria>	 joal: as it would be a higher maintenance  cost
[15:10:50] <joal>	 nuria: from the "Druid Expression" doc page: Multi-value types are not fully supported yet. Expressions may behave inconsistently on multi-value types, and you should not rely on the behavior in this case to stay the same in future releases.
[15:11:38] <ottomata>	 elukey:  wouldn't they eventually fall into that window?
[15:11:41] <ottomata>	 after enough rebalancing?
[15:12:02] <nuria>	 joal: but you can cast, right? from: http://druid.io/docs/latest/misc/math-expr.html
[15:12:24] <nuria>	 joal: if aarray cast to string as a "a, b, c" it might work
[15:12:35] <nuria>	 joal: point taken though, it needs to be tried
[15:13:51] <joal>	 nuria: I also agree a spark job would cost more than a single druid indexation, but if the modifications/aggregations are complex, it'll probably be needed
[15:14:20] <wikibugs>	 10Analytics, 10Operations, 10ops-eqiad: PSU broken on two Analytics Hadoop Workers - https://phabricator.wikimedia.org/T212861 (10elukey) p:05Triage→03High
[15:14:38] <elukey>	 lovely --^
[15:16:01] <elukey>	 ottomata: yes in theory if the balancer is smart enough I guess so, but yesterday's occurrence has ran for ~900 hours before me killing it :D
[15:16:39] <wikibugs>	 10Analytics: Update IP addresses of cloud labs to mark internal traffic on refinery code - https://phabricator.wikimedia.org/T212862 (10Nuria)
[15:16:39] <wikibugs>	 (03PS2) 10Nuria: Add new Cloud VPS ip addresses to network origin UDF [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/481225 (owner: 10BryanDavis)
[15:17:33] <wikibugs>	 10Analytics, 10Operations, 10ops-eqiad: PSU broken on two Analytics Hadoop Workers - https://phabricator.wikimedia.org/T212861 (10fgiunchedi) Judging by icinga there's a few other hosts with PS alerts, all in A2. I suspect it has to do with one of the rack PDU themselves  ` cloudelastic1001 db1082 db1107 ms-...
[15:20:10] <wikibugs>	 10Analytics, 10Operations, 10ops-eqiad: Rack A2's hosts alarm for PSU broken - https://phabricator.wikimedia.org/T212861 (10elukey)
[15:20:25] <wikibugs>	 10Analytics, 10Operations, 10ops-eqiad: kakfa1013 shows a failed PSU - https://phabricator.wikimedia.org/T212844 (10elukey)
[15:20:28] <wikibugs>	 10Analytics, 10Operations, 10ops-eqiad: Rack A2's hosts alarm for PSU broken - https://phabricator.wikimedia.org/T212861 (10elukey)
[15:30:51] <wikibugs>	 (03PS3) 10Nuria: Add new Cloud VPS ip addresses to network origin UDF [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/481225 (https://phabricator.wikimedia.org/T212862) (owner: 10BryanDavis)
[15:31:09] <wikibugs>	 (03CR) 10Nuria: [V: 03+2 C: 03+2] Add new Cloud VPS ip addresses to network origin UDF [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/481225 (https://phabricator.wikimedia.org/T212862) (owner: 10BryanDavis)
[15:31:38] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update IP addresses of cloud labs to mark internal traffic on refinery code - https://phabricator.wikimedia.org/T212862 (10Nuria)
[15:32:01] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update IP addresses of cloud labs to mark internal traffic on refinery code - https://phabricator.wikimedia.org/T212862 (10Nuria) Thanks to @bd808 for updating config
[16:12:40] <ottomata>	 milimetric:  i'm onto something if you are around
[16:12:53] <ottomata>	 i think it might be a bug (or misunderstanding) in lodash _.defaults()
[16:13:21] <milimetric>	 I'm around ottomata
[16:13:37] <milimetric>	 going cave
[17:06:55] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update IP addresses of cloud labs to mark internal traffic on refinery code - https://phabricator.wikimedia.org/T212862 (10Nuria) a:03bd808
[17:58:02] <wikibugs>	 (03PS3) 10Milimetric: Don't add links if all-projects is selected [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/479251 (owner: 10Fdans)
[17:58:19] <wikibugs>	 (03CR) 10Milimetric: [C: 03+2] Don't add links if all-projects is selected [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/479251 (owner: 10Fdans)
[18:01:43] <nuria>	 ping ottomata 
[18:02:39] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update IP addresses of cloud labs to mark internal traffic on refinery code - https://phabricator.wikimedia.org/T212862 (10Milimetric) p:05Triage→03High
[18:03:03] <wikibugs>	 10Analytics: Throttling /request dispatcher that fronts eventlogging /MEP endpoint - https://phabricator.wikimedia.org/T212853 (10Milimetric) p:05Triage→03High
[18:03:29] <wikibugs>	 10Analytics, 10Operations: notebook server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10Milimetric) p:05Triage→03High
[18:05:04] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Don't add links if all-projects is selected [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/479251 (owner: 10Fdans)
[18:10:17] <wikibugs>	 10Analytics, 10Operations: notebook server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10Milimetric) p:05High→03Normal A proper fix is to manage resources through containerization (kubernetes), so marking low priority for now as other solutions we could think of are a little hacky.
[18:13:28] <wikibugs>	 10Analytics, 10Pageviews-API, 10wikitech.wikimedia.org, 10Patch-For-Review: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821 (10Quiddity) Thanks for the merge, @JAllemandou ! What are the next steps to complete this task?  IIUC the 2 steps that Milimetric descri...
[18:13:30] <wikibugs>	 10Analytics: Add is_pageview as a dimension to the 'webrequest_sampled_128' Druid dataset - https://phabricator.wikimedia.org/T212778 (10Milimetric) p:05Triage→03Normal
[18:14:05] <wikibugs>	 10Analytics, 10Analytics-Data-Quality, 10Tool-Pageviews: Anomalous statistics results in eu.wikipedia siteviews - https://phabricator.wikimedia.org/T212879 (10Theklan)
[18:15:58] <wikibugs>	 10Analytics, 10Operations: notebook server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10elukey) A couple of things that we discussed with the team:  * this is the same problem that happens on stat machines, sometimes users are not conservative in their usage of those hosts consuming...
[18:16:07] <wikibugs>	 10Analytics, 10Operations, 10User-Elukey: notebook/stat server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10elukey)
[18:17:38] <wikibugs>	 10Analytics, 10Analytics-Wikistats: Wikistats New Feature - DB size - https://phabricator.wikimedia.org/T212763 (10Milimetric) For clarity, are you asking for the number of articles per wiki or the size in bytes of every current article?  We have both of these available via the API behind wikistats:  https://s...
[18:17:49] <wikibugs>	 10Analytics, 10Analytics-Wikistats: Wikistats New Feature - DB size - https://phabricator.wikimedia.org/T212763 (10Milimetric) p:05Triage→03High
[18:19:34] <wikibugs>	 10Analytics, 10Operations, 10User-Elukey: notebook/stat server(s) running out of memory - https://phabricator.wikimedia.org/T212824 (10elukey) I am pretty ignorant about it, but would cgroups fit in this use case? @MoritzMuehlenhoff ?
[18:19:47] <wikibugs>	 10Analytics, 10Product-Analytics: Columns named "dt" in the Data Lake have different formats - https://phabricator.wikimedia.org/T212529 (10Milimetric) p:05Triage→03High
[18:20:11] <wikibugs>	 10Analytics: https://www.tracemyfile.com/ is a bot, UA: Mozilla/5.0 (compatible; tracemyfile/1.0) - https://phabricator.wikimedia.org/T212486 (10Milimetric) p:05Triage→03Normal
[18:20:56] <wikibugs>	 10Analytics, 10Pageviews-API, 10wikitech.wikimedia.org, 10Patch-For-Review: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821 (10JAllemandou) This is exactly it @Quiddity : A deploy on our side should unlock the thing.
[18:21:12] <wikibugs>	 10Analytics, 10Pageviews-API: enetunreach responses - https://phabricator.wikimedia.org/T212477 (10Milimetric) Can you give us a sample of an actual request that's resulting in this error?
[18:21:35] <wikibugs>	 10Analytics, 10Pageviews-API: enetunreach responses - https://phabricator.wikimedia.org/T212477 (10Milimetric) p:05Triage→03Normal
[18:24:25] <ottomata>	 joal i think maybe no one is coming to this meeting!
[18:24:33] <joal>	 k ottomata :)
[18:25:51] <wikibugs>	 10Analytics, 10Analytics-Wikistats: Wikistats New Feature - DB size - https://phabricator.wikimedia.org/T212763 (10TheSandDoctor) The size of the Size of the English Wikipedia database. I would recommend looking at [[ https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia#Size_of_the_English_Wikipedia_datab...
[18:26:10] <wikibugs>	 10Analytics: Create Spark code to compare DateTimes with partition columns - https://phabricator.wikimedia.org/T212451 (10Milimetric)
[18:28:00] <dsaez>	 ottomata, I've just joined
[18:30:14] <wikibugs>	 10Analytics: Create Spark code to compare DateTimes with partition columns - https://phabricator.wikimedia.org/T212451 (10Milimetric)
[18:30:44] <wikibugs>	 10Analytics, 10Pageviews-API: enetunreach responses - https://phabricator.wikimedia.org/T212477 (10Adamwiggall) Milimetric,  I ended up changing the library I was using to handle the http calls. Issue went away.  Thanks for the response.
[18:33:48] <wikibugs>	 10Analytics: Create Spark code to compare DateTimes with partition columns - https://phabricator.wikimedia.org/T212451 (10Milimetric) p:05Triage→03Normal
[18:36:20] <ottomata>	 oh
[18:36:21] <ottomata>	 dsaez:  ok coming
[18:36:40] <wikibugs>	 10Analytics: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10Milimetric) p:05Triage→03Normal
[18:36:47] <wikibugs>	 10Analytics: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10Milimetric) p:05Normal→03High
[18:37:17] <wikibugs>	 10Analytics: Staging environment for upgrades of superset - https://phabricator.wikimedia.org/T212243 (10Milimetric) p:05Triage→03Normal
[18:53:06] <wikibugs>	 10Analytics: Create Spark code to compare DateTimes with partition columns - https://phabricator.wikimedia.org/T212451 (10Milimetric) Joseph was right, you can't substitute a string in as a WHERE clause condition, only as a value to compare to within the WHERE clause.  The parser gets an error:  ` hive (wmf)> se...
[19:07:36] <ottomata>	 joal:  i gotta eat some lunch but if you are around in a bit i'd love some intellij help...maybe we can both try to set up spark from scratch and compile
[19:09:30] * elukey off!
[19:47:02] <joal>	 Hi ottomata - I'm here :)
[19:55:25] <ottomata>	 joal:  coo here in 7 mins or less!
[19:59:51] <ottomata>	 ok joal going to bc
[20:00:01] <joal>	 joining ottomata 
[20:06:29] <joal>	 my download is faster ottomata, but still long :(
[20:06:31] <joal>	 I'll be back !
[20:07:27] <ottomata>	 ok!
[20:50:36] <wikibugs>	 10Analytics, 10Product-Analytics: Metrics request on portal namespace usage - https://phabricator.wikimedia.org/T205681 (10Tbayer) Just got to work on this again inbetween other things - sorry about the delay, but figuring out what is going on with those referrers that shouldn't exist turned out to be a bit mo...
[21:01:03] <nuria>	 milimetric: yt?
[21:01:40] <milimetric>	 yeah nuria hi
[21:02:16] <nuria>	 milimetric: when we scoop now the comment table is being scooped from prod right?
[21:03:19] <milimetric>	 nuria: we don't sqoop the comment table, we sqoop from the compatibility view made for logging, which includes a join to the comment table
[21:03:32] <milimetric>	 so right now we're only getting comments for logging, not revision
[21:03:58] <nuria>	 milimetric: ah i see
[21:04:11] <milimetric>	 and we don't sqoop anything from prod.  The change we were working on before break sqoops from both prod and cloud
[21:04:13] <wikibugs>	 10Analytics, 10Contributors-Analysis, 10Product-Analytics, 10Epic: Support all Product Analytics data needs in the Data Lake - https://phabricator.wikimedia.org/T212172 (10Nuria) @chelsyx  FYI that the cu_changes table is on wmf_raw as mediawiki_private_cu_changes this data is used for this dataset: https:...
[22:43:30] <wikibugs>	 10Analytics, 10Analytics-Data-Quality, 10Product-Analytics: mediawiki_history datasets have null user_text for IP edits - https://phabricator.wikimedia.org/T206883 (10Tbayer) >>! In T206883#4757850, @JAllemandou wrote: > I hear your point and it makes a lot of sense. I think our views differ in the notion of...
[23:06:04] <wikibugs>	 (03PS15) 10Mforns: Allow for custom transforms in DataFrameToDruid [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/477295 (https://phabricator.wikimedia.org/T210099)
[23:15:58] <wikibugs>	 (03CR) 10Ottomata: "THANKS FOR THESE AWESOME COMMENTS!" (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/477295 (https://phabricator.wikimedia.org/T210099) (owner: 10Mforns)
[23:35:08] <wikibugs>	 10Analytics, 10Contributors-Analysis, 10Product-Analytics, 10Epic: Support all Product Analytics data needs in the Data Lake - https://phabricator.wikimedia.org/T212172 (10chelsyx) Thanks @Nuria . I'm aware that `mediawiki_private_cu_changes` is on `wmf_raw`, but to my understanding it is scooped to `wmf_r...
[23:44:07] <wikibugs>	 10Analytics, 10Reading-Admin, 10Zero: Country mapping routine for proxied requests - https://phabricator.wikimedia.org/T116678 (10Reedy) 05Open→03Declined