[00:13:09] 10Analytics, 10Product-Analytics: prefUpdate schema contains multiple identical events for the same preference update - https://phabricator.wikimedia.org/T218835 (10Tbayer) [04:01:56] 10Analytics, 10Analytics-Wikistats: URLs in description break annotations format - https://phabricator.wikimedia.org/T218845 (10fdans) [04:03:24] 10Analytics: Long annotations text being clipped - https://phabricator.wikimedia.org/T218846 (10fdans) [04:03:39] 10Analytics, 10Analytics-Wikistats: Long annotations text being clipped - https://phabricator.wikimedia.org/T218846 (10fdans) [05:56:04] (03PS1) 10Fdans: Extract loadData method to GraphModel [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/498002 [06:39:46] 10Analytics, 10Analytics-Data-Quality, 10Product-Analytics: A few alterblocks events have event_timestamps from before 2001 - https://phabricator.wikimedia.org/T218824 (10Neil_P._Quinn_WMF) [06:40:06] 10Analytics, 10Analytics-Data-Quality, 10Product-Analytics: A few alterblocks events have event_timestamps from before 2001 - https://phabricator.wikimedia.org/T218824 (10Neil_P._Quinn_WMF) [06:52:22] (03PS2) 10Fdans: Extract loadData method to GraphModel [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/498002 [07:21:58] * elukey sees a fdans sending patches \o/ [07:23:53] 10Analytics, 10ChangeProp, 10Community-Tech, 10Core Platform Team, and 6 others: Provide the ability to have time-delayed or time-offset jobs in the job queue - https://phabricator.wikimedia.org/T218812 (10Joe) [07:31:40] 10Analytics, 10ChangeProp, 10Community-Tech, 10Core Platform Team, and 6 others: Provide the ability to have time-delayed or time-offset jobs in the job queue - https://phabricator.wikimedia.org/T218812 (10Joe) I'm a bit conflicted about this, and let me clarify why: in all the use-cases referenced above... [07:41:10] (03PS14) 10Elukey: Add artifacts for Debian Buster and upgrade to 0.31.0rc18-wikimedia1 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/495182 (https://phabricator.wikimedia.org/T212243) [08:12:00] joal: hehe, yes, we saw the comment in line 15, lots of \\ \\\\ fun! [08:19:34] 10Analytics, 10Analytics-Data-Quality, 10Product-Analytics: A few alterblocks events have event_timestamps from before 2001 - https://phabricator.wikimedia.org/T218824 (10JAllemandou) Thanks @Neil_P._Quinn_WMF and @matmarex :) This error is indeed related to wrong filtering of weird block-expirations. A patc... [08:21:24] New sqoop test succeeded in less than 11h - I think we can call this a success :) [08:21:30] morning elukey :) [08:22:23] bonjour :) [08:22:24] nice! [08:22:46] (03CR) 10Joal: [V: 03+1] "Tested on cluster - suceeded in 11 hours" (036 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/495266 (https://phabricator.wikimedia.org/T215550) (owner: 10Joal) [08:23:13] elukey: waiting for CRs, then merge/deploy and puppet patch :) [08:29:08] (03CR) 10Elukey: "Spotted a wild typo! No idea about the rest because I am too ignorant in sqoop :)" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/495266 (https://phabricator.wikimedia.org/T215550) (owner: 10Joal) [08:29:16] ack! [08:30:11] (03CR) 10Joal: "We have tested oozie-loop. It works, but only accpets a small number of items to loop-over (too many sub-workflows otherwise). See https:/" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/496885 (https://phabricator.wikimedia.org/T210844) (owner: 10Bmansurov) [08:32:42] (03PS10) 10Joal: Update refinery sqoop to use dedicated labsdb host [analytics/refinery] - 10https://gerrit.wikimedia.org/r/495266 (https://phabricator.wikimedia.org/T215550) [08:33:04] (03CR) 10Joal: Update refinery sqoop to use dedicated labsdb host (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/495266 (https://phabricator.wikimedia.org/T215550) (owner: 10Joal) [08:36:37] https://github.com/apache/incubator-superset/issues/7074 seems fixed! [08:36:42] (superset 0.31) [08:37:34] world maps still broken [08:37:39] sadly [08:37:46] :( [08:38:44] even if I have proposed a fix [08:41:05] joal: if you have time, would you mind to check superset on analytics-tool1004 to see if you spot anything weird again? I have live patched viz.py, would be great to know if the things that you spotted last time are working [08:41:27] ssh -L 9080:analytics-tool1004.eqiad.wmnet:80 analytics-tool1004.eqiad.wmnet [08:48:46] ah I just spotted that the edit filter fails [08:48:47] sigh [08:50:15] elukey: pageview dahsboards works for me [08:50:37] joal: this is superset 0.31.0rc18 + patches [08:50:44] Ah ok :) [08:50:49] + patch :) [08:51:53] yeah the one from upstream for the filter box, and mine for the world map [08:53:02] joal: https://gist.github.com/jobar/80cacffdeff98c85c724a978cf2cd037 seems working right? [08:53:05] all the examples I mean [08:54:34] I feel that they use people as giant net of unit testing [08:54:37] for their releases [08:55:31] :) [08:56:36] mechanicalytics-turc [08:57:18] joal: last one if you have patience - the edit filter chart [08:57:24] the current one returns no data [08:57:32] but the one for 0.31 returns an error for 'count' [08:57:41] that is a KeyError exception in the logs [08:58:14] elukey: using the same metric in world-map works, but doesn't in bubble-chart [08:58:47] the first example you mean [08:58:48] ? [08:59:42] how can I do to repro? [09:00:27] in the bubble cahrt example, use the same metric (SUM(view_count) for instance) in X-Axis, Y-Axis and Bubble-Size [09:00:33] ah getting Wrong number of items passed 2, placement implies 1 [09:00:55] correct [09:01:13] Same type of error we were having with map before [09:02:11] mmmm no with the map it was 'Too many indexers' [09:04:09] but the bubble one seems indeed broken, and now that I remember it was also for 0.29 [09:04:18] so I need to open another issue then [09:05:01] they bumped pandas to a newer versions and it broke a lot of things [09:06:44] elukey: the filter-box one works, issue is coming from invalid date in time-range (`infinite`_) [09:07:40] joal: did you find a way to make it working? [09:07:51] it breaks for me even if I change the time range [09:08:03] http://localhost:9080/superset/explore/?form_data=%7B%22datasource%22%3A%22332__druid%22%2C%22viz_type%22%3A%22filter_box%22%2C%22slice_id%22%3A163%2C%22url_params%22%3A%7B%7D%2C%22granularity%22%3A%22P1M%22%2C%22druid_time_origin%22%3Anull%2C%22time_range%22%3A%221990-01-01T00%3A00%3A00+%3A+now%22%2C%22filter_configs%22%3A%5B%7B%22asc%22%3Afalse%2C%22clearable%22%3Atrue%2C%22column%22%3A%22wiki_ [09:08:09] db%22%2C%22metric%22%3A%22count%22%2C%22multiple%22%3Atrue%7D%2C%7B%22asc%22%3Afalse%2C%22clearable%22%3Atrue%2C%22column%22%3A%22page_namespace%22%2C%22metric%22%3A%22count%22%2C%22multiple%22%3Atrue%7D%2C%7B%22asc%22%3Afalse%2C%22clearable%22%3Atrue%2C%22column%22%3A%22page_is_redirect%22%2C%22metric%22%3A%22count%22%2C%22multiple%22%3Atrue%7D%2C%7B%22asc%22%3Afalse%2C%22clearable%22%3Atrue%2C% [09:08:15] 22column%22%3A%22event_user_is_anonymous%22%2C%22metric%22%3A%22count%22%2C%22multiple%22%3Atrue%7D%2C%7B%22asc%22%3Afalse%2C%22clearable%22%3Atrue%2C%22column%22%3A%22event_user_is_bot_by_name%22%2C%22metric%22%3A%22count%22%2C%22multiple%22%3Atrue%7D%5D%2C%22date_filter%22%3Atrue%2C%22instant_filtering%22%3Atrue%2C%22show_sqla_time_granularity%22%3Atrue%2C%22show_sqla_time_column%22%3Atrue%2C%2 [09:08:21] 2show_druid_time_granularity%22%3Afalse%2C%22show_druid_time_origin%22%3Afalse%2C%22adhoc_filters%22%3A%5B%5D%7D [09:08:22] ahhhh [09:08:24] :D [09:08:24] Arfff [09:08:26] :) [09:08:28] sorry :( [09:08:31] comment in gist [09:08:33] elukey: --^ [09:08:34] :) [09:08:48] * joal wishes he'd learn from previous mistakes [09:09:34] ah nice! What I don't like though is that now the filter box says "No data" [09:09:41] in 0.26 I mean [09:09:47] hm [09:09:47] and in 0.31 says 'count' [09:09:59] because it relates to a key error exception [09:10:00] it doesn't for me [09:10:19] yeah well, the error is indeed very not-explanatory [09:11:00] joal: try to select time granularity different than month [09:11:06] it should return the error [09:11:29] nope - works for me [09:11:54] in 0.31 ? [09:11:58] yes [09:12:32] Now I think the time granularity doesn't provide any change in the result I get [09:13:09] joal: try time window set from today's date : now [09:13:23] elukey: no data for this dataset [09:13:38] whatttt [09:13:39] And indeed error is not nice [09:13:49] elukey: data fro this dataset start last month [09:13:56] I get 'count' [09:14:05] elukey: so if you use now as starting point, well no data available to work with [09:14:25] sure, I don't dispute the error condition, but the error message [09:14:28] can you bc a second? [09:14:28] indeed [09:14:30] sure [09:38:08] brb [10:21:32] joal: as FYI I am prepping a change to add to the hadoop test cluster 2000,20 as retry policy and 5000 as maximum ids stored [10:21:41] if it goes fine I'll apply to prod [10:21:54] does it sound good? [10:23:14] +1 elukey :) [10:30:27] I am curious about what happens to the exceeding app ids [10:30:32] will they be cleaned up? [10:30:58] I don't know! [10:31:34] in theory it should already clean those up periodically [10:38:30] elukey@analytics1028:~$ hdfs dfs -ls /user/yarn/rmstore/FSRMStateRoot/RMAppRoot | wc -l [10:38:33] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 [10:38:35] joal: --^ [10:38:38] 5001 [10:38:40] ah! [10:38:43] so all good in testing [10:38:46] moving to prod? [10:42:43] YES:) [10:42:52] elukey: --^ :) [10:45:54] ok proceeding, restarted yarn on 1001 [10:46:14] 1002 is of course taking its time to load from hdfs [10:46:29] it took 1m in the test cluster [10:46:32] as reference [10:50:29] brb [10:57:50] 12 minutes [10:57:53] what a disaster [10:58:34] ok master is 1002 now [10:58:42] will wait 5 mins and then I'll restart it again [11:02:56] failover started to get back to 1001 [11:10:58] seems to have been faster elukey :) [11:12:17] elukey@an-master1001:~$ hdfs dfs -ls /user/yarn/rmstore/FSRMStateRoot/RMAppRoot | wc -l [11:12:20] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 [11:12:23] 5009 [11:12:29] \o/ [11:14:10] 7 minutes [11:14:26] not that fast :S [11:15:27] I feel better now though that retry policy and appids are better :) [11:15:38] next step is to think about zookeeper [11:15:53] I would love to move back to zookeeper directly with the new hosts [11:16:06] but I think that we don't have enough remaining budget [11:17:31] elukey: with the new settings we are a lot safer than before relatively to failures [11:17:33] elukey: ha, you know all these people who do crosswords or sudokus during vacation time? Wikistats refactors are my sudokus :D [11:17:38] elukey: But, the startup time is still high [11:18:04] elukey: it can wait a qurter IMO :) [11:18:30] fdans: :D master-level :) [11:20:06] joal: 2 weeks is a like my longest holiday ever, I need just a lil bit of codage [11:20:39] :) [11:21:24] btw I just realized that I logged off the irccloud mobile app, but not the channel, so apologies for any unresponded pings [11:22:38] fdans_out: please stay safe from pings! HOLIDAYS! [11:23:56] :) [11:24:23] joal: I'll follow up with sre to see if we could order the hw in advance, we have some budget left [11:24:53] the alternative could be to explore co-locating zookeeper on the worker nodes [11:25:01] like we do for druid [11:25:12] and also for the journal nodes [11:27:50] joal: yeayea totally, just I'm sorry for appearing as online when I'm not :D [11:49:43] PROBLEM - YARN NodeManager Node-State on an-worker1089 is CRITICAL: CRITICAL: YARN NodeManager an-worker1089.eqiad.wmnet:8041 Node-State: Could not find the node report for node id : an-worker1089.eqiad.wmnet:8041 [11:49:43] PROBLEM - YARN NodeManager Node-State on an-worker1082 is CRITICAL: CRITICAL: YARN NodeManager an-worker1082.eqiad.wmnet:8041 Node-State: Could not find the node report for node id : an-worker1082.eqiad.wmnet:8041 [11:49:43] PROBLEM - Hadoop NodeManager on an-worker1092 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [11:49:57] PROBLEM - Hadoop NodeManager on an-worker1089 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [11:50:17] mmmm [11:50:49] PROBLEM - YARN NodeManager Node-State on an-worker1092 is CRITICAL: CRITICAL: YARN NodeManager an-worker1092.eqiad.wmnet:8041 Node-State: Could not find the node report for node id : an-worker1092.eqiad.wmnet:8041 [11:50:53] PROBLEM - Hadoop NodeManager on an-worker1082 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [11:53:23] RECOVERY - YARN NodeManager Node-State on an-worker1092 is OK: OK: YARN NodeManager an-worker1092.eqiad.wmnet:8041 Node-State: RUNNING [11:53:27] RECOVERY - Hadoop NodeManager on an-worker1082 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [11:53:33] RECOVERY - Hadoop NodeManager on an-worker1092 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [11:53:35] RECOVERY - YARN NodeManager Node-State on an-worker1082 is OK: OK: YARN NodeManager an-worker1082.eqiad.wmnet:8041 Node-State: RUNNING [11:53:35] RECOVERY - YARN NodeManager Node-State on an-worker1089 is OK: OK: YARN NodeManager an-worker1089.eqiad.wmnet:8041 Node-State: RUNNING [11:53:49] RECOVERY - Hadoop NodeManager on an-worker1089 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [11:54:24] so I have restarted the node managers on those [11:54:42] they went down when we did the failover due to conn refused [11:54:46] and only now downtime expired [11:58:38] ok so I checked on all and yarn should be fine [11:58:45] icinga is not complaining anymore [11:59:31] going out for lunch with a friend, but I'll keep my laptop with me just in case [11:59:40] please ping me on hangouts whatever if anything happens :) [11:59:42] * elukey lunch! [12:52:14] (03PS10) 10Joal: Update mediawiki-reconstruction with log info [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/493012 [12:53:38] (03PS2) 10Joal: Correct mw user-history create event timestamp [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/497604 (https://phabricator.wikimedia.org/T218463) [12:58:42] 10Analytics, 10ChangeProp, 10Community-Tech, 10Core Platform Team, and 6 others: Provide the ability to have time-delayed or time-offset jobs in the job queue - https://phabricator.wikimedia.org/T218812 (10Pchelolo) There is already an ability to execute jobs after a delay or at more-or-less specific time,... [13:26:20] (03PS3) 10Fdans: Extract loadData method to GraphModel [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/498002 [13:38:05] as FYI I just restarted yarn/hdfs on an-worker1080 to pick up new openjdk [13:53:03] 10Analytics-Kanban, 10Product-Analytics: Superset Updates - https://phabricator.wikimedia.org/T211706 (10elukey) [13:54:35] 10Analytics-Kanban, 10Product-Analytics: Superset Updates - https://phabricator.wikimedia.org/T211706 (10elukey) [13:56:51] also opened https://github.com/apache/incubator-superset/issues/7079 [13:57:02] so atm 3 upstream issues not fixed [13:59:26] ottomata: o/ [13:59:39] I'm ready when you are. [14:01:10] bmansurov: yes now is good! [14:01:26] ottomata: ok call in hangout [14:09:13] https://github.com/wikimedia/wikimedia-discovery-analytics [14:17:40] Random question that I couldn't answer by reading the operations-puppet repo: what version of Hive do we run on the stats cluster? [14:19:29] awight: o/ - it is the one shipped via cdh 5.15, Hive 1.1.0-cdh5.15.0 [14:21:13] awight: on a host you can try [14:21:20] dpkg -l| grep hive [14:25:57] Thanks! [14:26:39] joal: something interesting.. if I apply your bubble chart example as it is to superset.wikimedia.org it returns no data [14:28:11] :S [14:29:10] can you check if you have time? doesn't seem to be a regression? [14:29:21] but a partial fix? :D [14:32:38] elukey: I can't say the difference in settings, but when trying manually on superset.wikimedia.org I get the same error(Wrong number of items passed 3, placement implies 1) [14:33:52] Can you paste the link? (in the gist) [14:34:24] done [14:34:28] thanks :) [14:37:01] joal: congrats! You have discovered a bug! [14:37:02] :D [14:37:08] \o/ :) [14:37:14] I have updated https://github.com/apache/incubator-superset/issues/7079 accordingly [14:40:42] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Improve speed and reliability of Yarn's Resource Manager failover - https://phabricator.wikimedia.org/T218758 (10elukey) ` elukey@an-master1001:~$ hdfs dfs -ls /user/yarn/rmstore/FSRMStateRoot/RMAppRoot | wc -l Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=... [15:14:59] (03PS3) 10Joal: Update mw user-history timestamps [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/497604 (https://phabricator.wikimedia.org/T218463) [15:15:51] milimetric: let's discuss before you look at that one --^ or you'll think I'm making fun of you [15:16:00] please :) [16:04:37] 10Analytics, 10Product-Analytics: Add ExternalGuidance event logging table to whitelist - https://phabricator.wikimedia.org/T218838 (10Nuria) @chelsey: you just need to submit a CR, we will look at it, thanks [16:06:31] 10Analytics, 10Analytics-Wikistats, 10JavaScript: broken javascript on SquidReportPageViewsPerCountryBreakdownHuge - https://phabricator.wikimedia.org/T43663 (10awight) [16:06:57] 10Analytics, 10Analytics-Wikimetrics: "Created" date in report queue showing up strange - https://phabricator.wikimedia.org/T56300 (10awight) [16:07:08] 10Analytics, 10Analytics-Wikimetrics: Reports download with the filename "Bytes" - https://phabricator.wikimedia.org/T65401 (10awight) [16:07:21] 10Analytics, 10Analytics-Wikimetrics: csv upload tricks you into staring at a "Refresh" button - https://phabricator.wikimedia.org/T65402 (10awight) [16:07:46] 10Analytics, 10ChangeProp, 10Community-Tech, 10Core Platform Team, and 6 others: Provide the ability to have time-delayed or time-offset jobs in the job queue - https://phabricator.wikimedia.org/T218812 (10Mooeypoo) >>! In T218812#5042813, @Joe wrote: > @aezell is there something I'm missing? Wouldn't a sc... [16:07:50] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Eventlogging mysql consumer restarting for several hours due to schema parsing errors - https://phabricator.wikimedia.org/T218831 (10Nuria) p:05Triage→03High [16:08:58] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Access to yarn.wikimedia.org for julia.glen - https://phabricator.wikimedia.org/T218815 (10Nuria) a:03elukey [16:09:18] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Access to yarn.wikimedia.org for julia.glen - https://phabricator.wikimedia.org/T218815 (10Nuria) p:05Triage→03High [16:10:10] 10Analytics, 10EventBus: EventGate Helm chart should POST test event for readinessProbe - https://phabricator.wikimedia.org/T218680 (10Nuria) p:05Triage→03High [16:15:11] 10Analytics, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10Nuria) Can we have an ETA of when this work would be completed (since it is not a code patch it will not be visible here on ticket). Once fixed we can refine that data and ma... [16:17:26] 10Analytics, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10Nuria) FYI that this blocks some changes to improve refinement of data, cause until these schemas are changed they cannot be refined with the new code. Ping @AndyRussG [16:18:52] 10Analytics, 10Fundraising-Backlog, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10AndyRussG) [16:19:03] 10Analytics, 10Analytics-Wikistats, 10JavaScript: broken javascript on SquidReportPageViewsPerCountryBreakdownHuge - https://phabricator.wikimedia.org/T43663 (10Nuria) 05Open→03Declined [16:19:20] 10Analytics, 10Fundraising-Backlog, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10AndyRussG) >>! In T218617#5044948, @Nuria wrote: > FYI that this blocks some changes to improve refinement of data, cause until these schemas are chan... [16:20:02] 10Analytics, 10ChangeProp, 10Community-Tech, 10Core Platform Team, and 6 others: Provide the ability to have time-delayed or time-offset jobs in the job queue - https://phabricator.wikimedia.org/T218812 (10aezell) >>! In T218812#5042813, @Joe wrote: > - TTL-based expiry of records I agree. This is the "wo... [16:20:49] 10Analytics, 10Product-Analytics: prefUpdate schema contains multiple identical events for the same preference update - https://phabricator.wikimedia.org/T218835 (10nettrom_WMF) I used data from this schema in T216185. My experience was similar to what @Tbayer mentions in that some preferences appear to have i... [17:35:13] * elukey off! [17:35:33] bye elukey :) [17:52:38] elukey: when you send that message I always imagine you like a server powering down 😂 [17:59:56] (03CR) 10Nuria: "Some comments as to variable naming that (I think) make more clear what is going on." (034 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/495266 (https://phabricator.wikimedia.org/T215550) (owner: 10Joal) [18:18:25] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Access to yarn.wikimedia.org for julia.glen - https://phabricator.wikimedia.org/T218815 (10elukey) Julia is not in the `nda` LDAP group, I'll follow up with @MoritzMuehlenhoff tomorrow (EU time) and I'll add the username in there! [18:20:43] 10Analytics, 10Dumps-Generation, 10WikiCite, 10Wikidata, 10Patch-For-Review: Update wikidata-entities dump generation to fixed day-of-month instead of fixed weekday - https://phabricator.wikimedia.org/T216160 (10Rosiestep) Why 1st and 20th of each month, rather than more equally-spaced, e.g. 1st and 15th... [18:21:20] 10Analytics, 10Knowledge-Integrity, 10Research, 10Epic, 10Patch-For-Review: Citation Usage: run third round of data collection - https://phabricator.wikimedia.org/T213969 (10bmansurov) Data collection has started. [18:26:35] 10Analytics, 10Dumps-Generation, 10WikiCite, 10Wikidata, 10Patch-For-Review: Update wikidata-entities dump generation to fixed day-of-month instead of fixed weekday - https://phabricator.wikimedia.org/T216160 (10ArielGlenn) The reason for the reschedule is to have the first run correspond with the xm/sql... [18:50:19] 10Analytics, 10Fundraising-Backlog, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10chelsyx) Hi @Ottomata , IIUC this only requires changes on the schema on meta wiki ([MobileWikiAppiOSUserHistory](https://meta.wikimedia.org/wiki/Sche... [18:58:26] nuria: here? Just your comment on sqoop and was willing to make decision on naming [19:03:55] 10Analytics, 10CirrusSearch, 10Discovery, 10Discovery-Search: Ingest cirrusserachrequest data into druid - https://phabricator.wikimedia.org/T218347 (10debt) [19:04:00] 10Analytics, 10Analytics-Kanban, 10EventBus, 10MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), 10Patch-For-Review: Make Refine use JSONSchemas of event data to support Map types and proper types for integers vs decimals - https://phabricator.wikimedia.org/T215442 (10debt) [19:21:29] 10Analytics, 10Dumps-Generation, 10WikiCite, 10Wikidata, 10Patch-For-Review: Update wikidata-entities dump generation to fixed day-of-month instead of fixed weekday - https://phabricator.wikimedia.org/T216160 (10Melderick) @ArielGlenn I'm getting confused here. Wikidata dumps are generated every week (we... [19:24:53] (03CR) 10Joal: [C: 03+1] "A bunch of minimal comments - feel free to ignore and merge :)" (036 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/494831 (https://phabricator.wikimedia.org/T215442) (owner: 10Ottomata) [19:28:49] (03CR) 10Joal: "Thanks for comments @nuria - As stated in in-file comments, I'd rather the shorter names `sqoop_commands_map` and `sqoop_commands` if ok f" (034 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/495266 (https://phabricator.wikimedia.org/T215550) (owner: 10Joal) [20:18:35] 10Analytics, 10Fundraising-Backlog, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10Ottomata) > IIUC this only requires changes on the schema on meta wiki I believe so yes! Unless yall have reasons for specifying the items type as a... [20:52:52] 10Analytics, 10Dumps-Generation, 10WikiCite, 10Wikidata, 10Patch-For-Review: Update wikidata-entities dump generation to fixed day-of-month instead of fixed weekday - https://phabricator.wikimedia.org/T216160 (10Rosiestep) @ArielGlenn Am I understanding correctly that the xm/sql dumps run only on the 1st... [21:17:55] 10Analytics, 10EventBus, 10Wikimedia-production-error: Warning: get_class expects object (string given) from EventBusHooks.php - https://phabricator.wikimedia.org/T218952 (10Krinkle) [21:18:16] 10Analytics, 10Anti-Harassment, 10EventBus, 10Wikimedia-production-error: Warning: get_class expects object (string given) from EventBusHooks.php - https://phabricator.wikimedia.org/T218952 (10Krinkle) [21:18:47] 10Analytics, 10Anti-Harassment, 10EventBus, 10Wikimedia-production-error: Warning: get_class expects object (string given) from EventBusHooks.php - https://phabricator.wikimedia.org/T218952 (10Krinkle) Tagging AHT as this affect events from onBlockIpComplete, which might be of interest to you. [21:25:59] 10Analytics, 10Fundraising-Backlog, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10chelsyx) @Ottomata I was trying to change the schema on meta wiki, but got this error message: > Error: Invalid node: expecting "array", got "object".... [21:28:09] 10Analytics, 10Fundraising-Backlog, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10Ottomata) HUH! I wonder if this is why the items are arrays......is EventLogging extension enforcing this with its weirdo JSONSchema? Will check... [21:35:36] 10Analytics, 10Research, 10Article-Recommendation: Some worker nodes don't seem to have numpy installed - https://phabricator.wikimedia.org/T218955 (10bmansurov) [21:48:11] 10Analytics, 10Anti-Harassment, 10EventBus, 10Core Platform Team (Multi-DC (TEC1)), and 4 others: Warning: get_class expects object (string given) from EventBusHooks.php - https://phabricator.wikimedia.org/T218952 (10mobrovac) a:03mobrovac The code actually behaves correctly, since a block target can be... [22:14:41] 10Analytics, 10Research, 10Article-Recommendation, 10Patch-For-Review: Generate article recommendations in Hadoop for use in production - https://phabricator.wikimedia.org/T210844 (10bmansurov) [22:14:43] 10Analytics, 10Research, 10Article-Recommendation: Some worker nodes don't seem to have numpy installed - https://phabricator.wikimedia.org/T218955 (10bmansurov) 05Open→03Invalid Never mind, I had to create the virtual environment with `--system-site-packages`. [22:58:23] 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics: "Edit" equivalent of pageviews daily available to use in Turnilo and Superset - https://phabricator.wikimedia.org/T211173 (10Neil_P._Quinn_WMF) [22:59:38] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Operations, and 3 others: eventgate-analytics k8s pods occasionally can't produce to kafka - https://phabricator.wikimedia.org/T218268 (10Ottomata) I don't know much more, but I have a lot more data! Here is a staging pod with trace logging enabled reproducin... [23:37:36] 10Analytics-Kanban, 10Product-Analytics: Address data quality issues in the mediawiki_history dataset - https://phabricator.wikimedia.org/T204953 (10Neil_P._Quinn_WMF) [23:38:15] 10Analytics, 10Product-Analytics: Ingest data from PrefUpdate EventLogging schema into Druid - https://phabricator.wikimedia.org/T218964 (10Tbayer)