[02:45:58] <icinga-wm>	 PROBLEM - Hadoop NodeManager on analytics1071 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration
[05:29:41] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make anomaly detection correctly handle holes in time-series - https://phabricator.wikimedia.org/T251542 (10Nuria) >Logs display an error about one of the basic hive classes not being available: Seems that if the series does not have 24  points (for say, a daily measure of hour...
[05:33:34] <elukey>	 !log restart hadoop yarn nodemanager on analytics1071
[05:33:36] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[05:33:45] <elukey>	 failed due to spark shuffle --^
[05:33:50] <elukey>	 (and heap oom)
[05:34:30] <icinga-wm>	 RECOVERY - Hadoop NodeManager on analytics1071 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration
[05:35:45] <elukey>	 !log re-run two failed hours for webrequest load text (07/05T05) and upload (06/05T23)
[05:35:46] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[05:38:42] <nuria>	 elukey: was the info  about spark shuffle in the logs on 1071 on var/log?
[05:39:30] <elukey>	 nuria: hola! yes, /var/log/hadoop-yarn/hadoop-yarn-nodemanger..etc..
[05:39:58] <nuria>	 elukey: and can i see those with sudo -u hdfs?
[05:39:59] <elukey>	 it is sadly something that we have been seeing recently, I am going to increase the yarn node manager heap size today
[05:40:56] <elukey>	 nuria: even from your user, they should be readable from all 
[05:42:23] <nuria>	 elukey: ah yes, i was looking at syslogs
[05:42:34] <nuria>	 elukey: got it 
[05:46:10] <elukey>	 !log re-run mediarequest-hourly-wf-2020-5-6-19
[05:46:11] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[06:01:31] <elukey>	 mmm a lot of jobs failing for java.lang.NoClassDefFoundError: org/apache/hive/service/cli/HiveSQLException
[06:01:35] <elukey>	 including mw history
[06:03:41] <elukey>	 all spark jobs
[06:06:02] <elukey>	 and they started after the deploy
[06:21:31] <elukey>	 ah no lovely also webrequest fails
[06:38:07] <fdans>	 elukey: I'm here if you need me to look at anythin
[06:40:48] <elukey>	 fdans: hola!
[06:40:58] <elukey>	 I am not sure what's happening, there is a weird hive error
[06:41:10] <elukey>	 it seems as if the hive-service.jar wasn't picked up by oozie or similar
[06:41:44] <elukey>	 let me try with the hammer
[06:41:50] <elukey>	 !log restart oozie on an-coord1001
[06:41:51] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[06:43:31] <elukey>	 re-running a job to see if anything changes
[06:43:52] <elukey>	 yesterday I used oozie admin shlib upgrade and I just want to make sure that oozie isn't in a weird state
[06:51:51] <elukey>	 lol it completed fine
[06:52:02] <elukey>	 trying with webrequest
[06:54:06] <wikibugs>	 10Analytics: Creation of canonical pageview dumps for users to download - https://phabricator.wikimedia.org/T251777 (10fdans) Some more things to consider after team discussion:    - For this new dump to replace the Pageviews dump we would have to provide not only the access method, but also the agent type dimen...
[07:06:07] <elukey>	 !log restart mediawiki-history-load via hue
[07:06:08] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:08:13] <elukey>	 and of course the cluster is super busy
[07:12:15] <elukey>	 Nathan is using ~25% of the total RAM with a spark job :)
[07:18:45] <elukey>	 !log execute yarn application -movetoqueue application_1583418280867_332862 -queue root.nice
[07:18:46] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:18:48] <elukey>	 let's see if it works
[07:28:06] <elukey>	 ebernhardson: o/ I was checking https://yarn.wikimedia.org/proxy/application_1583418280867_333560/mapreduce/conf/job_1583418280867_333560
[07:28:16] <elukey>	 it consumes ~25% of the total ram available
[07:28:22] <elukey>	 (in the cluster)
[07:28:35] <elukey>	 it is in the root.nice queue but expected to be so massive?
[07:32:11] <elukey>	 !log re-run mediawiki history load
[07:32:13] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:38:52] <elukey>	 dcausse: o/
[07:38:58] <dcausse>	 o/
[07:39:11] <elukey>	 goood morning :)
[07:39:16] <dcausse>	 morning! :)
[07:39:22] <elukey>	 so there is a big job running by analytics-search
[07:39:22] <elukey>	 https://yarn.wikimedia.org/proxy/application_1583418280867_333560/mapreduce/conf/job_1583418280867_333560
[07:39:35] <elukey>	 that is now consuming ~ half of the total ram of the cluster :D
[07:39:51] <elukey>	 if you look for "hive.query.string"
[07:39:54] <elukey>	 you can see the full query
[07:40:03] <elukey>	 (there is a search bot in the top right corner)
[07:40:11] <elukey>	 SELECT '2020-05-06' AS date, search_classify(uri_path, uri_query) AS api, referer_class, COUNT(1) AS calls FROM webrequest WHERE etc..
[07:40:40] <dcausse>	 interesting
[07:41:29] <dcausse>	 looks like a query for populating our dashboards at https://discovery.wmflabs.org/
[07:41:38] <dcausse>	 which are broken since months
[07:42:43] <elukey>	 I'd be tempted to kill the job and see if we can tune it
[07:42:54] <dcausse>	 elukey: yes it's fine to kill
[07:43:02] <elukey>	 currently allocated MBs: 1933312
[07:43:17] <elukey>	 that is ~2TB :D
[07:43:20] <elukey>	 ack then thanks!
[07:43:35] <dcausse>	 we might just want to stop generating this data
[07:43:43] <dcausse>	 I'll bring this up
[07:43:46] <elukey>	 !log kill application_1583418280867_333560 after a chat with David, the job is consuming ~2TB of RAM
[07:43:47] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:43:49] <elukey>	 dcausse: <3
[07:44:28] <elukey>	 done, half of the cluster free now :D
[07:45:18] <elukey>	 !log re-run mediawiki-history-denormalize
[07:45:19] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:56:16] <elukey>	 ok the cluster is busy but manageable
[07:56:44] * elukey coffee, brutal start of the morning :D
[08:14:58] <wikibugs>	 10Analytics, 10I18n, 10RTL: Support right-to-left languages in Wikistats - https://phabricator.wikimedia.org/T251376 (10Amire80)
[09:37:52] <mgerlach>	 elukey: I cannot kinit on stat1005 anymore but it works on stat1007. I get "kinit: Failed to store credentials: No credentials cache found". any idea what could be the reason?
[09:39:17] <elukey>	 mgerlach: hey! Sorry I am working on it, I thought I was alone, I am changing some settings and I made a mistake. One min and it should be fixed
[09:40:38] <mgerlach>	 elukey: thanks 
[09:43:15] <elukey>	 mgerlach: ok so you should log off and login again
[09:43:17] <elukey>	 it should work
[09:43:39] <elukey>	 one thing - I am trying to change the default location of the kerberos credential cache
[09:43:43] <elukey>	 so something might not work
[09:44:07] <elukey>	 for example, if possible, I'd ask you to stop completely your notebook and start it again
[09:44:11] <mgerlach>	 elukey: checked - it works. thanks again.
[10:03:44] <elukey>	 mgerlach: let me know if anything doesn't work, you are now a tester of the new settings :D
[10:04:14] <mgerlach>	 elukey so far so good ; )
[10:05:57] <elukey>	 mgerlach: I am having issues with pyspark in notebooks, so you'll probably see some as well :(
[10:07:17] <mgerlach>	 for me pyspark+notebooks actually works
[10:09:02] <elukey>	 mgerlach: did you start a new one or kept using the old one?
[10:09:05] <elukey>	 a running one sorry
[10:09:15] <elukey>	 because it might be still using the old credentials 
[10:10:51] <mgerlach>	 probably the old one. but it asked for my credentials at some point and failed (thats when I pinged you); after re-entering it worked again
[10:39:58] * elukey lunch!
[10:52:40] <joal>	 Hi folks- Wow a lot of errors :(
[10:58:20] <joal>	 !log Rerun wikidata-articleplaceholder_metrics-wf-2020-5-6
[10:58:22] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[11:00:41] <joal>	 !log Moving application_1583418280867_334532 to the nice queue
[11:00:42] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[11:14:55] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make anomaly detection correctly handle holes in time-series - https://phabricator.wikimedia.org/T251542 (10JAllemandou) I think we should fill holes with 0s (that's actually the meaning of the hole).
[11:17:42] <wikibugs>	 (03CR) 10Joal: "We need to discuss how we want to handle data-dependency here. I don't think we have a datasets file for events yet. Should we have one? S" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/594719 (https://phabricator.wikimedia.org/T249773) (owner: 10Milimetric)
[11:33:30] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10Cmjohnson)
[11:51:41] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-druid1001.eqiad.wmnet'] `  Of which those **FAILED**: ` ['an-druid1001.eqiad.wmnet'...
[11:52:21] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts: ` an-druid1002.eqiad.wmnet ` The log...
[12:05:15] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts: ` an-druid1001.eqiad.wmnet ` The log...
[12:08:40] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-druid1002.eqiad.wmnet'] `  Of which those **FAILED**: ` ['an-druid1002.eqiad.wmnet'...
[12:17:25] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-druid1001.eqiad.wmnet'] `  Of which those **FAILED**: ` ['an-druid1001.eqiad.wmnet'...
[12:27:06] <elukey>	 so bad news from /run/user/etc..
[12:27:22] <elukey>	 jupyter doesn't support it by default, tried to pass the env var around but needs a bit more testing
[12:27:31] <elukey>	 I fear that this issue will repeat with multiple tools
[12:27:32] <elukey>	 sigh
[12:27:58] <elukey>	 mgerlach: I am reverting my change on stat1005, you might need to kinit again etc.. sorry :(
[12:28:20] * joal sends a lot of love to elukey, and some meat to feed the 3-headed dog :S
[12:31:55] <elukey>	 yeah it seems that everything is a problem if outside /tmp/krb
[12:31:56] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad, 10Patch-For-Review: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10Cmjohnson)
[12:32:24] <elukey>	 fdans: when you have a moment I'd like to ask you a question about the geoip archive stuff
[12:32:43] <fdans>	 elukey: batcave?
[12:33:32] <elukey>	 fdans: here is fine, should be quick :)
[12:33:40] <fdans>	 elukey: fire away!
[12:34:01] <elukey>	 so I am trying to complete the refactoring of the roles for the stat boxes
[12:34:09] <elukey>	 last thing standing is the profile for geoip archive
[12:34:28] <elukey>	 I'd love to move it on an-launcher, since IIRC that thing pushes to hdfs
[12:34:37] <elukey>	 so there is no real reason to have it on stat1007 right?
[12:44:01] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad, 10Patch-For-Review: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts: ` an-druid1001....
[12:46:41] <fdans>	 elukey: yeah, as long as it can fetch the newest version of the db without issues there should be no problem
[12:49:27] <elukey>	 super thanks :)
[13:01:21] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad, 10Patch-For-Review: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts: ` an-druid1002....
[13:03:29] <elukey>	 joal: is 00/4:45:00 on purpose?
[13:04:07] <joal>	 elukey: different times of days?
[13:04:20] <joal>	 elukey: I don't master those interval things :S
[13:05:17] <elukey>	 so that should be hour:minute:second
[13:05:33] <elukey>	 and usually the /something are to execute every something time
[13:05:45] <elukey>	 I think that the above means every 4 hours
[13:05:47] <elukey>	 or similar
[13:05:57] <elukey>	 what is the target that you have in mind?
[13:06:12] <joal>	 elukey: exactly that - every 4 hour, but a different minute
[13:06:29] <joal>	 elukey: mimic of webrequest (hourly data)
[13:09:16] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad, 10Patch-For-Review: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-druid1001.eqiad.wmnet'] `  and were **ALL** successful.
[13:09:33] <elukey>	 joal: okok perfect
[13:09:35] <elukey>	 merging
[13:10:14] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad, 10Patch-For-Review: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts: ` druid1007.eqi...
[13:10:41] <joal>	 elukey: Many thanks :) I would be happy to change if you prefer!
[13:12:14] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts: ` druid1008.eqiad.wmnet ` The log can...
[13:15:33] <elukey>	 joal: nono just wanted to check if it was the intended interval, we can change it in the future in case
[13:15:45] <joal>	 Ack elukey - T
[13:16:05] <joal>	 elukey: thanks for taking the time - Let me know if I can help with anything :S
[13:16:15] <elukey>	 sure!
[13:16:19] <elukey>	 I am deploying now
[13:27:35] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-druid1002.eqiad.wmnet'] `  and were **ALL** successful.
[13:33:11] <wikibugs>	 (03CR) 10Ottomata: "Hm.  Refine will write the _REFINED flag, and generally that will mean things are ready to go.  Refine does wait 2 hours before attempting" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/594719 (https://phabricator.wikimedia.org/T249773) (owner: 10Milimetric)
[13:35:27] <elukey>	 ottomata: morningggg
[13:35:35] <elukey>	 https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/594941/ - green light?? :D
[13:35:43] <elukey>	 (stat1007 to role explorer!)
[13:36:13] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['druid1007.eqiad.wmnet'] `  and were **ALL** successful.
[13:36:32] <ottomata>	 WOWO
[13:36:34] <ottomata>	 go for it luca
[13:36:35] <ottomata>	 is that the last one?
[13:36:42] <ottomata>	 or are there still some conditionals floating around?
[13:37:55] <ottomata>	 luca how about greenlight for 
[13:37:55] <ottomata>	 https://gerrit.wikimedia.org/r/c/operations/puppet/+/594565
[13:37:56] <ottomata>	 ?
[13:37:57] <ottomata>	 :
[13:37:58] <ottomata>	 :)
[13:38:14] <ottomata>	 Right now it will only start importing test.event and some eventgate error topics
[13:38:17] <elukey>	 ottomata: so there are some conditionals around that I had to make, but I have some ideas about how to fix them :)
[13:38:27] <ottomata>	 cooo
[13:38:52] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['druid1008.eqiad.wmnet'] `  and were **ALL** successful.
[13:39:56] <elukey>	 ottomata: I completely lost the dynamic stuff
[13:40:03] <ottomata>	 hah
[13:40:30] <ottomata>	 elukey: 
[13:40:31] <ottomata>	 https://gerrit.wikimedia.org/r/c/analytics/refinery/+/593047
[13:41:07] <ottomata>	 it makes the camus wrapper request from meta.wikimedia.org/w/api.php?action=streamconfigs... when it is run
[13:41:09] <ottomata>	 and use that to set
[13:41:13] <ottomata>	 -Dkafka.whitelist.topics
[13:41:16] <elukey>	 ahh you extended the camus wrapper! okok now it makes more sense :D
[13:41:28] <elukey>	 yesyes looks good
[13:41:36] <ottomata>	 a bit magical but it'll do
[13:41:44] <ottomata>	 until we figure out how to clean the entire thing up
[13:41:55] <ottomata>	 doesn't make things better, but also not worse ¯\_(ツ)_/¯
[13:42:20] <ottomata>	 k merging that will make sure it works ok
[13:47:02] <joal>	 Gone buying food - hopefully back on time for standup
[13:49:45] <ottomata>	 wow this old laptop's keyboard is much easier to type on than my newer one
[13:50:19] <elukey>	 ottomata: one thing that I forgot yesterday - is it ok if I bump the yarn nm's heap to 6G (and reduce the max mem available for containers accordingly) ?
[13:50:30] <ottomata>	 sure!
[13:50:33] <elukey>	 there seems to be a OOM issue sometimes with spark shuffling
[13:50:39] <elukey>	 and the nm gets to OOM :(
[13:50:46] <ottomata>	 hm, does sounds worth it but maybe risky too?
[13:50:56] <ottomata>	 what would we be reducing the container max mem to?
[13:51:09] <elukey>	 reducing it by 2G
[13:51:18] <elukey>	 basically the diff between 4G->6G
[13:51:33] <elukey>	 so -2G 
[13:51:55] <ottomata>	 that means that jobs won't be able to request more than 4G per worker, right?
[13:52:01] <ottomata>	 maybe fine, but also maybe some jobs need that?
[13:52:13] <ottomata>	 worth a try, might want to check with joseph about that too
[13:52:32] <elukey>	 mmm wait a sec, I am not getting why 
[13:52:40] <elukey>	 I mean the 4G limit per worker
[13:53:06] <elukey>	 !log move stat1007 to role::statistics::explorer (adding jupyterhub)
[13:53:07] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[13:53:12] <ottomata>	 the spark executor runs in a container, no?  
[13:53:18] <ottomata>	 and also other mappers et.c
[13:53:30] <ottomata>	 you'd be reducing the max mem per container from 6G to 4G?
[13:53:31] <ottomata>	 right?
[13:53:43] <ottomata>	 (brb)
[13:54:39] <elukey>	 ottomata: ahhh nono sorry, what I want to do is raise the NM heap max size to 6G
[13:54:59] <elukey>	 and reduce yarn_scheduler_maximum_allocation_mb: 53248 to 53248 - 2G
[13:56:31] <elukey>	 but now that I think about it, we have different nodes, that is not the value 
[13:57:36] <ottomata>	 b
[14:00:10] <ottomata>	 right but by reducing yarn_scheduler_maximum_allocation_mb, you will be reducing the memory avaiable to any given e.g. map task, right?
[14:00:30] <ottomata>	 so, if there is currently some map task that needs 53248 to work, it will no longer be able to get that much
[14:00:32] <ottomata>	 and might fail
[14:01:12] <ottomata>	 like, what happens if someone rusn spark with --executor-memory 53248M
[14:01:13] <ottomata>	 ?
[14:01:33] <elukey>	 we can argue that having a worker with 50G of ram is a little bit too much :D
[14:01:46] <ottomata>	 OH THAT IS G??
[14:02:01] <ottomata>	 right.
[14:02:01] <ottomata>	 ok
[14:02:03] <ottomata>	 np
[14:02:20] <ottomata>	 sounds fine.
[14:02:24] <ottomata>	 hah
[14:02:26] <ottomata>	 yes proceed!
[14:02:36] <elukey>	 but I am not remembering very well all those calculations about space etc.. so I'll recheck, what I want to do is bump the NM heap size :D
[14:02:47] <mforns>	 hey teaaaammmm, more alarrrmmmms!
[14:05:36] <ottomata>	 elukey:  sounds good
[14:05:47] <ottomata>	 sorry I misunderstood for a sec what you were reducing
[14:05:53] <ottomata>	 +1 to your change, should be fine
[14:11:52] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make anomaly detection correctly handle holes in time-series - https://phabricator.wikimedia.org/T251542 (10mforns) @Nuria I agree with @JAllemandou. I think we should fill in a default value, so that we can still use the existing data and alert accordingly. Usually 0 is a good...
[14:13:17] <elukey>	 aaaand stat1007 done!
[14:13:31] <elukey>	 still not perfect but happy about the resul
[14:13:35] <elukey>	 *result
[14:18:00] <ottomata>	 :D
[14:21:39] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10Cmjohnson)
[14:21:54] <wikibugs>	 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10Cmjohnson) 05Open→03Resolved
[14:53:19] <joal>	 Yay! Back in time :)
[14:53:29] <joal>	 shall we do standup or staff-meeting tema?
[14:54:38] <wikibugs>	 10Analytics: Decomission notebook hosts - https://phabricator.wikimedia.org/T249752 (10elukey) 05Stalled→03Open
[14:54:40] <wikibugs>	 10Analytics, 10Analytics-Kanban: Unify puppet roles for stat and notebook hosts - https://phabricator.wikimedia.org/T243934 (10elukey)
[14:59:31] <milimetric>	 a-team: standup or staff meeting (same q as joal)
[15:00:58] <joal>	 ok - staff meeting :)
[15:01:24] <ottomata>	 oh i guess staff ya
[15:31:52] <wikibugs>	 10Analytics, 10Analytics-Cluster: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10elukey) @Aroraakhil there are two ways of checking metrics:  1) `sudo radeontop` 2) https://grafana.wikimedia.org/d/ZAX3zaIWz/amd-rocm-gpu?orgId=1  rocm-smi is unfortunately a python script th...
[15:33:21] <wikibugs>	 10Analytics, 10good first task: Javascript-less Wikistats - https://phabricator.wikimedia.org/T251979 (10Milimetric) p:05Triage→03Medium Cool would be some kind of Server Side Rendering (even snapshot) :)  Which I'm for.  Let's do it.
[15:34:47] <wikibugs>	 10Analytics, 10Analytics-Cluster: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10Milimetric) 05Open→03Resolved a:03Milimetric
[15:35:22] <wikibugs>	 10Analytics, 10Analytics-Cluster: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10Milimetric) Documented on wikitech https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/AMD_GPU
[15:38:34] <wikibugs>	 10Analytics, 10Analytics-Kanban: Tune up thresholds of data quality hourly alarms - https://phabricator.wikimedia.org/T251814 (10Milimetric) p:05Triage→03High
[15:39:50] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Analytics-Wikistats: Add proper trend numbers to wikistats metrics - https://phabricator.wikimedia.org/T251813 (10Milimetric) p:05Triage→03Medium
[15:40:12] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10MediaWiki-Vagrant: eventlogging vagrant role: 'ParsedRequirement' object has no attribute 'req' - https://phabricator.wikimedia.org/T251864 (10Milimetric) 05Open→03Declined We're putting most effort on MEP and the new flow.  This is the python side of EventLogging...
[15:40:49] <wikibugs>	 10Analytics, 10Analytics-Kanban: Troubleshoot EventLogging sanitization immediate - https://phabricator.wikimedia.org/T251794 (10Milimetric) p:05Triage→03High
[15:42:19] <wikibugs>	 10Analytics, 10Analytics-Kanban: Add folder creation for sqoop initial installation in puppet - https://phabricator.wikimedia.org/T251788 (10Milimetric) p:05Triage→03High a:03fdans Debate: could be the script or puppet that creates the folder.
[15:44:49] <wikibugs>	 10Analytics: Cannot see SQL lab tab on UI - https://phabricator.wikimedia.org/T251787 (10Milimetric) 05Open→03Resolved p:05Triage→03High a:03elukey The superset admin user bug thing came up again.  Luca re-fixed it.
[15:45:52] <elukey>	 joal: ouch https://yarn.wikimedia.org/cluster/app/application_1583418280867_333836 :(
[15:46:00] <wikibugs>	 10Analytics: Creation of canonical pageview dumps for users to download - https://phabricator.wikimedia.org/T251777 (10Milimetric) p:05Triage→03High
[15:46:26] <joal>	 hm
[15:46:38] <joal>	 elukey: too much pressure on cluster I think :(
[15:46:39] <elukey>	 after 8h /o\
[15:46:53] <joal>	 elukey: at the end :)
[15:47:03] <joal>	 elukey: will look into logs to make sure
[15:47:15] <joal>	 elukey: Thanks for the heads up!
[15:48:26] <elukey>	 also Cc: mforns, denormalize failed :( - see yarn link above
[15:48:36] <mforns>	 ok
[15:48:40] <joal>	 elukey: shuffle errors
[15:48:48] <joal>	 plenty of them all along
[15:48:57] <elukey>	 buuuu
[15:49:02] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Product-Analytics: EditAttemptStep sent event with  "ready_timing": -18446744073709543000 - https://phabricator.wikimedia.org/T251772 (10Milimetric) +1 @mpopov, the client should never send negative numbers, no matter what the browsers are telling you :)  I'd come up...
[15:49:02] <elukey>	 but what kind of errors? OOM?
[15:49:04] <joal>	 elukey: can we release the patch and roll-restart before restarting the job?
[15:49:10] <elukey>	 sure sure
[15:50:06] <joal>	 elukey: from analytics1047.eqiad.wmnet (at least some)
[15:50:40] <wikibugs>	 10Analytics, 10Analytics-Kanban: Add page_restrictions table to sqoop list - https://phabricator.wikimedia.org/T251749 (10Milimetric) a:03JAllemandou
[15:51:46] <wikibugs>	 10Analytics, 10Analytics-Kanban: check leftovers of jmorgan - https://phabricator.wikimedia.org/T251600 (10Milimetric) a:03elukey with honorable mention of @mforns to do any cleanup in HDFS
[15:52:19] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10Patch-For-Review: Remove North Korea from data quality traffic entropy reports - https://phabricator.wikimedia.org/T251546 (10Milimetric) p:05Triage→03High
[15:52:33] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make anomaly detection correctly handle holes in time-series - https://phabricator.wikimedia.org/T251542 (10Milimetric) p:05Triage→03High
[15:53:35] <joal>	 elukey: the error is java.lang.NullPointerException - this is unexpected :(
[15:54:54] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Product-Analytics: EditAttemptStep sent event with  "ready_timing": -18446744073709543000 - https://phabricator.wikimedia.org/T251772 (10Ottomata) Ya I betcha you could add `minimum: 0` to the field.  https://json-schema.org/understanding-json-schema/reference/numeric...
[15:56:20] <wikibugs>	 10Analytics, 10I18n, 10RTL: Support right-to-left languages in Wikistats - https://phabricator.wikimedia.org/T251376 (10Milimetric) p:05Triage→03Medium I have thoughts about collaborating on this with the more mainstream effort of finding/building a design system for mediawiki (part of the slow migration...
[15:57:54] <wikibugs>	 10Analytics, 10I18n, 10RTL, 10good first task: Support right-to-left languages in Wikistats - https://phabricator.wikimedia.org/T251376 (10Milimetric)
[15:59:30] <wikibugs>	 10Analytics: Change Wikistats UI language without reloading the page - https://phabricator.wikimedia.org/T251375 (10Milimetric) p:05Triage→03High
[16:01:13] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Product-Analytics: Augment Hive event data with normalized host info from meta.domain - https://phabricator.wikimedia.org/T251320 (10Milimetric) p:05Triage→03High
[16:01:18] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform: All EventGate instances should use EventStreamConfig - https://phabricator.wikimedia.org/T251935 (10Ottomata) 05Open→03Declined After discussion the Event Platform Engineering sync yesterday, we all agreed that this is hard an...
[16:01:21] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: Automate ingestion and refinement into Hive of event data from Kafka - https://phabricator.wikimedia.org/T251609 (10Ottomata)
[16:01:23] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10CPT Initiatives (Modern Event Platform (TEC2)), and 2 others: Refactor EventBus mediawiki configuration - https://phabricator.wikimedia.org/T229863 (10Ottomata)
[16:01:55] <joal>	 something feels wrong - after 8h the job should be almost finished (or well advanced - And it seems not so advanced :(
[16:02:00] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Better Use Of Data, 10Event-Platform, 10Product-Analytics: Augment Hive event data with normalized host info from meta.domain - https://phabricator.wikimedia.org/T251320 (10Ottomata)
[16:02:36] <wikibugs>	 10Analytics, 10Analytics-Kanban: Corrupted parquet statistics when querying webrequest data via Superset/Presto - https://phabricator.wikimedia.org/T251231 (10Milimetric) a:03elukey
[16:02:55] <wikibugs>	 10Analytics, 10Analytics-Kanban: Corrupted parquet statistics when querying webrequest data via Superset/Presto - https://phabricator.wikimedia.org/T251231 (10Milimetric) p:05Triage→03High a:05elukey→03Nuria
[16:04:21] <wikibugs>	 10Analytics, 10Fundraising-Analysis: Statistics on a CN banner - https://phabricator.wikimedia.org/T251177 (10Milimetric)
[16:05:13] <wikibugs>	 10Analytics: Decomission notebook hosts - https://phabricator.wikimedia.org/T249752 (10Addshore) Will there be any automatic rsync / backup from the notebook hosts for all users? Or is that something I'll have to take care of myself?
[16:05:28] <wikibugs>	 10Analytics: Idea: Add 'top X bigger than Y' sanitization method to EL-to-Druid - https://phabricator.wikimedia.org/T251145 (10Milimetric) Describe more.  Do you mean per metric, dimension, etc?
[16:06:07] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create anaconda .deb package with stacked conda user envs - https://phabricator.wikimedia.org/T251006 (10Milimetric) p:05Triage→03High
[16:06:17] <wikibugs>	 10Analytics: Decomission notebook hosts - https://phabricator.wikimedia.org/T249752 (10elukey) >>! In T249752#6116532, @Addshore wrote: > Will there be any automatic rsync / backup from the notebook hosts for all users? > Or is that something I'll have to take care of myself?  Managed by users since it requires...
[16:07:45] <elukey>	 joal: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/594992/
[16:08:23] <joal>	 elukey: we need another change I think
[16:08:44] <elukey>	 sure which one?
[16:09:37] <joal>	 the change from 53248 to 49152 is about max-allocation for a single container - We need to change the available memory by nodeManager I think (yarn_nodemanager_resource_memory_mb)
[16:09:53] <wikibugs>	 10Analytics, 10EventStreams: EventStreams socket stays connected without any traffic incoming - https://phabricator.wikimedia.org/T250912 (10Milimetric) p:05Triage→03High Thanks for the report, can you please link us to the client code?  This is a python bot?  How often does it happen?  Every time after X...
[16:09:58] <elukey>	 joal: I thought the same but I didn't find any setting about it
[16:10:02] <joal>	 elukey: --^
[16:10:11] <joal>	 elukey: It's by node - maybe a different place?
[16:10:51] <wikibugs>	 10Analytics: Anomaly detection alarms for the edit event stream - https://phabricator.wikimedia.org/T250845 (10Milimetric) p:05Triage→03High
[16:11:03] <elukey>	 ah it maybe auto-calculated
[16:11:19] <elukey>	 because we have different nodes
[16:11:26] <elukey>	 this is why I didn't find it probably
[16:11:29] <joal>	 could be - I'm interested by the formula (or the place) :)
[16:12:15] <wikibugs>	 10Analytics: MEP: canary alarms so we know events are flowing through pipeline - https://phabricator.wikimedia.org/T250844 (10Milimetric) p:05Triage→03Medium
[16:12:26] <wikibugs>	 10Analytics: MEP: canary events so we know events are flowing through pipeline  - https://phabricator.wikimedia.org/T250844 (10Milimetric)
[16:13:19] <wikibugs>	 10Analytics: Release a public dataset about percentage of referrers in wikipedia traffic - https://phabricator.wikimedia.org/T250840 (10Milimetric) p:05Triage→03Low
[16:14:23] <joal>	 elukey: passed to hadoop.pp, undefined in (so defined somewhere else)
[16:14:52] <elukey>	 I am confused, I don't find where it gets defined
[16:16:23] <wikibugs>	 10Analytics: Unique devices, retrofit with bot detection code - https://phabricator.wikimedia.org/T250744 (10Milimetric) p:05Triage→03Medium Good to let the pageview detection bake for a bit before doing this.
[16:16:31] <wikibugs>	 10Analytics: We should get an alarm for partitions that have no data for topics that have data influx at all times, most of the  mediawiki.* - https://phabricator.wikimedia.org/T250699 (10Milimetric) p:05Triage→03High
[16:17:04] <elukey>	 ahhhh
[16:17:06] <elukey>	 we set yarn_nodemanager_os_reserved_memory_mb
[16:17:16] <elukey>	 and we have
[16:17:17] <elukey>	     $yarn_nodemanager_resource_memory_mb      = $hadoop_config['yarn_nodemanager_os_reserved_memory_mb'] ? {
[16:17:20] <elukey>	             undef   => undef,
[16:17:23] <elukey>	             default => floor($facts['memorysize_mb']) - $hadoop_config['yarn_nodemanager_os_reserved_memory_mb'],
[16:17:26] <elukey>	     }
[16:17:26] <elukey>	 ok swapping completed :D
[16:17:28] <elukey>	 yes yes
[16:17:39] <joal>	 Thanks elukey <3
[16:17:45] <joal>	 Where is it?
[16:18:18] <wikibugs>	 10Analytics: Verify if Superset can authenticate to Druid via TLS/Kerberos - https://phabricator.wikimedia.org/T250487 (10Milimetric) p:05Triage→03Medium
[16:18:26] <elukey>	 it is in hieradata/common.yaml
[16:18:30] <joal>	 Meh
[16:18:33] <joal>	 ok :)
[16:18:39] <wikibugs>	 10Analytics: Verify if Turnilo can pull data from Druid using Kerberos/TLS - https://phabricator.wikimedia.org/T250485 (10Milimetric) p:05Triage→03Medium
[16:23:36] <elukey>	 joal: ok updated
[16:23:45] <joal>	 I think the cluster business is at the gist of our problem - There have been retries all along the job
[16:24:13] <joal>	 reading elukey 
[16:24:42] <elukey>	 I am running pcc to see changes
[16:25:25] <joal>	 elukey: dumb idea while we are at it: can you add a comment line 714 to be able to more easily find yarn_nodemanager_resource_memory formula?
[16:25:32] <joal>	 ack elukey 
[16:25:39] <elukey>	 joal: how dare you asking such things
[16:25:41] <elukey>	 :P
[16:25:46] <joal>	 elukey: :)
[16:26:05] <joal>	 Future-me is actually tapping my shoulder, that's why I ask
[16:26:47] <elukey>	 I also changed the value in the wrong place (test cluster)
[16:27:30] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Technical contributors emerging communities metric definition, thick data - https://phabricator.wikimedia.org/T250284 (10Milimetric) p:05Triage→03High
[16:27:35] <joal>	 Wow good catch elukey !
[16:27:41] <joal>	 Didn't notice elukey 
[16:28:01] <joal>	 I looked at that in the previous patch and did not do it on that one (bad joal)
[16:28:23] <wikibugs>	 10Analytics: Kerberos-run-command doesn't work with spark-submit [workaround] - https://phabricator.wikimedia.org/T250161 (10Milimetric) p:05Triage→03High
[16:29:39] <mforns>	 heya elukey and joal can I help with anything?
[16:29:56] <elukey>	 mforns: denormalize kaput :(
[16:30:05] <mforns>	 reading
[16:30:06] <elukey>	 joal: new version ready
[16:30:15] <joal>	 ack elukey 
[16:31:18] <joal>	 +1ed elukey 
[16:31:40] <elukey>	 pcc https://puppet-compiler.wmflabs.org/compiler1003/22398/
[16:31:44] <joal>	 I'm gonna monitor the job tonight
[16:32:49] <elukey>	 joal: ok merging, running puppet and roll restarting
[16:32:52] <elukey>	 ETA 10 mins
[16:32:54] <joal>	 pfff-what a beginning of month
[16:33:23] <joal>	 ack elukey - Thanks again a milion - I'm gonna share a virtual beer with you tonight
[16:33:51] <elukey>	 <3
[16:35:05] <joal>	 elukey: That rsync-between-client link is a must have :)
[16:37:22] <elukey>	 !log roll restart of all the nodemanagers on the hadoop cluster to pick up new jvm settings
[16:37:24] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[16:44:30] <elukey>	 joal: https://grafana.wikimedia.org/d/000000585/hadoop?panelId=17&fullscreen&orgId=1&var-datasource=eqiad%20prometheus%2Fanalytics&var-hadoop_cluster=analytics-hadoop&var-worker=All&from=now-3h&to=now
[16:44:39] <elukey>	 so 8g might be more than we need, we can tune it later
[16:44:54] <elukey>	 all done, when you want to restart denormalize please go
[16:46:43] <elukey>	 going afk to do some gardening, checking later! (ping me on the phone if anything explodes)
[16:49:40] <joal>	 !log Restart and babysit mediawiki-history-denormalize-wf-2020-04
[16:49:41] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[16:59:28] <wikibugs>	 10Analytics, 10Analytics-Kanban: Make anomaly detection correctly handle holes in time-series - https://phabricator.wikimedia.org/T251542 (10Nuria) >I think we should fill holes with 0s (that's actually the meaning of the hole). The danger of doing that is that you get a stream of data with zeros that indicate...
[17:22:07] <isaacj>	 @analytics-team postpone start of research-analytics meeting for half hour to go to tech monthly? if not, i can start ontime and watch tech monthly later
[17:22:39] <joal>	 works for me isaacj 
[17:23:47] <isaacj>	 i'll take that for consensus unless i hear otherwise :)
[17:26:29] * joal makes consensus with himself
[17:29:02] <isaacj>	 the best kind!
[17:34:28] <wikibugs>	 (03PS1) 10Ottomata: bin/camus - check_java_opts should override extra_java_opts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/595007 (https://phabricator.wikimedia.org/T251609)
[17:35:47] <wikibugs>	 (03CR) 10Ottomata: [V: 03+2 C: 03+2] "Merging this and deploying it to stop false positive camus failure report emails." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/595007 (https://phabricator.wikimedia.org/T251609) (owner: 10Ottomata)
[17:36:04] <wikibugs>	 10Analytics, 10Analytics-Cluster: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10Aroraakhil) @elukey thanks much for your response. However, none of these monitoring tools give information about the pids of the processes or the number of processes currently using the GPU....
[17:39:12] <ottomata>	 !log deploying fix to refinery bin/camus CamusPartitionChecker when using dynamic stream configs
[17:39:13] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[17:39:33] <wikibugs>	 10Analytics: Release a public dataset about percentage of referrers in wikipedia traffic - https://phabricator.wikimedia.org/T250840 (10Nuria)
[17:39:50] <joal>	 actually isaacj I'm gonna need to skip the meeting (family need) - Please reach out to me if needed!
[17:40:31] <isaacj>	 joal: :thumbs up:
[17:43:20] <wikibugs>	 10Analytics, 10Analytics-Cluster: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10elukey) @Aroraakhil this is the output of rocm-smi (I executed manually via sudo) by default is:  ` elukey@stat1005:~$ sudo /opt/rocm/bin/rocm-smi    ========================ROCm System Manage...
[17:43:36] * elukey off!
[17:52:15] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10MediaWiki-Vagrant: eventlogging vagrant role: 'ParsedRequirement' object has no attribute 'req' - https://phabricator.wikimedia.org/T251864 (10Nuria) See: https://phabricator.wikimedia.org/T238230
[17:54:12] <wikibugs>	 10Analytics, 10Analytics-Cluster: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10Aroraakhil) @elukey thanks much for your prompt response. This is what I get from 'nvidia-smi' in our EPFL machine. As you can see it displays the number of processes currently running, and th...
[18:28:43] <wikibugs>	 10Analytics: Add a "latest" partition to Hive tables - https://phabricator.wikimedia.org/T252148 (10Isaac)
[18:44:00] <wikibugs>	 10Analytics: Add a "latest" partition to Hive tables - https://phabricator.wikimedia.org/T252148 (10Ottomata) I think this is a cool idea.  @JAllemandou @elukey @Milimetric.  What if every time we added a new Hive Partition, we'd select for the 'latest' one and then add another Hive partition pointing to its loc...
[18:49:17] <wikibugs>	 10Analytics, 10Analytics-Kanban: check leftovers of jmorgan - https://phabricator.wikimedia.org/T251600 (10mforns) [x] Removed listed directores from HDFS.
[18:58:11] <wikibugs>	 (03PS1) 10Mforns: Add outreach.wikipedia and incubator.wikipedia to the pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/595028
[18:59:37] <wikibugs>	 (03CR) 10Mforns: [V: 03+2 C: 03+2] "Self-merging to avoid alarms" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/595028 (owner: 10Mforns)
[19:38:17] <addshore>	 ottomata: o/ im curious, is there still a simple event logging / event bus -> sql thing?
[19:38:30] <addshore>	 (for use on other wikis)
[19:40:48] <ottomata>	 addshore:  no not really
[19:41:14] <ottomata>	 eventlogging still works, but we aren't targeting supporting sql like support for third parties
[19:41:24] <ottomata>	 the old stuff should all still work though
[20:10:09] <joal>	 Ah team - forgot to mention - Tomorrow is off in France - I'll keep an eye on ops but will mostly be not available
[20:27:45] <wikibugs>	 (03CR) 10Milimetric: "My sense is that it's too tricky to be very sure about streaming data.  So you run with the best you have at the time, and when you get yo" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/594719 (https://phabricator.wikimedia.org/T249773) (owner: 10Milimetric)
[20:29:05] <mforns>	 k joal :]
[20:34:26] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Proposed adjustment to wmf.wikidata_item_page_link to better handle page moves - https://phabricator.wikimedia.org/T249773 (10Milimetric) > One other thing that I thought of that might speed up the query: I can never remember how snapshot dat...
[20:34:28] <wikibugs>	 (03CR) 10Ottomata: "Hm, I wouldn't call this 'streaming' data though, any more than webrequest is. The data is generated and imported here in pretty much exac" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/594719 (https://phabricator.wikimedia.org/T249773) (owner: 10Milimetric)
[21:17:16] <wikibugs>	 (03PS3) 10Milimetric: Use page move events to improve joining to entity [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/594428 (https://phabricator.wikimedia.org/T249773)
[21:18:14] <wikibugs>	 (03CR) 10Milimetric: "done testing and vetting, everything looks good except the page_namespace column is now bigint for whatever reason so there's a type misma" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/594428 (https://phabricator.wikimedia.org/T249773) (owner: 10Milimetric)
[21:18:53] <ottomata>	 byaaaaa
[21:21:53] <milimetric>	 yay, finished vetting.  Now to look at this sqoop...
[21:22:03] <wikibugs>	 (03CR) 10Milimetric: "Hm... I don't know, I think this is just an artifact of how we're using the page move data right now.  I think the idea here is to make it" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/594719 (https://phabricator.wikimedia.org/T249773) (owner: 10Milimetric)
[21:22:03] <wikibugs>	 (03CR) 10Milimetric: "In any case, the new job code is vetted, this can be properly reviewed now, and I'll do as you two want, no real preference from me." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/594719 (https://phabricator.wikimedia.org/T249773) (owner: 10Milimetric)
[21:33:29] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Research, 10Patch-For-Review: Proposed adjustment to wmf.wikidata_item_page_link to better handle page moves - https://phabricator.wikimedia.org/T249773 (10Isaac) @Milimetric thanks for actually verifying my conjecturing around snapshot dates :)  Based on the history for...