[00:02:43] <icinga-wm>	 PROBLEM - Check the last execution of refinery-sqoop-mediawiki-production on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refinery-sqoop-mediawiki-production
[00:20:24] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Discovery, 10EventBus, and 2 others: Rewrite Avro schemas (ApiAction, CirrusSearchRequestSet) as JSONSchema and produce to EventGate - https://phabricator.wikimedia.org/T214080 (10Nuria) +1 to @Tgr  specially to "log schemas are fluid while webhook schemas need to b...
[00:30:48] <nuria>	 I think our scooping failed cause it could not write to /wmf/data/raw/mediawiki/tables/logging/snapshot=2019-01 as it already existed  on hdfs
[00:31:55] <nuria>	 milimetric: yt?
[00:37:14] <nuria>	 I am going to delete /wmf/data/raw/mediawiki/tables/logging/snapshot=2019-01 and restart scooping ? ping ottomata or milimetric to see it sounds ok
[00:43:29] <nuria>	 hopefully ahem ... this checks out, if it does not the 2019-01 will be in trash 
[00:45:41] <nuria>	 errors:
[00:45:51] <nuria>	 https://www.irccloud.com/pastebin/2QRbV9Qy/
[00:48:46] <nuria>	 will wait for milimetric a bit, can do same later
[01:06:57] <milimetric>	 nuria: those errors are from today, the sqoop had already finished.  I can take a closer look after bathtime
[01:07:46] <milimetric>	 but we figured the sqoop ran fine, we miscommunicated on the oozie job restarting
[01:08:05] <milimetric>	 it did seem fast so maybe something’s wrong
[01:23:34] <nuria>	 milimetric: see alarm: "Check the last execution of refinery-sqoop-mediawiki-production on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit refinery-sqoop-mediawiki-production"
[01:23:47] <nuria>	 milimetric: and dates of errors 2019-02-07T00:00:56
[01:24:32] <nuria>	 milimetric: and directories on hdfs 2019-02-05 22:22 /wmf/data/raw/mediawiki/tables/logging/snapshot=2019-01/wiki_db=tawikisource
[01:24:55] <nuria>	 milimetric: are created before the errors on the sqoop log, right?
[01:54:38] <milimetric>	 nuria: ok, so looking at this now
[01:54:50] <milimetric>	 it seems maybe that sqoop job is wrong, but we all looked at it, I'm surprised
[01:54:53] <milimetric>	 there are 3 jobs now
[01:55:08] <milimetric>	 there's mediawiki, mediawiki-production, and mediawiki-private
[01:55:21] <milimetric>	 the production one is just supposed to sqoop actor and comment, but I guess it's not
[01:57:09] <milimetric>	 the normal one, that sqoops most tables, is good.  I checked revision, archive, logging, etc. and they're all fine
[01:57:17] <milimetric>	 on the other hand, actor and comment haven't been touched
[01:57:24] <milimetric>	 checking logs now
[01:59:39] <milimetric>	 hm, something seems wrong with the systemd timer.  It says pretty clearly "-t actor,comment"
[01:59:46] <milimetric>	 but it must be calling the wrong one or something
[02:07:58] <milimetric>	 ok nuria, this patch is what was wrong: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/488670/
[02:08:15] <milimetric>	 nuria: I'll trigger the correct sqoop job manually, it's only a little late, should be fine
[02:18:36] <milimetric>	 found another bug, added to change, sqoop is running now
[03:31:34] <wikibugs>	 10Analytics, 10User-Elukey: Restoring the daily traffic anomaly reports - https://phabricator.wikimedia.org/T215379 (10Tbayer) >>! In T215379#4930611, @elukey wrote: > I think that the first step, if this job is important, could be to restore the cronjob under a user's crontab that will actively maintain it. T...
[06:10:33] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) @JAllemandou dbstore1002 crashed, let's start the same thing but with less jobs I would suggest.
[06:10:51] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, 10Patch-For-Review: dbstore1002 Mysql errors - https://phabricator.wikimedia.org/T213670 (10Marostegui) dbstore1002 crashed, possibly due to {T215450}
[06:42:29] <icinga-wm>	 RECOVERY - Check the last execution of refinery-sqoop-mediawiki-production on an-coord1001 is OK: OK: Status of the systemd unit refinery-sqoop-mediawiki-production
[06:43:09] <elukey>	 I manually reset the status of --^ since dan is executing the job in screen
[07:19:09] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10JAllemandou) @Marostegui : Sqoop finished yesterday at 23:31 UTC with expected number of rows. Maybe the db crash was unrelated? @Halfa...
[07:22:15] <joal>	 \o/ - manual production-sqooping is done, mediawiki-history job started as expected :)
[07:23:31] <joal>	 Something else to change in the production-sqoop config: use a different logging file than the priovate one
[07:23:35] <joal>	 milimetric: --^
[07:24:05] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) @JAllemandou at what time did you start the job? From what I can see it crashed at around 18:32, so it crashed before then...
[07:26:36] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10JAllemandou) @Marostegui : Indeed I started the job later (comment time is almostsynchronous).
[07:37:45] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) Great - so the crash happened before. Thanks a lot for helping out here :) Let's wait for @Halfak to verify he's got everyt...
[08:05:31] <wikibugs>	 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Allow Erik Bernhardson to have root access on stat1005 for GPU testing - https://phabricator.wikimedia.org/T215384 (10Joe) I second the idea, and I see @Nuria has given +1 to the patch which I assume can count as manager approval. Given...
[08:42:53] <wikibugs>	 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Allow Erik Bernhardson to have root access on stat1005 for GPU testing - https://phabricator.wikimedia.org/T215384 (10Joe) a:03Joe
[08:50:43] * elukey afk for a bit, cat to the vet
[09:47:52] <elukey>	 spent more than one hour fixing dbstore1002's replication
[09:47:55] <elukey>	 what a mess
[09:48:03] <elukey>	 now I really feel Manuel and Jaime
[10:15:26] <elukey>	 still broken
[10:28:28] <elukey>	 joal: o/
[10:43:51] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) I am also mysqldumping that table, just in case.
[10:52:43] <wikibugs>	 10Analytics, 10User-Elukey: Restoring the daily traffic anomaly reports - https://phabricator.wikimedia.org/T215379 (10elukey) This is jdcc's crontab (I thought it was not an active user anymore, I was wrong):  ` 0 15 * * * USER=jdcc /home/jdcc/anaconda3/bin/python /home/jdcc/project_monitoring/scripts/check_p...
[10:53:34] <elukey>	 addshore: o/
[10:53:38] <addshore>	 HI
[10:53:50] <elukey>	 would you like to be the first one to move to the new dbstores? :D
[10:54:04] <addshore>	 i could certainly think about trying it out today
[10:55:03] <elukey>	 I have updated https://wikitech.wikimedia.org/wiki/Analytics/Data_access#MariaDB_replicas with some notes
[10:55:12] <elukey>	 about the new cnames etc..
[10:55:17] <addshore>	 thanks!
[10:55:24] <elukey>	 dbstore1002 is getting broken and broken
[10:56:28] <addshore>	 :D
[11:59:27] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, 10Patch-For-Review: dbstore1002 Mysql errors - https://phabricator.wikimedia.org/T213670 (10Marostegui) @elukey @jcrespo Any objection to put dbstore1002 as IDEMPOTENT? This host crashes every single day, the data is already drifts a lot...
[12:00:17] <wikibugs>	 10Analytics, 10Discovery-Search, 10Multimedia, 10Research, and 2 others: Image Classification Working Group - https://phabricator.wikimedia.org/T215413 (10Gilles) There is an image classifier worth building that probably wouldn't fall into preexisting bias, which is determining whether an image is a photog...
[12:04:44] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, 10Patch-For-Review: dbstore1002 Mysql errors - https://phabricator.wikimedia.org/T213670 (10jcrespo) ok to me, data is already garbage, more garbage would not be a problem :-)
[12:07:16] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) @elukey the .sql file is at: `dbstore1003:/srv/tmp.staging/staging.sql` Can you grab it and store it somewhere else, so we...
[12:14:30] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Joe)
[12:18:42] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10JAllemandou) I suggest using `hdfs://wmf/data/archive/sqldumps` as a base for sql-dumps, with content-oriented subfolders, leading to `...
[12:34:31] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10elukey) I am moving the dump to stat1004, then I'll move it to the HDFS dir that Joseph proposed. Will update the task once done :)
[12:53:48] * elukey lunch!
[13:05:22] <miriam_>	 ciao analytics! I have a question: I recall someone mentioned that ores revision scores were going to be available on hive? Or is it just something I really hoped for and I modified my memories accordingly?
[13:05:26] <miriam_>	 :)
[13:07:23] <joal>	 Hi miriam_ - Indeed you can get ores revision scores in hive (or even better, Spark)
[13:08:04] <miriam_>	 joal: ohh this is amazing news :) which tables should I look at?
[13:10:11] <joal>	 miriam_: You'll have only new stuff as data is flowing in through events (revisions are flowing in from 2018-12)
[13:10:55] <joal>	 miriam_: actually, the reason for which data is available back for 3 month only is because of the data-rentention policy I guess
[13:12:52] <joal>	 miriam_: he table you should use is event.mediawiki_revision_score
[13:13:44] <joal>	 miriam_: Also, the data being not that big (12G), I really suggest using spark :)
[13:13:47] <miriam_>	 joal oh, I see, my data is from september, so I won't be able to use it this time but this is very good to know for the future!
[13:14:53] <miriam_>	 yes makes sense to use spark :)
[13:15:03] <miriam_>	 thanks joal!
[13:15:09] <joal>	 np miriam_ 
[13:15:15] <miriam_>	 merci :)
[13:15:23] <joal>	 hehe
[13:18:32] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Operations, 10Product-Analytics, 10Patch-For-Review: dbstore1002 Mysql errors - https://phabricator.wikimedia.org/T213670 (10JAllemandou) sqoop for actor and comment tables just finished and we should use the new hardware next month, ,so no problem fir me either :)
[13:26:14] <edsanders>	 Hey, I'm having problems running queries in Hue...
[13:26:46] <edsanders>	 "Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask"
[13:34:34] <joal>	 Hi edsanders - unfortunately error messages from hue are not informative most of the time :(
[13:34:44] <joal>	 edsanders: con you tell me more about what you;re doing?
[13:37:41] <wikibugs>	 (03PS1) 10Joal: Add change_tag and change_tag_def to sqoop script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/488927 (https://phabricator.wikimedia.org/T205940)
[13:40:49] <milimetric>	 joal: aaah, can't believe we didn't catch all of these the first time, I must be more tired than I thought
[13:42:10] <joal>	 milimetric: too many breaking stuff at one - we're getting back on track, let's even try to have stuff cleaner than it was before ;)
[13:42:39] <milimetric>	 oh, at least the private log file I think was on purpose
[13:42:48] <joal>	 Ah ok :)
[13:42:57] <milimetric>	 I remember thinking it was ok to have it in the private one, I didn't make variables for a different log
[13:43:02] <milimetric>	 do you prefer it different?
[13:45:28] <wikibugs>	 (03CR) 10Milimetric: "one small syntax thing, then merging" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/488927 (https://phabricator.wikimedia.org/T205940) (owner: 10Joal)
[13:45:46] <milimetric>	 joal: do you prefer different log files?
[13:46:16] <joal>	 milimetric: given the 2 jobs (private and prioduction) are different, I think it makes sense to have 2 different files
[13:46:39] <joal>	 also milimetric, commented on T154370
[13:46:40] <stashbot>	 T154370: Create script for moving orphaned revisions to the archive table - https://phabricator.wikimedia.org/T154370
[13:47:24] <wikibugs>	 (03PS2) 10Joal: Add change_tag and change_tag_def to sqoop script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/488927 (https://phabricator.wikimedia.org/T205940)
[13:47:46] <wikibugs>	 (03CR) 10Joal: Add change_tag and change_tag_def to sqoop script (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/488927 (https://phabricator.wikimedia.org/T205940) (owner: 10Joal)
[13:48:35] <wikibugs>	 (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add change_tag and change_tag_def to sqoop script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/488927 (https://phabricator.wikimedia.org/T205940) (owner: 10Joal)
[13:48:51] <joal>	 thanks milimetric --^
[13:49:17] <milimetric>	 joal: nice!!!  How did you find that task?!
[13:49:33] <joal>	 milimetric: phabricator-search
[13:49:44] <joal>	 milimetric: I must say I have been lucky ;)
[13:50:15] <milimetric>	 you are a genius kind of lucky, jo
[13:50:28] <joal>	 mwarf ;)
[13:51:50] <joal>	 milimetric: we're get settled on mediawiki-history soon - only 2 stages to go
[13:52:27] <milimetric>	 cool.  I mean, it would be weird if something happened, I tested it multiple times
[13:52:54] <joal>	 milimetric: I'm just eager to tick the line and deploy the new snapshot ;)
[13:56:38] <milimetric>	 for sure
[13:56:44] <milimetric>	 ok, logfile split is up for review
[13:56:49] <milimetric>	 gonna go have breakfast and stuff
[14:03:43] <elukey>	 merged and deployed
[14:07:16] <joal>	 thanks elukey :)
[14:08:49] <elukey>	 joal: I am adding the .sql file to /wmf/data/archive/backup/misc/dbstore1002_backup/staging_dbstore1002_07022019.sql since all the others are already there
[14:08:59] <elukey>	 but I just realized that naming is not optimal
[14:09:04] <elukey>	 too many "backup" etc..
[14:09:19] <elukey>	 so after the copy we can rename as we thing it's best
[14:09:28] <elukey>	 is it ok?
[14:09:37] <joal>	 elukey: is that file is only for mep_word_persistence table, I bet we should rename it :)
[14:10:24] <elukey>	 yes yes it is similar to how manuel named it, so but I realized after starting the copy
[14:10:32] <elukey>	 shouldn't be a big deal to rename
[14:10:51] <joal>	 elukey: for sure no, but let's do it now while we still remember what it's about :)
[14:12:09] <elukey>	 joal: I think it would be fun to discover it after some months
[14:12:13] <elukey>	 you ruin all the fun
[14:12:19] <joal>	 Mwahahahaha :D
[14:12:53] <joal>	 elukey: knowing it might be part of the *I* could be doing, I'm in the mood to ask for a rename, now :)
[14:12:57] <elukey>	 joal: do you have a minute today for the hive db drop stuff?
[14:13:29] <joal>	 elukey: Yessir, now?
[14:13:47] <elukey>	 sure!
[14:25:16] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10akosiaris) How is the data going to make it from Hadoop, which resides in the analytics cluster and is firewalled at the router level...
[14:29:01] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Ottomata) > How is the data going to make it from Hadoop, which resides in the analytics cluster and is firewalled at the router level...
[14:33:44] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10bmansurov) >>! In T213566#4934832, @akosiaris wrote: > Is it just a `LOAD DATA INFILE "something.tsv"` or is it something more complex...
[14:34:41] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Discovery, 10EventBus, and 2 others: Rewrite Avro schemas (ApiAction, CirrusSearchRequestSet) as JSONSchema and produce to EventGate - https://phabricator.wikimedia.org/T214080 (10Ottomata) > If we were to use Monolog, we'd likely want to do it with the aim of conve...
[14:36:07] <wikibugs>	 10Analytics, 10Discovery: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10Ottomata) @fgiunchedi ping on this...or can you ping someone who might know more about using Swift for this kind of thing?  Should we consider th...
[14:36:58] <wikibugs>	 10Analytics, 10Discovery: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10Ottomata)
[14:41:17] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10akosiaris) >>! In T213566#4934835, @Ottomata wrote: >> How is the data going to make it from Hadoop, which resides in the analytics cl...
[14:44:32] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10bmansurov) > That does look simple enough and not resource expensive on mwmaint1002. I guess it can fit in there as well? But a VM is...
[14:56:44] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Ottomata) > they will also not allow them to send the SYN/ACK packet required for the second (of the three) phase of the TCP handshake...
[14:57:14] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10elukey) Naming is not great (yet) but the backup is on HDFS:  ` elukey@stat1004:~$ sudo -u hdfs hdfs dfs -ls /wmf/data/archive/backup/m...
[14:58:15] <edsanders>	 joal: just trying to run any query with a "group by" clause
[14:58:27] <edsanders>	 I can run simple queries such as "select * from foo limit 1"
[14:58:54] <joal>	 edsanders: might depend on the table as well
[14:59:34] <edsanders>	 so this query which I copied off a phab task:
[14:59:34] <edsanders>	 select
[14:59:34] <edsanders>	     to_date(dt) as date,
[14:59:34] <edsanders>	     count(*) as events
[14:59:34] <edsanders>	 from editattemptstep
[14:59:34] <edsanders>	 where year = 2018 and month=12
[14:59:34] <edsanders>	 group by to_date(dt)
[14:59:50] <edsanders>	 in event_sanitized
[15:00:20] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) Thanks @elukey - ok to delete the .sql file I created from dbstore1003?
[15:00:57] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10elukey) >>! In T215450#4934900, @Marostegui wrote: > Thanks @elukey - ok to delete the .sql file I created from dbstore1003?  +1
[15:02:55] <joal>	 edsanders: the query worked for me in CLI - Might be related to reading rights
[15:02:56] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) Done - thank you
[15:03:48] <edsanders>	 yeah - my googling suggested it was a rights issue
[15:03:50] <joal>	 edsanders: I need to go errand for a ~2h, I'll be back after that - something else: ou should change the query month to use 2019-01, it contains data while 2018-12 doesn't :)
[15:04:13] <joal>	 elukey: if you have a minute, can you check with edsanders his LDAP groups and all?
[15:04:21] <joal>	 gone for kids team - see you at standup
[15:04:56] <elukey>	 sure
[15:05:28] <elukey>	 what is it needed from LDAP?
[15:11:51] <wikibugs>	 10Analytics: Check home leftovers of user imarlier (Ian Marlier) - https://phabricator.wikimedia.org/T213702 (10elukey) Everything cleaned up, thanks all for the feedback!
[15:12:02] <wikibugs>	 10Analytics, 10Analytics-Kanban: Check home leftovers of user imarlier (Ian Marlier) - https://phabricator.wikimedia.org/T213702 (10elukey)
[15:12:23] <wikibugs>	 10Analytics, 10Analytics-Kanban: Check home leftovers of user imarlier (Ian Marlier) - https://phabricator.wikimedia.org/T213702 (10elukey) a:03elukey
[15:25:51] <elukey>	 updated https://wikitech.wikimedia.org/wiki/Analytics/Ops_week#Have_any_users_left_the_Foundation? with info about how to drop a hive database
[15:25:55] <elukey>	 (credits to Joseph)
[15:28:23] <ottomata>	 NICE elukey and joal! :)
[15:28:42] <ottomata>	 btw mforns wanted to say thanks for all the recent wikitech docs too (HiveToDruid, etc.) those are great!@
[15:37:45] <wikibugs>	 10Analytics, 10User-Elukey: Restoring the daily traffic anomaly reports - https://phabricator.wikimedia.org/T215379 (10Jdcc-berkman) I only setup the reports to be dumped to disk. I didn't know they were getting emailed out, so that must have been handled somewhere else.
[15:56:27] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Halfak) Confirmed that they have the same number of rows (SELECT COUNT(*) FROM mep_word_persistence;).  I confirmed that I can run a us...
[15:57:14] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) Thanks @Halfak, can we drop the table from dbstore1002 then?
[15:57:15] <edsanders>	 joal: yup, just noticed that, although still fails with the correct month
[16:09:15] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Halfak) Yes.  Thanks for your patience.
[16:29:29] <wikibugs>	 10Analytics, 10EventBus, 10Parsoid, 10Research, and 5 others: Surface link changes as a stream - https://phabricator.wikimedia.org/T214706 (10bmansurov)
[16:50:42] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) I finished converting `mep_word_persistence` to InnoDB on dbstore1003: ` (1 day 9 hours 52 min 45.49 sec) `  This was obviously done before we knew it...
[16:52:38] <wikibugs>	 10Analytics, 10EventBus, 10Research, 10Wikidata, and 4 others: Surface link changes as a stream - https://phabricator.wikimedia.org/T214706 (10bmansurov)
[16:53:57] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10JAllemandou) Thanks @Halfak for the double check :) Also, If you like hive speed, try Spark :D Massive thanks again to @Marostegui for...
[16:56:48] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10Marostegui) >>! In T215450#4935181, @Halfak wrote: > Yes.  Thanks for your patience.   Thank you! I will kill this table tomorrow and c...
[16:59:32] <joal>	 ottomata: I confirm the cast fails from struct to maps, even with correctly typed structs
[16:59:43] <joal>	 ottomata: Double-reading is the only way I can think of
[17:02:03] <ottomata>	 does double reading even work though?
[17:02:05] <wikibugs>	 10Analytics, 10EventBus, 10Research, 10Wikidata, and 4 others: Surface link changes as a stream - https://phabricator.wikimedia.org/T214706 (10bmansurov) p:05Triage→03Normal
[17:02:24] <joal>	 ottomata: not even sure no, but worth a try :)
[17:02:40] <nuria>	 elukey: standdup?
[17:04:44] <joal>	 \o/ - mediawiki-history-check worked :)
[17:05:12] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reportupdater queries jobs failing - https://phabricator.wikimedia.org/T213219 (10Nuria)
[17:05:14] <wikibugs>	 10Analytics, 10Analytics-Kanban: Reportupdater should alert if it fails over and over - https://phabricator.wikimedia.org/T213309 (10Nuria) 05Open→03Resolved
[17:05:34] <joal>	 dependent jobs have kicked in, seems it's working :)
[17:10:45] <joal>	 edsanders: we're in standup, we'll discuss what could be wrong with your access to the data - Will report after
[17:25:58] <edsanders>	 thanks!
[17:36:49] <milimetric>	 edsanders: you need to be in the analytics-privatedata group, and for that you need to file a task with Ops-Access-Requests
[17:41:45] <edsanders>	 milimetric: thanks, is there a template?
[17:42:03] <wikibugs>	 10Analytics, 10Discovery-Search, 10Multimedia, 10Research, and 2 others: Image Classification Working Group - https://phabricator.wikimedia.org/T215413 (10Isaac) If we go down that pathway of trying to identify what images are photographs, we should look into work by a former colleague of mine on detecting...
[17:42:20] <milimetric>	 edsanders: I'll try to find an example, I don't think there's a template, it's just like "please give me access to this group, this is why, here's my manager CC-ed for approval"
[17:42:58] <edsanders>	 I think I already did something similar that gave me access to the stats servers
[17:44:42] <edsanders>	 I can access the old mysql databases on stat1005.eqiad.wmnet
[17:56:38] <elukey>	 edsanders: you shouldn't be able to log in into stat1005, is it the case?
[17:56:50] <elukey>	 (it has been deprecated in favor of stat1007 a while ago)
[18:00:25] <wikibugs>	 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install labsdb1012.eqiad.wmnet - https://phabricator.wikimedia.org/T215231 (10elukey) Since this host is important for the Analytics team, I'd be up to take over from the OS install perspective to remove some work from...
[18:03:20] <wikibugs>	 10Analytics, 10Performance-Team (Radar): [Bug] Type mismatch between NavigationTiming EL schema and Hive table schema - https://phabricator.wikimedia.org/T214384 (10fdans) p:05Triage→03Low
[18:06:53] <wikibugs>	 10Analytics: Aggregate pageviews to Wikidata entities - https://phabricator.wikimedia.org/T215438 (10fdans) p:05Triage→03Low
[18:08:25] <wikibugs>	 10Analytics, 10Growth-Team, 10Product-Analytics: Revisions missing from mediawiki_revision_create - https://phabricator.wikimedia.org/T215001 (10fdans) p:05Triage→03High
[18:10:42] <wikibugs>	 10Analytics, 10Performance-Team (Radar): [Bug] Type mismatch between NavigationTiming EL schema and Hive table schema - https://phabricator.wikimedia.org/T214384 (10Milimetric) @Krinkle: refined data was down-cast and is going to remain incorrect unless we re-ingest.  We have raw data going back 90 days [1], s...
[18:12:37] <wikibugs>	 10Analytics, 10Performance-Team (Radar): [Bug] Type mismatch between NavigationTiming EL schema and Hive table schema - https://phabricator.wikimedia.org/T214384 (10Ottomata) Eeee I don't know if that is a good idea... to treat all data as doubles.  The right solution is to create the table from the JSONSchema...
[18:14:03] <wikibugs>	 10Analytics: Replace analytics mailto link in  analytics.wikimedia.org - https://phabricator.wikimedia.org/T215362 (10fdans) p:05Triage→03Normal
[18:14:07] <wikibugs>	 10Analytics, 10Analytics-Kanban: update mw scooping to be able to scoop from new db cluster - https://phabricator.wikimedia.org/T215290 (10fdans) p:05Triage→03High
[18:15:53] <wikibugs>	 10Analytics, 10Analytics-Kanban: update mw scooping to be able to scoop from new db cluster - https://phabricator.wikimedia.org/T215290 (10fdans)
[18:15:58] <wikibugs>	 10Analytics, 10Analytics-Kanban: Update reportupdater to be able to query the new db cluster that will substitute 1002 - https://phabricator.wikimedia.org/T215289 (10fdans) p:05Triage→03High
[18:16:04] <wikibugs>	 10Analytics, 10Analytics-Wikistats: There are two entries for Cantonese language - https://phabricator.wikimedia.org/T215139 (10fdans) p:05Triage→03High
[18:16:38] <wikibugs>	 10Analytics, 10Wikimedia-Stream: EventStreams returns 502 errors from outside the WMF network - https://phabricator.wikimedia.org/T215013 (10fdans) @Ottomata are there any actionables for us here?
[18:17:56] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10fdans)
[18:17:58] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10User-Elukey: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 - https://phabricator.wikimedia.org/T215450 (10fdans) 05Open→03Resolved
[18:19:08] <elukey>	 ottomata: I forgot to ask yesterday about Spark 1.x deprecation! :)
[18:19:14] <elukey>	 I think that we should be ready right?
[18:19:17] <edsanders>	 elukey: yes, I can log into stat1007, not stat1005
[18:19:23] <elukey>	 ack thanks :)
[18:19:44] <elukey>	 for a moment I was like "did I mess with access for that host?" :D
[18:19:57] <edsanders>	 do I still need to request more access?
[18:20:27] <elukey>	 for analytics-privatedata? I think so, I didn't follow the conversation but it shouldn't change
[18:23:02] <wikibugs>	 10Analytics, 10EventBus: Spike: Can Refine handle map types if Hive Schema already exists with map fields? - https://phabricator.wikimedia.org/T215442 (10fdans) a:03JAllemandou
[18:23:13] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10EventBus: Spike: Can Refine handle map types if Hive Schema already exists with map fields? - https://phabricator.wikimedia.org/T215442 (10fdans)
[18:26:49] <milimetric>	 ottomata: agreed that eventually refine should be schema-aware, I always liked that better.  But in the meantime, we need to solve this problem, we're butchering data
[18:32:44] <wikibugs>	 10Analytics, 10User-Elukey: Restoring the daily traffic anomaly reports - https://phabricator.wikimedia.org/T215379 (10Nuria) @Slaporte: if these scripts are important let's please find a person responsible for maintaining them. They are very sparsely documented and as you can see it is not even clear who setu...
[18:33:41] * elukey ofF!
[18:59:04] <ottomata>	 milimetric:  we're butchering data only in a few cases, right?
[18:59:12] <ottomata>	 we could precreate schemas for ones we're worried about
[18:59:26] <ottomata>	 its only if the schema gets created incorretly during the first refine (or addition of a field)
[19:00:09] <ottomata>	 joal: would it be possible to move anlaytics systems hangtime to before standup on thursdays?  
[19:00:12] <ottomata>	 i see you have a busy block there...
[19:09:58] <joal>	 ottomata: I have the kids before standup, I won't make it :(
[19:11:56] <ottomata>	 aye ok hmmm
[19:12:31] <ottomata>	 leila: so yaa if we want joseph to come (which we do!) we can't do it until later
[19:15:45] <joal>	 ottomata, leila: could we do it on fridays, after analytics standup?
[19:16:32] <joal>	 There seem to be a free slot for both of you
[19:17:19] <ottomata>	 neither of us like friday meetings :)
[19:17:30] <ottomata>	 i am off every other friday too so sometimes it might not work
[19:17:36] <joal>	 makes sense - i don't either, but he
[19:17:42] <joal>	 Ah right forgot about that
[19:17:45] <joal>	 hm
[19:18:03] <joal>	 Could be on wednesdays?
[19:21:13] <ottomata>	 not before think not before standup, others have meeting then
[19:30:42] <joal>	 ottomata: I'm out of suggestions :(
[19:32:06] <joal>	 Nettrom: Good evening - Are you by any chance still around?
[19:33:36] <Nettrom>	 joal: I am, but about to have a meeting :(
[19:34:16] <joal>	 Nettrom: no problem, no emergency - Let's see if I'll still have a question tomorrow ;)
[19:34:34] <Nettrom>	 joal: sounds good, I should be around tomorrow morning PST, so feel free to ping me then!
[19:34:44] <joal>	 Thanks Nettrom :)
[19:37:38] <joal>	 ottomata: a question for you - Have we closed new artifacts mirroring on archiva?
[19:41:51] <ottomata>	 new artifacts mirroring?
[19:41:56] <joal>	 yessir
[19:42:02] <ottomata>	 not sure what that means
[19:42:30] <joal>	 meaning, if I add a new dependency in a pom depending on archiva, will it be mirrored autmogically?
[19:46:29] <ottomata>	 i think so yes
[19:46:31] <ottomata>	 it should be iirc
[19:46:37] <joal>	 hm
[19:46:46] <joal>	 ok - I have an issue with maven repo maybe then :)
[19:47:56] <joal>	 Ah ! Found it
[19:48:49] <joal>	 ottomata: I'd need to mirror a package located in a different repo than maven-core - namely: 
[19:49:03] <joal>	 https://dl.bintray.com/spark-packages/maven/
[19:49:39] <joal>	 where reside a bunch of spark packages, among which graphframes, that I'd like to update to 0.7.0 (at least 0.6.0)
[19:51:33] <wikibugs>	 (03PS1) 10Joal: Update graphframes to 0.7.0 in refinery-spark [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/489004
[19:51:36] <joal>	 ottomata: here is the pull request --^
[19:51:49] <joal>	 ottomata: the build fails because the jars are not handled by the main repo
[19:51:53] <ottomata>	 ah
[19:51:59] <ottomata>	 yeah we only mirror from maven central and cdh
[19:52:05] <joal>	 makes sense
[19:52:09] <ottomata>	 do you know which packages you need?
[19:52:12] <ottomata>	 can you upload them manually?
[19:52:31] <joal>	 yessir - details in the PR :)
[19:52:47] <ottomata>	 https://wikitech.wikimedia.org/wiki/Archiva#Uploading_dependency_artifacts
[19:53:45] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Update graphframes to 0.7.0 in refinery-spark [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/489004 (owner: 10Joal)
[19:53:48] <joal>	 Will follow ottomata - Thanks 1
[19:55:21] <ottomata>	 yup!
[19:55:23] <ottomata>	 lemme know if it works
[19:57:36] <joal>	 ottomata: worked like a charm - Many thanks :)
[19:58:53] <wikibugs>	 10Analytics-Kanban: Bump graphframes version to 0.6.0+ - https://phabricator.wikimedia.org/T215547 (10JAllemandou)
[19:59:01] <wikibugs>	 10Analytics-Kanban: Bump graphframes version to 0.6.0+ - https://phabricator.wikimedia.org/T215547 (10JAllemandou) a:03JAllemandou
[19:59:39] <wikibugs>	 (03PS2) 10Joal: Update graphframes to 0.7.0 in refinery-spark [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/489004 (https://phabricator.wikimedia.org/T215547)
[20:02:14] <leila>	 joal: I'm trying to keep my Friday's free from meetings for focus time. 
[20:02:30] <joal>	 leila: I definitely understand :)
[20:02:38] <joal>	 leila: We'll find a time :)
[20:02:56] <leila>	 joal: I'd really appreciate it. Friday is the one day I can fully focus and deep dive.
[20:03:44] <joal>	 leila: Please keep it this way, I'm sure there'll be a time that we can accomodate without too much disturbance
[20:03:57] <leila>	 joal: thanks! :)
[20:07:03] <joal>	 ottomata: added the line you asked for in https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Spark#Spark_tuning_for_big_jobs
[20:07:24] <joal>	 I'll call it done and merge the code patches (mforns +1ed them already)
[20:08:22] <mforns>	 :]
[20:11:27] <wikibugs>	 (03PS3) 10Joal: Update big spark jobs settings [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/482661 (https://phabricator.wikimedia.org/T213525)
[20:11:52] <ottomata>	 thanks joal!
[20:11:58] <wikibugs>	 (03CR) 10Joal: [C: 03+2] "Merging after providing doc." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/482661 (https://phabricator.wikimedia.org/T213525) (owner: 10Joal)
[20:12:04] <joal>	 Thanks for reading ;)
[20:13:14] <joal>	 I'd also hapilly merge https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/489004 if any of you gives me a +1 :)
[20:21:52] <wikibugs>	 (03CR) 10Ottomata: [C: 03+1] Update graphframes to 0.7.0 in refinery-spark [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/489004 (https://phabricator.wikimedia.org/T215547) (owner: 10Joal)
[20:22:30] <wikibugs>	 (03Merged) 10jenkins-bot: Update big spark jobs settings [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/482661 (https://phabricator.wikimedia.org/T213525) (owner: 10Joal)
[20:29:55] <wikibugs>	 10Analytics, 10Wikimedia-Stream: EventStreams returns 502 errors from outside the WMF network - https://phabricator.wikimedia.org/T215013 (10Ottomata) Yes, at the least we need to make the aliveness check somehow check from outside prod networks.
[20:30:01] <wikibugs>	 (03CR) 10Joal: [C: 03+2] "Merging for bug-fix. Thanks ottomata" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/489004 (https://phabricator.wikimedia.org/T215547) (owner: 10Joal)
[20:34:07] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10WMDE-Analytics-Engineering, 10Wikidata, and 3 others: track number of editors from other Wikimedia projects who also edit on Wikidata over time - https://phabricator.wikimedia.org/T193641 (10JAllemandou) I confirm the fix :) Closing this task.
[20:35:55] <wikibugs>	 (03Merged) 10jenkins-bot: Update graphframes to 0.7.0 in refinery-spark [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/489004 (https://phabricator.wikimedia.org/T215547) (owner: 10Joal)
[20:36:18] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Nuria) Ideally I would prefer that stats machines are completely out of the workflow of pushing data to machines like mwmaint1002.eqia...
[20:38:17] <ottomata>	 nuria:  fyi in that ticket ^ stat1007 was just an example of rsync pull from analytics vlan
[20:38:26] <ottomata>	 not suggestion that we would use it
[20:38:43] <ottomata>	 T213976 is where we will figure out how to do this
[20:38:44] <stashbot>	 T213976: Workflow to be able to move data files computed in jobs from analytics cluster to production  - https://phabricator.wikimedia.org/T213976
[20:40:26] <nuria>	 ottomata: ah k, let me look at that one
[20:42:27] <wikibugs>	 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10Nuria)
[20:43:17] <wikibugs>	 10Analytics, 10Operations, 10Research, 10serviceops, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Nuria) As @Ottomata pointed out more generic discussion about this topic can be found here: https://phabricator.wikimedia.org/T213976
[20:49:19] <wikibugs>	 10Analytics, 10Analytics-Kanban: Update sqoop base script for new analytics-db infra - https://phabricator.wikimedia.org/T215549 (10JAllemandou)
[20:49:50] <joal>	 milimetric, nuria - I just created that --^ (analytics board + kanban)
[20:51:09] <nuria>	 joal: i think is a duplicate of  https://phabricator.wikimedia.org/T215290? 
[20:51:37] <joal>	 It is nuria ! Didn't know about the original one - will close
[20:52:14] <wikibugs>	 10Analytics, 10Analytics-Kanban: update mw scooping to be able to scoop from new db cluster - https://phabricator.wikimedia.org/T215290 (10JAllemandou)
[20:52:16] <wikibugs>	 10Analytics, 10Analytics-Kanban: Update sqoop base script for new analytics-db infra - https://phabricator.wikimedia.org/T215549 (10JAllemandou)
[20:52:55] <wikibugs>	 10Analytics, 10Analytics-Kanban: Test sqooping from the new dedicated labsdb host - https://phabricator.wikimedia.org/T215550 (10JAllemandou)
[20:53:03] <joal>	 nuria: is that one also a dup --^
[20:53:05] <joal>	 ?
[20:55:47] <wikibugs>	 10Analytics, 10EventBus, 10Research, 10Wikidata, and 4 others: Surface link changes as a stream - https://phabricator.wikimedia.org/T214706 (10Pchelolo) We're getting closer. The last part of the puzzle would be to emit the event publically via event streams. In order to do that, we need to add the `mediaw...
[21:01:58] <joal>	 Gone for today team - see you tomorrow
[21:03:44] <mforns>	 byeeeee
[21:23:52] <wikibugs>	 10Analytics, 10User-Elukey: Restoring the daily traffic anomaly reports - https://phabricator.wikimedia.org/T215379 (10Tbayer) >>! In T215379#4934230, @elukey wrote: > This is jdcc's crontab (I thought it was not an active user anymore, I was wrong): Me too (based on T183291) ... >  > ` > 0 15 * * * USER=jdcc...
[21:40:12] <wikibugs>	 10Analytics, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10jlinehan) Since the Superset 0.28.1 upgrade ([[ https://phabricator.wikimedia.org/T211605 | T211605 ]]) has run aground for now, I talked with @Nuria about the possibilit...
[22:35:28] <wikibugs>	 (03PS1) 10Ladsgroup: Introduce WikimediaDbSectionMapper based on db-eqiad.php config [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/489097 (https://phabricator.wikimedia.org/T213894)
[22:56:16] <wikibugs>	 10Analytics, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10Nuria) p:05High→03Triage
[22:57:02] <wikibugs>	 10Analytics, 10Performance-Team: Plan navtiming data release - https://phabricator.wikimedia.org/T214925 (10Nuria) p:05Low→03Normal
[22:59:36] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Security-Team, and 3 others: Modern Event Platform: Stream Intake Service: AJV usage security review - https://phabricator.wikimedia.org/T208251 (10sbassett) !!**Security Review Summary - February 2019**!! Overall, this looks pretty good to me.  Issues d...
[23:05:05] <wikibugs>	 10Analytics, 10Performance-Team (Radar): [Bug] Type mismatch between NavigationTiming EL schema and Hive table schema - https://phabricator.wikimedia.org/T214384 (10Krinkle) @Milimetric If I understand correctly, you're proposing to change the schema and replace records for which we still have the raw data. Ol...