[00:11:10] (03CR) 10Nuria: Update mediawiki-history comment and actor joins (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/480796 (https://phabricator.wikimedia.org/T210543) (owner: 10Joal) [07:54:21] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), and 3 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Krinkle) Yep, all good. Hence closed the task. The debugging looks... [08:25:15] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [09:31:38] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [09:35:32] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cu... [09:35:59] I am currently re-imaging analytics1041 (test cluster) into a druid test single node cluster [09:36:08] it seemed the quickest option [09:36:16] (rather than using a ganeti vm) [10:45:15] ok we have a druid cluster on analytics1041 :) [10:55:44] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['analytics1041.e... [11:03:07] 10Analytics, 10Operations, 10Research, 10Article-Recommendation, and 3 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Marostegui) Maybe @MoritzMuehlenhoff can give some ideas [11:31:28] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) [11:37:04] 10Analytics, 10MediaWiki-API, 10Security, 10User-Ladsgroup: centralauthtoken should be redacted in logs (including hadoop wmf_raw.apiaction) - https://phabricator.wikimedia.org/T207814 (10Ladsgroup) 05Open→03Resolved a:03Ladsgroup Given that CentralAuth is not part of the bundle, [[https://gerrit.wik... [11:37:09] 10Analytics, 10MediaWiki-API, 10Security, 10User-Ladsgroup: centralauthtoken should be redacted in logs (including hadoop wmf_raw.apiaction) - https://phabricator.wikimedia.org/T207814 (10Ladsgroup) [11:37:52] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) @elukey we need to get it of TokuDB before importing on the final dbstore hosts. These are the tables that currently run TokuDB on staging: ` +------... [11:38:24] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) [11:40:22] * elukey lunch + errand! [13:07:06] PROBLEM - Check the last execution of reportupdater-interlanguage on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused [13:12:22] PROBLEM - Check the last execution of refinery-import-page-history-dumps on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused [13:14:08] PROBLEM - Check the last execution of reportupdater-browser on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused [13:17:18] RECOVERY - Check the last execution of reportupdater-interlanguage on stat1007 is OK: OK: Status of the systemd unit reportupdater-interlanguage [13:22:34] RECOVERY - Check the last execution of refinery-import-page-history-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-page-history-dumps [13:24:22] RECOVERY - Check the last execution of reportupdater-browser on stat1007 is OK: OK: Status of the systemd unit reportupdater-browser [13:31:52] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) I have altered the following tables: ` nettrom_creations_from_revision_sources nettrom_creations_from_revision nettrom_creations_from_page_sources ne... [13:44:14] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [13:50:38] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) [13:56:22] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) The following Aria tables have been converted to InnoDB on dbstore1002 on the staging database: ` tbayer_test2 tbayer_test1 theodora tgr_uw_terminatin... [13:57:29] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) [14:03:52] mmmm really annoying that when stat1007 gets overloaded the report updater alarms fire [14:13:16] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) >>! In T213706#4902189, @Marostegui wrote: > I have altered the following tables: > ` > > nettrom_creations_from_revision_sources nettrom_creations_from_... [14:15:00] PROBLEM - Check the last execution of refinery-import-page-history-dumps on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused [14:15:12] yeah stat1007 is not reachable [14:16:46] PROBLEM - Check the last execution of reportupdater-browser on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused [14:22:05] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [14:25:14] RECOVERY - Check the last execution of refinery-import-page-history-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-page-history-dumps [14:27:00] RECOVERY - Check the last execution of reportupdater-browser on stat1007 is OK: OK: Status of the systemd unit reportupdater-browser [14:27:01] joal: ran load and denormalize last night, all good, running checker now, but I have physical therapy this morning, so I'll be back before standup [14:28:24] (03PS2) 10Fdans: Only use artificial ID if there is no page id in delete events [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/484706 [14:28:49] hm, checker failed, will look at this later: https://hue.wikimedia.org/oozie/list_oozie_workflow/0089960-181112144035577-oozie-oozi-W/?coordinator_job_id=0089959-181112144035577-oozie-oozi-C [14:30:40] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Refactor analytics cronjobs to alarm on failure reliably - https://phabricator.wikimedia.org/T172532 (10elukey) [14:30:52] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Refactor analytics cronjobs to alarm on failure reliably - https://phabricator.wikimedia.org/T172532 (10elukey) [14:40:30] yo a-team anything yall would like me to say in sos? [14:40:54] not from me :) [14:44:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Update git lfs on stat1006/7 - https://phabricator.wikimedia.org/T214089 (10elukey) @Halfak should be done! Can you check and confirm? [14:48:03] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Update git lfs on stat1006/7 - https://phabricator.wikimedia.org/T214089 (10elukey) [14:52:12] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) All the tokudb tables on `staging` database have been migrated to InnoDB: ` root@DBSTORE[information_schema]> select TABLE_SCHEMA,TABLE_NAME,UPDATE_T... [14:53:25] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria/Tokudb tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) [14:55:18] WELL OK elukey I GUESS NO SOS FOR LUCA [14:55:33] THANK YOU [15:05:49] xD [15:06:00] (03PS2) 10Fdans: Change email send workflow to notify of completed jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484657 (https://phabricator.wikimedia.org/T206894) [15:16:01] o/ elukey do some people sometimes have issue logging into superset for the first time? [15:16:32] elukey: afaik https://tools.wmflabs.org/ldap/user/lea-wmde should be able to access superset? [15:16:49] addshore: o/ they need to reach out to us to be able to create the user, since we have still to do a manual step [15:16:52] even if the user is in ldap [15:17:01] amazing, could you do that for the above user? :) [15:17:38] could you also add https://tools.wmflabs.org/ldap/user/goransm if not already added? [15:18:37] addshore: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Superset#Access [15:18:47] is lea-wmde part of wmf/nda ldap groups? [15:18:50] (one of the two) [15:18:53] nda yes [15:19:14] (you can see on the ldap tool page i linked to) [15:19:32] yep yep didn't check it [15:19:37] so paperwork seems ok [15:20:55] do I need to file a phab ticket for a paper trail? or? [15:21:25] nah I think it is fine [15:21:30] amazing [15:21:42] since we give access to a lot of data we need to make sure that there is an nda signed or something [15:21:48] yuppp [15:22:09] i just filed a ticket to get lydia added to the nda group so that then I can get her access to superset too [15:22:31] can you ping me when lea-wmde is done so I can get her to check? :) [15:24:00] !log added lea-wmde and goransm to Superset [15:24:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:24:03] done :) [15:24:08] amazing [16:02:40] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:11:24] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10elukey) Thanks to Jaime's suggestions and Manuel's patient explanations about se... [16:19:24] mforns: o/ [16:19:29] hey elukey :] [16:19:35] you are lucky, I finished this morning the ensure support for eventlogging_to_druid_job [16:19:37] and related [16:19:38] :D [16:19:44] oh! hehe [16:19:49] didn't even notice that [16:20:00] awesome :D [16:20:17] I mean, didn't notice it wasn't there before [16:21:13] you're like Gandalf, always on time :] [16:23:13] ahahhaha [16:23:31] is the code review ready? [16:23:50] elukey, yes :] [16:25:53] mforns: have you ever used https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/ ? [16:26:16] elukey, I have seen it being used, but never did myself [16:26:25] ah! [16:26:43] let's do it if you have time [16:26:45] you need to log in [16:27:23] yes, should I input the change id? [16:28:21] exactly [16:28:33] the host is in this case an-coord1001.eqiad.wmnet [16:28:58] fdans: are you going to sos? (otherwise i can go, i have a few free mins) [16:29:53] nuria: I pmd you for the notes, don’t know if you saw [16:31:57] but elukey, if you need to push the code to gerrit to run the compiler, jenkins will run it anyway no? [16:32:36] mforns: nope jenkins does not run it, it does only some validations for syntax, linting, etc.. [16:32:44] oooh [16:32:54] yeah [16:33:03] so you can see how the puppet catalog differs for that host [16:33:05] elukey, it didn't recognize the change id that I gave it [16:33:13] I see [16:33:22] did you put only the number or all the link? [16:33:53] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10jcrespo) I don't have comments about the script itself, but: ['s1', 's2', 's3',... [16:34:07] I put the id, which is in hex [16:34:12] but it requires an int [16:34:50] ok, I should use the one in the url [16:35:33] ah yes yes sorry! [16:35:43] I didn't get that you were referring to that id [16:36:22] it's green :] [16:42:14] ah there you are https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/14453/console [16:42:22] in the console output there is a link [16:42:25] to the result [16:42:45] if you click on the hostname you should then see what changes [16:44:35] mforns: --^ [16:44:52] elukey, looking at it right now! [16:44:56] this is amazing :] [16:45:28] wow so much stuff [16:50:23] 10Analytics, 10Research, 10WMDE-Analytics-Engineering, 10User-Addshore, 10User-Elukey: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10elukey) Thanks for the suggestion! This is a very quick and dirty prototype to s... [16:52:22] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), and 3 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Nuria) Ping here @AndyRussG and @phuedx @jlinehan so they now Even... [16:52:35] 10Analytics, 10Analytics-Data-Quality, 10Product-Analytics: page_creation_timestamp not always correct in mediawiki_history - https://phabricator.wikimedia.org/T214490 (10mpopov) [17:01:00] a-team: staddduppppppp [17:01:09] ping milimetric , mforns [17:01:12] ping joal [17:27:36] 10Analytics, 10EventBus, 10Patch-For-Review, 10Services (watching): EventBus mediawiki extension should support multiple 'event service' endpoints - https://phabricator.wikimedia.org/T214446 (10Pchelolo) [17:48:31] milimetric: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Edit_history_administration#QA:_Assessing_quality_of_a_snapshot [17:48:41] sweet, thanks nuria [17:48:42] fdans, milimetric : linked now from on call docs [17:48:55] https://wikitech.wikimedia.org/wiki/Analytics/Ops_week#Mediawiki_Denormalize_checker_alarms [17:50:08] milimetric: please ping me if you want help on this [17:50:39] joal: no worries, I'm doing the stupid thing and copying the history to my folder :) Bringing the mountain to Mohamed [17:50:57] joal: I'll do lunch and finish up checking, I'm guessing everything's fine since I already checked this anyway [17:51:26] sounds good milimetric - I'll be around late-ish again tonight, don't hesitate :) [17:52:10] milimetric: please update docs so the next mohamed knows what to do [17:58:22] mforns: merging your change ok? [17:58:29] elukey, thanks! [18:07:06] mforns: done [18:08:31] 10Analytics: Clean up home dirs for user mkroetzsch - https://phabricator.wikimedia.org/T214501 (10MoritzMuehlenhoff) [18:15:28] * elukey off! [18:43:57] milimetric: have a meeting in a few mins but I am happy to look with you at the reduce snapshot [18:57:13] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: Develop a library for JSON schema backwards incompatibility detection - https://phabricator.wikimedia.org/T206889 (10Pchelolo) The topic appears to be deeper than I thought. Please read htt... [19:26:03] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: Develop a library for JSON schema backwards incompatibility detection - https://phabricator.wikimedia.org/T206889 (10mobrovac) >>! In T206889#4903298, @Pchelolo wrote: > Given that we are i... [19:29:45] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: Develop a library for JSON schema backwards incompatibility detection - https://phabricator.wikimedia.org/T206889 (10Pchelolo) > Hmmm you seem to assume we control all the producers, but I'... [19:54:36] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Bug: can't make a YoY time series chart in Superset - https://phabricator.wikimedia.org/T210687 (10mpopov) Cool! Thank you! [20:22:25] 10Analytics: Clean up home dirs for user mkroetzsch - https://phabricator.wikimedia.org/T214501 (10Peachey88) [21:05:25] nuria: I can make our 1/1 slot you scheduled, but not the first one because I volunteered to do the tech help thing with cloud that day [21:05:32] it wasn't on my calendar, my bad [21:09:54] 10Analytics: LDAP login advice on https://superset.wikimedia.org/ specifies wrong kind of login name - https://phabricator.wikimedia.org/T214524 (10Tbayer) [23:16:30] 10Analytics, 10ORES, 10Scoring-platform-team (Current): Backfill ORES Hadoop scores with historical data - https://phabricator.wikimedia.org/T209737 (10awight) Important change of plans—We're discussing backfilling, and it might be best to allow mismatched model versions in the dumps for now. In other words...