[00:16:19] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Give clear recommendations for Spark settings - https://phabricator.wikimedia.org/T245897 (10Nuria) [00:17:44] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Give clear recommendations for Spark settings - https://phabricator.wikimedia.org/T245897 (10Nuria) 05Open→03Resolved [00:17:47] 10Analytics, 10Epic, 10Product-Analytics (Kanban): Analysts cannot reliably use wmfdata to run SQL queries against Hive databases - https://phabricator.wikimedia.org/T245891 (10Nuria) [00:39:11] 10Analytics, 10Analytics-Kanban: Support CSV uploads in Superset - https://phabricator.wikimedia.org/T245679 (10Nuria) Tested that this works well, pretty easy to create cc @cchen so she knows this is a possible option, tables have to be created on the mysql_staging database cc @kzimmerman cause uploading cvs... [00:40:31] 10Analytics, 10Analytics-Kanban: Support CSV uploads in Superset - https://phabricator.wikimedia.org/T245679 (10Nuria) 05Open→03Resolved [00:45:09] 10Analytics: Should reportupdater Pingback reports be refactored? - https://phabricator.wikimedia.org/T246154 (10CCicalese_WMF) > It is true that, if we delete the data as proposed, we'll not be able to use the table for retroactive queries (say accumulate values until 2018-06-12). Right. Can we assume that the... [00:55:22] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10Mayakp.wiki) Based on my discussion with @DLynch , I checked the data from the past few months and it looks like... [01:37:26] 10Analytics: Add wikitech (labswiki) to the sqoop list - https://phabricator.wikimedia.org/T217792 (10Jdforrester-WMF) [01:37:34] 10Analytics: Add wikitech (labswiki) to the sqoop list - https://phabricator.wikimedia.org/T217792 (10Jdforrester-WMF) [01:48:19] 10Analytics: Should reportupdater Pingback reports be refactored? - https://phabricator.wikimedia.org/T246154 (10Nuria) >Would we be able to save the report up to the point in December that we had the failure Yes, the "report" is just a csv file here: https://analytics.wikimedia.org/published/datasets/periodic... [01:59:50] 10Analytics: Should reportupdater Pingback reports be refactored? - https://phabricator.wikimedia.org/T246154 (10CCicalese_WMF) Yes, my suggestion is that we stop the existing report but keep the data statically so people can still view the graphs for historical purposes. Then, we would create a new report with... [04:06:47] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10ori) If EventLogging is inconsistent, unreliable, or opaque in its behavior, researchers will lose trust in it, a... [04:55:52] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10DLynch) > I have some vague idea that this could be made better by using a promise instead.. I don’t think addin... [05:07:40] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10Nuria) >It’s very recent, and apparently exists only for performance reasons on mobile — if it introduced problem... [05:11:04] (03PS10) 10Nuria: Classification of actors for bot detection [analytics/refinery] - 10https://gerrit.wikimedia.org/r/562368 (https://phabricator.wikimedia.org/T238361) [05:16:23] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10DLynch) The existing instrumentation is from VisualEditor, mostly applies to desktop, and has been in place since... [05:17:34] (03CR) 10Nuria: "Tested oozie jobs and they are working, to fully vet data I think we need to merge and per joseph's suggestion backfill with two months of" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/562368 (https://phabricator.wikimedia.org/T238361) (owner: 10Nuria) [05:32:14] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10Nuria) >The existing instrumentation is from VisualEditor, mostly applies to desktop, and has been in place since... [06:09:59] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10DLynch) > And, given that times have changed since 2016 could the events possibly be attached to a more "modern"... [07:32:04] (03PS1) 10Elukey: refinery-drop-older-than: fix test for invalid_older_than [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575464 [07:40:11] (03PS2) 10Elukey: refinery-drop-older-than: improve test for older_than arg [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575464 [07:46:02] (03PS3) 10Elukey: refinery-drop-older-than: improve test for older_than arg [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575464 [07:46:18] so I just realized that my last change broke drop-older-than :( [07:46:25] I filed a change to fix it [08:12:59] (03PS1) 10Elukey: Add an-launcher1001 to the targets [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/575466 [08:13:20] (03CR) 10Elukey: [V: 03+2 C: 03+2] Add an-launcher1001 to the targets [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/575466 (owner: 10Elukey) [08:14:09] PROBLEM - Check the last execution of reportupdater-edit-beta-features on stat1006 is CRITICAL: NRPE: Command check_check_reportupdater-edit-beta-features_status not defined https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [08:14:15] PROBLEM - Check the last execution of reportupdater-language on stat1006 is CRITICAL: NRPE: Command check_check_reportupdater-language_status not defined https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [08:14:51] this is me cleaning up --^ [08:15:09] (03PS1) 10Elukey: Add an-launcher1001 to the targets [analytics/hdfs-tools/deploy] - 10https://gerrit.wikimedia.org/r/575467 [08:16:11] PROBLEM - Check the last execution of reportupdater-cx on stat1006 is CRITICAL: NRPE: Command check_check_reportupdater-cx_status not defined https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [08:16:27] PROBLEM - Check the last execution of reportupdater-published_cx2_translations on stat1006 is CRITICAL: NRPE: Command check_check_reportupdater-published_cx2_translations_status not defined https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [08:17:57] going to downtime [08:19:53] (03CR) 10Elukey: [V: 03+2 C: 03+2] Add an-launcher1001 to the targets [analytics/hdfs-tools/deploy] - 10https://gerrit.wikimedia.org/r/575467 (owner: 10Elukey) [09:12:59] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10akosiaris) No it's not full. CPU and memory usage requests are barely at 10%. events have been cleared from the namespace so... [10:28:21] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Add mediarequests metrics to wikistats UI - https://phabricator.wikimedia.org/T234589 (10BerndFiedlerWMDE) I love you guys! Thanks a lot! [10:31:10] so really strange [10:31:26] on stat1007 there is a /mnt/data mountpoint that I don't find config for [10:31:41] that refers to /data/xmldatadumps/public [10:35:14] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Add mediarequests metrics to wikistats UI - https://phabricator.wikimedia.org/T234589 (10BerndFiedlerWMDE) The file "presidential election" can now be tracked https://tools.wmflabs.org/mediaviews/?project=commons.wikimedia.org&platform=&referer=all-referers&... [10:57:12] ok fixed :) [10:57:21] joal: all hdfs-rsync jobs on an-launcher! [10:59:24] on stat1006 we don't have any more analytics-related timers [11:04:06] so stat1006 is very close to become like 1004/1005, another hadoop node [11:04:23] we'll need to sort out some stuff, and probably now it is a good time to start merging posix groups [11:04:28] we have too many [11:31:09] * elukey lunch! [12:35:59] (03PS4) 10Fdans: Improve flexibility of media file paths [analytics/aqs] - 10https://gerrit.wikimedia.org/r/571968 (https://phabricator.wikimedia.org/T244712) [12:36:05] (03CR) 10jerkins-bot: [V: 04-1] Improve flexibility of media file paths [analytics/aqs] - 10https://gerrit.wikimedia.org/r/571968 (https://phabricator.wikimedia.org/T244712) (owner: 10Fdans) [13:39:48] (03CR) 10Mforns: [C: 03+2] "Yes, looking way better now!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575464 (owner: 10Elukey) [13:46:13] Yay elukey! thanks a lot :) [13:55:00] elukey: "... so stat1006 is very close to become like 1004/1005, another hadoop node" - Yay :) Thanks a lot [13:55:56] GoranSM: still need to sort out some things, but hopefully soon all the stat boxes will be the same :) [13:56:13] elukey: :) [13:56:14] less posix groups too (hopefully only analytics/analytics-privatedata) [13:56:21] and notebooks on all [13:56:26] we are getting there :) [13:57:38] 10Analytics: Should reportupdater Pingback reports be refactored? - https://phabricator.wikimedia.org/T246154 (10mforns) Oh! Didn't know about the heartbeat pingback. That is great, looking one month back would be completely fine of course. Is the heartbeat pingback issued at a fixed data (say first of month) fo... [13:58:45] (03CR) 10Elukey: [V: 03+2] refinery-drop-older-than: improve test for older_than arg [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575464 (owner: 10Elukey) [13:59:07] a-team: doing to quickly deploy refinery (no hdfs needed, just to fix --^) [13:59:22] k! [13:59:26] ack! [14:05:39] 10Analytics, 10Analytics-Kanban: Move the Analytics infrastructure to Debian Buster - https://phabricator.wikimedia.org/T234629 (10elukey) [14:09:34] (03CR) 10Abijeet Patro: [C: 03+1] "I can't actually give a CR +2" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/575289 (owner: 10L10n-bot) [14:29:00] mforns: I am stupid, part of the fix doesn't work [14:29:16] ?? [14:29:22] I didn't test one case, since I made a last minute change [14:29:22] I tested it :[ [14:29:36] what's happening? [14:29:57] so if you pass an int via args, it will be a string IIUC [14:29:59] not an int [14:30:04] so isistance(etc.. fails) [14:30:28] Feb 28 14:23:35 an-coord1001 kerberos-run-command[44421]: The older_than arg must be either a date or an int. [14:31:10] is it possible? [14:31:14] or am I crazy> [14:31:15] ? [14:33:32] if true we can add a test [14:34:56] mforns: --^ [14:36:24] oh elukey yes, [14:36:27] I missed that [14:37:43] I trusted the tests without thinking about adding a new one :) [14:37:44] elukey, we can do sth like: try: int(older_than) except ValueError: raise RuntimeError(...) [14:38:10] yes, i am not sure if it is super great to do that inside an exception hanlder [14:38:40] why not? [14:39:15] it works but it doesn't look pretty from my point of view, maybe it is only my brain :) [14:39:23] otherwise just change the test that checks that to expect a ValueError, and it will be fine [14:40:14] mforns: in this case the test works, it is the code that validates something like --older-than=24 that fails [14:41:54] elukey, I understand, we should remove the isinstance, but if we do so, the error raised by the code when older-than is not parseable will be a ValueError, not a RuntimeError [14:42:06] so either we change the test to expect that, or it will fail [14:42:17] or we can do the try catch thing in the code [14:43:08] yep yep [14:44:16] wondering if try/else could be good too [14:44:38] * milimetric o/ [14:46:16] mmm no [14:47:22] elukey, we can use '34'.isdigit() [14:47:32] if false, raise RuntimeError [14:47:46] it works only with integers, but that's what we want [14:47:57] yep also an option, I like it [14:48:15] mforns: do you want to send a patch? [14:48:21] actually, is perfect, because it fails with negative numbers as well [14:48:25] sure! [14:48:55] +1 [14:49:52] !log Drop test keyspaces in cassandra cluster [14:49:53] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:10:03] 10Analytics, 10Analytics-Kanban: Problem with Matomo page overlay - https://phabricator.wikimedia.org/T246046 (10Milimetric) p:05Triage→03High I found two possibilities. One might be the CORS headers are not set up on both the client site and the matomo site settings: https://forum.matomo.org/t/page-overl... [15:14:13] (03PS1) 10Mforns: Fix refinery-drop-older-than tests and parameter check [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575542 (https://phabricator.wikimedia.org/T246272) [15:17:10] woah what happened to hive, it's super fast at metadata queries [15:17:18] (03CR) 10Mforns: "Checked that the tests are running fine." (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575542 (https://phabricator.wikimedia.org/T246272) (owner: 10Mforns) [15:17:33] elukey, ^ [15:18:02] took me a bit to figure out the new test checksum [15:19:37] (03PS1) 10Milimetric: Fix bot identification [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575545 (https://phabricator.wikimedia.org/T244632) [15:19:39] (03CR) 10Elukey: [C: 03+1] Fix refinery-drop-older-than tests and parameter check [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575542 (https://phabricator.wikimedia.org/T246272) (owner: 10Mforns) [15:19:45] looks good! [15:19:50] cool [15:20:01] want me to merge? or wait for someone else? [15:20:09] nono let's do it, I can deploy then [15:20:16] ok [15:20:32] (03CR) 10Elukey: [V: 03+2 C: 03+2] Fix refinery-drop-older-than tests and parameter check [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575542 (https://phabricator.wikimedia.org/T246272) (owner: 10Mforns) [15:20:41] deploying again :) [15:21:31] k [15:22:35] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Bot field in edits_hourly dataset ignores username - https://phabricator.wikimedia.org/T244632 (10Milimetric) Makes sense, thanks for the fix Neil. For future reference, totally feel free to submit this i... [15:23:25] milimetric: o/ - all RU jobs are on an-launcher1001 now, if you have time to triple check that nothing is gone horribly wrong today it would be great :D [15:25:59] elukey: will do [15:26:21] elukey: also, I didn't get around to restarting webrequest load bundle last night after deploy, because of the aqs stuff I was troubleshooting [15:26:38] I was going to do it now unless you think it's a bad idea [15:27:27] nono +1 [15:27:36] I am about to finish the scap deploy for refinery [15:27:49] (but shouldn't be impacting any restart) [15:30:48] yeah, I'm restarting referencing v0.0.115 which is already synced [15:31:02] oh elukey [15:31:15] nvm, nvm, sorry [15:31:16] :) [15:42:02] milimetric: have you merged = restarted the webrequest bundle stuff (host-normalization related) [15:42:05] ? [15:42:12] oh, and hi milimetric (sorry) [15:42:13] joal: I just did [15:42:16] \o/ [15:42:19] Thanks mate :) [15:42:19] mforns: it works! [15:42:29] yay [15:42:33] I merged and deployed yesterday, and restarted just a few minutes ago, joal, was late [15:42:41] np [15:42:42] also, hi joal (sorry) :) [15:42:46] huhu :) [15:45:49] root@analytics1030:/home/elukey# file /usr/lib/oozie/lib/hadoop-hdfs.jar [15:45:52] /usr/lib/oozie/lib/hadoop-hdfs.jar: broken symbolic link to ../../hadoop/client/hadoop-hdfs.jar [15:45:55] * elukey cries in a corner [15:48:17] it is a mess of symlinks [15:48:38] so oozie can't find the hdfs client libs and fails [15:50:56] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10ori) >>! In T246382#5925521, @Nuria wrote: > The perf issues are more prevalent in mobile but really exists anywh... [15:53:13] joal: hm... and it failed (webrequest load) https://hue.wikimedia.org/jobbrowser/jobs/job_1576512674871_312469/single_logs [15:53:34] I just saw that - milimetric - checking [15:53:42] looks like a sql error? [15:54:25] nope [15:55:37] was thinking of running it to see on the CLI what it says? [15:55:39] https://www.irccloud.com/pastebin/nGP7QrSN/ [15:56:52] why not milimetric, don't forget to load jar and create UDFs [15:57:39] joal: it gives me a parse error and it's exactly as copied from the logs... formatting now to see if I can spot it [15:58:05] milimetric: I have not changed a bit of that code! just the hive jar version! [15:58:16] yeah... weird [15:59:24] oh, lol, it has this in the middle of the SQL query in the logs!!! " printed operations logs going to print operations logs" [15:59:49] milimetric: spotted errors in app logs look aweful :( [16:00:35] I get bad formatting but this is literally like in the middle of the query randomly inside the select statement [16:01:19] 10Analytics, 10Analytics-EventLogging, 10DiscussionTools, 10VisualEditor, and 3 others: New EventLogging queue doesn't log events in window.unload - https://phabricator.wikimedia.org/T246382 (10Nuria) >Are there any measurements to support this? It is not perf issues with our code but rather issues with wa... [16:04:05] this is the error I get in plenty workers: https://gist.github.com/jobar/f477958747a5321fe691e3af682cfa84 [16:04:20] elukey - possibly related to yesterday's restart? [16:09:47] the mapreduce job failed map-side, with super-bad errors (as the one I pasted above) [16:12:51] joal: I'm still getting a super confusing parse error, narrowed it down to this line: "case coalesce(tls, '-') when '-' then null else str_to_map(tls, ';', '=') end as tls_map" but have no idea why [16:13:07] (I rewrote the line just in case there were weird characters in the logs, but for some reason it doesn't like that) [16:13:11] milimetric: it's not a parse error for sure - no map-reduce job would have been started [16:17:19] MEH [16:23:12] ok, parse mystery solved - the logs hide the \ as in '\;' and apparently ';' is invalid HQL or something, breaks the parser [16:23:31] Ok I can replicate the error [16:24:14] really? Interesting [16:31:18] ok error unrelated to the host-normalization patch [16:34:39] sorry joal was in a metting [16:34:41] *meeting [16:35:36] whattttt a segfault? [16:37:13] hm [16:39:33] joal: what is the status? [16:39:41] I am not really getting where we are at [16:39:53] I think I have the problem - my fix for geo-udf fixed local-mode but broke map-reduce mode :( [16:40:31] * joal is really sorry for not having thought that case :S [16:40:33] the segfauls are a problem though, do you think that they are related for some reason? [16:40:44] I can't imagine anythi [16:40:48] else [16:41:03] when I remove the geo-UDF from the request, no more error [16:41:40] pffff [16:41:49] let me know if I can help! [16:42:12] * joal reads the path again [16:55:22] joal: should I roll back and start refine with the previous settings? It's getting late and maybe we don't want to delay the jobs any longer [16:56:59] milimetric: I'm tring my last, then I'll give up [16:57:46] k [17:06:39] nuria: standuppp [17:15:54] joal: I killed the bundle, it was just wasting CO2 [17:16:00] Thanks [17:28:16] (03CR) 10Joal: [C: 03+1] "LGTM - We need to be careful to remember that point - to me, bot means group-bot, not group+name (very personal)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575545 (https://phabricator.wikimedia.org/T244632) (owner: 10Milimetric) [17:42:21] (03PS1) 10Milimetric: Revert "Fix GetGeoDataUDF and underlying function" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/575577 [17:42:44] (03CR) 10Milimetric: [C: 03+2] Revert "Fix GetGeoDataUDF and underlying function" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/575577 (owner: 10Milimetric) [17:47:39] (03Merged) 10jenkins-bot: Revert "Fix GetGeoDataUDF and underlying function" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/575577 (owner: 10Milimetric) [17:48:50] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) Created https://issues.apache.org/jira/browse/BIGTOP-3317 after debugging while the oozie sharedlib create command was failing. [17:51:10] (03PS1) 10Milimetric: Bump changelog.md to v0.0.116 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/575579 [17:52:10] here we go fdans - https://gist.github.com/jobar/f969585562f5c5731c0195cc5c1fb311 [17:52:16] milimetric, nuria --^ [17:52:37] (03CR) 10Milimetric: [C: 03+2] Bump changelog.md to v0.0.116 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/575579 (owner: 10Milimetric) [17:52:45] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Bump changelog.md to v0.0.116 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/575579 (owner: 10Milimetric) [17:53:10] wow joal [17:53:25] this is way more sophisticated than what i was cooking [18:02:35] going off! [18:02:36] o/ [18:12:28] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Fix bot identification [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575545 (https://phabricator.wikimedia.org/T244632) (owner: 10Milimetric) [18:12:46] 10Analytics, 10Analytics-Kanban: Support CSV uploads in Superset - https://phabricator.wikimedia.org/T245679 (10cchen) Thank you @Ottomata and @Nuria! i will try out and share with other teams. [18:14:15] 10Analytics: Hive log4j logging is misconfigured - https://phabricator.wikimedia.org/T216294 (10nshahquinn-wmf) FYI, I'm having to think about this again; I'm working on Python wrapper for the Hive command line client as part of T246060, and [logspam](https://wikitech.wikimedia.org/wiki/Deployments/Holding_the_t... [18:14:23] (03PS1) 10Milimetric: Bump webrequest load hive jar to 0.0.116 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575590 (https://phabricator.wikimedia.org/T245453) [18:14:34] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Bump webrequest load hive jar to 0.0.116 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/575590 (https://phabricator.wikimedia.org/T245453) (owner: 10Milimetric) [18:32:26] hm... I ran the refinery-source build, got no errors, but jenkins didn't add the new refinery-source jars to refinery like it usually does: https://integration.wikimedia.org/ci/job/analytics-refinery-release/242/ [18:32:36] that's a new one... [18:39:30] 10Analytics, 10Analytics-Kanban: Support CSV uploads in Superset - https://phabricator.wikimedia.org/T245679 (10Nuria) pinging also @EYener in FR so she knows this is an easy way to prototype dashboards from ad hoc data sources, just a csv file is needed [18:45:17] milimetric: adding jars to refinery is another jenkins job IIRC [18:46:14] thanks for the bump, joal, I was dyslexing all over the instructions that I followed like 100 times [18:46:22] I knew there was a third step but my eyes just weren't seeing it [18:46:26] :) unblocked [19:10:12] !log deployed 0.0.116 and restarted webrequest load bundle at 2020-02-28T14 [19:10:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:21:10] \o/ milimetric :) [19:21:55] (My IRC bouncer is flapping and the bot logs are down so I don't know if my previous message went through. Excuse any echo.) Maya and I are testing the PrefUpdate_19799589 schema (https://meta.wikimedia.org/wiki/Schema:PrefUpdate) on the beta cluster. We see the events pouring into all-events.log but the table does not exist in the database. In a [19:21:56] prior ticket (https://phabricator.wikimedia.org/T203815), nuria suggested restarting eventlogging on the server but I don't know how to do that. Can someone help? [19:23:24] niedzielski-mobi: let me remember what machine was this one [19:23:55] nuria: hey! We think it's deployment-eventlog05. [19:24:09] (But if there's a wiki page listing this stuff, that'd be invaluable.) [19:25:02] niedzielski-mobi: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/TestingOnBetaCluster#How_to_check_if_events_are_valid [19:25:38] Yay! [19:26:51] niedzielski-mobi: trying to reboot machine now [19:47:16] nuria: hm, so I think the machine has fully rebooted now but I'm still not seeing the table. Edward mentioned via Elena that the database logging was actually turned off Februrary 5th. Do you know if that's still disabled? (Sorry, I haven't been keeping up with analytics!) [20:02:37] (03PS1) 10GoranSMilovanovic: initial [analytics/wmde/WD/WD_languagesLandscape] - 10https://gerrit.wikimedia.org/r/575607 [20:03:15] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] initial [analytics/wmde/WD/WD_languagesLandscape] - 10https://gerrit.wikimedia.org/r/575607 (owner: 10GoranSMilovanovic) [20:07:36] (03PS1) 10GoranSMilovanovic: ui.R [analytics/wmde/WD/WD_languagesLandscape] - 10https://gerrit.wikimedia.org/r/575608 [20:07:49] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] ui.R [analytics/wmde/WD/WD_languagesLandscape] - 10https://gerrit.wikimedia.org/r/575608 (owner: 10GoranSMilovanovic) [20:12:53] (just fyi webrequest load bundle is back to normal) [20:25:19] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Bot field in edits_hourly dataset ignores username - https://phabricator.wikimedia.org/T244632 (10Milimetric) I was able to fast track it due to another deployment, this will take effect with the next snap... [20:29:30] 10Analytics: Should reportupdater Pingback reports be refactored? - https://phabricator.wikimedia.org/T246154 (10CCicalese_WMF) There's more detail about the heartbeat ping at T236178. [21:01:11] niedzielski-mobi: i thought mysql was still enabled for beta , but might be remembering it all wrong, can check in 30 min [21:01:37] nuria: thanks! [22:00:43] niedzielski-mobi: i cannot see anything running on deployment-eventlog05 [22:00:53] niedzielski-mobi: were you able to see your events before? [22:05:22] niedzielski-mobi: talked to labs team caus ei cannot even run puppet on that host [23:24:39] nuria: sorry, for the disconnect. What I was seeing was events landing in the all-events log but the table is never made in the database. [23:25:02] niedzielski-mobi: puppet was broken, i re-run it [23:25:11] nuria: e.g., in response to changing a preference and saving on https://en.wikipedia.beta.wmflabs.org/wiki/Special:Preferences#mw-prefsection-rendering [23:25:14] niedzielski-mobi: and dropped some large tables [23:25:23] niedzielski-mobi: did you checked again? [23:25:36] niedzielski-mobi: otherwise i can do my LAST TRICK [23:25:49] * niedzielski-mobi checking now [23:31:09] 10Analytics, 10Analytics-Kanban, 10User-Elukey: No queries run in Hue - https://phabricator.wikimedia.org/T242306 (10MMiller_WMF) FYI this happened again to me today, but I recreated the session using the instructions above and now it is working again. [23:32:17] nuria: Sorry for the delay (I swear I'm seeing different shell history!). I only see the old table, PrefUpdate_5563398, not the expected one PrefUpdate_19799589. [23:32:39] nuria: I see the event come through `tail -f /srv/log/eventlogging/all-events.log|grep -i Pref`. [23:38:45] niedzielski-mobi: let me see if i can restart the db [23:45:01] nuria: By the way, I'm doing all this on deployment-eventlog05 which I think is the correct place being that I see the expected results in the log. [23:47:23] https://www.irccloud.com/pastebin/W1Eup9Rg/ [23:47:44] niedzielski-mobi: sorry, see my post the mariadb is running with eventlogging user as read only [23:47:52] niedzielski-mobi: why i do not know [23:48:03] niedzielski-mobi: but that means new tables would not be created [23:48:20] niedzielski-mobi: in any case the all-events log is sufficinet to see they are valid [23:48:35] niedzielski-mobi: we probably should forget mariadb entirely [23:55:08] nuria: well, yay for getting to the bottom of things! Way to go, Nuria! Thank you soo much!