[00:53:38] PROBLEM - Check the last execution of monitor_refine_eventlogging_legacy_failure_flags on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_eventlogging_legacy_failure_flags https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [02:03:55] (03PS1) 10Milimetric: [WIP] Spike so far [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/618874 (https://phabricator.wikimedia.org/T258532) [02:04:36] (03PS2) 10Milimetric: [WIP] Spike so far [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/618874 (https://phabricator.wikimedia.org/T258532) [02:07:05] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Spike so far [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/618874 (https://phabricator.wikimedia.org/T258532) (owner: 10Milimetric) [06:30:09] RECOVERY - Check the last execution of monitor_refine_eventlogging_legacy_failure_flags on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_eventlogging_legacy_failure_flags https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [07:36:33] fdans: o/ [07:36:42] if you are around I'd need a consult for npm + webpack [07:36:49] since I am a total n00b [07:47:44] elukey: sorry! just back from running an errand [07:47:46] bc? [07:48:02] fdans: I think I may have solved, it turns out I am also a n00b in copying files [07:48:08] haha [07:48:18] lmk if you need anything [07:48:29] basically I haven't realized about all the .hidden files related to babel etc.. [07:48:37] so I wasn't copying them in the dir to build [07:48:49] I'll ping you if it explodes again [07:48:50] sigh [08:22:14] 10Analytics: Check home/HDFS leftovers of nathante - https://phabricator.wikimedia.org/T256356 (10akosiaris) 05Open→03Resolved @Groceryheist You should have access again now. I 'll resolve this, feel free to reopen [10:30:58] * elukey lunch! [10:33:47] 10Analytics, 10MediaWiki-REST-API, 10Platform Team Sprints Board (Sprint 1), 10Platform Team Workboards (Green), 10Story: System administrator reviews API usage by client - https://phabricator.wikimedia.org/T251812 (10Naike) [10:39:36] 10Analytics, 10MediaWiki-REST-API, 10Platform Team Sprints Board (Sprint 0), 10Platform Team Workboards (Green), 10Story: System administrator reviews API usage by client - https://phabricator.wikimedia.org/T251812 (10Naike) [10:39:58] 10Analytics, 10MediaWiki-REST-API, 10Platform Team Sprints Board (Sprint 1), 10Platform Team Workboards (Green), 10Story: System administrator reviews API usage by client - https://phabricator.wikimedia.org/T251812 (10Naike) [11:03:35] 10Analytics, 10MediaWiki-REST-API, 10Platform Team Sprints Board (Sprint 1), 10Platform Team Workboards (Green): Unify access log schema for Action API and API Gateway/REST API - https://phabricator.wikimedia.org/T259736 (10Naike) [11:50:29] 10Analytics-Radar, 10Datasets-General-or-Unknown, 10Product-Analytics, 10Structured-Data-Backlog: Set up generation of JSON dumps for Wikimedia Commons - https://phabricator.wikimedia.org/T259067 (10Cparle) The wikibase json dump script seems to work just fine for this locally at least `mwscript extension... [11:54:26] hey teaaam :] [13:14:05] really nice reading https://phabricator.wikimedia.org/phame/post/view/190/internal_anycast [14:06:45] o/ elukey [14:06:53] i'm looking into CAS for jupyterhub [14:07:01] got any tips? docs aren't great... [14:07:09] using [14:07:10] https://github.com/cwaldbieser/jhub_cas_authenticator [14:09:00] ottomata: o/ [14:09:12] mmm can you explain more what you want to do? [14:09:24] ok so to use jupyterhub, you have to sign in [14:09:30] right now we use an ldapauthenticator to do that [14:09:46] but people have to ssh tunnel to the jupyterhub server [14:09:49] and THEN sign in with ldap [14:09:55] was hoping i could avoiid that step with CAS [14:10:05] expose jupyterhub just like we do with superset [14:10:16] and use cas to authenticate into jupyterhub [14:10:22] but we'd need to expose all stat boxes no? [14:10:38] hm [14:10:43] for local yes. [14:10:46] if we were doing yarnspawner no. [14:10:46] hm [14:11:01] ah ok this part was not clear :) [14:11:10] well i was thikning about both [14:11:19] but ya hm that is true [14:11:21] so for the moment we have been adding mod_cas to httpd in front of all the UIs [14:11:36] right, but jupyterhub has a built in cas authenticator [14:11:43] which would fix the double sign in issue [14:12:01] well 'built in' i mean extension [14:12:31] yep yep, I think that the best option is to open a task and add Moritz/Jbond to it [14:12:50] there is some backend config related to CAS/IDP to do that I am not super familiar with [14:13:08] for jupyterhub, after the above change, it should be simple to configure [14:13:50] but I'd solve first the problem of having a single UI/endpoint to use [14:14:17] yeah [14:14:25] ok i will punt on that for now then [14:14:33] i want to see if i can get a jupyterhub setup with anaconda [14:14:38] can do cas later if we want to [14:16:04] yes makes sense [14:16:52] at some point we should lock down UIs with 2FA, and this one seems one that would need a mandatory usage of a yubikey [14:28:00] it is super weird [14:28:19] I have a working debian rules file that calls "make apps" in hue's makefile [14:28:29] but when pip starts, I get errors like [14:28:30] Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError(': Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))': /simple/asn1crypto/ [14:28:44] that seems indicating that the proxy is not set [14:28:53] but I pass it via pdebuild [14:29:10] when I get the root console for the chroot [14:29:10] root@deneb:/tmp/buildd/hue-4.7.1# env | grep https [14:29:11] https_proxy=http://webproxy.codfw.wmnet:8080 [14:29:19] and if I execute make apps, it works [14:29:34] (both pip and npm) [14:29:51] not sure why it doesn't before [15:26:21] ottomata: wikimedia-events-utilities ci build is occurring at https://integration.wikimedia.org/ci/job/wikimedia-event-utilities-maven-java8-docker/1/console [15:26:22] :] [15:26:36] and it is super fast [15:26:49] for the release job, that would need some more pairing. Joal would surely be able to help [15:38:40] joal is out on vaca for a bit [15:38:42] can I help? [15:38:54] hashar: ^ [15:39:29] hashar: I can merge https://gerrit.wikimedia.org/r/c/wikimedia-event-utilities/+/618378 [15:39:33] and we can try to release? [15:39:51] i guess we need to set up a jenkins job (via the UI?) like we have for refinery/source? [15:41:28] ottomata: for the release jojb, I will need to dig into it but that is not a task for a friday evening ;] [15:41:33] surely I can look into it next week [15:41:35] ;] [15:41:59] there is a long tails of things to check so it is not entirely trivial [15:41:59] ok that's fine [15:42:02] thanks hashar [15:42:21] hashar: shall I make a task? [15:42:36] ottomata: definitely! :] [15:42:42] ok ty! [15:43:07] I am vanishing for now :] Lets poke at it sometime next week [15:43:39] 10Analytics, 10Analytics-Kanban, 10Event-Platform: Set up Jenkins maven release job for wikimedia-event-utilities like analytics/refinery/source - https://phabricator.wikimedia.org/T259898 (10Ottomata) [15:43:41] k ty [15:43:42] laters! [15:44:19] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services): Set up Jenkins maven release job for wikimedia-event-utilities like analytics/refinery/source - https://phabricator.wikimedia.org/T259898 (10Ottomata) [15:45:37] ottomata: perfect. Thx for the task and see you next week :] [16:15:09] ottomata: are you around? [16:15:39] I'd need a brainbounce about hue if you have time (on IRC is fine) [16:16:21] ya [16:16:23] am here heyaaa [16:17:45] thanks :) [16:18:01] so hue [16:18:29] it provides a makefile with a target, "apps", that builds a ton of things including pythong and nodejs deps [16:18:37] into a big ~500MB dir called build [16:18:50] hue then runs from there, it doesn't need much more [16:19:18] in order to build, it needs a lot of debian deps like librkb5-dev, libsasl2-dev, libmariadb-dev, etc.. [16:20:07] so instead of adding them on deneb, me and Filippo were wondering if debian/rules could have been used to pull everything into a sandbox [16:20:15] basically adding the build deps to control [16:20:32] so I ended up in https://gerrit.wikimedia.org/r/c/operations/debs/hue/+/618728 [16:20:50] but, sadly, when I run pdebuild it fails at the first use of network [16:20:54] in a strange way, like [16:20:58] Collecting asn1crypto==0.24.0 (from -r /build/hue-4.7.1/desktop/core/requirements.txt (line 2)) [16:21:01] Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError(': Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))': /simple/asn1crypto/ [16:21:08] (note the temporary DNS resolution failure) [16:21:32] and I tried to pass https_proxy=etc.. in all possible sauces [16:22:02] the funny part is that when the build drops to the root chroot shell, it works [16:22:13] I can pull all files, build them ,etc.. [16:22:42] Antoine suggested to set pbuilder to use the network, etc.. [16:23:05] but it feels not the right direction [16:23:20] so I was wondering if a scap repo could be an alternative [16:23:31] (it would mean ~500MB to gerrit but..) [16:23:55] simple, we build hue where we want, then we push to gerrit [16:24:02] and a systemd unit executes hue from there [16:24:26] the other alternative is asking to SRE if I can add all the packages that I need to deneb [16:24:45] build beforehand (not via rules) and then simply add the build dir to the repo [16:36:47] ok so the temporary dns failure seems to change to a better "connection timeout" when I use USENETWORK=yes with pbuilder [16:37:14] ah no it was a false hope sigh [16:52:23] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production shell for Denny Vrandecic - https://phabricator.wikimedia.org/T259388 (10Nuria) I think L3 doc is signed now so we can proceeed? [16:53:34] ok I found a way to make this work [16:53:59] but it was adding USENETWORK=yes to /etc/pbuilder's config [16:54:20] passing it to pbuild doesn't work, nor adding it to .pbuilderrc [16:54:23] in my home [16:54:23] sigh [17:24:37] * elukey off! [17:24:41] have a good weekend folks :) [17:37:06] 10Analytics: Check home/HDFS leftovers of nathante - https://phabricator.wikimedia.org/T256356 (10Groceryheist) Thank you @akosiaris. For your info, my revisions deadline was just extended from September 9 to October 15th. I'll be able to wrap up this project by the end of September. But I might still be workin... [17:54:31] ottomata: i need to file a request / sort-of-bug-report for eventstreamconfig. should I make it sub-task of https://phabricator.wikimedia.org/T242122 ? it's to fix how the JSON it returns in the mw API so that streams without configs (e.g. test.event) are {} (good), not [] (bad) [17:56:54] ottomata: ran into this issue testing out the event platform client on ios and deserializing from the retrieved json. [18:16:39] Any eventgate changes around aug 3 before 17:00? Looking into an issue where nested data in event.mediawiki_cirrussearch_request is either put in the wrong place by mediawiki, or moved by some processing before it gets to hive [18:18:00] bearloga: oh intersting [18:18:04] you can make it a subtask of https://phabricator.wikimedia.org/T205319 [18:18:19] hmmm [18:18:32] ebernhardson: i did deploy a refine change on aug 3 [18:18:45] oh but not to cirrussearch_request [18:18:49] OHOH [18:18:50] yes i [18:18:51] to that [18:18:59] sorry was thining that was EventLogging [18:19:08] ebernhardson: what nested data [18:19:09] ? [18:19:11] what are the field names [18:19:24] the change I made changes what happens to some special character field names [18:19:32] ottomata: elasticsearch_requests is an array of structs, elasticsearch_requests[n].query_type now contains the query, and the query has a number in it [18:19:56] ? [18:20:08] https://phabricator.wikimedia.org/T255818#6313784 [18:20:08] so [18:20:32] ottomata: the data is something like {"elasticsearch_requests":[{"query": "foo", "query_type": "hi mom"}]}, hive has {"elsaticsearch_requests":[{"query": "0", "query_type": "foo"}]} [18:21:49] hm [18:21:58] hm ok, my change should not affect that [18:22:07] the types are correct everywhere, and there's nothing weird about that field name [18:22:08] i'll see if i can find a matching source and refined row, might take a minute to find the matching timespan in kafka [18:22:23] ebernhardson: raw json data is also in /wmf/data/raw [18:22:29] in sequence files [18:22:41] ok, that might be easier. We have unique id's in the messages so its just a filter [18:23:15] ebernhardson: FYI i merged the change to refine at 15:29:38 [18:23:24] so it would have taken affect at the next refinement [18:23:37] my first empty partition is the run at 17:00, i guess the 16:00 run could be partial [18:23:45] but it looked mostly complete, eyeballing size [18:23:58] empty partiiton? [18:24:13] the refine lags by 2 hours, and this job launcesh at :20 after [18:24:14] so [18:24:18] the first hour affected would have been [18:24:21] my downstream data is empty, because the query_type field doesn't contain expected values so it's all filtered [18:24:27] 13 or 14 i think? [18:24:32] oh [18:25:03] sounds like i'll need an hour or so to dig through some more and find more concrete data, but have some things to start with :) [18:26:29] ebernhardson: yeah checkin /wmf/data/raw/event for your table and hour data [18:26:37] to see iff you can find a corresponding event to see if it is in the json or not [18:27:46] hey a-team, in case y'all find yourself in place where you're asked about the impact of your work, you should check out this thread: https://wikimedia.slack.com/archives/CTU0ZVA22/p1596818258274000 . We wouldn't have publicly available COVID-19 data without your tools & infrastructure! [18:29:21] ebernhardson: example of reading sequence file json with spark: [18:29:21] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/refine/RefineTarget.scala#L482 [18:29:32] oh [18:29:34] no actually [18:29:39] i moved it... [18:29:56] sc.sequenceFile() reads it, plain RDD[Int, String] but it works [18:29:58] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-spark/src/main/scala/org/wikimedia/analytics/refinery/spark/sql/HiveExtensions.scala#L784 [18:30:00] yeah [18:30:07] that'll do too [18:30:10] if you want a DataFrame ^ [18:30:28] ok found one, sec [18:33:25] ottomata: https://phabricator.wikimedia.org/P12200 [18:47:01] looking [18:52:04] hm, kzeta that’d be nice but I don’t think our stuff helps with the on-wiki data. I have long-form thoughts on the whole thing though, we should talk [19:00:35] ok ebernhardson my change is def related [19:00:45] i'm not quite sure what is wrong [19:00:56] but i'm using the jsonschema to read in the jsondata now [19:01:10] whereas before i was using both hive schema merged with the e jsonschema [19:01:36] when I use just the jsonschema, the hits_returned field is the first one in that struct [19:01:46] hmm [19:01:51] when I use the merged schema, itis query [19:02:11] i think the data is being read in properly using the jsonschema [19:02:15] hmmm [19:04:49] milimetric: the Washington Post article talks about how many pageviews, articles, editors, edits, and languages there are for COVID-19 Wikipedia content. Shay, Diego, Connie, & Maya used APIs, PAWS, etc to provide that data for https://wikimediafoundation.org/covid19/data/, which was the main source for the Washington Post article. [19:08:31] hmmm yeah def a bug ebernhardson [19:08:34] maybe because its an array of structs [19:08:44] the refine code seems like it is not merging properly [19:12:46] milimetric: what I meant was that we wouldn't have publicly available data *about Wikipedia's coverage of* COVID-19 without your tools & infrastructure. [19:24:56] 10Analytics, 10Event-Platform, 10Product-Infrastructure-Data: Streams with empty configs should be rendered as {} in the JSON returned by StreamConfig API - https://phabricator.wikimedia.org/T259917 (10mpopov) [19:25:36] 10Analytics, 10Event-Platform, 10Product-Infrastructure-Data: Streams with empty configs should be rendered as {} in the JSON returned by StreamConfig API - https://phabricator.wikimedia.org/T259917 (10mpopov) [19:26:04] 10Analytics, 10Event-Platform, 10Product-Infrastructure-Data: Streams with empty configs should be rendered as {} in the JSON returned by StreamConfig API - https://phabricator.wikimedia.org/T259917 (10mpopov) p:05Triage→03High [19:42:12] 10Analytics, 10Event-Platform, 10Product-Infrastructure-Data: Streams with empty configs should be rendered as {} in the JSON returned by StreamConfig API - https://phabricator.wikimedia.org/T259917 (10Ottomata) Thanks, sounds like a classic PHP can't tell the difference between and empty object and an empty... [20:22:18] kzeta: just read that article (via wikireserach on twitter, seems a good one) [20:22:33] kzeta: thanks for sending it along [20:26:30] 10Analytics: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Ottomata) [20:28:35] (03PS1) 10Ottomata: [WIP] Fix for convertToSchema with array of structs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/619034 (https://phabricator.wikimedia.org/T259924) [20:29:05] 10Analytics, 10Patch-For-Review: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Ottomata) [20:30:33] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Fix for convertToSchema with array of structs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/619034 (https://phabricator.wikimedia.org/T259924) (owner: 10Ottomata) [20:30:47] (03PS2) 10Ottomata: [WIP] Fix for convertToSchema with array of structs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/619034 (https://phabricator.wikimedia.org/T259924) [20:33:16] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Fix for convertToSchema with array of structs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/619034 (https://phabricator.wikimedia.org/T259924) (owner: 10Ottomata) [20:33:22] kzeta: and thanks for the shuout [20:35:22] 10Analytics, 10Patch-For-Review: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Ottomata) Attempt at https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/619034, but it doesn't quite work. That changes the SQL to: ` ARRAY(NAMED... [20:37:45] 10Analytics, 10Patch-For-Review: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Ottomata) This was noticed by @EBernhardson this week as I merged the fix for {T255818} on Monday. I'm no longer merging (and properly reordering?) the str... [20:38:21] ottomata: thanks for looking into it! [20:47:51] 10Analytics, 10Patch-For-Review: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Ottomata) Launching backfill: ` sudo -u analytics /usr/bin/spark2-submit \ --name refine_event_backfill_cirrussearch_request \ --class org.wikimedia.analyt... [20:58:44] ebernhardson: can you please add me here so i can see example? https://phabricator.wikimedia.org/P12200 [20:59:20] 10Analytics, 10Patch-For-Review: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Nuria) >I don't know if this is affecting other data. Is the hits field the one affected? https://schema.wikimedia.org/repositories//primary/jsonschema/med... [21:03:59] oh, very cool, thanks kzeta :) [21:21:51] 10Analytics, 10Event-Platform, 10WMF-JobQueue, 10MW-1.36-notes (1.36.0-wmf.4; 2020-08-11), and 3 others: EventBus extension must not send batches that are too large - https://phabricator.wikimedia.org/T232392 (10Clarakosi) 05Open→03Resolved [21:25:55] 10Analytics-Radar, 10Product-Analytics, 10MW-1.36-notes (1.36.0-wmf.2; 2020-07-28), 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team): Update mediawiki_user_blocks_change to log partial block parameters - https://phabricator.wikimedia.org/T252455 (10Pchelolo) 05Open→03Resolved Verifie... [21:34:02] nuria: added. it's only under acl since i didn't clean out private data from raw logs [21:51:47] (03PS3) 10Ottomata: [WIP] Fix for convertToSchema with array of structs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/619034 (https://phabricator.wikimedia.org/T259924) [21:52:27] 10Analytics, 10Patch-For-Review: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Ottomata) > which I don't yet understand. Why does it think I want to cast d to a BIGINT? Ah, because I had a field in the test already called `d`, doh. Ok... [21:54:59] 10Analytics, 10Patch-For-Review: HiveExtensions.convertToSchema does not properly convert arrays of structs - https://phabricator.wikimedia.org/T259924 (10Ottomata) > Is the hits field the one affected? In this case it is the `elasticsearch_requests[].hits_returned` vs `elasticsearch_requests[].query` (and al... [21:55:01] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Fix for convertToSchema with array of structs [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/619034 (https://phabricator.wikimedia.org/T259924) (owner: 10Ottomata) [22:52:03] 10Analytics, 10Product-Analytics: NULL-values for useragent column in event.searchsatisfaction - https://phabricator.wikimedia.org/T259944 (10nettrom_WMF)