[00:39:15] 10Analytics, 10Analytics-Wikistats: Add flagged revision status statistics to Wikistats 2.0 - https://phabricator.wikimedia.org/T177951 (10Nuria) [00:40:02] 10Analytics, 10Analytics-Wikistats: Add flagged revision status statistics to Wikistats 2.0 - https://phabricator.wikimedia.org/T177951 (10Nuria) Putting on wikistats backlog. In the near team our plans are around reporting active editors for all wikis which is still a pending metric in wikistats so i doubt w... [00:49:29] 10Analytics: Optimization tips and feedback - https://phabricator.wikimedia.org/T245373 (10Nuria) @JAllemandou might know best but i think the partition predicate might not doing what you think. Partitions are better specified like year, month, day, like (year = 2019 and month=1 and day in (11, 12...31) )... [06:45:46] (03PS20) 10Fdans: Add vue-i18n integration, English strings [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/558702 (https://phabricator.wikimedia.org/T240617) [06:46:00] (03CR) 10Fdans: Add vue-i18n integration, English strings (032 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/558702 (https://phabricator.wikimedia.org/T240617) (owner: 10Fdans) [07:09:23] (03PS2) 10Fdans: Url encode file name before querying the data store [analytics/aqs] - 10https://gerrit.wikimedia.org/r/570608 (https://phabricator.wikimedia.org/T244373) [07:09:43] (03CR) 10Fdans: Url encode file name before querying the data store (031 comment) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/570608 (https://phabricator.wikimedia.org/T244373) (owner: 10Fdans) [07:09:59] (03CR) 10jerkins-bot: [V: 04-1] Url encode file name before querying the data store [analytics/aqs] - 10https://gerrit.wikimedia.org/r/570608 (https://phabricator.wikimedia.org/T244373) (owner: 10Fdans) [07:12:19] 10Analytics: Optimization tips and feedback - https://phabricator.wikimedia.org/T245373 (10JAllemandou) Hi @Iflorez, About the pageview query, @Nuria is right: partition predicates need to use single partition fields and simple definition (the same as what you do in comments) Otherwise the engine doesn't manag... [08:36:46] good morning :) [08:37:12] so I am testing Spark in hadoop test, and sadly encryption doesn't work again [08:37:32] this time hadoop check native returns [08:37:33] openssl: false EVP_CIPHER_CTX_encrypting [08:37:56] even with the symlink to libcrypto 1.0.2 [08:38:45] without the symlink, it fails saying that no lib is found [08:39:03] so it is not picking up libcrypto1.1 [08:39:36] I guess that there was a change in hadoop 2.8 that doesn't work with our version of libssl 1.0.2 ? [08:41:00] to compare, checknative returns the following when libcrypto1.1.0 is used [08:41:03] openssl: false EVP_CIPHER_CTX_cleanup [08:45:35] so the _encrypting feature seems used if OPENSSL_VERSION_NUMBER >= 0x10100000L [08:50:20] ah ok so checknative uses apache crypto [08:53:34] should be related to https://github.com/apache/bigtop/pull/306/commits/256a196d50ac5e346cc3f6bd12e2a179b5fb3f37 [08:54:56] but I'd expect checknative to work with 1.1 then [08:55:10] instead, same old false EVP_CIPHER_CTX_cleanup [08:59:32] tried to force the libcrypto symlink to 1.1 and tested spark, no joy [09:05:21] https://issues.apache.org/jira/browse/HADOOP-14597 [09:05:30] "Good point about EVP_CIPHER_CTX_encrypting. The function doesn't exist in OpenSSL-1.0.2 . I've put an ugly ifdef for that too now. " [09:08:27] hi team - caring NaΓ© this morning, will not be online regularly [09:08:39] joal: ack <3 [09:13:42] https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c doesn't list any EVP_CIPHER_CTX_encrypting. [09:13:54] so it must be the patch that I referred above [09:14:32] EVP_CIPHER_CTX_encrypting though seems pulled in only if OPENSSL_VERSION is >= 0x10100000L, that should be 1.1 [09:15:30] so maybe having both 1.1 and 1.0.2 confuses hadoop's libs? [09:18:12] elukey: I'd be confused myself --^ :) [09:18:27] does Hadoop dlopen() at runtime or regular linking? [09:18:53] if you run ldd on the Hadoop binary, does it link against 1.0.2 or 1.1? [09:18:59] moritzm: trying to find that [09:19:04] which host is that, something in the test cluster? [09:19:11] yep, analytics1035 [09:19:15] looking [09:19:46] I was checking deps and there is no mention, afaics, of libssl [09:19:48] which package name is that? [09:19:56] or which binary? [09:20:29] I am trying to get what it is, I think it is one of the hadoop core libs [09:20:41] the c file that is used, IIUC, is https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c [09:21:04] but I don't know where it is located yet [09:22:29] it is very weird though since forcing the use of libcrypto1.1.0 leads to EVP_CIPHER_CTX_cleanup, that got dropped in 1.1, and in the patch I don't see a ifdef guard for it [09:23:05] (the patch is old) [09:23:51] even in hadoop master though EVP_CIPHER_CTX_cleanup is not guarded [09:25:57] http://mail-archives.us.apache.org/mod_mbox/hadoop-user/201910.mbox/%3cCADiq6=weDFxHTL_7eGwDNnxVCza39y2QYQTSggfLn7mXhMLOdg@mail.gmail.com%3e [09:26:21] https://issues.apache.org/jira/browse/HADOOP-16647 [09:26:25] what a disaster [09:26:26] sigh [09:27:13] buster buster has 1.1.0, not 1.1.1 [09:27:54] there should also be no API change between 1.1.0 and 1.1.1 [09:28:09] only some defaults change in terms of what ciphers are offered by default [09:28:48] no idea [10:03:40] 10Analytics, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Add time limits to scripts executed on stat1007 as part of analytics/wmde/scripts - https://phabricator.wikimedia.org/T243894 (10Ladsgroup) So this is for weekly runs: ` ladsgroup@stat1007:/srv/analytics-wmde/graphite/log$ tail wee... [10:07:38] moritzm: sorry I needed to take care of other things - http://mail-archives.us.apache.org/mod_mbox/hadoop-user/201910.mbox/%3cCALh-6sAug29Ua2iX+aZnR_TjzjQpwVBcVV_macQDxNYYgoOLzA@mail.gmail.com%3e seems listing the current problem that I have, but it says 1.1.1, so in theory 1.1.0 should work (our version). I see that EVP_CIPHER_CTX_cleanup is deprecated for 1.1.0, this is why I was asking [10:07:44] before if maybe the hadoop patch is not complete or correct (or maybe EVP_CIPHER_CTX_cleanup was deprecated in a version afterwards?) [10:10:06] I can open an issue to BigTop in case, and ask for more info from their isde [10:10:09] *side [10:11:12] 10Analytics, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Add time limits to scripts executed on stat1007 as part of analytics/wmde/scripts - https://phabricator.wikimedia.org/T243894 (10Ladsgroup) On daily ones: ` 2020-02-11 03:00:01 wikidata-sparql-instanceof Script Started! 2020-02-11... [10:15:16] I had a look at the changelog; EVP_CIPHER_CTX_cleanup was removed in 1.1.0 [10:17:42] thanks, didn't find it [10:18:02] so even the hadoop patch seems a little bit broken, they didn't guard the usage of it [10:18:34] the patch from bigtop-2932 is a compile-time change, are we building the current bigtop packages from source? otherwise it appears as if the packages distributed by them were built against OpenSSL 1.0.2? [10:21:59] moritzm: nono the packages are synced from the big top repos [10:23:37] moritzm: EVP_CIPHER_CTX_encrypting seems available, judging from the ifdef, with 1.1.0+ no? Maybe they are building on Debian 9 using libssl-dev, that creates the symlink to libcrypto.so.1.1.0 [10:24:08] so I guess that the overall build would be against 1.1.0 [10:24:28] (not sure if I am saying something silly or not) [10:27:16] stretch has both; if they link against libssl1.0-dev, OpenSSL 1.0.2 is used and libssl-dev uses OpenSSL 1.1.0 [10:27:30] so, yes, you're right [10:27:51] do they publish source packages for the debs? or some other insight how the debs are generated? [10:29:40] there are some scripts in their github repo about building etc.., not sure if they publish the src packages in their repos (never checked). I am opening a Jira to BigTop to ask for some guidance, so we'll avoid getting crazy :) [10:32:40] sounds good [10:35:55] created https://issues.apache.org/jira/browse/BIGTOP-3308 [10:36:22] (03PS3) 10Fdans: Url encode file name before querying the data store [analytics/aqs] - 10https://gerrit.wikimedia.org/r/570608 (https://phabricator.wikimedia.org/T244373) [10:45:13] and also https://issues.apache.org/jira/browse/BIGTOP-3309 to ask for Debian Buster [10:50:34] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Enable encryption in Spark 2.4 by default - https://phabricator.wikimedia.org/T240934 (10elukey) As FYI, filed https://issues.apache.org/jira/browse/BIGTOP-3308 for BigTop [10:51:36] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) Filed https://issues.apache.org/jira/browse/BIGTOP-3308 - openssl is not correctly picked up, Spark RPC encryption doesn't work. So far it is the biggest issue see... [11:00:40] 10Analytics, 10Analytics-Kanban: links to stats.wikimedia.org/v2/[wiki] have an extra slash - https://phabricator.wikimedia.org/T245414 (10fdans) [11:09:05] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10MoritzMuehlenhoff) Looking at "hadoop checknative" is opens/usr/lib/x86_64-linux-gnu/libcrypto.so which is a symlink to /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.2. But E... [11:12:14] moritzm: thanks! ==^ [11:27:24] 10Analytics, 10Analytics-Kanban: links to stats.wikimedia.org/v2/[wiki] have an extra slash - https://phabricator.wikimedia.org/T245414 (10elukey) I think that the main problem is `RedirectMatch permanent ^/v2(.*) /$1`, since the .* capturing group also grabs the trailing `/`. Maybe something like: ` Redirec... [11:28:04] fdans: you there? [11:28:13] (hola) [11:33:53] need to go now, will be back in ~2h [12:39:48] 10Analytics, 10Operations, 10decommission, 10serviceops: decommission kraz.wikimedia.org - https://phabricator.wikimedia.org/T245279 (10jbond) p:05Triageβ†’03Medium [12:39:48] oh elukey sorry, I was out for lunch, ping me you're back :) [13:50:36] fdans: I am back! [13:51:06] elukey: helloooo [13:51:29] fdans: ok if I test a new httpd config on thorium? [13:51:49] elukey: please! [13:54:49] still not working [14:04:03] it is really weird, I am trying various configs but nothing [14:04:07] also tried to purge the URL [14:17:10] fdans: as test I left only Redirect /v2/ /v2 [14:17:22] still not working, very strange [14:17:44] are there any urls different than /v2/ excluding the #blabla part? [14:17:59] elukey: hmmm [14:18:27] elukey: I'm not sure I understand the question, sorry [14:19:15] fdans: IIRC after the # it is something hanlded by the js no? [14:19:20] I might be wrong [14:20:33] fdans: all right can you check now with incognito? [14:20:34] elukey: funny that stats.wikimedia.org/v2 redirects to stats.wikimedia.org// [14:20:56] elukey: oh it works now [14:21:00] (i think) [14:21:09] fdans: let's try to check [14:21:25] elukey: yep, links from wikis are good [14:22:12] fdans: if you have a sec for bc I can explain [14:22:14] elukey: all cases seem to work for me [14:22:51] elukey: omw [14:38:00] fdans: I am wondering one thing - is it possible that in the stats.w.o code the trailing slash is added somewhere? [14:43:22] fdans: because I don't find the 307 in the httpd logs [14:44:38] elukey: hmmm, I know relatively little about the routing in wikistats [14:44:49] elukey: mforns_ might know a bit more [14:45:07] I'm not sure if the router does any wrangling with the trailing slash [14:46:56] I am reasonably sure it does, otherwise I don't explain this [14:47:26] fdans: I mean, is there any mention of a /v2/ or similar in the routing code? [14:55:32] elukey: let me take a look (sorry for not responding timely, my notifications are off) [14:58:34] fdans: sorry I need to recheck the logs, I found some redirects for /v2 ending up in 301 [14:58:38] that is what curl returns [14:58:42] I misread 307 [14:59:24] mmm ok I am a little bit crazy [14:59:39] so google dev tools shows me the /v2 -> /v2/ 307 [15:00:16] ah ok but it is not for /v2 [15:00:17] okok [15:01:50] added more info in the task [15:01:51] 10Analytics, 10Analytics-Kanban: links to stats.wikimedia.org/v2/[wiki] have an extra slash - https://phabricator.wikimedia.org/T245414 (10elukey) After some tries on thorium, the following rule alone seems to work: ` Redirect /v2/ / ` Interesting curl traces: * /v2/ seems to be redirected following the new... [15:06:07] heya teamm [15:08:21] elukey, was reading scrollback about wikistats trailing slash [15:08:28] ahhhhhhhhh https://httpd.apache.org/docs/2.4/mod/mod_dir.html#directoryslash [15:08:32] TIL [15:08:39] mforns, fdans I think I found why [15:08:42] it is httpd doing it [15:08:48] elukey, wikistats router only modifies the url after the # [15:08:53] oh, reading [15:09:12] "Typically if a user requests a resource without a trailing slash, which points to a directory, mod_dir redirects him to the same resource, but with trailing slash for some good reasons" [15:09:50] so v2 is a directory, and httpd redirects /v2 to /v2/ by itself [15:09:59] then Redirect /v2/ / works as intended [15:10:35] aha [15:13:11] mforns, fdans can you confirm that all the urls are behaving fine? If so I'll commit the change to puppet [15:14:36] elukey, in prod? [15:14:47] elukey: all of these looking fine: [15:14:49] https://stats.wikimedia.org/#/all-projects [15:14:49] https://stats.wikimedia.org/ [15:14:49] https://stats.wikimedia.org/v2 [15:14:49] https://stats.wikimedia.org/v2/ [15:14:49] https://stats.wikimedia.org/v2/#/all-projects [15:15:03] mforns: yes yes I temporarily fixed the httpd config on thorium [15:15:05] but not in puppet [15:15:08] ok [15:15:10] looking [15:15:35] mforns: beware of cache! you might want to do a full reload/use incognito [15:15:43] oh ok [15:18:00] I see the permanent link is broken for some metrics, but I don't think this is related to changes at all [15:19:12] elukey, fdans, I don't find any wrong link! Works for me! [15:19:20] ack! [15:19:25] what permanent link is broken? [15:19:35] 10Analytics, 10Analytics-Kanban: links to stats.wikimedia.org/v2/[wiki] have an extra slash - https://phabricator.wikimedia.org/T245414 (10elukey) This seems to be the reason: https://httpd.apache.org/docs/2.4/mod/mod_dir.html#directoryslash > Typically if a user requests a resource without a trailing slash,... [15:20:52] 10Analytics, 10Analytics-Wikistats: [Wikistats] The permanent link is broken - https://phabricator.wikimedia.org/T245445 (10mforns) [15:21:13] elukey, the one generated by the code, see the "Permanent link" button near the chart type dropdown [15:21:20] ah okok [15:30:30] mforns, fdans https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/572690/ [15:35:01] all right all done! [16:04:35] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: links to stats.wikimedia.org/v2/[wiki] have an extra slash - https://phabricator.wikimedia.org/T245414 (10elukey) [16:07:39] 10Analytics, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Add time limits to scripts executed on stat1007 as part of analytics/wmde/scripts - https://phabricator.wikimedia.org/T243894 (10Rosalie_WMDE) https://github.com/wikimedia/analytics-wmde-scripts/pull/1 [16:12:03] 10Analytics, 10Analytics-Kanban, 10ArticlePlaceholder, 10Wikidata, and 4 others: ArticlePlaceholder dashboard stopped tracking page views - https://phabricator.wikimedia.org/T236895 (10Ladsgroup) a:03Ladsgroup [16:18:58] 10Analytics: Get pytorch running on AMD GPU - https://phabricator.wikimedia.org/T245449 (10Nuria) [16:20:19] (03PS1) 10Rosalie Perside (WMDE): Add Time limit to scripts executed on stat1007 [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/572705 (https://phabricator.wikimedia.org/T243894) [16:30:03] 10Analytics: Get pytorch running on AMD GPU - https://phabricator.wikimedia.org/T245449 (10elukey) https://github.com/pytorch/pytorch/issues/10657 is interesting, even if doesn't seem really great in terms of solutions :) [16:34:09] 10Analytics: Get pytorch running on AMD GPU - https://phabricator.wikimedia.org/T245449 (10fdans) p:05Triageβ†’03Medium [16:34:40] 10Analytics, 10Analytics-Wikistats: [Wikistats] The permanent link is broken - https://phabricator.wikimedia.org/T245445 (10fdans) p:05Triageβ†’03High [16:36:44] 10Analytics, 10Analytics-Kanban: links to stats.wikimedia.org/v2/[wiki] have an extra slash - https://phabricator.wikimedia.org/T245414 (10fdans) p:05Triageβ†’03High [16:36:54] 10Analytics, 10Analytics-Kanban: links to stats.wikimedia.org/v2/[wiki] have an extra slash - https://phabricator.wikimedia.org/T245414 (10fdans) 05Openβ†’03Resolved [16:41:02] 10Analytics: Setup refinment/sanitization on netflow data similar to how it happens for other event-based data - https://phabricator.wikimedia.org/T245287 (10fdans) p:05Triageβ†’03High [16:44:20] 10Analytics: Add SWAP profile to stat1005 - https://phabricator.wikimedia.org/T245179 (10fdans) p:05Triageβ†’03High [16:47:18] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Remove tranquility and banner-impressions streaming from refinery-job - https://phabricator.wikimedia.org/T245151 (10fdans) p:05Triageβ†’03High [16:53:05] 10Analytics: Delete raw events after some time even if not needed - https://phabricator.wikimedia.org/T245126 (10fdans) p:05Triageβ†’03High [16:54:09] 10Analytics: Create a Kerberos access for sguebo - https://phabricator.wikimedia.org/T244913 (10fdans) p:05Triageβ†’03High [16:54:49] 10Analytics: Create a Kerberos access for sguebo - https://phabricator.wikimedia.org/T244913 (10elukey) ` elukey@krb1001:~$ sudo manage_principals.py create sguebo --email_address=sguebo@wikimedia.org Principal successfully created. Make sure to update data.yaml in Puppet. Successfully sent email to sguebo@wikim... [16:55:11] (03PS1) 10Ladsgroup: Add sparkJobJar as a parameter in WikidataArticlePlaceholderMetrics [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572711 (https://phabricator.wikimedia.org/T236895) [16:55:32] 10Analytics, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 10Patch-For-Review: Refining is failing to refine centranoticeimpression events - https://phabricator.wikimedia.org/T244771 (10Nuria) Let's: 1) update table in hive to latest schema 2) refine (if possible) the last 90 days of data [16:57:37] (03PS1) 10Ladsgroup: Pass spark_job_jar as an argument in ArticlePlaceholder oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/572713 (https://phabricator.wikimedia.org/T236895) [17:00:42] (03CR) 10Ladsgroup: "Hey, Can you take a look if this and I808bc889 make sense? I couldn't find a place that argument turns to a param in the spark job and I a" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572711 (https://phabricator.wikimedia.org/T236895) (owner: 10Ladsgroup) [17:02:45] 10Analytics, 10Analytics-Kanban, 10LDAP-Access-Requests, 10Operations: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue) - https://phabricator.wikimedia.org/T244410 (10Nuria) [17:05:38] 10Analytics, 10Analytics-Kanban: HDFS space usage steadily increased over the past three month - https://phabricator.wikimedia.org/T244889 (10Nuria) 05Openβ†’03Resolved [17:10:51] 10Analytics, 10Cite, 10Reference Previews, 10CPT Initiatives (Modern Event Platform (TEC2)): Remove or simplify tracking metrics - https://phabricator.wikimedia.org/T242127 (10Nuria) >The correct approach would be to simply emit StatsD metrics which can be directly consumed from Grafana to give all the sam... [17:12:51] 10Analytics, 10Analytics-SWAP, 10GLOW: Viewing Santali and Javanese characters on SWAP via Chrome only displays Tofu signs - https://phabricator.wikimedia.org/T242490 (10Nuria) The image you pasted is a wikipedia page, do you have a repro scenario for SWAP? [17:13:34] 10Analytics: [Wikistats2] Normalize pageviews per country by population - https://phabricator.wikimedia.org/T242621 (10Nuria) p:05Triageβ†’03Medium [17:17:01] 10Analytics: Revise wiki scoop list from labs once a quarter - https://phabricator.wikimedia.org/T239136 (10Nuria) a:03fdans [17:17:10] 10Analytics: Revise wiki scoop list from labs once a quarter - https://phabricator.wikimedia.org/T239136 (10Nuria) @fdans to revise for q3. [17:20:24] 10Analytics, 10Product-Analytics: Check home leftovers of dfoy - https://phabricator.wikimedia.org/T239571 (10Nuria) a:05Milimetricβ†’03fdans [17:21:59] 10Analytics, 10Product-Analytics: Check home leftovers of dfoy - https://phabricator.wikimedia.org/T239571 (10Nuria) Assigning to @fdans for ops week, let's just remove data from hdfs [17:23:43] 10Analytics, 10Analytics-Kanban: Webrequest text fails to refine regularly - https://phabricator.wikimedia.org/T240815 (10Nuria) 05Openβ†’03Resolved [17:24:43] 10Analytics: Issues querying table in Hive - https://phabricator.wikimedia.org/T244484 (10Nuria) Let's not use this data, closing ticket . The data is on events database on centralbannerhistory table and been so for a few years. [17:24:56] 10Analytics: Issues querying table in Hive - https://phabricator.wikimedia.org/T244484 (10Nuria) 05Openβ†’03Declined [17:25:24] 10Analytics: Add druid load job for data quality table - https://phabricator.wikimedia.org/T244379 (10Nuria) p:05Triageβ†’03High [17:26:33] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset aggregation across edit tags uses all tags - https://phabricator.wikimedia.org/T243552 (10Nuria) a:03fdans [17:46:36] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) To keep track of everything, on the coordinator I had to run the following to upgrade the Metastore's schema - `/usr/lib/hive/bin/schematool -dbType mysql -upgrade... [17:56:29] !log restart hadoop daemons on analytics1042 to pick up new openjdk updates [17:56:30] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:58:03] !log restart cassandra on aqs1004 to pick up new openjdk updates [17:58:04] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:58:28] 10Analytics, 10Analytics-Kanban: Fix webrequest host normalization - https://phabricator.wikimedia.org/T245453 (10JAllemandou) [17:58:38] 10Analytics, 10Analytics-Kanban: Fix webrequest host normalization - https://phabricator.wikimedia.org/T245453 (10JAllemandou) a:03JAllemandou [17:59:29] !log restart druid daemons on druid1003 to pick up new openjdk updates [17:59:30] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:00:07] (03PS1) 10Joal: Fix webrequest host normalization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572726 (https://phabricator.wikimedia.org/T245453) [18:00:19] nuria: if you have a minute please --^ [18:12:41] PROBLEM - aqs endpoints health on aqs1006 is CRITICAL: /analytics.wikimedia.org/v1/legacy/pagecounts/aggregate/{project}/{access-site}/{granularity}/{start}/{end} (Get pagecounts) timed out before a response was received: /analytics.wikimedia.org/v1/unique-devices/{project}/{access-site}/{granularity}/{start}/{end} (Get unique devices) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring [18:12:53] Aouch [18:13:47] yeah aqs1004-a didn't like the restart, I am re-doing it [18:14:34] ack elukey [18:16:23] (03CR) 10Joal: "Hi Amir - I'm sorry I should have been more careful on my previous review. Adding a jar is not the best solution here. Instead you should " [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572711 (https://phabricator.wikimedia.org/T236895) (owner: 10Ladsgroup) [18:16:33] RECOVERY - aqs endpoints health on aqs1006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [18:22:29] ok all good from aqs side [18:22:46] !log restart cassandra on aqs1004 to pick up new openjdk updates [18:22:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:25:39] !log restart kafka on kafka-jumbo1001 to pick up new openjdk updates [18:25:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:29:36] !log reboot turnilo and superset's hosts for kernel upgrades [18:29:37] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:14:47] * elukey ofF! [19:45:45] (03CR) 10Ladsgroup: "> Patch Set 1:" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572711 (https://phabricator.wikimedia.org/T236895) (owner: 10Ladsgroup) [19:48:33] (03PS1) 10Ladsgroup: Stop using the jar file in the WikidataArticlePlaceholderMetrics [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572734 (https://phabricator.wikimedia.org/T236895) [19:54:47] (03CR) 10Joal: "Let's triple check with @nuria that the solution I suggest works for her." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572711 (https://phabricator.wikimedia.org/T236895) (owner: 10Ladsgroup) [20:16:03] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Fix webrequest host normalization - https://phabricator.wikimedia.org/T245453 (10Nuria) Wow. How did we noticed this was happening? [20:17:56] (03CR) 10Nuria: [C: 03+1] Fix webrequest host normalization (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572726 (https://phabricator.wikimedia.org/T245453) (owner: 10Joal) [20:18:04] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Fix webrequest host normalization - https://phabricator.wikimedia.org/T245453 (10JAllemandou) I tried to validate the approach of using `normalized_host.project` and `normalized_host.project_family` instead of `pageview_info[project]` for @Ladsgroup... [20:19:07] (03CR) 10Joal: Fix webrequest host normalization (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572726 (https://phabricator.wikimedia.org/T245453) (owner: 10Joal) [20:46:13] (03CR) 10Nuria: Fix webrequest host normalization (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572726 (https://phabricator.wikimedia.org/T245453) (owner: 10Joal) [20:53:12] (03CR) 10Joal: Fix webrequest host normalization (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572726 (https://phabricator.wikimedia.org/T245453) (owner: 10Joal) [21:31:05] (03PS1) 10Joal: [WIP] Add wikidata item_page_link job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572746 [21:46:38] (03CR) 10Nuria: Stop using the jar file in the WikidataArticlePlaceholderMetrics (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572734 (https://phabricator.wikimedia.org/T236895) (owner: 10Ladsgroup) [21:47:54] (03CR) 10Nuria: [C: 03+2] Add spark code for wikidata json dumps parsing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346726 (https://phabricator.wikimedia.org/T209655) (owner: 10Joal) [21:51:02] (03CR) 10Joal: Stop using the jar file in the WikidataArticlePlaceholderMetrics (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572734 (https://phabricator.wikimedia.org/T236895) (owner: 10Ladsgroup) [21:52:38] (03Merged) 10jenkins-bot: Add spark code for wikidata json dumps parsing [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346726 (https://phabricator.wikimedia.org/T209655) (owner: 10Joal) [22:03:41] (03PS2) 10Joal: Add wikidata item_page_link spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572746 [22:03:59] (03CR) 10Joal: "Tested on cluster." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/572746 (owner: 10Joal) [22:05:52] Gone for tonight - See you tomorrow team [22:08:38] 10Analytics, 10Pageviews-API: Pageviews missing for titles with emojis since April 23, 2019 - https://phabricator.wikimedia.org/T245468 (10MusikAnimal) [22:18:40] 10Analytics: Optimization tips and feedback - https://phabricator.wikimedia.org/T245373 (10Iflorez) Thank you for the feedback! I've updated the date handling in the pageviews query and added event_entity, revision_is_identity_reverted, and revision_is_deleted_by_page_deletion to the fields used in the revisio... [23:48:09] 10Analytics, 10Pageviews-API: Pageviews missing for titles with emojis since April 23, 2019 - https://phabricator.wikimedia.org/T245468 (10Nuria) Are these accepted Mediawiki page titles? cc @awight [23:50:46] 10Analytics, 10Analytics-Kanban: Make history and current wikitext available in hadoop - https://phabricator.wikimedia.org/T238858 (10Nuria) Do we need to update docs? https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/XMLDumps/Mediawiki_wikitext_history