[00:27:06] is there any self-serve way to get data from the MariaDB replicas to the Data Lake? [01:29:40] 10Analytics: Metrics request on portal namespace usage - https://phabricator.wikimedia.org/T205681 (10AfroThundr3007730) [01:42:54] 10Analytics: Metrics request on portal namespace usage - https://phabricator.wikimedia.org/T205681 (10AfroThundr3007730) [02:25:15] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Services (doing), 10Wikimedia-production-error: Some requests fail with UIDGenerator error "Process clock is outdated or drifted" - https://phabricator.wikimedia.org/T94522 (10Krinkle) (Still seen on 1.32.0-wmf.23) [03:11:24] Is there a good way to output pandas dataframes as mediawiki tables? [05:04:58] RECOVERY - Throughput of EventLogging events on einsteinium is OK: (C)8000 ge (W)1500 ge 1486 https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen&orgId=1 [05:21:14] 10Analytics, 10User-Banyek: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T205544 (10Marostegui) Coordinate with me, as I am doing some maintenance on change_tag table on dbstore1002 and we should probably avoid running several heavy alters at the same time on this host. [08:33:18] Tant mieux [08:33:32] Mwarf - Wronf window [09:40:12] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure, 10Operations, 10Patch-For-Review, and 2 others: Prometheus resources in deployment-prep to create grafana graphs of EventLogging - https://phabricator.wikimedia.org/T204088 (10MoritzMuehlenhoff) p:05Triage>03Normal [09:41:38] hello joal, elukey [09:42:13] has the aqs side of https://github.com/wikimedia/restbase/pull/1070 been deployed already? [10:19:49] mobrovac: I think they're both out atm, but as far as i remember from yesterday, this remained unmerged when yesterday's restbase deploy happened [10:19:50] https://gerrit.wikimedia.org/r/#/c/mediawiki/services/restbase/deploy/+/463305/ [10:55:30] (03PS6) 10Fdans: Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) [10:55:36] (03CR) 10jerkins-bot: [V: 04-1] Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) (owner: 10Fdans) [11:01:59] (03PS7) 10Fdans: Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) [11:05:34] oh btw joal, this doesn't really test the endpoints from the UI, but if you can pass me the json body of a request to aqs on both new metrics, I can serve those locally and at least make sure that the data makes sense with the configuration in the UI [11:51:29] fdans: right, that PR hasn't been deployed on the RB side yet, but i want to know if the matching aqs changes are already live in production. if they are, we should deploy rb too, if not, it'd be better to wait next week to do so [11:54:01] mobrovac: 99.9% sure the aqs side is deployed, but I'm not the owner of the change, so I'd wait for joal to be back :) [11:54:35] and wait we shall [11:54:38] thnx fdans [11:54:40] mobrovac: is there a time by which today you wouldn't feel comfortable deploying? [11:55:18] i'd be ok doing so in the next hour or so, afterwards it will be too late [12:23:27] mobrovac: https://usercontent.irccloud-cdn.com/file/X89S8ZNg/Screen%20Shot%202018-09-28%20at%201.23.04%20PM.png [12:23:57] this is from yesterday, so the change should def be deployed on aqs, it was just the restbase part that we were missing [12:24:16] kk fdans thnx for the info [12:24:30] (i was expecting a meme, tbh, when I saw the link :P) [12:24:41] mobrovac: ha, sorry to disappoint [12:25:10] comms technology from the 1980s don't allow direct linking of messages ;) [12:28:12] :) [12:29:11] mobrovac: you're deploying then? (sorry to be a pain, it's just this deploy is part of a couple of goals for the quarter) [12:29:42] fdans: waiting for the go-ahead from an sre, since it's friday aftnoon [12:29:55] mobrovac: thank you :) [12:30:02] but, it's lunch time now, so we have to be patient [12:30:10] I understand [13:11:51] mobrovac: just verified the endpoint and it seems to be working as expected! thank you so much [13:12:25] fdans: the deploy is still in progress, will ping you once it's been completely doen (should be in the next 5 mins) [13:12:28] so don't use it just yet [13:12:54] oh mobrovac sure, thanks [13:17:48] fdans: kk it's fully done now, enjoy :) [13:41:49] Hi fdans and mobrovac - Sorry for having afk at the wrong moment [13:41:57] mobrovac: Super many thanks for that deploy :) [13:46:34] fdans: Thanks for having made that possible :) [13:46:52] np :) [13:47:02] sorry for the unfortunate delay [13:47:42] no worries mobrovac - I hope Pietr is not too bad :/ [14:31:09] (03PS8) 10Fdans: Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) [14:36:30] mforns helloooo you here? I want to move forward with the two metrics :) [14:36:42] fdans, yep [14:36:49] wanna pair? [14:36:55] mforns: omw [14:36:58] k [14:54:42] 10Analytics, 10Analytics-Wikistats: Beta: Provide easier mapping between Wikistats1 metrics and Wikistats2 metrics (example: "active editors") - https://phabricator.wikimedia.org/T187806 (10ezachte) @ChrisPins thanks for pointing this out. Wikistats 1 doesn't count ip addresses as contributors, in that setup... [14:58:23] (03PS9) 10Fdans: Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) [15:05:24] (03CR) 10Mforns: [C: 032] "Left an optional comment! LGTM!" (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463230 (https://phabricator.wikimedia.org/T204965) (owner: 10Fdans) [15:06:05] (03CR) 10Mforns: [C: 032] "LGTM!" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) (owner: 10Fdans) [15:07:14] (03CR) 10Fdans: "yeaaa but new pages and pages to date are also contributing metrics sooo I don't know! We can rearrange them later" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463230 (https://phabricator.wikimedia.org/T204965) (owner: 10Fdans) [15:08:12] (03Merged) 10jenkins-bot: Add top pages by edits metric [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463230 (https://phabricator.wikimedia.org/T204965) (owner: 10Fdans) [15:08:14] (03CR) 10jerkins-bot: [V: 04-1] Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) (owner: 10Fdans) [15:14:44] (03PS10) 10Fdans: Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) [15:15:19] (03PS11) 10Fdans: Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) [15:16:31] (03PS12) 10Fdans: Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) [15:17:20] (03CR) 10Fdans: [V: 032 C: 032] Add top editors metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/461635 (https://phabricator.wikimedia.org/T189882) (owner: 10Fdans) [15:20:34] (03PS1) 10Fdans: Release 2.4.1 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463497 [15:21:11] (03CR) 10Fdans: [V: 032 C: 032] Release 2.4.1 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463497 (owner: 10Fdans) [15:25:35] (03Abandoned) 10Fdans: [DO NOT MERGE] Test new metrics using tunnel aqs url [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463324 (owner: 10Fdans) [15:27:50] joal: Hola! were we able to deploy restbase ? [15:30:09] neilpquinn: data from mariadb gets scooped monthly from several tables. The data lake uses that scooping to "reconstruct" the history, which is a process that takes couple days, so overall is about 4 days to scoop and reconstruct mariadb. Can you let us know what you need so we can understand where does your question come from? [15:30:55] nuria, neil isn't in here [15:31:18] joal: i see (should have read backscroll) that we deployed restbase [15:56:37] nuria: we as in Marko, helped by fdans for the go on our side :) [15:59:08] 10Analytics-Kanban: Why do dumps and pageview api have slightly different counts? - https://phabricator.wikimedia.org/T205457 (10Nuria) 05Open>03Resolved [15:59:45] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Last 12 months aggregate is actually taking the first 12 months - https://phabricator.wikimedia.org/T205565 (10Nuria) [16:02:25] ping fdans [16:02:30] fdans: standduppp [16:26:53] 10Analytics, 10Analytics-Kanban: Split "top contributors" metric for bot and editors - https://phabricator.wikimedia.org/T205725 (10Nuria) [16:27:02] 10Analytics, 10Analytics-Kanban: Split "top contributors" metric for bot and editors - https://phabricator.wikimedia.org/T205725 (10Nuria) [16:36:17] 10Analytics, 10Analytics-Kanban: Add ability to filter "top contributors" metric for bot and editors - https://phabricator.wikimedia.org/T205725 (10Nuria) a:05mforns>03fdans [16:43:22] 10Analytics, 10Analytics-Kanban: Ingest data into druid for readingDepth schema - https://phabricator.wikimedia.org/T205562 (10mforns) a:05fdans>03mforns [17:13:44] 10Analytics, 10Analytics-Kanban: Add ability to filter "top contributors" metric for bot and editors - https://phabricator.wikimedia.org/T205725 (10Nuria) [17:13:48] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create report for "articles with most contributors" in Wikistats2 - https://phabricator.wikimedia.org/T204965 (10Nuria) [17:14:28] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create report for "articles with most contributors" in Wikistats2 - https://phabricator.wikimedia.org/T204965 (10Nuria) This task is deployed but we think we want to do changes in T205725 before announcing metric [17:16:25] 10Analytics, 10Analytics-Wikistats: Group pageview data per family in AQS so we can surface it in wikistats per-family pageview metrics - https://phabricator.wikimedia.org/T205730 (10Nuria) p:05Triage>03High [17:22:04] joal: see if my note here is correct: https://meta.wikimedia.org/wiki/Research:Wikistats_metrics/Top_edited_pages about "top edited pages" being "most edited pages" not pages with "most contributors" [17:30:19] joal: also our "top contributors" count is done per month right? not for all time [17:31:39] 10Analytics: stats.wikimedia.org home page should link to wikistats 2 - https://phabricator.wikimedia.org/T191555 (10Nuria) >Wikistats 1.0 dump-based reports are often still their only resort, right? I do not think so, there are two additional sources of data the API and the new wikistats2 UI: http://stats.wikim... [17:34:30] fdans: isn't pages to date a better fit for "content" than contributing ? cc mforns [17:34:49] agree [17:35:15] I also think there are other metrics that could be repositioned: [17:35:40] top viewed articles: could go from reading to content [17:36:03] top edited pages: could go from contributing to content [17:36:27] both byte diffs: could go from content to contributing [17:36:37] mforns: ok, i think I am going to move pages to date to content today if nobody disagrees cc milimetric fdans joal [17:36:50] mforns: on bytes i diagree, bytes measures content [17:37:04] mforns: or , rtaher, is the "best measure of content we have" [17:37:05] hm ok [17:37:18] mforns: i can tell you agree COMPLETELY jajajaj [17:37:25] hehe [17:37:36] no, makes sense xD [17:37:43] cc milimetric fdans for feedback [17:46:55] (03PS3) 10Bmansurov: Add CitationUsage and CitationUsagePageLoad to EL whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/462521 (https://phabricator.wikimedia.org/T205272) [18:17:40] nuria: Thanks for clarification on top edited pages [18:17:58] Also nuria - Top metrics are computed 'on the fly' by druid [18:18:40] nuria: We currently generate requests for monthly and daily top with AQS, and we could relatively easily generate an all-time top [18:42:17] joal: ok [18:54:23] 10Analytics, 10Discovery-Analysis, 10Product-Analytics, 10Wikidata, and 2 others: Query stats dashboard not updating - https://phabricator.wikimedia.org/T204415 (10mpopov) Alright, I wiped all the request counts starting with August 10th (after making a backup) so Golden/Reportupdater is going to start a r... [19:23:54] (03PS4) 10Joal: Update python/refinery/utils/HdfsUtils [analytics/refinery] - 10https://gerrit.wikimedia.org/r/459780 (https://phabricator.wikimedia.org/T202489) [19:32:40] (03PS6) 10Joal: Add python script importing xml dumps onto hdfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/456654 (https://phabricator.wikimedia.org/T202489) [20:18:59] (03PS1) 10Joal: Add mediawiki-history-wikitext oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/463548 (https://phabricator.wikimedia.org/T202490) [20:26:13] (03PS2) 10Joal: Add MediawikiXMLDumpsConverter spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/463370 (https://phabricator.wikimedia.org/T202490) [20:28:53] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Deprecation Information for EventLogging ResourceLoader modules - https://phabricator.wikimedia.org/T205744 (10Milimetric) p:05Triage>03High [20:42:11] 10Analytics, 10Analytics-Kanban: Ingest data into druid for readingDepth schema - https://phabricator.wikimedia.org/T205562 (10Tbayer) >>! In T205562#4622849, @fdans wrote: > Hi @Tbayer, just wanted to let you know that at the moment we can't index the time values stored in this dataset due to their high card... [20:43:14] 10Analytics, 10Analytics-Kanban: Add ability to bucketize integers as part of event ingestion - https://phabricator.wikimedia.org/T205641 (10Tbayer) As indicated over at T205562#4626349 , in many or most cases we will want to treat such integer fields as measures, rather than as dimensions. It seems bucketing... [20:46:54] 10Analytics, 10Page-Issue-Warnings, 10Product-Analytics, 10Reading-analysis, 10Readers-Web-Backlog (Tracking): Ingest data from PageIssues EventLogging schema into Druid - https://phabricator.wikimedia.org/T202751 (10Tbayer) For the record: @nuria and I discussed this task earlier this week, and I unders... [20:58:06] 10Analytics, 10Analytics-Cluster, 10Operations: stat1004 - /mnt/hdfs is not accessible - https://phabricator.wikimedia.org/T182342 (10Dzahn) 16:45 < icinga-wm> PROBLEM - Disk space on analytics1003 is CRITICAL: DISK CRITICAL - /mnt/hdfs is not accessible: No such file or directory 16:50 < icinga-wm> PROBLEM... [21:16:57] (03PS5) 10Joal: Update python/refinery/utils/HdfsUtils [analytics/refinery] - 10https://gerrit.wikimedia.org/r/459780 (https://phabricator.wikimedia.org/T202489) [21:18:31] (03PS7) 10Joal: Add python script importing xml dumps onto hdfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/456654 (https://phabricator.wikimedia.org/T202489) [21:30:31] (03CR) 10Joal: [V: 031] "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/456654 (https://phabricator.wikimedia.org/T202489) (owner: 10Joal) [21:31:24] (03CR) 10Joal: [V: 031] "Tested through https://gerrit.wikimedia.org/r/c/analytics/refinery/+/456654" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/459780 (https://phabricator.wikimedia.org/T202489) (owner: 10Joal) [21:46:37] 10Analytics, 10Analytics-Kanban: Ingest data into druid for readingDepth schema - https://phabricator.wikimedia.org/T205562 (10Nuria) @tbayer these are the numeric aggregations supported by druid and the ones that are available for this data: http://druid.io/docs/latest/querying/aggregations Druid strength is... [21:48:12] (03PS1) 10Joal: Correct bug in MediawikiXMLRevisionInputFormat [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/463564 [21:52:14] 10Analytics: Move pages to date to "content" frrom "contributing" category on wikistats UI - https://phabricator.wikimedia.org/T205752 (10Nuria) [21:58:54] 10Analytics: Move pages to date to "content" frrom "contributing" category on wikistats UI - https://phabricator.wikimedia.org/T205752 (10Nuria) [21:59:03] 10Analytics, 10Analytics-Kanban: Move pages to date to "content" frrom "contributing" category on wikistats UI - https://phabricator.wikimedia.org/T205752 (10Nuria) [22:01:34] 10Analytics, 10Analytics-Kanban: Ingest data into druid for readingDepth schema - https://phabricator.wikimedia.org/T205562 (10Tbayer) @Nuria I figured that percentiles including the median might be more demanding, but I didn't expect that the mean would be a problem too. Considering that Druid's aggregators i... [22:23:06] (03PS3) 10Joal: Add MediawikiXMLDumpsConverter spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/463370 (https://phabricator.wikimedia.org/T202490) [22:23:26] (03CR) 10Joal: [V: 032 C: 032] "self-merging bug correction" [analytics/wikihadoop] - 10https://gerrit.wikimedia.org/r/463564 (owner: 10Joal) [22:24:03] (03CR) 10Joal: [V: 031] "Tested on cluster" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/463370 (https://phabricator.wikimedia.org/T202490) (owner: 10Joal) [22:29:42] groceryheist: (pandas tables) https://en.wikipedia.org/wiki/Help:Table#Converting_spreadsheets_and_database_tables_to_wikitable_format and the other page linked there might help [22:39:12] (03PS1) 10Nuria: Move pages-to-date to "content" frrom "contributing" category [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/463574 [22:39:44] (03PS2) 10Joal: Add mediawiki-history-wikitext oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/463548 (https://phabricator.wikimedia.org/T202490) [22:41:04] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Add project-families to AQS for additive-metrics - https://phabricator.wikimedia.org/T203258 (10Nuria) Ping @JAllemandou let's update documentation https://wikitech.wikimedia.org/wiki/Analytics/AQS/Wikistats_2 with the metrics avai... [22:41:22] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Last 12 months aggregate is actually taking the first 12 months - https://phabricator.wikimedia.org/T205565 (10Nuria) 05Open>03Resolved [22:42:29] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Wikistats: add functions you apply to dimensional data such as "accumulate" - https://phabricator.wikimedia.org/T203180 (10Nuria) [22:42:35] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: "Total Article Count" (a.k.a "pages to date") Wikistats metric (per project and overall) - https://phabricator.wikimedia.org/T198425 (10Nuria) [22:42:41] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Wikistats: add functions you apply to dimensional data such as "accumulate" - https://phabricator.wikimedia.org/T203180 (10Nuria) 05Open>03Resolved [22:42:53] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: "Total Article Count" (a.k.a "pages to date") Wikistats metric (per project and overall) - https://phabricator.wikimedia.org/T198425 (10Nuria) [22:43:02] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0 Remaining reports. - https://phabricator.wikimedia.org/T186121 (10Nuria) [22:43:06] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: "Total Article Count" (a.k.a "pages to date") Wikistats metric (per project and overall) - https://phabricator.wikimedia.org/T198425 (10Nuria) 05Open>03Resolved [22:43:27] 10Analytics-Kanban, 10Patch-For-Review: Update top-(editor/pages) endpoints in AQS to follow top-pageviews semantics - https://phabricator.wikimedia.org/T204707 (10Nuria) [22:44:05] 10Analytics-Kanban, 10Patch-For-Review: Update top-(editor/pages) endpoints in AQS to follow top-pageviews semantics - https://phabricator.wikimedia.org/T204707 (10Nuria) Ping @JAllemandou Let's please update documentation in wikitech [22:44:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Make hover info-box on line charts consistent with bar charts - https://phabricator.wikimedia.org/T205461 (10Nuria) 05Open>03Resolved [22:44:45] 10Analytics-Kanban: Add endpoints to RESTBase for new WKS2 endpoints - https://phabricator.wikimedia.org/T203175 (10Nuria) 05Open>03Resolved [22:46:23] (03CR) 10Joal: [V: 031] "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/463548 (https://phabricator.wikimedia.org/T202490) (owner: 10Joal) [23:06:29] docs updated nuria :) [23:06:38] joal: super thanks [23:13:26] nuria: I have the full chain for dumps in CR [23:13:36] joal: WOW nice! [23:14:04] nuria: I suggest we use the patches manually this month, to debunk possible issues and leaving time fro reviews [23:14:07] and docs [23:15:02] And with that, i'm gonna start my weekend :) [23:15:09] See y'all next week :) [23:17:26] joal: ciaooo