[00:04:11] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2631418 (Tgr) [00:06:17] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2612941 (Legoktm) Would this be a proxy to the pageview API or would we store some of the data in the MW database? [00:51:53] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2631476 (Tgr) Some open questions: * The pageview REST API deals with URLs; the action API deals with pages. There are many ways in which the two do no align (page m... [00:52:47] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2631478 (Tgr) @Legoktm I hope that non-answers your question :) [00:55:42] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2631479 (Tgr) [01:20:50] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2631487 (Tgr) Also, if this is written as an extension (which it probably will be), how (and to what extent) should we avoid making it Wikimedia-specific? MediaWiki... [01:23:41] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2631489 (Legoktm) I was told that the pageviews API was Wikimedia-specific, so I created the WikimediaPageViewInfo extension for displaying its counts on action=info... [01:33:42] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2631511 (Tgr) Should we also expose unique devices data? I guess there is no reason not to; it's per-site only so most of the complexity with pageview data does not... [04:06:14] Analytics, Dumps-Generation, Security: Pageview dumps incorrectly formatted, looks like a result of possibly malicious activity - https://phabricator.wikimedia.org/T144100#2588550 (Bawolff) "<" and ">" are invalid character to use in titles, and should result in a 400 status code. Does the pageviews... [08:11:47] Analytics, Operations: kafkatee's logrotate/syslog default pkg files needs to be removed - https://phabricator.wikimedia.org/T145490#2631941 (elukey) [10:35:49] hi team [10:36:10] o/ [10:38:20] hey mforns ! [10:38:33] hello guys! [11:10:00] joal,mforns: can you try to ssh to druid1001.eqiad.wmnet? [11:10:07] elukey, sure [11:10:11] you are now in the druid-admins group [11:10:33] woohoo! yes :] [11:10:34] elukey: YAY, works for me :) [11:10:46] good :) [11:10:49] Thanks elukey ! [11:10:54] thx! [11:10:57] I am going to send an email with a recap, you can do a lot now :) [11:11:07] curiosity - can you try journalctl? [11:11:23] No journal files were found. [11:11:27] elukey: --^ [11:11:35] something like sudo journalctl -u druid-overlord.service ? [11:11:58] elukey: no work :( [11:12:10] elukey: asks for password, and without sudo no journal [11:12:25] ah ok I thought it wasn't taken care by a patch that Marko worked on for services [11:12:28] will add it [11:13:06] elukey: As you've noticed, I don't know how to use journalctl, so for me it's not that important ;) [11:13:44] joal, btw, the changes we did in scala, work! [11:14:18] I still changed a <= to just <, because there was a shift in the states, but now it looks perfect [11:14:28] mforns: I didn't check, wanted to but got to do other things [11:14:32] mforns: That's great :) [11:14:36] mforns: data looks reaso [11:14:46] yes [11:14:47] again: mforns: data looks reasonably correct? [11:14:57] that's really great :) [11:15:09] joal, the Billz example looks really good, which is a good sign [11:15:15] YAY [11:15:44] joal, I should check now that the user side makes sense, and I think we'll be good [11:15:54] mforns: sounds awesome! [11:16:34] also mforns, since we have changed the code quite a bit, maybe we can try to rerun the thing for enwiki as well, in case it changes something :) [11:16:52] joal, sure [11:17:08] joal, how much memory did you use per node? [11:17:25] mforns: tried with up to 32G per worker I think [11:17:55] but mforns, other thing I didn't try is to raise partition number [11:18:03] smaller splits [11:18:13] Not sure it'll do, but who knows [11:18:20] joal, aha, ok, and driver memory? [11:18:27] mforns: 8G [11:18:39] errors were memory errors on workers [11:18:45] how many partitions did you use with 32 gigs? [11:19:11] mforns: I'll also try to load the simplewiki data in druid, see how it looks using pivot :D [11:19:23] joal, awesome! [11:34:38] (CR) Thiemo Mättig (WMDE): [C: 1] "I still can not merge in this repository. ;-) +2, the change is correct and my question is resolved. Thanks for the explanation and the mi" [analytics/statsv] - https://gerrit.wikimedia.org/r/308959 (owner: Addshore) [11:34:45] * elukey lunch! [12:04:47] (PS4) Mforns: Add re-run script [analytics/reportupdater] - https://gerrit.wikimedia.org/r/308977 (https://phabricator.wikimedia.org/T117538) [12:07:44] (CR) Mforns: [C: -1] "The dependency needs to be deployed first." [analytics/reportupdater] - https://gerrit.wikimedia.org/r/308977 (https://phabricator.wikimedia.org/T117538) (owner: Mforns) [12:55:17] Analytics-Kanban: Compile a request data set for caching research and tuning - https://phabricator.wikimedia.org/T128132#2632731 (Danielsberger) Open>Resolved I have finally been able to take a look at the data set: it's great - exactly what we need to analyze caching performance. Below are the over... [13:38:05] is there any soul with a bit of compassion that can answer my questions about the pivot UI? :D [13:38:30] elukey: I can always try :) [13:40:38] elukey: But it has nothing to do with compassion ;) [13:41:07] ahahhaah [13:41:25] so it has probably something to do with my understanding of how we deploy nodejs apps [13:41:53] we now have https://gerrit.wikimedia.org/r/#/admin/projects/analytics/pivot/deploy and https://gerrit.wikimedia.org/r/#/admin/projects/analytics/pivot [13:41:56] thanks to Dan [13:42:50] my understanding is that /pivot/deploy contains nodejs modules and it uses the /pivot repo as submodule in its src dir [13:43:00] elukey: that's my understanding too [13:43:05] so we deploy the /pivot/deploy repo and we are good [13:43:11] hello ottomata ! [13:43:11] elukey: if it works the same way restbase does, that's the thing [13:43:38] so service-runner needs an entry point [13:43:55] and the one currently used in our screen session is bin/pivot [13:44:11] that calls build/server/www.js [13:44:44] but I have no idea how it is created [13:44:50] hm [13:44:56] build/server/www.js doesn't exist? [13:45:26] it does in your repo on stat1002 [13:45:33] but I don't find it in the gerrit one.. [13:46:00] elukey: i probably ran the instructions in the pivot readme [13:46:06] for dev mode [13:46:48] uh, what's the pivot github repo now? [13:46:58] ahhh now I found it [13:47:03] npm install [13:47:19] and gulp [13:47:30] aye [13:53:46] and then the next step would be something like https://wikitech.wikimedia.org/wiki/Analytics/AQS#Update_the_deploy_repository [14:01:00] joal: want to come to dba meeting? [14:01:27] milimetric: ^ ? [14:02:11] aww damn, I feel awful but I really do. Yeah, ok [14:02:24] s'ok you don't have to! [14:02:26] if you want to: https://hangouts.google.com/hangouts/_/wikimedia.org/analytics-dba [14:02:42] y/n ? [14:03:30] brt [14:04:49] hoooo, forgot about that [14:04:53] joiniong ! [14:43:59] elukey, ottomata, question for you [14:44:04] woah, check this out: [14:44:09] http://debezium.io/ [14:44:12] particularly [14:44:38] grr, i can't link, but [14:44:38] http://debezium.io/docs/tutorial/ [14:44:47] search for 'Here’s that new event’s value formatted to be easier to read: [14:44:47] ' [14:45:04] we see no changes in the schema section and a couple of changes in the payload section: [14:45:04] • The op field value is now u, signifying that this row changed because of an update [14:45:04] • The before field now has the state of the row with the values before the database commit [14:45:05] • The after field now has the updated state of the row, and here was can see that the first_name value is now Anne Marie. [14:45:13] elukey, ottomata: Do you know if pivot ui keeps datasource metadata in cache or something? [14:45:49] i think it writes to a file or something by default [14:46:02] we should prob configure it to save in mysql metastore or something [14:46:11] ottomata: k [14:46:35] ottomata: My question was more, can we get rid of that easily? [14:48:42] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2612941 (Anomie) >>! In T144865#2631487, @Tgr wrote: > MediaWiki core has a pageview counting feature (which no one uses [citation needed]), Wasn't that removed in... [14:48:45] joal: whatcha mean? [14:49:19] ottomata: I have created a test datasource, and changed its structure after first loading :( [14:49:33] ottomata: looks like pivot doesn't add the field I added [14:49:50] hm [14:49:54] joal: is it Edit test data? [14:50:00] elukey: correct [14:50:52] what is the field ? Just to double check [14:51:11] elukey: it's a measure field, named : edits [14:51:50] okok I wanted to make sure that Varnish wasn't caching something but this is not the case, sorry :) [14:52:22] elukey: no prob, all potential solutions are good [14:52:30] joal: maybe I can try to restart it to see if it works? [14:52:39] the usual turn off/on again solution [14:52:43] elukey: why not (if easy) [14:53:25] joal: done [14:53:55] elukey: Yay ! Worked :) [14:53:59] elukey: thanks mate :) [14:54:16] ah so the thing caches as Andrew mentioned [14:54:26] probably there is an option to tune this [14:54:30] will check :) [14:54:53] hm [14:55:00] i'm not sure what it is, and i can't remember how I changed it! [14:55:03] mforns, milimetric : If you want to have a look at simplewiki edit data in druid, it's now available :) [14:55:34] ottomata, elukey : Most of the time, caching data schemas is the right thing to do :) [14:58:16] so pivot does [14:58:18] Adding Druid data source: 'edit-history-test' [14:58:18] Adding Druid data source: 'pageviews-daily' [14:58:18] Adding Druid data source: 'pageviews-hourly' [14:58:18] Initial load and introspection complete. Got 3 data sources, 3 queryable [14:58:56] maybe after this bootstrap it doesn't recognize new stuff like measures? [14:59:14] it could be an option [14:59:38] elukey: I assume after it loads a datasource it doesn't update its schema [15:00:01] more elegant, yes :) [15:03:33] Analytics, Operations: kafkatee's logrotate/syslog default pkg files needs to be removed - https://phabricator.wikimedia.org/T145490#2633102 (elukey) p:Triage>Low [15:33:22] Analytics: Project: deploy notebook solution on top of hadoop to replace mysql eventlogging database - https://phabricator.wikimedia.org/T145527#2633255 (Nuria) [15:56:37] elukey: irc for deploy? [15:56:42] goin lunchin + trying again at phone fixing bbl [15:56:52] nuria_: sure :) [15:57:16] elukey: head of branch is this one: https://gerrit.wikimedia.org/r/#/c/309604/ [15:57:54] elukey: so i think when we do the pull in tin or similar we have to do git pull origin new-aqs-cluster [15:59:34] nuria_: will it create a new branch or use the current one? [15:59:41] I have some doubts now [15:59:50] because it would be great to have two branches on tin too [15:59:57] elukey: it will use that branch [16:00:13] elukey: so when you pull if you do > git branch [16:00:23] you shall see two branches master and this one [16:01:13] ah you already created it [16:01:14] checking [16:02:05] elukey: we can do batcave and share screen if you want [16:03:18] nuria_: qq - I thought that you were going to deploy and I was going to check the service, is it the right understanding or do you want me to do it? [16:04:08] elukey: i am not sure i have permits to deploy, i did not last time i tried [16:04:13] Analytics-Cluster, Operations: Migrate titanium to jessie (archiva.wikimedia.org upgrade) - https://phabricator.wikimedia.org/T123725#1936502 (Dzahn) Cool, i'll take it as a reminder to shut titanium down after a waiting period. [16:04:25] Analytics-Cluster, Operations: Migrate titanium to jessie (archiva.wikimedia.org upgrade) - https://phabricator.wikimedia.org/T123725#2633405 (Dzahn) a:Dzahn [16:04:43] elukey: jaja i thought we were doing it the other way around [16:04:43] nuria_: you should be able to [16:04:48] okok :) [16:04:56] so https://gerrit.wikimedia.org/r/#/c/309604/ needs to be merged right? [16:05:31] because what I thought we were going to do was [16:05:33] merge the code [16:05:36] git fetch [16:05:39] elukey: no need to merge [16:05:44] git checkout new-aqs-cluster [16:05:52] scap deploy [16:06:01] elukey: ah sorry. merge to that branch right [16:06:31] elukey: ahem, yes. batcave? [16:06:33] (scap deploy only to aqs100[456]) [16:07:33] elukey: batcave 2? [16:07:49] sure [16:07:53] elukey: https://hangouts.google.com/hangouts/_/wikimedia.org/a-batcave-2 [16:08:54] elukey: i am there [16:09:09] ah the '-' [16:11:21] (CR) Nuria: [C: 2 V: 2] Update per-article compression scheme to default (LCS) [analytics/aqs] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/309602 (https://phabricator.wikimedia.org/T140866) (owner: Nuria) [16:13:56] (CR) Nuria: [C: 2 V: 2] Map null count values to 0 in per-article output [analytics/aqs] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/309604 (https://phabricator.wikimedia.org/T144521) (owner: Nuria) [16:47:29] wikimedia/mediawiki-extensions-EventLogging#598 (wmf/1.28.0-wmf.19 - 40402c6 : Chad Horohoe): The build has errored. [16:47:29] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/40402c668b63 [16:47:29] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/159648232 [16:47:48] (PS4) Nuria: Map null count values to zeros in output [analytics/aqs] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/309604 (https://phabricator.wikimedia.org/T144521) [16:51:59] (PS5) Nuria: Map null values to zeros in output in per-article endpoint [analytics/aqs] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/309604 (https://phabricator.wikimedia.org/T144521) [16:53:54] mobrovac: o/ do you have a min for a docker question? [16:54:22] I am trying to run the server.js --deploy-repo from Mac Os [16:54:43] uh [16:54:49] is it a dead end or somebody tried? [16:54:54] ah ok I guess no [16:54:58] I can see a weird Step 5 : RUN groupadd -g 20 -r rungroup && useradd -m -r -g rungroup -u 502 runuser [16:55:02] that fails [16:55:20] I have a group with id 20 on macos [16:55:21] aren't those your user's uid/gid? [16:55:34] yes they are [16:55:38] it's needed for the mapping [16:55:41] of files [16:55:58] it fails with what error? [16:56:38] (PS6) Nuria: Map null values to zeros in output in per-article endpoint [analytics/aqs] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/309604 (https://phabricator.wikimedia.org/T144521) [16:56:49] GID 20 exists, but I thought it was doing it inside the container [16:56:57] yes [16:56:59] it does [16:57:10] so maybe it already exists inside the container [16:57:11] hm [16:57:15] damn permissions [16:57:22] it says [16:57:22] The command '/bin/sh -c groupadd -g 20 -r rungroup && useradd -m -r -g rungroup -u 502 runuser' returned a non-zero code: 4 [16:57:54] yup, that's gid not unique [16:58:00] ahhh ok got it now, [16:58:02] elukey: are you using native docker for mac or boot2docker? [16:58:12] Pchelolo: o/ [16:58:13] (PS7) Nuria: Map null values to zeros in output in per-article endpoint [analytics/aqs] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/309604 (https://phabricator.wikimedia.org/T144521) [16:58:16] native docker [16:58:18] only native (latest) docker for mac is supported [16:58:36] elukey: i think i have a solution [16:58:40] lemme do a quick pr [16:58:45] (CR) Nuria: [C: 2 V: 2] Map null values to zeros in output in per-article endpoint [analytics/aqs] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/309604 (https://phabricator.wikimedia.org/T144521) (owner: Nuria) [16:58:45] thanks a lot! [16:58:48] sorry for the trouble [16:59:10] elukey: sorry about this, now merged: https://gerrit.wikimedia.org/r/#/c/309604/ [16:59:27] elukey: make sure you have service-runner 2.0.6+ [16:59:39] after that version is should work out of the box [17:02:50] elukey: Pchelolo: https://github.com/wikimedia/service-runner/pull/126 [17:02:53] that ought to fix it [17:03:11] mobrovac: em... https://github.com/wikimedia/service-runner/pull/126/files#diff-f68e1ed72a6a9bbc03d6989560fa9708R214 [17:03:21] this code shouldn't be run on mac at all [17:03:49] hmm [17:04:28] and yet, it is executed on elukey's box [17:04:53] mobrovac: that check was added in version 2.0.6, maybe elukey has older version installed? [17:05:12] Pchelolo: either way, the non-unique uid/gid should be supported [17:05:31] elukey: what version of service-runner are you using? [17:05:55] mobrovac: ye, no question, just it has to work out of the box on Mac, at least works for me [17:06:19] I was trying to figure it out, bare in mind that I've started today to check this [17:06:22] :) [17:06:48] elukey: grep version package.json [17:07:06] or node_modules/service-runner/package.json [17:07:51] 0.1.0 afaics [17:08:02] elukey: what did you clone / install? [17:08:18] * mobrovac is lost a bit here [17:08:23] nono "version": "1.3.1" [17:08:30] uh that's old [17:08:34] this is from node_modules/service-runner/package.json [17:08:41] where did you get this version from? [17:08:45] I just checked out the aqs deploy repo :) [17:08:48] ottomata: updated couple dumps tickets: https://phabricator.wikimedia.org/T128513 [17:09:14] elukey: yeah, aqs needs to be brought up to date with the latest service-runner/hyperswitch developments [17:09:34] elukey: just npm install service-runner [17:10:12] mobrovac: It's pretty up to date in src repo as far as I can tell (it's 2.0 service-runner at least) [17:13:38] mobrovac: it installs 1.3.1 [17:13:45] basically this is a bit old right? https://github.com/wikimedia/analytics-aqs/blob/master/package.json#L26-L30 [17:14:11] but I don't know what ^ means [17:14:15] elukey: AQS is hosted in gerrit I thought? [17:14:24] sorry people if you want I can RTFM myself and come back tomorrow [17:14:26] elukey: because it is reading package.json, try npm install service-runner@2.1.2 [17:14:57] msg elukey walking to library, we can touch base later or tomorrow [17:15:08] elukey: https://gerrit.wikimedia.org/r/#/admin/projects/analytics/aqs [17:15:19] Pchelolo: yes, it is hosted in gerrit [17:15:32] Pchelolo: it is just easy to link to latest code on giuthub [17:15:35] *github [17:16:27] nuria_: tomorrow would be better, would it be ok? [17:16:43] Pchelolo: I get service-runner@2.1.2 invalid [17:16:59] └─┬ limitation@0.1.9 [17:16:59] └── kad@1.3.6 (git+https://github.com/gwicke/kad.git#f35971036f43814043245da82b12d035b7bbfd16) [17:17:11] elukey: why do you need to be in the aqs repo? [17:17:21] elukey: just go elsewhere [17:17:52] elukey: what is the end goal here? [17:17:54] :) [17:19:25] mobrovac: thanks for the help but I'll read some stuff before bothering you, I am just trying to deploy aqs [17:19:28] that's it [17:19:36] I'll figure out a way [17:19:53] thanks Pchelolo too for the help [17:21:49] elukey: yes tomorrow works, i will continue the deployment [17:21:56] mobrovac: we are trying to deploy aqs [17:22:09] mobrovac: and running into issues [17:25:00] nuria_: it seems that my build issue could be resolved using a more recent version of service-runner [17:25:02] mobrovac: so question (for Pchelolo also) [17:25:26] elukey: right, but the version we have on aqs package.json is "^1.1.0", [17:25:40] mobrovac: we deployed just like 1 month ago and that was fine cc Pchelolo [17:25:47] mobrovac: does that need to be updated? [17:26:01] nuria_: that was not a deploy natively from OSX. [17:26:24] that was supported all the time. building natively on OSX is supported only v 2.0.6+ [17:26:30] so you need to update service-runner [17:26:38] Pchelolo:right, cause deploying from OS at the time it did not work, ok [17:26:52] Pchelolo: because of the group id issue, correct? [17:27:06] nuria_: yes. [17:27:14] Pchelolo: ok, will do and try to build again [17:27:49] nuria_: I would not upgrade service-runner now, but maybe ask joal to build the docker image [17:27:56] too many changes in flight [17:28:09] we can definitely schedule a version bump after this [17:28:44] elukey: right, i was going to build from mu ubuntu image .. [17:28:56] *my ubuntu [17:28:59] ah okok, super :) [17:29:07] elukey: be back in a bit [17:29:57] I am going to go afk, will sync with you tomorrow ok? Let me know if you succeed, otherwise I'll create a VM or ask joal to build [17:30:00] thanks! [17:30:38] or jsut build it in labs this time around [17:30:57] sure [17:31:13] thanks for the help, going afk! [17:46:49] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2633788 (Tgr) You are right, it was migrated to [[https://www.mediawiki.org/wiki/Extension:HitCounters|HitCounters]], which actually it easier to check how widely it... [17:57:43] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2633851 (Anomie) >>! In T144865#2633788, @Tgr wrote: > ```lang=php > public function getPageData( Title $title, $metric = self::METRIC_VIEW ); > ``` Ideally the... [18:51:22] Analytics-Kanban: Compile a request data set for caching research and tuning - https://phabricator.wikimedia.org/T128132#2634143 (Nuria) >I had one minor question: Am I right to assume that the dates of this dataset are August 17 to August 31? no, dates are not those, since we did not need timestamps for th... [19:04:36] Analytics, MediaWiki-API, Reading-Infrastructure-Team: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#2634288 (Tgr) Good point, updated. [19:42:13] Pchelolo: yt? [19:42:20] nuria_: ?? [19:45:08] Pchelolo: my aqs install in ubuntu fails with dependency issues on sqllite3 [19:45:43] Pchelolo: on mac Os deps look like this for restbase-mod-table-sqlite [19:45:57] Pchelolo: [19:46:03] https://www.irccloud.com/pastebin/2nleAxlr/ [19:46:56] nuria_: sqlite3 has native code, so why is your module compiled for darwin while it's inside ubunty? [19:47:22] Pchelolo: what I pasted is MAC OS X [19:48:14] Pchelolo: the dir on ubuntu the lib directory for sqllite3 doesn't have any bindings ...? [19:49:37] Analytics, Dumps-Generation, Security: Pageview dumps incorrectly formatted, looks like a result of possibly malicious activity - https://phabricator.wikimedia.org/T144100#2634472 (Nuria) Also, I can see garbage like requests in pageview_hourly table, on that hour: <