[09:19:12] elukey: o/ [09:20:11] o/ [09:22:02] elukey: I managed to have the deploy repo setup, now fighiting with gerrit to push the thing in a branch [09:23:21] nice! thnaks! [09:23:27] *thanks [09:24:24] I guess that git config deploy.remote deploy_repo_remote_name is not sufficient [09:27:17] elukey: I don't have push rights on the deploy repo on gerrit :( [09:27:27] So I can't push my new branch [09:29:33] (PS1) Joal: Update aqs to 4d9f516 [analytics/aqs/deploy] - https://gerrit.wikimedia.org/r/310763 [09:29:57] elukey: I managed that --^ [09:30:06] woooooo [09:30:37] elukey: but I'm not sure of what I did :( [09:30:54] can you list the thing that you did in here? [09:31:03] just to brainstorm [09:31:07] elukey: not really, too much of a mess [09:31:10] :) [09:31:13] I'll do :)( [09:32:18] I got last patches for nuria on aqs repo, getting new-aqs-cluster branch [09:34:40] I merged that into master locally and pushed that master to a gitthub repo, to have a new repo having updated code on master [09:35:25] Then I patched local aqs-deploy git config for the submodule to reference the gitthub repo, and deployed with server.js [09:35:35] it worked, creating a branch [09:36:45] (Abandoned) Joal: Update aqs to 4d9f516 [analytics/aqs/deploy] - https://gerrit.wikimedia.org/r/310763 (owner: Joal) [09:37:18] then noew I'm trying to push the patch to aqs-deploy:new-aqs-cluster (and not master) [09:37:34] but since I don't have push rights on gerrit, I don't know how to do [09:37:38] elukey: --^ [09:38:39] mmmm I didn't get the whole thing :D [09:38:53] tell what needs to be detailed :) [09:39:10] I am trying to get the github repo thing [09:39:42] elukey: Using server.js deploy implies usiong master [09:40:21] Since we don't want to that on gerrit (we use new-aqs-cluster), I created a tmp repo where code is in master instead of new-aqs-cluster [09:41:26] elukey: now, I realize that my things might completely fail, since git will be expecting to be able to access a certain commit in submodule, not having it abailable ... [09:41:52] ah okok now it is clearer :) [09:42:06] but Marko yesterday mentioned that server.js could use a different branch [09:42:13] even if the code yesterday suggested otherwise [09:42:52] elukey: I think mobrovac suggested scap could take a branch, not server.js [09:44:09] I was referring to https://www.mediawiki.org/wiki/ServiceTemplateNode/Deployment#Local_git [09:44:21] re-reading [09:45:34] elukey: no mention of branch in there I think [09:46:56] git config deploy.remote deploy_repo_remote_name [09:47:06] this one seemed the one to use [09:47:33] but not sure if you can specify a branch [09:47:44] because it mentions only origin [09:47:58] just to know: did you try to play with it by any chance? [09:48:25] elukey: nope [09:48:46] but I read from it in the code :) [09:49:00] and it doesn't do what we need [09:49:01] sigh [09:49:21] SO [09:49:32] I know that what I am about to say is not elegant [09:49:34] but [09:49:40] :) [09:50:04] what if we deploy on aqs100[456] master, and hack manually the DTCS settings? [09:50:34] is there also other things that differs from the current cluster? [09:50:39] I might have missed some points [09:50:48] but IIRC the only difference is that one [09:51:06] not sure about the 0/NULL diff though [09:51:13] elukey: I'd suggest something different - Let's move to master assuming everything will fine in traitionning to the new cluster [09:51:28] elukey: that means, no deploy for old aqs anymore [09:52:13] this could be another option as well [09:52:27] and we'll roll back new-aqs stuff if we need to deploy old-aqs anew [09:53:01] elukey: 2 things to deploy for new-aqs: compaction setting and 0/null conversion [09:53:40] is the 0/null conversion already in the current cluster? [09:53:52] (sorry very basic questions but I'd need to clear out some things) [09:54:12] elukey: not yet, but it's in the new-asqs-cluster branch [09:55:25] and this is because the new data loaded on aqs100[456] has this difference meanwhile the one in aqs100[123] has not [09:55:30] correct [09:55:40] ok now it makes sense :D [09:55:43] SOOOOO [09:55:57] +1 on merging in master and deploy only to aqs100[456] [09:56:09] simpler and quicker [09:56:23] and with scap we can also do scap deploy --rev [09:56:27] elukey: agreed - Let's wait on nuria_ to triple check [09:56:51] so in case of fire we can just do scap deploy --rev ROLLBACK --limit 100[123] [09:57:13] well no it doesn't make sense [09:57:15] elukey: k [09:57:16] sorry [09:57:17] ahhahaha [09:57:21] anyhooowww [09:57:23] huhu [09:57:25] let's wait nuria_ [09:57:29] +1 [09:57:31] I am getting crazy with nodejs stuff [09:58:41] yeah, we should continue the talk we started few days ago about package management systmes for not-java languages [10:06:49] elukey: joal: right, sorry, service-runner doesn't take a src repo branch [10:06:51] but! [10:06:54] we can make it do so [10:07:49] if not super difficult it would help in these cases [10:08:29] mobrovac: agreed, not super sdifficult :) [10:08:54] sdifficult? [10:08:55] :D [10:09:07] ...and i shall make it so [10:09:35] give me half an hour to go through some stuff and i'll be on it [10:09:38] shouldn't take long [10:10:01] you rock mobrovac :) [10:10:16] i have a heart of gold! [10:10:17] hahaha [10:10:20] :D [10:10:20] ahahahah [10:30:30] joal: elukey: so, you want to build a specific branch of the deploy repo starting from a src repo branch different from master? [10:30:48] mobrovac: you have it :) [10:30:54] kk [10:31:23] mobrovac: specific branch in repo, I can easily deal with from sync-deploy [10:31:34] but building from another branch than master will be helpful [10:31:46] (the other one as well, but less crucial_ [10:32:57] ok, i [10:33:12] i'll make it possible to do src/branchA -> deploy/branchB [10:33:35] awesome mobrovac :) [10:33:37] Thanks [10:51:58] joal: elukey: uh, gerrit is being a dick about creating remote branches [10:52:04] so, here's what we are going to do [10:52:28] i'll add a feature to service-runner that will allow you to specify the src repo branch [10:52:40] you can then use it to build the deploy repo [10:52:49] but don't use the --review switch [10:53:19] instead, you'll have to create the remote branch manually and then submit the build patch for the deploy repo against that branch manually [10:53:42] it sucks a bit, but it at least allows you to get around the current problems [10:54:44] hm, actually, lemme check something [10:56:19] disregard that, i'll add the deploy branch option too [10:56:45] joal: elukey: but, you have to remember to create the branch manually in gerrit before building the deploy repo [10:56:52] otherwise the process will fail [10:58:21] this seems awesome [10:58:34] glad we could come to an agreement [10:58:35] :) [10:59:36] the only drawback is that we are using an ancient version of service-runner in package.json [10:59:48] so we'll need to upgrade it if we want to use the awesomeness [10:59:49] right? [11:01:54] Analytics, MediaWiki-extensions-WikimediaEvents, The-Wikipedia-Library, Wikimedia-General-or-Unknown, Patch-For-Review: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#2640146 (Samwalton9) >>! In T115119#2630233, @Nuria wrote: > @Samwalton9: data is being coll... [11:02:48] elukey: yup [11:02:53] no way arround that [11:03:07] i can help with that [11:04:04] hi team! [11:04:49] hey mforns [11:04:54] hi joal ! [11:04:55] Thanks a mil mobrovac :) [11:05:58] hey joal, how is pivot data hell? [11:06:31] mforns: didn't pivot that much this morning, fought with git a bit [11:07:52] joal, me neither, tried executing enwiki [11:08:43] Did it work mforns ? [11:09:02] joal, it's taking some time, because I needed the page history and user history before [11:09:02] mforns: I assume not, givewn you dais try :) [11:09:15] Ah right :) [11:10:36] mforns: sorry for unreadable typos :) [11:11:27] mforns: I assume going for enwiki means simplewiki data using month-splitting were the same as other? [11:13:00] * elukey lunch! [11:22:50] joal, after a couple trials, yes* (* = some "latest" values were different, because the split by month broke the state chains...) [11:23:15] mforns: makes sense [11:23:31] joal, but I thought to move on, because Erik does not need the "latest" values, [11:23:49] joal, and we can compute those in the page/user reconstruction code easily. [11:24:02] or else we have to do a medium refactor in the denormalize code [11:24:44] np mforns [11:25:12] I mean, I think we'll only send simplewiki to Erik [11:26:13] joal, sure! [11:30:36] Now the decision to be taken is: Do we send data to Erik as-is, or do we want to improve it before (2 main issues are page history cut in the middle, and anonymous users user text not well recorded) [11:30:40] mforns: --^ [11:31:49] joal: elukey: tested, works, will release a new version of service-runner soon [11:32:44] joal, maybe we can dig a bit more into the page history cut, and timebox it, and see if we find a quick fix or partial fix [11:33:14] k mforns, will start on that then (after my break though ;) [11:33:20] a-team, taking a reak ! [11:33:29] joal, I'll spend my time until standup looking into this [11:33:42] k, will ping you whan I get back [11:33:45] cya! [11:38:25] mobrovac: you rock [11:39:44] mobrovac: one thing that I wanted to ask you (since you love your colleagues) would be to create a service-node-template for to wrap basic nodejs services that don't need cassandra, api, etc.. [11:39:56] or maybe some doc about the bare minimum skeleton [11:40:33] service-node-template doesn't use or need cassandra [11:41:06] it is a skeleton around tjhe express framework with added bonuses, like logging, metrics collection, error handling etc [11:41:43] sorry I referenced cassandra because I found var uuid = require('cassandra-uuid'); in one of the lib/ files [11:41:50] anyhow, you got my point [11:48:59] elukey: euh, no not sure i do [11:49:27] the things that are in the template are minimal things that we expect prod services to have [11:53:44] mobrovac: all right so probably I need to dive a bit more into that service template :) [11:53:47] will let you know [11:59:18] elukey: ok, new version published [12:00:00] elukey: you don't need to update service-runner in package.json, just do npm i service-runner@2.1.6 in the root of the src repo [12:00:43] this will install the new service-runner temporarily and you will be able to use the new features [12:01:17] ah nice! [12:01:21] then, create the needed branches in gerrit for src and deploy repos and initialise them properly (pull them locally and do git review -s) [12:01:52] finally, you need to tell the build script the branches' names with git config srcbranch|deploybranch [12:01:57] and you are ready to build [12:02:17] updated https://www.mediawiki.org/wiki/ServiceTemplateNode/Deployment#Local_git with the info on srcbranch and deploybranch [12:02:29] let me know if you encounter any problems [12:03:02] sounds easy, will wait for joal to test it [12:04:10] mobrovac: the only thing that doesn't work is the npm i in the root of src [12:04:18] npm tells me "invalid" [12:04:49] you are in the root of the src repo, not the src submodule of the deploy repo, right? [12:05:15] yes [12:05:32] where node_modules is [12:05:54] └─┬ service-runner@2.1.6 invalid [12:05:55] └─┬ limitation@0.1.9 [12:05:55] └── kad@1.3.6 [12:06:07] hm so it did install it? [12:06:29] what does grep version node_modules/service-runner/package.json say? [12:06:59] WHAT - 2.1.6 [12:07:07] at this point I am confused [12:07:45] it's probably because in your package.json you have a version that doesn't match 2.1.6 so it complained about that [12:08:04] never mind that now [12:08:13] I've read it as "mess with dependencies, not installing it" [12:08:27] it will still install the service-runner version you have in package.json [12:08:28] anyhow [12:08:44] I got your point, thanks :) [12:11:15] for the build process, that is ^ [12:11:20] s/for/during/ [12:11:26] anyhow, all good, you can continue [12:11:27] :D [12:11:39] thanks! [12:35:02] Analytics-Tech-community-metrics, Developer-Relations (Jul-Sep-2016), Documentation: Create basic/high-level Kibana (dashboard) documentation - https://phabricator.wikimedia.org/T132323#2640352 (Aklapper) I updated the screenshot and the terms in https://www.mediawiki.org/wiki/Community_metrics#User_... [12:48:14] (PS1) Joal: Update aqs to 4d9f516 [analytics/aqs/deploy] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/310816 [12:48:38] mobrovac: That patch of yours is awesome :D [12:48:55] elukey: you're good to go --^ [12:49:08] Back to being away ;) [12:53:02] it worked i see :) [12:53:03] awesome! [13:01:56] (CR) Elukey: [C: 1] "I haven't checked node_modules dependencies for obvious reasons but the src SHA looks good :)" [analytics/aqs/deploy] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/310816 (owner: Joal) [13:02:22] joal: --^ [13:02:24] LGTM [13:02:51] now the next step is to merge and do the trick with scap [13:03:09] maybe --rev X --limit aqs1004 to test? [13:09:05] * elukey awards a hearth token to mobrovac [13:09:43] maybe without the ending h [13:09:46] details [13:09:55] :) [13:10:10] for deploying, you can perhaps use --environement too [13:10:12] * mobrovac checking the docs [13:16:12] https://doc.wikimedia.org/mw-tools-scap/scap3/quickstart/setup.html documents the environments [13:16:19] but there is no mention of the list of nodes... [13:16:26] so -l ought to be used [13:19:21] there is a --list no? [13:19:25] sorry limit [13:20:30] yes [13:20:33] -l == --limit [13:20:34] :) [13:20:51] also, i looked at the scap code, and you can use --rev branch_ame [13:20:52] name [13:21:02] so you should be good to go here [13:22:14] nice [13:22:28] joal: ? [13:25:25] elukey: you can also have a specific scap.cfg for the new cluster if you want [13:25:47] elukey: I'm here, let's try [13:26:17] eqi aqs1004 [13:26:19] oop [13:31:50] mobrovac: thanks, will check that.. but atm I just want a way to make this deployment.. After that, the plan is to put the new cluster in prod and nuke the old cluster :) [13:32:08] sure, sure, elukey [13:32:12] just mentioning .,.. [13:32:15] (CR) Elukey: [C: 2 V: 2] Update aqs to 4d9f516 [analytics/aqs/deploy] (new-aqs-cluster) - https://gerrit.wikimedia.org/r/310816 (owner: Joal) [13:32:41] mobrovac: you are very kind sir, thanks ;) [13:38:43] elukey: moving forward? [13:39:36] I was finishing a thing in deployment-prep, checking tin now [13:39:49] elukey: no rush, I was wondering :) [13:44:49] so what I did [13:44:59] 1) git fetch && git checkout new-aqs-cluster [13:45:03] all good and sound [13:45:12] 2) git checkout master [13:45:29] now following what has been written, I should be able to do: scap deploy --rev new-aqs-cluster --limit aqs1004.eqiad.wmnet [13:45:49] seems correct elukey [13:46:05] all right executing [13:46:41] I forgot submodule [13:46:42] grr [13:46:51] :( [13:48:18] it seems doing promote and restart now [13:50:10] all right promote failed and I rolled back [13:50:13] checking logs [13:51:16] elukey: don't git checkout master, stay on new-aqs-cluster and git submodule update --init from there [13:52:59] mobrovac: yes yes it was my next action, 13:52:32 Finished Deploy: analytics/aqs/deploy (duration: 00m 01s) [13:53:10] all good :) [13:53:24] yay :) [13:53:55] joal: do you want to sanity check with me aqs1004? [13:54:03] sure elukey [13:56:06] mmmm I don't find anything in /srv/deployment/analytics/aqs/deploy/src [13:56:32] in fact alerts in ops :D [13:56:49] right ... [13:56:58] elukey: checkout issues? [13:57:11] or maybe submodule issue [13:57:22] * mobrovac looking [13:57:57] the submodule wasn't checked out there [13:58:59] mobrovac: ? [13:59:18] ? [13:59:46] this one --> "the submodule wasn't checked out there" [13:59:48] :D [14:00:04] on aqs1004 [14:00:08] the src dir is empty [14:00:21] ah yes ok [14:00:29] I thought you were referrint to tin [14:00:41] ah! [14:00:42] maybe the --ref doesn't work as expected? [14:00:58] elukey: try scap deploy --force --rev new_aqs_cluster [14:01:01] what if I try scap deploy --limit aqs1004 from new-aqs-cluster? [14:02:04] i see in the logs "/srv/deployment/analytics/aqs/deploy-cache/revs/56ab863456b409b1f27075597bc9c0b2f69a5da3 is already live" [14:02:05] so it didn't do anything [14:02:06] so you need to force the deployment [14:02:23] but the bizarre thing is - why does it say that the dir already exists? [14:04:10] mobrovac: mmm I did but scap didn't like it a lot [14:04:49] Unable to checkout '4d9f5160687f8dc3df3401453d2da5e861c19db7' in submodule path 'src' [14:06:24] elukey: a workaround would be to create a dummy commit in the new_aqs_cluster branch (like change the readme or touch a stupid file or whatever) and then try to deploy the new SHA1 [14:06:45] sometimes scap gets confused with re-deploying the same rev or sth [14:06:54] anyhow, better to have a clean one [14:07:35] Analytics-Tech-community-metrics, Developer-Relations (Jul-Sep-2016), Documentation: Create basic/high-level Kibana (dashboard) documentation - https://phabricator.wikimedia.org/T132323#2640523 (Aklapper) Open>Resolved https://www.mediawiki.org/w/index.php?title=Community_metrics&oldid=223914... [14:08:32] mobrovac: we can try [14:09:17] ottomata: we have event logging events in hadoop, right? [14:09:25] joal: do you have time to file a new code review? [14:09:27] but there's no hive table? [14:09:29] or I just can't find it [14:10:28] milimetric: no hive table [14:10:29] heheh [14:10:31] i was about to do something similar [14:10:38] no, it would be a night mare to maintain hive tables for that :) [14:10:47] i'm trying to get a spark shell now to try uaparser [14:10:53] was going to add an example to the wikitech page [14:11:00] buuut, am still waiting for room on hadoop! [14:11:05] ottomata: ok, cool, to answer tilman's question right? [14:11:11] ya [14:11:18] elukey: hmm, the scap state does not seem clean on aqs1004 [14:11:19] elukey: I can do that [14:11:27] ok, I'll let you do that then, and post a simple example of using it from Hive [14:11:33] ok cool [14:11:39] but explain why it's hard to maintain hive tables [14:11:47] because the data is [14:11:49] 'free form' [14:11:51] yes there are schemas [14:11:58] but the schemas vary in backwards incompatible ways [14:12:07] and there are hundreds of them [14:12:09] mobrovac: we have that revision on the new-aqs-cluster branch even in the src repo [14:12:19] coild it be that scap tries to look in master? [14:12:19] elukey: i'd advice forcing a deploy of 2c471aa2abbfc21d31eb7edbfb8df8387d6d922d first (the current master), then introduce the dummy commit and then try to deploy new_aqs_cluster [14:12:27] we'd have to manually maintain each hive table [14:13:03] mobrovac: so checkout master and then scap deploy --rev 2c471aa2abbfc21d31eb7edbfb8df8387d6d922d --limit aqs1004? [14:16:32] elukey: git checkout master && git submodule update --init && scap deploy -f --limit aqs1004 [14:16:45] yes ok [14:16:59] ottomata: in case of 'mutually compatible' schemas, spark could merge data :) [14:17:19] ok done [14:17:39] Analytics-Tech-community-metrics, Developer-Relations (Jul-Sep-2016), Documentation: Create basic/high-level Kibana (dashboard) documentation - https://phabricator.wikimedia.org/T132323#2640549 (Qgil) Thank you very much! [14:17:45] now I can see src populared [14:17:47] mforns: I'm watching elukey fight with aqs, but can still isten to you while I'm not needed [14:17:49] *populated [14:18:06] joal, batcave? [14:18:14] sure mforns [14:18:16] k [14:18:59] joal: ja, spark can deal, but hive not as well [14:19:08] ottomata: yes [14:24:00] joal: I am wondering if the fact that we have the new-aqs-cluster branch also in the src repo might cause this [14:25:00] elukey: I wouldn't think it cause troubl [14:25:38] elukey: I would tend to think the same you did: does scap look in branches for submodules? [14:26:09] yes this is what I meant [14:26:15] sorry maybe it wasn't clear [14:26:22] elukey: maybe it was related to having tried a deploy without submodule update [14:26:24] i checked on tin the deploy repo [14:26:32] and .git/modules/src/config shows only master [14:26:34] it happened to me before and generated a mess [14:26:52] ah nice [14:28:05] but "Unable to checkout '4d9f5160687f8dc3df3401453d2da5e861c19db7' in submodule path 'src'" looks really like it doesn't find the commit [14:33:40] :( elukey [14:35:56] Analytics-Kanban, Patch-For-Review: Make reportupdater support passing the values of an explode_by using a file path - https://phabricator.wikimedia.org/T132481#2199910 (Milimetric) reviewing now. btw, to ensure changes get merged in a certain order, you can just chain them in gerrit (make them on top o... [14:36:40] elukey: since you have root, on aqs1004 do: sudo rm -rf /srv/deployment/analytics/aqs/deploy-cache/revs/56ab863456b409b1f27075597bc9c0b2f69a5da3 before the next deploy [14:36:45] to clean up the mess [14:37:15] and then try deploying new_aqs_cluster again [14:38:37] mobrovac: current seems to be revs/2c471aa2abbfc21d31eb7edbfb8df8387d6d922d, ok it should be the right thing to do [14:39:19] yes, 56ab is not used currently, so just clean it up because it doesn't have the submodule checked-out [14:40:13] done [14:42:06] (CR) Milimetric: [C: 2 V: 2] Support passing the exploded values by file path [analytics/reportupdater] - https://gerrit.wikimedia.org/r/306966 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:42:32] mobrovac: so now I retry the deployment [14:42:46] scap deploy --rev new-aqs-cluster --force --limit aqs1004.eqiad.wmnet [14:44:48] wow this time it worked [14:45:06] so it was probably me forgetting to submodule init before deploying? [14:45:11] as joal mentioned [14:45:14] (CR) Milimetric: [C: 2] Make use of the new explode by file feature [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/307243 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:45:19] (Merged) jenkins-bot: Make use of the new explode by file feature [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/307243 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:45:31] thanks milimetric :] [14:45:54] milimetric, are you merging and deploying? [14:45:54] thanks mobrovac [14:46:05] hope that I didn't delay your trip :) [14:46:11] mforns: reportupdater deploys automatically right, it's still ensure: latest? [14:46:34] I'll double check [14:47:18] ok, I git pulled just in case [14:48:24] elukey: tested usual requests, works like a charm :) [14:48:25] (CR) Milimetric: [C: 2 V: 2] Make use of the new explode by file feature [analytics/limn-ee-data] - https://gerrit.wikimedia.org/r/307244 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:49:13] elukey: Also, tested 0/null conversion: works :) [14:49:17] elukey: uh just in time, just in time [14:49:20] now, i have to run! [14:49:29] (CR) Milimetric: [C: 2 V: 2] Make use of the new explode by file feature [analytics/limn-multimedia-data] - https://gerrit.wikimedia.org/r/307273 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:49:40] milimetric, the 6 RU patches were to be deployed like 1) support feature, 2) all query repositories using the feature, 3) disable deprecated old way [14:49:53] mobrovac: o/ [14:49:59] joal: gooood! [14:50:00] mforns: yeah, I read that [14:50:06] mforns: that's what should happen [14:50:07] milimetric, ok ok [14:50:09] can I proceed with 1005 and 1006 ? [14:50:26] I merged reportupdater, it should've pulled it before doing the next job, but I pulled it manually just in cse [14:50:34] now I'm merging all the configs [14:50:47] mforns: how come in some places you changed it to wiki_db and in others you left it as "wiki"? [14:50:53] awesomel milimetric thanks a lot! :] [14:51:22] milimetric: sorry, y'day and today time simply passed by without me having time to look at pivot [14:51:29] i'll get to it monday, scout's honour! [14:51:42] mobrovac: you don't have to look at pivot at all really [14:51:56] mobrovac: it's just in typescript, and needs to be compiled to JS before it runs [14:52:02] i do need to look at the gulp buid though [14:52:08] build [14:52:22] mobrovac: oh ok, sure, no huge rush, but the sooner the better [14:52:30] kk [14:52:34] mobrovac: I can always work around it by building gulp and checking that into the repo [14:52:42] mobrovac: so if you don't have time to think/look let me know [14:52:54] let you know monday [14:52:57] * mobrovac out [14:53:02] have a nice night [14:54:21] milimetric, I tried to leave the var names as they were, I didn't change wiki to wiki_db IIRC, did I? [14:54:26] (CR) Milimetric: [C: 2 V: 2] Disable the deprecated option by_wiki [analytics/reportupdater] - https://gerrit.wikimedia.org/r/306968 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:55:02] mforns: in one of the patches you did, the multimedia one [14:55:27] (CR) jenkins-bot: [V: -1] Disable the deprecated option by_wiki [analytics/reportupdater] - https://gerrit.wikimedia.org/r/306968 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:55:40] milimetric, oh you're right [14:55:49] https://gerrit.wikimedia.org/r/#/c/307273/1/multimedia/config.yaml oh yeah, but it didn't have a param before [14:55:56] no big deal either way [14:56:28] milimetric, I did it, because they were using it as 'wiki' because it was the default name when using the old `by_wiki` option [14:56:39] I'm a little scared of what this means for maintaining the list of wikis. It would be nice to factor that out somewhere, like we make a few specific sets of wikis and publish them, and the path can be a URI [14:56:48] yep yep, np [14:57:03] milimetric, but if you look at the last report config, they were using 'wiki_db' when exploding for a reduced set of wikis [14:57:34] so I thought they would prefer wiki_db, instead of wiki [14:57:39] anyway... [14:58:10] milimetric, yes, those lists of wikis... but actually, there's only 1 list that is full-size, the others are partial on purpose [14:58:42] (CR) Milimetric: [C: -1] "oops, missed the tox problem, commented inline" (1 comment) [analytics/reportupdater] - https://gerrit.wikimedia.org/r/306968 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [14:59:56] milimetric, oh... I removed that already... git is tricking me... [14:59:56] mforns: yeah, some are partial like as they roll out features, and that's fine, those have to be custom. But some are partial because they wanted "most of the important wikis". For those, like edit, we can make a "wikis relevant to edit analysis" list and publish it [15:00:11] weird [15:00:11] aha [15:01:00] mforns: you're right, the next one is fine: https://gerrit.wikimedia.org/r/#/c/308977/ [15:01:04] and the previous one was fine too [15:01:06] weird... [15:01:28] anyway, I'll review that next one when you fix it so I can rebase [15:38:17] Analytics, Analytics-Kanban, Pageviews-API: Special characters showing up as question marks in /pageviews/top endpoint - https://phabricator.wikimedia.org/T145043#2640862 (Milimetric) p:Triage>Normal a:Nuria [15:39:12] Analytics-Kanban: Redact data so it can be public - https://phabricator.wikimedia.org/T145091#2640864 (Milimetric) This is harder than we thought because we need to exclude not only columns but also **rows**. [15:39:33] Analytics: Redact data so it can be public - https://phabricator.wikimedia.org/T145091#2640865 (Milimetric) [15:40:14] Analytics: Add hardware capacity to AQS - https://phabricator.wikimedia.org/T144833#2640868 (Milimetric) p:Triage>Normal [15:40:37] Analytics-Kanban: Switch AQS to new cluster - https://phabricator.wikimedia.org/T144497#2640871 (Milimetric) a:elukey [15:40:45] Analytics-Kanban: Switch AQS to new cluster - https://phabricator.wikimedia.org/T144497#2601636 (Milimetric) p:Triage>Normal [15:41:02] Analytics, EventBus, Wikimedia-Stream: Productionize Public Event Stream Prototype - https://phabricator.wikimedia.org/T143925#2640873 (Milimetric) [15:43:00] Analytics-Kanban, Spike: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2640881 (Milimetric) a:Milimetric>JAllemandou [15:43:42] Analytics-Kanban, Spike: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2277183 (Milimetric) Idea: test this on the pageview datasource already loaded, making a lookup table for the Chrome 41 bug or something else. [15:44:05] Analytics: Productionize loading of edit data into Druid (contingent on success of research spike) - https://phabricator.wikimedia.org/T141473#2640885 (Milimetric) [15:44:43] Analytics, Spike: Research spike: load enwiki data into Druid to study lookup table performance - https://phabricator.wikimedia.org/T141472#2640887 (Milimetric) [15:45:10] Analytics-Kanban: Productionize scala code for edit reconstruction - https://phabricator.wikimedia.org/T142552#2640889 (Milimetric) a:JAllemandou [15:45:36] Analytics: Count pageviews for all wikis/systems behind varnish - https://phabricator.wikimedia.org/T130249#2640892 (Milimetric) [15:45:43] Analytics-Kanban: Make top pages for WP:MED articles - https://phabricator.wikimedia.org/T139324#2640893 (Milimetric) a:Milimetric [15:47:00] Analytics: Improve user management for AQS - https://phabricator.wikimedia.org/T142073#2640895 (Milimetric) [15:48:56] Analytics, Analytics-Kanban, Pageviews-API: Special characters showing up as question marks in /pageviews/top endpoint - https://phabricator.wikimedia.org/T145043#2640899 (MusikAnimal) I have another question. Would it be possible to include logic that ensures the page actually exists at the time the... [15:49:38] Analytics: Capacity projections of pageview API document on wikitech - https://phabricator.wikimedia.org/T138318#2640905 (Milimetric) [15:52:48] Analytics, Analytics-EventLogging, scap, Patch-For-Review, Scap3 (Scap3-Adoption-Phase1): Use scap3 to deploy eventlogging/eventlogging - https://phabricator.wikimedia.org/T118772#2640913 (Milimetric) [15:53:51] Analytics, Patch-For-Review: Load Avro schemas from configurable external path - https://phabricator.wikimedia.org/T126501#2640917 (Milimetric) [15:57:26] Analytics-Kanban: Create clean simplewiki output from edit history reconstruction - https://phabricator.wikimedia.org/T143321#2640963 (Milimetric) a:Milimetric>mforns [15:59:25] Analytics, Analytics-Dashiki, Need-volunteer: Vital-signs layout is broken - https://phabricator.wikimedia.org/T118846#2640970 (Milimetric) a:Milimetric [16:03:47] Analytics: Create documentation for edit history reconstruction - https://phabricator.wikimedia.org/T139763#2641008 (Milimetric) a:mforns [16:04:35] Analytics: User History: Add history of annonymous users to history reconstruction - https://phabricator.wikimedia.org/T139760#2441784 (Milimetric) [16:08:18] Analytics: Better identify varnish/vcl timeouts and document - https://phabricator.wikimedia.org/T138511#2641061 (Milimetric) a:elukey [16:08:42] Analytics-Kanban: Better identify varnish/vcl timeouts and document - https://phabricator.wikimedia.org/T138511#2402494 (Milimetric) [16:12:10] Analytics, Analytics-Cluster: Monitor cluster running out of HEAP space with Icinga - https://phabricator.wikimedia.org/T88640#2641080 (Milimetric) a:elukey [16:13:09] Analytics, Pageviews-API, Wikimedia-General-or-Unknown: 404.php was most read article - https://phabricator.wikimedia.org/T145791#2641089 (Jdlrobson) [16:21:33] Analytics, Analytics-Dashiki, Need-volunteer: Vital-signs layout is broken - https://phabricator.wikimedia.org/T118846#2641155 (Milimetric) [16:23:38] Analytics: Capacity projections of pageview API document on wikitech - https://phabricator.wikimedia.org/T138318#2641167 (Milimetric) [16:30:04] Analytics: Reportupdater calculations for Pages Created and Edit counts - https://phabricator.wikimedia.org/T141479#2641209 (Milimetric) a:mforns [16:32:39] Analytics: Put data needed for edits metrics through Event Bus into HDFS - https://phabricator.wikimedia.org/T131782#2641220 (Milimetric) a:Ottomata [16:36:34] Analytics, RESTBase: REST API entry point web request statistics at the Varnish level - https://phabricator.wikimedia.org/T122245#2641252 (Milimetric) We can't prioritize this this quarter due to other work, but we can point you to other teams (like search) that have deployed similar jobs, and help you a... [16:39:00] Analytics: Check if we can deprecate legacy TSVs production (same time as pagecounts?) - https://phabricator.wikimedia.org/T130729#2144772 (Milimetric) [16:39:47] Analytics-Cluster, Analytics-Kanban: Deploy hive-site.xml to HDFS separately from refinery - https://phabricator.wikimedia.org/T133208#2641305 (Milimetric) a:JAllemandou [16:42:57] Analytics: Check if we can deprecate legacy TSVs production (same time as pagecounts?) - https://phabricator.wikimedia.org/T130729#2144772 (Ottomata) This class could be removed: https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/rsync/webrequest.pp That would stop t... [16:45:20] Analytics, Spike: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2641317 (Milimetric) [17:04:28] going afk team! o/ [17:20:40] milimetric, I need to cook sth for my family before they get back home, they had an unexpected day, do you mind if I skip the meeting? [17:27:58] not at all mforns I'll keep you updated [17:28:20] have a nice dinner / night [17:29:52] Analytics, Analytics-Dashiki: Enable HTTP2 for dashiki - https://phabricator.wikimedia.org/T145801#2641495 (Milimetric) [17:30:12] milimetric, thanks! [18:07:27] Analytics, Wikimedia-Stream, service-runner: Support node cluster sticky-session in service-runner - https://phabricator.wikimedia.org/T145805#2641608 (Ottomata) [18:24:41] (PS1) Milimetric: Remove pesky bad symlinks [analytics/dashiki] - https://gerrit.wikimedia.org/r/310902 [18:24:50] (CR) Milimetric: [C: 2 V: 2] Remove pesky bad symlinks [analytics/dashiki] - https://gerrit.wikimedia.org/r/310902 (owner: Milimetric) [18:25:49] Analytics, Wikimedia-Stream, service-runner: Support node cluster sticky-session in service-runner - https://phabricator.wikimedia.org/T145805#2641696 (Ottomata) Hm, we will need to support some kind of ip hash based load balancing (from varnish?) in order to route sessions to the same box anyway. A... [19:05:59] Analytics-Cluster, Operations, Patch-For-Review: decom titanium - https://phabricator.wikimedia.org/T145666#2641821 (Dzahn) 11:33 < mutante> !log titanium - stop salt, stop puppet, revoke puppet cert, delete salt key [19:36:30] (PS4) Mforns: Disable the deprecated option by_wiki [analytics/reportupdater] - https://gerrit.wikimedia.org/r/306968 (https://phabricator.wikimedia.org/T132481) [19:39:03] joal: still around [19:39:04] ? [19:39:07] need spark magic fingers [19:39:16] hey ottomata [19:39:22] you're lucky, good timig :) [19:39:25] haha [19:39:35] am trying to use UAParser from refinerhy-core in spark-shell [19:39:36] What's up? [19:39:42] org.apache.spark.SparkException: Task not serializable [19:39:47] Caused by: java.io.NotSerializableException: org.wikimedia.analytics.refinery.core.UAParser [19:39:55] yessir [19:39:59] gone through that already [19:40:02] because its an instantiated object/ [19:40:09] yeah i remember this sorttaaaaa [19:40:11] been a while [19:40:34] its it possible? [19:41:35] ottomata: not possible without changing UAparser a bit [19:42:37] (CR) Mforns: "Ready for merge" [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/307274 (https://phabricator.wikimedia.org/T132481) (owner: Mforns) [19:42:53] ottomata: UAParser makes use og CachingParser which is not serializable [19:44:04] ottomata: only way I found to overcome that is to emulate caching parser using LRUMap [19:44:21] aye so I can't easily use our UAParser directly [19:44:23] was afraid of that [19:44:31] could make a SerializeableUAParser [19:44:38] that uses only serilalizeable stuff [19:44:39] ottomata: definitely [19:44:43] but mehhhhh [19:44:48] was hoping for an easy solution for tiliman [19:44:58] i think i'll tell him how to make a hive table and use the UDF [19:45:00] ottomata: Yeah I assume so :( [19:45:01] probably easier :/ [19:45:24] ottomata: ohhhh, thinking of that: I wonder if it's not possible to use UDFs in spark [19:45:30] oh [19:45:31] hm [19:45:36] would be awkward but maybe [19:46:31] ottomata: ineficient way is to call new UAParser at every row :( [19:46:34] sadness [19:52:56] ottomata: tried to use hive udf - same issue [19:53:27] ottomata: sorry mate :( [19:53:59] ottomata: I'll try tomorrow morning to make it work using some other tricks [19:54:10] Going away for tonight, later a-team [19:54:14] aye [19:54:15] s'ok np [19:54:19] gonna just give the hive solution [19:54:20] thanks joal [19:54:23] goodnight! [19:54:27] bye joal [19:54:30] ! [20:22:21] Analytics-Kanban: Make top pages for WP:MED articles - https://phabricator.wikimedia.org/T139324#2642067 (Doc_James) Look forwards to the outcome. Andrew West does the top 5000 but we do not have a total for the entire project. Would also like similar totals for other languages. [20:22:48] Analytics-Kanban: Make top pages for WP:MED articles - https://phabricator.wikimedia.org/T139324#2642069 (Doc_James) Which articles pertain to medicine in other languages can be found through wikidata language links. [23:02:35] Analytics, Analytics-Dashiki: passport-mediawiki-oauth doesn't support callback parameter - https://phabricator.wikimedia.org/T145828#2642376 (Jdlrobson) [23:02:44] Analytics, Analytics-Dashiki: passport-mediawiki-oauth doesn't support callback parameter - https://phabricator.wikimedia.org/T145828#2642391 (Jdlrobson)