[06:54:43] Hi a-team [07:14:58] Analytics-Kanban: Compile a request data set for caching research and tuning - https://phabricator.wikimedia.org/T128132#2535901 (Danielsberger) I cannot answer the first point as we did not include the http_status column into this data set. Thanks for pointing this out, including http_status might help clea... [07:24:27] goood morning :) [07:31:12] joal: Moritz has just merged a firewall patch for druid, that is a no-op. If you see anything weird let me know :) [07:31:41] elukey: only two things I can think of are: loading fails, querying fails [07:32:10] elukey: Querying still works [07:32:21] elukey: We'll know about loading tonight [07:34:29] so the patch limits the access to druid only from the analytics network [07:34:37] loading will be ok [07:34:52] but let's double check just in case :) [07:35:23] elukey: from a querying perpsective, I think for the moment we only use UIs that run on stat1002 (not-prod ottomata managed stuff) [07:48:53] joal: any preference for the new cassandra username? [07:49:31] elukey: aqs-related name? [07:49:48] like aqs-worker for instance, or something else in the area? [07:51:29] aqs might be good [07:51:38] k [07:51:48] all right [07:52:45] would it be possible for you to use different users for loading jobs? cassandra for the current cluster and aqs for the new one? [07:53:18] even if I would really like to switch the old cluster too [08:24:18] elukey: sorry didn't noticed your ping [08:24:23] elukey: sure, very eassy [08:24:46] elukey: however I think it could also be good to have a dedicated user for loading, no? [08:36:43] joal: could I poke you back towards https://gerrit.wikimedia.org/r/#/c/302102/ https://gerrit.wikimedia.org/r/#/c/301661/ and https://gerrit.wikimedia.org/r/#/c/301657/ to see if we can get them in this week? :) [08:39:59] joal: I have been thinking if it is worth it since afaik we don't set any special access control on the users [08:40:25] I mean, we could use one user for restbase and one for loading but they would be basically the same [08:40:38] I'd say that the first step would be to move away from admin [08:40:43] to a single user [08:40:55] then we could implement more fine grained control if we need [08:41:15] does it make sense? [08:45:42] Analytics-EventLogging, DBA, ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#2536049 (jcrespo) @Jdforrester-WMF Metrics are still coming in as recent as `20160808090453`, and thus these tables are being recreat... [08:46:27] joal: ahh ok now I got it, adduser.cql.erb in operations/puppet regulates the permissions for the new user [08:46:49] so we could think to have two users [08:47:21] or just to use one in the beginning (aqs) and then add another one afterwards [08:47:34] the big part is to switch the restbase user [08:47:44] then the other ones would be easy to add [08:48:29] the script is a bit weird though since it grants permissions to all the keyspaces? Does it mean also system.auth? [08:48:32] mmmmm [08:48:37] * elukey reads documentation [09:15:25] Analytics-Kanban: Improve user management for AQS - https://phabricator.wikimedia.org/T142073#2536087 (elukey) Interesting reading: https://issues.apache.org/jira/browse/CASSANDRA-5310 > QUORUM is only used for default superuser ('cassandra'), for other users ONE is used. You are not supposed to use 'cassan... [10:48:03] (PS1) Addshore: Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303784 [10:51:05] (PS2) Addshore: Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303784 [10:55:55] Analytics, Datasets-General-or-Unknown: UploadWizard dataset is empty, limn has no data - https://phabricator.wikimedia.org/T112851#2536158 (Milimetric) Invalid>declined @Nemo_bis, those are the uploading metrics that the team wanted to track, that probably just changed over time. Anyone can eas... [10:57:04] Analytics: Better publishing of Annotations about Data Issues - https://phabricator.wikimedia.org/T142408#2536161 (Milimetric) @MusikAnimal: happy to work with you on this if you'd like, to understand what would be easier from your point of view. [10:59:57] (PS3) Addshore: Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303784 [11:00:08] (PS1) Addshore: Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303785 [11:00:12] (CR) Addshore: [C: 2] Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303785 (owner: Addshore) [11:00:15] (CR) Addshore: [C: 2] Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303784 (owner: Addshore) [11:01:48] (Merged) jenkins-bot: Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303785 (owner: Addshore) [11:01:51] (Merged) jenkins-bot: Allow passing a date to WD dumpDownloads [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/303784 (owner: Addshore) [11:10:40] hi a-team :] [11:10:53] o/ [11:22:03] joal: I just merged the first part of the user management changes to aqs100[456] and aqs1001 to create the adduser.cql file in /etc [11:22:33] will run puppet also to 100[23] in a bit, just wanted to make sure that the patch created only that file [11:23:00] the next step would be to execute the adduser.cql file on aqs100[456] and then switch restbase to use it via puppet [11:23:28] test and then maybe switch only aqs1001 to the same settings [11:23:33] to see performance differences [11:23:38] the procedure seems super safe [11:24:49] my understanding is that I could proceed even with compaction ongoing [11:24:58] since adding a user affects only the system.auth table [11:25:12] buuuut after yesterday I am not in a hurry anymore :P [11:34:34] all right merged also to aqs100[23] all good [11:43:16] elukey: I t [11:43:30] elukey: I think you can create the new user, it won't affect anything [11:44:17] elukey: as for multiple users, it's just a suggestion, I really don't mind using aqs for loading [11:45:25] joal: I started to not trust myself after yesterday when it comes to cassandra [11:45:27] finally: about 100[456], just keep in mind we can't easily deploy (or we need to do it manually) [11:45:28] ahahahaha [11:45:32] huhuhu :) [11:45:41] ah yes that problem too [11:46:33] we might change the username manually [11:46:40] elukey: I think its best [11:47:04] I thought it was in puppet though [11:47:20] there is also an aqs change too? [11:47:21] elukey: I'll also load using the new user (to test) [11:47:34] elukey: puppet updates aqs conf file [11:47:36] I think [11:47:44] ah there you go [11:47:45] okok [11:47:53] so I am going to create the new user [11:48:03] on all nodes [11:48:11] elukey: And actually you're right, since aqs conf is puppet managed, no need to deploy, all good (sorry, my mistake) ! [11:48:26] nono I like to brainstorm with you [11:48:39] I ask your opinion on purpose [11:48:54] elukey: I take that as a compliment :) [11:48:58] * joal blushes a bit [11:49:06] it was :) [11:49:07] Hi mforns [11:49:12] hi joal :] [11:49:18] so that you know mforns [11:49:47] mforns: I'm currently updating PhpUnserializer and SubGraphPartitioner to match your comments [11:50:03] I'll also create a unit test for partitioner [11:50:14] joal, ok, but shouldn't we discuss them first? [11:50:31] mforns: I modify on the things I think you're right ;) [11:50:40] mforns: I don't mind multiple passes [11:50:45] ok [11:51:13] joal, yesterday I found out what was happening with english wiki [11:51:31] mforns: Great ! What was it? [11:51:49] it was a parsing problem, just another alternative storage format for a segment of newusers events that I had overseen [11:52:17] I fixed it and executed on enwiki and it worked fine, like in 10 minutes, as expected [11:52:23] mforns: Maaaaan, you're our expert in logging table user format [11:52:30] mforns: AWEESOME :) [11:52:40] mforns: how many unmatched events? [11:52:56] :] also vetted the data and looks good, I'm pushing it now to gerrit [11:53:03] so we can comment on it [11:53:20] (CR) Joal: "Reply for nuria." (1 comment) [analytics/refinery] - https://gerrit.wikimedia.org/r/301661 (https://phabricator.wikimedia.org/T141525) (owner: Addshore) [11:54:47] (CR) Joal: [C: 2] Fix WikidataArticlePlaceholderMetrics class doc [analytics/refinery/source] - https://gerrit.wikimedia.org/r/302102 (owner: Addshore) [11:56:06] addshore: on https://gerrit.wikimedia.org/r/#/c/301657/ and https://gerrit.wikimedia.org/r/#/c/301661/, nuria asked for more descriptive comments [11:56:25] ahh, I didn't see that there were comments as there was no +/- ! [11:56:28] addshore: except from that, I think it should be ok to merge them this week (not sure about deploy though) [11:56:39] addshore: :) [11:58:17] (Merged) jenkins-bot: Fix WikidataArticlePlaceholderMetrics class doc [analytics/refinery/source] - https://gerrit.wikimedia.org/r/302102 (owner: Addshore) [11:59:36] a-team, taking a break, will be back in a while [11:59:46] ok, cya [12:06:21] aqs@cqlsh> select project from "local_group_default_T_pageviews_per_article_flat".data limit 10; [12:06:31] works very fine on aqs1004 [12:06:37] so the new user works :) [12:10:53] Analytics-Kanban: Improve user management for AQS - https://phabricator.wikimedia.org/T142073#2536277 (elukey) aqs user added to aqs100[456] and verified that it returns data on each instance with the following query: ``` elukey@aqs1004:~$ cat showdata.cql select project from "local_group_default_T_pageview... [12:17:05] (PS15) Mforns: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [12:45:32] all right aqs100[456] are running with restbase and the aqs user [12:45:34] not cassandra [12:48:39] (CR) Mforns: [WIP] Refactor Mediawiki History scala code (10 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [12:52:57] test data works (like curl localhost:7232/analytics.wikimedia.org/v1/unique-devices/en.wikipedia/all-sites/daily/1970010100/1970010100) [12:57:50] Analytics-Kanban: User History: Make it work with enwiki - https://phabricator.wikimedia.org/T142472#2536335 (mforns) [12:57:53] Analytics-Kanban: User History: Make it work for enwiki - https://phabricator.wikimedia.org/T142472#2536335 (mforns) [12:58:46] Analytics-Kanban: Compile a request data set for caching research and tuning - https://phabricator.wikimedia.org/T128132#2536351 (BBlack) >>! In T128132#2535901, @Danielsberger wrote: > I should first clarify that the 12% figure did not account for the Varnish cache request routing, so it's **all** requests... [13:01:09] Analytics-Kanban: User History: Make it work for enwiki - https://phabricator.wikimedia.org/T142472#2536335 (mforns) The code is in the common gerrit change, patch 15: https://gerrit.wikimedia.org/r/#/c/301837/15 [13:10:44] !log deploying the aqs cassandra user to aqs100[456] (not using it in aqs-restbase yet) [13:10:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [13:12:29] !log deploying the aqs cassandra user to aqs100[123] (not using it in aqs-restbase yet) [13:12:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [13:12:57] the first one was of course wrong :) [13:15:34] Analytics-Kanban: Improve user management for AQS - https://phabricator.wikimedia.org/T142073#2536405 (elukey) New cluster switched, installed the new user in the current one (aqs100[123]) and tested: ``` elukey@aqs1001:~$ cqlsh -u aqs -f showdata.cql aqs1001.eqiad.wmnet Password: project ---------------... [13:17:25] all right, I am technically ready to switch the aqs100[123] cluster [13:17:52] If I run puppet only on aqs1001 it will be possible to test the change only in there first [13:18:16] the user seems working fine from the rests that I've made (updated the phab task with them) [13:24:40] * elukey prepares the code review and waits for urandom and joal [13:26:05] https://gerrit.wikimedia.org/r/#/c/303798/2 [13:27:15] * elukey goes afk for a bit! [14:02:40] * elukey back [14:04:32] urandom: hello! [14:04:42] are you onlinez by any chance? [14:21:06] (CR) Addshore: Create WikidataSpecialEntityDataMetrics (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301657 (https://phabricator.wikimedia.org/T141525) (owner: Addshore) [14:21:17] elukey: Give me a minute to test restbase with the new user on aqs1004 [14:21:20] ;) [14:21:27] of course [14:21:28] (PS9) Addshore: Create WikidataSpecialEntityDataMetrics [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301657 (https://phabricator.wikimedia.org/T141525) [14:21:35] you have also the user on aqs100[123] [14:21:53] I tested it with a basic query but not sure if enough [14:21:59] let me know :) [14:24:55] elukey: have you restarted restbase on aqs1004? [14:25:40] my test is for restbase, not cassandra? [14:25:41] ;) [14:26:39] yeah I restarted on aqs100[456] [14:26:42] (restbase) [14:28:48] great :) [14:28:56] Checking content using a restbase url [14:29:18] I ran something as stupid as select project from "local_group_default_T_pageviews_per_article_flat".data limit 10; [14:29:43] elukey: this tests cassandra from cqlsh, I want the same but from restbase [14:30:00] elukey: still weird thing for icinga checks: /usr/local/lib/nagios/plugins/service_checker: No such file or directory [14:30:29] ah yeah I also ran the test urls [14:30:35] to hit testbase [14:30:39] *resbase [14:30:46] ok today I can't write [14:30:56] :D [14:31:04] where did you find the error? [14:31:16] running /srv/deployment/analytics/aqs/deploy/test/test_local_aqs_urls.sh: [14:31:17] I'll follow up, didn't see it [14:31:29] ah snap this one is new to me [14:31:31] :P [14:31:44] ahh no SElfie obama [14:31:45] okok [14:31:46] elukey: the missing file is the one icinga should use to test restbase is up [14:32:36] elukey: I confirm restbase works with the new user (with the test url I hit at least) [14:32:45] * joal claps at elukey ! [14:33:29] joal: ah yes, maybe service_checker is not used anymore in there? I am going to check [14:33:41] weird :( [14:33:42] also joal if you want to review https://gerrit.wikimedia.org/r/#/c/303798/2 [14:33:49] I put a description of what I have in mind [14:35:25] joal: you are not a swagger [14:35:28] this is the problem [14:35:30] :P [14:35:32] elukey@aqs1004:~$ service-checker-swagger 127.0.0.1 http://localhost:7232/analytics.wikimedia.org/v1 [14:35:35] All endpoints are healthy [14:35:47] hm elukey [14:35:59] elukey: service checker must have changed place !P [14:35:59] I can find only that one on aqs [14:36:07] checking also puppet [14:39:58] so joal the only reference to service_checker that I can find is in modules/lvs/manifests/monitor_services.pp, in a comment [14:40:54] we use [14:40:55] monitoring::service { 'aqs_http_root': [14:40:55] description => 'AQS root url', [14:40:55] check_command => "check_http_port_url!${::aqs::port}!/", [14:40:56] contact_group => 'admins,team-services', [14:40:58] } [14:41:19] no bueno [14:41:26] we need a better one [14:41:49] :( [14:42:21] elukey: we have a reference to that file (the service checker one) cause the tests are defined in restbase itself [14:43:03] mforns_lunch: I found the tweet I talked to you about (elukey: you can check it if you wanna laugh) : https://twitter.com/AlexanderEin/status/759485679185387521 [14:43:43] aahhah [14:43:58] :) [14:48:10] mforns_lunch: I have also read the discussion about code formater for scala - I agree o nhaving an independent one [14:48:23] milimetric: --^ [14:48:28] milimetric: hi sorry :) [14:49:01] hi joal [14:49:35] independent formatter? ok [14:49:53] milimetric, mforns_lunch : I suggest using scalafmt - I have met the guy writing it at the airport when getting back from Berlin, and I think it's good [14:50:00] joal: I found out something by accident that I'm wondering if it's obvious math that I just don't remember [14:50:03] wanna hear? [14:50:06] so joal restbase apparently implements the same check that we have [14:50:12] in restbase::monitoring [14:50:14] joal: scalafmt sounds good [14:50:21] but they watch a lot of graphite metrics [14:50:23] I'll ask to Eric [14:50:24] elukey: howdy! i am now (though i'm about to go into a meeting [14:50:28] ) [14:50:34] morning! [14:50:40] good afternoon! [14:50:44] whenever you have time there is a code review for you :) [14:50:45] elukey: hmm [14:50:55] elukey: that doesn't ring anything to me [14:50:57] I'd like to switch the aqs100[123] cluster to the aqs user [14:51:05] elukey: ok; will do [14:51:05] great milimetric :) [14:51:17] joal: there are also checks from LVS [14:51:20] on the LVS hosts [14:51:28] if the endpoint answers [14:51:36] and I think AQS does not have anything in there [14:51:47] I am going to open a phab task to make a summary and fix [14:52:13] but atm if aqs-restbase is down we get the alarm [14:52:28] Analytics-Kanban: Compile a request data set for caching research and tuning - https://phabricator.wikimedia.org/T128132#2536627 (Danielsberger) Thank you for clarifying the X_cache field, this helps a lot. It seems then that the current Hive query (x_cache like '%cp4006%') allows us to reproduce the cache h... [14:53:17] elukey: I think I don't get it [14:53:23] elukey: But I trust you ! [14:55:55] joal: so my understanding - the aqs::monitoring class does execute check_command => "check_http_port_url!${::aqs::port}!/ that basically checks on each aqs host if the HTTP service is up. This is why I was saying that we get an alarm if aqs is down.. It is not super complete but it should work.. From the LVS/Load-balancer side we could add other alarms, but towards the aqs endpoint not single h [14:56:01] osts [14:57:06] elukey: I don't know really how it works (if it's through LVS or not), but I know hyperswitch (restbase, and therefore aqs) defines some tests in project configuration [14:57:30] elukey: Those are the tests using default test data that needs to be inserted at cluster restart [15:01:40] ah this is another one ok ok [15:01:51] didn't know that :) [15:02:00] milimetri, joal: standddupppp [15:02:20] Joining !b [15:12:02] Analytics, Analytics-EventLogging: Ensure no dropped messages in analytics eventlogging processors when stopping broker - https://phabricator.wikimedia.org/T142430#2536657 (Ottomata) p:Triage>Normal [15:24:55] joal: looks like addshore patches are reday to merge right? [15:24:58] *ready [15:25:09] should be nuria_ ! [15:25:19] nuria_: agreed :) [15:26:07] (CR) Nuria: [C: 2] Create WikidataSpecialEntityDataMetrics [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301657 (https://phabricator.wikimedia.org/T141525) (owner: Addshore) [15:26:44] addshore: maybe you can make commit message a bit more descriptive: https://gerrit.wikimedia.org/r/#/c/301661/ [15:32:14] elukey: ops sync? [15:32:32] yes sorry [15:32:36] joining [15:32:44] Analytics, Operations, Traffic: Correct cache_status field on webrequest dataset - https://phabricator.wikimedia.org/T142410#2536733 (Nuria) @elukey : If I understand things right this changeset will emit a new header that we need to publish via varnishkafka. The header value should replace whatever... [15:37:41] Analytics, Operations, Traffic: Correct cache_status field on webrequest dataset - https://phabricator.wikimedia.org/T142410#2533998 (BBlack) @Nuria + @elukey - the second patch just merged above takes care of the varnishkafka part. So this should gradually go live over the next ~30 minutes. [15:45:02] (PS3) Addshore: Create wikidata/specialentitydata_metrics coordinator [analytics/refinery] - https://gerrit.wikimedia.org/r/301661 (https://phabricator.wikimedia.org/T141525) [15:45:04] nuria_: ^^ [15:46:11] Analytics-EventLogging, DBA, ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#2536814 (Jdforrester-WMF) Isn't client-side code caching great? OK, wait a month and then try? [15:48:46] Analytics-EventLogging, DBA, ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#2536822 (jcrespo) :-) To be fair- I understand why those would be generated. But shouldn't we have a way to discard those on server... [15:54:33] Analytics-EventLogging: Have a way to mark schemata as disabled, so valid events (e.g. from cached client code) don't get logged - https://phabricator.wikimedia.org/T142490#2536861 (Jdforrester-WMF) [15:54:43] Analytics-EventLogging, DBA, ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#2497306 (Jdforrester-WMF) >>! In T141407#2536822, @jcrespo wrote: > :-) > > To be fair- I understand why those would be generated. B... [16:05:32] joal: aqs1001 would be ready to be re-pooled [16:05:42] do you have a minute to check? [16:07:27] I ran test urls and service-checker-swagger, plus checked the host in general [16:07:31] it looks good to me [16:10:06] Analytics-EventLogging, DBA, ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#2536923 (jcrespo) Let's wait, then. It is not that important, it was just annoying. [16:11:40] all right I am going to re-pool it, it looks really good [16:12:19] traffic flowing with 200s, goood [16:13:56] (CR) Nuria: [C: 2 V: 2] Create wikidata/specialentitydata_metrics coordinator [analytics/refinery] - https://gerrit.wikimedia.org/r/301661 (https://phabricator.wikimedia.org/T141525) (owner: Addshore) [16:19:31] elukey: sorry missed your ping ! [16:21:16] don't worry! [16:21:20] traffic looks good [16:21:56] (PS16) Joal: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) [16:22:00] elukey: great :) [16:22:09] if you are ok I am going to proceed with aqs1002 [16:22:48] milimetric, mforns: --^ that patch is reformatted (plus php serializer and subgraph partitionner modified as per your comments) [16:22:56] eqi aqs1001 [16:22:59] oops :) [16:23:21] joal: ok, should I take a look too or you guys still working on it [16:23:36] milimetric: there is still ongoing work [16:23:46] elukey: please go ahead with 2 [16:25:00] Analytics, Analytics-EventLogging: Ensure no dropped messages in analytics eventlogging processors when stopping broker - https://phabricator.wikimedia.org/T142430#2536939 (Ottomata) Am testing high volume production for eventbus in labs. Created this upstream issue: https://github.com/dpkp/kafka-python... [16:26:34] done :) [16:27:03] a-team: you all discuss moving the staff meeting up? I moved the other meetings up against standup to give you a chance to see if that would work [16:27:32] milimetric: everything works for me :) [16:28:08] milimetric: do you mean anticipating staff? [16:29:40] thanks joal! will look at it [16:29:45] elukey: moving staff after the new standup time, so 15:30 UTC [16:30:20] milimetric: I have a vague souvenir nuria_ said she might not be able to have it moved, but better to ask her [16:30:58] milimetric: i have a meeting that conflicts (management meeting) [16:31:07] milimetric: so i would not be able to make it [16:31:11] ok [16:31:14] thx [16:31:16] milimetric: lgtm! [16:31:32] (we can talk more about schedules during staff in 30 min. then) [16:32:18] aqs100[123] switched to use the 'aqs' cassandra user for restbase [16:32:23] all good [16:32:41] there was only a brief spike in latency after aqs1001's upgrade [16:32:43] but nothing else [16:32:49] traffic looks good on all the hosts [16:36:18] Analytics-Kanban: Improve user management for AQS - https://phabricator.wikimedia.org/T142073#2536980 (elukey) Remaining steps: 1) establish how to distribute the new user/password credentials to oozie; 2) move oozie away from the 'cassandra' user, either using the newly created 'aqs' user or creating a new... [16:41:35] mforns, milimetric : argh, i merged the change to add the new wiki to reportupdater [16:41:59] milimetric: but i realized that mforns had done +2 but not merged it, were we waiting? [16:42:02] nuria_: that's good, it should've merged automatically anyway [16:42:06] milimetric: ah ok [16:49:09] nuria_, no, you did it right, I still am not used to the new gerrit ui [16:49:36] mforns: ya , same here] [16:59:31] a-team: fast staff meeting? [16:59:37] a-team: batcave? [16:59:40] sure [17:00:33] nuria_: yeah we're all here [17:15:59] going offline a-team! The aqs100[123] cluster looks good, but please keep an eye on it today :) [17:16:07] talk with you tomorrow! [17:16:09] Bye elukey :) [17:16:11] bye elukey ! [17:16:26] nite [17:16:31] * milimetric going to lunch [17:17:02] https://usercontent.irccloud-cdn.com/file/uLlGKjFs/relative%20sizes%20of%20wikis [17:17:13] btw, ^ [17:17:50] milimetric: taking vertical slices, we have our parallelization scheme ! [17:18:10] I'll re-render it as I put wikis into groups and include a few in the diagrams folder [17:18:37] yeah, joal, but I'm thinking to be a little gentler for the big ones [17:18:50] so I run wikidata and commons separately, and then 2 by 2 and so on [17:19:01] until I get to the thinner slices which I'll do together [17:25:54] Analytics-Kanban: User History: Add unit tests to UserHistoryDataExtractors and UserHistoryBuilder - https://phabricator.wikimedia.org/T142500#2537165 (mforns) [17:26:52] awesome milimetric :) [17:32:41] wikimedia/mediawiki-extensions-EventLogging#582 (wmf/1.28.0-wmf.14 - 9190570 : Mukunda Modell): The build has errored. [17:32:41] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/9190570aed90 [17:32:41] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/150992697 [17:48:41] !log restarting eventlogging with kafka-python 1.3.1 (and bugfix), will be testing kafka broker restarts again today [17:48:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [17:49:37] Analytics, Editing-Analysis, Notifications, Collab-Team-Q1-July-Sep-2016: Numerous Notification Tracking Graphs Stopped Working at End of 2015 - https://phabricator.wikimedia.org/T132116#2537277 (Neil_P._Quinn_WMF) p:Low>Normal [17:55:46] Analytics, Analytics-EventLogging: Ensure no dropped messages in analytics eventlogging processors when stopping broker - https://phabricator.wikimedia.org/T142430#2537341 (Ottomata) kafka-python 1.3.1 was released yesterday. This fixed a bug I found when running sync producer in labs and stopping a bro... [18:05:22] Analytics, MediaWiki-extensions-WikimediaEvents, The-Wikipedia-Library, Wikimedia-General-or-Unknown, Patch-For-Review: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#2537376 (Samwalton9) @Legoktm Any luck fixing this bug? [18:20:53] Logging off a-team, see you tomorrow ! [18:21:23] laters! [18:34:31] hi! does anybody know anything about the gap in wmf.webrequest data for jun 1 to jun 7? [18:34:57] this produces no results: select * from webrequest where year=2016 AND month=6 and day=1 AND uri_host='query.wikidata.org' AND webrequest_source='misc' limit 100; [18:41:44] ah figured it out, turns out we store only last 60 days there... [19:33:06] Analytics-Kanban, EventBus, Wikimedia-Stream: Public Event Streams - https://phabricator.wikimedia.org/T130651#2537741 (Ottomata) Hm, thoughts about filtering: > Key correlates to a root key, or dot separated key path, in message data. E.g. 'server_name' or 'revision.old'. I agree that we should b... [20:25:10] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2538048 (Amire80) >>! In T141506#2533974, @Milimetric wrote: > @Amire80 so we could try to clean up this data in our pageview data pipeline, but it would be a *l... [20:33:46] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2538082 (Milimetric) Amire, if you apply the following filter to any query against the `pageview_hourly` table on stat1002, it will exclude the spike. You will... [20:43:43] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2538134 (Amire80) >>! In T141506#2538082, @Milimetric wrote: > @Amire80, if you apply the following filter to any query against the `pageview_hourly` table on st... [20:56:41] bye a-team, see you tomorrow! [20:56:46] nite mforns [21:35:57] Analytics, Spike: Spike: Evaluate alternatives to varnishkafka: varnishevents - https://phabricator.wikimedia.org/T138426#2538353 (Danny_B) [22:18:07] Analytics: Better publishing of Annotations about Data Issues - https://phabricator.wikimedia.org/T142408#2538476 (MusikAnimal) >>! In T142408#2536161, @Milimetric wrote: > @MusikAnimal: happy to work with you on this if you'd like, to understand what would be easier from your point of view. Certainly! Than... [22:22:37] Analytics: Find out what happens to the old rows in the revision table - https://phabricator.wikimedia.org/T142535#2538516 (Milimetric)