[00:55:43] Quarry, Labs: Phantom entries in "Quarry" query and or labs replica of enwiki db - https://phabricator.wikimedia.org/T141818#2513833 (Danny_B) [07:27:31] Analytics, Analytics-Wikistats, Operations, Regression: [Regression] stats.wikipedia.org redirect no longer works ("Domain not served here") - https://phabricator.wikimedia.org/T126281#2514295 (Nemo_bis) p:Triage>Normal [08:21:29] Analytics-Cluster, Analytics-Kanban, Deployment-Systems, scap, and 2 others: Deploy analytics-refinery with scap3 - https://phabricator.wikimedia.org/T129151#2514384 (MoritzMuehlenhoff) @elukey: I've dropped Yuvi's expired key from pwstore, so new entries can be added now. [08:23:00] joal: gooood morning! I'd be ready to bring down our dear cassandra cluster [08:23:03] to reimage with raid10 [08:24:48] elukey: good morning :) [08:25:15] elukey: Please go ahead, let me know if I can be of any help [09:01:35] * elukey writes 100 times: when you change a partman recipe you need to remember to run puppet on carbon otherwise you'll get the old one [09:01:45] * elukey is creating raid0 again on aqs1004 [09:01:46] sig [09:01:49] *sigh [09:02:18] joal: would it make sense to try to reimage aqs1004 and then see if the instances get in sync again? [09:02:25] or is it asking too much to the clusteR? [09:02:35] because we might avoid to loose the four months [09:02:45] elukey: Interesting idea !!!1 [09:02:59] elukey: you can try that, let's see how the thing behaves :) [09:03:36] super [09:03:46] I'll start a new reimage in a bit [09:03:52] this time with raid10 [09:03:53] :/ [09:04:22] elukey: let me know when trying to sync, I'd like to monitor with you :) [09:04:38] sure! [09:08:43] elukey: any ideas if i should be / can use -slave as well as -master? :D [09:10:31] no idea, but jynus would be the right one to ask (our dba/ops). I will follow up with him today :) [09:10:37] okay! :D [09:35:38] joal: I am running puppet for the first time on AQS now [09:35:41] aqs1004 [09:35:47] elukey: okey [09:40:11] joal: would you mind to deploy on aqs1004? [09:40:35] elukey: hm, problems with old/new versions (schema change etc) [09:40:46] elukey: manual changes needed [09:40:52] elukey: But we'll do [09:41:22] ahhh yeah sorrY! [09:41:34] I just remembered that I installed cassandra manually on the host [09:41:40] this is why puppet was angry at me [09:41:42] ahahahah [09:43:12] but new raid10 ready [09:43:13] Filesystem 1K-blocks Used Available Use% Mounted on [09:43:21] awesome elukey [09:43:32] /dev/md2 3028128480 73944 2874211336 1% /srv/cassandra-b [09:43:32] /dev/md1 3028128480 73944 2874211336 1% /srv/cassandra-a [09:43:35] elukey: I don't really know how to deal with installs now [09:43:42] 3TB as expected right? [09:44:00] elukey: Half of what we had before, sounds correct :) [09:44:16] me too joal, I just press random buttons and swear a lot [09:44:18] it works :) [09:46:25] elukey: restbase needs some manual update on schemas IIRC [09:47:27] yeah but I don't remember which ones.. we'd probably need to write them down somewhere [09:56:38] INFO [main] 2016-08-02 09:56:20,900 StorageService.java:1199 - JOINING: calculation complete, ready to bootstrap [09:56:46] cassandra-a is bootstrapping [09:57:01] elukey: I know :) [09:57:08] :D [09:57:23] I had to follow https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html [09:57:26] interesting [09:57:31] elukey: issue is, cassandra might try to get back in sync with others having schema diffs due to restbase [09:57:53] elukey: you shouldn't start restbase is my idea [09:58:12] aqs is down afaik because of failed dependecies [09:58:15] let me check [09:58:51] yeah it is down [09:59:00] ok [09:59:09] so it shouldn't have touched cassandra [10:00:41] elukey: can't join cassandra on aqs1004 [10:00:47] cqlsh fails [10:03:04] it is bootstrapping [10:03:24] probably not ready yet [10:03:34] elukey: I'd like to check schemas before ... [10:03:47] elukey: For instance, will it use default compcation? [10:03:57] elukey: and compression? [10:04:54] I have no idea :) [10:05:17] elukey: We probably want top know that before finishing bootstrap, no ? [10:07:27] joal: nodetool-a netstats - it is almost finished [10:08:02] worst case we'll do it again with proper settings ? [10:08:10] ok [10:08:19] I am not sure how to start cassandra without going through this path [10:08:24] I mean, bootstrap [10:16:28] elukey: from nodetool-a netstats, seems not finished at all ! (received ~15 of ~1000 files) [10:17:11] elukey: Having cassandra doing a bootstrap without being sure settings are correct is kinda uncool I think [10:17:51] IRRC, cassandra will get feeded all data, and then will need to compact etc [10:17:51] yeah I just noticed the format of the output [10:18:49] what are the settings that we need to check? [10:19:07] I mean, how did you set them [10:19:08] elukey: schema correctness, compaction and compression [10:19:13] all right [10:19:16] elukey: in cqlsh [10:21:17] the last two should be set in configuration thoug [10:21:22] *though [10:21:36] elukey: nope - They are schema dependent, set in cql [10:22:14] ok what I meant is that we shouldn't have to set them manually in the perfect world [10:22:37] elukey: schemas are defined by restbase, that's why we don't have to set them [10:23:13] yes ok but if these are global and related to schemas a new node should pick them up from the other ones [10:23:19] elukey: Someone needs to do it :) [10:23:49] elukey: I hope so, that's why I wanted to double check using cqlsh [10:24:09] ah yes it would be great but I didn't trigger any "cassandra bootstrap" or similar [10:24:12] I just started it [10:24:15] :D [10:24:24] started what? [10:24:33] cassandra-a, the service [10:25:00] hm [10:26:08] that is the service related to the a instance [10:26:11] in systemd [10:26:45] elukey: still no cqlsh [10:27:19] joal: the instance is in joining mode, I think that you can't use cqlsh in this timeframe to avoid messing up with settings [10:27:42] I was wrong about the fact that it would have finished soon, I misread the netstats [10:28:04] I am almost sure that you won't be able to use cqlsh until the instance will have done the full boottrap [10:28:07] *bootstrap [10:28:12] elukey: That would make sense ! [10:29:32] elukey: from nodetool-a cfstats, compaction seems properly set - I have no info about compression though [10:31:41] Hallo. [10:31:46] I noticed a weird thing. [10:32:04] In both English and Russian Wikipedias there is a large spike of visits to the main page in the last few days. [10:32:15] (If I'm using the pageviews site correctly.) [10:32:21] Here's what I'm talking about: [10:32:27] Russian: https://tools.wmflabs.org/pageviews/?project=ru.wikipedia.org&platform=desktop&agent=user&start=2015-07-01&end=2016-08-01&pages=%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0 [10:32:32] English: https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=desktop&agent=user&start=2015-07-01&end=2016-08-01&pages=Main_Page [10:32:44] Does anybody have an idea why does it happen? [10:32:59] I don't see anything like that in other major languages. [10:33:07] Spanish: https://tools.wmflabs.org/pageviews/?project=es.wikipedia.org&platform=desktop&agent=user&start=2015-07-01&end=2016-08-01&pages=Wikipedia:Portada [10:33:14] Portuguese: https://tools.wmflabs.org/pageviews/?project=pt.wikipedia.org&platform=desktop&agent=user&start=2015-07-01&end=2016-08-01&pages=Wikip%C3%A9dia:P%C3%A1gina_principal [10:33:22] Japanese: https://tools.wmflabs.org/pageviews/?project=ja.wikipedia.org&platform=desktop&agent=user&start=2015-07-01&end=2016-08-01&pages=%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8 [10:33:26] aharoni: Hi, known issue: T141506 [10:33:27] T141506: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506 [10:33:44] Ah, cool, thanks. [10:37:07] joal: nice! I can see compression ratio 0.8 but nothing more [10:37:23] worst case we'll set it up later on [10:37:34] I am wondering now if I can start the bootstrap for cassandra-b [10:38:32] also bootstrapping would take a lot I am afraid [10:39:11] elukey: if bootstrapping * 3 < loading * 4, we're good :) [10:39:29] yes definitely :D [10:39:48] joal: what do you think about leaving this bootstrap to finish, and then make the calculations? [10:40:09] elukey: I think you should launch the other bootstrap as well [10:40:22] elukey: 2 instances, one node - Stuff should ahppen in parallel [10:40:35] ok I agree, let's see how it goes [10:40:51] IIRC cassandra complained in the past with multiple instances joining [10:40:56] failing to start the second one [10:40:59] let's see [10:41:01] Arrf [10:41:13] Maybe multi-boostrp is not something [10:42:08] Analytics, Pageviews-API: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2501321 (Amire80) I noticed the same while analyzing pageview data in correlation to interlanguage clicks data. It happens only in some languages. It happens in English, Russian and... [10:43:58] joal: it seems that cassandra b is bootstrapping [10:44:30] elukey: Great ! [10:45:21] all right going to lunch, ttl! [10:46:00] (brb in ~30 mins) [10:47:01] (PS2) Joal: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) [10:49:18] (CR) jenkins-bot: [V: -1] [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [10:49:54] taking a break a-team, later ! [10:52:15] Analytics, Pageviews-API: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2501321 (PrimeHunter) I tested around 20 other main pages in Wikipedias and other Wikimedia projects. The only huge increase was [[https://tools.wmflabs.org/pageviews/?project=ru.wiki... [11:03:23] hi team :] [11:16:03] Analytics, Pageviews-API: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2514958 (Amire80) FWIW, this is high priority for me because these statistics significantly affect the info about pageviews in general, and I need pageview stats to be as precise as p... [12:58:20] heyall [12:58:57] Hey milimetric [12:59:16] milimetric: how is basement? [12:59:29] uh... terrible :) [12:59:42] :( [12:59:45] but the plumbers aren't here yet so I have nothing to do yet [12:59:54] Arf, ok [13:09:28] pwstore unblocked, we can proceed with adding the refinery keys for scap \o/ [13:47:01] addshore: you can query both but you have to keep in mind that they contain different shards, but other than that there shouldn't be any issue [13:50:43] elukey okay! [13:50:53] *quickly checks what the slave has* [13:51:59] ahh elukey okay, I hadn't actually looked at that! yeh, the slave only has 23dbs! [13:55:40] (PS3) Addshore: Run all 03 cron scripts at the same time [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302131 [13:56:20] (PS2) Addshore: Run all 03 cron scripts at the same time [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302132 [13:56:51] (CR) Addshore: [C: 2] Run all 03 cron scripts at the same time [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302131 (owner: Addshore) [13:56:55] (CR) Addshore: [C: 2] Run all 03 cron scripts at the same time [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302132 (owner: Addshore) [13:56:58] (Merged) jenkins-bot: Run all 03 cron scripts at the same time [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302131 (owner: Addshore) [13:57:01] (Merged) jenkins-bot: Run all 03 cron scripts at the same time [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302132 (owner: Addshore) [13:58:36] addshore: I didn't know it too, I'll try to improve the docs! [13:58:46] [= [13:59:49] (PS2) Addshore: Add echo statusNotifications script to 03 daily cron [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302126 (https://phabricator.wikimedia.org/T140928) [14:01:17] (PS3) Addshore: Add echo statusNotifications script to 03 daily cron [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302126 (https://phabricator.wikimedia.org/T140928) [14:01:37] (PS4) Addshore: Add echo statusNotifications script to 03 daily cron [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302126 (https://phabricator.wikimedia.org/T140928) [14:02:09] (PS2) Addshore: Add echo statusNotifications script to 03 daily cron [analytics/wmde/scripts] - https://gerrit.wikimedia.org/r/302128 (https://phabricator.wikimedia.org/T140928) [14:24:43] (PS3) Joal: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) [14:24:48] mforns: --^ [14:25:27] mforns: Updated as discussed (package rename and explode-non-connected-states thing) [14:25:46] joal, thanks a lot! [14:25:56] np ! [14:27:36] (CR) jenkins-bot: [V: -1] [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [14:35:32] Analytics-Cluster, Analytics-Kanban, Deployment-Systems, scap, and 2 others: Deploy analytics-refinery with scap3 - https://phabricator.wikimedia.org/T129151#2515412 (elukey) Created the keys in the private repo and encrypted them with the pass stored in pwstore under analytics-deployment-key-pas... [14:35:55] (PS4) Joal: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) [14:36:06] mforns: --^ sorry, corrected a small bug [14:38:17] joal: https://wikitech.wikimedia.org/wiki/Cassandra#Adding_a_new_.28empty.29_node - with -Dconsistent.rangemovement=false we could go considerably faster but there might be consistency issues [14:39:33] elukey: I don't know what to say [14:40:15] joal: just support me and don't kill me in the process [14:40:19] :D [14:40:50] elukey: I SUPPORT ! [14:40:50] jokes aside, it seems that we stream from only one node.. sigh [14:42:21] ottomata: o/ [14:42:25] I added downtime to kafka2002 [14:42:28] fyi [14:42:59] elukey downtime? [14:43:15] in icinga [14:43:25] what's up with kafka2002? [14:43:38] there is the weird icinga error for service check [14:43:47] the fake post [14:43:52] it sends alarm to ops [14:44:30] 15:24 PROBLEM - eventlogging-service-eventbus endpoints health on kafka2002 is CRITICAL: /v1/events (Produce a valid test event) is CRITICAL: Test Produce a valid test event returned the unexpected status 500 (expecting: 201) [14:44:37] grrrr [14:44:39] these ones [14:44:40] ok. [14:44:49] hadn't seen them in a while thought they had just gone away ;/ [14:44:50] hmmm [14:45:21] if nuria_ reviews my change, maybe we can go ahead and deploy new eventogging producer in codfw. [14:45:26] that's because I added downtime in icinga the last time, I think that I mentioned it but it has probably gone lost in the IRC conversations :( [14:45:26] i'm waiting for an upstream tag to really do it [14:45:31] but it won't hurt to just do latest in codfw [14:45:52] nah it is fine, I added some days of downtime IIRC, so we are good [14:45:56] I just wanted to let you know [14:46:19] ok [14:46:20] thanks [14:46:49] brb... [15:01:57] ottomata: will review change today [15:02:15] yay danke! [15:25:47] hey a-team. i have a weird q for you! [15:26:09] my girlfriend is thinking about making a performance piece out of our techy meeting speak [15:26:15] she often is around while i'm in meetings with yall [15:26:18] and hears us talking [15:26:40] she asked me if I could record a meeting today so she could listen to it and see if she could make it work [15:26:50] mind if I record standup? [15:26:57] no problem if not [15:27:28] ottomata: no problem for me :) [15:29:25] ottomata, elukey : Can't make it to ops-sync - Only cassandra that elukey knows about on my end :) [15:30:21] ok np [15:32:11] hm, might wait til we have a really techy one [15:32:13] for the recording :/ [15:33:30] ottomata: if you have time we can batcave quickly [15:33:33] otherwise we can skip [15:34:37] i'm in batcave [15:34:39] ja? [15:34:42] no? [15:35:35] ottomata, no problem for me too [15:47:27] Analytics-Kanban: User History: Populate the causedByUserId and causedByUserName fields in 'create' states. - https://phabricator.wikimedia.org/T139761#2515596 (mforns) a:mforns [15:57:08] ah ottomata I forgot to ask you about mirror maker [15:57:25] I added some comments to the code review but not sure if they are useful [15:57:30] the rest looks very good to me [15:58:00] ah thanks! i haven't looked yet [15:58:10] will look [16:00:26] milimetric: did you changed piwik alreday for IOS , i looked and url was already: https://m.en.wikipedia.org [16:42:25] (PS5) Joal: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) [16:45:11] (PS6) Joal: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) [16:56:43] a-team retro cancelled, what a darn shame [16:56:46] i am so disappointed [17:00:42] lol [17:07:15] a-team fyi, deployment-eventlogging03 has puppet disabled for a bit. i'm testing kafka-python consumer on all-events and mysql consumers [17:07:21] hmm, think i will do it for consumer and producer of processor too [17:08:17] k [17:15:53] wikimedia/mediawiki-extensions-EventLogging#578 (wmf/1.28.0-wmf.13 - 3902c49 : thcipriani): The build has errored. [17:15:53] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/3902c49a8bb0 [17:15:53] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/149247513 [17:23:08] ottomata: 5 minutes without friends after talking bad about our retro. [17:23:22] "speaking bad".. ahem... [17:23:34] ottomata: did you guys alredy looked at your async code? [17:23:45] *already [17:23:47] cc milimetric [17:26:05] going afk team! [17:26:08] byeeeee o/ [17:27:28] nuria_: milimetric has seen some of it, buuuut you don't need to review that now [17:27:33] it is wip and i don't plan on deploying it anytime soon [17:27:35] maybe eventually [17:28:12] k [17:29:04] ottomata: ok, back to testing kafkaX [17:39:06] nuria_: it sounds like we need to sync up on piwik and aqs deplo [17:39:08] *deploy [17:39:17] lemme know when, I'm in a meeting in 20 min. but after that free [17:40:12] k , you let me know [17:40:28] I am available [17:41:26] Quarry, DBA, Labs: Phantom entries in "Quarry" query and or labs replica of enwiki db - https://phabricator.wikimedia.org/T141818#2516050 (yuvipanda) [17:44:22] (Abandoned) Milimetric: [WIP] Process Mediawiki page history [analytics/refinery/source] - https://gerrit.wikimedia.org/r/295693 (https://phabricator.wikimedia.org/T134790) (owner: Milimetric) [17:44:44] ottomata: does this cmd look ok to test the new reader in vagrant? [17:44:46] ./bin/eventlogging-service --num-processes 4 --port 8087 --schemas-path /vagrant/srv/event-schemas/jsonschema --topic-config /vagrant/srv/event-schemas/config/eventbus-topics.yaml 'kafka:///localhost:9092?async=False&topic=datacenter1.{meta[topic]}' 'kafka-python:///localhost:9092?async=False&topic=datacenter1.test' [17:46:20] Analytics-Kanban, Patch-For-Review: Extract edit oriented data from MySQL for simplewiki - https://phabricator.wikimedia.org/T134790#2277147 (Milimetric) a:Milimetric>mforns [17:46:29] Quarry, DBA, Labs: Phantom entries in "Quarry" query and or labs replica of enwiki db - https://phabricator.wikimedia.org/T141818#2513270 (jcrespo) Please follow the recommendations mentioned at: https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database/Replica_drift [17:47:06] mforns: I assigned that to you ^ because first you did most of that work and second the simplewiki page stuff is mostly done, and now you're finishing the user history stuff so you can move it to done when you're done. [17:47:40] milimetric, ok, thanks! [17:47:52] Quarry, DBA, Labs: Phantom entries in "Quarry" query and or labs replica of enwiki db - https://phabricator.wikimedia.org/T141818#2516093 (jcrespo) [17:48:59] Analytics-Kanban: Page History: write scala for page history reconstruction algorithm - https://phabricator.wikimedia.org/T138853#2516098 (Milimetric) There was a lot of back and forth on this algorithm. The 34 points is probably an under-estimate when you count all the brainstorming, fixing, edge cases, et... [17:50:09] Quarry, DBA, Labs: Phantom entries in "Quarry" query and or labs replica of enwiki db - https://phabricator.wikimedia.org/T141818#2516102 (ShakespeareFan00) [17:52:50] Analytics-Kanban: Productionize edit history extraction for all wikis using Sqoop - https://phabricator.wikimedia.org/T141476#2516106 (Milimetric) a:Milimetric [17:56:08] https://www.irccloud.com/pastebin/dJL6cOem/ [17:56:25] ottomata: does the cmd above look ok? [18:00:04] nuria_: I was wrong, I have 30 min. now [18:00:19] k [18:00:19] omw batcave [18:01:17] cc milimetric [18:01:23] omw [18:03:44] nuria_: that looks you are outputting to two sync producers [18:03:45] is that what you want? [18:03:52] you probably don't need that last argument [18:19:16] elukey: hm [18:19:18] you still there? [18:19:21] i'm getting Warning: the following recipients are invalid: EADD29C7EC1C57228DFCA2DC55030251A5487A66. Try again (or proceed)? [Y/n] Y [18:19:23] still :/ [18:19:28] but i don't see that key in .users [18:20:31] oh [18:20:32] yes i do [18:20:33] its mark [18:22:38] yeah mark's is expired [18:25:42] ah! there we go, i just had to refresh keys [18:26:42] Analytics-Kanban: Create separate archiva credentials to be loaded to the Jenkins cred store {hawk} - https://phabricator.wikimedia.org/T132177#2516212 (Ottomata) Woot, finally was able to store password in pwstore. [18:44:55] ottomata: i changed cmd [18:44:57] ottomata: [18:45:10] ./bin/eventlogging-service --num-processes 4 --port 8087 --schemas-path /vagrant/srv/event-schemas/jsonschema --topic-config /vagrant/srv/event-schemas/config/eventbus-topics.yaml 'kafka:///localhost:9092?async=False&topic=datacenter1.{meta[topic]}' 'kafka:///localhost:9092?topic=test' [18:45:25] but still, is there a way to post a simplevent in vagrant [18:46:19] this one [18:46:45] https://www.irccloud.com/pastebin/ixQwuWj0/ [18:47:05] teh post errors no matter the topic [18:51:53] ottomata: nevermindddd [18:53:05] nuria_: ha ok [18:53:23] i'm here if you need help [18:53:47] nuria_: i've thought about maybe modifying the code so that if you don't give a topic-config, it will allow produces to any topic [18:53:49] dunno [18:54:42] ottomata: do we have any events that work on vagrant via changing them a biT/ [18:54:46] *a bit? [19:00:18] nuria_: sure i got you one :) [19:00:58] nuria_: https://gist.github.com/ottomata/95053a826a3167948e359c84b07bac5f [19:02:20] nuria_: I'm back and I'll just let this build thing go for a long time. Maybe I wasn't patient enough [19:02:32] milimetric: did it work? [19:02:32] I'll check it at the end of the day and if it's still not done I'll file a bug with services [19:02:46] no, I upgraded to node 4.4.6 and it's stuck in the same place [19:02:58] milimetric: if after 30 mins is not done we should file a bug [19:03:01] k [19:03:13] milimetric: imagine if we need to fix an important bug in prod [19:03:36] peter is at lunch but if you fiel ticket I iwll ping him [19:03:54] * i will ping him [19:03:57] yeah, makes sense [19:06:33] I'm back from the lunch, what's up? [19:09:31] Pchelolo: milimetric 's build for aqs on docker is not working [19:09:52] not working in which way? [19:10:22] Pchelolo: stack forever [19:10:27] sorry *stuck [19:11:12] Could you maybe file a bug or create a gist with console output, version of service-runner and what's the OS? [19:11:30] and version of docker [19:12:16] cc milimetric [19:12:52] Pchelolo: this seems like it could be improved if we all build in the same machine , do we have one set for that? [19:13:10] sure, will file with details [19:13:38] Pchelolo: otherwise the combinations of Os+docker are going to beto many for you guys to support [19:13:40] nuria_: I've used to have a VM locally with ubuntu, but now since it supports mac I will build on my mac [19:14:00] Pchelolo: but i bet you we have different ubuntu vrrsions + docker versions [19:14:26] Pchelolo: unless we all build in teh same place troubleshooting problems like this one will be a time sucker for you [19:14:57] nuria_: ya, we may think about setting up a labs VM for everyone to use [19:15:40] Pchelolo: if we had a prod bug we need to fix now we couldn't as neither dan nor me can build - for different reasons- [19:26:41] ottomata:besides vanilla event producing with the kafka handler is there something else i should test ? i tested also that utf-8 do not break anything using arabic [19:30:27] nice :) [19:30:59] hmm, nuria_ not off the top of my head [19:31:19] i've done a bunch of testing like that [19:31:34] if you like, test when shutting down kafka [19:31:50] the main thing is we don't want it to error, at least eventually [19:31:53] ottomata: ok, does upstart also manages kafka or kill-9? [19:32:00] upstart [19:32:02] sudo service kafka stop [19:32:06] etc. [19:33:45] so posting an event should say "no brokers avialable, right?" [19:33:50] *available [19:33:54] ^ ottomata [19:38:13] nuria_: maybe. it might just say failed to produce or send [19:38:22] the eventbus logs should say something like that [19:38:24] ottomata: ok, merging then? [19:38:36] nuria_: does it block for you when kafka is down? [19:38:52] ottomata: the production, no it returns with error: [19:39:07] does it take a couple of seconds, or does it return right away? [19:39:11] https://www.irccloud.com/pastebin/vLG7w3Lx/ [19:40:35] ottomata: but if you try to start service it will loop forever with the error "no brokers available" [19:40:39] that's fine [19:40:57] nuria_: cool that's a good error to get back [19:41:06] did it give you back that error right away, or did it take a few seconds? [19:41:18] ottomata: ok, then we are good to merge [19:45:32] cool [19:50:21] Pchelolo: https://phabricator.wikimedia.org/T141917 [19:50:39] milimetric: okey, I see... [19:50:53] I will fix that, but here's a quick fix for you: [19:51:35] Pchelolo: milimetric, got a mw + eventbus q for you if yall when you have a sec [19:51:47] cd to your deploy repo locally and do 'git review' [19:52:31] the 'review' part sometimes hangs for me too, I've never had time to look why was that happening [19:52:39] Pchelolo: I know what you mean, but in this case nothing was comitted on the deploy repo [19:52:40] looks like now is a good time [19:52:52] it just shows the src folder as being updated, but there's no new commit in git log [19:53:11] (in the past when it hung like that, and I ctrl+C it, it would leave a commit there that I could push) [19:53:25] milimetric: 'git checkout sync-repo' perhaps? [19:53:42] oh, right, I forgot it puts it on that branch [19:54:00] yep, grr, I [19:54:03] I'll add that to the docs [19:54:11] (PS1) Milimetric: Update aqs to 532ba2a [analytics/aqs/deploy] - https://gerrit.wikimedia.org/r/302499 [19:54:29] nuria_: ^ that's the patch - it puts it on a separate branch in the deploy repo and I didn't see it [19:54:43] milimetric: kk, in the meantime I will look why does the '--review' part hangs sometimes, but looks like it's not super-urgent [19:54:58] ottomata: what's your q? [19:55:21] Pchelolo: I think in the past it left the deploy repo in the sync-repo branch if it failed before review, so I'd prefer that honestly. Because otherwise it's hard to remember [19:56:07] milimetric: ye, I will look at that part again, it shouldn't hang at all ideally [19:57:15] Pchelolo: revsion_create sets user_text based on $revision->getUserText() [19:57:15] https://github.com/wikimedia/mediawiki/blob/master/includes/Revision.php#L867 [19:57:20] all other events use [19:57:25] $user->getName() [19:57:26] https://github.com/wikimedia/mediawiki/blob/master/includes/user/User.php#L2135 [19:57:29] which should we use? [19:57:44] would prefer to be consistent, but maybe that is problematic with permissions and or revision user_text accuracy? [19:57:58] hm.. lemme check [19:58:07] if we are using user->getName elsewhere, it seems like there shouldn't be a privacy problem with the revision too [19:58:40] milimetric: ^^^ relevant to you too, espceially if you are creating a user's edit history from events [19:58:49] ottomata: maybe that's due to rev_user_text sometimes needing to be the IP? [19:58:56] getName returns IP too [19:59:02] oh really, ok [19:59:04] if no id [19:59:05] then no idea [19:59:10] https://github.com/wikimedia/mediawiki/blob/master/includes/user/User.php#L2135 [19:59:34] ottomata: ok, I know why - I was lazy back in the day :) [19:59:34] milimetric: that is *some* patch that screams : "we need a build server" cc Pchelolo [20:00:49] Pchelolo: seems to me that some time needs to be put towards building restbase services in a lighter way . [20:01:03] (CR) Nuria: [C: 2 V: 2] Update aqs to 532ba2a [analytics/aqs/deploy] - https://gerrit.wikimedia.org/r/302499 (owner: Milimetric) [20:01:14] milimetric: merged now [20:01:35] ottomata: the 'revision->getUser' returns the userId, so it was easier to call 'revision->getUserText' then load the user object by ID and take the text from it, and revision->getUserText() essentially does the same thing [20:02:00] but I do agree that it would be better/more consistent to use $user->getName() everywhere. [20:02:03] thx nuria_, next step is to do the scap deploy, so we can do that whenever [20:02:18] milimetric: batcave? [20:02:21] Pchelolo / ottomata: agree on user->getName [20:02:25] omw nuria_ [20:03:56] ok cool, thanks. [20:03:57] will do that then [20:06:59] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2516626 (MusikAnimal) [20:18:17] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2501321 (Milimetric) Thanks everyone for the analysis. This kind of bug is really hard to pin down. Because you're right, it's probably a bot. But finding som... [20:20:00] (CR) MarcoAurelio: "Sorry, I added Mxn and JGirault here in error." [analytics/refinery] (jenkins-test) - https://gerrit.wikimedia.org/r/290630 (owner: Maven-release-user) [20:32:34] milimetric: ok, one less hurdle, let's keep trying to deploy tomorrow [20:32:51] if anything we would have cleared a bunch of obstacles for the next time we need to do it [20:32:52] k [20:33:04] yeah, all good [21:10:29] Analytics-Kanban: User History: Populate the causedByUserId and causedByUserName fields in 'create' states. - https://phabricator.wikimedia.org/T139761#2516794 (mforns) I tested that and it works. The algorithm takes much longer to execute, probably because there are 500k create events in simplewiki, which n... [22:27:40] Analytics, Pageviews-API, Reading-analysis: Suddenly outrageous higher pageviews for main pages - https://phabricator.wikimedia.org/T141506#2501321 (Tbayer) >>! In T141506#2502584, @MusikAnimal wrote: > I believe @Sjoerddebruin is looking for a general research investigation. Per [[ https://meta.wiki...