[00:00:24] nuria: well i was going to ask - in the current vagrant setup - is there any test data in the wiki database? [00:00:49] madhuvishy: no, 'wiki' database is vagrants mediawiki install [00:00:56] madhuvishy: our tests create test_wiki [00:01:02] madhuvishy: using sql alchemy bindings [00:01:11] nuria: that's fine - but there's no data right? [00:01:13] madhuvishy: so we can freely insert /delete data per tets [00:01:16] *test [00:01:31] madhuvishy: it's empty until you do a test run [00:01:34] nuria: i understand the tests part [00:01:42] I'm asking only about the dev setup [00:02:11] are the wiki and centralauth databases empty when vagrant sets it up? [00:02:31] madhuvishy: wiki we do not set up, taht comes out of teh box in vagrant neither centraluth [00:02:37] madhuvishy: those are separate roles [00:02:42] nuria: yes [00:02:49] but do they have data in them? [00:02:56] madhuvishy: not that we use [00:03:05] some minimal data like admin i've seen [00:03:09] madhuvishy: the role might set up some for other purposes [00:03:25] madhuvishy: permits likely yes [00:03:33] nuria: okay - so if we setup our own dbs for dev in a docker container [00:03:34] madhuvishy: as in grants for access [00:03:40] does it have to have any data in it? [00:03:47] madhuvishy: through the alchemy db bindings? [00:03:52] ummmm [00:03:54] yes [00:04:02] creating the db via alchemy bindings [00:04:05] madhuvishy: no, but if we use sql alchemy we will set ub the db EMPTY [00:04:13] the tables are set up via the bindings [00:04:15] nuria: yes :) i know... [00:04:19] ah sorry [00:04:31] i'm asking if we should populate it with some test data to start wiht [00:04:35] for dev [00:04:50] or if it's okay if its just empty [00:04:56] We could but i would do it through sql alchemy [00:05:06] not with a frozen snapshot or it will fall out of date [00:05:14] to start i'd say empty is fine [00:05:19] nuria: i dont want to do a frozen snapshot [00:05:39] milimetric suggested we can set it up using test fixtures - and not tear it down [00:05:50] madhuvishy: ok, so empty dbs will work, we used to have a script that through the ui populated some data [00:06:21] nuria: okay - i'm going to stop worrying about populating any data for now - and set up everything else. [00:06:29] will worry about this at the very end [00:06:40] madhuvishy: sounds good [00:06:53] cool, thanks [00:50:04] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1964444 (Neil_P._Quinn_WMF) I talked with @madhuvishy today, and she suggested another possibility: storing all the data in... [10:08:52] Hello a-team! Just wanted to let you know that today I have to work with ops to re-image some memcached hosts plus probably rolling reboot of Kafka/Hadoop for the new kernel [10:09:09] Hi elukey [10:09:11] np :) [10:09:37] elukey: please let us know when rolling reboot hadoop/kafka ;) [10:09:45] have a good day ! [10:11:37] joal: of course! [11:08:14] hi a-team! [11:09:01] Hey ! Hi mforns :) [11:09:22] hey joal :] [11:11:20] mforns: I'm happy, I'm beginning (finally) to find results about anonymization :) [11:11:49] took me long, I felt a bit lost in translation, but it seems it'll pay :) [11:12:16] woohooo joal, so what did you get? [11:12:44] I'm curious to see the results of the entropy analysis [11:12:56] I have confirmation that circadian rythm (days and nights) impacts anonymization [11:13:04] (that was importatn :) [11:13:07] aha [11:13:17] And I have entropy --> we don't loose a lot of information [11:13:22] batcave ? [11:13:23] cool! [11:13:25] sure! [12:21:59] mforns: entropies per dimension done: we loose ~40% of wmf_app_version, ~25% of device family, ~11% of cities, ~10% of zero_carriers, and less than 10% for the rest of dims [12:22:27] aha [12:22:44] :) [12:22:45] joal does this match with the other global measures? [12:22:57] it makes total sense to me [12:23:00] hm, not sure I understand the question :S [12:23:18] I mean these per-dimension measures make total sense [12:23:22] yeah mforns, orders of magnitude of anonymization makes sense to me as well :) [12:23:46] This stuff seems to work :) [12:24:14] and I think it matches with the 3-4% global loss we saw, if you count all other dimensions [12:25:27] mforns: seems correct from 10 thousands feet ;) [12:25:33] yes [12:26:14] the only unexpected thing is that wmf_app_version has a higher loss... why should it? it is less variable than device_family, no? [12:28:13] mforns: I think it's not because of variability, but because of global frequency - Whenever present on a row that needs anonymization, it is removed first because almost never present (in comparison to others) [12:28:47] ooooh... I see [12:29:01] makes sense [12:29:02] :] [12:51:24] (CR) Mforns: Add split-by-os argument to AppSessionMetrics job (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/264297 (https://phabricator.wikimedia.org/T117615) (owner: Mforns) [12:58:45] (CR) Mforns: Add split-by-os argument to AppSessionMetrics job (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/264297 (https://phabricator.wikimedia.org/T117615) (owner: Mforns) [12:59:13] (PS9) Mforns: Add split-by-os argument to AppSessionMetrics job [analytics/refinery/source] - https://gerrit.wikimedia.org/r/264297 (https://phabricator.wikimedia.org/T117615) [13:00:33] (CR) Mforns: "Madhu, please let me know which version of the code you meant (see comments in previous patch), or if your idea was different :]" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/264297 (https://phabricator.wikimedia.org/T117615) (owner: Mforns) [13:28:20] Analytics-Kanban: Add wm:BE-Wikimedia-Belgium to Wikimetrics tags {dove} [1 pts] - https://phabricator.wikimedia.org/T124492#1965537 (mforns) a:mforns [13:39:25] Analytics-Tech-community-metrics: Port MediaWikiAnalysis to SQLAlchemy - https://phabricator.wikimedia.org/T114437#1965552 (AbdealiJK) @Anmolkalia are you still doing this ? Can I continue it or take over if you've stopped ? [13:50:05] Analytics-Tech-community-metrics: Port MediaWikiAnalysis to SQLAlchemy - https://phabricator.wikimedia.org/T114437#1965571 (Anmolkalia) >>! In T114437#1965552, @AbdealiJK wrote: > @Anmolkalia are you still doing this ? Can I continue it or take over if you've stopped ? Hi. Actually, I had made the requisite... [14:03:29] o/ joal & milimetric [14:04:03] Hi halfak [14:04:21] halfak: joining ! [14:04:25] joal, did you see my research showcase presentation last week? [14:04:52] Hi halfak, I'm sorry I'm sick, kinda debating taking today off [14:05:27] milimetric, don't sweat it. Take care of yourself! [14:05:49] +1 ! take some rest milimetricm I hope you'll feel better soon [14:29:51] Can someone rm -rf /srv/limn-public-data/metrics/multimedia-health/cross-wiki-uploads/ for me on stat1003? I need it to be regenerated with the new query [14:39:09] Hm, halfak, I don't see any negative deltas in my data for illustrations. Is this a bug in the script? [14:39:45] I have one that added, apparently, 73,000 images [14:39:51] But no removals [14:41:19] Oh [14:41:28] halfak: I see exactly what went wrong, ignore me [14:41:41] On the bright side I have to run it again, so I don't need to worry about updating my data for a while [14:47:05] Re-running the script, adding some extra data for my edification, and we'll see how it goes [14:47:48] halfak: I don't know if you took that script verbatim for your example, but you might want to fix it [14:49:55] * halfak does meeting things. [14:49:58] Will be back shortly [14:50:02] OK! [15:22:43] Analytics-Kanban: Tune eventlogging.overall.inserted.rate alert to use a movingAverage transformation - https://phabricator.wikimedia.org/T124204#1965745 (elukey) Code pushed. [15:23:03] Analytics-Kanban: Tune eventlogging.overall.inserted.rate alert to use a movingAverage transformation - https://phabricator.wikimedia.org/T124204#1965746 (elukey) Open>Resolved [15:24:31] :) [15:37:51] ottomata: o/ [15:38:23] hiay! [15:55:08] Analytics-Kanban, Wikipedia-Android-App: Beta Event Logging no longer functional {oryx} - https://phabricator.wikimedia.org/T123781#1965828 (Niedzielski) Resolved>Open Reopening so this doesn't slip through the cracks. [16:00:36] ottomata: hello Andrew [16:01:10] hiya [16:01:11] ottomata: I got a few patches for operations/puppet kafkatee and varnishkafka puppet modules. Merely to run python linting checks and actually run the unittest which are around [16:01:26] https://gerrit.wikimedia.org/r/#/q/project:%255Eoperations/puppet/.*+is:open+owner:hashar+reviewer:Ottomata,n,z should give you the list. If you can have a look at it it would be nice [16:01:37] then I will get the tox job to run on both repos [16:01:49] lookin! [16:04:00] mforns: meeting? [16:04:00] nuria, 1x1? [16:05:00] ottomata: great thanks :-} [16:05:05] hashar: you have been mERGED [16:11:19] Analytics, Editing-Analysis, Editing-Department: Consider scrapping Schema:PageContentSaveComplete and Schema:NewEditorEdit, given we have Schema:Edit - https://phabricator.wikimedia.org/T123958#1965896 (Halfak) Nuria, I don't think breaking down Edits by who does the editing makes sense, but there's s... [16:17:12] Analytics, MediaWiki-extensions-WikimediaEvents, The-Wikipedia-Library, Wikimedia-General-or-Unknown, Patch-For-Review: Implement Schema:ExternalLinkChange - https://phabricator.wikimedia.org/T115119#1965911 (Halfak) Generally, we'll want to be able to query based on "domain". If we write a q... [16:55:06] a-team: https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=8&fullscreen [16:55:35] elukey, that's good no? [16:55:46] i see burrow lag alert for processor, not sure why yet [16:55:51] things seem to look ok [16:55:58] but ja, that grahp looks ok, no? [16:56:55] mmmm yesterday spiked at the same time of the outage [16:57:10] I thought it was related to issue while validating [16:57:23] so I am probably wrong :) [16:57:42] I saw the burrow alarm for el-mixed and I thought this metric was indicating an issue [17:00:10] Analytics, Editing-Analysis, Editing-Department: Consider scrapping Schema:PageContentSaveComplete and Schema:NewEditorEdit, given we have Schema:Edit - https://phabricator.wikimedia.org/T123958#1966068 (Nuria) @halfak: agreed, my meta-point is that edit schema doesn't need more data flowing into it, r... [17:09:24] Analytics: Restore MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 - https://phabricator.wikimedia.org/T123595#1966101 (Tbayer) It seems the "disk getting full" blocking task [[https://phabricator.wikimedia.org/T120187#1955662|is now resolved]] too. @jcrespo or others, could you confirm th... [17:09:40] Analytics-Kanban, DBA, Patch-For-Review: Pending maintenance on the eventlogging databases (db1046, db1047, dbstore1002, other dbstores) - https://phabricator.wikimedia.org/T120187#1966104 (Tbayer) [17:09:42] Analytics: Restore MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 - https://phabricator.wikimedia.org/T123595#1966103 (Tbayer) [17:27:29] ottomata: have we found the reason for burrow alarms ? [17:27:48] not yet [17:27:55] finishing this up... [17:38:02] unsure of what's happening, but restarting eventlogging [17:38:11] !log restarting eventlogging [17:39:53] I am a bit puzzled about the EL metrics [17:40:05] the only thing that I can see for the time of the weirdness is [17:40:06] 2016-01-26 17:21:11,036 (Thread-10 ) Rebalancing consumer eventlog1001:802ea649-8b02-4b50-b752-f098a159ac44 for topic eventlogging-valid-mixed. [17:40:23] plus a lot of requests for topics to kafka [17:40:26] i am also puzzled [17:41:19] throughput metric does down https://grafana.wikimedia.org/dashboard/db/eventlogging?panelId=6&fullscreen but I can't understand what this means, events are flowing in the logs [17:42:50] elukey: i don't think throoghput is going down, i think that's just graphite funkyness [17:43:15] it looked like that 30 mins ago too, ja? [17:44:21] yep could be, not really used to it.. sigh [17:45:40] it could be the graphite cache lag from what I can read but I am super ignorant [17:49:55] Analytics-Kanban: Restore MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 - https://phabricator.wikimedia.org/T123595#1966217 (Nuria) p:Triage>Normal [17:52:47] the logs on eventlogging1001 are a bit weird [17:52:49] like eventlogging_consumer-mysql-m4-master-00.log [17:53:22] mforns: can you chec? [17:53:24] *check [17:55:11] elukey: i'm shorta checking, will be 100% with you in a few... [17:57:57] sure! [17:58:20] (CR) Madhuvishy: Add split-by-os argument to AppSessionMetrics job (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/264297 (https://phabricator.wikimedia.org/T117615) (owner: Mforns) [17:58:22] So Burrow complains about mixed, might be related to mysql insertion rate? [17:58:35] ellery: i'm also very confused [17:58:40] well, according to burrow [17:58:48] lag is 0, but the 'status' is STOP [17:59:14] ottomata: you are not pinging elukey :) [17:59:22] https://github.com/linkedin/Burrow/wiki/http-request-consumer-group-status [17:59:23] ops! [17:59:24] oops! [17:59:27] elukey: ^ [17:59:35] (sorry ellery!) [18:00:09] ah nice! [18:00:44] 2016-01-26 17:55:42,307 (Thread-11 ) Received error(xid=29) NodeExistsError((), {}) [18:00:45] ? [18:00:58] ottomata: are you running that query from krypton ? [18:01:12] yes [18:01:27] curl http://krypton.eqiad.wmnet:8000/v2/kafka/eqiad/consumer/mysql-m4-master/lag | jq . [18:01:30] ottomata: staff meeting? [18:01:35] I'll add a reference in the oncall wiki [18:01:36] elukey: staff meeting? [18:01:38] oh ja [18:01:41] nuria: sure! [18:01:58] ?!? analytics has staff now? [18:02:15] Nuria, I thought we were going to talk budget ;-) [18:02:43] i sent you an e-mail [18:02:52] it conflitcs with staff [18:03:23] can we do it tomorrow (super thanks for your help on this) [18:03:52] ok [18:10:55] i may see something a problem with the mysql consumer hmm [18:20:20] Analytics, Analytics-EventLogging: EventLogging dies when fetching a schema over HTTP that does not exist. - https://phabricator.wikimedia.org/T124799#1966396 (Ottomata) NEW [18:23:07] ottomata: is the 404 the cause of the issue? So EL processes dying ? [18:23:24] i think so, yeah, although, i'm not sure why'd they 404 in the mysql consumer [18:23:56] if they are in the eventlogging-valid-mixed topic, that means that the schema previously existed and was validated [18:24:14] looks like a series of events 404ed [18:24:19] which caused several mysql processes to die [18:24:28] several iterations of mysql consumers [18:24:31] start, consume, 404, die [18:24:34] a few times [18:24:36] looks ok for now... [18:24:39] and this caused the lag [18:24:57] hmm actually [18:24:57] no [18:25:05] it just happened again a few mins ago [18:25:37] theoretically when a process dies upstart bring it up again right? [18:27:55] hey mforns let's talk about mysql handler? [18:29:18] ottomata, sure! but I have the wikimedia bot meeting now... [18:29:30] can we talk in 30 mins? [18:29:45] ja [18:30:28] madhuvishy: cost estimates for wikimania: https://office.wikimedia.org/wiki/Travel/Wikimania_2016 [18:31:00] nuria: there's nothing in it? [18:31:02] only for flights [18:31:25] https://office.wikimedia.org/wiki/Travel/Wikimania_2016 [18:31:27] mforns: I am going to skip the meeting sorry : [18:31:29] :( [18:31:33] wait .. i see content [18:31:52] madhuvishy: it just got deleted/ whattt? [18:32:27] madhuvishy: ah no, there is content on office wiki [18:32:37] but for some reason it redirects to wikimediafoundation [18:32:43] nuria: hhmmm - when i click it it redirects [18:32:44] ottomata: going offline atm, I'll recheck later on for news about EL [18:32:49] do you need me? [18:33:29] naw its cool [18:33:47] super, thanks! [18:34:14] byyyeeeeee a-team! talk with you tomorrow :) [18:34:24] byye [18:42:00] ja ok, mysql consumers busted because of htis, i need to add some logging... [18:47:39] ottomata: I do not understand what is happening to consumers, can you explain abit? [18:47:47] one min.. [18:47:51] k [18:48:43] nuria: not totally sure, but they are attempting to get a schema from meta that 404s [18:48:46] that throws an exception [18:48:47] and the process dies [18:48:52] this happens for each iteration of the process [18:49:22] also, I am not certain that the exception finally catchall in the mysql handler is inserting queued events when this happens [18:49:24] going to add some logging [18:49:43] ottomata: the http get? odd, seems that the parsing might but the GET... [18:50:07] yes, it is strange [18:50:12] because, these events are in eventlogging-valid-mixed [18:50:17] which means they have been validated [18:50:22] which means that the schema didn't 404 at some point [18:50:30] need logging ot know what is 404ing... [18:52:28] ottomata: the capsule? [18:52:37] ottomata: as in the revision of capsule is wrong? [18:53:45] not sure [18:53:51] yet [18:55:18] !log stopping eventlogging mysql consumers to troubleshoot [18:59:55] a-team, gone for diner ! [18:59:59] nuria: https://meta.wikimedia.org/w/api.php?action=jsonschema&title=PageDeletion&revid=7481655&formatversion=2 [19:01:34] oh [19:03:09] am very confused [19:04:11] formatversion=2? [19:04:33] ottomata: https://wikimediafoundation.org/wiki/Schema:PageDeletion [19:04:36] gah [19:05:22] ja [19:05:22] hm [19:05:23] ah [19:05:25] some redirect? [19:05:29] this works though [19:05:30] https://meta.wikimedia.org/w/api.php?action=jsonschema&title=PageDeletion&revid=7481655 [19:05:35] ottomata: yeah [19:05:38] something is wrong [19:05:40] some work thogh? [19:05:40] https://meta.wikimedia.org/wiki/Schema:DeprecatedUsage [19:05:50] going to releng [19:05:55] meta.wikimedia.org keeps taking me to wikimediafoundation [19:06:01] so does office [19:06:45] very strange though [19:06:45] https://meta.wikimedia.org/w/api.php?action=jsonschema&title=DeprecatedUsage&revid=7906187&formatversion=2 [19:06:46] works fine [19:07:05] ottomata, back, did you want to talk about mysql handler? [19:07:20] mforns: nm i think things are ok [19:07:22] but see above [19:07:43] i was worried that the finally block wasn't doing what I thought it did and we were losing queued events [19:07:45] but we aren't, its working [19:08:07] I see [19:08:09] a-team, *.wikimedia.org is broken [19:08:12] i'm stopping eventlogging for now [19:08:12] reading scrollback [19:08:21] oh [19:08:38] !log stopping all of eventlogging [19:08:47] wow [19:08:56] ottomata: should I send email to list? [19:09:11] madhuvishy: hold, we aren't losing data while this is happening, as long as i stop processors, etc. [19:09:18] ah [19:09:20] we should send email, but we need more info [19:09:27] well [19:09:29] i take it back [19:09:32] we are losing server side data [19:09:36] since that is not sent to kafka [19:09:37] hmmm [19:09:48] i will stop that one process, since it doesn't do validation [19:09:48] client side works? [19:09:48] and it will send to kafka [19:09:50] yes [19:09:54] it is varnishkafka -> kafka [19:10:02] its being buffered in kafka when el is down [19:10:08] right kay [19:11:56] ottomata: i see that the original problem is fixed - all redirects are because of them being cached at different levels [19:12:00] ops is purging them [19:12:43] Analytics, Editing-Analysis, Editing-Department: Consider scrapping Schema:PageContentSaveComplete and Schema:NewEditorEdit, given we have Schema:Edit - https://phabricator.wikimedia.org/T123958#1966716 (Neil_P._Quinn_WMF) @Halfak, cool! Normalizing the schema seems like a good idea; that could be part... [19:13:11] https://meta.wikimedia.org/w/api.php?action=jsonschema&title=PageDeletion&revid=7481655&formatversion=2 200ing now [19:13:25] not getting redirects [19:15:13] cool [19:15:32] first time i ever remember an EL problem caused by meta [19:15:46] ha ha [19:15:48] aha [19:15:58] i made a task to not die on a 404 [19:16:05] buuut, maybe it is the right hting to do after all [19:16:05] there is possible gaps in the server side? [19:16:09] madhuvishy: yes [19:16:17] not long though [19:16:25] okay [19:16:30] schemas are cached, so they could be kept [19:16:31] only 5 minutes ish, however it took me from when i stopped all El and hten started a forwarder process [19:16:39] mforns: yeah, for the most part [19:16:50] but, for schemas that weren't cached, it was causing processes to die [19:16:59] aha [19:19:22] !log restarting eventlogging after *.wikmiedia.org outage [19:21:39] we should really get that server side stuff ported to kafka [19:21:39] ottomata, if we change the code to not die on a 404, and we also avoid cache eviction in that case, we can still let the process run for quite a long time, probably all schemas are cached in all processes at all moments [19:22:03] yeah [19:22:12] but, we'd lose messages for the 404s [19:22:23] whereas now, we don't lose any messages [19:22:29] but things come to a halt [19:22:31] ottomata, I see... [19:23:30] but I mean, EL code requests the schema json when the eviction time triggers no? we could keep the old schema when a request goes 404 [19:24:02] eveiction time? [19:24:11] i don't htink there's any eviction time for cached schemas [19:24:19] I mean we could keep the last valid schema at all times [19:24:29] no? [19:24:36] all schemas are cached [19:24:42] ottomata: sorry i missed ping, i see now [19:24:47] you want to validate an event against its previous version? [19:24:52] looking [19:25:11] an event is validate against a specific scid (schema,revision) [19:25:15] validated* [19:25:19] ottomata, I see, I thought there was some eviction [19:25:21] anw [19:25:22] naw [19:25:24] because [19:25:30] revisions are immutable [19:25:33] no need to ever re-request [19:25:33] aha [19:25:36] makes sense [19:25:41] ottomata: so schemas were 404 -ing at times [19:25:51] ottomata: but how is that related to server side validation? [19:25:56] uncached schemas (mostly in mysql consumers) were 404ing [19:26:00] causing the process to die [19:26:08] nuria, i stopped all of eventlogging [19:26:14] server side is the only one we still lose messages for when we do that [19:26:19] it still uses UDP [19:26:21] ottomata: ya, understood [19:26:28] ottomata: cause those are going to zero mq [19:26:32] not zmq [19:26:37] UDP -> eventloggingforwarder -> kafka [19:26:37] ah sorry [19:26:39] udp [19:26:45] yes yes [19:26:47] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 53.33% of data above the critical threshold [30.0] [19:27:01] here [19:27:01] https://wikitech.wikimedia.org/w/images/c/cc/EventLoggingStag.jpg [19:27:14] note the (to be removed) italic :) [19:27:22] then: the 404s in meta had nothing to do with us right? [19:27:22] we have kafka support in MW now [19:27:28] we should make the EventLogging extensino use it [19:27:42] they were caused by a bad deploy that was reverted [19:27:42] ottomata: let's make task [19:27:46] ops had to purge caches everywhere [19:27:48] i ahd a task, can't find it... [19:28:09] AH [19:28:10] https://phabricator.wikimedia.org/T106257 [19:28:57] Analytics: Server side eventlogging should publish to kafka and not use udp - https://phabricator.wikimedia.org/T124813#1966797 (Nuria) NEW [19:29:01] Analytics: Server side eventlogging should publish to kafka and not use udp - https://phabricator.wikimedia.org/T124813#1966806 (Nuria) [19:29:19] nuria ^^^ [19:29:23] argjhh [19:30:16] ottomata: consolidated tasks now [19:30:32] Analytics-EventLogging, Analytics-Kanban: Send raw server side events to Kafka using a PHP Kafka Client - https://phabricator.wikimedia.org/T106257#1966814 (Nuria) [19:30:39] Analytics-EventLogging, Analytics-Kanban: Send raw server side events to Kafka using a PHP Kafka Client - https://phabricator.wikimedia.org/T106257#1966815 (Nuria) p:Low>Normal [19:41:39] ottomata: let's prioritize the EL server side switchover [19:43:45] k [19:44:47] Jamesofur: yt? [19:45:57] nuria: kinda sorta, sup? [19:46:10] Jamesofur: how has piwik worked out? [19:46:57] overall I like it, was really nice to see more real time stats and more breath of stats without using google analytics [19:47:25] Jamesofur: are the wikimedia15 folks still using it? [19:48:48] Some, but not particularly actively atm since the main push is all over, I need to show a couple more of them how to get in . [19:48:48] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 20.00% above the threshold [20.0] [19:49:08] Jamesofur: ok [20:01:28] bye a-team, see you tomorrow! o/ [20:01:46] night mforns :) [20:05:54] latesr! [20:40:32] oh boy, all broken?? O_O [20:40:32] bad deploy?? [20:40:32] elukey: all good now [20:40:32] but yeah, bad deploy [20:46:12] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 26.67% of data above the critical threshold [30.0] [20:47:47] i think thats me [20:47:50] not real [20:47:57] i forgot to silence that [20:48:00] i'm changing kafka metric names [21:05:09] Analytics, MediaWiki-extensions-WikimediaEvents, The-Wikipedia-Library, Wikimedia-General-or-Unknown, Patch-For-Review: Implement Schema:ExternalLinkChange - https://phabricator.wikimedia.org/T115119#1967284 (Sadads) I did a totally unscientific quick scroll through of P587, most of the items... [22:02:44] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 20.00% above the threshold [20.0] [22:03:15] cool [22:07:31] laters a-team! [22:51:15] (PS5) Bearloga: Functions for categorizing queries. (Work In Progress) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/254461 (https://phabricator.wikimedia.org/T118218) [23:21:16] Analytics-Kanban, Editing-Analysis: Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} [8 pts] - https://phabricator.wikimedia.org/T124676#1968026 (Neil_P._Quinn_WMF) It looks like we could reduce the logging rate to a //quarter// of what it is now, just by redu...