[03:48:39] bearloga: BTW (saw your tweet) there is now https://phabricator.wikimedia.org/T160941 ;) [04:06:28] aah, I wish I could get to all the things :( [06:25:20] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3142820 (10Marostegui) The eventlogging script on db1047 is failing due to: ``` Thu Mar 30 06:07:49 UTC 2017 localhost ContentTranslationError_11767097, createERROR 1005 (... [08:28:09] 10Analytics-EventLogging, 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Add autoincrement id to EventLogging MySQL tables. {oryx} - https://phabricator.wikimedia.org/T125135#3143065 (10jcrespo) That would be huge win! And it would make the terbium back-filling unnecessary finally! [08:44:32] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3143073 (10jcrespo) Should we increase open_files_limit or do you think this was a one-time issue due to the rename process? [08:46:06] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3143081 (10Marostegui) I thought about it, but it has not happened until we did the rename thingy yesterday. So I would leave it for now. [11:18:26] 10Analytics-Tech-community-metrics: Git code repository is listed but not all recent activity is shown on wikimedia.biterg.io - https://phabricator.wikimedia.org/T161211#3143339 (10Aklapper) [12:12:29] 10Analytics-Tech-community-metrics, 10Gerrit: Numerous Gerrit (draft) patchsets cannot be accessed: "Cannot display change because it has no revisions." - https://phabricator.wikimedia.org/T161207#3143407 (10Aklapper) [12:38:36] ah snap I found https://github.com/wikimedia/varnishkafka/issues/5 [12:38:39] from October.. [12:38:47] I put myself in the watchlist [12:45:17] elukey: yeah, gerrit is just not good for actual open source work. It gets the review part right, but completely fails on the social part [12:46:03] I wonder if we should think about moving to differential [12:48:11] elukey: Would you have aminute for some explanation to the newby I am on puppet in labs? [12:48:24] joal: whatcha doin? [12:49:45] milimetric: I have launched an instance on labs yesterday, and couldn't find how to apply it puppet role [12:51:21] well, did you make it a self-hosted puppetmaster yet? [12:51:36] milimetric: didn't do anything :) [12:51:53] one sec, looking up instructions in case they changed [12:52:02] milimetric: it makes a long time I had not done some labs infra, and last time there was possi [12:52:05] bility [12:52:10] to apply roles through UI [12:52:12] https://wikitech.wikimedia.org/wiki/Standalone_puppetmaster [12:52:13] IIRC [12:52:24] right, those are the built-in roles [12:52:35] you can do that from wikitech... unless they moved it, one sec [12:52:48] wait wait, why a self hosted puppet master? :) [12:53:08] joal: have you access to horizon? [12:53:29] elukey: I think I don't - I tried yesterday and didn't manage to login [12:54:33] joal: you should be able to access via wikitech credentials.. [12:54:42] aha, they moved puppet stuff to horizon [12:55:20] I thought he needed a role he was working on, to test, self-hosted is still the only way to do that, right? [12:55:23] if the puppet role is already in puppet and you don't need to live-hack it etc.. it suffice to connect to Horizon and select the role [12:55:36] elukey: I suspected something like that :) [12:56:04] milimetric: yes exactly, my understanding is that self-hosted pm is a super painful process that you need only if you have to hack :D [12:56:06] elukey: horizin gives me invalid creds message (I use GAuthenticator as 2nd validation) [12:56:45] mmmm [12:57:52] elukey: I think 2 factor auth was not setup (from what I see), even if I had an Gauth in place (weird) [12:58:27] so you can now login? [12:58:32] yeah, they sent an email a while back that it was required, we should've mentioned it at standup [13:02:24] I triple checked 2-factor auth on wikitech - works fine - And Horizon style doesn't allow me :( [13:02:57] hm, I remember something weird like that, one sec [13:04:42] joal: it looks like you need to be a projectadmin on a project, elukey is that just being in the admin role? or something else? [13:04:56] Yay ! Finally managed to [13:05:04] milimetric: it was a space problem [13:05:21] oh yay. Ok, I just checked and you're projectadmin on analytics [13:05:22] :O [13:05:25] hiii [13:05:34] milimetric: My 2 auth program gives me a code with 6 digits, 3 then space then 3 [13:05:42] whaaaa?! [13:05:49] Well that space causes a problem to horizon [13:05:58] But not to wikitech from what I have tested [13:06:03] WEIRDOOOOH [13:06:06] I've never heard of a space in a 2fa [13:06:30] so you just leave the space out? [13:06:40] correct [13:06:53] cool, and did you find puppet in there? [13:07:04] not yet, still looking around :) [13:07:25] I did ! [13:07:29] joal: if you click on the instance there should be a puppet tab [13:07:31] ok :) [13:07:42] And it looks ottomata already did the things for me yesterday :) [13:07:54] i broke some things [13:07:57] we both broke things [13:07:59] Ah by the way elukey - Looks like I'll have a strong incentive to moving fast to jessie ;) [13:08:07] ottomata: I surely did ! [13:08:26] joal: should I be scared? :D [13:08:27] joal: what is status? i was just going to stand you up a new jessie cluster this morning [13:08:28] ottomata: I kinda felt removing 3 instances like that would have broken something :) [13:08:52] ottomata: I managed to run a job with jessie when running local [13:09:03] I'd love to have a small cluster to test it yarn mode [13:09:17] (like 1 master and 1 or 2 workers) [13:09:23] ok, i'm going to just blast all the existing nodes and make a new 3 node node cluster non HA [13:09:27] needs to be jessie, and to have some packages installed [13:09:29] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017): Investigate why there is a mismatch between six names and certain email address in mediawiki-identities data - https://phabricator.wikimedia.org/T123643#3143480 (10Aklapper) a:03Aklapper [13:09:45] awesome, thanks a lor andrew [13:09:58] ottomata: By the way, can I shadow you on that (learning will bei nteresting) [13:10:21] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017): Investigate why there is a mismatch between six names and certain email address in mediawiki-identities data - https://phabricator.wikimedia.org/T123643#1934774 (10Aklapper) 05Open>03Resolved **** This is wrong / corrupted dat... [13:11:24] joal: sure! [13:11:25] bc? [13:11:31] OMW ! [13:11:47] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017): Investigate why there is a mismatch between six names and certain email address in mediawiki-identities data - https://phabricator.wikimedia.org/T123643#3143493 (10Aklapper) [13:25:13] (03PS7) 10Fdans: Add changes to support reportcard in Dashiki [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/344114 (https://phabricator.wikimedia.org/T143906) [13:26:11] fdans: want me to review that? I see it doesn't have a WIP [13:27:11] milimetric: that'd be great! It's missing a couple of tests that I'm adding now, just fyi [13:27:18] k [13:31:44] (03CR) 10Milimetric: [V: 032 C: 032] Update sqoop script to better handle failures (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/345327 (owner: 10Joal) [13:32:16] lost you andrew [13:33:02] ottomata: --^ [13:34:01] thanks milimetric :) [13:59:38] 06Analytics-Kanban, 15User-Elukey: Review the recent Varnishkafka patches - https://phabricator.wikimedia.org/T158854#3143560 (10elukey) Patch reviewed! Definitely a little bug (not relevant to our settings btw) gets resolved, but I am unsure about the other ones. Asked clarifications. [14:00:51] finally vk patches reviewed \o/ [14:01:35] hello analytics! I need access to some EventLogging data, I understand you may need to set it up to replicate in Hive? [14:01:46] specifically, this is for cookie blocks [14:03:41] ottomata: https://gerrit.wikimedia.org/r/#/c/345509/2/wmf-config/ProductionServices.php [14:07:43] nvm, I found it :) [14:14:04] musikanimal: ya! hi EL has been going to HDFS for a while [14:14:12] we plan to make the hive integration better soonish [14:14:19] for now you have to do it like you probably just found on wikitech [14:14:33] elukey: .discovery.?! [14:20:36] ottomata: looks like a success :) [14:20:47] ottomata: I had to have another manual step though :( [14:20:49] ottomata: is EventLogging data exposed in grafana? [14:20:59] I found what looks to be like what I want, but I see no data [14:21:04] so we could have just set it up wrong [14:21:18] but also I don't see "grafana" mentioned in the wikitech docs [14:21:45] https://grafana.wikimedia.org/dashboard/db/eventlogging?orgId=1&var-topic=eventlogging_CookieBlock&from=1469071723354&to=1490882923354 [14:23:17] musikanimal: sorta, but nothing beyond # of events [14:23:45] you can see there aren't many messages for that schema, so it is hard to graph there [14:23:46] hm [14:24:30] musikanimal: you can also kinda see it here [14:24:31] https://grafana.wikimedia.org/dashboard/db/kafka-by-topic?from=1469071723354&to=1490882923354&refresh=5m&orgId=1&var-cluster=analytics-eqiad&var-kafka_brokers=All&var-topic=eventlogging_CookieBlock [14:24:34] but yeah hardly any messages [14:25:00] joal: what's your other manual step? [14:25:01] ottomata: nice! now I see the data I expected [14:25:09] ottomata: yep! Now the auth dns is able to respond with the active DC IP [14:25:11] we only had it set up on testwiki [14:25:15] just wanted to let you know [14:25:20] it is for the switchover [14:25:21] ottomata: downloading some NLTK resources through python [14:25:40] ottomata: I think this step can be seen as putting some resource under /usr/local/share/nltk_data [14:25:47] * elukey brb [14:28:27] elukey: cool! [14:28:29] that's pretty awesome [14:28:40] hmmm [14:28:41] ok... [14:31:01] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3143638 (10Ottomata) Wow, ok. Thanks. [14:32:44] ottomata: hey, so what does "Kafka Messages by topic" mean? [14:32:56] sorry for not reading the docs [14:33:32] musikanimal: do you know what kafka is? [14:33:50] haha no I guess not [14:33:57] ok, simple answer then :) [14:34:03] kafka is our distributed log buffer [14:34:17] so, its what allows us to ship logs and streaming data around the cluster [14:34:19] kinda likea message queue [14:34:31] EventLogging uses it [14:35:00] oh ok [14:35:04] it gets incoming events from browsers or wherever, parses them, then produces them to separate kafka 'topics' (queues) [14:35:06] so [14:35:18] the EL schema based ones are all prefixed with 'eventlogging_' [14:35:18] 10Analytics, 10EventBus, 05MW-1.29-release (WMF-deploy-2017-03-28_(1.29.0-wmf.18)), 06Services (done): Page properties-change event is rejected if page was deleted - https://phabricator.wikimedia.org/T158702#3143645 (10Pchelolo) 05Open>03Resolved Deployed, verified, resolving. [14:35:33] so all messages for a particular schema name will go into 'eventlogging_MySchemaName' topic [14:35:51] from there [14:35:59] the messages are consumed by various pieces [14:36:03] one of which is a mysql inserter [14:36:07] that's how the data gets into mysql [14:36:15] another is an HDFS writer [14:36:20] that is how the data gets into HDFS [14:36:24] ottomata: Tried to provide the nltk corpora as resource to spark, but didn't work :( [14:36:51] joal: i don't know what that is, but is that something we can just put into hdfs via refinery or something? [14:37:05] ottomata: I see, thank you for the explanation! [14:37:24] so i was kind of hoping to know in integers how many events were logged, is that possible? [14:37:38] musikanimal: https://wikitech.wikimedia.org/wiki/EventLogging [14:37:41] has a little diagram [14:38:08] also,i gave a lightning tech talk about it a while ago [14:38:10] if you are curious [14:38:10] https://www.youtube.com/watch?v=yUQ5d192z3M [14:38:16] https://www.mediawiki.org/wiki/File:EventLogging_on_Kafka_-_Lightning_Talk.pdf [14:39:19] ottomata: ok thank you! [14:41:45] 10Analytics, 10Analytics-Dashiki, 13Patch-For-Review: Create dashboard for upload wizard - https://phabricator.wikimedia.org/T159233#3143678 (10matthiasmullie) Yes and no. I do have time to work on this, but no experience with Dashiki. So far, I've not had much luck getting anywhere myself, so some pointers... [14:46:08] ottomata: it won't work with HDFS as far as I understand :( [14:49:55] ottomata: nltk package as used in revscoring expects data in specific places: https://gist.github.com/jobar/2ee394541c4cec991b6b3e7078dfd85f [14:50:26] oof, so it'd need to be on all the workers in one of those dirs? [14:50:28] ergh [14:50:54] joal: is there an nltk data python package? [14:50:56] ottomata: Mwarf indeed [14:51:04] ottomata: didn't check !!!1 [14:51:18] http://www.nltk.org/data.html [14:51:25] If you did not install the data to one of the above central locations, you will need to set the NLTK_DATA environment variable to specify the location of the data [14:52:22] joal: i betcha we could make a deb package that includes the data [14:52:35] ottomata: Seems not too complicated [14:52:50] ottomata: except that I don't know how to make packages :S [14:54:19] haha yeah i can do for ya [14:54:40] ottomata: from other tuto I see, they run the download command on every work node (ugly) [14:54:44] yeah [14:54:48] no good [14:55:42] some utilities comming with cluster make ugly easy (command: acluster --> runs the command on every worker - WROOOOONG !) [15:03:27] 06Analytics-Kanban: Document and publicize AQS legacy page counts endpoint - https://phabricator.wikimedia.org/T159959#3143706 (10mforns) a:03mforns [15:14:29] 06Analytics-Kanban, 10DBA, 06Operations: Improve eventlogging replication procedure - https://phabricator.wikimedia.org/T124307#3143721 (10Nuria) a:03Ottomata [15:15:28] 06Analytics-Kanban, 10DBA, 06Operations: Improve eventlogging replication procedure - https://phabricator.wikimedia.org/T124307#1952524 (10Nuria) Let's take advantage of the fact that after the rename we have now autoincrement ids on new tables . [15:29:43] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10The-Wikipedia-Library, 10Wikimedia-General-or-Unknown, and 4 others: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#3143785 (10Milimetric) Sam knows about the replication, that's why he knows there's a delay. But Sam,... [15:43:45] nuria: the full config for the reportcard is now in https://meta.wikimedia.org/wiki/Config:Dashiki:Sample/tabs [15:43:56] if you want to take a look [15:44:02] looking [15:45:20] fdans: I see, i should be "daily" unique devices only, no monthly, other than that it looks good [15:45:41] nuria: cool, removing monthly [15:54:16] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10The-Wikipedia-Library, 10Wikimedia-General-or-Unknown, and 4 others: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#3143837 (10Samwalton9) >>! In T115119#3143785, @Milimetric wrote: > Sam knows about the replication, t... [15:54:40] 10Analytics, 06Operations, 07Documentation: Improve SSH access information in onboarding documentation - https://phabricator.wikimedia.org/T160941#3115695 (10Nuria) Can we be specific as to what needs improvements to help ops document what is needed? cc @Zareenf, @Tbayer @mpopov who had had trouble with thi... [15:56:56] 06Analytics-Kanban, 06DC-Ops, 06Operations, 10ops-eqiad, 13Patch-For-Review: Decom/Reclaim analytics1027 - https://phabricator.wikimedia.org/T161597#3143852 (10Nuria) [15:57:17] 06Analytics-Kanban, 06DC-Ops, 06Operations, 10ops-eqiad, 13Patch-For-Review: Decom/Reclaim analytics1027 - https://phabricator.wikimedia.org/T161597#3136552 (10Nuria) a:03elukey [15:57:48] 10Analytics: Improve Oozie error emails for testing - https://phabricator.wikimedia.org/T161619#3143856 (10Nuria) p:05Triage>03Normal [15:58:44] 10Analytics: Update pivot to latest version - https://phabricator.wikimedia.org/T161630#3143858 (10Nuria) p:05Triage>03Normal [16:01:12] 10Analytics, 10Analytics-Cluster, 06Operations, 10hardware-requests: CODFW: 6 Nodes for Kafka refresh/upgrade - https://phabricator.wikimedia.org/T161637#3138006 (10Nuria) p:05Triage>03Normal [16:02:40] 10Analytics: Upgrade pivot - https://phabricator.wikimedia.org/T161725#3143873 (10Nuria) 05Open>03Invalid [16:11:03] 06Analytics-Kanban: Security Upgrade for piwik - https://phabricator.wikimedia.org/T158322#3143894 (10Nuria) a:05Milimetric>03None [16:11:34] 10Analytics, 10Fundraising-Backlog: Storage for banner history data - https://phabricator.wikimedia.org/T161635#3143898 (10DStrine) We are checking some points with legal here: T161656 [16:11:54] 06Analytics-Kanban: Create robots.txt policy for datasets - https://phabricator.wikimedia.org/T159189#3143900 (10Milimetric) a:03Milimetric [16:15:10] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10The-Wikipedia-Library, 10Wikimedia-General-or-Unknown, and 4 others: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#3143902 (10Milimetric) The two tables are part of a maintenance on EventLogging, we needed to delete d... [16:37:29] 10Analytics, 06Operations, 07Documentation: Improve SSH access information in onboarding documentation - https://phabricator.wikimedia.org/T160941#3144005 (10Tbayer) >>! In T160941#3143836, @Nuria wrote: > Can we be specific as to what needs improvements to help ops document what is needed? cc @Zareenf, @Tb... [16:39:02] 10Analytics-EventLogging, 06Analytics-Kanban, 10DBA, 06Operations: Improve eventlogging replication procedure - https://phabricator.wikimedia.org/T124307#3144009 (10Nuria) [16:44:58] 10Analytics, 10Analytics-Dashiki, 13Patch-For-Review: Create dashboard for upload wizard - https://phabricator.wikimedia.org/T159233#3144111 (10Nuria) Mathias: Let's backtrack a little here: seems to me that your case has little do do with analytics but rather error logging, I understand that mediawiki does... [16:47:34] 10Analytics, 06Operations, 07Documentation: Improve SSH access information in onboarding documentation - https://phabricator.wikimedia.org/T160941#3144130 (10Nuria) This task will be seen by the ops on cleaning duty next week. It will help to have a list of issues so they can know what problems the documenta... [17:29:14] 10Analytics: Add zero carrier to pageview_hourly data on druid - https://phabricator.wikimedia.org/T161824#3144399 (10Nuria) [17:30:10] team I am shutting off analytics1039 [17:30:18] so Chris will be able to apply thermal paste [17:39:42] elukey: i love the fact that that 1039 is a metal box somewhere now cover in some ooo [17:41:54] :D [17:45:24] joal: I think that I killed one of your spark jobs, sorry :( [17:46:11] ottomata: ping? [17:46:21] hii [17:46:35] oh elukey yeah! [17:46:39] lets do one? [17:48:36] sure [17:49:03] in bc if you want [17:49:10] which one? [17:49:41] we can do in here too! [17:49:47] so 1046 could be a good candidate [17:49:53] next inline for the reimage :) [17:50:13] ok here is good [17:50:17] ok [17:50:25] ok if i just follow https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Administration#Worker_Reimage_.2812_disk.2C_2_flex_bay_drives_-_analytics1028-analytics1057.29 [17:50:26] and ask qs? [17:50:47] so what I usually do is schedule a bit of downtime and shutdown yarn/hdfs, to let jobs drain. Not sure if it is a good procedure but with the new settings (Yarn stop doesn't kill the containers) it might let computations to survive? [17:50:58] anyhow, then neodymium and type: [17:51:40] sudo -E wmf-auto-reimage analytics1046.eqiad.wmnet -p T160333 [17:51:41] T160333: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333 [17:51:57] it will ask the pwstore management passzorz [17:52:10] after that it will be following the above guide :) [17:52:43] elukey: should I jsut try and see what happens without downtime? [17:53:08] oh hm [17:53:13] yarn stop doesn't kill containers [17:53:14] i get it [17:53:15] ya i'll do that [17:53:38] IIRC if the appmaster is somewhere running it might still get container's computations [17:53:43] but I am not sure :( [17:53:53] aye [17:54:38] !log stopping hadoop services on analytics1046 for jessie upgrade [17:54:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:55:14] ok yeha, so lots of yarn procs still running [17:55:19] so i'll wait til those disappear [17:56:29] ah one thing that I wanted to talk with you about [17:56:39] /var/lib/hadoop/data is not created automagically [17:57:11] at least, it is not when the disks have not all the datanode partitions that can be mounted [17:57:18] so a manual mkdir is needed [17:58:03] ah right, makes sense, cause puppet creates the path [17:58:15] you do that after it comes up elukey? [17:58:42] usually after the first puppet run that breaks when setting up the datanode partitions [18:02:34] !log an1039 back up again after thermal paste applied [18:02:35] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:03:16] 10Analytics, 10Analytics-Cluster, 06Operations, 10ops-eqiad, 15User-Elukey: Analytics hosts showed high temperature alarms - https://phabricator.wikimedia.org/T132256#3144546 (10elukey) Chris applied the thermal paste and the host is up and running again. Will watch mcelog during the next days to see if... [18:05:43] ouch mw-history-denorm spark job killed, joal will be mad :) [18:05:49] sorryyyyy [18:06:13] uuuuh oh [18:06:15] how'd it happen [18:06:17] an39? [18:09:02] yeah probably, there were spark jobs running and I shutdown the host since I knew they would have taked a lot of time.. :( [18:09:21] 1039 is the host showing up most of the temp alerts [18:17:22] ook elukey 0 yarn procs on 1046 now [18:18:22] running the reimage thang [18:18:52] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3144577 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1046.eqiad.wmnet'] ``` The log can b... [18:19:05] nice :) [18:19:30] ok so now everything should be handled by the script [18:19:38] reimage, first puppet run, reboot [18:19:46] amazing [18:19:47] and salt check etc.. [18:19:55] it'll verify the salt key?! [18:20:26] it will call the host via salt [18:20:29] after signing the key [18:21:05] so ideally if the script finishes without any issue you'll just ssh and fix datanode partitions [18:23:28] ottomata: brb later on if you need me! [18:24:39] amazing [18:24:40] ok cool [18:25:22] fdans: reportcard looks good!, some minor nits [18:34:49] nuria: woudl you say that it is safe to no longer bother replicating any of the tables that we didn't rename? [18:37:11] ottomata: that would mean not replicating tables that get less than 1 event per day (which is how i compiled initial list of tables to rename) [18:37:25] hm [18:37:44] nuria shouldn't we have just renamed all talbes then? if there are tables that are 'active' at less than 1 event per day [18:37:52] those tables will still have the short varchars [18:37:57] and also no id field [18:38:02] (well, maybe no id field) [18:40:08] ottomata: right, but I'd say that is a very uncommon use case of less than 1 event per day, we see that pattern in schemas that are being "replaced" by a new one and thus being phased out [18:42:06] hm oook [18:42:43] ottomata: i guess it is [possible that we have some tables with little little usage that still have columns of old length [18:44:00] ja nuria, i just noticed because replication script replicates all tables on master -> slave [18:44:04] and if the table on master does not have id field [18:44:08] script will fail [18:44:11] if i try to replicate by ides [18:44:13] ids [18:44:16] so, i'm going to have to be smarter [18:44:30] i think i can code to prefer ids, but fallback to timestamp [18:44:54] ottomata: let's see, I think is posible that some tables that have less than one event per day will have a column with old UA length [18:45:18] ya, the old ua length isn't such a big deal (to me), as stuff will keep workign with that [18:45:23] the data will just be truncated for those ones [18:45:29] i'm trying to do the id based replication [18:45:30] ottomata: that did not seem a huge deal as the truncation of UA was not happening every time and things would on thsoe tables, now, auto increment [18:45:36] and lots of tables don't have ids :/ [18:45:39] ottomata: by definition those tables have to be real small [18:45:51] nuria: also [18:45:55] ottomata: if they receive such a small flow of events [18:45:57] is data currently purged at all? [18:45:59] on slaves? [18:47:46] I think purging has issues so maybe we could remove all data that is old before fixing replication? [18:50:06] nuria: yah, i just noticed that purging is done by this script, but it is commented out [18:50:10] so i'm pretty sure it snot happening [19:03:14] ottomata: o/ [19:03:20] I saw that the reimage went fine [19:03:25] https://phabricator.wikimedia.org/T160333#3144625 [19:03:56] ya! [19:03:58] following things no [19:03:59] w [19:04:02] chowing... [19:04:26] goood! it will take a while :( [19:04:52] oh ya lots! [19:05:22] elukey: that would go a lot faster if you added a & to background each letter search [19:05:38] ah this is a good point! [19:05:50] didn't think about it! [19:06:28] will try next run then, thanks! [19:07:02] ottomata: all right I am logging off then! [19:09:59] laters! [19:33:04] elukey: I am indeed somehow unhappy, but it's for the good cause :) Jessie EVERYWHERE ! [19:34:44] elukey: Restarted :) [19:50:55] milimetric: super good work with those formatters on digraphs i was able to figure out what was wrong and change to "kmb" real fast [19:53:25] cool. Vue has "filters" so like if you have {{value}} you can be like {{value | kmb}} [19:53:55] pretty elegant, and you can have a function return the filter based on config [19:58:42] nuria: I think the dashiki work would be so much more useful with actual documentation of the layouts. But I don't love the layouts as they are, lots of bugs. So I was thinking: one week to re-shape the configs, then two days to document them properly. Let me know if you think that fits into our plan. While I think it would be nice, we can get by fine [19:58:42] with just me making dashboards for people whenever they need them [19:59:58] milimetric: mmmm.. I think they are pretty good, i do not think layouts need much doc besides readme meta config and nice code [21:38:53] Hi! I finally logged into stat1003 to get some event logging data but I am not sure how to access it. I first checked grafana but https://grafana.wikimedia.org/dashboard/db/eventlogging-schema?refresh=5m&orgId=1 does not list CookieBlock schema in the drop-down. [21:41:32] Oh, I found the sample query on https://wikitech.wikimedia.org/wiki/Analytics/EventLogging#MariaDB. That's helpful. Thanks!