[09:49:25] morning elukey! :D Where is puppet is defined which roles / profiles are on which stat box? [09:49:55] addshore: o/ [09:50:06] I need to split up ome of the wmde stuff to run on seperate hosts, some on 1005 and some on 1004 [09:50:17] I usually start from site.pp and then checkout what roles are applied [09:50:21] okay! [09:50:27] let me know if you need help! [09:50:48] ahh yes, so our current stuff is in statistics::private which is included on 10005 [09:51:36] elukey: where would be the best place for stuff for 1004 then? as that only has role(analytics_cluster::client, analytics_cluster::refinery) [09:51:48] maybe I should refactor my wmde stuff out of statistics::private ? or ? :/ [09:54:29] addshore: can I ask what are your plans on splitting things? Just to have an idea [09:54:44] anyhow, I think a profile would be the best, I might need to do some refactoring first though [09:54:57] so we'll have only one role for each stat* box [09:55:28] Yup, so the stuff in the graphite manifest can stay on 1005 [09:55:46] then the new wdcm stuff will be split between 1004 and 1005, there will be 1 R script to run on each box afaik [09:55:57] so the wmde user should be on both machines [09:56:13] also the wdcm code should be on both machines, with different crons setup [09:56:42] the mysql details should be included in both the wdcm and graphite manifest on 1005 [09:57:16] I'm willing to take a crack at any refactoring if you give me some rough guidelines :) [09:58:33] addshore: there is a lot of things in-progress that you might need to dive deep into for this refactoring, I don't want to put you in this position :) [09:58:43] hehe :) [09:58:44] we are trying to slowly move to role/profiles [09:58:53] lemme try to check how hard it is [09:59:29] I think the first thing to make sense would be get 2 profiles / roles setup, and make the wmde user appear on both machines [09:59:42] and from there I guess I should be able to take over / make some drafts at least for review [10:01:09] is stat1004 the correct one to choose? From site.pp it seem preferred to access hadoop stuff from there [10:01:27] stat100[56] are more data cruncers [10:01:32] so apparently this r code is doing some scoop stuff which cant run from 1005 [10:01:38] ah okok [10:01:40] *finds the ticket* [10:03:41] I guess the last line of this comment https://phabricator.wikimedia.org/T171258#3772815 [10:05:55] ah snap because of java 8 [10:06:15] this is so annoying [10:06:43] if we had time we'd have already migrated hadoop to java 8 and this wouldn't be an issue :D [10:07:05] stat1004 is jessie, meanwhile 100[56] (newer) are stretch [10:07:22] the latter two runs java 8 and we have java 7 on the hadoop cluster [10:14:12] addshore: need to finish one thing for druid, do you mind if I attempt a refactoring after lunch? [10:14:18] (not sure how urgent this thing is) [10:14:28] after that you should be able to create a profile and just include it [10:14:58] yup that sounds fine! :) [10:15:02] Thanks!! [11:08:36] Hi all. Is there a replica of the Elasticsearch enwiki index with public read access on labs or somewhere else? I want to run some ES queries that can be done with MediaWiki search API. [11:09:44] mschwarzer_: hi! Probably better to ask to #wikimedia-discovery [11:09:50] mschwarzer_: http://en-wp-ltr-0617-relforge.wmflabs.org/wiki/Main_Page [11:09:59] not very upto date [11:10:09] or wait for the super dcausse to answer in here :P [11:10:14] :) [11:10:41] and it has only namespace 0 [11:11:44] dcausse thanks! How can I access the ES Rest API? [11:12:18] mschwarzer_: relforge1001.eqiad.wmnet:9248 should be accessible from labs [11:12:27] with https [11:13:02] this index is en-wp-ltr-0617_content [11:14:43] err: https://relforge1001.eqiad.wmnet:9243/ is the proper endpoint for sending queries directly to elastic [11:17:46] dcausse wow! it works. thanks. you saved me a lot work [11:17:53] yw! [11:23:07] mschwarzer_: beware that this cluster is not meant to build stable tools, only for experimentation (we may broke it from times to times) [11:24:41] of course.. i need it only for some tests with the morelikethis query [12:24:45] and after this morning of happiness and love with Python 2/3 string differences, I go to lunch [13:31:16] joal: o/ [13:36:28] Hi elukey [13:36:31] How are you? [13:36:46] goooood [13:36:53] all good from your side? [13:36:56] I've seen the snake caused ou some trouble? [13:37:12] it is my brain that caused troubles, not python [13:37:14] yessir - Some more investigations, no yet as good as findings, but at least moving [13:37:15] as usual [13:37:19] huhu [13:37:26] :D [13:37:37] buuuut the druid_exporter is deployed in stretch-wikimedia! [13:37:47] elukey: This is the problem with computers: most of the time, they do what they're told - therefore WE are the problem ! [13:37:52] hahahaha [13:37:53] Yay !!! [13:37:55] so true [13:38:05] https://gerrit.wikimedia.org/r/#/c/392052/3 is the code change to enable the http emitters [13:38:28] Awesome [13:39:31] elukey: the AAAA in ferm range seems such a secret code to me :) [13:40:10] elukey: Kudos, that path for deployment is very clean - means the underneatch code must be as well :) [13:43:12] the ferm stuff is usually copied/pasted by some pre-existing code :D [13:43:27] Which is close to secret coding then ;) [13:51:29] joal: if you have time we can merge + restart the druid puppet code and see how it goes [13:51:39] works for me elukey [13:51:52] elukey: it'll divert me from metrics vetting ::sigh:: [13:52:03] joal: ah no sorry! [13:52:11] please keep going [13:52:14] elukey: let's do that yes ;) [13:53:36] joal: sure? I can apply it to druid100[456] and restart daemons only in there, should be ok and will not need coordination :) [13:54:38] awesome elukey [13:54:42] le's start with that [13:55:32] all right proceeding [14:00:11] hi joal, qq: there is any way to create temporary tables in the SQL databeses [14:00:13] (analytics-store.eqiad.wmnet) [14:00:19] ? [14:00:49] or any other way to filter a query using an external .csv file [14:00:59] dsaez: I have no clue, I don't use DBs -- I think elukey knows more [14:01:26] hi elukey, any idea? [14:01:58] dsaez: you can use the staging database, as far as I know that one is the playground for research [14:03:27] elukey: great! thanks [14:09:24] joal: elukey@druid1004:/var/log/druid$ curl localhost:8000/metrics -s | grep -v "#" (if you want to check :) [14:10:27] Mwaaaaahaha :) [14:10:31] Works for me elukey :) [14:10:46] \o/ \o/ [14:11:39] I've only restarted the broker on 1004, going to do the same with coordinator and historical [14:11:59] hello :] [14:12:05] Hi mforns :) [14:17:12] mforns: o/ [14:17:23] the staging db has been copied from db1047 to db1108 [14:17:37] if research confirms that the other dbs are not needed, we are good :) [14:23:02] elukey, awesome [14:39:03] elukey: just a reminder poke :) ;) [14:40:17] addshore: hey, I didn't forget but I didn't had the time to follow up due to other tasks :( I'll try later on and will ping you when ready [14:40:59] hehe, thats fine :) just wanted to make sure you wern't just sitting there wanting something to do ;) [14:50:05] morning ottomata ! [14:51:33] joal: found a bug! https://gerrit.wikimedia.org/r/#/c/392424 - whenever you have time let me know if my commit msg makes sense [14:53:46] hiii [14:54:22] o/ [14:58:47] ottomata: whenever you have time lemme know that I have a naming question to ask you :) [15:00:11] elukey: I must say the headline of the commit message is not super explicit :) But the commi message makes it clearer [15:03:32] joal: does it make sense to drop those metrics? I never thought about it but now it seems a good moment :) [15:07:24] elukey: reading you commit message, I assume we have them already -- And I can't forsee a specific use as of now [15:08:28] joal: yes I can introduce those metrics to you as PS, they might be useful but not really see their value now :( [15:08:43] ok elukey sounds good [15:25:07] addshore: the refactoring is not easy, I need to sync with andrew first and figure out how to do it.. will ask during standup :) [15:26:07] mforns: do you have a minute? [15:26:39] elukey: okay! :D [15:26:53] mforns: I am planning to merge https://gerrit.wikimedia.org/r/#/c/391828/2/modules/profile/manifests/mariadb/misc/eventlogging/replication.pp today if possible [15:33:44] elukey, cool, do we have a start-ts file already in place? or will it create a new one? [15:34:58] elukey: yes! hello let's name things! [15:34:59] :) [15:35:00] what's up? [15:35:03] mforns: nope we need to put one in place [15:35:21] k [15:35:47] ottomata: I'll bother you during standup :D [15:36:41] ok! :) [15:45:14] !log deploying fixes to EL EventCapsule discrepancies: https://phabricator.wikimedia.org/T179625#3755242 [15:45:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:59:30] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Improve purging for analytics-slave data on Eventlogging - https://phabricator.wikimedia.org/T156933#3774716 (10jcrespo) Probably I am stating the obvious, but some thottling could be disabled if only affects the new hosts (not dbstore1002), as they sho... [16:01:13] a-team standup! [16:01:20] oh right [16:16:21] hello there! not sure this is the right channel but I'll give it a try :) can someone tell me if the pixels (actual image files) of commons images are easily accessible from the stats machines? [16:18:55] mforns: hey ya i'm deploying the timestamp changes for the EL data [16:19:02] so I have to wipe all previously imported data to avoid conflicts [16:19:27] ok ok, yea np, will adapt the code to parse new formats today [16:19:31] ok [16:19:55] afaik, the popups experiment has stopped, and they've stopped emitting (lots?) of popup data now [16:20:08] so, if you want to import historical into druid, you probably should do it from tbayer's table [16:20:20] but, make it a different datasource than the one you might automate [16:20:23] oh! ok [16:20:26] tbayer_popups or something [16:20:45] ottomata, I don't think we'll automate it for now, no? [16:20:47] ok [16:21:01] the idea was more to have a vertical proof of concept [16:21:03] well, in case we do, you still might not want to name it just 'popups' in druid [16:21:04] aye [16:21:05] ok [16:49:43] 10Analytics-Kanban: Check data from new API endpoints agains existing sources - https://phabricator.wikimedia.org/T178478#3774933 (10Ottomata) [16:51:02] 10Analytics: Some fields in Pivot should be numbers - https://phabricator.wikimedia.org/T167494#3774936 (10Ottomata) [16:55:18] 10Analytics: Create LVS endpoing for druid-public-overlord (for oozie job indexing) - https://phabricator.wikimedia.org/T180971#3774951 (10Ottomata) [16:55:31] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3774967 (10Ottomata) [16:55:55] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3617909 (10Ottomata) Moved 'Create LVS endpoing for druid-public-overlord (for oozie job indexing)' to its own task so we can close this... [16:59:30] 10Analytics-Kanban, 10Patch-For-Review: Provide oozie job running ClickStream spark job regularly - https://phabricator.wikimedia.org/T175844#3774995 (10Ottomata) [17:00:32] 10Analytics, 10Performance-Team, 10Patch-For-Review: Explore NavigationTiming by faceted properties - EventLogging refine - https://phabricator.wikimedia.org/T166414#3775000 (10Ottomata) [18:28:55] !log deployed prometheus-druid-exporter (still not released in apt) on druid1004 for testing [18:28:56] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:28:58] ottomata: --^ [18:29:08] argh forgot the 0.4 bit [18:29:33] I'll leave it running for the next hours and then complete the work tomorrow morning :) [18:29:40] seems running fine [18:30:22] * elukey off! [18:30:24] byyeeee [18:31:23] :) [18:34:34] super useful data from few minutes of grabbing metrics [18:34:36] druid_broker_query_cache_timeouts_count 0.0 [18:34:36] druid_broker_query_cache_errors_count 0.0 [18:34:36] druid_broker_query_cache_misses_count 188.0 [18:34:36] druid_broker_query_cache_hits_count 3952.0 [18:34:38] druid_broker_query_cache_sizebytes_count 30682.0 [18:34:41] druid_broker_query_cache_evictions_count 0.0 [18:34:43] druid_broker_query_cache_numentries_count 186.0 [18:34:46] this is from druid1004 [18:46:44] hey joal, how about revisions on pages that do not have interwiki links? [18:46:57] wasn't that a thing in wikistats1_ [18:47:17] ? [18:47:31] mforns: not sure about interwiki links, but any link IIRC [18:47:46] aha [18:48:30] Like revisions without any link ... On the edits page (https://stats.wikimedia.org/EN/TablesDatabaseEdits.htm) it's not noted though [18:48:44] aha [18:49:07] but... why would that affect es, ja, ru more than en? doesn't make much sense [18:49:46] joal, and wikistats1-blacklisted bots? [18:50:03] no bots filtered on that metric [18:50:10] oh ok [18:55:08] joal, eswiki has its images in commons, but enwiki doesn't [18:55:49] all enwiki images come from enwiki... [18:56:06] seems [18:56:24] by the 90-second research I did :P [18:57:56] mforns: no flagged anything is spanish from what I see [18:57:58] japanese, has its images also on jawiki [18:58:21] aha [19:08:37] GOne for tonight a-team, fed up from stats :( [19:08:58] ok joal, cya! [19:09:49] laterrrs [20:37:34] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Resolve EventCapsule / MySQL / Hive schema discrepancies - https://phabricator.wikimedia.org/T179625#3775786 (10Ottomata) ALllriiight! Capsule changes all deployed and good. Json Refine jobs restarted and moving along nicely. Documentation... [20:38:08] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Resolve EventCapsule / MySQL / Hive schema discrepancies - https://phabricator.wikimedia.org/T179625#3775787 (10Ottomata) Remaining task is to send an announcement. Let's let this go until next week and announce then. [20:39:28] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Implement EventLogging Hive refinement - https://phabricator.wikimedia.org/T162610#3775796 (10Ottomata) Looking goooood! Remaining tasks: - purging in Hadoop after 90 days. We'll want to get smart purging in as a new task. - send an announc... [21:04:55] hey ottomata [21:05:27] sorry, I’ve been in a series of back to back meetings and will still be for a while :-/ [21:05:39] let me check if one of my team members can help out [21:15:41] ow heeeeey ottomata. ;) happy to help. Call in Hangout whenever. [21:17:32] hiii [21:17:32] ok [21:29:54] 10Analytics, 10Analytics-EventLogging, 10Performance-Team: Make webperf eventlogging consumers use eventlogging on Kafka - https://phabricator.wikimedia.org/T110903#3775945 (10Krinkle) 05stalled>03Open Un-blocking this from T175087 as that will take longer to complete. In the interim, I'll prioritise ge... [21:29:59] 10Analytics, 10Analytics-EventLogging, 10Performance-Team: Make webperf eventlogging consumers use eventlogging on Kafka - https://phabricator.wikimedia.org/T110903#3775949 (10Krinkle)