[00:34:36] 10Quarry, 10Discovery, 10VPS-Projects, 10Wikidata, and 2 others: Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry) - https://phabricator.wikimedia.org/T104762 (10Smalyshev) Not exactly Quarry, but see https://commons.wikimedia.org/wiki/User:TabulistBot - this should be similar to Listeri... [00:46:46] 10Analytics-Data-Quality, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Data set review for the Wiktionary Cognate Dashboard - https://phabricator.wikimedia.org/T199851 (10GoranSMilovanovic) [07:14:27] (03CR) 10Jforrester: "🤷🏽‍♂️" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/446399 (https://phabricator.wikimedia.org/T188776) (owner: 10Reedy) [08:46:29] sorry the oozie failures are due to me [08:46:33] I need to reboot an1030 [08:48:15] !log re-run hour 7 of webrequest upload/text via Hue (failed due to a hadoop node restart) [08:48:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:02:22] 10Analytics-Tech-community-metrics, 10Developer-Advocacy: Consider enabling GitHub backend in wikimedia.biterg.io to cover canonical Wikimedia repositories not in Gerrit - https://phabricator.wikimedia.org/T186736 (10Aklapper) Correcting the actual scope of this task (Bitergia to deploy Bestiary; Wikimedia to... [09:02:49] 10Analytics-Tech-community-metrics, 10Developer-Advocacy: Track canonical Wikimedia repositories on Github in wikimedia.biterg.io - https://phabricator.wikimedia.org/T186736 (10Aklapper) [09:34:22] fdans neilpquinn https://phabricator.wikimedia.org/T183145#4127859 [11:00:50] * elukey lunch! [11:02:49] 10Analytics-Kanban: [EL sanitization] Write and productionize script to drop partitions older than 90 days in events database - https://phabricator.wikimedia.org/T199836 (10mforns) [11:03:43] 10Analytics, 10Analytics-Kanban: [EL sanitization] Add ability to salt and hash to eventlogging sanitization in Hive - https://phabricator.wikimedia.org/T198426 (10mforns) [11:05:37] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: [EL sanitization] Productionize EventLoggingSanitization.scala - https://phabricator.wikimedia.org/T193176 (10mforns) [11:06:31] 10Analytics: [EL sanitization] Make WhitelistSanitization support arrays of structs, maps or other arrays - https://phabricator.wikimedia.org/T199230 (10mforns) [11:11:54] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging sanitization - https://phabricator.wikimedia.org/T199898 (10mforns) [11:12:07] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging sanitization - https://phabricator.wikimedia.org/T199898 (10mforns) [11:12:33] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging sanitization - https://phabricator.wikimedia.org/T199898 (10mforns) a:03mforns [11:13:37] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174 (10mforns) [11:13:39] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging sanitization - https://phabricator.wikimedia.org/T199898 (10mforns) [11:18:30] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: [EL sanitization] Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174 (10mforns) [11:19:15] 10Analytics-Kanban: [EL sanitization] Write and productionize script to drop partitions older than 90 days in events database - https://phabricator.wikimedia.org/T199836 (10mforns) [11:19:16] 10Analytics: [EL sanitization] Make WhitelistSanitization support arrays of structs, maps or other arrays - https://phabricator.wikimedia.org/T199230 (10mforns) [11:19:18] 10Analytics, 10Analytics-Kanban: [EL sanitization] Add ability to salt and hash to eventlogging sanitization in Hive - https://phabricator.wikimedia.org/T198426 (10mforns) [11:19:20] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging sanitization - https://phabricator.wikimedia.org/T199898 (10mforns) [11:22:55] 10Analytics, 10Analytics-EventLogging: [EL Sanitization] Set up salt creation and rotation - https://phabricator.wikimedia.org/T199899 (10mforns) [11:23:20] 10Analytics, 10Analytics-EventLogging: [EL Sanitization] Set up salt creation and rotation - https://phabricator.wikimedia.org/T199899 (10mforns) [11:23:22] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging sanitization - https://phabricator.wikimedia.org/T199898 (10mforns) [11:26:13] 10Analytics, 10Analytics-EventLogging: [EL sanitization] Store the old salt for 2 extra weeks - https://phabricator.wikimedia.org/T199900 (10mforns) [11:26:26] 10Analytics, 10Analytics-EventLogging: [EL sanitization] Store the old salt for 2 extra weeks - https://phabricator.wikimedia.org/T199900 (10mforns) [11:30:35] 10Analytics, 10Analytics-EventLogging: [EL sanitization] Retroactively hash and salt appInstallId fields in the event_sanitized database - https://phabricator.wikimedia.org/T199902 (10mforns) [13:00:41] PROBLEM - Check if active EventStreams endpoint is delivering messages. on scb2002 is CRITICAL: CRITICAL: No EventStreams message was consumed from http://scb2002.codfw.wmnet:8092/v2/stream/recentchange within 10 seconds. [13:01:17] mmm [13:03:15] elukey, I'm reading wikitech docs to see what I can do [13:03:22] nono I am working on it mforns [13:03:27] it is part of an investigation [13:03:29] don't worry [13:03:30] oh ok [13:03:34] not sure what's happening though [13:03:35] cool [13:03:47] need a rubber duck? [13:04:57] nono thanks I'll explain during standup, it might be working as expected.. [13:05:18] basically it seems that an ES consumer can fetch/buffer up to a GB of messages if the consumer is lagging [13:05:30] aha [13:05:33] and this seems to be the root cause for ES in codfw consuming a ton of memory [13:05:36] so we are trying to tune it [13:05:44] k [13:30:47] RECOVERY - Check if active EventStreams endpoint is delivering messages. on scb2002 is OK: OK: An EventStreams message was consumed from http://scb2002.codfw.wmnet:8092/v2/stream/recentchange within 10 seconds. [13:35:55] \o/ [13:50:33] 10Analytics, 10User-Elukey: Add a safe failover for analytics1003 - https://phabricator.wikimedia.org/T198093 (10elukey) Any comment about my last entry? Sorry to ping you guys, maybe a quick meeting between the three of us would be better? [13:58:39] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Move piwik's database to db1108 (eventlogging slave) - https://phabricator.wikimedia.org/T198217 (10elukey) 05Open>03declined Declining this for the moment, it doesn't seem a good way to go. [14:26:52] (03PS2) 10Mforns: Fix case insensibility for MapMaskNodes in WhitelistSanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/443508 (https://phabricator.wikimedia.org/T193176) [14:27:55] (03PS1) 10Mforns: [WIP] Add ability to salt and hash to eventlogging sanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/446592 (https://phabricator.wikimedia.org/T198426) [14:28:37] (03CR) 10Mforns: [C: 04-1] "Still needs testing" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/446592 (https://phabricator.wikimedia.org/T198426) (owner: 10Mforns) [14:38:32] hi :) [14:38:47] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add ability to salt and hash to eventlogging sanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/446592 (https://phabricator.wikimedia.org/T198426) (owner: 10Mforns) [14:39:09] hi milimetric :] [14:39:26] hey mforns! [14:39:28] o/ [14:39:52] o/ Luca [14:51:47] 10Analytics, 10EventBus, 10Operations, 10Wikimedia-Stream, and 4 others: EventStreams accumulates too much memory on SCB nodes in CODFW - https://phabricator.wikimedia.org/T199813 (10mobrovac) p:05Unbreak!>03High a:03mobrovac Lowering the priority as fixing the receive buffer size to 64MB should do t... [15:34:16] (03PS1) 10Sahil505: Improved wikistats2 map zoom [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/446610 (https://phabricator.wikimedia.org/T198867) [15:36:43] (03CR) 10Sahil505: [C: 04-1] "WIP : Still need to work on diagonal zoom." (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/446610 (https://phabricator.wikimedia.org/T198867) (owner: 10Sahil505) [16:00:10] a-team: standddupppp [16:21:20] (03PS2) 10Sahil505: Improved wikistats2 map zoom [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/446610 (https://phabricator.wikimedia.org/T198867) [16:25:25] (03CR) 10Sahil505: [C: 031] "1. Patch 1 fixes the first point of the commit description (please read inline comment in patch 1 for exact location of code that fixes th" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/446610 (https://phabricator.wikimedia.org/T198867) (owner: 10Sahil505) [16:30:52] 10Analytics, 10User-Elukey: Add a safe failover for analytics1003 - https://phabricator.wikimedia.org/T198093 (10Marostegui) >>! In T198093#4407109, @elukey wrote: >>>! In T198093#4407099, @jcrespo wrote: >>> This should be ok since we already have some dbproxies whitelisted in the analytics vlan's firewall, s... [16:36:58] I'm biking home, will be back in time for SoS [16:39:23] bye team, see you later [17:04:06] * elukey off! [17:08:46] 10Analytics-Kanban: quantifyvolume of traffic on piwik with DNT header set - https://phabricator.wikimedia.org/T199928 (10Nuria) [17:09:00] 10Analytics-Kanban: quantifyvolume of traffic on piwik with DNT header set - https://phabricator.wikimedia.org/T199928 (10Nuria) [17:12:59] 10Analytics-Kanban: Quantify volume of traffic on piwik with DNT header set - https://phabricator.wikimedia.org/T199928 (10Nuria) [18:23:05] nuria_ are you available to supervise as I try to ingest some data into druid? I have an ingestion spec and data ready to go [18:24:08] nuria_: *will you be available today [18:24:33] bearloga: sorry on interviews and meetings for next couple hours [18:26:50] nuria_: np! do you know if anyone else would be available today? if not, can I schedule something for you and me for tomorrow? [18:29:11] bearloga: sorry i will be out for a while from tomorrow onwards, our team works mostly on EU if you ping the channel earlier in the morning you can probably get help easily. [18:35:34] 10Analytics, 10ChangeProp, 10MediaWiki-JobQueue, 10Operations, and 2 others: Consider the possibility of separating ChangeProp and JobQueue on Kafka level - https://phabricator.wikimedia.org/T199431 (10herron) p:05Triage>03Normal [18:36:12] 10Analytics, 10ChangeProp, 10Operations, 10Services (designing), 10Wikimedia-Incident: Separate dev Change-Prop from production Kafka cluster - https://phabricator.wikimedia.org/T199427 (10herron) p:05Triage>03Normal [19:02:16] I tried submitting an indexing task with curl to ingest hdfs://analytics-hadoop/user/bearloga/wmf_gsc/output.csv into druid. didn't get an error or a message so that's a good sign, I suppose :) will check in with someone tomorrow morning if I don't see the new "gsc_all" datasource show up in turnilo/superset by then [19:25:22] 10Analytics, 10Readers-Web-Backlog: Problems with external referrals? - https://phabricator.wikimedia.org/T195880 (10Nuria) Of interest: https://caniuse.com/#feat=referrer-policy [20:20:42] back [21:23:57] bearloga: back, let me see about your question [21:31:20] bearloga: no, i do not see your datasource indexed [21:32:24] bearloga: maybe you want to paste here your injection spec to get help [21:32:26] ? [21:32:59] nuria_: sure! thanks! [21:33:15] bearloga: your datasource needs to be suffixed with _test [21:33:23] otherwise it will not index: To prevent reindexing production data, a non-hdfs user will be prevented to index datasources with name not starting with "test_" [21:33:33] bearloga: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Druid#Regular_indexations_through_Oozie [21:34:05] bearloga: ah, no wait, cause you are not using oozie right? [21:34:13] bearloga: you are posting directly? [21:34:42] nuria_: yup, posting directly. productionizing with oozie will be next step after I return from vacation [21:36:07] bearloga: your data is not json though [21:36:14] bearloga: it is a cvs right? [21:36:49] bearloga: ah yes, i see it above "csv" [21:37:17] nuria_: correct https://gist.github.com/bearloga/c311cdcd3a61f4435b4b006cf119c30e [21:40:23] nuria_: I modified metrics specs after my initial attempt but otherwise it's the same ingestion spec [21:47:06] bearloga: i really cannot see your indexation anywhere, you might want to try again [21:47:25] bearloga: did you get a 200 code when you submitted it? [21:47:35] bearloga: i can monitor while you try [21:48:22] nuria_: I actually didn't get anything. I'll try it with -v this time. here's the command I'm using: `curl -X POST -H "content-type: application/json" -d@druid-spec_country-all.json http://druid1001.eqiad.wmnet:8090/druid/indexer/v1/task` [21:49:59] nuria_: is the overlord url correct? I was using http://druid.io/docs/0.12.1/tutorials/tutorial-batch.html + https://github.com/wikimedia/analytics-refinery/blob/master/oozie/pageview/druid/daily/coordinator.properties for reference [21:50:47] I want to make sure the command is correct before I try again [21:51:55] bearloga: a 200 indicates job was received, command looks fine [21:52:17] bearloga: let me make sure it needs to be posted to 1001 [21:53:32] bearloga: ya 8090 on druid 1001 [21:53:44] bearloga: but ... where are you running your command from? [21:54:05] nuria_: (1) 👍 (2) from stat1005 [21:54:33] bearloga: and did you make sure you can reach that host via http? [21:54:40] nuria_: I can query druid no problem [21:55:04] bearloga: ok, one thing we do not have to worry about [21:56:24] bearloga: i *think* indexer is being redirected to 1003 [21:56:59] bearloga: if you do curl -v http://druid1001.eqiad.wmnet:8090/druid/indexer/v1/task you see a 307 [21:57:00] nuria_: curl debug output https://www.irccloud.com/pastebin/070xryTj/ [21:57:31] bearloga: ya, a 307 [21:57:54] bearloga: try posting to 1003 [21:58:26] nuria_: got a 200 :P [21:58:38] nuria_: do you see anything on your end? [21:59:32] bearloga: your initial post was a 30X which means (in http-speak) please repost to this other location, i think druid has moved indexing to 1003 [21:59:37] bearloga: let me see [22:00:32] nuria_: yup I changed it to 1003 in the command and that's when I got success status [22:00:51] bearloga: ya, now it is working [22:01:24] bearloga: will take a bit, gotta say i have not indexed csv data before [22:02:03] nuria_: w00t! YAY! [22:02:23] bearloga: wa8it, i might have spoken too soon, one sec [22:02:30] bearloga: ayayaya [22:02:56] nuria_: also in that case shouldn't https://github.com/wikimedia/analytics-refinery/blob/master/oozie/pageview/druid/daily/coordinator.properties be changed to refer to 1003 instead, then? [22:03:11] (and other coordinator properties in other oozie jobs) [22:03:16] bearloga: druid moves things arround and jobs get redirected, is fine [22:03:22] ah, okay [22:05:31] 10Analytics, 10Discovery-Search (Current work), 10Patch-For-Review: Use kafka for communication from analytics cluster to elasticsearch - https://phabricator.wikimedia.org/T198490 (10EBernhardson) Ran a quick test for data volume we will be shipping over kafka, looks like we will be generating around 2-3GB o... [22:06:23] nuria_: are you seeing errors with the indexing? :( [22:07:12] bearloga: one sec [22:13:07] bearloga: ya, it gets an OOM [22:14:08] bearloga: wait, no, it is also in other jobs .. mmm [22:14:50] nuria_: I was going to say how surprising that is given the csv file is 55MB uncompressed [22:15:21] bearloga: yaya, my mistake, it is not that [22:15:25] bearloga: totally [22:15:56] bearloga: can you try to submit command gain? [22:16:01] *again [22:16:07] bearloga: i will look closely [22:16:31] nuria_: sure, np! [22:16:38] nuria_: submitting now [22:16:50] bearloga: got 200? [22:16:56] nuria_: yup! [22:20:19] bearloga: indexation job looks done, but i do not see data anywhere [22:20:52] bearloga: you do not have querygranularity [22:21:00] bearloga: on your spec [22:21:32] bearloga: makes sense? [22:22:45] nuria_: not particularly, no. queryGranularity is set to "none" in http://druid.io/docs/0.12.1/tutorials/tutorial-batch.html but I can try setting it to "day" and seeing if that fixes it [22:24:22] nuria_: submitting indexing job again with the updated spec…got 200. [22:25:17] nuria_: the path to the csv is correct, right? https://gist.github.com/bearloga/c311cdcd3a61f4435b4b006cf119c30e#file-druid-spec_country-all-json-L37 [22:32:01] bearloga: paths look good to me , i wonder if this is due to permits of druid not being able to access your dir? [22:32:13] bearloga: did you try putting file on hadoop /tmp? [22:33:00] bearloga: hdfs://analytics-hadoop/tmp/? let me look at other logs see if i find something [22:33:19] nuria_: I'll try putting it in /tmp [22:33:23] bearloga: did you submitted it with querygrnularity? [22:34:22] nuria_: yep I changed it to "day" [22:34:46] bearloga: ok, let's try again, index logs look fine [22:36:43] nuria_: okay, submitted (200) with paths set to hdfs://analytics-hadoop/tmp/wmf_gsc.csv [22:39:14] nuria_: btw I'm keeping that gist updated with every attempt if you want to to verify the ingestion spec [22:39:36] (03PS2) 10Mforns: [WIP] Add ability to salt and hash to eventlogging sanitization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/446592 (https://phabricator.wikimedia.org/T198426) [22:40:34] bearloga: i think your data is ingested but is as if it your file was empty and thus no data source is created [22:40:51] bye teaaam [22:42:21] nuria_: do you want to try changing ownership to druid? [22:43:07] bearloga: let me see if in other log there is a clue as to error [22:43:37] bearloga: we can try to insert as json, that is the only other thing i can think could be the issue [22:43:56] bearloga: https://www.csvjson.com/csv2json [22:44:26] nuria_: I also have a tiny idea but it would be so wild if this is the reason. the CSV example shown in http://druid.io/docs/0.12.1/ingestion/data-formats.html has everything but timestamp and numbers quoted, but my CSV doesn't [22:45:15] bearloga: if you are willing to change it do change it to json as we know that format works [22:45:31] bearloga: does that sound good? [22:45:40] bearloga: we can do two tries if you want [22:45:46] i will be grepping logs [22:48:00] nuria_: uhhhhh let me see if I can actually get it into json because there's no way I'm uploading this sensitive and private dataset to a public website [22:49:55] bearloga: for indexing purposes it does not need to be real data, just use whatever fake one [22:50:34] nuria_: good point. I'll try that [22:50:37] bearloga: ahem... "to try indexing" that is [23:00:18] bearloga: this might be it "io.druid.segment.indexing.DataSchema: No metricsSpec has been specified. Are you sure this is what you want?" [23:02:19] nuria_: okay, just updated the spec (lmk if you need the gist url) and added the (fake) example json-formatted data that we'll try to ingest: https://gist.github.com/bearloga/c311cdcd3a61f4435b4b006cf119c30e#file-wmf_gsc-json [23:03:09] bearloga: ok, so i found error [23:03:12] bearloga: [23:03:16] https://www.irccloud.com/pastebin/PCfaTmPY/ [23:05:37] bearloga: can i see ingestion spec? [23:06:29] nuria_: https://gist.github.com/bearloga/c311cdcd3a61f4435b4b006cf119c30e#file-druid-spec_country-all-json [23:07:34] nuria_: btw I haven't tried that one yet, wanted to get an OK from you first [23:07:41] bearloga: well, let's try that [23:09:20] nuria_: okay, submitted (with 200) [23:16:05] bearloga: still i see metricsSpec as empty .. "metricsSpec" : [ ], [23:17:22] nuria_: well that is super weird because as you can see it is super not empty in the ingestion spec [23:24:23] nuria_: I'm happy to table it for now if you want to call it a day or have other things you'd rather focus on [23:24:52] bearloga: in case this is abug , let's do one last try with metricsspec like : [23:24:55] https://www.irccloud.com/pastebin/2r5zxVTA/ [23:25:17] I mean "name/type/fieldName" fields on that order [23:26:22] bearloga: sorry, I mean "name/type/fieldName" structure on json file [23:29:02] bearloga: let me know if you want to try that as our last thing [23:29:19] nuria_: yup, preparing the spec right now [23:32:50] nuria_: oh my god [23:33:08] bearloga: typo? [23:33:15] bearloga: cause that would explain all this? [23:33:27] nuria_: I think in the course of editing metricsSpec somehow got out of dataSchema [23:33:43] ooooooyyyyyyyyy [23:33:48] bearloga: ok, then that is our problem [23:34:05] bearloga: you can retry and it will probably succeed [23:34:13] bearloga: will check later on! [23:42:50] nuria_: thanks! I submitted it with this revision of the spec: https://gist.github.com/bearloga/c311cdcd3a61f4435b4b006cf119c30e/07af7acf299f3572ddc97bbeebb675910f66a721 hopefully that works! if it does, I'll try again with the CSV too :)