[06:41:20] Analytics-Kanban: Improve mediawiki data redaction - https://phabricator.wikimedia.org/T146444#2698679 (jcrespo) > clean up Sanitarium to Jaime's standards and make it so it works with tungsten (a real-time sqoop) I like the "I have no idea how it works", and all guesses of how it works (like Alex's) are wr... [06:44:43] Analytics-Kanban: Improve mediawiki data redaction - https://phabricator.wikimedia.org/T146444#2698681 (AlexMonk-WMF) >>! In T146444#2698679, @jcrespo wrote: >> clean up Sanitarium to Jaime's standards and make it so it works with tungsten (a real-time sqoop) > > all guesses of how it works (like Alex's) ar... [10:30:56] joal: o/ [12:03:05] Analytics-Kanban, Patch-For-Review: Kill dashboards on limn1 that are no longer used - https://phabricator.wikimedia.org/T147000#2699120 (mforns) Action item (1): Remove reportupdater jobs from puppet. See gerrit change above ^ We'll need to remove by hand the cron jobs in stat1003, I guess. [12:13:05] Analytics: Clean up datasets.wikimedia.org - https://phabricator.wikimedia.org/T125854#1998898 (mforns) The following folders in datasets,wikimedia.org contain data that isn't used any more. We can recheck with their owners, but the dashboards that retrieved them don't exist any more. So, when reorganizing t... [12:13:38] Analytics-Kanban, Patch-For-Review: Kill dashboards on limn1 that are no longer used - https://phabricator.wikimedia.org/T147000#2677428 (mforns) Action item (2): This folders in datasets.wikimedia.org should be deleted. But I think it would be cool to leave them there for a couple weeks, as a temporary... [12:17:45] Analytics-Kanban, Patch-For-Review: Kill dashboards on limn1 that are no longer used - https://phabricator.wikimedia.org/T147000#2699160 (mforns) Action item (3): Delete the web proxies (dns) for the following domains: - mobile-reportcard.wmflabs.org - edit-reportcard.wmflabs.org - debugging.wmfla... [12:19:54] I am trying to re-run the failed job in hue [12:19:59] the error emails are mine :) [12:20:04] Analytics-Kanban, Patch-For-Review: Kill dashboards on limn1 that are no longer used - https://phabricator.wikimedia.org/T147000#2699161 (mforns) Action item (4): Remove reportupdater's report files in stat1003 for: - limn-mobile-data - limn-extdist-data This should be done after the puppet patch has... [12:21:59] Permission denied: user=elukey, access=WRITE, inode="/wmf/data/raw/webrequest/webrequest_upload/hourly/2016/10/06/17/_PARTITIONED [12:22:15] Analytics-Kanban, Patch-For-Review: Kill dashboards on limn1 that are no longer used - https://phabricator.wikimedia.org/T147000#2699162 (mforns) So, what remains to do here is: 1. Review and merge the puppet patch 2. After merge and puppet run, remove the cron jobs for mobile and extdist in stat1003 3.... [12:24:34] Hi elukey [12:26:51] joal: o/ [12:27:14] do you have time to teach how to restart oozie jobs with higher error threshold to a n00b? [12:27:35] I thought that re-running it via hue changing the parameter would have been enough :P [12:28:19] elukey: normally it should, but IIRC thqt when using hue, job is not run with hdfs as user [12:28:31] elukey: correction: When re-running changing a parameter [12:28:48] yeah it runs with my username [12:29:00] and I need hdfs to touch the file right? [12:29:22] elukey: you need to be hdfs to write to those folders yes [12:29:27] ah okok [12:29:39] what do you usually do? [12:30:02] I mean, do you have a precanned oozie command ? [12:30:10] elukey: I do [12:30:29] I did my homework and the total errors registered are 47% :( [12:30:54] elukey: it needs a modified properties file --> we use a bundle for load coords, and the precise correction case we only want a coord [12:31:59] so we need to run a coordinator like hue does when I re-run but with the error threshold set to something like 50 [12:32:36] elukey: The file I use is in stat1004:/home/joal/code/oozie_props/coord_load_webrequest_upload.properties [12:36:17] * elukey checks it out [12:38:11] joal: and the oozie command? (mind if I run it this time to learn?) [12:40:15] elukey: sudo -u hdfs oozie job --oozie $OOZIE_URL -Drefinery_directory=hdfs://analytics-hadoop$(hdfs dfs -ls -d /wmf/refinery/2016* | tail -n 1 | awk '{print $NF}') -Dqueue_name=production -Doozie_launcher_queue_name=production -Doozie_launcher_memory=256 -Dstart_time=2016-10-06T17:00Z -Dstop_time=2016-10-06T17:59Z -config /home/joal/code/oozie_props/coord_load_webrequest_upload.pro [12:40:21] perties -run [12:40:29] elukey: please go ahead ;0 [12:40:54] thankssss [12:41:15] np elukey [12:41:20] ah you already changed dates [12:41:22] too easy :P [12:41:44] elukey: Let me know if you want us to take some time for aqs-old->new move vhanges [12:42:22] sure! [12:42:33] there are also other oozie alerts [12:46:33] elukey: other alerts are warnings about new project (olo.wikimedia) [12:48:05] yep! How did you find it? [12:48:15] (I guess in which table it was) [12:48:21] wait wait wait there is a wiki page [12:48:23] let me find it [12:48:24] :P [12:53:51] now I have https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/Oozie [12:53:54] :) [12:56:19] maybe I should update https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Oozie too [12:57:29] possible elukey, just the fact that the properties file not being productionized is not really [12:57:32] great [12:57:48] yeah, I'll keep in my docs for the moment [12:57:52] k [12:58:56] ah the wmf.pageview_unexpected_values table! [13:03:19] select * from pageview_unexpected_values where year = 2016 and month = 10 and day = 7; [13:03:29] :) [14:58:49] Analytics-Kanban: Improve mediawiki data redaction - https://phabricator.wikimedia.org/T146444#2699546 (Milimetric) @jcrespo: no work has been done, no technologies have been decided, these are just words on a phab task right now. Tungsten is the only tech that I know of that helps export mysql data to hado... [15:01:14] (CR) Nuria: [C: 2 V: 2] Bringing master up to date with aqs-new-cluster branch [analytics/aqs] - https://gerrit.wikimedia.org/r/314284 (https://phabricator.wikimedia.org/T144497) (owner: Nuria) [15:29:37] mforns: do you need any help for https://phabricator.wikimedia.org/T147000#2699162 [15:29:41] ? [15:32:30] Analytics, Cassandra, Discovery, Maps, and 2 others: Investigate and implement possible simplification of Cassandra Logstash filtering - https://phabricator.wikimedia.org/T130861#2699616 (Eevans) Open>Resolved Complete. [15:52:11] elukey, it worked fine, thx! [15:54:30] Analytics-Kanban, Patch-For-Review: Kill dashboards on limn1 that are no longer used - https://phabricator.wikimedia.org/T147000#2699640 (mforns) Action item (1) DONE - Removed the crontab entries for limn-mobile-data and limn-extdist-data - Removed the limn-mobile-data and limn-extdist-data repos from... [16:05:34] urandom: you there? [16:05:52] elukey: yup [16:05:56] o/ [16:05:58] re: gimme all yer nodes. [16:06:07] heh [16:06:55] I am in favor but bare in mind that 1) they don't have SSDs and 2) from what I've gathered the raid/disk controllers perform really poorly [16:07:05] I want to make some performance tests for 2) [16:07:20] ok; we are already all set to buy SSDs [16:07:27] so that's not an issue [16:07:42] if you are ok with the above warnings, I can go ahead with wiping them and assigning the role spare, then I'll make a phab task for DC ops to assign them to you guys [16:07:45] we have some nodes ear-marked in AMS, but that location is less ideal [16:07:56] as for the raid control... [16:08:07] controllers, that is [16:08:33] we usually use software raid, no... is it that you cannot not use these controllers? [16:08:47] (sorry for the double-negative there) [16:09:06] yeah but from what I know these controllers will be used to control the disks too, even if you do use them as JBOD [16:09:25] (sorry miswritten but you got my point) [16:09:40] Ok, and they perform badly [16:09:45] do you know what machines these are? [16:09:50] are the Dells? [16:10:07] yes I think so, but atm these are speculations and not final proofs [16:10:08] s/the/they/ [16:10:12] I want to have more data next week [16:10:48] i received some vague warnings about the controllers in the AMS machines, too [16:10:54] Dell PowerEdge R720xd [16:10:59] this is aqs1001 [16:11:03] that rings a bell [16:13:19] but again this might be only a speculation [16:13:34] so I'll make tests and I'll let you know :) [16:13:56] elukey: kk [16:14:03] (for all the readers - we are talking about disk/raid performances for a specific heavy use case like Cassandra) [16:14:29] elukey: while i have you, do you think on Monday I could get a merge of https://gerrit.wikimedia.org/r/#/c/314603? [16:14:38] PC output is linked [16:14:56] I thought it was part of yesterday's SWAT [16:14:57] it just adds a jar to the classpath for TWCS [16:15:07] no, i needed to get the jar deployed first [16:15:23] and now it's friday :( [16:15:24] sure let's touch base on Monday [16:15:28] kk [16:15:34] elukey: thanks! [16:15:51] urandom: will you be in Seville for the ApacheCon? [16:15:58] it took more wrangling than i cared for to get everything deployed [16:15:59] (probably I'll already asked you that) [16:16:00] elukey: nope [16:16:14] elukey: are you going? [16:16:21] ah snap! it would have been nice to meet [16:16:23] yep! [16:16:50] ah yeah, that would have been nice [16:25:52] (Abandoned) Nuria: Bringing master up to date with aqs-new-cluster branch [analytics/aqs] - https://gerrit.wikimedia.org/r/314284 (https://phabricator.wikimedia.org/T144497) (owner: Nuria) [16:28:04] (PS1) Nuria: Updating master with new-aqs-cluster branch [analytics/aqs] - https://gerrit.wikimedia.org/r/314722 (https://phabricator.wikimedia.org/T144497) [16:29:17] (CR) Nuria: "@elukey please look again, I squased commits inorder to merge branch cause I think gerrit prefers that workflow" [analytics/aqs] - https://gerrit.wikimedia.org/r/314722 (https://phabricator.wikimedia.org/T144497) (owner: Nuria) [16:29:49] elukey: corrected merge on the way I *think* gerrit prefers it [16:30:42] \o/ [16:31:32] mmmm [16:31:44] how many commints were squashed? [16:31:54] not a big deal but it would be great to keep the history [16:32:35] elukey: ah two commits i can redo commit history, gimmme a sec [16:37:17] nuria: I am going afk atm but I'll review the code reviews on Monday first thing :) [16:37:24] elukey: k, ciao [16:37:33] have a good weekend team! byeeeee o/ [16:39:25] bye elukey ! [16:41:30] elukey: ciao! [16:47:08] (PS2) Nuria: Updating master with new-aqs-cluster branch [analytics/aqs] - https://gerrit.wikimedia.org/r/314722 (https://phabricator.wikimedia.org/T144497) [16:57:53] joal: I do not think this has been deployed yet (404 change) https://gerrit.wikimedia.org/r/#/c/312561/ [17:02:31] joal: i take it back it has, today [17:04:04] k nuria [17:05:03] nuria: I'll investigate on the ratio nbReqs / distinct UAs this weekend [18:17:34] joal: ok, I am still seeing pages that *I think* should no longer be there, let me look a bit more [18:22:17] joal: ahem ... nuria forgot what month are we in [18:22:25] time flies... [18:26:08] joal: ok, no, those pages are no longer there [18:27:01] Analytics-Kanban: Spamy - User-like pages distort our pageview metrics (they return 200 when they should return 404) - https://phabricator.wikimedia.org/T145922#2645266 (Nuria) Confirming that pages such as this one: https://en.wikipedia.org/?title=User:GoogleAnalitycsRoman/google-api&action=history return 4... [18:47:42] Analytics-Kanban: {kudu} Wikimetrics for IPL - https://phabricator.wikimedia.org/T114423#1694971 (Nuria) Closing, old, no longer relevant. [18:47:46] Analytics-Kanban: {kudu} Wikimetrics for IPL - https://phabricator.wikimedia.org/T114423#2700257 (Nuria) Open>Resolved [20:22:55] Analytics-Kanban, Research-and-Data, Research-collaborations, Research-management, Patch-For-Review: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2700397 (leila) @Nuria can you let us know when the code is running and we can close this task? (I think... [20:24:48] Analytics-Kanban, Research-and-Data, Research-collaborations, Research-management, Patch-For-Review: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2700443 (Nuria) Code will be deployed next week. [21:38:48] Analytics-Kanban, Research-and-Data, Research-collaborations, Research-management, Patch-For-Review: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2700656 (leila) Thanks, Nuria.