[03:37:19] anyone alive around? looking for some geo guidance on wmf.request table ) [05:34:48] Analytics: Junk in wmf.webrequest.uri_host field - https://phabricator.wikimedia.org/T95836#1201517 (Yurik) NEW [06:39:51] Analytics: Junk in wmf.webrequest.uri_host field - https://phabricator.wikimedia.org/T95836#1201543 (Yurik) I did some stats to see the most frequent cases, those that might actually cause wrong results (if filtered/processed incorrectly), rather than just annoyance (like a random web sites). The most common... [08:35:09] (PS1) Yurik: CountryCounts.hql implementation, lower(uri_host) [analytics/zero-sms] - https://gerrit.wikimedia.org/r/203650 [08:35:53] (CR) Yurik: [C: 2 V: 2] CountryCounts.hql implementation, lower(uri_host) [analytics/zero-sms] - https://gerrit.wikimedia.org/r/203650 (owner: Yurik) [10:53:43] Analytics-EventLogging: agent_type field does not work for anything except last few hours - https://phabricator.wikimedia.org/T95806#1201679 (Ironholds) Open>Invalid a:Ironholds That's not a bug. The complexity of regenerating ~60 days of data, where a day is 24*60*125000 rows, is extreme, and addin... [15:14:35] Analytics-Tech-community-metrics, ECT-April-2015: Instructions to update user data in korma - https://phabricator.wikimedia.org/T88277#1201801 (Dicortazar) I've added information about SortingHat in the Contributions section [1]. I may add extra details if needed. Hope this is useful! Some old text was r... [23:20:36] cluster is having issues and needs more nodes :) [23:45:16] killed application_1424966181866_83018 - rsrvdmem 350208 [23:51:24] yurik, more precisely? [23:51:34] Ironholds, ?? [23:51:42] the cluster was hanging [23:51:45] was that the error you got, orr? [23:51:45] totally flat [23:51:53] http://ganglia.wikimedia.org/latest/graph.php?r=hour&z=xlarge&hreg[]=analytics1012.eqiad.wmnet|analytics1018.eqiad.wmnet|analytics1021.eqiad.wmnet|analytics1022.eqiad.wmnet&mreg[]=kafka.server.BrokerTopicMetrics.%2B-BytesOutPerSec.OneMinuteRate>ype=stack&title=kafka.server.BrokerTopicMetrics.%2B-BytesOutPerSec.OneMinuteRate&aggregate=1 [23:52:25] running mapred job showed identical list [23:52:48] so i killed one of my jobs - didn't help [23:52:59] uhm [23:53:02] killed the one with the highest rsrvdmem - and it worked [23:53:15] just warning in case you might need to re-run it [23:53:25] run hadoop job -list on stat1002 [23:53:41] and tell me who you think is responsible for there being no memory [23:53:44] wait, you killed someone else's job? [23:53:51] correct [23:53:54] who owned it? [23:54:07] hfds [23:54:14] ... [23:54:17] and what are your jobs doing? [23:54:48] calculating per country per domain usage [23:54:51] for one day [23:55:01] got a link to the script? [23:55:15] sec, need to check it in [23:55:18] and, what for? [23:55:38] zero [23:55:56] Ironholds, could you document the hadoop job https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Load [23:55:59] okay. more specifically, for a zero production-level task or? [23:56:08] i only see the mapred [23:56:11] what's the question you're answering? [23:56:32] it is for the portal [23:56:43] Imagine for a second that I do not know what the portal is [23:56:45] how much traffic / requests / pagveiews a country generates [23:56:57] we need to plot this for our partner [23:56:59] hold on [23:57:01] typing :))) [23:57:23] partners need to see their own usage graphs + compare it with their entire country [23:57:31] gotcha [23:57:50] are the jobs checked/CRd by analytics or just introduced at your end? [23:57:56] we have pure zero (except that it doesn't include uploads/desktop, even though we need that too later on) [23:58:14] not checked yet [23:58:25] so, to summarise; you decided to run two jobs using 485376M and 497664M of memory respectively, on a weekend, to calculate a dataset for partners, without checking in the jobs for code review. [23:58:26] hold on, let me check it in [23:58:29] When these didn't run fast enough, you killed one of the regularly scheduled, production-level maintenance and ETL tasks, without telling anybody in advance or asking permission. [23:58:44] not exactly right :) [23:58:51] i run jobs all the time [23:58:58] i have tons of the schedlued, they update zero portal site [23:59:00] on schedule [23:59:13] (under my cron) [23:59:16] okay, but this job was not CRd, not logged, not checked? [23:59:29] and, you realise that the scheduling of the task is not my biggest bugbear here? [23:59:38] the cron part was written with otomata, but he never checked the job itself [23:59:43] this one was very similar to it [23:59:57] yes, but noone has ever mentioned any review process for the jobs