[05:15:37] 10Quarry: Ask python scripts to use custom user agents - https://phabricator.wikimedia.org/T197258#4284548 (10zhuyifei1999) >>! In T197258#4283799, @Framawiki wrote: > Mmm, is it some kind of monitoring tool ? benchmark test ? :) @zhuyifei1999 I don't know of one. [05:26:10] 10Quarry: Ask python scripts to use custom user agents - https://phabricator.wikimedia.org/T197258#4284549 (10zhuyifei1999) > 20240 14.29% GET HTTP/1.1 /query/new This should flood the new query list. Maybe we can check the list and see if there's something unhuman there? [05:35:38] 10Quarry: Ask python scripts to use custom user agents - https://phabricator.wikimedia.org/T197258#4284551 (10zhuyifei1999) ``` MariaDB [quarry]> SELECT -> COUNT(DISTINCT query.id) AS numempty, -> user_id, -> (SELECT username FROM user WHERE user.id = user_id) AS username -> FROM query... [06:13:27] morning! [06:13:38] that webrequest hour is super weird [06:13:44] it doesn't want to refine :D [06:13:54] I just tried another re-run to see if we are lucky [06:14:32] if the failures were more I'd point the finger to the new journal nodes, but everything looks good up to now except for those hours [06:59:37] Hi elukey [06:59:43] I'm gonna spend time on this now [07:00:00] elukey: This will be problematic if it doesn't work at some point :) [07:00:11] yeah weird :) [07:00:20] I am merging an apache change, then I'll be there to help :) [07:00:59] Ah - Just saw nur*ia_ message - Should we wait, or work? [07:03:01] I'd say no, let's try to fix it, then if we have a solution we'll show to the team [07:23:33] 10Analytics-Kanban: Fix failing webrequest hours (upload and text 2018-06-14-11) - https://phabricator.wikimedia.org/T197281#4284615 (10JAllemandou) [07:37:51] elukey: somnething is wseird with spark on an1004 [07:37:55] stat1004 sorry [07:38:34] ? [07:38:35] I can't access the hive databases :( [07:38:52] have you tried it after the reimage to stretch? [07:39:00] I think I didn't [07:39:08] ah ok so something might have changed [07:39:18] what does it say? [07:39:33] elukey: I'm betting on spark-scripts no linking to hive-site.xml anymore :) [07:39:47] elukey: it says it dosn't know of any of hive databases; [07:39:55] I was about to check the hive-site link [07:39:58] elukey: Spark doesn't connect to the hive metastore [07:40:00] :) [07:40:59] so /etc/spark/conf.analytics-hadoop is there [07:41:02] with hive-site [07:41:40] and lrwxrwxrwx 1 root root 45 May 22 16:59 hive-site.xml -> /etc/hive/conf.analytics-hadoop/hive-site.xml [07:43:55] does it work on stat1005? [07:44:04] elukey: didn't try - sill do [07:44:23] spark2-shell --master yarn right ? [07:44:27] I am trying it now [07:44:46] correct [07:45:08] ah but I don't remember how to access the hive databases [07:45:21] goal for this year joal for me is to learn spark [07:45:25] I am too ignorant [07:45:30] spark.sql("show databases").collet.foreach(println) [07:45:33] so you'll need to be patient [07:45:37] more than usual :D [07:45:38] collect sporry [07:45:55] works :) [07:46:16] you have the list? [07:46:21] yep [07:46:25] MEEEEEH ! [07:47:53] Well at least we know it's not completely broken! [07:48:31] ah! [07:48:37] found the issue [07:48:42] ? [07:49:00] on stat1004 /etc/spark2/etc.. is missing hive config [07:49:08] :*( [07:49:10] (I checked in /etc/spark/ before) [07:53:09] joal: so reading puppet, profile::hadoop::spark2 - there is a comment in there [07:53:17] # The deb package creates as post-install step a symlink like [07:53:17] # /etc/spark/conf/hive-site.xml -> /etc/hive/conf.analytics/hive-site.xml [07:53:20] # This package needs to be installed after the deploy of the Hive configuration. [07:53:23] # (should be guaranteed by the puppet evaluation order). [07:53:52] for spark 1 we explicitly define it in puppet [07:54:00] IIRC Andrew mentioned an issue like this one [07:54:22] say for example that the post install step of the spark2 deb executes before the hive config [07:54:23] elukey: How come it works in stat1005 then??? [07:54:25] then no link [07:54:32] Ah ! Puppet install order [07:54:52] it might have been fixed manually before, and in this case the stat1004 reimage's first puppet run hit the race condition [07:55:09] even if there is an explicit require for profile::hadoop::common [07:55:20] ah but not hive client! [07:55:27] ahhhhhhh [07:55:31] that one needs to happen before [07:55:33] okok now I get it [07:59:11] I must say I don't :) [08:00:13] I think it is a puppet thing [08:00:23] so if you check profile::hadoop::spark2 in puppet [08:00:59] you'll see that there is the comment that I pasted above [08:01:06] but it is guarded by a if [08:01:11] so we require in there require ::profile::hadoop::common [08:01:26] that basically deploys most of our config files for hadoop [08:01:29] but not the hive ones [08:01:35] those are in ::profile::hive::client [08:01:54] we only state in there "if hive is defined, then.." [08:02:01] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 9 others: Fix failing webrequest hours (upload and text 2018-06-14-11) - https://phabricator.wikimedia.org/T197281#4284668 (10238482n375) p:05Triage>03Lowest a:05JAllemandou>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZm... [08:02:08] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Fundraising-Backlog, and 16 others: Handle timeout in PaypalEC Orphan Rectifier and enable job - https://phabricator.wikimedia.org/T184284#4284674 (10238482n375) p:05Triage>03Lowest a:05mepps>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3R... [08:02:09] so without an explicit ordering, if somewhere else ::profile::hive::client is included [08:02:13] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Fr-Ingenico-integration_2017-18, and 14 others: Ingenico Connect: order id suffix incremented between pending and donations queues - https://phabricator.wikimedia.org/T184291#4284680 (10238482n375) p:05Triage>03Lowest a:05mepps>03None SG9tZVBoYWJ... [08:02:18] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Fr-Ingenico-integration_2017-18, and 12 others: Update Ingenico hosted style to match new payments form CSS - https://phabricator.wikimedia.org/T184288#4284692 (10238482n375) p:05Normal>03Lowest a:05Ejegg>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2... [08:02:21] 10Analytics-Kanban, 10AbuseFilter, 10Beta-Cluster-Infrastructure, 10Data-release, and 15 others: Move beta cluster ORES to its own machine - https://phabricator.wikimedia.org/T184282#4284686 (10238482n375) p:05Triage>03Lowest a:05Ladsgroup>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmlj... [08:02:27] and runs BEFORE profile::hadoop::spark2 [08:02:36] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Allow users to copy automated description - https://phabricator.wikimedia.org/T184376#4284735 (10238482n375) p:05Normal>03Lowest a:05Urbanecm>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFN... [08:02:41] then when the deb gets installed we get the symlink [08:02:41] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 14 others: Enable fine grained lua tracking gradually in client wikis - https://phabricator.wikimedia.org/T184322#4284710 (10238482n375) p:05High>03Lowest a:05Ladsgroup>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXR... [08:02:44] otherwise we don't [08:03:11] 10Analytics-Kanban, 10AbuseFilter, 10Advanced-Search, 10Data-release, and 13 others: [AdvancedSearchRequest] Undeclared property "filetype" - https://phabricator.wikimedia.org/T184405#4284790 (10238482n375) p:05Triage>03Lowest a:05Addshore>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmlj... [08:03:34] 10Analytics-Kanban, 10Analytics-Tech-community-metrics, 10AbuseFilter, 10Data-release, and 11 others: Explain difference in number of repositories when trying to manually exclude imported third party repositories - https://phabricator.wikimedia.org/T184420#4284870 (10238482n375) p:05Low>03Lowest a:05A... [08:03:45] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: EHU - Gipuzkoako campusa - Informatika fakultatea - Hizkuntzalaritza aplikatua - https://phabricator.wikimedia.org/T184453#4284899 (10238482n375) p:05Low>03Lowest a:05Theklan>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLi... [08:03:54] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Fr-Ingenico-integration_2017-18, and 13 others: Skope: Ingenico Connect: if we need 3rd-party cookies in their iframe, implement same thing as we have with Adyen - https://phabricator.wikimedia.org/T184289#4284913 (10238482n375) p:05Normal>03Lowest a... [08:04:24] So we're waiting for hadoop to be present, but not hive ! Get it [08:04:41] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: Add Help Page link in Special:RenameUser - https://phabricator.wikimedia.org/T184418#4285039 (10238482n375) p:05Normal>03Lowest a:05Jayprakash12345>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKI... [08:05:12] 10Analytics-Kanban, 10AbuseFilter, 10Beta-Cluster-Infrastructure, 10Data-release, and 14 others: What to do with deployment-sca03? - https://phabricator.wikimedia.org/T184501#4285075 (10238482n375) p:05Triage>03Lowest a:05EddieGP>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLg... [08:05:24] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 20 others: Stop logging autopatrol actions - https://phabricator.wikimedia.org/T184485#4285115 (10238482n375) p:05High>03Lowest a:05Ladsgroup>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAoKQ3... [08:05:49] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Fundraising-Backlog, and 14 others: add ability to filter by contact type to contribution search - https://phabricator.wikimedia.org/T184496#4285286 (10238482n375) p:05Triage>03Lowest a:05Eileenmcnaughton>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2... [08:05:54] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: dpkg ailing on ores-misc-01.ores-staging.eqiad.wmflabs - https://phabricator.wikimedia.org/T184494#4285268 (10238482n375) p:05Triage>03Lowest a:05Andrew>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25z... [08:05:58] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Basic plan of pre-Hackathon team offsite - https://phabricator.wikimedia.org/T184550#4285310 (10238482n375) p:05Normal>03Lowest a:05greg>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJj... [08:06:03] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Finish tests for reading lists service - https://phabricator.wikimedia.org/T184545#4285304 (10238482n375) p:05Triage>03Lowest a:05Tgr>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAo... [08:06:07] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Delete project 'wmt' - https://phabricator.wikimedia.org/T184449#4285280 (10238482n375) p:05Triage>03Lowest a:05Andrew>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAoKQ3JlYXRlIFRhc2... [08:06:09] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: Upgrade Puppet Master Infrastructure to Debian Stretch - https://phabricator.wikimedia.org/T184562#4285292 (10238482n375) p:05Normal>03Lowest a:05fgiunchedi>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRp... [08:06:12] whattt [08:06:17] spam day? :D [08:06:23] joal: does it make sense what I wrote? [08:06:43] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Create updated release v0.3.0beta2 - https://phabricator.wikimedia.org/T184519#4285443 (10238482n375) p:05Triage>03Lowest a:05HannaLindgren>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlY... [08:06:48] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Change my Phabricator username to "Cgt" - https://phabricator.wikimedia.org/T184509#4285449 (10238482n375) p:05Triage>03Lowest a:05Aklapper>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlY... [08:06:59] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: EHU - Gipuzkoako Campusa - HEFA - Pedagogia: Hezkuntzaren Teoria eta Erakunde Garaikideak - https://phabricator.wikimedia.org/T184632#4285473 (10238482n375) p:05Normal>03Lowest a:05Theklan>03None SG9tZVBoYWJyaWNhdG9yCk5v... [08:07:04] 10Analytics-Kanban, 10AbuseFilter, 10Advanced-Search, 10Data-release, and 13 others: Keep namespace selection when entering advancedSearch - https://phabricator.wikimedia.org/T184589#4285491 (10238482n375) p:05Normal>03Lowest a:05gabriel-wmde>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZ... [08:07:08] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: tools: out of disk: tools-worker-1020 - https://phabricator.wikimedia.org/T184604#4285521 (10238482n375) p:05Low>03Lowest a:05aborrero>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaA... [08:07:12] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: Create "eliminator" user group on ur.wikipedia - https://phabricator.wikimedia.org/T184607#4285485 (10238482n375) p:05Triage>03Lowest a:05Jayprakash12345>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25... [08:07:21] 10Analytics-Kanban, 10AbuseFilter, 10BlueSpice, 10Data-release, and 10 others: Fatal error: Class undefined: WikiAdmin for multiple BlueSpice extensions - https://phabricator.wikimedia.org/T184583#4285568 (10238482n375) p:05Normal>03Lowest a:05Osnard>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBOb... [08:07:25] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: [Phlogiston Readers Web] Several tasks appear as Scope: Unknown in Status Report - https://phabricator.wikimedia.org/T184646#4285566 (10238482n375) p:05Triage>03Lowest a:05JAufrecht>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3N... [08:07:27] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 15 others: Reduce the amount of time it takes for the 2017 wikitext editor to become interactive - https://phabricator.wikimedia.org/T184614#4285527 (10238482n375) p:05High>03Lowest a:05Esanders>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc... [08:07:35] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: Run maintenance/migrateArchiveText.php on all wikis - https://phabricator.wikimedia.org/T184629#4285564 (10238482n375) p:05Triage>03Lowest a:05Anomie>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgo... [08:07:46] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Edit-Review-Improvements-RC-Page, and 13 others: namespace field in Schema:ChangesListFilters should not be an integer, causes multi-namespace events to be invalid - https://phabricator.wikimedia.org/T184642#4285612 (10238482n375) p:05Triage>03Lowest... [08:07:50] 10Analytics-Kanban, 10AbuseFilter, 10Android-app-Bugs, 10Data-release, and 12 others: [BUG] Bottom section momentarily overlaps content for articles with very short lead section. - https://phabricator.wikimedia.org/T184661#4285618 (10238482n375) p:05Low>03Lowest a:05Sharvaniharan>03None SG9tZVBoYWJ... [08:07:54] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Metrics for Android quarterly update - https://phabricator.wikimedia.org/T184641#4285666 (10238482n375) p:05Normal>03Lowest a:05chelsyx>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJja... [08:07:58] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: [betalabs] "Uncaught TypeError: this.isFloatableOutOfView is not a function" when clicking on "Switch editor" - https://phabricator.wikimedia.org/T184665#4285654 (10238482n375) p:05Unbreak!>03Lowest a:05matmarex>03None S... [08:08:04] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 9 others: Propose a logo for the PAWS project - https://phabricator.wikimedia.org/T184683#4285726 (10238482n375) p:05Triage>03Lowest a:05Hatimali9293>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYX... [08:08:09] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: Installation of PyYaml is failing on Python 2.6 appveyor builds - https://phabricator.wikimedia.org/T184678#4285702 (10238482n375) p:05High>03Lowest a:05Dalba>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYX... [08:08:13] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: [4hrs] Document the service - https://phabricator.wikimedia.org/T184609#4285738 (10238482n375) p:05Normal>03Lowest a:05Jdlrobson>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAoKQ3Jl... [08:08:16] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: Evaluate and set up a test instance of FOSS persistent chat software as a companion to Q&A system for communication with third-party developers - https://phabricator.wikimedia.org/T184606#4285708 (10238482n375) p:05Normal>03L... [08:08:21] 10Analytics-Kanban, 10AbuseFilter, 10Commons, 10Data-release, and 14 others: Install Noto fonts on scaling servers for SVG rendering - https://phabricator.wikimedia.org/T184664#4285719 (10238482n375) p:05Triage>03Lowest a:05kaldari>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25z... [08:08:28] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: Request for Tonina to be added to the ldap/wmde group - https://phabricator.wikimedia.org/T184620#4285756 (10238482n375) p:05Normal>03Lowest a:05RobH>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgo... [08:09:14] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Retrieve lists of wikidata items belonging to a defined categories - https://phabricator.wikimedia.org/T184734#4285871 (10238482n375) p:05High>03Lowest a:05Miriam>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZm... [08:09:26] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Filter candidate images by relevance - https://phabricator.wikimedia.org/T184738#4285916 (10238482n375) p:05High>03Lowest a:05Miriam>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAoK... [08:09:48] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 14 others: Benchmark the new page summary API - https://phabricator.wikimedia.org/T184751#4285997 (10238482n375) p:05High>03Lowest a:05Mholloway>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAo... [08:09:52] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 15 others: language-screenshots-VisualEditor job fails if there is a problem in `before each` hook. - https://phabricator.wikimedia.org/T184724#4285967 (10238482n375) p:05Triage>03Lowest a:05zeljkofilipin>03None SG9tZVBoYWJyaWNhdG9... [08:10:02] 10Analytics-Kanban, 10AbuseFilter, 10Continuous-Integration-Infrastructure, 10Data-release, and 11 others: Docker won't start on integration-slave-docker-1005 - https://phabricator.wikimedia.org/T184781#4286039 (10238482n375) p:05Triage>03Lowest a:05thcipriani>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3N... [08:10:10] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: OO.ui.FieldLayout broken behavior when label contains links and widget getInputId() is null - https://phabricator.wikimedia.org/T184708#4286081 (10238482n375) p:05Normal>03Lowest a:05matmarex>03None SG9tZVBoYWJyaWNhdG9yC... [08:10:14] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 16 others: Every edit (including rollback) distorts non-ASCII text - https://phabricator.wikimedia.org/T184749#4286063 (10238482n375) p:05Unbreak!>03Lowest a:05Addshore>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXR... [08:10:40] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: UX: Such Diagram for the Jeroen - https://phabricator.wikimedia.org/T184798#4286153 (10238482n375) p:05Triage>03Lowest a:05Hanna_Petruschat_WMDE>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAg... [08:10:54] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Give access to S4 (procurement tasks) to Erika Bjune - https://phabricator.wikimedia.org/T184617#4286184 (10238482n375) p:05Triage>03Lowest a:05RobH>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoK... [08:11:05] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Make new roll-ups - https://phabricator.wikimedia.org/T184804#4286226 (10238482n375) p:05Triage>03Lowest a:05jhsoby-WMNO>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAoKQ3JlYXRlIFRh... [08:11:21] 10Analytics-Kanban, 10AbuseFilter, 10Android-app-Bugs, 10Data-release, and 12 others: Changing orientation clears the saved icon in randomizer - https://phabricator.wikimedia.org/T184671#4286247 (10238482n375) p:05Triage>03Lowest a:05Dbrant>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZml... [08:11:34] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 15 others: How do I test my extension's maintenance scripts? - https://phabricator.wikimedia.org/T184775#4286295 (10238482n375) p:05Triage>03Lowest a:05awight>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKI... [08:11:46] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Filter candidate images by quality - https://phabricator.wikimedia.org/T184740#4286319 (10238482n375) p:05High>03Lowest a:05Miriam>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAoKQ3... [08:12:02] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Create ‘extendedconfirmed’ for kowiki - https://phabricator.wikimedia.org/T184675#4286391 (10238482n375) p:05Normal>03Lowest a:05revi>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgIFNlYXJjaAo... [08:12:47] 10Analytics-Kanban, 10AbuseFilter, 10Beta-Cluster-Infrastructure, 10Community-Tech, and 13 others: Deploy GlobalPreferences on beta cluster - https://phabricator.wikimedia.org/T184668#4286403 (10238482n375) p:05Triage>03Lowest a:05MaxSem>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYX... [08:12:52] 10Analytics-Kanban, 10AbuseFilter, 10DBA, 10Data-release, and 11 others: Setup tendril database monitoring on 2 new hosts, one on eqiad and one on codfw - https://phabricator.wikimedia.org/T184704#4286409 (10238482n375) p:05Normal>03Lowest a:05jcrespo>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBO... [08:13:27] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Comments that begin with a "+" break Special:AllComments table formatting - https://phabricator.wikimedia.org/T184731#4286525 (10238482n375) p:05Triage>03Lowest a:05cicalese>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiB... [08:13:38] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: "Syncing 1 articles" notification on beta ??? - https://phabricator.wikimedia.org/T184827#4286554 (10238482n375) p:05Normal>03Lowest a:05Dbrant>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLgoKICAgI... [08:13:42] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Discovery-Search, and 10 others: Set up RelForge test of phonetic title search - https://phabricator.wikimedia.org/T184771#4286584 (10238482n375) p:05Normal>03Lowest a:05TJones>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiBObyBub3RpZmljYXRpb25zLg... [08:13:52] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 15 others: MobileFrontend - SpecialMobileHistory - RevisionAccessException - Failed to load blob from address tt:3543 - https://phabricator.wikimedia.org/T184690#4286566 (10238482n375) p:05Triage>03Lowest a:05daniel>03None SG9tZVB... [08:14:01] 10Analytics-Kanban, 10AbuseFilter, 10Android-app-Bugs, 10Data-release, and 12 others: On this day picker switches back to current day if device is rotated - https://phabricator.wikimedia.org/T184649#4286614 (10238482n375) p:05Triage>03Lowest a:05Dbrant>03None SG9tZVBoYWJyaWNhdG9yCk5vIG1lc3NhZ2VzLiB... [08:15:26] (03CR) 10Addshore: "Are you also working on CI for this?" [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/440133 (https://phabricator.wikimedia.org/T196883) (owner: 10WMDE-leszek) [08:28:06] (03CR) 10WMDE-leszek: "CI job attempt no 1 is I3d5026e83daa4dc219cdec1635a33030a764e242" [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/440133 (https://phabricator.wikimedia.org/T196883) (owner: 10WMDE-leszek) [08:28:45] joal: in the meantime, I created https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/440507/1/modules/profile/manifests/hadoop/spark2.pp [08:28:52] let's see what andrew thinks about it [08:30:01] Great - Thanks elukey [08:30:14] (03CR) 10WMDE-leszek: "thanks for mentioning -build thing. Will do!" [analytics/wmde/toolkit-analyzer] - 10https://gerrit.wikimedia.org/r/440133 (https://phabricator.wikimedia.org/T196883) (owner: 10WMDE-leszek) [08:44:47] joal: let me know if you need any help for the webrequest stuff [08:48:24] elukey: so far I'm ok - Looking for the corrupted file [08:49:24] elukey: I wonder if the corruption have come from the journal-nodes, or from the namednode-restart [08:55:59] joal: yeah I am wondering the same as well [08:56:28] elukey: I think it's because of the namenode [08:56:38] elukey: hdfs://analytics-hadoop/user/joal/wmf/data/raw/webrequest/webrequest_upload/hourly/2018/06/14/11/webrequest_upload.1004.10.1214791.15490650727.1528974000000._COPYING_ [08:58:29] ah yes makes sense [08:58:58] not sure how because in theory it shouldn't have happened [08:59:05] yeah I know ! [08:59:29] so what's the best fix? Re-run camus for that specific hour? [08:59:38] elukey: the correct exist (without the _COPYING_ extension), is bigger, and has the same exact start as the faulty [08:59:53] I request permission to just drop the faulty file [09:00:00] elukey: --^ [09:00:01] +2 [09:00:46] same thing for webrequest-text I suppose? [09:00:55] !log Deleting corrupted file hdfs://analytics-hadoop/user/joal/wmf/data/raw/webrequest/webrequest_upload/hourly/2018/06/14/11/webrequest_upload.1004.10.1214791.15490650727.1528974000000._COPYING_ to prevent webrequest refine jobs from failing. No data will be lost as the correct file exist. [09:01:00] elukey: Will check [09:01:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:01:23] joal: is /usr/joal/wmf the correct path? [09:01:40] elukey: surely not :) [09:01:47] elukey: hmmmmm [09:02:01] elukey: maybe the problem comes from the data I copied [09:02:05] Good catch, will check [09:02:53] Ah crap - False joy [09:03:15] elukey: You're right - the problmatic file doesn' exist in the core folder, mus have an artifact of me copying the data [09:03:24] elukey: sorry for the noise and thanks for catching that [09:03:39] :( [09:03:53] * joal gets back to trying to understand [09:04:27] is there a way to re-import the raw data for that hour [09:04:28] ? [09:04:44] there is, but it's a bit of work [09:04:49] yeah I imagine [09:05:40] elukey: find the correct camus job in history for the restart, reset a camus working directory, and run camus, [09:06:33] For the moment, trying to confirm data correction from spark (counting number of rows in each file) [09:06:44] super [09:07:00] elukey: from REAL data folder this time [09:30:30] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Operations, and 3 others: Clean up cpjobqueue metrics - https://phabricator.wikimedia.org/T196067#4290118 (10fgiunchedi) List of metrics at https://phabricator.wikimedia.org/P7262, I'll remove those if the list looks good. [09:32:44] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Operations, and 3 others: Clean up cpjobqueue metrics - https://phabricator.wikimedia.org/T196067#4290128 (10Pchelolo) The list is insanely long, I've poked around and didn't find anything that should remain. LGTM. [09:49:14] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Operations, and 3 others: Clean up cpjobqueue metrics - https://phabricator.wikimedia.org/T196067#4290234 (10fgiunchedi) 05Open>03Resolved [09:49:20] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, and 3 others: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4290235 (10fgiunchedi) [09:57:13] 10Analytics-Kanban: Fix failing webrequest hours (upload and text 2018-06-14-11) - https://phabricator.wikimedia.org/T197281#4290264 (10akosiaris) a:03JAllemandou [09:57:17] 10Analytics-Kanban: Fix failing webrequest hours (upload and text 2018-06-14-11) - https://phabricator.wikimedia.org/T197281#4290269 (10akosiaris) [09:57:19] 10Analytics-Kanban: Fix failing webrequest hours (upload and text 2018-06-14-11) - https://phabricator.wikimedia.org/T197281#4284615 (10akosiaris) [09:57:36] 10Analytics-Kanban: Fix failing webrequest hours (upload and text 2018-06-14-11) - https://phabricator.wikimedia.org/T197281#4284615 (10akosiaris) p:05Lowest>03Triage [10:01:23] elukey: every file of the uplaod hour is readable and countable [10:02:12] 10Analytics-Tech-community-metrics, 10Developer-Relations (Apr-Jun-2018), 10Security, 10Wikimedia-VE-Campaigns (S2-2018): Explain decrease in number of patchset authors for same time span when accessed 3 months later - https://phabricator.wikimedia.org/T184427#4290315 (10Dzahn) p:05Lowest>03Low a:03Ak... [10:07:08] joal: is there any file that spikes for number of rows or something that may lead to timeouts? [10:07:28] elukey: nope [10:07:30] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Consider not removing multiple blank lines/white space between paragraphs - https://phabricator.wikimedia.org/T184755#4290365 (10Pginer-WMF) [10:07:37] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Discovery-Search, and 9 others: Set up RelForge test of phonetic title search - https://phabricator.wikimedia.org/T184771#4290367 (10Pginer-WMF) [10:07:40] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: Evaluate and set up a test instance of FOSS persistent chat software as a companion to Q&A system for communication with third-party developers - https://phabricator.wikimedia.org/T184606#4290366 (10Pginer-WMF) [10:07:46] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Tech audiences map - https://phabricator.wikimedia.org/T184770#4290368 (10Pginer-WMF) [10:07:50] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Improve access to Commons image data for research and development - https://phabricator.wikimedia.org/T184744#4290369 (10Pginer-WMF) [10:08:01] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Developer-Relations, and 11 others: Create and publish a multi-tiered support level system for MediaWiki extensions frequently used by third parties - https://phabricator.wikimedia.org/T184648#4290372 (10Pginer-WMF) [10:08:48] elukey: regular file size and row number, end of hour files are smaller is all [10:08:57] joal: the other question that I have is about the maxmind geolocalizaion of the ip.. should it be considered quick enough, or a pile of ips not recognized might cause a ton of time to be spent? [10:09:35] (I am throwing stupid questions out loud, don't really explain what's happening) [10:09:38] elukey: IIRC we use a LRU cache to make sure it;s fast enough [10:09:46] elukey: no stupid questions :) [10:11:07] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: Use cached page leads when creating page summaries to reduce MCS load - https://phabricator.wikimedia.org/T184753#4290396 (10Pginer-WMF) [10:11:19] elukey: Trying a manual refine over copied (and cleaned) data [10:22:33] 10Analytics, 10Operations, 10hardware-requests: eqiad: (1) new stat box to offload users from stat1005 - https://phabricator.wikimedia.org/T196345#4290498 (10elukey) I had a chat with Ottomata and I think that the spare could work for the moment. The warranty will expire soonish so in case we'll see that a m... [10:25:11] joal: going afk in a bit for a couple of hours, will read when back.. anything needed before I go? [10:25:24] nope, continuing my investigation, slowlyn [10:25:36] cluster pace :) [10:25:39] ack, thanks for the work :) [10:25:49] np - Hopefully we'll find [10:31:38] 10Analytics-Tech-community-metrics, 10Developer-Relations (Apr-Jun-2018), 10Wikimedia-VE-Campaigns (S2-2018): Explain decrease in number of patchset authors for same time span when accessed 3 months later - https://phabricator.wikimedia.org/T184427#4290550 (10Dzahn) [10:41:19] 10Analytics, 10Contributors-Analysis, 10Product-Analytics: Make an Analytics Data Lake table to provide meta info about wikis - https://phabricator.wikimedia.org/T184576#4290688 (10Dzahn) [10:46:27] 10Analytics-Kanban, 10AbuseFilter, 10Collaboration-Team-Triage, 10Data-release, and 14 others: Localize StructuredDiscussion namespace into lfn - https://phabricator.wikimedia.org/T184517#4290739 (10Dzahn) [11:03:41] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: TLS security review of the Kafka stack - https://phabricator.wikimedia.org/T182993#4290888 (10Vgutierrez) I've just tested a new build of librdkafka (0.11.3-1~bpo8+1+wikimedia2) on cp1008 that includes the new TLS configuration... [11:21:02] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: [wmf.17 - regression] Edit post page shows no-js reply link - https://phabricator.wikimedia.org/T184636#4290953 (10Aklapper) p:05Lowest>03Triage a:03Catrope [11:21:07] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 15 others: language-screenshots-VisualEditor job fails if there is a problem in `before each` hook. - https://phabricator.wikimedia.org/T184724#4290955 (10Aklapper) p:05Lowest>03Triage a:03zeljkofilipin [11:21:11] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: CI npm job for VisualEditor repo fails as can't find the npm-browser-test docker image - https://phabricator.wikimedia.org/T184810#4290957 (10Aklapper) p:05Lowest>03Triage a:03hashar [11:21:16] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Consider not removing multiple blank lines/white space between paragraphs - https://phabricator.wikimedia.org/T184755#4290962 (10Aklapper) p:05Lowest>03High a:03ssastry [11:21:22] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: [betalabs] "Uncaught TypeError: this.isFloatableOutOfView is not a function" when clicking on "Switch editor" - https://phabricator.wikimedia.org/T184665#4290959 (10Aklapper) p:05Lowest>03Unbreak! a:03matmarex [11:21:27] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 15 others: Reduce the amount of time it takes for the 2017 wikitext editor to become interactive - https://phabricator.wikimedia.org/T184614#4290965 (10Aklapper) p:05Lowest>03High a:03Esanders [11:30:22] 10Analytics, 10Analytics-Wikistats, 10ORES, 10Scoring-platform-team, 10Security: Discuss Wikistats integration for ORES - https://phabricator.wikimedia.org/T184479#4291029 (10akosiaris) [11:30:31] 10Analytics, 10Analytics-Wikistats, 10ORES, 10Scoring-platform-team: Discuss Wikistats integration for ORES - https://phabricator.wikimedia.org/T184479#4291032 (10akosiaris) [11:32:00] 10Analytics, 10Analytics-Wikistats: Wikistats Bug - https://phabricator.wikimedia.org/T184475#4291055 (10akosiaris) [11:32:02] 10Analytics, 10Analytics-Wikistats: Wikistats Bug - https://phabricator.wikimedia.org/T184475#4291063 (10akosiaris) [11:33:03] 10Analytics-Kanban, 10AbuseFilter, 10Addwiki, 10Data-release, and 9 others: There is no way to associate blocks to the User data-model - https://phabricator.wikimedia.org/T184635#4291073 (10Aklapper) p:05Lowest>03Triage a:03dmaza [11:33:09] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Notify when installing MediaWiki without HTTP external connector - https://phabricator.wikimedia.org/T184652#4291076 (10Aklapper) p:05Lowest>03Triage a:03RazeSoldier [11:33:14] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Fundraising-Backlog, and 13 others: Omnimail recipient load tripping over non-downloaded file - https://phabricator.wikimedia.org/T184823#4291080 (10Aklapper) p:05Lowest>03Triage a:03Eileenmcnaughton [11:33:18] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Measure impact of Singapore data center on Wikimedia usage - https://phabricator.wikimedia.org/T184677#4291082 (10Aklapper) p:05Lowest>03Normal a:03MNeisler [11:33:30] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Discovery-Search, and 9 others: Set up RelForge test of phonetic title search - https://phabricator.wikimedia.org/T184771#4291095 (10Aklapper) p:05Lowest>03Normal a:03TJones [11:33:34] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Improve access to Commons image data for research and development - https://phabricator.wikimedia.org/T184744#4291093 (10Aklapper) p:05Lowest>03Normal a:03Miriam [11:33:38] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10Developer-Relations, and 11 others: Create and publish a multi-tiered support level system for MediaWiki extensions frequently used by third parties - https://phabricator.wikimedia.org/T184648#4291088 (10Aklapper) p:05Lowest>03Low a:03Johan [11:33:52] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Categories, and 5 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4291098 (10jcrespo) [11:38:50] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Categories, and 6 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4291167 (10jcrespo) This was known to me. This points me I should be more vocal about #wikimedia-l... [11:47:07] 10Quarry, 10DBA, 10Data-Services: Cannot reliably get the EXPLAIN for a query on analytics wiki replica cluster - https://phabricator.wikimedia.org/T195836#4291244 (10jcrespo) p:05Triage>03Low So the workaround for now is to make sure one is connected to the same server by doing: ``` SELECT @@GLOBAL.hos... [12:13:25] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Categories, and 6 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4291360 (10Anomie) 05Open>03Resolved a:03Anomie This looks likely to be resolved now: the ch... [12:44:57] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Add the prometheus jmx agent to AQS Cassandra - https://phabricator.wikimedia.org/T184795#4291550 (10Aklapper) a:03elukey [12:45:34] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Fix outstanding bugs preventing the use of prometheus jmx agent for Hive/Oozie - https://phabricator.wikimedia.org/T184794#4291597 (10Aklapper) a:03elukey [12:46:06] 10Analytics-Kanban, 10Discovery-Analysis, 10Product-Analytics, 10Wikipedia-Android-App-Backlog, 10Patch-For-Review: Bug behavior of QTree[Long] for quantileBounds - https://phabricator.wikimedia.org/T184768#4291602 (10Aklapper) a:03Nuria [12:47:56] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Categories, and 6 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4291625 (10jcrespo) Please don't run `count(*)` + `LOCK IN SHARE MODE` on the masters or you will... [12:50:54] joal: o/ I'm trying to read this file 'hdfs://analytics-hadoop/user/joal/wikidata/parquet' with pyspark in a standalone script, but the script isn't able to resolve analytics-hadoop. Do you know what I'm missing? [12:51:35] bmansurov: You can ry without hdfs://analytics-hadoop [12:52:00] joal: I did that too, then the script complains about /user [12:52:00] bmansurov: If you manage to read other data with pyspark on the cluster, this should be readabe as well :) [12:52:09] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Categories, and 6 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4291673 (10Anomie) It looks like that locking was added in 2008 in {80a5874828} so the counts woul... [12:52:28] joal: I can read data only from within pyspark, and not from a python script [12:52:42] joal: in scala, how do you read data? any code examples? [12:52:42] bmansurov: Yes, this is normal [12:53:25] bmansurov: to read data from hdfs, you need to either have an HDFS client, or use a 'system' (spark, hive) that knows how to read it [12:53:42] joal: here's what I'm doing https://github.com/wikimedia/research-translation-recommendation-models/blob/master/train.py#L41 [12:54:15] oh, so I need to somehow use the system spark in my spark session? [12:54:46] bmansurov: code you pasted should work [12:54:59] joal: o/ [12:55:02] any luck? [12:55:10] joal: ok, I'll look around, thanks for the help [12:55:52] bmansurov: I think the issue here comes from your spark-session not being instatiated on the cluster [12:56:21] elukey: problem is related to extracting geo-data and ua from certain rows [12:56:35] elukey: I'm trying to pinpoint examples [12:56:54] joal: I see [12:56:56] joal: ah! Does it take a ton of time for some rows? [12:57:10] elukey: YES [12:57:28] elukey: I'm assuming this is ua-parser (regex based) [12:57:28] uffffff [12:57:51] joal: do you think that we could increase the map timeout for this specific use case ? [12:58:01] I know that 10m are a lot [12:58:09] but maybe 20m will unblock the refinement [12:58:22] elukey: possible [12:58:27] elukey: we can try [12:58:50] elukey: I have looked after how to change the the parameter [13:00:15] +NOT [13:00:42] 10Analytics-Kanban, 10AbuseFilter, 10Collaboration-Community-Engagement, 10Data-release, and 12 others: February 2018 Collaboration newsletter - https://phabricator.wikimedia.org/T184799#4291742 (10Aklapper) a:03Trizek-WMF [13:00:51] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 14 others: Prepare config for WikibaseLexeme on beta wikidata - https://phabricator.wikimedia.org/T184745#4291743 (10Aklapper) a:03Addshore [13:00:58] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Metrics for Android quarterly update - https://phabricator.wikimedia.org/T184641#4291745 (10Aklapper) a:03chelsyx [13:01:03] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Remove FunctionComment.MissingParamComment from default rules - https://phabricator.wikimedia.org/T184650#4291744 (10Aklapper) a:03Reedy [13:01:07] 10Analytics-Kanban, 10AbuseFilter, 10Advanced-Search, 10Data-release, and 15 others: Set up first hello world browser test - https://phabricator.wikimedia.org/T184608#4291747 (10Aklapper) a:03Lea_WMDE [13:01:29] 10Analytics-Kanban, 10AbuseFilter, 10DBA, 10Data-release, and 11 others: Setup tendril database monitoring on 2 new hosts, one on eqiad and one on codfw - https://phabricator.wikimedia.org/T184704#4291754 (10Aklapper) a:03jcrespo [13:01:33] 10Analytics-Kanban, 10AbuseFilter, 10DBA, 10Data-release, and 12 others: Generate consistent logical database backups in CODFW - https://phabricator.wikimedia.org/T184699#4291756 (10Aklapper) a:03jcrespo [13:01:39] 10Analytics-Kanban, 10AbuseFilter, 10DBA, 10Data-release, and 14 others: Decommission db1011 - https://phabricator.wikimedia.org/T184703#4291755 (10Aklapper) a:03Cmjohnson [13:01:46] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Prepare material for the quarterly check-in deck (Q2 Sep-Dec FY18) - https://phabricator.wikimedia.org/T184672#4291759 (10Aklapper) a:03DarTar [13:01:55] elukey: from my example file, UA extraction is done in less than a minute for 2M rows, and basically takes forever because of 884 rows [13:02:03] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: "Syncing 1 articles" notification on beta ??? - https://phabricator.wikimedia.org/T184827#4291765 (10Aklapper) a:03Dbrant [13:03:41] joal: nice finding! [13:04:21] elukey: I could probably reduce the number of rows being problemaic [13:06:47] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: QA of 'specialpages' feature branch (settings page) - https://phabricator.wikimedia.org/T184742#4291837 (10Aklapper) a:03ABorbaWMF [13:06:53] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 13 others: Alert instrumentation returning 500 errors - https://phabricator.wikimedia.org/T184721#4291838 (10Aklapper) a:03ema [13:07:02] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 16 others: SpecialMobileContributions - Fatal exception of type "MediaWiki\Storage\RevisionAccessException" - https://phabricator.wikimedia.org/T184689#4291839 (10Aklapper) a:03daniel [13:07:09] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 12 others: Installation of PyYaml is failing on Python 2.6 appveyor builds - https://phabricator.wikimedia.org/T184678#4291840 (10Aklapper) a:03Dalba [13:07:13] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Update 2018 privacy statement - https://phabricator.wikimedia.org/T184659#4291841 (10Aklapper) a:03Niharika [13:07:17] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Initial outreach: Raising awareness of technical translation - https://phabricator.wikimedia.org/T184640#4291842 (10Aklapper) a:03contraexemplo [13:07:20] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 15 others: [wmf.16-regression] Fatal exception of type "Flow\Exception\InvalidDataException" for opting out from "Structured Discussions on user talk" - https://phabricator.wikimedia.org/T184670#4291843 (10Aklapper) a:03SBisson [13:07:26] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 11 others: Latest Trusty kernel boot issues: network and grub - https://phabricator.wikimedia.org/T184639#4291844 (10Aklapper) a:03chasemp [13:07:30] 10Analytics-Kanban, 10AbuseFilter, 10Data-release, 10HAWelcome, and 10 others: Submit grant proposal WLM 2018 international - https://phabricator.wikimedia.org/T184679#4291845 (10Aklapper) a:03Effeietsanders [13:07:50] hey team [13:08:19] 10Analytics, 10EventBus, 10Pywikibot-core, 10Patch-For-Review: EventStreams doesnt find any messages anymore - https://phabricator.wikimedia.org/T184713#4291871 (10Aklapper) p:05Lowest>03High [13:09:08] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Pywikibot-core, 10Patch-For-Review: EventStreams doesnt find any messages anymore - https://phabricator.wikimedia.org/T184713#4291896 (10Aklapper) a:03Xqt [13:09:10] 10Analytics-Kanban, 10Patch-For-Review: Sqoop cu_changes table for geowiki - https://phabricator.wikimedia.org/T184759#4291898 (10Aklapper) a:03Milimetric [13:18:55] mforns: o/ [13:19:03] hey elukey :] [13:19:31] 10Analytics, 10DBA, 10EventBus, 10MediaWiki-Categories, and 6 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4291929 (10jcrespo) I see most issues arose from updates to things like 'CC-BY-SA-4.0', and 'Self-... [13:20:22] elukey: I have launched my processing for the long rows more than 1/2 hour ago - still not done :( [13:21:03] elukey: Plus I'll need to drop soon and I'll miss standup tonight [13:21:33] elukey: can we spend a minute in batcave so that I explain what I think we could do for the thing? [13:21:39] joal: of course [13:21:41] I was about to ask [13:21:43] joining [13:22:08] mforns: can you join too? [13:22:21] elukey, sure [13:22:39] gimme 3 mins [13:30:21] 10Analytics-Kanban, 10Security: please add Casey Dentinger to Phabricator Security Project - https://phabricator.wikimedia.org/T184465#4292016 (10Reedy) [13:31:50] 10Analytics-Kanban, 10Security: please add Casey Dentinger to Phabricator Security Project - https://phabricator.wikimedia.org/T184465#3883673 (10Reedy) p:05Lowest>03Normal a:03Bawolff [13:39:34] 10Analytics-Kanban, 10ORES, 10Scoring-platform-team (Current), 10Security: Convert CloudVPS instances to stretch. - https://phabricator.wikimedia.org/T184296#4292133 (10Halfak) a:03Halfak [13:43:05] 10Analytics-Kanban, 10MediaWiki-Core-Tests, 10MW-1.31-release-notes (WMF-deploy-2018-01-09 (1.31.0-wmf.16)), 10Patch-For-Review: Query: DROP TEMPORARY TABLE IF EXISTS unittest_imagelinks Error: 1 near "TEMPORARY": syntax error breaking coverage ... - https://phabricator.wikimedia.org/T184333#4292154 [13:43:12] 10Analytics-Kanban, 10MediaWiki-Core-Tests, 10MW-1.31-release-notes (WMF-deploy-2018-01-09 (1.31.0-wmf.16)), 10Patch-For-Review: Query: DROP TEMPORARY TABLE IF EXISTS unittest_imagelinks Error: 1 near "TEMPORARY": syntax error breaking coverage ... - https://phabricator.wikimedia.org/T184333#4292161 [13:47:07] 10Analytics-Kanban, 10ORES, 10Scoring-platform-team, 10cloud-services-team: dpkg ailing on ores-misc-01.ores-staging.eqiad.wmflabs - https://phabricator.wikimedia.org/T184494#4292197 (10Halfak) [13:52:26] o/ elukey :) [13:52:31] qq, not sure I understadn this one: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/440507/ [13:53:33] ottomata: morning :) - it is an attempt to explicitly deploy hive's config before spark deb gets installed [13:54:08] today Joseph saw that stat1004 had the symlink missing problem [13:54:24] so spark2-shell wasn't able to see hive databases [13:54:27] 10Analytics-Kanban, 10DBA, 10Security: db1011 possibly faulty BBU - https://phabricator.wikimedia.org/T184401#4292333 (10Reedy) [13:54:44] (03PS1) 10Joal: Update UA parsing to limit agent length [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/440533 (https://phabricator.wikimedia.org/T197281) [13:54:49] mforns, elukey --^ [13:54:59] cool joal thanks [13:55:27] hmm [13:55:36] we should probably just get rid of the symlink creation from the .deb ? [13:55:37] mforns: Currently trying ot parse the faulty hour with the 512 limit - I'll see if it succeeds [13:55:40] that's a little hacky [13:55:42] puppet can ensure it better [13:55:46] or even better, just put it in puppet too [13:55:53] joal, ok [13:56:23] elukey: maybe instead of adding another param, since we don't really need it, just add ensure symlink in the if defined(Class) part [13:57:01] ottomata: so even if the symlink doesn't get created the first time, it will be during the second puppet run? [13:57:20] it probably will be created the first puppet run no matter what [13:57:35] 10Analytics-Tech-community-metrics, 10Developer-Relations (Jan-Mar-2018): Explain difference in number of repositories when trying to manually exclude imported third party repositories - https://phabricator.wikimedia.org/T184420#3882502 (10Aklapper) [13:58:08] i dunno why the deb package didn't create the symlink on install. i don't think the explicit require/include change you are adding would be different than the -> class dependency [13:58:09] ottomata: not sure, anyhow as you prefer, I thought the parameter was clearner but no strong opinion [13:58:19] the -> should put the same dep requirement as 'require' kw [14:00:10] sure, but afaics profile::hadoop::spark2 is required at the same time as profile::hive::client, and I am not sure if when puppet executes profile::hadoop::spark2 it is already aware of profile::hive::client (speculations) [14:00:33] Ok dropping team [14:00:39] mforns: I confirm value of 512 is ok :) [14:00:41] 10Analytics-Kanban, 10Patch-For-Review: Fix failing webrequest hours (upload and text 2018-06-14-11) - https://phabricator.wikimedia.org/T197281#4292419 (10JAllemandou) One problem is related to user-agent parsing for very long strings: ``` sudo -u hdfs spark2-shell --master yarn --conf spark.dynamicAllocation... [14:00:41] it has to compile the full catalog before doing anything tho, no? [14:00:47] all require does is [14:01:06] if you do say [14:01:17] class A { require B } [14:01:17] is [14:01:41] class A { [14:01:41] include B [14:01:41] Class[B] -> Class[A] [14:01:41] } [14:01:48] ok I get it [14:02:10] but I am talking about profile::analytics::cluster::client [14:02:18] where we have [14:02:18] require ::profile::oozie::client [14:02:18] # Spark 2 is manually packaged by us, it is not part of CDH. [14:02:19] require ::profile::hadoop::spark2 [14:03:09] i believe (could be wrong) [14:03:10] my speculation was that the two may have been evaluated in whatever order puppet decided [14:03:26] if it weren't for the dep we declare in spark2 class [14:03:29] you would be right [14:03:46] but the -> dep should force hive to happen before spark2 [14:03:48] and if it doesn't [14:03:59] i'm not sure how require of profile::hive::client would be different [14:04:15] since it is essentially the same thing as -> + include [14:04:21] my main doubt was around the defined profile::hive::cient [14:04:22] *client [14:04:36] not about the -> vs require [14:04:48] 10Analytics-Kanban, 10Edit-Review-Improvements-Integrated-Filters, 10Collaboration-Team-Triage (Collab-Team-This-Quarter), 10Security: Metrics: Pull New Filters data so we can have a new baseline to measure against - https://phabricator.wikimedia.org/T184493#4292464 (10Reedy) [14:04:51] hmmmmm [14:05:08] yeah hm, i suppose we do with files sometimes if defined(File[A]) [14:05:10] in multiple places [14:05:14] and we know one of them will happen first [14:05:18] but we don't know which one [14:06:49] ok elukey maybe ya...i have a diff suggestion then (i really prefer to not add parameters we have to tweak in labs if we don't have to). ... :D [14:07:00] can we just specify the dependency in analytics::cluster::client [14:07:04] and profile::hadoop::worker? [14:07:04] 10Analytics-Kanban, 10cloud-services-team, 10Security: weird root shell on wmde-wikidiff2-patched.wikidiff2-wmde-dev.eqiad.wmflabs - https://phabricator.wikimedia.org/T184495#4292649 (10Reedy) [14:07:08] the places they are included [14:07:29] class { :profile::hadoop::spark2: [14:07:29] require -> Class[::profile::hive::client] [14:07:29] } [14:07:30] ? [14:08:17] sure sure :) [14:08:46] then we can probalby remove the whole if defined block, in spark2 class, ya? [14:08:59] 10Analytics-Kanban, 10Community-Tech, 10Grant-Metrics, 10Security: Investigation: Create job queue system for calculating event statistics - https://phabricator.wikimedia.org/T184492#4292675 (10Reedy) [14:09:33] ottomata: ack, +1 [14:09:40] (03CR) 10Ottomata: [C: 031] "Is 512 really long enough? Other wise +1!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/440533 (https://phabricator.wikimedia.org/T197281) (owner: 10Joal) [14:09:58] as side note, Joe found a way to merge the varnishkafka module into ops/puppet without making fireworks [14:10:05] and we just merged it, all good [14:10:07] \o/ [14:10:48] 10Analytics-Kanban, 10Data-Services, 10MediaWiki-Logging, 10SpamBlacklist, and 2 others: Expose spamblacklist log type on wiki replica servers - https://phabricator.wikimedia.org/T184483#4292713 (10Reedy) [14:10:53] elukey: really!? [14:10:54] HOW? [14:11:01] like if i go git pull right now things are ok? [14:11:11] i must at least have to delete my local submodule right? [14:11:56] in theory no [14:12:12] we moved the module into environments/production/modules [14:12:27] that is looked before the main modules, I didn't know that [14:12:34] and then moved again back under modules [14:12:39] so two steps [14:12:46] curious if your git pull breaks [14:14:09] ok trying [14:15:28] nnmm [14:15:29] nopers [14:15:29] error: The following untracked working tree files would be overwritten by merge: [14:15:29] modules/varnishkafka/README.md [14:16:37] ah snap, then those needs to be cleared [14:16:49] can you try and see if it works? [14:17:31] yup that worked [14:17:44] super, I need to send an email to ops then [14:30:57] 10Analytics-Kanban, 10RESTBase-API, 10Patch-For-Review, 10Services (done): Update AQS pageview-top definition - https://phabricator.wikimedia.org/T184541#4293080 (10Aklapper) [14:31:28] (03CR) 10Elukey: [C: 031] Update UA parsing to limit agent length [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/440533 (https://phabricator.wikimedia.org/T197281) (owner: 10Joal) [14:32:54] 10Analytics, 10Operations, 10hardware-requests: EQIAD: (1) hardware request for eventlog1001 replacement - eventlog1002. - https://phabricator.wikimedia.org/T184551#4293105 (10Aklapper) [14:36:13] 10Analytics-Kanban, 10Puppet, 10User-Elukey: analytics VPS project puppet errors - https://phabricator.wikimedia.org/T184482#4293128 (10Aklapper) [14:37:39] https://grafana.wikimedia.org/dashboard/db/varnishkafka?orgId=1&from=now-7d&to=now&var-instance=eventlogging [14:37:43] whattttt [14:44:04] joal: how how how, did you figured out that the ua length was leading to broken refine? [14:45:02] * nuria_ redaing backscroll [14:45:06] *reading [14:47:45] nuria_: the mappers were timing out, so after checking the integrity of the camus imported files via spark etc.. joseph ran a manual refine, and found the issue [14:47:52] (this is what I gathered) [14:48:05] mostly Joseph's magic [14:48:14] elukey: still no compredou [14:48:31] elukey: will talk to joal about it when he's back [14:49:01] elukey: cause i do not see (if it was UA) where that error was in any of the logs... there must have been a logfile(s) that i missed [14:50:21] nuria_: yeah this is the main issue, it seems that they don't error but simply take a ton of processing time, eventually causing the map to hit its timeout (10m) [14:50:31] the UA was like more than 2700 chars [14:50:35] or something similar [14:50:40] (one of the problematic ones) [14:51:18] I asked to Joseph to post the UA as it may contain the answer to the question about the universe and what we do on earth [14:51:25] :D [14:51:31] elukey: ah ok, so joal must have just looked at data for the hour (duh nuria) [14:51:36] elukey: ayayay [14:52:41] yep it used spark IIUC [14:52:47] *he used [14:57:52] (03CR) 10Nuria: [C: 032] Update UA parsing to limit agent length [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/440533 (https://phabricator.wikimedia.org/T197281) (owner: 10Joal) [14:58:51] nuria_: the idea is to avoid rushing a (long) deployment today and possibly wait for monday, announcing the task/issue to analytics@ [14:59:10] but if anybody wants to go forward we can do it sooner [14:59:28] elukey: yaya, agreed, Monday sounds fine [14:59:47] elukey: I am reruning tests on patch [15:05:11] (03CR) 10Nuria: [V: 032 C: 032] Update UA parsing to limit agent length [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/440533 (https://phabricator.wikimedia.org/T197281) (owner: 10Joal) [15:16:13] 10Analytics, 10Pageviews-API, 10Tool-Pageviews: Pageviews agent=bot is always 0 - https://phabricator.wikimedia.org/T197277#4293341 (10MusikAnimal) It's just rare, I'm guessing. Pageviews Analysis is pulling this from the #pageviews-api, which returns zero: https://wikimedia.org/api/rest_v1/metrics/pageviews... [15:20:58] elukey, I'm looking at the deployment-eventlog05 alerts, the start-ts-file is missing. Should I create one? [15:26:30] I didn't get them! Yes please [15:36:13] 10Analytics, 10Analytics-EventLogging, 10Readers-Web-Backlog: Some VirtualPageView are too long and fail EventLogging processing - https://phabricator.wikimedia.org/T196904#4293412 (10Tbayer) >>! In T196904#4283794, @Jdlrobson wrote: > That's definitely an option. I'm not sure what the limit would be though... [15:38:04] elukey, done, hope this avoids them [15:38:25] * mforns leaving now for school meeting [15:38:31] se ya later [15:39:00] please, see e-scrum [15:49:31] a-team, do we know that and/or why the 20180614 hour 12 is missing here? https://dumps.wikimedia.org/other/pageviews/2018/2018-06/ [15:50:31] ottomata: is it 12 or 11? because we had an issue with hour 11, webrequest text/upload is not refined (the code review the joal posted) [15:50:58] so pageviews hourly surely didn't run, and daily will be blocked until we deploy [15:51:34] https://hue.wikimedia.org/oozie/list_oozie_workflow/0048177-180510140726946-oozie-oozi-W/?coordinator_job_id=0027735-180510140726946-oozie-oozi-C&bundle_job_id=0027733-180510140726946-oozie-oozi-B [15:52:46] 10Analytics, 10Analytics-EventLogging, 10Readers-Web-Backlog: Some VirtualPageView are too long and fail EventLogging processing - https://phabricator.wikimedia.org/T196904#4293485 (10Jdlrobson) >>! In T196904#4293428, @Ottomata wrote: > I mentioned this to @mforns in chat, not sure if it works or not. Inst... [15:53:03] weird, it is 12 [15:53:30] but I am pretty sure that it is related [16:00:52] elukey: yeah sounds related for sure [16:00:57] a guy emailed the list [16:02:45] standuuuup [16:02:49] AH [16:02:56] ping ottomata mforns_away joal [16:05:13] 10Analytics, 10Analytics-EventLogging, 10Readers-Web-Backlog: Some VirtualPageView are too long and fail EventLogging processing - https://phabricator.wikimedia.org/T196904#4293531 (10Nuria) >that should be enough to get the project and language_variant. The language variant is extracted from the uri like:... [16:14:43] 10Analytics, 10Analytics-EventLogging, 10Readers-Web-Backlog: Some VirtualPageView are too long and fail EventLogging processing - https://phabricator.wikimedia.org/T196904#4293548 (10Ottomata) OHhh right 'variant'. Hm ok. [16:15:19] 10Analytics, 10Analytics-EventLogging, 10Readers-Web-Backlog: Some VirtualPageView are too long and fail EventLogging processing - https://phabricator.wikimedia.org/T196904#4293551 (10Ottomata) You could truncate everything after language variant? [16:15:34] 10Analytics, 10Analytics-Wikistats: Shortcut icon is not showing - https://phabricator.wikimedia.org/T197482#4293552 (10sahil505) [16:20:06] (03CR) 10Sahil505: "> I think we should make all metric names capitalized only in the" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/440387 (https://phabricator.wikimedia.org/T197103) (owner: 10Sahil505) [16:56:42] all right email(s) sent [16:57:09] I think I'll drop off for today, have a nice weekend a-team! [17:10:28] thanks luca! [17:16:07] 10Analytics, 10Operations, 10SRE-Access-Requests: Requesting access for mbsantos - https://phabricator.wikimedia.org/T197237#4293707 (10herron) Hello @MSantos! To provision access we need to list down the specific group memberships that are requested. Could you please coordinate the gathering of this infor... [17:18:52] 10Analytics, 10Operations, 10SRE-Access-Requests: Requesting access for mbsantos - https://phabricator.wikimedia.org/T197237#4293712 (10herron) p:05Triage>03Normal [18:09:04] Since data is missing from hour 12, does that mean we need to wait until that's populated to get the daily viewcount totals? [18:30:07] 10Quarry: Add nofollow attribute in some links to prevent bots from following unnecessary ones - https://phabricator.wikimedia.org/T197488#4293811 (10Framawiki) [18:32:00] 10Quarry, 10Cloud-Services: GoogleDocs bot has download 125 000 csv exports in the last month - https://phabricator.wikimedia.org/T197256#4293823 (10Framawiki) I've found this page from google help center that describe their import function It can match our problem. https://support.google.com/docs/answer/30933... [18:54:28] ottomata: Curious if it would make sense for statsv.py to run on a server other than webperf1001? It's fine where it is in terms of load, but given we don't actively look after it anymore, it might make more sense to run elsewhere. Thoughts? [18:55:13] also because webperfx001 is now multi-dc, and I think things went fine, but I honestly forgot about it (oops) [18:56:01] nuria_: I'm curious whether https://phabricator.wikimedia.org/T187207 would fit onto next quarters roadmap? [18:57:16] 10Analytics, 10Analytics-EventLogging, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207#4293842 (10Krinkle) [18:59:30] Krinkle: yes, i think it's already there on my --ahem.. super rough notes etherpad [18:59:39] Thanks :) [19:00:01] Krinkle: https://etherpad.wikimedia.org/p/analytics-goals [19:00:29] Krinkle: no need to look but so you know what i mean with "rough" [19:10:34] hey, I'm back [19:14:14] Krinkle: hm [19:14:17] sure it can run anywhere [19:14:25] not really sure where is a better place than it is thogh [19:14:33] a ganeti instance or 2? [19:15:04] yeah, or one of (eventloggingx001, statsd-lb, eventlogging processor, another analytics-maintained host) [19:15:59] statsd-lb? what's that? [19:16:02] but yeah, a VM seems suitable, especially given we already (accidentally) moved it from hafnium to webperf1001 (a VM) [19:16:09] it would be good to have a place in eqiad and and codfw for it [19:16:19] ottomata: the server that hosts statsd.eqiad.wmnet and its LB and instances on various internal ports. [19:16:31] is there one in codfw? [19:16:31] Right, assuming it makes sense to run multi-dc at the moment. [19:16:44] There should be one yes, but I don't know for sure. [19:16:50] Krinkle: hm, tbh, this is a service that should probably be moved offcially to ops ownership [19:16:57] it is for operational metrics [19:16:59] statsd is active-inactive but the host should exist [19:17:09] we are trying to push more non analytics usages of kafka like this to ops :) [19:17:11] ya [19:17:26] they've agreed to that [19:17:50] Right. Well, I suppose that's the next level down indeed, when looking it the slice of this service. [19:18:42] Its' currently in a fairly odd state [19:19:00] the repo is analytics/statsv (originally moved from puppet:/webperf/files) [19:19:05] the scap name is statsv/statsv [19:19:12] the puppet class is (still) webperf::statsv [19:19:37] Could you take on the move to a VM and/or ask ops to? [19:20:14] I can help out with the puppet part of it, but I don't know its code or function well enough to coordinate or transfer the knowledge. [19:26:05] nuria_: hey, is there a codebase for https://phabricator.wikimedia.org/T191964 so I can help out? [19:43:30] ya sure [20:34:40] Amir1: I rather do that work myself, looking into those issues are a brief minutes of joy when i am drowning in management issues [20:38:36] nuria_: As you wish :) Have fun! [20:57:50] 10Analytics: Measure traffic for new wikimedia foundation site - https://phabricator.wikimedia.org/T188419#4294032 (10Nuria) @Varnent we can probably accommodate the site but not immediately. What are the target dates you are looking for launch? [21:00:13] 10Analytics: Measure traffic for new wikimedia foundation site - https://phabricator.wikimedia.org/T188419#4294051 (10Varnent) Site launch is scheduled for 30 July. Should we utilize third-party until internal is ready? Any sense of if we are talking about months or a fiscal year? :)